ooVoo
ooVoo is a multi-platform video conferencing/chat application that boasts allowing up to 12 simultaneous video streams.
Unlike Skype, ooVoo doesn't make use of peer-to-peer connections, instead relying on some kind of cloud configuration. Apparently because of this, ooVoo's CEO has stated that ooVoo isn't typically used for "scheduled international calls or professional meetings," instead focusing on casual users. Nevertheless, the Washington Post claimed in 2011 that ooVoo had been approved for official communications within the US House of Representatives.
The ooVoo Interface
ooVoo manifests as a cluttered and non-native interface spread across multiple windows for its various tasks: a contacts list, a text chat window, a video chat window, etc. Every window in ooVoo displays advertisements at you. It can be assumed that these are targeted: during our intercontinental chats, I was shown a Volkswagen ad comprised of text in Spanish, German, and English simultaneously. Some ads automatically play audio at you, destroying any recordings you might be making - this is significant since call recording is a built-in feature.
In a video conference, ooVoo partitions the window based on how many participants are present. In the default view with two people, the video streams appear as adjacent rectangles. With three people present, the right and left streams are angled in a gratuitous 3D effect. As an alternative, any single stream can be designated to fill the window, with the remaining streams delegated into a single corner.
Testing ooVoo
We tested ooVoo as a 3-person team with one person in Weimar and 2 people at separate locations in San Diego. Audio fidelity during conversation was relatively poor, as has come to be expected from internet telephony. All three of us experienced intermittent feedback bursts, where audio with about a half-second of delay would return loudly for a few seconds at a time. ooVoo's automatic gain control didn't have much of an effect on this.
Weimar occasionally experienced bandwidth issues, which ooVoo deals with by automatically dropping the affected video stream, while keeping audio alive.
Exploring
We explored latency pragmatically. As an initial experiment, we attempted to clap steadily as a group. This failure found new form in our second attempt, where our hands were kept clearly visible in front of the camera, providing both strong visual and audio cues for timing.
The power of feedback in mediating our group dynamic led us to our next experiment. We decided upon a feedback-based "call and response" exercise that would demonstrate the audio, visual, and perceptual latencies present.
Weimar found a way to stream a video file in place of the webcam image, and generated a series of stroboscopic videos displaying a burst of white between lulls of blackness. When darkness fell in San Diego, Weimar began broadcasting. With each flash, Weimar uttered the German word for light, "licht." The San Diegans, from independent locations, monitored Weimar fullscreen. Whenever they witnessed the light of Weimar, they responded in their own native tongues, Spanish and English: "Luz!" "Light!"
With microphone gains and speakers set to maximum, an unsteady but hypnotic mantra emerged. The whole process was recorded by Weimar, who was witness to all three streams. Playback provided clear documentation of the timing discrepancies of audio and video between the three locations. Unable to resist the cycle of feedback, Weimar again recorded the viewing of this recording. We present the result here.
As soon as it is uploaded.