One of the big imponderables of desktop video conferencing is whether anyone will actually want it. Do you really want to have people looking while chatting to you in the office? Will the population in general be happy with moderate resolution, the slightly jerky displays? Since the middle of 1990 users at The Olivetti Research Lab and at the nearby University of Cambridge Computer Laboratory have had videoconferencing systems on their desks, linked by a proprietary Asynchronous Transfer Mode network, and the insight gained into how these systems work in practice is rare. It is, to be sure, a small, rather specialised community; there are only about 40 units in use, with a total of around 50 users, and the Lab acknowledges quite bluntly that the results will not scale up accurately to larger populations. Nonetheless the researchers say that their expectations have been baffled in many cases, as facilities became popular for quite unexpected reasons. The system that Olivetti put together in the late 1980s, dubbed Pandora, is notable for its flexibility: every Pandora’s Box contains six T45 Transputers, each with its own discrete function; some handling video, one for audio, and another for data. The whole lot is then combined by the final Transputer for display as a standard X Window desktop. The end product is a very versatile multimedia workstation, which, thanks to the way that it is networked can offer anything from multi-way video conferencing in multiple windows to audio feeds from a networked compact disk player or networked video from national television. Over the years the Lab has learned quite a bit on how these services get used in practice.
Videophone
Potentially good news for telecommunications companies is that two-way videophone conversations are popular and typically last longer than telephone conversations for the same users. A summary of findings suggests that At Pandora resolutions, body language is passed over well and it is possible to tell whether the correspondent is still interested in what is being said. In effect conversations can last quite a long time because of this feedback. Which means that people at Olivetti manage to look a lot more interested than most people during phone calls. Neither did potential problems with eye-contact prove insurmountable. Cameras mounted to the side or above the screen mean that users are never looking at each other eye to eye, but luckily the comparatively low resolution of the screen means that this is not immediately apparent. In fact, it became apparent that users would put up with lower quality pictures than expected: what they really hate is audio that breaks up. When it comes to pictures, the question that has been exercising the designers’ minds is whether users prefer their pictures to be quick-moving but granular, or high resolution and jerky. And the answer, according to the Labs is that users prefer to sacrifice resolution for speed: it is, after all natural to see someone through a fog; but to speak to someone whose lips don’t move in time with their words isn’t part of our day-to-day experience. Most videoconferencing networks have strict rules about who can see whom, with Pandora, by contrast anyone can peek into anyone’s room to see if they are busy, without getting permission in advance: the links are bi-directional, so that each user sees a picture of each other pop up on their screen. What they cannot do, without permission, is listen in to the room. The Labs have found that taking a peek is moderately popular, with users popping up for three or four seconds before terminating a call with smile and a wave. If video-conferencing and video-mail (voice-mail with pictures) have proved popular, then the multimedia applications have flopped. The researchers provided a facility whereby video clips can be embedded into standard electronic mail and much to their surprise found it did not get used much.
By Chris Rose
Now with a huge wave of multimedia, Object Linking and Embedding enabled etc etc applications poised to crash over the user, it b
ecomes important to determine exactly why people are not using it at the Labs. The simplest explanation is that the actual process of generating these documents is too complex: that the tools that the labs have developed are just not good enough. More disturbing though, for the likes of Microsoft Corp and Lotus Development Corp is the possibility that text and video just don’t mix in business situations; that, in the words of the summary: video-mail and text mail provide different methods of communication and it is quite natural to make a choice of the one better suited at a particular time. For the majority of communications the video-mail system is appropriate because it is quicker, but where particular accuracy is concerned text-mail is used instead. The Lab is loath to interpret these findings just yet. So Research Engineer Chris Turner says that it may just be that today’s users have been conditioned to think of word processors as just that, and still have difficulty getting used to the idea of moving pictures in there. On the other hand it could be that the video and electronic mail represent fundamentally different modes of communication. But even if that is the case, it’s worth noting that the researchers are hardly the most representative set of users in that they already have videoconferencing available to them. Bearing in mind that most of us don’t – it could be that video-enabled electronic mail has a window of opportunity. And before we do have videoconferencing on our desks, we are going to have to move them: yes it is the prosaic problems of lighting and office layout that stand between people and the brave new world in question. As Andy Hopper, director of corporate research at Ing C Olivetti & Co SpA points out; people don’t light their offices like film studios and until they do, or charge-coupled device cameras get better, the outlook will remain murky. Not only the lighting differentiates office from film-set: acoustics also needs consideration as does users’ propensity for standing with their backs to the camera. This last problem is at the front of the researchers minds as they develop Medusa, daughter of Pandora. Medusa takes what has been learned from Pandora and attempts to implement it in a hardware- and software-independent way, peripherals are no longer attached to Pandora’s Box, but directly to the Asynchronous Transfer Mode network. These include processor farms, RAID file repositories, and for input camera and audio clusters.
Fishy eye
Most important from the user’s point of view (literally) is the sudden growth in the number of cameras. Today’s videoconference users are used to being transfixed by a single fishy eye. With Medusa, the number of cameras per user will be increased to eight or perhaps 16, scattered around the office in an attempt to enable them to talk and work more naturally. At the other end of the wire, the recipient gets all the streams, which could be quite a problem if it were just dumped onto their desktop. Instead, the researchers are looking at designing proxy ‘intelligent’ agents to act as directors and help choose the active window: so as the person moves from one end of the room to other the system can ‘cut’ between the shot that was highlighted. In the end though, it is the viewer that is in control and able to select which view is most appropriate. The extra complication and expense is designed to make videoconferencing closer to face-to-face contact. This allows the eye to flick about the room; to switch between the person’s face and the piece of paper that they are holding. The multi-stream capabilities become even more important where conferences are concerned and the Lab is looking at equipping its lecture theatre with 32, maybe 64 cameras – which should stop anyone from dozing off during presentations. With Medusa, the scope for experimentation once again broadens out and Andy Hopper admits I don’t know how it will work out. He talks enthusiastically about the possibilities of using the multiple streams of data to generate stereoscopic images and as
ks his audience what do you think, is that any good – is that worth a billion dollars?