Bringing `behaviors' to VRML: Making sense of the avatar debate
Commercial success of VRML awaits avatar standards outcome
By Sue Wilcox
A common specification for how to handle ones virtual representation in 3D worlds, or avatar, as it crosses between worlds created by different vendors is what is holding back the deployment of Virtual Reality Modeling Language (VRML). This article explores three current proposals for how to standardize avatar interoperability, among other things. (7,240 words)
At the Earth to Avatars (E2A) conference in San Francisco in October, two of the original architects of Virtual Reality Modeling language (VRML) explained that they never planned for 3D virtual space to fulfill everyone's dreams of cyberspace at the first attempt. Mark Pesce and Tony Parisi said they always expected VRML to pass through three stages: geometry, behaviors, and multi-user interactions. "Cyberspace will not be built in a day," they reiterated.
Last August at the SIGGRAPH conference, the second stage was reached: the much-debated VRML 2.0 specificiation was approved and in addition to other things, it provided for the how to handle the behavior of objects. But, despite a lot of talk about the usefulness of data visualizations created with VRML 2.0, marketing and business types were still searching for signs of a 'killer app' to grow the market dramatically. And the most 'killer' use of VRML seemed to be in 3D chat environments where other human beings provided the added value needed to make sites popular -- without being too expensive to create and refresh. Given that VRML is a single-user three-dimensional environment, users preferred to be in static VRML 1.0 spaces (or VRML 1.0 spaces enhanced with Netscape's Live3D extensions, anyway) if it meant they could talk to other people while they were there. But this interaction all depended on proprietary multi-user technology to provide the means for users to see each other's representations, avatars, and to chat with each other.
Meanwhile, VRML software companies have also been experimenting with adding behaviors to avatars and objects that are similar to those possible with VRML 2.0, but which use proprietary formats for the most part. This was partly due to impatience with the gestation time of VRML 2.0 and partly due to a belief that proprietary solutions are better than the evolving standard ones. The avatars debate has arisen because the multiple proprietary solutions regarding how to bring avatars together to interact have fractured the potential market. VRML 2.0 will let your avatar interact with objects in a scene but does not provide a standard way to present the interactions to other avatars. In addition, there is no standard way for avatars to see and share the state of a scene, and no standard way to move from one world to another or to look from one world into another.
The result is that "pure" VRML 2.0 is little used while developers focus their efforts on formulation of the multi-user capabilities necessary to produce a fully featured interactive 3D space for a multiple-user killer app. Or, they're on working with proprietary technologies that can do this already e.g. Superscape plc's ".svr format," Worlds Inc.'s ActiveWorld, or Oz Interactive Inc.'s Oz-Virtual.
The focus on the future and VRML alternatives has been encouraged by the slow development of VRML 2.0 browsers and authoring tools. Essentially, the VRML community is looking to VRML 3.0, and the completion of the definition of VRML, before committing to developing content that uses the specification.
The avatar standards issue is therefore crucial to the success of VRML as a commercially viable language. Until there is some common definition of an avatar, and universality of movement between spaces on the Internet, it seems unlikely that any VRML company can hope to make serious money. Although the debate focuses on avatars, it is really just a special case of object interactions passing between a variety of servers in real-time. Talking about avatars personalizes the debate and brings up special issues to do with the nature of identity, security, interpersonal relations, and the nature of society on the Internet.
A number of companies have come together in interest groups supporting this and various other approaches. Three groups in particular have published first drafts of their specifications: (Note the groups overlap in some areas.)
A competition everyone
Now the task for the VRML community will be to fit the variety of approaches together. Let's examine the three proposals more closely.
This means many parameters must be settled as part of the standard or assigned to be the responsibility of the world creator. From how tall an avatar can be to fit into a world, to how complex it can be to render in a reasonable time, to how you can tell if it's a person driving the avatar or an artificial intelligence -- the debate has been multifarious, provoking, fantastic, and deeply interesting. A wide range of participants have been attracted to the discussions on the UA mailing lists. Subjects covered have included a range of philosophical, social, and technical issues:
Maclen Marvit, teleologist of Worlds in San Francisco, provides this overview of UA's approach: "We are at a point in our industry where lots of companies are doing innovative things, both technically and artistically. The goal of UA is to allow users to move as freely as possible between the technologies and find the best experiences in each, while maintaining a consistent identity. So if Bernie moves from one `world' [developed using] one technology to another `world' in another technology, he can maintain his avatar's representation, his Internet phone number and his proof of identity."
A UA file for avatars as currently proposed clarifies how this will work. It would contain:
Living Worlds: making VRML 2.0 applications
interpersonal and interoperable
LW's will define the minimum set of system features needed to enable shared environments and the minimum set of standard interfaces required to let third party applications interact with the shared environment. To help assess the requirements for the LW's specification the authors created a simple domestic scenario, which is summarized below and available in full as part of the draft specification. (See Resources section, below.) Imagine you are a virtual home owner receiving guests for the evening. Your avatar is in the 3D representation of your home and there is a knock at the door, you open it and two avatars representing two of your friends walk in to visit with you. One is followed in by a pet and both are carrying packages.
The requirements for the Living World's specification are illustrated in the context of the "visitors for the evening" scenario. Here is the wish list:
Bob Rockwell is chief technology officer at Black Sun Interactive, which is part of the Living Worlds group. His comment on these requirements for LW's specification is useful: "It is important to bring out that [this] list is not of things we want to standardize, but rather of things we want to hook up to in a standard way. Our job is to provide standard hooks to non-standard features that are still proprietary. To say it from the other direction, we want to pave a path for proprietary functions into the standard arena where everyone can use them."
Interoperability, openness, and innovation are LW's buzz words.
This diagram shows the major components of a multiuser system, which are the user interface, the VRML 2.0 browser, the VRML world itself, any external applications used in the world, and the MUtech. The numbers on the diagram label the interfaces. One through five are those of concern to the Living Worlds group. Number six is the job of the implementer of the MUtech and Number seven is the connection to other users over the network. LW has decided, in outline, how the components will communicate with each other. They have set the minimum requirements for interoperability between components (so any component can be supplied by any vendor) and started to design a framework so VRML 2.0 can support a multiuser application built with these components.
Living Worlds is strictly a VRML-focused effort. LW defines how to write a VRML wrapper so MUtech can plug into the browser. All of its concrete proposals for interface code are in VRML. It builds on VRML 2.0 and its authors are adamant that "it must be built entirely with VRML 2.0 mechanisms, defining as implementation specific anything which cannot be made to work inside the current standard." The design of LW assumes that VRML will be the development language of choice among cyberspace implementors.
Living Worlds aims to provide a framework for VRML 2.0 worlds to support interaction among Universal Avatars and promote a social community. The Silicon Graphics External Authoring Interface (EAI) proposal, developed by SGI and Black Sun, is a key component of Living Worlds, which enables communication between a VRML world and its external environment. (Details about the EAI are located at a URL found in Resources, below.)
LW's portable VRML 2.0 API represents intransitive behaviors and the unspecified proprietary (Java) APIs support transitive behaviors. Problems with the definition of the terms transitive and intransitive [coined by Moses Ma, whose company I-Games (formerly, Velocity Games) is part of the UA team] are shown by an example: What happens if you wave your hand (intransitive) but it intersects with my head (transitive impact)? Even accepting all that Living Worlds can do, a content developer must still build or license a proprietary multiuser API and server in order to run a multiuser world.
Ultimately the Living Worlds group intend to handle everything via dynamically downloaded Java applets. But at present, most systems prohibit Java from accessing local files, which makes it impossible, for example, to connect to locally installed third party software features. Until Java can access local files, the management of downloads and local access will be left to proprietary MUtech solutions.
Franz Buchenberger, chief executive officer of Black Sun, points out that until LW and the rest of the standardization initiative came along "each vendor of VRML-based online technology had to develop and market servers, avatars, clients, and worlds in order to ensure interoperability between all the pieces."
Now it will be possible to develop just content (avatars or worlds), or multi-user servers, knowing the other pieces will work together with it. This should enable an accelerated rate of development and better user confidence. Gregory Slayton, president of ParaGraph International agrees: "The Living Worlds standard should open up revenue-producing opportunities for companies that will specialize in helping people to develop their own personalized digital representations, or avatars, and to update them over time."
The supporting companies of the Living Worlds proposal have agreed that LW shall form a working group under the VRML Consortium, which was set up a few months ago to promote VRML standardization across the entire industry. The working group will benefit from the structure provided by the Consortium and from the input provided by all the Consortium member companies.
"This will ensure that the resulting standards take into account the needs of the entire community" said Tony Parisi, president of InterVista Software Inc and co-developer of VRML. The rationale for taking Living Worlds under the VRML Consortium umbrella is summarized by Bob Rockwell: "We want a standard which will `canalize' innovation toward a common standard. Hence, the creation of a standard set of interfaces for proprietary solutions to write to, and a clear statement of direction that we want those innovative proprietary efforts to eventually be brought into as a best-of-breed solution for the future standard."
Does this mean that, one day, all the proprietary solutions would become one? "More likely, there will emerge a common core of features for which the proprietary systems agree to use common interfaces," he says, "just as we have already done for avatar motion and behaviors that avatars are capable of."
The roadmap for the first draft of the LW specification includes: handling avatars and other active objects, simple communication using chat and business cards, a capabilities interface for whiteboards, and a simple security/rights model. The development timescale for the LW project is tight because the authors feel that the only way to test their ideas is by getting examples up and running, and trying out the interoperability of a group of worlds built according to their principles. They want a body of application experience to be accumulating while the VRML community debates the proposals for a VRML 2.x.
Draft 1.0 of the LW specification is expected before the end of 1996 so there can be implementations available before the community meets at the VRML '97 Symposium in February. (For more information on the symposium, see the Resources listing below.) Looking ahead, Black Sun's Rockwell says that "Phase Two will be the distillation of our implementation experience with Phase One into set of proposed extensions to VRML -- not just a library of ExternPROTOs but a new set of language elements covering this domain. Also, we expect to identify a few areas where the current standard needs improving, either wholly new features or things that should be done differently.
"The most obvious example is that of object-to-object collision detection, which the graphics guys were unwilling to accept in 2.0 because it caused them headaches (made certain things they do to optimize browser performance difficult to impossible). But object-to-object collision detection is utterly essential to multi-user work."
If there is an area of disagreement between UA and LW it is on the timing of "nailing down network protocols." UA wants to standardize an API -- essentially a standard protocol -- for applications to communicate through over the Internet, whereas LW wants to postpone this until VRML is multiuser.
Comparing Universal Avatars
and Living Worlds approaches
One company involved with both the UA and the LW proposals is IBM Inc. The company's "avatar wrangler", Abbott Brush, clarifies the company's view of the two approaches: "The Universal Avatar proposal deals with issues ranging from persistent avatar identity to conventions for an inter-avatar behavior framework. The Living Worlds proposal involves specifying interoperable multi-user technology for inter-personal communications. IBM is very interested in providing avatars that are `personal, portable, and persistent,' so we are very pleased to be involved with both these complementary initiatives."
There is undoubtedly going to be a lot of what Abbott describes as "back room sessions" as the teams try to work out how they can position their ongoing work to ensure collaboration. To perhaps point up the politicking, Abbott notes that Universal Worlds/Open Community will probably wind up supporting Living Worlds and will be a reference implementation of it. He explains that it is based on Mitsubishi Electric Research Laboratory's (MERL's) SPLINE system, not VRML, which is of great interest to the other UA members who have non-VRML solutions. Moreover, UW/OC also includes application programming interfaces (APIs) for authentication and commerce. However, this goes beyond LW 1.0.
"Based on our existing commerce products, IBM is very interested in providing front-end technology to our transaction systems," Abbott says. "We will follow UW/OC, however, we are committed to VRML-based solutions." (To see IBM's avatar demo, refer to URL listed in Resources, below.)
formerly Universal Worlds
The Open Community proposal is a joint development of members of the UA group and MERL (Mitsubishi Electric Research Laboratory). Open Community is based on SPLINE, a Scaleable Platform for Large Interactive Network Environments, a software architecture developed by MERL for dealing with multiple users over distributed systems. SPLINE has been in development for over three years and its Diamond Park demo only runs on SGI machines; it is based on the Performa graphics system. (Diamond Park is a virtual world where avatars can move around and talk to other avatars, ride bicycles in a velodrome, create new worlds, and play multiuser games.) Mitsubishi is currently working on commercializing SPLINE through its VSIS division, and is moving it across to the personal computer and VRML.
The particular features of SPLINE that make it so useful as the foundation for Open Community are: it uses "regions" to scale the world to the number of users; its multi-protocol communication system lets it cope with new media types easily; peer-to-peer communication reduces bandwidth and latency for data transmission; and strict object ownership prevents conflicts between interacting objects. (Regions are areas of a world, so a whole world is not presented at once, just the region around the user's viewpoint.)
SPLINE also enables the use of non-VRML applications such as a shared whiteboard or other collaborative applications in a virtual space. Many in the 3D community hope that Mitsubishi is going to donate SPLINE as part of the open standard for Open Community, which Glenn Coffman, director of market and business development at VSIS Inc., says is their expectation.
The "world model"
Open Community servers
Yet other servers act as name servers that associate names with objects and enable a process to find a remote object even if it is outside its current region. By spreading the load of managing the entire distributed system in this manner, no one server is overloaded and all servers can provide the best possible service to the client processes.
The Open Community
There are three kinds of data in the world model: small rapidly changing objects, large slowly changing objects, and continuous streams of data. Open Community has an efficient scheme for synchronizing these different kinds of data. Small objects travel as Uniform Data Packets (UDP) or TCP packets. Large objects (including graphic models, recorded sounds, and behaviors) are identified by URLs and communicated via standard Web protocols (e.g., VRML, MIDI, and Java). Streamed data (such as audio or video) travels in a multicast or unicast message. As there is no limitation on the size of large objects, preloading is used to reduce latency time, but CD-ROM data storage can also be used with Open Community to reduce download time.
But even for an AI, if it has no eyes it can be hard to tell which way an avatar is facing. The Open Community world model contains an object class used to identify avatars and a convention is imposed that the object faces down the negative Z axis. There is also an extensible facility for labeling objects with tags that carry semantic information. Conventions will be needed to establish how to use these tags so interoperability is maintained.
The buzzwords for Open Community are: portable, interconnected, scalable, multi-user, and interactive. These worlds are intended to run on any manufacturer's Open Community-compatible server. Server software developers can then choose to differentiate themselves by optimizing their offerings for speed, robustness, and additional proprietary features (such as latency mitigation technology, morphing, interpolation algorithms, etc.). Visual and audio rendering become just browser plug-ins running on top of Open Community. This means Open Community can be easily interfaced to any renderer. As rendering engine choice is usually an emotional subject, this flexibility is a real advantage. A content developer using Open Community can license a multiuser API and server from a third-party and then write multiuser behaviors in any language -- Java being the most portable -- so the behaviors can run on any machine. The advantage of using the Open Community Java API and VRML prototype layer is that developers can create a world that operates with anyone else's server.
Karl Jacob, chief executive officer of Dimension X, comments: "The Open Community effort will go a long way to achieve multi-user capability and help encourage adoption of VRML."
While UA has proposed standardizing many of the features required in avatars, such as persistence, scale, and the design features of the interface used for socialization, Open Community has implemented these concepts already in the classes spAvatarinfo and spAvatar. This is one example of how closely the two groups are working. Chaco Communication's president, Dan Greening, supports the spirit of cooperation between the three standards groups: "The Open Community proposal provides the missing link to worldwide social computing. Its Java libraries were designed for seamless support of multi-user environments such as those proposed in Living Worlds."
"We look forward to participating in the next step in the evolution of VRML, and integrating Open Community capabilities with those of the Universal Avatars and Living Worlds proposals," Black Sun's Rockwell says. "We were delighted to discover how neatly the Open Community design work dovetails into LW's proposed VRML interfaces, and welcome MERL's generous commitment to support the ongoing evolution of Living Worlds and the VRML standard."
Virtual world jumpstart
Once avatar creation companies have become common, the prices of avatars can come down. Currently, Third Dimension is charging $49.95 to make a photorealistic avatar head. Lowering the price of avatars will make them more widely available and grow the market. Once there are avatars in cyberspace with spending capacities, and once they can be influenced by advertisements and virtual salespeople, then creating worlds to attract avatars can be expected to proliferate.
A final specification is needed for the networking protocols to be used between server and client to form what the Open Community people call Universal Cyberspace. The dream scenario is for a baseball player in a virtual world running on one server to be able to hit a ball out of the park and have it crash through the window of a storefront running on a different company's server.
But outside the dream is the commercial world and the mass market. When former Apple Computer Chairman John Sculley gave his analysis of the future of cyberspace at the Earth 2 Avatars conference recently, he said that once the technology is shown to work and standards are agreed, the big league players will move into cyberspace.
Advertising has not been shown to be effective on the Internet. If you look at what is available in the way of banners and little animations on the Web, and compare that to a regular television commercial, the lack of impact on the Web is profoundly noticeable. It will take expensive production techniques and lots of bandwidth to output the sort of quality ads on the Web that people are used to on TV. And it takes a revenue stream to support that investment. Only companies with multiple media assets can justify and afford multi-channel distribution and advertising. Sculley, who is now the president of Live Picture, predicts that the likes of Disney and Time-Warner will become the moguls of interactive 3D content provision on the Internet. And that content may well be free.
As avatars become members of self-organizing groups, Sculley sees them as "a driving force shaping the economics of this industry." The avatar standards movement is the next step to achieving cyberspace.
See you on the strip.
About the author
Sue Ki Wilcox is an author and specialist in Internet 3-D computer graphics, virtual reality, VRML tools, and other resources for worlds builders. Often found out and about as conference reporter for Web Techniques magazine, Sue is organizing the VRMLocity conference to be held in San Francisco in June 1997. Sue is co-author of the book EZ-GO: Oriental Strategy in a Nutshell. Reach her at firstname.lastname@example.org.
The following companies have expressed support for the Living Worlds initiative: 3Name3D, 3D Labs, Acuris, 3rd Dimension Technologies, Aereal Inc., Apple Computer, Archite X, Axial Systems, Barnegat Communications, Black Sun Interactive, Boxoffice.net, Chaco Communications, CyberPuppy, CyberTown, Extempo Systems, Fujitsu Laboratories LTD, First Virtual Holdings, GrR HomeNet, IBM, Integrated Data Systems, Intel Corporation, Intervista Software Inc, Media Authoring Center - George Mason University, Netcarta, Neuromedia Studios, OnLive Technologies, Oracle Corp., OZ Interactive, ParaGraph International, PeopleWorld, Planet 9 Studios, Sense 8, Silicon Graphics, Sony Corporation, Velocity, Vivid Studios, VREAM, VRMLSite, Worlds Inc.
The following definitions are particularly applicable to Living Worlds but will probably spread into more general usage.