A Young Lady’s Illustrated Primer

There are a lot of things in Neal Stephenson‘s The Diamond Age which I love. If I’m honest with myself I hope to see mediatronic paper and animated digital chops, for example, become real in my lifetime. There are other aspect of the world created in that novel, for example massive inequality in a post-scarcity society, which I hope we won’t see, but I fear we are already walking down the path towards. At the core of the book though is one idea that some of my recent reading has prompted me to think about again.

The 2009 paper, Serious Games in Cultural Heritage, by Anderson et al., is a fun read, reporting on the state of the art at the time. There are some lovely lines which I’d like to take issue with. The authors, for example, hint at an opinion that a serious game doesn’t need to be fun. To which my reply that if its not fun, then its all “serious” and not a “game,” even if it does make use of gaming technology. The authors cite two examples of virtual reconstructions of Roman life, Rome Reborn and Ancient Pompeii, which use gaming technology as a research tool: “[Rome Reborn] aims to develop a researchers’ toolkit for allowing archeologists to test past a current hypotheses surrounding architecture, crowd behavior, social interactions, topography, and urban planning and development.” More fun comes from the Virtual Egyptian Temple, and The Ancient Olympic Games examples which have playful or ludic elements in them, even its its only piecing pots back together or successfully answering quizzes set by what the paper calls a “pedagogical agent.” (Crikey! I’m returning to the Ludology vs Narratology debate again – on the side of the Ludologists!)

The paper also discusses the pedalogical value of some commercial games, which Burton calls “documentary games.” The most recent example of this genre brought to my attention is Call of Juarez: Gunslinger (with thanks to Chad at westernreboot). Of course another feature of many modern commercial games that the paper highlights is the bundled content creation tools that allow you to create your own cultural heritage environment, and indeed the Virtual Egyptian Temple mentioned above was built with the Unreal Engine toolset.

There’s also a section on all the various “realities” that gaming technology has to offer, which I’ll return to when I finally get round to writeing up Pine and Korn’s Infinite Possibilities. and a section on the various gaming technologies (rendering effects and artificial intelligence) and the like, which a cultural heritage modeler can use, which makes the paper a very good primer on the subject (and one I wish I’d found earlier).

What led me to that paper was looking deeper at one of the poster presentations I saw last week. I didn’t get a chance to talk to (I guess) Joao Neto who was deep in a conversation I didn’t want to interrupt, so I did some Googling. Part of a team working to interpret Monserrate Palace in Sintra, Portugal, Joao and Maria Neto did some of the usual stuff: creating a 3D model from architectural drawings and laser scanning to show how the palace developed over time; an interactive application called The Lords of Monserrate, exploring the lives of the different owners of the palace over the centuries; and The Restoration, which appears to be a mobile app which recognizes the distinctive plasterwork in each room and interprets the restoration process in that room. But they also experimented with what they called Embodied Conversational Agents.

These are virtual historical characters, “equipped with the complete vital informational [sic] of a heritage site.” The idea was that the virtual character would capture the visitor’s interest with a non-interactive animated opening scene, in the manner of a cut-scene on a video game, but then would open up a real time conversation that would immerse the visitor with realistic “face movements, full-body animations and complex human emotions.” The conversation would be more sophisticated than a simple question and answer system, by being “context aware,” breaking up the knowledge base into modules, to make interactive responses more possible.

In order to achieve this ambition, we developed an Embodied Conversational Agent Framework – ECA Framework. This framework allows the creation, configuration and usage of virtual agents throughout various kinds of multimedia applications. Based on a spoken dialogue system, an Automatic Speech Recognition (ASR), Text-to-Speech (TTS) engines, a Language Interpretation, VHML Processing, Question & Answer and Behavior modules are used. These essential features have very different roles in the global virtual agent framework procedure, but they all work together to accomplish realistic facial and body animations, as well as complex behavior and disposition.

Which all sounds like an amazing feat,even if the end result is (and I’m sure it must be) a little bit clunky. I’d love to see it in action. But what does this have to do with Neal Stephenson and The Diamond Age? Well, the subtitle of that book and the McGuffin (though plot wise, it’s much more than a McGuffin) is A Young Lady’s Illustrated Primer. In the story, A Young Lady’s Illustrated Primer is an interactive book, a pedagogic tool commissioned by a very wealthy nobleman to ensure that his daughter’s educational development is superior to her peers. Many of the characters that the reader meets in the Primer are sophisticated virtual agents like those described by Neto and Neto. But some are voiced by a “ractor,” an interactive actor whose voice, expressions and movements are transmitted live to become the voice, expressions and movements of the character in the Primer. One of the characters in Stephenson’s novel make her living as a ractor, playing characters like Kate “in the ractive version of Taming of the Shrew (which was a a butcherous kludge, but popular with a certain sort of male user),” and to “fill in the blanks when things got slow, she also had standing bids, under another name, for easier work: mostly narration jobs, plus anything having to do with children’s media.”

I used to be a “ractor” of sorts, as a costumed interpreter in all sorts of historic sites. I’m proud that my colleagues and I became one of the most interactive and immersive of all the interpretation media available. But having professional people on site is expensive, and not all volunteers have the skills, confidence or desire to take on historical roles. So I’m wondering if another approach to Neto and Neto’s Embedded Conversational Agents is now, technically a possibility.

Could a virtual character be distantly controlled in real time by a human “ractor”? And could that ractor fill their working day becoming different characters (and even at different cultural heritage sites) as and when required? The relatively small audience for cultural heritage after all makes a live ractor experiment a more realistic possibility than it would be for a popular commercial video game.

I REALLY want to try this out. Who wants to help me?