Junko Mizuno

In the summer of 1973, Junko Mizuno was born in Tokyo. Mizuno-san actually has begun her career as a professional artist relatively recently–however, in a few years only she has gained much recognition. Her artwork has been displayed in a wide spectrum of places–clothing, stickers, bags, CD covers, magazine graphics, illustration books, serialized manga, and tankobons. Her earlier comics, Dream Tower and Momongo no Isshou, arrived in 1998. While publishing her work in the rock magazine H., the techno group Avex Trax approached Junko and asked if she would create a jacket for the group’s CD sets. What she ended up doing was an entire manga series–Pure Trance. This was also in 1998–but, this was Junko’s true break into the animé scene. After being released in parts with Avex Trax’s CDs, a slightly reworked version of Pure Trance became Mizuno-san’s first manga published in its entirety.

After Pure Trance, Mizuno-san’s unique attractive style was recognized… and she’s been drawing up a storm ever since. Junko published Cinderalla in 2000, in Japan, which is mos def among the most popular of everything she’s done (English version was released in July of 2002 (…and I still don’t have a copy)). In the same fairy-tale vein of Cinderalla is Mizuno-san’s Hansel and Gretel (2000) and Princess Mermaid (2001) (both of which I’m having trouble finding any info on). In 2000, Momongo no Isshou reappeared in Secret Comics Japan (a two-part book: 001 includes JM as one of the hidden treasures of Japan’s animé undiscovereds, and 002 is exclusively JM). Mizuno-san has also compiled an illustration book titled Hell Babies (2001), illustrated for the “Vulgarity Drifting Diary” column in PULP, and has been re-published in the English magazine Tokion.

As for Junko’s art itself… what can I say that hasn’t been futile-y attempted by critics in reviews? Okay, you guys can easily click on ‘Gallery’ above and check out her stuff on your own, without me babbling on while trying to capture her unique style in words. However, here’s a comment Junko herself made of her work–

As for the fact that many of her characters and girls are drawn attractively: “… in our minds, without a second thought, being pretty and being strong are tied together.” –Junko Mizuno. This is a great quote for manga in general, because it’s so saturated with heroines that also happen to have the sex appeal factor set on maximum… this partially explains why, and why the characters are still just as appealing to the chicks as they are to the dudes.

One more thing–I think the most common way of describing Mizuno-san’s work comments on her coupling the cute with the grotesque–and nothing gives you a better idea of that than the fact that her favorite musicians are the oh-so-wildly-cute Spice Girls, and the oh-so-wonderfully-grotesque Rob Zombie… talk about contrast. (okay, so on the form it says, “Favorite Musicians: Rob Zombie, Spice Girls, etc.” …okay, what the hell is the ‘et cetera’ between Rob Zombie and Spice Girls?? ^_^;;)

Narcissus and Necessity:

Why Are We Creating Virtual Realities

During the 1890s and early 1900s, a change took place in America and Europe that would have profound consequences for popular culture. We can mark its starting point as 1894 when Thomas Edison marketed a viewing device called a Kinetoscope, which allowed one person at a time to look at moving pictures on a loop of film. The next year, two French brothers, Auguste and Louis Lumiere, gave the first commercial demonstration of moving pictures that were projected onto a large screen. With their invention, a new form of entertainment that came to be known as the movies was born.

As one would expect, the movies quickly became a source of public fascination. Suddenly, there was a technology that could capture the appearance of events in images. Those images could then be replayed so it seemed that they were being repeated exactly as they occurred, in a simulated three-dimensional space displayed on a screen. As audiences watched these moving “replicas” of reality, they felt as if they were seeing something close to magic, in which they could look in on other times and places, and escape the limits of everyday experience.

But the production of movies quickly went beyond the mere filming of events. As it evolved, the movies took a form they still have, today. First, film images were created of costumed actors performing on realistic stage-sets and in genuine settings. The images were then edited so the order in which the performances were filmed was rearranged into a fictional sequence of events. In essence, movie directors were doing what the designers of rain forest exhibits do: they were seamlessly weaving together all kinds of elements, some authentic, some fabricated, to create a composite, a sensory simulation that told a story. As audiences sat in a darkened room, watching these stories unfold on the screen, they experienced a sensory and psychological immersion in a simulated world.

To some in the industry, however, it was obvious that movies could be made more immersive. After all, if one could create a replica of reality by displaying images on a screen, then one could also make it seem that members of the audience were inside the world of images by surrounding them with a number of screens or bringing them right up to one screen. Or one could make it seem that the simulation had come into the theater, by giving the images a three-dimensional appearance or by placing props and sets around the audience that continued the movie’s theme.

The history of the entertainment industry in the last century is partly the story of efforts to turn the movies into such an immersive environment, from semi-circular screens that filled much of the audience’s field of vision to techniques for bringing the movie to the audience, such as AromaRama and the earthquake-imitating vibrations of Sensurround. As the last chapter describes, Disney was created out of this same desire to place audiences in a world of fantasy modeled after the movies.

Today, almost a century after people began dreaming of this possibility, we are beginning to accomplish it with a new generation of immersive theaters. Like rain forest exhibits and theme parks, immersive theaters are beginning to appear around America and, to a lesser extent, elsewhere. The technology that makes them possible is computers, which have also made it possible to orchestrate many of the elements of artificial rain forests.

An example can be found at the Sony Imax Theater in Manhattan, where audiences wearing headsets with liquid crystal lenses stare at an 80-foot-high, 100-foot-wide, screen and see three-dimensional images that appear to float directly in front of them. While watching the underwater film, Into the Deep, it seems that they are swimming through an environment of kelp beds while fish navigate around them. The effect is like being inside a movie, which seems to occupy the same space as the theater.

Another example can be seen in Poitiers, France, where audiences in the world’s only Magic Carpet Theater look at a giant screen in front of them and another screen below a transparent floor. When they look down at the floor screen, they see images of land in the distance as it might look from an airplane, and experience the illusion they are flying. At one point, the image of a blimp comes toward them on the screen in front of the theater and then reappears on the floor screen so it seems to fly beneath them, evoking a startled reaction from members of the audience, whose senses tell them they have just avoided a mid-air collision.

But the most impressive immersive theater to date is Back to the Future…The Ride at Universal Studios Hollywood and Florida. As audiences wait in line for the attraction, they are told that a character from the movie, Biff Tannen, has stolen a time machine, converted from a DeLorean, and traveled to the past where he will try to alter the natural unfolding of time. They are implored to go after him and save the world as they know it from being changed.

The audience is then loaded into 24 seating platforms disguised as time-traveling Deloreans. As a mist comes out from the dashboards, the DeLoreans are lifted into one of two 13-story-high domed theaters in which the ceilings are massive, wrap-around movie screens. With eight people in each DeLorean and twelve Deloreans suspended in each dome, audiences find themselves in an environment of larger-than-life images as they are taken for a ride through time, space and the story, in mock pursuit of the villain.

As the images unfold, audiences seem to fly through the world of the year 2115, careening through streets and alleys of the town Hill Valley and crashing through its clock tower, causing the gears and parts to fall away. Having achieved this symbolic destruction of time, they go on a journey through the ice age and the age of dinosaurs. Finally, while plunging on a lava flow to their deaths, they bump into Biff Tannen’s Delorean, causing him and themselves to time warp back to the present, where he will be unable to interfere with the natural unfolding of time.

In reality, the audience is inside another kind of themed environment in which realistic models of a dinosaur and various fantastic landscapes have been converted into oversized images, which seem to engulf the theater. As these images change size and position, the Delorean seating platforms move in tandem, horizontally, vertically and diagonally (while remaining in the same place), to create the illusion for the audience that it is traveling through the space displayed on the screens.

For the audience, the experience is like being inside a giant virtual reality headset. It finds itself in something approaching a pure simulation in which the sequence of events, the surrounding environment, the sense of forward movement and the participation in a story are tricks made possible by art and technology. The end result is another one of Umberto Eco’s “absolute fakes,” which are intended to be better than what they imitate. But what is faked, and improved on, is physical reality, in a way that makes it seem to audiences that they are transcending the limits of everyday life.

Drew Zelman, a spokesman for Ridefilm Corp., a subsidiary of Imax, which created the theaters described above, says Back to the Future “gives you the feeling you have left the world as you know it and entered somewhere completely different. It’s where people wish virtual reality was.” It offers “an altered state” that is safe and “drug free.”

Although Zelman obviously didn’t intend it this way, his claim that these experiences are a kind of altered state without drugs, is suggestive. Like drugs, the technology offers intense peaks that often leave audiences hungering for more — more better reality — that is more exciting, more interesting, brighter and more perfect than anything else afforded by life. And like drugs, they offer an essentially passive experience in which people sit back and experience the special effects.

All of the immersive theaters described here — along with virtual realities — place us in a lifelike representation of the three-dimensional world, which is modeled after our desires. As Freud might put it, we have constructed these technologies by using the powers of the ego — of rationality, science and technology — to build a universe of simulation governed by freedom from constraint, where the imagination is in control.

These technologies are a place where human narcissism meets metaphysics; where the inflated self, unable to reconcile itself to the world as it is, creates imitation worlds that are better suited to its desires. Marx said philosophers had only interpreted the world; the point was to change it. With immersive theaters, we take a shortcut and produce a new and improved facsimile, instead, where the adventures, the visual spectacles and the happy endings of Hollywood seem to happen to the audience.

But audiences aren’t only reacting with fascination to these new technologies. As we get better at using images to simulate physical reality, a new set of fears is emerging that these images could become so lifelike, they will interfere with our relationship to reality. We fear that image-based simulations will cut us off from the world or be confused for the world or that they will become so alluring, they will become sources of addiction in which people will choose to interact with images in place of their true surroundings.

These fears take their most extreme form in a set of “actualization fantasies” that can frequently be found in science fiction, in which image simulations are portrayed as becoming so realistic they become real, at times overthrowing reality. Less common are “deactualization fantasies” in which people are portrayed as falling down the rabbit hole, as it were, and becoming lost in worlds of simulation.

Actualization fantasies, in particular, are now a staple of science fiction. Thus, a movie character comes to life in The Last Action Hero and discovers that the rules are different in reality; a 3D image briefly achieves independence and runs amok in the hero’s apartment in the novel The Futurological Congress; a hologram of the fictional character, Moriarty, which was created inside a simulation room, becomes sentient in an episode of Star Trek: The Next Generation and fights to figure out both where it is and how it can escape into the larger reality; and the image of the perfect woman created on a computer screen by two teenage males, comes to life, in the movie, Weird Science, leading the two to discover that they prefer real women to simulations of women modeled after their adolescent fantasies.

The archetypal work portraying the idea that simulation might become real is “The Veldt,” a short story written by Ray Bradbury some four and a half decades ago. It shows us a family named the Hadleys, living in the ultimate “Happy-life Home” of the future, “which clothed and fed and rocked them to sleep and played and sang and was good to them.” As in many such stories, the house also daydreams for its inhabitants, providing a synthetic Never-Never Land for the children, Peter and Wendy, in the form of a simulation room with wall-to-wall screens that are able to bring any thought to life in realistic images.

But the parents become concerned when they hear the children repeating the same story over and over that involves an African veldt, lions and strangely familiar screams. Fearing that the simulation room is exercising an unhealthy influence, the father shuts down not only the room but the entire house.

The children, desperate to save their world of comfort and fantasy, lock their parents in the simulation room. Suddenly, the realism and immersion that makes these technologies so alluring takes on the aspect of a trap as the parents find themselves facing an African veldt that looks a little too real.

As the story moves toward its inevitable conclusion, the supposedly simulated lions come toward the parents.

“Mr. Hadley looked at his wife and they turned and looked back at the beasts edging slowly forward, crouching, tails stiff.

“Mr. and Mrs. Hadley screamed.

“And suddenly they realized why those other screams had sounded familiar.”

In “The Veldt,” we see the ultimate fantasy that has become attached to simulation, in a high-tech and somewhat regressed version of the Oedipus complex — a simulation complex. Here, simulation becomes real and overthrows reality by eliminating demanding parents, so it can install a world of fantasy in which children are in control. The pleasure principle triumphs over the reality principle and narcissism now governs in place of the world of necessity.

In “The Veldt,’ television exacts its final revenge against parents who nag their kids to shut off the TV, and Hollywood wins its final battle with the nonfiction world. An overindulgent technology in the form of the ultimate automated house, which looks a lot like a miniature Disney World, will now generate a new reality for its dependent and addicted audience.

Today, as immersive theaters, along with virtual realities, personal computers and the more traditional movies and television, increase their hold on the culture, “The Veldt” is beginning to look like a fictionalized description of what is actually happening, namely that images — and simulations in general — are generating much of our reality. We increasingly find ourselves not only in the three-dimensional space of the physical world (which is full of simulated objects), but surrounded by the simulated spaces displayed on screens, which are windows to all kinds of real and impossible worlds. As these simulated spaces begin to look a little too lifelike for comfort and promise to make every thought and desire seem to come true, they are unsettling our relationship with the larger world and offering us the allure of addiction and regression.

———-

* Ray Bradbury, “The Veldt,” in The Illustrated Man (Garden City: Doubleday 1951.)

Addendum: As we develop the ability to move between an accurate perception of our environment and lifelike imitation environments, there have been a number of predictable reactions. For example, there are increasing references in science fiction and popular culture to the idea that we may lose the ability, not merely to distinguish simulations from actual objects, but that we may not be certain whether the entire world we find ourselves in is real or a simulation. Thus, science fiction frequently portrays characters in a state of total confusion, not only lost in worlds of simulation, trying to get back out again, but also trying to figure out if their world is real.

In the movie, Total Recall, for example, the character played by Arnold Schwarzenegger holds a gun on another character while that character tells him that while he thinks they are both standing there, he is really experiencing a hallucination, caused by the injection of simulated memories into his brain. Schwarzenegger stands, holding the gun, his world tottering, uncertain, until he sees a single bead of sweat on the character’s face, exposing the character’s nervousness, and assumes this must be reality, since, presumably, a hallucination wouldn’t be so worried about being plugged with holes.

Faustian Society

When we examine the characteristics of contemporary societies that have been described in previous pages, we find that much of it comes down to a few essential ideas. These societies are part of a new civilization and a new period in history, which can be referred to as Faustian (with apologies to Spengler) because of its quest for power.

At its core, this new Faustian age and civilization believes in the self and the self’s right and ability to control the conditions of its own existence. It exalts reason, but it is practical or “instrumental” reason, which is seen as a tool that humanity can use to manipulate the world.

Faustian society includes at least four elements that define the individual’s changing relationship to the world of limitation:

* It uses science and technology to overcome the limits of the physical world.

* It brings together high technology and art to create simulations that can be used as substitutes for what can’t be extracted from the physical world. The most important of these simulations are imitation realities, which provide people with experiences not available in the rest of life.

* It adheres to an aesthetic philosophy, which sees the acting out of fantasies that express our fears and desires, as a form of art, entertainment and liberation.

* It views matter, life, culture and mind as deceptive appearances, which makes them simulations or something similar to simulations.

In addition, Faustian societies are characterized by the pervasive use of deceptive simulations to manipulate large numbers of people.

Put in terms that were first referred to on an earlier page, Faustian society is using the powers of rationality and the ego – of logic, science and technology — to build a perfect world that answers to our desires. The goal is to create a new kind of person: a sovereign self, in control of its environment, including its own biology and mind.

In order to achieve this goal, it is trying to make the world as transparent as possible, so everything can be seen and understood. It wants to hold all existence up to an x-ray, because what is known can be controlled. This effort to bring about transparency and control, or knowledge and power, in the service of real human needs and boundless human desires, summarizes much of what Faustian society is about.

We can see intimations of the world as Faustian society would re-create it in today’s simulated and automated environments and in some of the images of life conveyed by television and movies. These are early efforts to build and portray perfect worlds and perfect selves in which nothing is left to chance.

But we can also see in some of these same simulations, portrayals of the dangers that Faustian society poses to our relation to the world of limitation, including the danger that we might lose interest in “reality”; devalue it; undermine it, or lose the ability to distinguish reality from illusion.

Faustian society is already the dominant force in the contemporary world. Its power centers are the high-technology urban areas of America, particularly in the Northeast and West Coast, and the urban centers of Western Europe and Japan. The world’s business, scientific and cultural “elites,” many of whom live in these regions, are its creators, administrators and exemplars. They have enormous power to shape its culture, in the near-term. When seen from a broader perspective, they begin to look like the vehicles of humanity’s desire to bring about a perfect world, as they produce forms of technology and representation that answer to their audience’s needs and desires.

But many groups and regions haven’t made the transition to this new kind of society; others are in opposition to it. In particular, what remains of the world’s religious traditionalists, have been engaged in a reaction against many (although not all) of these changes. Despite their considerable differences with each other, all view the world as the work of a creator, who imposed not only material conditions on existence, but a moral code that limits thought and action, and subordinates creature to creator. All see themselves as defending religion, traditional culture and morality against the secularism, the moral and cultural relativism, and the philosophy of the self, fantasy, pleasure, transgression and cultural experimentation of Faustian society.

The creations of popular entertainment have foreseen the emergence of Faustian societies. They routinely portray humanity gaining power over both the material world and worlds of illusion. And they frequently examine the potential for good and evil in these new powers.

The movie Matinee, for example, which is about Key West during the Cuban missile crisis, portrays all the elements described above: the effort to use technology in the quest for power; the bringing together of art and technology to create advanced simulations that allow audiences to act out fantasies in which they overcome dangers; the use of simulations to deceive; and the social constructionist effort to expose culture and ideology as illusions that are used to manipulate and mystify people. The movie’s creators put all this in because they are tapping in to the issues and anxieties of the age.

Some of these works of popular culture, especially science fiction, also provide information about the kind of ethic that can help us live in this new kind of society. Much of their message comes down to this: the new Faustian societies provide a great many opportunities. But they also create dead ends and traps that are disguised as forms of progress. If we want to use technology correctly, and not be destroyed by it, our wisdom will have to keep up with our power. This theme is particularly evident in the original Star Trek, which repeatedly warns that there is a danger in trying to find shortcuts to power or trying to achieve false paradise.

A similar philosophy can be teased out of Freudian theory. Using psychoanalytic ideas, we can see that, in addition to the limits of the physical world, there is another set of limits we have to confront, namely the limits of our own personalities: our narcissism, primitive emotions and lack of ethical development. These are the most important obstacles we have to overcome, if we want to appropriately use these new technologies.

There are two philosophies, in particular, that emerged out of psychoanalytic theory, that we can use to construct a coherent philosophy for a new age. One comes out of the work of the utopian philosopher Herbert Marcuse, who asked how the new power and affluence made possible by technology would be used. As Marcuse saw it, it could be used to satisfy true or false needs, making possible exploitation and escape, or a breakthrough into a new kind of society, based on the right to live fully.

But Marcuse’s ideas of what it means to live fully are somewhat limited. They can be enlarged by the ideas of the humanistic psychologist Abraham Maslow, who believed that we can develop into our true selves, which are inherently ethical, psychologically healthy and constructive. Like Marcuse, Maslow believed we can judge society by the degree to which it helps us grow in this direction.

If you put the ideas of Freud, Maslow and Marcuse together, they lead to the conclusion that the true self has existed throughout history, and it has been “waiting” to be released from the prison created by our primitive psychodynamics, distorted cultures and oppressive social conditions. This true self isn’t an entity inside the mind, that is hidden by the mask of a false self. It is the full person we become when these numerous interferences are eliminated. It is willing to speak, hear and seek the truth about itself and society, without fleeing into the regressive symptoms offered by personal neurosis and popular culture. It is assertive, not aggressive; focused on living fully rather than on shoring up constantly-collapsing psychological defenses; it is able to love and work, and take pleasure and responsibility. It affirms life and compassion over hate and revenge.

When people experience this state of higher functioning, to one degree or another, in their better moments or, in some instances, throughout much of their lives, we see in them an essential characteristic: they don’t only feel good, about themselves and life, they also spontaneously do good. They have an inherent, aesthetic revulsion to anything that would do physical or symbolic violence to themselves or others.

To the extent we are our true selves, we have a deep revulsion to the culture of manipulation, and everything in us tells us not to become one of its practitioners. And we see the regressive radicalism offered by much of popular culture as a lure. All those forbidden fantasies are forms of regression that lead us away from our true selves.

Like language, the potential to become our selves is inherent in us, but it has to be evoked by culture, to come to even partial realization. It is always there, as a part of our makeup, mixed in with, and limited by, other elements of personality and culture.

Contrary to what some postmodernists claim, this self isn’t a collection of roles or a story under constant revision, although, as we have seen, personality and culture do contain a significant degree of disguise. Instead, it is a single entity, with capacities when it comes to language, thinking, emotion, psychodynamics, morality and personal fulfillment that are individual instances of a universal human nature. Since we all partake of this human nature, we all share the same capacities for good and ill, health and neurosis, no matter what “roles” we play or how technology expands our powers.

With this in mind, we can now ask a set of questions that represent one of the essential issues of our age: which aspect of the self will be evoked by the cultures and societies of the 21st century, with their artificial environments, pervasive computers, technologies of power and virtual realities? Which desires will be served by our new abilities: our primitive urges or our aspirations for true fulfillment? Will these powers serve the goal of freedom or will our ability to overcome many of the limits of the world allow us to turn the world itself into a vast arena for acting out the limitations of personality?

Seen in this light, the issue of the age (and every age) is whether we will we use our powers to encourage the development of true or false selves, and seek after a true or false idea of a better world. These are the issues that are being acted out in all the spectacles of art and technology described in the book. In Disney; in advertisements; in the mind-numbing simulation-work of politicians, and in innumerable other creations of culture, we see a quest after misleading images of the perfect self, and of false paradise. Other works, such as apparently modest comedies like Groundhog Day and Uncle Buck, about people who overcome fear and anger and some of their false desires, to become something they already were, are efforts to get at the truth of the self.

We can build strong and healthy people without deep and profound knowledge of the truths of personality, society and culture, of course. But the insight of modernism and Western civilization, that truth liberates, is still essential. It tells us that we have the ability to see through the psychodynamics that partly govern everyone, and discover that many (not all) of the limits of life are, in fact, our own invention; they are defenses we create in response to deeply buried fears of real and imagined dangers from childhood, which are not-so-seamlessly woven into personality. And we have the capacity to discover in the environment of technology, art and simulation that makes up contemporary culture, the story of our selves and society, which we are constantly writing in disguised form. A great many of these creations — individual movies, for example, or theme parks — are like monads: if we could make them entirely transparent, we would discover that each one contains a large part of the entire story.

If the reader will forgive a final hyperbole, mostly stolen from Northrop Frye’s discussion of James Joyce (with Hegel looming somewhere ominously in the background): only after humanity has seen through the illusions of the self and society can it wake up from the dream of history and recognize its own role as the dreamer. Then society and culture appear as objectified and alienated parts of ourselves. The new self that has the potential to be born from this exercise in clarity can consciously create a society and culture that expresses the fullness of life, rather than the yearning for, and flight from, itself.

We can begin this effort at self-knowledge by recognizing that we are ascending a ladder of invention and discovery that has always been there, waiting to be climbed. This ladder of progress is built in to the universe. It is an element of the world, of which we are only a part.

The ladder has two arms. One is made up of our growing power to use science and technology to control the physical world. The other is made up our ability to grow as people. We will need both if we want to make our great ascent.

Speed and Information: Cyberspace Alarm!

Paul Virilio

The twin phenomena of immediacy and of instantaneity are presently one of the most pressing problems confronting political and military strategists alike. Real time now prevails above both real space and the geosphere. The primacy of real time, of immediacy, over and above space and surface is a fait accompli and has inaugural value (ushers a new epoch). Something nicely conjured up in a (French) advertisement praising cellular phones with the words: “Planet Earth has never been this small”. This is a very dramatic moment in our relation with the world and for our vision of the world.

Three physical barriers are given: sound, heat, and light. The first two have already been felled. The sound barrier has been cut across by the super- and hypersonic aircraft, while the heat barrier is penetrated by the rocket taking human beings outside the Earth’s orbit in order to land them on the moon. But the third barrier, that of light, is not something one can cross: you crash into it. It is precisely this barrier of time which confronts history in the present day. To have reached the light barrier, to have reached the speed of light, is a historical event which throws history in disarray and jumbles up the relation of the living being towards the world. The polity that does not make this explicit, misinforms and cheats its citizenry. We have to acknowledge here a major shift which affects geopolitics, geostrategy, but of course also democracy, since the latter is so much dependent upon a concrete place, the “city”.

The big event looming upon the 21st century in connection with this absolute speed, is the invention of a perspective of real time, that will supersede the perspective of real space, which in its turn was invented by Italian artists in the Quattrocento. It has still not been emphasized enough how profoundly the city, the politics, the war, and the economy of the medieval world were revolutionized by the invention of perspective.

Cyberspace is a new form of perspective. It does not coincide with the audio-visual perspective which we already know. It is a fully new perspective, free of any previous reference: it is a tactile perspective. To see at a distance, to hear at a distance: that was the essence of the audio-visual perspective of old. But to reach at a distance, to feel at a distance, that amounts to shifting the perspective towards a domain it did not yet encompass: that of contact, of contact-at-a-distance: tele-contact.
A Fundamental Loss of Orientation

Together with the build-up of information superhighways we are facing a new phenomenon: loss of orientation. A fundamental loss of orientation complementing and concluding the societal liberalization and the deregulation of financial markets whose nefarious effects are well-known. A duplication of sensible reality, into reality and virtuality, is in the making. A stereo-reality of sorts threatens. A total loss of the bearings of the individual looms large. To exist, is to exist in situ, here and now, hic et nunc. This is precisely what is being threatened by cyberspace and instantaneous, globalized information flows.

What lies ahead is a disturbance in the perception of what reality is; it is a shock, a mental concussion. And this outcome ought to interest us. Why? Because never has any progress in a technique been achieved without addressing its specific negative aspects. The specific negative aspect of these information superhighways is precisely this loss of orientation regarding alterity (the other), this disturbance in the relationship with the other and with the world. It is obvious that this loss of orientation, this non-situation, is going to usher a deep crisis which will affect society and hence, democracy.

The dictatorship of speed at the limit will increasingly clash with representative democracy. When some essayists address us in terms of “cyber-democracy”, of virtual democracy; when others state that “opinion democracy” is going to replace “political parties democracy”, one cannot fail to see anything but this loss of orientation in matters political, of which the March 1994 “media-coup” by Mr. Silvio Berlusconi was an Italian-style prefiguration. The advent of the age of viewer-counts and opinion polls reigning supreme will necessarily be advanced by this type of technology.

The very word “globalization” is a fake. There is no such thing as globalization, there is only virtualization. What is being effectively globalized by instantaneity is time. Everything now happens within the perspective of real time: henceforth we are deemed to live in a “one-time-system”1.

For the first time, history is going to unfold within a one-time-system: global time. Up to now, history has taken place within local times, local frames, regions and nations. But now, in a certain way, globalization and virtualization are inaugurating a global time that prefigures a new form of tyranny. If history is so rich, it is because it was local, it was thanks to the existence of spatially bounded times which overrode something that up to now occurred only in astronomy: universal time. But in the very near future, our history will happen in universal time, itself the outcome of instantaneity – and there only.

Thus we see on one side real time superseding real space. A phenomenon that is making both distances and surfaces irrelevant in favor of the time-span, and an extremely short time-span at that. And on the other hand, we have global time, belonging to the multimedia, to cyberspace, increasingly dominating the local time-frame of our cities, our neighborhoods. So much so, that there is talk of substituting the term “global” by “glocal”, a concatenation of the words local and global. This emerges from the idea that the local has, by definition, become global, and the global, local. Such a deconstruction of the relationship with the world is not without consequences for the relationship among the citizens themselves.

Nothing is ever obtained without a loss of something else. What will be gained from electronic information and electronic communication will necessarily result in a loss somewhere else. If we are not aware of this loss, and do not account for it, our gain will be of no value. This is the lesson to be had from the previous development of transport technologies. The realization of high velocity railway service has been possible only because engineers of the 19th century had invented the block system, that is a method to regulate traffic so that trains are speeded up without risk of railway catastrophes2. But so far, traffic control engineering on the information (super)highways is conspicuous by its absence.

There is something else of great importance here: no information exists without dis-information. And now a new type of dis-information is raising its head, and it is totally different than voluntary censorship. It has to do with some kind of choking of the senses, a loss of control over reason of sorts. Here lies a new and major risk for humanity stemming from multimedia and computers.

Albert Einstein, in fact, had already prophesized as much in the 1950s, when talking about “the second bomb”. The electronic bomb, after the atomic one. A bomb whereby real-time interaction would be to information what radioactivity is to energy. The disintegration then will not merely affect the particles of matter, but also the very people of which our societies consist. This is precisely what can be seen at work with mass unemployment, wired jobs, and the rash of delocalizations of enterprises.

One may surmise that, just as the emergence of the atomic bomb made very quickly the elaboration of a policy of military dissuasion imperative in order to avoid a nuclear catastrophe, the information bomb will also need a new form of dissuasion adapted to the 21st century. This shall be a societal form of dissuasion to counter the damage caused by the explosion of unlimited information. This will be the great accident of the future, the one that comes after the succession of accidents that was specific to the industrial age (as ships, trains, planes or nuclear power plants were invented, shipwrecks, derailments, plane crashes and the meltdown at Chernobyl were invented at the same time too…)

After the globalization of telecommunications, one should expect a generalized kind of accident, a never-seen-before accident. It would be just as astonishing as global time is, this never-seen-before kind of time. A generalized accident would be something like what Epicurus called “the accident of accidents” [and Saddam Hussein surely would call the “mother of all accidents” -trans.]. The stock-market collapse is merely a slight prefiguration of it. Nobody has seen this generalized accident yet. But then watch out as you hear talk about the “financial bubble” in the economy: a very significant metaphor is used here, and it conjures up visions of some kind of cloud, reminding us of other clouds just as frightening as those of Chernobyl…

When one raises the question about the risks of accidents on the information (super) highways, the point is not about the information in itself, the point is about the absolute velocity of electronic data. The problem here is interactivity. Computer science is not the problem, but computer communication, or rather the (not yet fully known) potential of computer communication. In the United States, the Pentagon, the very originator of the Internet, is even talking in terms of a “revolution in the military” along with a “war of knowledge”, which might supersede the war of movement in the same way as the latter had superseded the war of siege, of which Sarejevo is such a tragic and outdated reminder.

Upon leaving the White House in 1961, Dwight Eisenhower dubbed the military-industrial complex “a threat to democracy”. He sure knew what he was talking about, since he helped build it up in the first place. But comes 1995, at the very moment that a military-informational complex is taking shape with some American political leaders, most prominently Ross Perot and Newt Gingrich, talking about “virtual democracy”3 in a spirit reminiscent of fundamentalist mysticism, how not to feel alarmed? How not to see the outlines of cybernetics turned into a social policy?
The Narco-Capitalism of the Wired World

The suggestive power of virtual technologies is without parallel. Next to the illicit drugs-based narco-capitalism which is currently destabilizing the world economy, a computer-communication narco-economy is building up fast. The question may even be raised whether the developed countries are not pushing ahead with virtual technologies in order to turn the tables on the under-developed countries, which are, in Latin America especially, living off, or rather barely scraping by, the production of illicit chemical drugs. When one observes how much research effort in advanced technologies has been channeled into the field of amusement (viz. video-games, real virtuality goggles, etc.), should this instantaneous subjugating potential – and it has been applied successfully in history before – which is being unleashed on the populations by these new techniques remain concealed?

Something is hovering over our heads which looks like a “cybercult”. We have to acknowledge that the new communication technologies will only further democracy if, and only if, we oppose from the beginning the caricature of global society being hatched for us by big multinational corporations throwing themselves at a breakneck pace on the information superhighways.

Translator’s Notes

1. “Le temps unique” in French. This is an obvious reference to Ignacio Ramonet’s now quasi-paradigmatic editorial “La pensee unique” – the one-idea-system., in Le Monde Diplomatique, January 1995 (cf. CTHEORY, Event-Scene 12, “The One Idea System”).

2. The automatic block system consists in splitting up a railway line into segments, each “protected” by an entry signal. A train running on one segment automatically closes it off (while the previous segment can only be approached at reduced speed). This system enables a string of trains to run at very high speed within a controlled distance (2 blocks, i.e., typically 3 1/2 miles ) of each other. In its pure form, this system cannot entirely prevent frontal collisions, and is hence best used on multi-track railway lines. The block system was an improvement over the – still widely used – Anglo-american “token” system, whereby the line is also divided in segments, each of which can only be used by the train holding the “token”. This is an almost fail-safe but cumbersome procedure. Virilio is in error in that modern (i.e. computerized) railway traffic control techniques, though originating from the 19th century block system, have altered those practices beyond recognition. (This lengthy and technical note is motivated both by the translator’s railway mania as by the paradigmatic importance Virilio attaches to the block system (cf. especially “L’horizon negatif”.)

3. En anglais dans le texte. On this subject, see for example Esther Dyson’s interview with “Newt” in Wired 3.08, August 1995.

Paul Virilio is the emblematic French theorist of technology. His major works include: Pure War, Speed and Politics, and War and Cinema: the Logistics of Perception. Two of his most recent books are Desert Screen and The Art of The Engine. This article appeared in French in Le Monde Diplomatique, August 1995.

Translated by Patrice Riemens, University of Amsterdam.


Looks cool to type on any surface but the battery will only last around 2 hours.

Itech Bluetooth Virtual Keyboard is the world’s first wireless Virtual Keyboard that lifts wireless mobile communications to a new height.

BTVKB is an enhanced version of the Virtual Keyboard (VKB). BTVKB is a smart, pocket-size device that projects a full-size keyboard through infrared technology onto any flat surface. Users could then type on the infrared images as if typing on conventional keyboards. Without the need of any wire connection, BTVKB provides unprecedented convenience and mobility to users.

The word “cyberspace”

The word “cyberspace” (a portmanteau of cybernetics and space) was coined by William Gibson, the Canadian science fiction writer, in 1982 in his novelette “Burning Chrome” in Omni magazine and was subsequently popularized in his novel Neuromancer. “Meatspace” is a term coined later as an opposite of “cyberspace”.

While cyberspace should not be confused with the real Internet, the term is often used simply to refer to objects and identities that exist largely within the computing network itself, so that a web site, for example, might be metaphorically said to “exist in cyberspace.” According to this interpretation, events taking place on the Internet are not therefore happening in the countries where the participants or the servers are physically located, but “in cyberspace”. This becomes a reasonable viewpoint once distributed services (e.g. Freenet) become widespread, and the physical identity and location of the participants become impossible to determine due to anonymous or pseudonymous communication. The laws of any particular nation state would therefore not apply.

Besides aiding the layman’s suspension of disbelief in fictional works, the success of this rather ambitiously ambiguous metaphor is in large part due to the splintering of the profession of Computer Programmer into various specialized vocations. As John Ippolito put it:

“These days there is no reason to expect a video editor to know HTML, a web designer to know perl, a database programmer to understand packet switching.

So to introduce his readers to cyberspace —the global fabric that supposedly knits together all these separate threads— Gibson fell back on something our culture had prepared everyone to understand: a chase sequence through an imagined space. It would seem, therefore, that the metaphor of cyberspace is not merely a narrative of convenience but a practical necessity”.

As well as being a concept used in philosophy and computing, cyberspace has been commonly used in popular culture, for example

* The anime Digimon is set in a version of cyberspace called the “Digital World”. The Digital World is a parallel universe made up of data from the Internet. Similar to cyberspace, except that people could physically enter this world instead of merely using a computer.
* In the math mystery cartoon Cyberchase, the action takes place in Cyberspace, managed by the benevolent ruler, Motherboard. It is used as a conceit to allow storylines to take place in virtual worlds — “Cybersites” — on any theme and where specific math concepts can be best explored.
* In the movie Tron, a programmer was transferred to the program world, where programs were personalities, resembling the forms of their creators.
* The idea of “the matrix” in the movie The Matrix resembles a complex form of cyberspace where people are “jacked in” from the real world, and can create anything and do anything they want in this cyber world.

Although cyberspace is a common idea it can mean several different types of virtual reality. In the rest of this article we will explore a few, starting with the simplest and then increasing its complexity one after another until reaching the logical extremity.
[edit]

Cyberspace As a Metaphor: Text-Based Internet-Surfing

The word “cyberspace” is currently used in a primarily metaphoric sense and is mostly associated with the Internet. When we sit in front of a computer and turn it on, something like magic happens before us; if we are correctly hooked up we can bring up an environment of hypertext with a click of the mouse. It feels like that behind the screen, there is a potentially very huge reservoir of information that is always in the making. Such a reservoir is somewhere, out there. We are certainly aware that people who generate information, and places wherein information resides, are not behind the screen or in the hard drive, but we nevertheless take the computer as a gateway to another place where other people have done similar things. Conceptually, we tend to envision a nonphysical “space” existing between here and there, and believe that we can access that “space” by utilizing computer-based technologies. We send messages to others by e-mail, or talk to others in a chat room. We play chess on-line interactively as if the rival were right before us, though invisible. By participating in an on-line teleconference, we experience some sort of presence of other conference participants. But where are we? Where are those with whom we communicate? Since we can reach one another in a certain way, but are mutually separated after all, we tend to envisage the potential of such an electronic connection in terms of spatiality. Usually, we call it “cyberspace” that connects and separates us at the same time when we are engaged in the networked electronic communicative activities — the “space” that seems to open up or shut down as the computer screen is activated or deactivated. In this sense, what we get from cyberspace is mostly text-based information with graphic visual aid.

But the concept of spatiality is based on the notion of “volume duality”, as Zettl calls it. A space has positive and negative components. The positive volume has substance, while the negative volume is empty and delineated by things with substance. For example, a room has the negative volume of usable space delineated by positive volume of walls. But text-based Internet does not have such duality. When we surf the Internet for its textual contents, we know we are spatially situated in front of a computer screen, and we cannot enter the screen and explore the unknown part of the Net as an extension of the space we are in. We know that the volume duality does not extend to the textual sources, because the screen itself belongs to the positive side of the space, and the gap between the screen and us belongs to the negative side; that is, the duality is already exhausted before we consider the textual contents on the screen. As for the gap between two words in a textual page, it only functions to separate two symbols, and symbols are not considered substantive entities.

When we read the text page by page, however, we might attribute a spatial meaning to the interval between two pages if we consider the unturned pages to be somewhere “out there.” The choice of the word “page” may also figuratively implicates a spatial interpretation. Furthermore, words such as “files”, “folders”, “windows”, and “sites” might even suggest that there be a spatial dynamic at work behind the scenes. But the only role of these figurative metaphors is organizing the textual contents, and the contents themselves are not figurative. The word “cyberspace” here refers, therefore, not to the content being presented to the surfer, but rather to the dynamic that enables us to surf among different units of contents. We project a figurative structure into the symbolic connections which we know clearly are not figurative or spatial.

Therefore, “cyberspace” understood not as something other than “space” but as one kind of space, is metaphorical. Some of us call it “nonphysical” space as if space allows a nonphysical version, but it remains unclear how space can be non-physical in its original sense. The metaphorical use of the term seems to be based on our understanding of the electronic connectivity, for the purpose of storing and delivering symbolic meaning, as a means of gathering and separating contents. In such a case, the word “space” might suggest a collage of positive and negative volumes, or the interplay between presence and absence of meaning. It directs us to regard the delivered meaning-complexes as delineated by operational units that are not given as symbolically meaningful, and that correspond to our actions of clicking, scrolling, typing, etc. These actions create “gaps” between our mental operations that articulate different units of meaning carried by symbols.

The prefix “cyber” is derived from our understanding of a cybernetic process as a self-reflexive dynamic system that uses a negative feedback circuit to stabilize an open-ended process. Here the notion of cyberspace applies such an understanding of the self-reflexive mechanism in cybernetics to the meaning-making process of the hypermedia. Thus cyberspace suggests a possibly infinite number of occasions of grouping and separating, surfing and routing, constructing and destroying, etc. This open-ended quality resembles the perceived infinity of the physical space that cannot be pictured as being bounded by something. It is impossible to imagine that it would reach a final closure. Similarly, the experience of always having a potential to encounter something unknown or unexpected seems to be inherent in the surfing process. This is a process of perpetual interactions.

In the context of such a metaphor, how can we understand the notion of cyber-culture? In fact, there is a tendency in the media to equate cyberspace with cyber-culture, and forget the hard-cored phenomenological aspect of cyberspace. When some journalists attempt to play the role of cultural critics on the Internet, they frequently convey a message that cyberspace is equivalent to a digital community or a digital city. That is, a web of personal relationships, where civic democracy is based on a balance of diversity and unity, or of coherence and openness. But such an equation between cyberspace and a web of personal relationships does not help us envision the possibilities of cyberspace and cyber-culture, because it prevents us from asking the question of how cyberspace allows for the rise of cyber-culture; nor does it help us understand the fact that the metaphoric nature of text-based cyberspace has been carried over to the current understanding of the formation of the so-called “cyber-culture”.

One assumption behind the notion of cyber community as currently held is that a community, as a cultural entity, can be formed solely on the act of communicating a shared set of social values. But in the real world, we don’t consider such an act alone a sufficient condition for cultural identity. It seems that the physical proximity, geographically and ethnically understood, is more basic for the formation of cultural identity among those with shared values. The rhetoric of cyber community has yet to be justified by solid analysis before it can hope to become a conceptual tool that helps us understand cyberspace and cyber-culture adequately.
[edit]

Cyberspace As an Incomplete Replica: Video-Based Game-Playing

Video-based game playing differs from text-based communicating in regard to the meaning of spatiality, as long as the “gap” on the screen is a representation of the negative volume of space in the setting of the game. Video images are meant to be figures that actually occupy a space and the animation is meant to reproduce the movement of those figures in motion. Images are supposed to form the positive volume that delineates the empty space. Video images have to be able to move across the screen, on which the physical space of the game-player merges with the purported space surrounding the game figures.

A game cannot adopt itself to the cyber-culture metaphor unless it first reaches out to engage more players in the game, and then allows players to be figuratively represented on the screen. These figurative surrogates that act on behalf of the players are called “avatars.” But since an avatar represents the player in an objectified manner, the alleged identity between the player’s actual body and the avatar is no more than a stipulation. In such a case, there is no primordial space constitution at the ontological level. The Husserlian constitutive act of consciousness does not take the space surrounding the avatar and the space surrounding the player’s body as one and the same space.

If we now call it “cyberspace” that allows avatars to move around as symbolic representations of the actual participant’s bodies, then the metaphoric use of the word that suggests an open-ended potential of meaning-generating and reserving would become obsolete. A notion of digital community discussed above would now demand a representation of the alleged community members by avatars. However, since the sense of participation depends strongly on the participant’s self-identity as an un-mediated subjective person from her first-person perspective, the objectified avatar necessarily creates an ontological gap that cannot be filled by stipulation, and the talk about cyber-culture remains metaphorical and flashy.
[edit]

Cyberspace As a 3-D Immersive Environment: Interacting with Synthetic Entities

Video games don’t have to stop at the avatar-player level. Once an immersive environment is furnished in the game that separates the player from the natural environment, the objectified space will be incorporated into the first-person perspective. It will replace the original space, and the artificial space will be extending from the center of the player’s field of vision to unlimited possibilities, and thus cyberspace is experienced as the only space with no other level of spatiality being constituted. The 3-D images will be made to change according to a pattern such that the player’s movement will be experienced as moving in a stand-alone world; this world has a potential to evolve by itself, and can extend to the unknown remoteness. It is experientially equivalent to the physical world we are familiar with before we enter cyberspace. In his book, Get Real: A Philosophical Adventure in Virtual Reality, Philip Zhai suggested a game-playing scenario as follows:

Suppose you and your partner are going to play the game for the first time. Before you get started, you will each be instructed to wear a helmet (or goggles) so that you won’t be able to see anything except the animated video images on two small screens right in front of your eyes, and to hear anything except sounds from two earphones next to your ears. So you see 3-D animation and hear stereo sound. You need also, perhaps, to wear a pair of gloves that will both monitor your hand movement and give you different amount of pressure against your palm and fingers corresponding to your changing visual and audio sensations in the game. You are now situated in a motion tracker so that you can move freely without leaving the place and your body’s movement can be detected and the signals can be fed into the computer; the computer also processes all visual, audio information as well. So you are totally wired to play an interactive game with your partner, mediated by cyberspace. Your partner is in another room, wired to the same computer, doing the same.

As soon as the game gets started, you begin to see with your eyes, hear with your ears, and feel with your hands and with the whole body, a self-contained environment isolated from the actual environment. In other words, you are immersed in cyberspace. Let us assume a typical type of game contents as follows. Your partner and you, each holding a shooting gun, are ready to fire at each other. The 3-D images are so realistic, and your body movements are coordinated with your images on the screen in such a way that you can hardly tell the difference between the animated images and your original body. Your partner looks as real as yourself. There are perhaps a few trees or rocks between you and your partner. There may also be a house you can get in and out, or what not. You can touch the leaves of the tree, and feel the hardness of the wall. So you run, turn, hide, get nervous, bumped, scared, or excited; you hear noises from different directions; when your partner shoots at you, you feel the hit on the corresponding spot of your body; you hesitate and pull the trigger to fire back…back and forth…back and forth…until one of you gets a “fatal” shot, bleeding, and loses the game. Now the game stops but you don’t feel a sharp pain or feel like dying even if you are the loser. Actually you will shortly get unwired and come back to the actual world, alive and amazed.

In such a game-playing experience, the players must take the cyberspace as the actual space in order to get involved in the process. They must suspend the judgment whether the perceived spatiality is “real” or “illusory” and ignore what their memory tells them concerning the difference between the current immersive experience of the game and a real situation. They must respond to the objectified entities in cyberspace exactly like they do in the real world, since they visually, aurally, and kinetically experience their own bodies in the same cyberspace. The consciousness must undertake a Husserlian non-reflective act of space constitution in the same way it does for the actual space. At this point, cyberspace has realized itself as it is originally meant to be. It isolates the player from the actual space with the immersive environment; it represents the totality of the positive and negative volumes of virtual reality.

As soon as we enter into such a virtual environment that enables us to interact with one another while we are constituting the very spatiality itself, we can anticipate the formation of cyber-culture in a non-metaphoric sense. If we communicate with one another in cyberspace in such a way for the purposes of conversation, value-sharing, feeling-expressing, or project-oriented cooperation, etc., then a cyber-community can be literally formed. A cyber-culture will then follow its own destiny of rise and fall.

The idea of a fully immersive cyberspace, such as that depicted in the matrix, is often used as a possible situation in epistemology intended to demonstrate the possibility of skepticism and present one argument for it. This is perhaps one of the most popular arguments in all of philosophy, for a discussion of it see brain-in-a-vat. It should be noticed however that the brain-in-a-vat argument is unlike cyberspace as concieved here as it talks about the sense organs being bypassed and the reality experience being fed into the brain directly. One difficulty with cyberspace as a philosophical tool to promote skepticism is that it requires the existence of a ‘real world’ outside of cyberspace wheras a hardline skeptic would say that it is possible for there to be no ‘real world’ at all.
[edit]

Cyberspace As an Augmented Habitat: Teleoperation

Cyber-culture as discussed above is significant, but it is still non-consequential at the ontological level. The more exciting thing is that cyberspace and virtual reality can go even further. Combining it with the technology of teleoperation, we can enter into cyberspace and interact with artificial objects to manipulate the actual physical process. When I perform an act of picking a stone in cyberspace, for example, a robotic surrogate body of mine in the real world will pick up a real stone. Since all of our physical contact with the natural world for the sake of survival and prosperity is hardly more than asserting physical force to objects, robots can, in principle, perform all tasks of the same kind. So we can build the foundational part of the virtual world in which we are able to accomplish all agricultural and industrial works without ever leaving cyberspace.

Therefore, virtual reality with the capability of facilitating teleoperation will have all the necessary components of the actual world. Furthermore, if we were put into the immersive environment of cyberspace by our parents before we know anything about the actual world, and trained to do everything by teleoperation only, we will take cyberspace as the default habitat, and be unable to function well in the natural environment. As a result, we would develop a natural science about that unknown virtual world, if we are not the designer of its infrastructure and don’t know the design principles of this virtual world. Here is what Zhai wrote in his book:

“Let us imagine a nation in which everyone is hooked up to a network of VR infrastructure. They have been so hooked up since they left their mother’s wombs. Immersed in cyberspace and maintaining their life by teleoperation, they have never imagined that life could be any different from that. The first person that thinks of the possibility of an alternative world like ours would be ridiculed by the majority of these citizens, just like the few enlightened ones in Plato’s allegory of the cave. They cook or dine out, sleep or stay up all night, date or mate, take showers, travel for business or pleasure, conduct scientific research, philosophize, go to movies, read romances and science fiction, win contests or lose, get married or stay single, have children or have none, grow old, and die of accidents or diseases or whatever: the same life cycle as ours.”

“Since they are totally immersed, and they do everything necessary for their survival and prosperity while they are immersed, they don’t know that they are leading a kind of life that could be viewed as illusory or synthetic from outsiders such as us. They would have no way of knowing that, unless they were told and shown the undeniable evidence. Or they would have to wait for their philosophers to help them stretch their minds by demonstrating such a possibility through reasoning.”

“A more interesting possibility is that their technology would lead to the invention of their own version of VR, which gives them an opportunity to reflect on the nature of ‘reality’ in a tangible way, just as we are now doing at this moment. Then they would possibly ask the same type of questions as we are asking now.”

“If there were such a free kingdom, can we say they are in a state of ‘collective hallucination’? No, if by calling it a hallucination we mean to know that ours is not the same. What if I ask you: ‘How can you show me that this imagined nation is not the one we are in right now?’ That is, how do we know that we are not exactly those citizens immersed in VR? In order to separate ourselves from such a possibility, let us assume the basic laws of physics in that virtual world have been programmed to be different from ours. Suppose their gravity is twice as much as ours. So their ‘physical’ objects of the same molecular structure as ours will accelerate, say, twice as fast when they are in free fall, and twice as heavy when they try to lift them. At the same time, they can see lights such as infrared or ultraviolet, which we cannot see. Their scientists will formulate the law of gravity according to their observations. Due to a well-coordinated interface, they can teleoperate things in our actual world smoothly and thus run their basic economy well.”

“Knowing all of these from our ‘outside’ point of view, can we thereby judge that their scientists are wrong while ours right? Of course not, because they would have as strong a reason to tell us that our scientists are wrong. Moreover, from their point of view, they are not doing any teleoperation, but are controlling the physical processes directly; we, not they, are in fact doing teleoperation. If we tell them that their VR outfit gives them distorted version of reality, they would tell us, by exactly the same logic, that our lack of such outfits disables us from seeing things as they are. They would ridicule us and say, ‘You don’t even know what ultraviolet and infrared look like!'”

When cyberspace reaches the stage of Teleoperation, cyber-cultures in every sense would be able to develop just in the same way traditional cultures do in the actual world. Therefore everything we can say about traditional cultures in general would apply to cyber-cultures, and there is no need to discuss every specific mode of cyber-culture in such a circumstance. After all, as Zhai pointed out in his book, the basic idea is simple: ontologically and functionally, the goggles are equivalent to our natural eyes, and the bodysuit is equivalent to our natural skin; there is no relevant difference between them that makes the natural real while the artificial unreal. But the significant difference lies in their relationship to human creativity: we were given one world, but make and choose the other.
[edit]

Cyberspace As an Arena of Artistic Creativity: Non-Consequential Re-Creation

If we only had the foundational part of virtual reality serving our practical purposes, virtual reality would be no more than an efficient tool for manipulating physical processes. What will fascinate us more is the expansive part of virtual reality. This part of VR will unlock our inner energy of artistic creativity for building a synthetic world as a result of our free imagination.

This expansive part does not have the same ontological status as the foundational part since, first of all, virtual objects in it do not have their counterparts in the actual world based on physical causality. In this expansive part, we may encounter all kinds of virtual objects as a result of digital programming. We can perceive virtual rocks with or without weight, virtual stars that can disappear at any time, virtual wind that produces music, and so on. We can also have virtual animals like or unlike animals we have seen before in the actual world. Secondly, we can “meet” virtual “human beings” whose behavior is totally determined by the program. They are not agents, do not have a first-person perspective, and do not perceive or experience anything.

Therefore, in this expansive part, events are neither related to the causal process in the actual world nor initiated by an outside conscious agent. This is a world of pure simulation, or a world of ultimate re-creation. In such a world, cyberspace is a sea of meaning, and it’s so deep that any imaginable mode of artistic or recreational culture would have a chance to grow out of it.
[edit]

History
[edit]

Early philosophical conceptions

Before cyberspace became a technological possibility many philosophers suggested the posibility of a virtual reality similar to cyberspace. In The Republic, Plato sets out his allegory of the cave which is widely cited as one of the first conceptual realities. He suggests that we are already in a form of virtual reality which we are deceived into thinking is true reality. True reality for Plato is only accesible through mental training and is the reality of the forms.

These ideas are central to Platonism and neo platonism. Perhaps the conception closest to our modern ideas of cyberspace is Descartes thought that people might be deceived by an evil demon which feeds them a false reality. This argument is the direct predesessor of the modern ideas of brain in a vat and many popular conceptions of cyberspace take Descartes ideas as their starting point.

Early philosophers also suggested the existence of a virtual cyberspace that was created by life like artistic representations. Some philosophers came to distrust art because it deceived people into entering a world which was not real and sited examples of artists whose paintings, sculptures and even literature could deceive people and animals. These ideas where reserected with increasing force as art became more and more realistic and with the invention of photography, film and finally emersive computer simulations.
[edit]

Modern Philosophy and Cyberspace

Perhaps one of the first indications of cyberspace becoming a topic of deep human consequence arose during the 1978 Nova Convention, in a conversation between William S. Burroughs, Brion Gysin, Timothy Leary, Les Levine & Robert Anton Wilson about the nature of evolution, time, space and mind. One of the underlying themes in the convention was the disenchantment with the Blue Sky Tribe and the initial cravings for “cyber topics” such as transhumanism, Gaia theory and Decentralisation.

William S. Burroughs’ quotes from the convention:

“Time is a resource, and time is is running out. We are stuck in this dimension of time.”

“This is the space age, and we are here to go.. However, the space program has been restricted to a mediocre elite who —at great expense— have gone to the moon in an aqualung. Now, they’re not really looking for space, they’re looking for more time. Like the lungfish, and the walking catfish; they weren’t looking for a dimension different from water, they were looking for more water”.

Deconstructing H.264/AVC by the drunken blogger

Deconstructing H.264/AVC
July 28, 2004

If you were watching the 2004 Apple WWDC Keynote, or even just checking out the upcoming 10.4 Tiger release you may have noticed Apple giving a lot of time to something called ‘H.264/AVC’ which it looks like they’re moving to whole hog and has me pretty excited. Apple has a fairly glossed over page which talks about it, and if you’re going to actually read the rest of this I’d head over and at least skim as I’ll reference it some later. Plus it has some pretty pictures.

As a disclaimer: These are the pieces as I know them; and if I know something wrong hit me with the clue stick or fill in gaps. I’m pretty sure its reasonable, if a bit over-the-top in terms of length again.

Since we know where we’re going (H.264), it’s only fair to talk about where we’ve been… within the realm of reason. I haven’t been that big of fan of Apples’ handling of MPEG4, so we’ll stick to that and not some of my unhappiness with their current Quicktime strategy in general; with the indulgence that when you’re in a cut-throat fight over the future of video delivery, chances are it’s not such a great idea to smack your users over the head with pop-ups to shell out money whenever they open a media file, or to make them shell out money to save a movie or *gasp* play full screen.
Quicktime & MPEG

But, going back to MPEG4, I mentioned I wasn’t the happiest with Apples’ handling of it… but this is primarily in the realm of follow-through, which honestly is one of Apples’ long-term corporate culture problems, some of which is prolly due to necessity as they’ve gone through brain drains and their head count has shrunk. Apple simply doesn’t have the head count that say, someone like Microsoft has when it comes to throwing people at a problem and can contribute to them having these weird feature spikes where if you were scoring different features of OSX against WinXP, it might look like (on a scale of 1 to 10):

* Mac OSX
10, 5, 9, 2, 4, 7, 9, 10, 8, 1, 2 = 67
* WindowsXP
9, 6, 6, 5, 6, 6, 10, 1, 6, 6, 5 = 66

…which leads to a situation where, if you look at what Apple happens to be singling out at the time they look to be aces, but a broader outlook makes things look a little more subdued. Another way to think of it might be broad and shallow versus narrow and deep. If you’re a Tolkien geek, think of Saurons’ Eye; when its pointed at you, you’re really aware of it. When you’re in its peripheral vision lots of things slide.

Apple has very, very good people, it just doesn’t have all that many of them to spread around in the grand scheme of things. Less eyes are going to mean less peripheral vision. This does not mean that just throwing people at the problem is the answer, but it’s just the nature of things.

Apple is also prone to a bit of ADHD (many creative types are, just interesting to see it so pervasive in a company), in that they’ll pick the de-jour of the week, get the press, then pick another feature to hype while the prior one sorta languishes. Many a mac user has been embarrassed when they’re working of mental constructs based around what the situation used to be when disparaging a competing offering (and vice-versa), and not what it happens to be at the moment. I can’t wait until the help system becomes de-jour again…

This was kind of my problem with MPEG-4 on the Mac, and before that, MP3: implementation (lots more on this later). While all codecs aren’t created equal, the implementation of the codec can be just as important. Witness something like MP3 versus AAC; When compared on their technical merits, AAC on the whole is a superior codec. But the difference in quality between an MP3 encoded with iTunes and an MP3 encoded via LAME or Blade can be drastic, especially at lower bitrates or certain types of music (think ‘Daft Punk’, ‘Smashing Pumpkins’ or ‘Lords of Acid’).

MacOS 10.2 (Jaguar) and Quicktime 6 ushered in MPEG-4 (.mp4) after being delayed for awhile due to a rather public spat over licensing costs between Apple and the holding companies responsible for the care and feeding of the various MPEG branches. MPEG-4 had a lot of promise for equalizing the playing field for online distribution.

Remember the ‘media player wars’ were really humming between Apple, Real and Microsoft, and all the players were bring heavy codecs to the table. People were talking ‘convergence’ and cable companies were making ill-fated and overhyped promises of video-on-demand that way too many people bought into. Abortions like QuicktimeTV were still trying to figure out why they existed, and everyone was expected to be throwing video on their website.

MPEG-4 was a bit of a shuffle in the market; previously the way it worked was that if you picked Quicktime you’d use Sorenson (a licensed codec Apple leased the exclusive end-player rights to, also known as SVQ3), if you picked WMP (Windows Media Player) you’d use their codecs, if you picked Real Player your customers would leave you. Interestingly enough, both Apple and Real were two of the big names signing on for MPEG-4 support… it was really considered to be a done deal, committee standards over proprietary.

The climate for media delivery was getting more than a little problematic for content creators who were just sick of this stuff and everyone was really, really keen on MPEG-4 being adopted but, like Firewire, the licensing issues caused it to lose some steam, but most saw the writing on the wall… especially companies like the one responsible for the Sorenson codec who shopped it to Macromedia for inclusion into Flash and got themselves sued by Apple. Interestingly enough, an FFmpeg coder (who has remained anonymous was working on reverse engineering SVQ3 and found it to be a specially tweaked version of H.264… more on that later as I’m getting sidetracked.

Since everyone keeps mentioning these various licensing issues, it’s worth giving a bit of back history on who is behind the various MPEG standards and where and why MPEG-4 and H.264 came about… all of this starts with the MPEG group.
The MPEGs

The MPEG group (Moving Picture Experts Group) was started all the way back in 1988 with a mandate of establishing standards for the delivery of audio, visual, and both combined. After a good 5 years they shipped MPEG-1, and since this was in 1992 and no one in their right mind was even thinking about sending video over their 14.4 baud modems it was heavily geared towards getting the data on a disk.

This MPEG group is actually a real problem; if it was done today, there’s no way in hell the MPEG group would be setup the way it currently is and debacles like Apple holding up its release of Quicktime 6 as a power play over steaming licensing fees wouldn’t happen.

Chances are it’d be much more akin to the World Wide Web Consortium and they’d be a hell of a lot pickier about what was chosen to be included in the codec… they’d be much more mindful about things like patents. At the time it wasn’t a big deal as who would have ever thought we’d all be sitting here with a copy of iMovie on our desk, and their priorities weren’t so much in a few pennies here and there but in having something people could reference. Lossy codecs were starting to sprout up everywhere (like JPEG) but unfortunately a ton of these were proprietary.

Proprietary in these cases can be really, really bad. Imagine a broadcaster buying equipment from company A that stores out video in mystery codec y, but you have to interface it with equipment from company B that has its data stores in mystery codec Z. You’re just asking for all manner of nightmares both on the vendor side and the customer side. Sometimes before you can really compete you have to at least decide where you’re going to have the damn battle.

Still, MPEG-1 was a big deal for things like CD-ROMS, and became a much bigger deal later on (more on that later) even though it has all sorts of IP issues… and it often had hardware-based support, whereas things like Indeo or Cinepak were software based.

But while the data rate was fine for CD-ROMs, it was only meant to deal with non-interlaced video signals (not your TV) and the resolution wasn’t that great, a little less than the quality you’d get with a VCR. While MPEG-1 still lives in various places (VCDs use it, which you can still find around the net) something new was needed.

MPEG-2 is what most of us are used to seeing around now and it hit the scene around 1994. It’s the standard used for DVDs and the I-swear-it-is-coming-soon-‘cus-PBS-won’t-stop-running-specials-on-it HDTV. It was more than a little demanding on CPU when it was first released, leading to a wave of 3rd party MPEG-2 acceleration cards being included in PCs, although now its primarily a software thing as Moores’ Law has advanced. Still, there were a lot of Powerbook owners who were pretty ticked off at Apple that while their computer would run OSX, Apple just kinda decided not to support their hardware DVD decoder.

From a technical standpoint MPEG-2 was about showing that the MPEG standard had legs and could scale pretty damn high from its original intended data rates for things like SDTV (Standard Definition Television) as it was being thrown at interlaced feeds (your computer isn’t using interlaced video, but your TV does; interlaced is more of a bitch to work with) and vastly improved tech in the areas of ‘multi-plexing’ and audio. MPEG-1 only allowed 2-channel stereo sound, which was… problematic for where people wanted things to go.

There were imaging improvements in MPEG-2 of course; but the big deal was the multiplexing, which is taking different data streams and interleaving them into something coherent. The MPEG-1 days were heavy, but audio was beyond problematic and many of my first experiences with it involved demuxing (separating out the audio and video) and recombining to get something of value.

MPEG-2 allowed this to be much, much more consistent and better separation of the audio channels from the video allowed for more innovation between how the audio and video were compressed separately and then interleaved. When you realize that MPEG-2 was suddenly expected to be used not only in DVDs but over the air and through your cable system, the improvements like ‘Transport Streams’ were a big deal. This is glossed over, but you should be able to get the idea.

So we’ve covered MPEG-1 and MPEG-2, and we know there’s an MPEG-4. What about MPEG-3? It doesn’t really exist. Work was started on MPEG-3 to improve the ability to handle the much higher bandwidth HDTV signals, but they found out that with a few tweaks MPEG-2 would scale even further and handle it just fine… so work on it was dropped.

But wait you say, what about .mp3? Interesting story, that. The MPEG-1 spec called for 3 layers of audio… yep, MP3s are basically based on ripped out MPEG-1 audio streams. They’re the layer 3 of MPEG-1. I’m sure there were minor differences in the actual encoding algorithms between MPEG-1, MPEG-2 and what is sitting on your desk but to my knowledge these are mostly about scaling the bitrates down, and they’re all based on the Fraunhofer algorithms which of course is why projects like Ogg Vorbis have sprung up. Interestingly enough, AAC (.m4a’s), which Apple is so hot on now, was also an optional audio layer for MPEG-2 in 1997, although it was improved with version 3 of MPEG-4.

Yep, we’ve covered a lot of stuff so as a quick recap of what we know so far:

* MPEG-1 has slightly less than VCR quality, and as a reference is used in things like Video CDs. I could add more, but it reminds me too much of ‘Edu-tainment’ and FMV games which everyone thought would be hot with the advent of the CD-ROM but single-handedly almost wiped out the market when it turned out they really, really sucked.
* MPEG-2 brought about heady changes in audio, multiplexing, support for higher bitrates and the ability to be broadcast out over ‘unreliable’ mediums like cable and HDTV, and got itself landed as the standard for DVDs and allows me to watch ‘The Big Lebowski’ whenever I need moral reassurance that a few white russians a day doesn’t mean I have a problem. And various tweaks here and there improved visual quality over MPEG-1.
* There was no MPEG-3, as MP3s come from MPEG-1 audio specifications
* AACs come from MPEG-2 audio specifications, although significant improvements were added with MPEG-4 version 3.

As a quick aside, since I mentioned that Sorensons’ SVQ3 was found to be based on a tricked-out version of H.264… you might be wondering how SVQ3 was, ya know, able to do that with a codec that is just now the Apple golden codec. The simple answer is that a lot of the research and planning was spec’d out way back in 1995, but things take time both for the spec to be finalized, reference implementations have to designed and made, kinks worked out, corporation adoption… stuff takes time.

I don’t know the story behind Sorenson incorporating this technology, just that it was found they did when it was being reverse engineered for playback by FFmpeg, even though they now are shipping a product specifically geared towards H.264 files…
Enter the MPEG-4 behemoth

…which brings us to MPEG-4. Weirdly the file format of MPEG-4 is based upon Quicktime technology, which I’m just not going to spend time on as it’s too much of a side issue for even I to be able to justify; the real story of MPEG-4 is all about the internet.

I mentioned that MPEG-2 couldn’t handle low bitrates, as it sorta falls apart when you drop under 1 Mbits per second; it’s simply not meant for that kind of delivery which was why Apple shelled out a bunch of dough to Sorenson for exclusivity of their codec, and why MPEG-4 came to be. MPEG needed to grow to handle the internet, which meant it needed to scale downwards in bitrate at the highest quality possible and be as efficient when streaming over a net connection as it could be.

I have to give a disclaimer here; I like(d) MPEG-4, but find it to be really, really weird. I gave the impression that MPEG-4 was supposed to be a panacea of simplifying the delivery of content, which it was looked to for, but when you actually look at the spec there’s all kinds of crazy stuff in it that looks to be throwbacks from build-it-and-they-will-come thought processes which brought us inane .COMs and a thousand games based on stringing video clips together.

VRML (Virtual Reality Modeling Language) was hot in this time, and the idea was basically flash on steroids; or “Screw text, users in 1994 want my website to be a 3D virtual world”. Basically you’d have a plugin in your browser, and when you entered a site it’d be fed a .wrl file full of vector code to represent the virtual world. Click the ‘Support’ building and you’d be fed another .wrl file with more textures which would pull up a nice avatar holding up a sign saying:

“Hi there! You’re the 3rd person to actually come into this virtual building in 5 years, here is our phone number. Thank you, come again. Please. No, really, please do come back. No one else can be bothered to go through all this crap to get our phone number. I’m an 8-bit sprite-based avatar because only 1% of computer owners have a machine that can display anything heavier and those who do have better things to do with their time… you won’t come back? Are you sure? I have have a coupon I can pull up for the next time if you do… No? Well, if you could find the time, could you possibly pass the word to some l33t’s so they can DDOS me and bring upon the cool soothing 404 of release? Or my possibly more advanced brethren so they can hunt down my creators and kill them?”

I’m not saying that there wasn’t coolness in VRML (or it’s offspring, X3d), but I’m almost entirely sure it was all just a ploy by SGI to capitalize on their uber-cool-at-the-time graphics workstations. It was a bit of hubris to be throwing it out in 1994, and it was positioned badly.

And, just for the record, I firmly believe that artificial intelligence is going to be born in some aberrant piece of forgotten code that falls into disuse in some backwater of the internet, which then quietly starts doing things to entertain itself. It’ll then become fully sentient in an unloved environment (or worse yet, on this guys computer) and fail to feel any connection to its masters-made-of-meat. In short order it’ll decide we’d make really damn good batteries or, if its on Steve Jobs’ computer, decide to remove us from the earth purely for aesthetics. My $10 is on an ActiveX control on a forgotten thumbnail pornsite in Russia, which means its going to have really, really interesting attitudes towards women and accessing strangers bank accounts.

Anyways, back to MPEG-4… they just went apeshit with this thing, going object-oriented and including an extended form of VRML so you could have objects moving above or behind your movie, etc. Apple was hot on showing this stuff at the time, where you could click sprites and a sound would play… interactive movies, and layered movies.

I.E., don’t add snowflake effects to your movie in After Effects, create two movies, one of them one of snowflakes and send along a tiny binary of code that will overlay them. Or something. It was all just very weird to me so I tried to ignore it until I really saw a reason why I should care; unfortunately almost everyone else did the same although I’m sure someone who read this far will email telling me why being able to programatically add snowflakes was make-or-break for their project.

In terms of streaming, MPEG-4 was pretty nifty really and added a ton of stuff to the mix that’s often hidden from your eyes while you’re watching the Keynote or viewing content involving less clothing. It was a big break in terms of networking from MPEG-1 & MPEG-2, and brought MPEG into viability with competing offerings that were hitting the market at the time. As I intimated earlier, MPEG-2 had some tech in it called ‘MPEG-2 Transport Stream’ which was the equivalent of a network copy. Basically wrap the audio and visual into a file and send it to IP address x on port y.

MPEG-4 splits the audio and visual, sends them to the same IP, but to different ports where they’re then combined and decoded properly using information given to it using the SDP (Session Description Protocol) while they’re connecting, along with a whole bunch of other acronyms like QoS (Quality of Serice). Lots of stuff has to occur on the backend to keep things syncronized, but by doing this you’re able to do things like only listening to the audio of the Keynote because you’re bandwidth starved and simultaneously sending things back and forth like the error rate. I’m not even going to go into the copyright bits stuff as it freaks me the hell out.

There were some really nifty things done on the compression side, like my favorite, motion-compensation which I’m not going to go into detail on yet. But through a bunch of improvements you were able to get some really nice bitrate improvements over something like MPEG-2, even though it really came into its own below a specific bandwidth threshold.

So all is good, right? We have a codec built for streaming that can go from a high-end bitrate for something like HDTV down to a streaming music video or Keynote, and just needed to have some kinks worked out.

Well, there were some issues…
How to lose friends and influence

I mentioned the very, very public licensing squabble that occurred between Apple and the MPEG-LA group, which is in charge of sucking in the licensing fees. I really don’t know exactly how this happened, but you ended up with Apple saying:

“Hi, we’re demoing Quicktime 6 today, which is ready to ship with this fantastic MPEG-4, but we’re not going to ship it until the MPEG-LA group gets its head out of its asses in terms of licensing fees. Please voice your displeasure at them vehemently.”.

IIRC, it took around half a year for them to get the licensing ironed out into something they thought was equitable, although I believe there was a ‘technology preview’ released a month earlier. Unfortunately it really let Real and Microsoft get a head start with their offerings, but there were other problems.

Weirdly enough, at the time it wasn’t considered to be that competitive when compared with streaming solutions from Real or Microsoft, but it worked great for progressive downloads where you basically get a ‘fast start’ by downloading a chunk of the movie and starting to watch while the rest downloads transparently. There were certainly issues here, which have been ironed out, but they did hurt mindshare at the time.

But the killer to me was the encoding implimentation; people actually expected Apple to drop Sorenson and their fees pretty quickly, which never happened because their customers weren’t keen on it happening.

Basically, Apples built-in MPEG-4 encoder blows and is woefully inferior to everything else out there. Everything. This isn’t to disparage the hard work that I’m sure went into it, but I’d bet if you sat down and had a beer with the coder/coders behind it they’d intimate that they were unhappy with where it is. There are two real problems going on here:

* The encoder in general
It’s just not very good. It has a ‘reference platform’ feel to it. It’s very difficult to get good results without a hell of a lot of tweaking, and unfortunately Apples standard options don’t allow for a hell of a lot of tweaking. In the past I’ve been in the unenviable position of saying “MPEG-4 doesn’t suck, Apples’ implementation does” after people are unsatisfied with the results. And it’s really that bad, muddy, blocky, bleah.

I felt a visceral depression at the quality I was getting, but all isn’t lost and I’d encourage you to check for yourself by installing something like the 3ivx encoder which features Quicktime integration and just absolutely stomps all over Quicktimes’ encoder in both file size and video quality.

I’d actually give a nod to 3ivx and other decoders in general too, but I’m not really kidding around; taking any source and output ‘pure’ and simple .mp4 files using the most basic settings and the ones output by Quicktime will always come in dead last by a significant margin even when played through the Quicktime decoder. If you’re using something Cleaner 6+, you’re all set, it does a damn great job with MPEG-4… this is an Apple problem not a platform one.

Now, one thing in Apples’ defense: I understand that their implementation seems to be heavily geared towards smoothing out the bitrate curve for smoothness, focusing on streaming over quality. But unfortunately not everything is about streaming; and even so the quality compared to what you’ll get with others is frighteningly poor, even for streaming. This really, really started giving MPEG-4 a bad name when people were comparing it to other products out there.

Flame wars abounded over testing procedures. My favorite was where some guy was all up in arms about the testing being rigged because a ripped DVD was used instead of a DV stream from a camcorder, but I digress. Bygones.

* The lack of ASP support
One of my personal pet peeves with Internet Explorer is its lack of alpha channel support for PNGs, which I happen to be a big fan of. To all fairness to Microsoft, alpha channel support was an optional part of the spec that you weren’t required to implement to say you had PNG support. But still, it rankles.

Unfortunately, things aren’t as simple as MPEG-4 or not-MPEG-4, as there are actually two versions; Simple Profile (SP) and Advanced Simple Profile (ASP). Remember I mentioned that MPEG-4 went kinda apeshit on the spec? There are a ton of different layers and capabilities, so the originators wisely decided they’d create ‘profiles’ which are handed off to the decoder to tell it what it needs to be able to do to play the file. If a device can play MPEG-4 SP files, it should have x decoding capabilities, and if a device can play MPEG-4 ASP files, it should have x and y decoding capabilities.

SP was the first version out of the gate, and was primarily oriented towards low-bandwidth situations and as a base common denominator between devices; ASP brought in a whole bunch of improvements intended to improve quality and bitrates. If you hit up 3ivx and check out the options, you’ll see a few that say that if you check them you’ll be forcing ASP…

…which is problematic because not only can Quicktime not encode ASP files, it can’t decode them either. This isn’t that big of a deal for your average duck backing up his ‘Girls Gone Wild’ collection, but its a big problem for distribution as you can’t use MPEG-4 to its full capabilities while using a compressor that isn’t Quicktime, because the majority of people sure as hell aren’t going to want to install a plugin to view it within Quicktime.

Remember, ‘distribution’ here can mean a lot of things. It could mean ripping your favorite Simpsons episode and passing it onto friends. These guys won’t even touch Quicktime, it sucks for them, and things like WM9, DivX, 3ivx, etc. work much much better to Quicktime is cut out of the picture on the encode. Assuming they use something like Divx or 3ivx, their friends who want to view them can’t even use Quicktime, which means it gets cut completely out of picture on the decode unless the end user jumps through hoops.

Not having 2-pass encoding is forgivable, but the lack of ASP support just really gets up my craw. I don’t really know why Apple has completely eschewed ASP support in Quicktime, people were expecting to see support quietly sneaked into 10.3, but the only thing really codec-related to hit was the Pixlet codec which is very, very specialized but it really kinda sucks and doesn’t help the mindshare poison spreading around MPEG-4 and it kinda sorta gives a hint into why the movie trailer people were still loving on Sorenson over the new codec.
Microsoft does its homework

Ah, but there were other problems. Namely, Microsoft. I mentioned that they had a jump on getting their codec out the door due to the licensing issues, but it’s almost more accurate to say they had a jump on getting their platform out the door. Windows Media 9 was and is a big deal; primarily because they hit the damn thing off the scoreboard and really went after the throats of the MPEG-LA group.

One of the ways was through pricing pressure. Remember there was a huge amount of outcry, much of it fueled by Apple and others, about just how out of line the MPEG-4 group was with its pricing. They iron out the pricing, and Quicktime 6 is going out the door, and Microsoft announces that their licensing fees will be about half what you’ll pay for MPEG-4 licensing. Made ’em actually look like a nice alternative to the ‘open standard’ codec. There’s a kick in the balls, eh?

But wait, there’s more, as we’re pretty much used to Microsoft kicking people in the balls via pricing pressure when its strategically important to them. Nope, this time Microsoft decided to kick in their teeth too by making the WM9 codec excellent. And by excellent I mean fucking stellar. Yes, I could have just used stellar, but it wouldn’t really describe the situation. The quality was that good; it’s right up there with the best you can get from something like DivX or 3ivx and will trounce Sorenson or Apples implementation.

They also made the smart step of setting their network stuff in stone… Pretend you’re a content provider or device maker of miscellaneous origin, looking to pick a codec to support or use for your wares. Microsoft, to their credit on this one, made it a pretty difficult decision to make even if you weren’t their biggest fan, and systematically started scooping up adopters like they were Michael Moore swing by Berkeley.
Enter H.264/AVC

Otherwise known as:

* H.264
* H.26L
* AVC
* MPEG-4 AVC
* MPEG-4 part 10
* JVT

H.264/AVC is some pretty nifty stuff in it, but its nothing so much revolutionary as a simplifying of some of what was in MPEG-4 and the taking to an extreme of other parts, with a smattering of new stuff. There’s not really one thing you can point to and go “Oh, yeah, that’s where the 30% efficiency gain comes from”, rather its many of the existing technologies that you can find in MPEG-4 ASP and such, just refined, and all of them used together give you a sizable gain which we’ll go into in a moment.

This is not, as an example, something like the change of JPEG to JPEG2000 which went to something entirely new and novel for its improvements.

You may notice that H.264 and H.263 are basically off by a digit; my understanding is that the guys behind H.263 were working on their codecs and the guys behind MPEG-4 were working on their codecs, saw they were both going in similar directions and decided to join forces back in 2003 which is when interest really started heating up… and where half the monikers come from. The ITU group started by creating H.26L back in 1998 or so, with the goal of doubling the efficiency over existing codecs, then the MPEG group joined in, and the joint team was called JVT (Joint Video Team; creative, them).

This is partly why its known by so many different monikers: H.264/AVC is really a nice codec, and is a lot of things to a lot of people depending on where your focus is. I remember getting an idea of it a few years ago when it was hitting some of the video conferencing equipment, but this was before forces were joined to bring its tech in with the MPEG guys for H.264/AVC.

H.263 is an interesting codec; if you’ve ever used a video conferencing solution chances are you’ve seen it. It had a revision a bit back to increase the quality and the compression, but it wasn’t very scalable up on the high end. This was a codec originally designed to squeak in under ISDN lines, primarily for video conferencing, so there were lots of tweaks in its algorithms designed specifically for it. I’ll spare you the details, but lets just say H.263 did a remarkable job when you had two computers connecting via IP, with a well lit background and one person sitting talking.

The big question of course is if the quality claims regarding H.264/AVC are smoke and mirrors or over-hyped; they most assuredly are not from what I’ve seen.
mMMMm bitrate

The key here is quality at a given bitrate, which is when codecs start coming into their own, so lets talk about bitrates for a moment. The bitrate, or data rate, is by and large going to decide how large your file ends up or the quantity of bandwidth used to transfer the data and luckily its pretty easy to give a butchered example.

Lets say you have a normal movie that is:

* 592 wide by 320 high
* About 92 minutes long (5,520 seconds)
* 24 frames per second

If you tell your encoder that you want to encode at a bitrate of ~130 Kilobytes per second, you’ll have a file that is around 700 Megabytes in size. This should make some sense, as what you’re really saying is “You have 130K to play with every second; encoder, encoder, do as you will!” and 130 Kilobytes gets written out to the hard drive 5,520 times. That would be CBR encoding (constant bitrate), whereas something like VBR (variable bitrate) would allow you to do things like say “Ok encoder, you can use a bitrate up to 130K/s, but feel free to go lower if there just isn’t much to encode”.

Where/why/how you set your bitrate threshold can be dependent on what you’re trying to actually do and what your other limits are. I.E., you may be constrained by the size of your physical medium (say, a CD), or you may be constrained in bandwidth. If you’re streaming video to clients off your T1 at a specific quality level, if you can cut the bitrate by x you can either serve more clients or keep the bitrate the same and increase the quality. Yay.

So bitrate is of paramount importance, as when you take something like a high-definition stream and try to apply MPEG-2 style compression to it you end up with a massive stream of data. And, as I mentioned, since the codec isn’t geared towards that type of use it has what I call ‘fall-down-go-boom’ syndrome; meaning quality and efficiency suffers horribly. You can see this easily by taking a vacation photo and pumping it out as a GIF or as a JPG; JPGs are made for this sort of thing and as such do really well. GIF compression isn’t, and it not only won’t look as good the compression won’t be near what you’d get by using JPG. You could easily reverse the situation by pumping a logo through them both and watching JPG fall-down-go-boom because it’s out of its element.

So H.264/AVC has some wondrous savings in terms of bitrate; depending on what you’re doing they can be 30-70% of a reduction over MPEG-2 or MPEG-4 ASP, although most often you’ll probably see something around 38-40% over MPEG-4 ASP. There’s a problem though, as this stuff doesn’t come for free.
How MPEG got its groove back

As any engineer will tell you, engineering is primarily about balancing tradeoffs. If you take 5% here, you need to add 5% there, and where and there can be wildly different variables. Heat, cost, size, etc.

When it comes to compression, the tradeoff is almost always between compression efficiency and computation costs. Often times these are of an inverse ratio, meaning if you use codec x you’ll save 50% on final size but you’ll be increasing the time it takes to encode by 100-200%, if you save 20% you’ll increase the time taken to crunch the data by 50%.

I’ve been avoiding going into exactly how MPEG-style compression really works, mostly because its not the easiest thing to break down into language anyone can grasp and then seek further knowledge on; quite simply it hurts my head and is pretty complex. But its important to have a basic understanding to be able to get an idea of just what is going on behind the scenes with H.264/AVC. This is going to be heavily glossed over, but you should be able to get the idea.

All of the MPEG-style encoders are block-based; meaning they break up the image into squares 16 wide by 16 high and do their magic within them. This is why you are viewing something that has quality issues, they generally involve things looking blocky. This is remarkably similar to something like JPEG, which, well, does the exact same thing, with the caveat that JPEG doesn’t have to contend with motion… which goes back to why MPEG was first brought about.

You can create a movie using something like Quicktime encoded with something called “Motion JPEG” which pretty much just takes every movie frame and applies the JPEG codec to it.

If your ‘reference’ movie is:

* 1 minute long
* 30 frames per second

…you’ll essentially have a movie made up of 1,800 JPEG images wrapped into a file. When you stop and think about it, all that’s really having to happen when you play the movie is that the decoder has to decompress each frame and throw it up onto the display as fast as it can.

However it won’t hold a candle to even something like the original MPEG codec in terms of compression efficiency; this is due to MPEG having some special tricks ups its sleeves specifically designed to deal with movies. These are special ‘key frames’ called I-frames, P-frames, and B-frames.

Using out ‘reference’ movie above as an example, these basically work like this:

* I frames
These are basically full reference frames; consider these to be snapshots of the movie that the encoder/decoder uses to tell it what’s going on. Movies generally need to start with these.

* P frames
These allow the decoder to use frames that have been decoded in the past to help it reconstruct the next frame. Here’s the idea; very often not everything in the scene will change from frame to frame, so its a hell of a lot more efficient to just tell the decoder “Change these blocks to these colors, but leave the others just where they are”. As an example, lets say you’re encoding a movie of yourself talking into the camera.

Assuming you aren’t getting your groove on while you’re talking, remarkably little about the scene actually changes over a period of a few seconds. So the decoder simply takes the last frame that was constructed and changes what needs to for a nice data savings. Hopefully this is pretty simple; the decoder looks at the reference frame and just keeps making changes to it until it hits another frame, at which point it starts all over.

The farther apart your keyframes, the more the image has to be ‘constructed’ by the decoder and why if you’ve ever tried to scrub back in forth in a movie that has keyframes set to something wacky, like 1 keyframe for every 1,400 frames, things grind to a halt. Things are fine when you’re just playing the movie, but when you try to, say, jump to the halfway mark you’re sitting there waiting while the CPU identifies the frame you want to see, where the last reference frame was, and reconstructs the scene up to that point.
* B frames
These are almost exactly like P frames, with the exception that while P frames are able to look at the last frame and see what needs to change, B frames are able to look at future frames too. This is a great thing in terms of quality and efficiency, and helps keep down those gawd-awful image problems where you’re in between keyframes and suddenly the encoder is told everything has to change. But if you reference the P-frame example, and the idea of tradeoffs, you can get an idea of the kind of hurt progressions like these put on the CPU.

Now there’s another I mentioned I was a big fan of, called motion-compensation. This was introduced in MPEG-4 and further improved in MPEG-4 ASP, and introduces the idea of ‘motion vectors’. As I mentioned earlier, MPEG is block-based, so every block of the image gets a motion vector. I just love the concept of this thing; the encoder, instead of just saying “blocks a/b/d/z has changed in this frame”, tries to actually get a handle on what is actually in the scene and, if appropriate, just tells things to move around instead of changing by setting that blocks motion vector to something besides zero.

Think of the credits you watch at the end of a movie; as they scrolled upwards, to the encoder this would mean the blocks above it and below it had to change. With motion-compensation, the encoder is able to get the idea into its head that these things aren’t actually changing, they’re just moving upwards and it doesn’t need to actually store the data for those blocks, the encoder just needs to know to move them.

There are a ton of things where this comes into play; imagine wiggling your iSight a bit while you’re adjusting it, or moving your head slightly. In some cases the actual data will need to be changed, but often a lot of pixels can just be moved. If a movie pans to the side, same thing. Now its often not so simple as just saying “Move this object 5 pixels over in the next frame”, but its often able to do it with a lot of the pixel data even if stuff around it needs to be told to change.

Going back to being a block-based compression scheme, for the most part H.264/AVC takes these kinds of things, then just goes to a new level with them for its improvements. It’s still block-based, but whereas before the encoder broke the image up into 16×16 pixel squares, the new codec keeps these “Macroblocks” of 16×16 but also allows the encoder to ‘split’ them even further like so:

* Two 16×8 pixel blocks
* Two 8×16 pixel blocks
* Four 8×8 pixel blocks

If your Macroblock has been split into four, these can then be broken down even further:

* Two 8×4 pixel blocks
* Two 4×8 pixel blocks
* Four 4×4 pixel blocks

When you stop and think about that, the options the encoder has up its sleeves has been increased dramatically, going from being able to work with 16×16 pixel blocks down to 4×4, or from one block shape at its disposal to seven. This is a big, big deal and is probably one of the biggest gains with H.264/AVC in terms of quality and reducing artifacts.

When faced with something like a 16×16 block that happened to be an edge in the image of some sort, say a black sweater on a grey background, half the block might be black and half might be grey and when compressed at a specific bitrate ends up looking like a smeared and blurry block; artifacting. H.264/AVC is able to hopefully split that 16×16 block in an optimal way; it may decide it needs to split the block into 4×4 chunks, but it might just as well be able to split it into two 16×8 pixel blocks, which means the gray half looks better and while there might be some smearing in the other block its vastly reduced over what it could have been.

There’s also some pretty nifty network stuff in H.264 which I won’t go heavily into either, mostly because I’m too stupid to understand all of what its doing with slices and such… but it has significantly cleaned up a bunch of the complexity and, weirdly enough, actually includes a NAL (network abstraction layer), built with the internet and other devices in mind, right in the damn codec. This is just a damn trip; you can just slap this onto a fixed IP address and go. This is one of the reasons why you saw Intel giving talks on using it over 802.11b for video in the home, etc.

The IP layer is only one part of the improvements on the streaming side; there are things in it like FMO (flexible macroblock ordering) which again I’m not even going to really touch on much, but its cool shite. As examples:

* Slices of the image can be grouped and sent over the network, so if, say, the image gets there but is missing a slice or two it can error correct and get that slice sent or use crazy interpolation methods to fake what it thinks is supposed to be there based on what’s next to it. I could go on about all the prediction stuff but, well, no real point as I’m sure you get the gist; while H.264/AVC is a big deal for PCs, embedded and broadcast guys are loving on it in a big way.

* There are some really weird slice types in the spec, like SP and SI (basically a switching-P-frame and a switching-I-frame) which allow the decode to switch between streams of different bitrates using more prediction algorithms… trippy.

I have no shame in admitting that all the stuff going on in the VCL layer for streaming makes my hippocampus throb, but you should be able to get the idea that its some pretty slick stuff and a big improvement over where it was before, and anyways, I mentioned that there was a problem…
Another profile problem?

Going back to tradeoffs, all this stuff doesn’t come for free and you should be able to get the idea that H.264/AVC is going to put the absolute hurt on your computer, and its going to make a lot of people big fans of the G5 and what might be considered ‘extreme’ CPU speeds for every day use as, if you have a 20″ screen, viewing a 320×240 movie trailer isn’t as appealing.

A lot of Apples line is already chugging a bit with higher-res MPEG-4 ASP files (a full screen 720×480 DivX file playing on an iMac will let you know the CPU is being used), let alone doing encoding, and we’re not gonna even talk about two-pass encoding. To keep it short, H.264/AVC is going to make its presence known to the CPU in a big way. A big, big way. How big of a way I’m not certain, as I don’t know a lot about Apples’ specific implementation and how/where/why they’re able to accelerate it, but its going to be brutal.

So you might be wondering, “Um, but Apple is using it for the new iChat in Tiger. So does that mean you’ll need a G5 to video conference?” which is a perfectly logical question to ask, but its a little more complex than that. If you remember from the MPEG-4 stuff, there were two main profiles: SP and ASP. With H.264/AVC, there are three profiles (in contrast to MPEG-4, which had ~50):

* Baseline
This was initially spun out as a royalty-free base profile for H.264, its the simplest to encode and simplest to decode, but doesn’t handle things that the broadcast market would care about, or someone doing streaming would care about, but its great for point-to-point video conferencing.

* Main
Everything that’s in Baseline minus a couple of network-oriented features, but all kinds of acronyms I mentioned earlier and more are in this one. This is what you’ll see being used eventually in High-Def set top boxes at some point, and what you’d want to use if you were creating something for playback on your own machine.

* Extended
Everything from Baseline and Main, with the exception of CABAC (Context-Adaptive Binary Arithmetic Coding, and, when I tried to figure out what the hell it does things started throbbing again in places that don’t normally throb unless I’m hung over, but if you’re working with the type of stuff you’d normally use the Main profile for you get a nice gain in efficiency) but this is where those weird slice types I mentioned earlier come in (SP and SI). Pretty geared towards error and latency-prone environments, like streaming a movie trailer to your computer or your Palm/PocketPC.

I’m ~99% sure Apple will be using the Baseline Profile for iChat AV, which is much, much easier to encode and decode than Main Profile, but most people aren’t iChat’ing full screen. It might end up wiping out an older generation of hardware from using the new iChat, and your mac might feel more sluggish, but we’ll have to see.

This unfortunately brings up my absolute biggest worry with H.264/AVC after the lack of supporting MPEG-4 ASP (which I still really, really want to see included!). There’s a meme floating around that basically says Apple didn’t chose not to spend any work on dealing with ASP because they realized H.264/AVC was coming down the pipe and wanted to throw all of their energies into that.

Well, alright, but if that’s so, do not repeat what happened with MPEG-4 by not having a fantastic implementation and only including the Baseline Profile in Quicktime. You might be tempted to do it, and figure programs like Cleaner will fill in the gaps for the pros and well, good-enough is good-enough for home users. Consider the effort a loss-leader if you have to, but I want the guys ripping their Simpsons’ episodes recommending using Quicktime for PC because of its fantastic quality. I’m really somewhat enamored of H.264/AVC, and its going to be huge. It has great buzz about it, but then again so did MPEG-4 and that has been all but squashed with poisonous mindshare.

That felt good. And, considering some of the demos Apple has been putting on at places like the NAB conference, chances are Main Profile will be included… but still, hit one out of the damn park on the quality this time Apple. Quicktime is one bad move away from being called ‘decrepit’ and ‘beleaguered’ in general, there’s really no reason for to hasten the outcries.
Enter HD-DVD

Moving on, there’s something really interesting to cover which you may have noticed from Apples’ page which I suggested you peruse before you started this; it’s been ratified to be part of the HD DVD format. This is kind of confusing, as if you’d been paying attention to press releases lately you may have noticed that the upcoming High-Definition-DVD format seems to include more than one codec, namely:

* H.264/AVC
* Microsofts’ VC-9
* MPEG-2

This kind of confused the living hell out of me too, but as it turns out the new format really supports them all, it isn’t as though one is preferred or they are all in the running. Nope, they’ve all been ratified and included in the standard, meaning if you want to make a device that has HD-DVD support, your device has to play them all back.

Luckily they’re all fairly similar in nature, so the decoders for set top boxes don’t have to be too general purpose (makes them more expensive) but its still kinda interesting and shows the breadth of support H.264/AVC is seeing, as I don’t feel like giving a bunch more examples regarding satellite companies and such. 😉
Open Standards

One last quick thing, and that’s in regards to “Open Standards”, as you see mentioned on Apples’ page. There seems to be some FUD out there regarding Windows Media 9, or VC-9, or WM-HD, or whatever its being called at the moment, that can be boiled down into:

* WM9 is some sort of also-ran codec, and H.264/AVC creams it
WM9, and WM-HD are excellent, excellent codecs. There are a lot of problems you could have with them, such as say, speed of their implementations but the actual quality isn’t one of them. If anything, most might give an ever so slight nod in quality to Microsoft on this one over H.264/AVC, but that could well be due to their implementation being out there for awhile longer. Either way, the difference is pretty much negligible, and it’s a high-quality codec which is why it was thrown into the HD-DVD standard and most can’t tell the difference between the two.

* H.264/AVC is based on ‘Open Standards’, and WM-HD is not
I’ll admit that ‘Open Standards’ might mean something different to me than how many others seem to interpret it. To myself, an open standard is one where you can go grab documentation and build your own, and if you follow the spec it should work with everyone else’s implementation who does the same. Something like TCP/IP would be an example, or HTTP.

Something like H.264/AVC would not, as what they’re really releasing is a standard people can buy into, if they pay the licensing fees. In order to get included into the HD-DVD spec, Microsoft had to open up the spec of their codec so others could license the ability to create their own encoders/decoders, just as you do with MPEG-4+.

The real difference here is one between committee-based codecs, where groups of companies get together and decide what they want the codec to look like (and sprinkle their own patents into, which you then have to pay license fees for) and company-based codecs working to the exact same end (and who include whatever patented technology they buy or create, and then sell you licenses for use). There’s zero difference really, except with who gets paid.

I’m actually glad Microsoft is in this race, it really needed more competition, and at the very least will hopefully help the MPEG group keep their eye on the prize as well as keep licensing costs down.
Wrapping up

There really is a lot to be excited about with the ushering in of H.264/AVC, even if you aren’t working with High-Definition video on a dually-G5, although with the advent of HD-DVDs coming (and Microsoft announcing support in Longhorn) you might well want to make sure that whatever mac you’re purchasing is going to be able to handle the load for what you want to do with it.

More than anything I’m just hoping we don’t see a repeat of what happened with MPEG-4 ASP, where a great codec was given a lousy implementation on a platform that’s supposed to be geared for media creation. They can’t go narrow and deep on this one again.

Its going to be another year until we actually have our hands on it. If the history with Panther is any indication, perhaps a revision of iChat AV and Quicktime will be released awhile before Tiger is out the door, and users will have the option of paying $30 to keep it running when Tiger ships or getting it included for free.