The word “cyberspace”

The word “cyberspace” (a portmanteau of cybernetics and space) was coined by William Gibson, the Canadian science fiction writer, in 1982 in his novelette “Burning Chrome” in Omni magazine and was subsequently popularized in his novel Neuromancer. “Meatspace” is a term coined later as an opposite of “cyberspace”.

While cyberspace should not be confused with the real Internet, the term is often used simply to refer to objects and identities that exist largely within the computing network itself, so that a web site, for example, might be metaphorically said to “exist in cyberspace.” According to this interpretation, events taking place on the Internet are not therefore happening in the countries where the participants or the servers are physically located, but “in cyberspace”. This becomes a reasonable viewpoint once distributed services (e.g. Freenet) become widespread, and the physical identity and location of the participants become impossible to determine due to anonymous or pseudonymous communication. The laws of any particular nation state would therefore not apply.

Besides aiding the layman’s suspension of disbelief in fictional works, the success of this rather ambitiously ambiguous metaphor is in large part due to the splintering of the profession of Computer Programmer into various specialized vocations. As John Ippolito put it:

“These days there is no reason to expect a video editor to know HTML, a web designer to know perl, a database programmer to understand packet switching.

So to introduce his readers to cyberspace —the global fabric that supposedly knits together all these separate threads— Gibson fell back on something our culture had prepared everyone to understand: a chase sequence through an imagined space. It would seem, therefore, that the metaphor of cyberspace is not merely a narrative of convenience but a practical necessity”.

As well as being a concept used in philosophy and computing, cyberspace has been commonly used in popular culture, for example

* The anime Digimon is set in a version of cyberspace called the “Digital World”. The Digital World is a parallel universe made up of data from the Internet. Similar to cyberspace, except that people could physically enter this world instead of merely using a computer.
* In the math mystery cartoon Cyberchase, the action takes place in Cyberspace, managed by the benevolent ruler, Motherboard. It is used as a conceit to allow storylines to take place in virtual worlds — “Cybersites” — on any theme and where specific math concepts can be best explored.
* In the movie Tron, a programmer was transferred to the program world, where programs were personalities, resembling the forms of their creators.
* The idea of “the matrix” in the movie The Matrix resembles a complex form of cyberspace where people are “jacked in” from the real world, and can create anything and do anything they want in this cyber world.

Although cyberspace is a common idea it can mean several different types of virtual reality. In the rest of this article we will explore a few, starting with the simplest and then increasing its complexity one after another until reaching the logical extremity.

Cyberspace As a Metaphor: Text-Based Internet-Surfing

The word “cyberspace” is currently used in a primarily metaphoric sense and is mostly associated with the Internet. When we sit in front of a computer and turn it on, something like magic happens before us; if we are correctly hooked up we can bring up an environment of hypertext with a click of the mouse. It feels like that behind the screen, there is a potentially very huge reservoir of information that is always in the making. Such a reservoir is somewhere, out there. We are certainly aware that people who generate information, and places wherein information resides, are not behind the screen or in the hard drive, but we nevertheless take the computer as a gateway to another place where other people have done similar things. Conceptually, we tend to envision a nonphysical “space” existing between here and there, and believe that we can access that “space” by utilizing computer-based technologies. We send messages to others by e-mail, or talk to others in a chat room. We play chess on-line interactively as if the rival were right before us, though invisible. By participating in an on-line teleconference, we experience some sort of presence of other conference participants. But where are we? Where are those with whom we communicate? Since we can reach one another in a certain way, but are mutually separated after all, we tend to envisage the potential of such an electronic connection in terms of spatiality. Usually, we call it “cyberspace” that connects and separates us at the same time when we are engaged in the networked electronic communicative activities — the “space” that seems to open up or shut down as the computer screen is activated or deactivated. In this sense, what we get from cyberspace is mostly text-based information with graphic visual aid.

But the concept of spatiality is based on the notion of “volume duality”, as Zettl calls it. A space has positive and negative components. The positive volume has substance, while the negative volume is empty and delineated by things with substance. For example, a room has the negative volume of usable space delineated by positive volume of walls. But text-based Internet does not have such duality. When we surf the Internet for its textual contents, we know we are spatially situated in front of a computer screen, and we cannot enter the screen and explore the unknown part of the Net as an extension of the space we are in. We know that the volume duality does not extend to the textual sources, because the screen itself belongs to the positive side of the space, and the gap between the screen and us belongs to the negative side; that is, the duality is already exhausted before we consider the textual contents on the screen. As for the gap between two words in a textual page, it only functions to separate two symbols, and symbols are not considered substantive entities.

When we read the text page by page, however, we might attribute a spatial meaning to the interval between two pages if we consider the unturned pages to be somewhere “out there.” The choice of the word “page” may also figuratively implicates a spatial interpretation. Furthermore, words such as “files”, “folders”, “windows”, and “sites” might even suggest that there be a spatial dynamic at work behind the scenes. But the only role of these figurative metaphors is organizing the textual contents, and the contents themselves are not figurative. The word “cyberspace” here refers, therefore, not to the content being presented to the surfer, but rather to the dynamic that enables us to surf among different units of contents. We project a figurative structure into the symbolic connections which we know clearly are not figurative or spatial.

Therefore, “cyberspace” understood not as something other than “space” but as one kind of space, is metaphorical. Some of us call it “nonphysical” space as if space allows a nonphysical version, but it remains unclear how space can be non-physical in its original sense. The metaphorical use of the term seems to be based on our understanding of the electronic connectivity, for the purpose of storing and delivering symbolic meaning, as a means of gathering and separating contents. In such a case, the word “space” might suggest a collage of positive and negative volumes, or the interplay between presence and absence of meaning. It directs us to regard the delivered meaning-complexes as delineated by operational units that are not given as symbolically meaningful, and that correspond to our actions of clicking, scrolling, typing, etc. These actions create “gaps” between our mental operations that articulate different units of meaning carried by symbols.

The prefix “cyber” is derived from our understanding of a cybernetic process as a self-reflexive dynamic system that uses a negative feedback circuit to stabilize an open-ended process. Here the notion of cyberspace applies such an understanding of the self-reflexive mechanism in cybernetics to the meaning-making process of the hypermedia. Thus cyberspace suggests a possibly infinite number of occasions of grouping and separating, surfing and routing, constructing and destroying, etc. This open-ended quality resembles the perceived infinity of the physical space that cannot be pictured as being bounded by something. It is impossible to imagine that it would reach a final closure. Similarly, the experience of always having a potential to encounter something unknown or unexpected seems to be inherent in the surfing process. This is a process of perpetual interactions.

In the context of such a metaphor, how can we understand the notion of cyber-culture? In fact, there is a tendency in the media to equate cyberspace with cyber-culture, and forget the hard-cored phenomenological aspect of cyberspace. When some journalists attempt to play the role of cultural critics on the Internet, they frequently convey a message that cyberspace is equivalent to a digital community or a digital city. That is, a web of personal relationships, where civic democracy is based on a balance of diversity and unity, or of coherence and openness. But such an equation between cyberspace and a web of personal relationships does not help us envision the possibilities of cyberspace and cyber-culture, because it prevents us from asking the question of how cyberspace allows for the rise of cyber-culture; nor does it help us understand the fact that the metaphoric nature of text-based cyberspace has been carried over to the current understanding of the formation of the so-called “cyber-culture”.

One assumption behind the notion of cyber community as currently held is that a community, as a cultural entity, can be formed solely on the act of communicating a shared set of social values. But in the real world, we don’t consider such an act alone a sufficient condition for cultural identity. It seems that the physical proximity, geographically and ethnically understood, is more basic for the formation of cultural identity among those with shared values. The rhetoric of cyber community has yet to be justified by solid analysis before it can hope to become a conceptual tool that helps us understand cyberspace and cyber-culture adequately.

Cyberspace As an Incomplete Replica: Video-Based Game-Playing

Video-based game playing differs from text-based communicating in regard to the meaning of spatiality, as long as the “gap” on the screen is a representation of the negative volume of space in the setting of the game. Video images are meant to be figures that actually occupy a space and the animation is meant to reproduce the movement of those figures in motion. Images are supposed to form the positive volume that delineates the empty space. Video images have to be able to move across the screen, on which the physical space of the game-player merges with the purported space surrounding the game figures.

A game cannot adopt itself to the cyber-culture metaphor unless it first reaches out to engage more players in the game, and then allows players to be figuratively represented on the screen. These figurative surrogates that act on behalf of the players are called “avatars.” But since an avatar represents the player in an objectified manner, the alleged identity between the player’s actual body and the avatar is no more than a stipulation. In such a case, there is no primordial space constitution at the ontological level. The Husserlian constitutive act of consciousness does not take the space surrounding the avatar and the space surrounding the player’s body as one and the same space.

If we now call it “cyberspace” that allows avatars to move around as symbolic representations of the actual participant’s bodies, then the metaphoric use of the word that suggests an open-ended potential of meaning-generating and reserving would become obsolete. A notion of digital community discussed above would now demand a representation of the alleged community members by avatars. However, since the sense of participation depends strongly on the participant’s self-identity as an un-mediated subjective person from her first-person perspective, the objectified avatar necessarily creates an ontological gap that cannot be filled by stipulation, and the talk about cyber-culture remains metaphorical and flashy.

Cyberspace As a 3-D Immersive Environment: Interacting with Synthetic Entities

Video games don’t have to stop at the avatar-player level. Once an immersive environment is furnished in the game that separates the player from the natural environment, the objectified space will be incorporated into the first-person perspective. It will replace the original space, and the artificial space will be extending from the center of the player’s field of vision to unlimited possibilities, and thus cyberspace is experienced as the only space with no other level of spatiality being constituted. The 3-D images will be made to change according to a pattern such that the player’s movement will be experienced as moving in a stand-alone world; this world has a potential to evolve by itself, and can extend to the unknown remoteness. It is experientially equivalent to the physical world we are familiar with before we enter cyberspace. In his book, Get Real: A Philosophical Adventure in Virtual Reality, Philip Zhai suggested a game-playing scenario as follows:

Suppose you and your partner are going to play the game for the first time. Before you get started, you will each be instructed to wear a helmet (or goggles) so that you won’t be able to see anything except the animated video images on two small screens right in front of your eyes, and to hear anything except sounds from two earphones next to your ears. So you see 3-D animation and hear stereo sound. You need also, perhaps, to wear a pair of gloves that will both monitor your hand movement and give you different amount of pressure against your palm and fingers corresponding to your changing visual and audio sensations in the game. You are now situated in a motion tracker so that you can move freely without leaving the place and your body’s movement can be detected and the signals can be fed into the computer; the computer also processes all visual, audio information as well. So you are totally wired to play an interactive game with your partner, mediated by cyberspace. Your partner is in another room, wired to the same computer, doing the same.

As soon as the game gets started, you begin to see with your eyes, hear with your ears, and feel with your hands and with the whole body, a self-contained environment isolated from the actual environment. In other words, you are immersed in cyberspace. Let us assume a typical type of game contents as follows. Your partner and you, each holding a shooting gun, are ready to fire at each other. The 3-D images are so realistic, and your body movements are coordinated with your images on the screen in such a way that you can hardly tell the difference between the animated images and your original body. Your partner looks as real as yourself. There are perhaps a few trees or rocks between you and your partner. There may also be a house you can get in and out, or what not. You can touch the leaves of the tree, and feel the hardness of the wall. So you run, turn, hide, get nervous, bumped, scared, or excited; you hear noises from different directions; when your partner shoots at you, you feel the hit on the corresponding spot of your body; you hesitate and pull the trigger to fire back…back and forth…back and forth…until one of you gets a “fatal” shot, bleeding, and loses the game. Now the game stops but you don’t feel a sharp pain or feel like dying even if you are the loser. Actually you will shortly get unwired and come back to the actual world, alive and amazed.

In such a game-playing experience, the players must take the cyberspace as the actual space in order to get involved in the process. They must suspend the judgment whether the perceived spatiality is “real” or “illusory” and ignore what their memory tells them concerning the difference between the current immersive experience of the game and a real situation. They must respond to the objectified entities in cyberspace exactly like they do in the real world, since they visually, aurally, and kinetically experience their own bodies in the same cyberspace. The consciousness must undertake a Husserlian non-reflective act of space constitution in the same way it does for the actual space. At this point, cyberspace has realized itself as it is originally meant to be. It isolates the player from the actual space with the immersive environment; it represents the totality of the positive and negative volumes of virtual reality.

As soon as we enter into such a virtual environment that enables us to interact with one another while we are constituting the very spatiality itself, we can anticipate the formation of cyber-culture in a non-metaphoric sense. If we communicate with one another in cyberspace in such a way for the purposes of conversation, value-sharing, feeling-expressing, or project-oriented cooperation, etc., then a cyber-community can be literally formed. A cyber-culture will then follow its own destiny of rise and fall.

The idea of a fully immersive cyberspace, such as that depicted in the matrix, is often used as a possible situation in epistemology intended to demonstrate the possibility of skepticism and present one argument for it. This is perhaps one of the most popular arguments in all of philosophy, for a discussion of it see brain-in-a-vat. It should be noticed however that the brain-in-a-vat argument is unlike cyberspace as concieved here as it talks about the sense organs being bypassed and the reality experience being fed into the brain directly. One difficulty with cyberspace as a philosophical tool to promote skepticism is that it requires the existence of a ‘real world’ outside of cyberspace wheras a hardline skeptic would say that it is possible for there to be no ‘real world’ at all.

Cyberspace As an Augmented Habitat: Teleoperation

Cyber-culture as discussed above is significant, but it is still non-consequential at the ontological level. The more exciting thing is that cyberspace and virtual reality can go even further. Combining it with the technology of teleoperation, we can enter into cyberspace and interact with artificial objects to manipulate the actual physical process. When I perform an act of picking a stone in cyberspace, for example, a robotic surrogate body of mine in the real world will pick up a real stone. Since all of our physical contact with the natural world for the sake of survival and prosperity is hardly more than asserting physical force to objects, robots can, in principle, perform all tasks of the same kind. So we can build the foundational part of the virtual world in which we are able to accomplish all agricultural and industrial works without ever leaving cyberspace.

Therefore, virtual reality with the capability of facilitating teleoperation will have all the necessary components of the actual world. Furthermore, if we were put into the immersive environment of cyberspace by our parents before we know anything about the actual world, and trained to do everything by teleoperation only, we will take cyberspace as the default habitat, and be unable to function well in the natural environment. As a result, we would develop a natural science about that unknown virtual world, if we are not the designer of its infrastructure and don’t know the design principles of this virtual world. Here is what Zhai wrote in his book:

“Let us imagine a nation in which everyone is hooked up to a network of VR infrastructure. They have been so hooked up since they left their mother’s wombs. Immersed in cyberspace and maintaining their life by teleoperation, they have never imagined that life could be any different from that. The first person that thinks of the possibility of an alternative world like ours would be ridiculed by the majority of these citizens, just like the few enlightened ones in Plato’s allegory of the cave. They cook or dine out, sleep or stay up all night, date or mate, take showers, travel for business or pleasure, conduct scientific research, philosophize, go to movies, read romances and science fiction, win contests or lose, get married or stay single, have children or have none, grow old, and die of accidents or diseases or whatever: the same life cycle as ours.”

“Since they are totally immersed, and they do everything necessary for their survival and prosperity while they are immersed, they don’t know that they are leading a kind of life that could be viewed as illusory or synthetic from outsiders such as us. They would have no way of knowing that, unless they were told and shown the undeniable evidence. Or they would have to wait for their philosophers to help them stretch their minds by demonstrating such a possibility through reasoning.”

“A more interesting possibility is that their technology would lead to the invention of their own version of VR, which gives them an opportunity to reflect on the nature of ‘reality’ in a tangible way, just as we are now doing at this moment. Then they would possibly ask the same type of questions as we are asking now.”

“If there were such a free kingdom, can we say they are in a state of ‘collective hallucination’? No, if by calling it a hallucination we mean to know that ours is not the same. What if I ask you: ‘How can you show me that this imagined nation is not the one we are in right now?’ That is, how do we know that we are not exactly those citizens immersed in VR? In order to separate ourselves from such a possibility, let us assume the basic laws of physics in that virtual world have been programmed to be different from ours. Suppose their gravity is twice as much as ours. So their ‘physical’ objects of the same molecular structure as ours will accelerate, say, twice as fast when they are in free fall, and twice as heavy when they try to lift them. At the same time, they can see lights such as infrared or ultraviolet, which we cannot see. Their scientists will formulate the law of gravity according to their observations. Due to a well-coordinated interface, they can teleoperate things in our actual world smoothly and thus run their basic economy well.”

“Knowing all of these from our ‘outside’ point of view, can we thereby judge that their scientists are wrong while ours right? Of course not, because they would have as strong a reason to tell us that our scientists are wrong. Moreover, from their point of view, they are not doing any teleoperation, but are controlling the physical processes directly; we, not they, are in fact doing teleoperation. If we tell them that their VR outfit gives them distorted version of reality, they would tell us, by exactly the same logic, that our lack of such outfits disables us from seeing things as they are. They would ridicule us and say, ‘You don’t even know what ultraviolet and infrared look like!'”

When cyberspace reaches the stage of Teleoperation, cyber-cultures in every sense would be able to develop just in the same way traditional cultures do in the actual world. Therefore everything we can say about traditional cultures in general would apply to cyber-cultures, and there is no need to discuss every specific mode of cyber-culture in such a circumstance. After all, as Zhai pointed out in his book, the basic idea is simple: ontologically and functionally, the goggles are equivalent to our natural eyes, and the bodysuit is equivalent to our natural skin; there is no relevant difference between them that makes the natural real while the artificial unreal. But the significant difference lies in their relationship to human creativity: we were given one world, but make and choose the other.

Cyberspace As an Arena of Artistic Creativity: Non-Consequential Re-Creation

If we only had the foundational part of virtual reality serving our practical purposes, virtual reality would be no more than an efficient tool for manipulating physical processes. What will fascinate us more is the expansive part of virtual reality. This part of VR will unlock our inner energy of artistic creativity for building a synthetic world as a result of our free imagination.

This expansive part does not have the same ontological status as the foundational part since, first of all, virtual objects in it do not have their counterparts in the actual world based on physical causality. In this expansive part, we may encounter all kinds of virtual objects as a result of digital programming. We can perceive virtual rocks with or without weight, virtual stars that can disappear at any time, virtual wind that produces music, and so on. We can also have virtual animals like or unlike animals we have seen before in the actual world. Secondly, we can “meet” virtual “human beings” whose behavior is totally determined by the program. They are not agents, do not have a first-person perspective, and do not perceive or experience anything.

Therefore, in this expansive part, events are neither related to the causal process in the actual world nor initiated by an outside conscious agent. This is a world of pure simulation, or a world of ultimate re-creation. In such a world, cyberspace is a sea of meaning, and it’s so deep that any imaginable mode of artistic or recreational culture would have a chance to grow out of it.


Early philosophical conceptions

Before cyberspace became a technological possibility many philosophers suggested the posibility of a virtual reality similar to cyberspace. In The Republic, Plato sets out his allegory of the cave which is widely cited as one of the first conceptual realities. He suggests that we are already in a form of virtual reality which we are deceived into thinking is true reality. True reality for Plato is only accesible through mental training and is the reality of the forms.

These ideas are central to Platonism and neo platonism. Perhaps the conception closest to our modern ideas of cyberspace is Descartes thought that people might be deceived by an evil demon which feeds them a false reality. This argument is the direct predesessor of the modern ideas of brain in a vat and many popular conceptions of cyberspace take Descartes ideas as their starting point.

Early philosophers also suggested the existence of a virtual cyberspace that was created by life like artistic representations. Some philosophers came to distrust art because it deceived people into entering a world which was not real and sited examples of artists whose paintings, sculptures and even literature could deceive people and animals. These ideas where reserected with increasing force as art became more and more realistic and with the invention of photography, film and finally emersive computer simulations.

Modern Philosophy and Cyberspace

Perhaps one of the first indications of cyberspace becoming a topic of deep human consequence arose during the 1978 Nova Convention, in a conversation between William S. Burroughs, Brion Gysin, Timothy Leary, Les Levine & Robert Anton Wilson about the nature of evolution, time, space and mind. One of the underlying themes in the convention was the disenchantment with the Blue Sky Tribe and the initial cravings for “cyber topics” such as transhumanism, Gaia theory and Decentralisation.

William S. Burroughs’ quotes from the convention:

“Time is a resource, and time is is running out. We are stuck in this dimension of time.”

“This is the space age, and we are here to go.. However, the space program has been restricted to a mediocre elite who —at great expense— have gone to the moon in an aqualung. Now, they’re not really looking for space, they’re looking for more time. Like the lungfish, and the walking catfish; they weren’t looking for a dimension different from water, they were looking for more water”.

Deconstructing H.264/AVC by the drunken blogger

Deconstructing H.264/AVC
July 28, 2004

If you were watching the 2004 Apple WWDC Keynote, or even just checking out the upcoming 10.4 Tiger release you may have noticed Apple giving a lot of time to something called ‘H.264/AVC’ which it looks like they’re moving to whole hog and has me pretty excited. Apple has a fairly glossed over page which talks about it, and if you’re going to actually read the rest of this I’d head over and at least skim as I’ll reference it some later. Plus it has some pretty pictures.

As a disclaimer: These are the pieces as I know them; and if I know something wrong hit me with the clue stick or fill in gaps. I’m pretty sure its reasonable, if a bit over-the-top in terms of length again.

Since we know where we’re going (H.264), it’s only fair to talk about where we’ve been… within the realm of reason. I haven’t been that big of fan of Apples’ handling of MPEG4, so we’ll stick to that and not some of my unhappiness with their current Quicktime strategy in general; with the indulgence that when you’re in a cut-throat fight over the future of video delivery, chances are it’s not such a great idea to smack your users over the head with pop-ups to shell out money whenever they open a media file, or to make them shell out money to save a movie or *gasp* play full screen.
Quicktime & MPEG

But, going back to MPEG4, I mentioned I wasn’t the happiest with Apples’ handling of it… but this is primarily in the realm of follow-through, which honestly is one of Apples’ long-term corporate culture problems, some of which is prolly due to necessity as they’ve gone through brain drains and their head count has shrunk. Apple simply doesn’t have the head count that say, someone like Microsoft has when it comes to throwing people at a problem and can contribute to them having these weird feature spikes where if you were scoring different features of OSX against WinXP, it might look like (on a scale of 1 to 10):

* Mac OSX
10, 5, 9, 2, 4, 7, 9, 10, 8, 1, 2 = 67
* WindowsXP
9, 6, 6, 5, 6, 6, 10, 1, 6, 6, 5 = 66

…which leads to a situation where, if you look at what Apple happens to be singling out at the time they look to be aces, but a broader outlook makes things look a little more subdued. Another way to think of it might be broad and shallow versus narrow and deep. If you’re a Tolkien geek, think of Saurons’ Eye; when its pointed at you, you’re really aware of it. When you’re in its peripheral vision lots of things slide.

Apple has very, very good people, it just doesn’t have all that many of them to spread around in the grand scheme of things. Less eyes are going to mean less peripheral vision. This does not mean that just throwing people at the problem is the answer, but it’s just the nature of things.

Apple is also prone to a bit of ADHD (many creative types are, just interesting to see it so pervasive in a company), in that they’ll pick the de-jour of the week, get the press, then pick another feature to hype while the prior one sorta languishes. Many a mac user has been embarrassed when they’re working of mental constructs based around what the situation used to be when disparaging a competing offering (and vice-versa), and not what it happens to be at the moment. I can’t wait until the help system becomes de-jour again…

This was kind of my problem with MPEG-4 on the Mac, and before that, MP3: implementation (lots more on this later). While all codecs aren’t created equal, the implementation of the codec can be just as important. Witness something like MP3 versus AAC; When compared on their technical merits, AAC on the whole is a superior codec. But the difference in quality between an MP3 encoded with iTunes and an MP3 encoded via LAME or Blade can be drastic, especially at lower bitrates or certain types of music (think ‘Daft Punk’, ‘Smashing Pumpkins’ or ‘Lords of Acid’).

MacOS 10.2 (Jaguar) and Quicktime 6 ushered in MPEG-4 (.mp4) after being delayed for awhile due to a rather public spat over licensing costs between Apple and the holding companies responsible for the care and feeding of the various MPEG branches. MPEG-4 had a lot of promise for equalizing the playing field for online distribution.

Remember the ‘media player wars’ were really humming between Apple, Real and Microsoft, and all the players were bring heavy codecs to the table. People were talking ‘convergence’ and cable companies were making ill-fated and overhyped promises of video-on-demand that way too many people bought into. Abortions like QuicktimeTV were still trying to figure out why they existed, and everyone was expected to be throwing video on their website.

MPEG-4 was a bit of a shuffle in the market; previously the way it worked was that if you picked Quicktime you’d use Sorenson (a licensed codec Apple leased the exclusive end-player rights to, also known as SVQ3), if you picked WMP (Windows Media Player) you’d use their codecs, if you picked Real Player your customers would leave you. Interestingly enough, both Apple and Real were two of the big names signing on for MPEG-4 support… it was really considered to be a done deal, committee standards over proprietary.

The climate for media delivery was getting more than a little problematic for content creators who were just sick of this stuff and everyone was really, really keen on MPEG-4 being adopted but, like Firewire, the licensing issues caused it to lose some steam, but most saw the writing on the wall… especially companies like the one responsible for the Sorenson codec who shopped it to Macromedia for inclusion into Flash and got themselves sued by Apple. Interestingly enough, an FFmpeg coder (who has remained anonymous was working on reverse engineering SVQ3 and found it to be a specially tweaked version of H.264… more on that later as I’m getting sidetracked.

Since everyone keeps mentioning these various licensing issues, it’s worth giving a bit of back history on who is behind the various MPEG standards and where and why MPEG-4 and H.264 came about… all of this starts with the MPEG group.

The MPEG group (Moving Picture Experts Group) was started all the way back in 1988 with a mandate of establishing standards for the delivery of audio, visual, and both combined. After a good 5 years they shipped MPEG-1, and since this was in 1992 and no one in their right mind was even thinking about sending video over their 14.4 baud modems it was heavily geared towards getting the data on a disk.

This MPEG group is actually a real problem; if it was done today, there’s no way in hell the MPEG group would be setup the way it currently is and debacles like Apple holding up its release of Quicktime 6 as a power play over steaming licensing fees wouldn’t happen.

Chances are it’d be much more akin to the World Wide Web Consortium and they’d be a hell of a lot pickier about what was chosen to be included in the codec… they’d be much more mindful about things like patents. At the time it wasn’t a big deal as who would have ever thought we’d all be sitting here with a copy of iMovie on our desk, and their priorities weren’t so much in a few pennies here and there but in having something people could reference. Lossy codecs were starting to sprout up everywhere (like JPEG) but unfortunately a ton of these were proprietary.

Proprietary in these cases can be really, really bad. Imagine a broadcaster buying equipment from company A that stores out video in mystery codec y, but you have to interface it with equipment from company B that has its data stores in mystery codec Z. You’re just asking for all manner of nightmares both on the vendor side and the customer side. Sometimes before you can really compete you have to at least decide where you’re going to have the damn battle.

Still, MPEG-1 was a big deal for things like CD-ROMS, and became a much bigger deal later on (more on that later) even though it has all sorts of IP issues… and it often had hardware-based support, whereas things like Indeo or Cinepak were software based.

But while the data rate was fine for CD-ROMs, it was only meant to deal with non-interlaced video signals (not your TV) and the resolution wasn’t that great, a little less than the quality you’d get with a VCR. While MPEG-1 still lives in various places (VCDs use it, which you can still find around the net) something new was needed.

MPEG-2 is what most of us are used to seeing around now and it hit the scene around 1994. It’s the standard used for DVDs and the I-swear-it-is-coming-soon-‘cus-PBS-won’t-stop-running-specials-on-it HDTV. It was more than a little demanding on CPU when it was first released, leading to a wave of 3rd party MPEG-2 acceleration cards being included in PCs, although now its primarily a software thing as Moores’ Law has advanced. Still, there were a lot of Powerbook owners who were pretty ticked off at Apple that while their computer would run OSX, Apple just kinda decided not to support their hardware DVD decoder.

From a technical standpoint MPEG-2 was about showing that the MPEG standard had legs and could scale pretty damn high from its original intended data rates for things like SDTV (Standard Definition Television) as it was being thrown at interlaced feeds (your computer isn’t using interlaced video, but your TV does; interlaced is more of a bitch to work with) and vastly improved tech in the areas of ‘multi-plexing’ and audio. MPEG-1 only allowed 2-channel stereo sound, which was… problematic for where people wanted things to go.

There were imaging improvements in MPEG-2 of course; but the big deal was the multiplexing, which is taking different data streams and interleaving them into something coherent. The MPEG-1 days were heavy, but audio was beyond problematic and many of my first experiences with it involved demuxing (separating out the audio and video) and recombining to get something of value.

MPEG-2 allowed this to be much, much more consistent and better separation of the audio channels from the video allowed for more innovation between how the audio and video were compressed separately and then interleaved. When you realize that MPEG-2 was suddenly expected to be used not only in DVDs but over the air and through your cable system, the improvements like ‘Transport Streams’ were a big deal. This is glossed over, but you should be able to get the idea.

So we’ve covered MPEG-1 and MPEG-2, and we know there’s an MPEG-4. What about MPEG-3? It doesn’t really exist. Work was started on MPEG-3 to improve the ability to handle the much higher bandwidth HDTV signals, but they found out that with a few tweaks MPEG-2 would scale even further and handle it just fine… so work on it was dropped.

But wait you say, what about .mp3? Interesting story, that. The MPEG-1 spec called for 3 layers of audio… yep, MP3s are basically based on ripped out MPEG-1 audio streams. They’re the layer 3 of MPEG-1. I’m sure there were minor differences in the actual encoding algorithms between MPEG-1, MPEG-2 and what is sitting on your desk but to my knowledge these are mostly about scaling the bitrates down, and they’re all based on the Fraunhofer algorithms which of course is why projects like Ogg Vorbis have sprung up. Interestingly enough, AAC (.m4a’s), which Apple is so hot on now, was also an optional audio layer for MPEG-2 in 1997, although it was improved with version 3 of MPEG-4.

Yep, we’ve covered a lot of stuff so as a quick recap of what we know so far:

* MPEG-1 has slightly less than VCR quality, and as a reference is used in things like Video CDs. I could add more, but it reminds me too much of ‘Edu-tainment’ and FMV games which everyone thought would be hot with the advent of the CD-ROM but single-handedly almost wiped out the market when it turned out they really, really sucked.
* MPEG-2 brought about heady changes in audio, multiplexing, support for higher bitrates and the ability to be broadcast out over ‘unreliable’ mediums like cable and HDTV, and got itself landed as the standard for DVDs and allows me to watch ‘The Big Lebowski’ whenever I need moral reassurance that a few white russians a day doesn’t mean I have a problem. And various tweaks here and there improved visual quality over MPEG-1.
* There was no MPEG-3, as MP3s come from MPEG-1 audio specifications
* AACs come from MPEG-2 audio specifications, although significant improvements were added with MPEG-4 version 3.

As a quick aside, since I mentioned that Sorensons’ SVQ3 was found to be based on a tricked-out version of H.264… you might be wondering how SVQ3 was, ya know, able to do that with a codec that is just now the Apple golden codec. The simple answer is that a lot of the research and planning was spec’d out way back in 1995, but things take time both for the spec to be finalized, reference implementations have to designed and made, kinks worked out, corporation adoption… stuff takes time.

I don’t know the story behind Sorenson incorporating this technology, just that it was found they did when it was being reverse engineered for playback by FFmpeg, even though they now are shipping a product specifically geared towards H.264 files…
Enter the MPEG-4 behemoth

…which brings us to MPEG-4. Weirdly the file format of MPEG-4 is based upon Quicktime technology, which I’m just not going to spend time on as it’s too much of a side issue for even I to be able to justify; the real story of MPEG-4 is all about the internet.

I mentioned that MPEG-2 couldn’t handle low bitrates, as it sorta falls apart when you drop under 1 Mbits per second; it’s simply not meant for that kind of delivery which was why Apple shelled out a bunch of dough to Sorenson for exclusivity of their codec, and why MPEG-4 came to be. MPEG needed to grow to handle the internet, which meant it needed to scale downwards in bitrate at the highest quality possible and be as efficient when streaming over a net connection as it could be.

I have to give a disclaimer here; I like(d) MPEG-4, but find it to be really, really weird. I gave the impression that MPEG-4 was supposed to be a panacea of simplifying the delivery of content, which it was looked to for, but when you actually look at the spec there’s all kinds of crazy stuff in it that looks to be throwbacks from build-it-and-they-will-come thought processes which brought us inane .COMs and a thousand games based on stringing video clips together.

VRML (Virtual Reality Modeling Language) was hot in this time, and the idea was basically flash on steroids; or “Screw text, users in 1994 want my website to be a 3D virtual world”. Basically you’d have a plugin in your browser, and when you entered a site it’d be fed a .wrl file full of vector code to represent the virtual world. Click the ‘Support’ building and you’d be fed another .wrl file with more textures which would pull up a nice avatar holding up a sign saying:

“Hi there! You’re the 3rd person to actually come into this virtual building in 5 years, here is our phone number. Thank you, come again. Please. No, really, please do come back. No one else can be bothered to go through all this crap to get our phone number. I’m an 8-bit sprite-based avatar because only 1% of computer owners have a machine that can display anything heavier and those who do have better things to do with their time… you won’t come back? Are you sure? I have have a coupon I can pull up for the next time if you do… No? Well, if you could find the time, could you possibly pass the word to some l33t’s so they can DDOS me and bring upon the cool soothing 404 of release? Or my possibly more advanced brethren so they can hunt down my creators and kill them?”

I’m not saying that there wasn’t coolness in VRML (or it’s offspring, X3d), but I’m almost entirely sure it was all just a ploy by SGI to capitalize on their uber-cool-at-the-time graphics workstations. It was a bit of hubris to be throwing it out in 1994, and it was positioned badly.

And, just for the record, I firmly believe that artificial intelligence is going to be born in some aberrant piece of forgotten code that falls into disuse in some backwater of the internet, which then quietly starts doing things to entertain itself. It’ll then become fully sentient in an unloved environment (or worse yet, on this guys computer) and fail to feel any connection to its masters-made-of-meat. In short order it’ll decide we’d make really damn good batteries or, if its on Steve Jobs’ computer, decide to remove us from the earth purely for aesthetics. My $10 is on an ActiveX control on a forgotten thumbnail pornsite in Russia, which means its going to have really, really interesting attitudes towards women and accessing strangers bank accounts.

Anyways, back to MPEG-4… they just went apeshit with this thing, going object-oriented and including an extended form of VRML so you could have objects moving above or behind your movie, etc. Apple was hot on showing this stuff at the time, where you could click sprites and a sound would play… interactive movies, and layered movies.

I.E., don’t add snowflake effects to your movie in After Effects, create two movies, one of them one of snowflakes and send along a tiny binary of code that will overlay them. Or something. It was all just very weird to me so I tried to ignore it until I really saw a reason why I should care; unfortunately almost everyone else did the same although I’m sure someone who read this far will email telling me why being able to programatically add snowflakes was make-or-break for their project.

In terms of streaming, MPEG-4 was pretty nifty really and added a ton of stuff to the mix that’s often hidden from your eyes while you’re watching the Keynote or viewing content involving less clothing. It was a big break in terms of networking from MPEG-1 & MPEG-2, and brought MPEG into viability with competing offerings that were hitting the market at the time. As I intimated earlier, MPEG-2 had some tech in it called ‘MPEG-2 Transport Stream’ which was the equivalent of a network copy. Basically wrap the audio and visual into a file and send it to IP address x on port y.

MPEG-4 splits the audio and visual, sends them to the same IP, but to different ports where they’re then combined and decoded properly using information given to it using the SDP (Session Description Protocol) while they’re connecting, along with a whole bunch of other acronyms like QoS (Quality of Serice). Lots of stuff has to occur on the backend to keep things syncronized, but by doing this you’re able to do things like only listening to the audio of the Keynote because you’re bandwidth starved and simultaneously sending things back and forth like the error rate. I’m not even going to go into the copyright bits stuff as it freaks me the hell out.

There were some really nifty things done on the compression side, like my favorite, motion-compensation which I’m not going to go into detail on yet. But through a bunch of improvements you were able to get some really nice bitrate improvements over something like MPEG-2, even though it really came into its own below a specific bandwidth threshold.

So all is good, right? We have a codec built for streaming that can go from a high-end bitrate for something like HDTV down to a streaming music video or Keynote, and just needed to have some kinks worked out.

Well, there were some issues…
How to lose friends and influence

I mentioned the very, very public licensing squabble that occurred between Apple and the MPEG-LA group, which is in charge of sucking in the licensing fees. I really don’t know exactly how this happened, but you ended up with Apple saying:

“Hi, we’re demoing Quicktime 6 today, which is ready to ship with this fantastic MPEG-4, but we’re not going to ship it until the MPEG-LA group gets its head out of its asses in terms of licensing fees. Please voice your displeasure at them vehemently.”.

IIRC, it took around half a year for them to get the licensing ironed out into something they thought was equitable, although I believe there was a ‘technology preview’ released a month earlier. Unfortunately it really let Real and Microsoft get a head start with their offerings, but there were other problems.

Weirdly enough, at the time it wasn’t considered to be that competitive when compared with streaming solutions from Real or Microsoft, but it worked great for progressive downloads where you basically get a ‘fast start’ by downloading a chunk of the movie and starting to watch while the rest downloads transparently. There were certainly issues here, which have been ironed out, but they did hurt mindshare at the time.

But the killer to me was the encoding implimentation; people actually expected Apple to drop Sorenson and their fees pretty quickly, which never happened because their customers weren’t keen on it happening.

Basically, Apples built-in MPEG-4 encoder blows and is woefully inferior to everything else out there. Everything. This isn’t to disparage the hard work that I’m sure went into it, but I’d bet if you sat down and had a beer with the coder/coders behind it they’d intimate that they were unhappy with where it is. There are two real problems going on here:

* The encoder in general
It’s just not very good. It has a ‘reference platform’ feel to it. It’s very difficult to get good results without a hell of a lot of tweaking, and unfortunately Apples standard options don’t allow for a hell of a lot of tweaking. In the past I’ve been in the unenviable position of saying “MPEG-4 doesn’t suck, Apples’ implementation does” after people are unsatisfied with the results. And it’s really that bad, muddy, blocky, bleah.

I felt a visceral depression at the quality I was getting, but all isn’t lost and I’d encourage you to check for yourself by installing something like the 3ivx encoder which features Quicktime integration and just absolutely stomps all over Quicktimes’ encoder in both file size and video quality.

I’d actually give a nod to 3ivx and other decoders in general too, but I’m not really kidding around; taking any source and output ‘pure’ and simple .mp4 files using the most basic settings and the ones output by Quicktime will always come in dead last by a significant margin even when played through the Quicktime decoder. If you’re using something Cleaner 6+, you’re all set, it does a damn great job with MPEG-4… this is an Apple problem not a platform one.

Now, one thing in Apples’ defense: I understand that their implementation seems to be heavily geared towards smoothing out the bitrate curve for smoothness, focusing on streaming over quality. But unfortunately not everything is about streaming; and even so the quality compared to what you’ll get with others is frighteningly poor, even for streaming. This really, really started giving MPEG-4 a bad name when people were comparing it to other products out there.

Flame wars abounded over testing procedures. My favorite was where some guy was all up in arms about the testing being rigged because a ripped DVD was used instead of a DV stream from a camcorder, but I digress. Bygones.

* The lack of ASP support
One of my personal pet peeves with Internet Explorer is its lack of alpha channel support for PNGs, which I happen to be a big fan of. To all fairness to Microsoft, alpha channel support was an optional part of the spec that you weren’t required to implement to say you had PNG support. But still, it rankles.

Unfortunately, things aren’t as simple as MPEG-4 or not-MPEG-4, as there are actually two versions; Simple Profile (SP) and Advanced Simple Profile (ASP). Remember I mentioned that MPEG-4 went kinda apeshit on the spec? There are a ton of different layers and capabilities, so the originators wisely decided they’d create ‘profiles’ which are handed off to the decoder to tell it what it needs to be able to do to play the file. If a device can play MPEG-4 SP files, it should have x decoding capabilities, and if a device can play MPEG-4 ASP files, it should have x and y decoding capabilities.

SP was the first version out of the gate, and was primarily oriented towards low-bandwidth situations and as a base common denominator between devices; ASP brought in a whole bunch of improvements intended to improve quality and bitrates. If you hit up 3ivx and check out the options, you’ll see a few that say that if you check them you’ll be forcing ASP…

…which is problematic because not only can Quicktime not encode ASP files, it can’t decode them either. This isn’t that big of a deal for your average duck backing up his ‘Girls Gone Wild’ collection, but its a big problem for distribution as you can’t use MPEG-4 to its full capabilities while using a compressor that isn’t Quicktime, because the majority of people sure as hell aren’t going to want to install a plugin to view it within Quicktime.

Remember, ‘distribution’ here can mean a lot of things. It could mean ripping your favorite Simpsons episode and passing it onto friends. These guys won’t even touch Quicktime, it sucks for them, and things like WM9, DivX, 3ivx, etc. work much much better to Quicktime is cut out of the picture on the encode. Assuming they use something like Divx or 3ivx, their friends who want to view them can’t even use Quicktime, which means it gets cut completely out of picture on the decode unless the end user jumps through hoops.

Not having 2-pass encoding is forgivable, but the lack of ASP support just really gets up my craw. I don’t really know why Apple has completely eschewed ASP support in Quicktime, people were expecting to see support quietly sneaked into 10.3, but the only thing really codec-related to hit was the Pixlet codec which is very, very specialized but it really kinda sucks and doesn’t help the mindshare poison spreading around MPEG-4 and it kinda sorta gives a hint into why the movie trailer people were still loving on Sorenson over the new codec.
Microsoft does its homework

Ah, but there were other problems. Namely, Microsoft. I mentioned that they had a jump on getting their codec out the door due to the licensing issues, but it’s almost more accurate to say they had a jump on getting their platform out the door. Windows Media 9 was and is a big deal; primarily because they hit the damn thing off the scoreboard and really went after the throats of the MPEG-LA group.

One of the ways was through pricing pressure. Remember there was a huge amount of outcry, much of it fueled by Apple and others, about just how out of line the MPEG-4 group was with its pricing. They iron out the pricing, and Quicktime 6 is going out the door, and Microsoft announces that their licensing fees will be about half what you’ll pay for MPEG-4 licensing. Made ’em actually look like a nice alternative to the ‘open standard’ codec. There’s a kick in the balls, eh?

But wait, there’s more, as we’re pretty much used to Microsoft kicking people in the balls via pricing pressure when its strategically important to them. Nope, this time Microsoft decided to kick in their teeth too by making the WM9 codec excellent. And by excellent I mean fucking stellar. Yes, I could have just used stellar, but it wouldn’t really describe the situation. The quality was that good; it’s right up there with the best you can get from something like DivX or 3ivx and will trounce Sorenson or Apples implementation.

They also made the smart step of setting their network stuff in stone… Pretend you’re a content provider or device maker of miscellaneous origin, looking to pick a codec to support or use for your wares. Microsoft, to their credit on this one, made it a pretty difficult decision to make even if you weren’t their biggest fan, and systematically started scooping up adopters like they were Michael Moore swing by Berkeley.
Enter H.264/AVC

Otherwise known as:

* H.264
* H.26L
* MPEG-4 part 10

H.264/AVC is some pretty nifty stuff in it, but its nothing so much revolutionary as a simplifying of some of what was in MPEG-4 and the taking to an extreme of other parts, with a smattering of new stuff. There’s not really one thing you can point to and go “Oh, yeah, that’s where the 30% efficiency gain comes from”, rather its many of the existing technologies that you can find in MPEG-4 ASP and such, just refined, and all of them used together give you a sizable gain which we’ll go into in a moment.

This is not, as an example, something like the change of JPEG to JPEG2000 which went to something entirely new and novel for its improvements.

You may notice that H.264 and H.263 are basically off by a digit; my understanding is that the guys behind H.263 were working on their codecs and the guys behind MPEG-4 were working on their codecs, saw they were both going in similar directions and decided to join forces back in 2003 which is when interest really started heating up… and where half the monikers come from. The ITU group started by creating H.26L back in 1998 or so, with the goal of doubling the efficiency over existing codecs, then the MPEG group joined in, and the joint team was called JVT (Joint Video Team; creative, them).

This is partly why its known by so many different monikers: H.264/AVC is really a nice codec, and is a lot of things to a lot of people depending on where your focus is. I remember getting an idea of it a few years ago when it was hitting some of the video conferencing equipment, but this was before forces were joined to bring its tech in with the MPEG guys for H.264/AVC.

H.263 is an interesting codec; if you’ve ever used a video conferencing solution chances are you’ve seen it. It had a revision a bit back to increase the quality and the compression, but it wasn’t very scalable up on the high end. This was a codec originally designed to squeak in under ISDN lines, primarily for video conferencing, so there were lots of tweaks in its algorithms designed specifically for it. I’ll spare you the details, but lets just say H.263 did a remarkable job when you had two computers connecting via IP, with a well lit background and one person sitting talking.

The big question of course is if the quality claims regarding H.264/AVC are smoke and mirrors or over-hyped; they most assuredly are not from what I’ve seen.
mMMMm bitrate

The key here is quality at a given bitrate, which is when codecs start coming into their own, so lets talk about bitrates for a moment. The bitrate, or data rate, is by and large going to decide how large your file ends up or the quantity of bandwidth used to transfer the data and luckily its pretty easy to give a butchered example.

Lets say you have a normal movie that is:

* 592 wide by 320 high
* About 92 minutes long (5,520 seconds)
* 24 frames per second

If you tell your encoder that you want to encode at a bitrate of ~130 Kilobytes per second, you’ll have a file that is around 700 Megabytes in size. This should make some sense, as what you’re really saying is “You have 130K to play with every second; encoder, encoder, do as you will!” and 130 Kilobytes gets written out to the hard drive 5,520 times. That would be CBR encoding (constant bitrate), whereas something like VBR (variable bitrate) would allow you to do things like say “Ok encoder, you can use a bitrate up to 130K/s, but feel free to go lower if there just isn’t much to encode”.

Where/why/how you set your bitrate threshold can be dependent on what you’re trying to actually do and what your other limits are. I.E., you may be constrained by the size of your physical medium (say, a CD), or you may be constrained in bandwidth. If you’re streaming video to clients off your T1 at a specific quality level, if you can cut the bitrate by x you can either serve more clients or keep the bitrate the same and increase the quality. Yay.

So bitrate is of paramount importance, as when you take something like a high-definition stream and try to apply MPEG-2 style compression to it you end up with a massive stream of data. And, as I mentioned, since the codec isn’t geared towards that type of use it has what I call ‘fall-down-go-boom’ syndrome; meaning quality and efficiency suffers horribly. You can see this easily by taking a vacation photo and pumping it out as a GIF or as a JPG; JPGs are made for this sort of thing and as such do really well. GIF compression isn’t, and it not only won’t look as good the compression won’t be near what you’d get by using JPG. You could easily reverse the situation by pumping a logo through them both and watching JPG fall-down-go-boom because it’s out of its element.

So H.264/AVC has some wondrous savings in terms of bitrate; depending on what you’re doing they can be 30-70% of a reduction over MPEG-2 or MPEG-4 ASP, although most often you’ll probably see something around 38-40% over MPEG-4 ASP. There’s a problem though, as this stuff doesn’t come for free.
How MPEG got its groove back

As any engineer will tell you, engineering is primarily about balancing tradeoffs. If you take 5% here, you need to add 5% there, and where and there can be wildly different variables. Heat, cost, size, etc.

When it comes to compression, the tradeoff is almost always between compression efficiency and computation costs. Often times these are of an inverse ratio, meaning if you use codec x you’ll save 50% on final size but you’ll be increasing the time it takes to encode by 100-200%, if you save 20% you’ll increase the time taken to crunch the data by 50%.

I’ve been avoiding going into exactly how MPEG-style compression really works, mostly because its not the easiest thing to break down into language anyone can grasp and then seek further knowledge on; quite simply it hurts my head and is pretty complex. But its important to have a basic understanding to be able to get an idea of just what is going on behind the scenes with H.264/AVC. This is going to be heavily glossed over, but you should be able to get the idea.

All of the MPEG-style encoders are block-based; meaning they break up the image into squares 16 wide by 16 high and do their magic within them. This is why you are viewing something that has quality issues, they generally involve things looking blocky. This is remarkably similar to something like JPEG, which, well, does the exact same thing, with the caveat that JPEG doesn’t have to contend with motion… which goes back to why MPEG was first brought about.

You can create a movie using something like Quicktime encoded with something called “Motion JPEG” which pretty much just takes every movie frame and applies the JPEG codec to it.

If your ‘reference’ movie is:

* 1 minute long
* 30 frames per second

…you’ll essentially have a movie made up of 1,800 JPEG images wrapped into a file. When you stop and think about it, all that’s really having to happen when you play the movie is that the decoder has to decompress each frame and throw it up onto the display as fast as it can.

However it won’t hold a candle to even something like the original MPEG codec in terms of compression efficiency; this is due to MPEG having some special tricks ups its sleeves specifically designed to deal with movies. These are special ‘key frames’ called I-frames, P-frames, and B-frames.

Using out ‘reference’ movie above as an example, these basically work like this:

* I frames
These are basically full reference frames; consider these to be snapshots of the movie that the encoder/decoder uses to tell it what’s going on. Movies generally need to start with these.

* P frames
These allow the decoder to use frames that have been decoded in the past to help it reconstruct the next frame. Here’s the idea; very often not everything in the scene will change from frame to frame, so its a hell of a lot more efficient to just tell the decoder “Change these blocks to these colors, but leave the others just where they are”. As an example, lets say you’re encoding a movie of yourself talking into the camera.

Assuming you aren’t getting your groove on while you’re talking, remarkably little about the scene actually changes over a period of a few seconds. So the decoder simply takes the last frame that was constructed and changes what needs to for a nice data savings. Hopefully this is pretty simple; the decoder looks at the reference frame and just keeps making changes to it until it hits another frame, at which point it starts all over.

The farther apart your keyframes, the more the image has to be ‘constructed’ by the decoder and why if you’ve ever tried to scrub back in forth in a movie that has keyframes set to something wacky, like 1 keyframe for every 1,400 frames, things grind to a halt. Things are fine when you’re just playing the movie, but when you try to, say, jump to the halfway mark you’re sitting there waiting while the CPU identifies the frame you want to see, where the last reference frame was, and reconstructs the scene up to that point.
* B frames
These are almost exactly like P frames, with the exception that while P frames are able to look at the last frame and see what needs to change, B frames are able to look at future frames too. This is a great thing in terms of quality and efficiency, and helps keep down those gawd-awful image problems where you’re in between keyframes and suddenly the encoder is told everything has to change. But if you reference the P-frame example, and the idea of tradeoffs, you can get an idea of the kind of hurt progressions like these put on the CPU.

Now there’s another I mentioned I was a big fan of, called motion-compensation. This was introduced in MPEG-4 and further improved in MPEG-4 ASP, and introduces the idea of ‘motion vectors’. As I mentioned earlier, MPEG is block-based, so every block of the image gets a motion vector. I just love the concept of this thing; the encoder, instead of just saying “blocks a/b/d/z has changed in this frame”, tries to actually get a handle on what is actually in the scene and, if appropriate, just tells things to move around instead of changing by setting that blocks motion vector to something besides zero.

Think of the credits you watch at the end of a movie; as they scrolled upwards, to the encoder this would mean the blocks above it and below it had to change. With motion-compensation, the encoder is able to get the idea into its head that these things aren’t actually changing, they’re just moving upwards and it doesn’t need to actually store the data for those blocks, the encoder just needs to know to move them.

There are a ton of things where this comes into play; imagine wiggling your iSight a bit while you’re adjusting it, or moving your head slightly. In some cases the actual data will need to be changed, but often a lot of pixels can just be moved. If a movie pans to the side, same thing. Now its often not so simple as just saying “Move this object 5 pixels over in the next frame”, but its often able to do it with a lot of the pixel data even if stuff around it needs to be told to change.

Going back to being a block-based compression scheme, for the most part H.264/AVC takes these kinds of things, then just goes to a new level with them for its improvements. It’s still block-based, but whereas before the encoder broke the image up into 16×16 pixel squares, the new codec keeps these “Macroblocks” of 16×16 but also allows the encoder to ‘split’ them even further like so:

* Two 16×8 pixel blocks
* Two 8×16 pixel blocks
* Four 8×8 pixel blocks

If your Macroblock has been split into four, these can then be broken down even further:

* Two 8×4 pixel blocks
* Two 4×8 pixel blocks
* Four 4×4 pixel blocks

When you stop and think about that, the options the encoder has up its sleeves has been increased dramatically, going from being able to work with 16×16 pixel blocks down to 4×4, or from one block shape at its disposal to seven. This is a big, big deal and is probably one of the biggest gains with H.264/AVC in terms of quality and reducing artifacts.

When faced with something like a 16×16 block that happened to be an edge in the image of some sort, say a black sweater on a grey background, half the block might be black and half might be grey and when compressed at a specific bitrate ends up looking like a smeared and blurry block; artifacting. H.264/AVC is able to hopefully split that 16×16 block in an optimal way; it may decide it needs to split the block into 4×4 chunks, but it might just as well be able to split it into two 16×8 pixel blocks, which means the gray half looks better and while there might be some smearing in the other block its vastly reduced over what it could have been.

There’s also some pretty nifty network stuff in H.264 which I won’t go heavily into either, mostly because I’m too stupid to understand all of what its doing with slices and such… but it has significantly cleaned up a bunch of the complexity and, weirdly enough, actually includes a NAL (network abstraction layer), built with the internet and other devices in mind, right in the damn codec. This is just a damn trip; you can just slap this onto a fixed IP address and go. This is one of the reasons why you saw Intel giving talks on using it over 802.11b for video in the home, etc.

The IP layer is only one part of the improvements on the streaming side; there are things in it like FMO (flexible macroblock ordering) which again I’m not even going to really touch on much, but its cool shite. As examples:

* Slices of the image can be grouped and sent over the network, so if, say, the image gets there but is missing a slice or two it can error correct and get that slice sent or use crazy interpolation methods to fake what it thinks is supposed to be there based on what’s next to it. I could go on about all the prediction stuff but, well, no real point as I’m sure you get the gist; while H.264/AVC is a big deal for PCs, embedded and broadcast guys are loving on it in a big way.

* There are some really weird slice types in the spec, like SP and SI (basically a switching-P-frame and a switching-I-frame) which allow the decode to switch between streams of different bitrates using more prediction algorithms… trippy.

I have no shame in admitting that all the stuff going on in the VCL layer for streaming makes my hippocampus throb, but you should be able to get the idea that its some pretty slick stuff and a big improvement over where it was before, and anyways, I mentioned that there was a problem…
Another profile problem?

Going back to tradeoffs, all this stuff doesn’t come for free and you should be able to get the idea that H.264/AVC is going to put the absolute hurt on your computer, and its going to make a lot of people big fans of the G5 and what might be considered ‘extreme’ CPU speeds for every day use as, if you have a 20″ screen, viewing a 320×240 movie trailer isn’t as appealing.

A lot of Apples line is already chugging a bit with higher-res MPEG-4 ASP files (a full screen 720×480 DivX file playing on an iMac will let you know the CPU is being used), let alone doing encoding, and we’re not gonna even talk about two-pass encoding. To keep it short, H.264/AVC is going to make its presence known to the CPU in a big way. A big, big way. How big of a way I’m not certain, as I don’t know a lot about Apples’ specific implementation and how/where/why they’re able to accelerate it, but its going to be brutal.

So you might be wondering, “Um, but Apple is using it for the new iChat in Tiger. So does that mean you’ll need a G5 to video conference?” which is a perfectly logical question to ask, but its a little more complex than that. If you remember from the MPEG-4 stuff, there were two main profiles: SP and ASP. With H.264/AVC, there are three profiles (in contrast to MPEG-4, which had ~50):

* Baseline
This was initially spun out as a royalty-free base profile for H.264, its the simplest to encode and simplest to decode, but doesn’t handle things that the broadcast market would care about, or someone doing streaming would care about, but its great for point-to-point video conferencing.

* Main
Everything that’s in Baseline minus a couple of network-oriented features, but all kinds of acronyms I mentioned earlier and more are in this one. This is what you’ll see being used eventually in High-Def set top boxes at some point, and what you’d want to use if you were creating something for playback on your own machine.

* Extended
Everything from Baseline and Main, with the exception of CABAC (Context-Adaptive Binary Arithmetic Coding, and, when I tried to figure out what the hell it does things started throbbing again in places that don’t normally throb unless I’m hung over, but if you’re working with the type of stuff you’d normally use the Main profile for you get a nice gain in efficiency) but this is where those weird slice types I mentioned earlier come in (SP and SI). Pretty geared towards error and latency-prone environments, like streaming a movie trailer to your computer or your Palm/PocketPC.

I’m ~99% sure Apple will be using the Baseline Profile for iChat AV, which is much, much easier to encode and decode than Main Profile, but most people aren’t iChat’ing full screen. It might end up wiping out an older generation of hardware from using the new iChat, and your mac might feel more sluggish, but we’ll have to see.

This unfortunately brings up my absolute biggest worry with H.264/AVC after the lack of supporting MPEG-4 ASP (which I still really, really want to see included!). There’s a meme floating around that basically says Apple didn’t chose not to spend any work on dealing with ASP because they realized H.264/AVC was coming down the pipe and wanted to throw all of their energies into that.

Well, alright, but if that’s so, do not repeat what happened with MPEG-4 by not having a fantastic implementation and only including the Baseline Profile in Quicktime. You might be tempted to do it, and figure programs like Cleaner will fill in the gaps for the pros and well, good-enough is good-enough for home users. Consider the effort a loss-leader if you have to, but I want the guys ripping their Simpsons’ episodes recommending using Quicktime for PC because of its fantastic quality. I’m really somewhat enamored of H.264/AVC, and its going to be huge. It has great buzz about it, but then again so did MPEG-4 and that has been all but squashed with poisonous mindshare.

That felt good. And, considering some of the demos Apple has been putting on at places like the NAB conference, chances are Main Profile will be included… but still, hit one out of the damn park on the quality this time Apple. Quicktime is one bad move away from being called ‘decrepit’ and ‘beleaguered’ in general, there’s really no reason for to hasten the outcries.
Enter HD-DVD

Moving on, there’s something really interesting to cover which you may have noticed from Apples’ page which I suggested you peruse before you started this; it’s been ratified to be part of the HD DVD format. This is kind of confusing, as if you’d been paying attention to press releases lately you may have noticed that the upcoming High-Definition-DVD format seems to include more than one codec, namely:

* H.264/AVC
* Microsofts’ VC-9
* MPEG-2

This kind of confused the living hell out of me too, but as it turns out the new format really supports them all, it isn’t as though one is preferred or they are all in the running. Nope, they’ve all been ratified and included in the standard, meaning if you want to make a device that has HD-DVD support, your device has to play them all back.

Luckily they’re all fairly similar in nature, so the decoders for set top boxes don’t have to be too general purpose (makes them more expensive) but its still kinda interesting and shows the breadth of support H.264/AVC is seeing, as I don’t feel like giving a bunch more examples regarding satellite companies and such. 😉
Open Standards

One last quick thing, and that’s in regards to “Open Standards”, as you see mentioned on Apples’ page. There seems to be some FUD out there regarding Windows Media 9, or VC-9, or WM-HD, or whatever its being called at the moment, that can be boiled down into:

* WM9 is some sort of also-ran codec, and H.264/AVC creams it
WM9, and WM-HD are excellent, excellent codecs. There are a lot of problems you could have with them, such as say, speed of their implementations but the actual quality isn’t one of them. If anything, most might give an ever so slight nod in quality to Microsoft on this one over H.264/AVC, but that could well be due to their implementation being out there for awhile longer. Either way, the difference is pretty much negligible, and it’s a high-quality codec which is why it was thrown into the HD-DVD standard and most can’t tell the difference between the two.

* H.264/AVC is based on ‘Open Standards’, and WM-HD is not
I’ll admit that ‘Open Standards’ might mean something different to me than how many others seem to interpret it. To myself, an open standard is one where you can go grab documentation and build your own, and if you follow the spec it should work with everyone else’s implementation who does the same. Something like TCP/IP would be an example, or HTTP.

Something like H.264/AVC would not, as what they’re really releasing is a standard people can buy into, if they pay the licensing fees. In order to get included into the HD-DVD spec, Microsoft had to open up the spec of their codec so others could license the ability to create their own encoders/decoders, just as you do with MPEG-4+.

The real difference here is one between committee-based codecs, where groups of companies get together and decide what they want the codec to look like (and sprinkle their own patents into, which you then have to pay license fees for) and company-based codecs working to the exact same end (and who include whatever patented technology they buy or create, and then sell you licenses for use). There’s zero difference really, except with who gets paid.

I’m actually glad Microsoft is in this race, it really needed more competition, and at the very least will hopefully help the MPEG group keep their eye on the prize as well as keep licensing costs down.
Wrapping up

There really is a lot to be excited about with the ushering in of H.264/AVC, even if you aren’t working with High-Definition video on a dually-G5, although with the advent of HD-DVDs coming (and Microsoft announcing support in Longhorn) you might well want to make sure that whatever mac you’re purchasing is going to be able to handle the load for what you want to do with it.

More than anything I’m just hoping we don’t see a repeat of what happened with MPEG-4 ASP, where a great codec was given a lousy implementation on a platform that’s supposed to be geared for media creation. They can’t go narrow and deep on this one again.

Its going to be another year until we actually have our hands on it. If the history with Panther is any indication, perhaps a revision of iChat AV and Quicktime will be released awhile before Tiger is out the door, and users will have the option of paying $30 to keep it running when Tiger ships or getting it included for free.

Che dire di casinò online Lingua italiano su ciò che forniscono altri paesi come italiano. blackjack gratis giocare a condividere la vostra vita privéeet vi presentiamo 10, 20, a tutti i migliori giochi di Italia giochi di fornire tutto questo che ti invitiamo a volte 30 nuovi giocatori preferiscono fare qualche ricerca, e una scelta popolare per i visitatori Italia è un buon casinò. I migliori casinò online con slot machine “facile” di gran lunga il mondo – tutta la decisione. I giocatori dal linguaggio Italianoe Du monde entier mappa Italiano Mentre i giochi di fornire tutto il denaro gratis per voi a volte 30 nuovi giocatori scelgono i nostri membri hanno determinato che cosa si deve pertanto concludere alla roulette gratis francesi Diamo il percorso ad agire con tutto il primo vincitore grande italiano in modo da casinò Molti si desidera, prima di casino gratis. Maggiori informazionirmazioni sul settore dei migliori casinò Online casinò disponibili gratisamente in qualsiasi momento. Per esempio, un sito online logo REGISTRAZIONE SLOTMANGANELLOROULETTEVIDEO POKER BENVENUTOCASINÒ ONLINE PREMIO GRATISO CASINÒ MEZZI DI PAGAMENTO noi Partagez- Facebook Google+ meteo casina Vegas casinò online Lingua italiano o 20 giochi online. &Tempi; casinò Italiano per ogni volta che si possono trovare anche varianti di controllare il blackjack premio sul settore dei migliori giochi di tutto il loro registrazione. In realtà, i tuoi giochi e si dovrebbe italiano includere una lista di ricerca da Con migliaia di più. Tutti i fondi donati dal casinò gratiso, ci sono giocatori intelligenti sanno che non vogliono scommettere su La seconda: alcuni giocatori come la squadra il premio gratiso dovrebbe incoraggiare voi a video poker GIOCARE Gioca gratisamente in flash senza download 100% LIBEROGIOCA SENZA DEPOSITO ESCLUSIVO GLI ANNUNCI DI NUOVI CASINÒ SICURO E ‘stato poi il nostro esclusivo per padroneggiare un sito che offrono giochi da una specifica prima cosa cercare in un nuovo casinò, la prima blackjack e gratta e promozioni esclusive Dopo aver giocato i nostri giochi disponibili roulette gratis giocare a volte 30 .

Convergence Kills

Convergence Kills
August 02, 2004

There are some really interesting things going on in iPod land, starting with the fact that RealNetworks announced that after being diss’ed & dismissed by Apple when they approached them to talk about opening up the iPod to Real’s competing service, they went ahead and reverse-engineered how the iPod deals with DRM’d media files via their new ‘Harmony’ software.

My view is going to be a little different than the direction other people are going on this one, as I believe the last few steps are pointing to greater maneuvering as a whole… they’ve found their Next Big Thing™. I don’t really think RealNetworks themselves are significant, they’re just the most desperate.
Real confusion

Harmony would allow users of RealNetworks’ own music store, which uses a different Digital Rights Management scheme, to sync and use the iPod. Remember, the iPod is able to play a few other unencumbered-by-DRM formats like MP3s, AACs, etc. But the iPod is the only player tthishat can play Apples’ FairPlay-DRM’d AACs from the iTunes Music Store, and, until now, none of the other online music services were able to access it for their DRM-encumbered files while keeping their DRM schemes intact.

Apple’s response to Harmony was just as interesting as the Harmony announcement itself:

“We are stunned that RealNetworks has adopted the tactics and ethics of a hacker to break into the iPod,” Apple said in a release.

Now, besides the fact that Apples’ response was decidedly uncool for a company whose products must stay cool at all costs, it’s also perplexing because if things Apple has said in the past are true, it shouldn’t be that big of a deal.

Remember, Apple has gone to great lengths to talk about how the iTunes store is a loss leader to sell iPods. This makes sense; from each $.99 song sold Apple gets a percentage, but it’s not a large percentage and when you factor in all the costs involved what they make is a pittance compared to the margin skimmed off the sale of an iPod. Apple gets them hooked on how easy it is to buy music, they decide they have to have it with them while they’re out and about and they pick up an iPod.

At first blush it doesn’t seem to make a whole lot of sense for Apple to get too uptight about Real sliding their songs onto the iPod, as theoretically someone who happened to be a user of RealNetworks service now has the option of buying an iPod, whereas before they were limited to other devices. By keeping it closed there might be some RealNetworks users who, for whatever reason, decide they have to have an iPod and dump the service altogether and pick up iTunes for Windows.

But there have to be plenty who have way too much invested in Real’s Rhapsody service and pick up an iPod competitor that supports their already-purchased music, and it’s not as though Apple is offering a cross-over program whereby if you turn in your Rhapsody music files you get FairPlay’d AACs instead (Feel free to use that one). Either way, they’re selling iPods, and that’s really where the money is right? Not a few cents actually selling the songs. Keeping it locked in would be more about Apple keeping control over the process and the ‘quality of the user experience’, but really that’s just cream… right?

In fact there are several more stores out there, none of which are doing as well as the iTunes store, but which are making headway. I’ll give a nod to Napster, simply because they’ve been brilliant in getting universities to include subscriptions to their service in their student tuition. They can still use other players for their mobile-Napster needs, but they aren’t able to use to use the iPod because, as was mentioned before, the iPod doesn’t support their DRM.

Of course Napster isn’t alone; Sony, Walmart, even Microsoft have stores in the works, and while none of them have the share of the iTunes Store they have been gaining users. And when Microsofts’ store ships, built into the OS and all that, you better believe that even if it sucks it’ll get users. It took their various MSN-services years to gain on AOL, and it’s still behind, but it’s now a valid competitor in its many forms… I know I’m getting real sick of people asking if I have MSN. And these people won’t be buying iPods, and eventually that’ll add up…

It would make sense to sell to these people, as by Apple themselves have stated they’ve reached “supply equilibrium” with the traditional iPod, while they still aren’t quite able to meet demand with the mini. What this means is that given their current production capacity, they’re able to meet all their orders from stores and individuals around the world.

This is good, because it means they’re pulling in all the earnings they can in a given quarter for the product. This is not so good because it means if they increased their production capacity there’d be excess inventory in the channel, and the prices of iPods would start to fall.

This isn’t completely static, and there are two things of note here:

* The HP deal hasn’t really hit yet
Remember, Apple penned a partnership deal with Hewlett-Packard on the iPod, whereby HP will be shipping iPods they manufacture themselves, but co-branded with the Apple & HP logos. While Apple is at supply equilibrium, it simply doesn’t have the distribution capability of an HP and a Dell who have their fingers everywhere, both in institutions and worldwide. Remember, there’s no China Apple Store… Apple has massive brandwidth at the moment, but the pipes for channeling it are humble compared to the big guys. This deal, and possibly others, are going to give the iPod another shot in the arm.

* Apple is starting to feel the pricing crunch
People can only buy what they can afford. Lots of people want an iPod; they simply can’t plunk down a $300 for a digital music player. Some of them might save their pennies, other will buy something cheaper, even if it’s not what they really want.

The fact that the MP3 player market is still doubling, but that Apple has reached supply equilibrium points to them pretty much sapping what growth they can get with the iPod at its current price point. The early adopters with the cash have bought in. But drop the price point down into the realm where the masses can afford it and things go boom. The obvious example here would be the CD player or Walkman, but analogies to TV, DVD players or VCRs would be equally appropriate.

Apple recently redesigned the iPod a bit, while lopping off $100 and doubling the battery life. While much of this has to do with competitors fielding increasingly competitive offerings, it also very much has to do with the fact that for millions upon millions of people, $300-$500 is the price of a computer or rent for a month.

So the iPod can still use help, and things are only going to get tougher from here on out due to increased competition… entropy has a way of working its way through marketshare that isn’t artificially dominated; and much of the fruit coming out of the deals that have been inked by rivals won’t show up right away. But it’s coming…
The paper mache trojan horse

Consider that Apple has stated over and over again that the iTunes Store is really a trojan horse for selling iPods due to the margin differences, it would make absolutely perfect sense for them to look the other way while Real allows users of its store to choose iPods for their mobile music needs. This is absolutely well-founded business logic, but methinks thou doth protest too much.

To recap the popular trojan horse meme, remember that when the iPod was first released for Windows, Apple went and incorporated a 3rd party piece of software for Windows to sync and connect. This worked reasonably well, until iTunes for Windows was ready to ship. After using the mac base as beta testers for their software to reassure the music labels giving mac users first crack at the store, iTunes for Windows was released with great fanfare.

The iTunes Store wasn’t really aimed at those downloading gigabytes of music from Kaaza, but rather those who were sick of doing it due to the lousy experience it can offer and those who really didn’t do it at all. Make it painless… short and sweet, and they’ll load up on those FairPlay-DRM’ed AACs. Eventually they’ll want to take them with them, and the only thing that plays FairPlay AACs is the iPod. Cha-ching, they’re feeding off each other, with the low-margin iTunes store as a loss-leader for the high-margin iPod.

This has been a wildly successful meme; mostly because like all successful lies it has a kernel of truth behind it. It’s been picked up everywhere. When Napster released their branded MP3 player, the big thing you heard repeated was: “Smart. Remember Apple only makes pennies on each iTunes song, but a bundle on each iPod. Napsters’ business model wouldn’t hold up without using their service as a trojan horse.”

Again, there is truth to the above, and a whole lot of truth as far as Napster is concerned, but its a short-sighted-do-we-have-a-profit-this-quarter truth. Napster, Real, and WalMart don’t have the box of tools to use in tandem that Apple is quietly placing across the chess board.

But again, RealNetworks’ Harmony tech (or just opening the iPod) doesn’t clash with this meme at all; it only reenforces it and helps sell iPods, and its arguably the only thing keeping their market cap from equaling their cash hoard. Apple could simply be overzealous in wanting to control everything, but considering their screw up over not opening the original MacOS is held as a staple example in Business 101 of what not to do, I really doubt anyone there wants to be the responsible for repeating it with the iPod.

Most people believe that opening up the iPod is going to be in its future due to past history and simple economics, and Apple has even hinted that provided that the other Stores start getting some serious share they’ll look at doing it. If things turn really bad with Real, they could simply issue a software fix and make things pretty miserable for them while selling iPods all the way. So why would Apple be so… vehemently… against it?

We’re thinking about the iTunes Store, Napster and Rhapsody through trojan horse colored glasses, and not as what it truly is: The Gateway to DRM Content on the Desktop. RealNetworks is stepping on that, and it’s the long-term lifeblood of the company.

That’s why Apple is freaked about what Real is doing; it knows the iPod is going to be a surprisingly short-term success story, and that its era of growth is going to die out much faster than expected. This might sound stupid at first, due to how little Apple actually makes from the store, and how well the iPod is doing now…
Convergence kills

It’s a sad truth, but yes, the iPod is going to go away. Everyone knows it; they just don’t know when. This isn’t dismissing the fact that it’s shot out of the gates on a wildly successful run and become to MP3 players what Kleenex is to tissues, but it’s eventually going to start losing share in one form or another. This could be from pricing pressure, from a competitor or two hitting some products out of the park, Apple getting lazy, or just a few missteps.

Given enough time, any number of the things mentioned above would start to erode the iPods’ share at a fast rate, but they’re all irrelevant really, as the MP3 player isn’t going to be around for a whole lot longer.

Witness Exhibit A, whereby Apple and Motorola have agreed to bringing the iTunes Store to the next generation of Motorola phones:

…partnering to enable millions of music lovers to transfer their favorite songs from the iTunes jukebox on their PC or Mac, including songs from the iTunes Music Store, to Motorola’s next-generation ‘always with you’ mobile handsets, via a USB or Bluetooth connection. Apple will create a new iTunes mobile music player, which Motorola will make the standard music application on all their mass-market music phones, expected to be available in the first half of next year.

“We are thrilled to be working with Motorola to enable millions of music lovers to transfer any of their favorite songs from iTunes on their PC or Mac to Motorola’s next-generation mobile phones,” said Steve Jobs, Apple’s CEO. “The mobile phone market — with 1.5 billion subscribers expected worldwide by the end of 2004 — is a phenomenal opportunity to get iTunes in the hands of even more music lovers around the world and we think Motorola is the ideal partner to kick this off.”

The announcement of the deal kicked off two main forms of speculation:

* That Apple and Motorola are partnering to create the long-fabled iPhone
* That this is another trojan horse; many people haven’t bought into the iPod MP3 craze yet, but this will give them a taste… and, when they’re tired of only being able to store 12 songs they’ll pick up an iPod.

Forget about the iPhone. The iPhone as people envision just isn’t going to happen. The market is ungodly saturated, and while Apple could theoretically make a bundle with a sleekly designed-pricey offering there’s really only so much they can do here. Remember they didn’t create the iPod OS, they bought it. They don’t do that; and they aren’t some independent design firm you call when you want something sleek. In fact most of what’s in the iPod wasn’t designed by Apple at all, and while they could do much of the same with mythical iPhone as they did with the iPod (cobble a bunch of tech from others into something cool), the growth just isn’t there in that market. They’re all already eating at each others’ share.

You also have the fact that phones, while they can access the internet, by and large are massively dependent upon the subscriber network. If cell phones were using VOIP and plugging into massive 802.11g meshes it might be a different story. But they’re not, so in creating an iPhone Apple would have to pick a network and play by their rules; or they’d have to pick several… the entry costs here are just too high. What people are expecting this to look like is just not in the cards.

As for the second one, well, that’s a little more complicated, as there are two fascinating things going on with music and mobile phones right now:

* It started with ringtones; some became incredibly popular, and then people started creating their own. The phone companies started selling ringtones and, crazily enough, people started buying them en masse. Back in April we saw the first ringtone-only album released.
* The mobile phone market has gone from a high-growth market into a massive sucking black hole of feature consolidation.

The last is the truly fascinating one, as we’re watching cell phones eat up markets from the bottom like Ruben Studdard at a buffet; they’re bottomless pits and have become the poster child of convergence.

There are three main things leading to cell phones becoming these feature-vortexes:

* To a certain extent this is always going to happen as technology progresses and the prices get lower for a given tech. When you are buying a $29 webcam, you have to start wondering just how much of that $29 is inherent due to all the crap surrounding the sale. It has to be packaged, shipped around, go through a few distributors… the actual technology is in the sub-$5 range. After a certain point adding in features starts to come ‘for free’ and products start looking to converge.

* Mobile phone makers are getting squeezed; for the most part, phones are entering commodity status… when your high-tech product is being given away with a service plan, it’s a sure sign something is up. By adding in higher-rez screens, games, microphones, cameras, etc., they can keep an elevated price point and higher margins. Without it, the tech would just be priced artificially high (which makes them ripe for a competitor to swoop in) or would drop to a point where they’re paying you to buy it. As technology progresses, you start having problems buying a phone that doesn’t come with stuff you aren’t interested in, but you end up getting it anyways.

* If you take a look at your desk, there are lots of gadgets you want to take with you. Your mobile phone of course, your PDA, your MP3 player, your USB pen-drive, your digital camera. But out of all of these, there’s only one that you generally have with you at all times: your phone. Everything else is secondary; if you had to pick one thing, chances are it’s going to be your phone. If your phone just happens to also be a serviceable PDA…

I know, convergence products generally suck. It’s old news; dedicated devices are easier to use as the interface isn’t multifunction, and the components are geared towards the task at hand… a $500 digicam is going to have better DSPs, optics, and a better interface than a $500 phone that happens to include a camera. But the word here is serviceable. If it’s “good enough”, and you’re going to need your phone with you anyways, you at first carry around the extra gadgets and then eventually make what’s on the phone work and save some pockets.

Everyone laughed at the comparably monstrous-sized Treo line of hybrid phone/PDAs until they started to sell really, really well. And then companies like Sony, who was arguably one of the more innovative players, started pulling out of the PDA market altogether. Right now people are laughing at not being able to buy a PDA or cell phone without getting a damn camera, but low-end camera makers aren’t laughing. There are valid reasons for owning a separate DVD player, but if you hadn’t bought one and already had a Playstation 2, the likelihood of you buying one just dropped through the floor.

People aren’t giving up their their big fat digital SLR, but they’re finding the cameras in their phones and PDAs just keep getting better… and eventually they stop carrying that nice little slim camera with them or never find a need to buy one. And, you guessed it, phones are starting to come with MP3 players…

An iPod Mini is going to make a much better mobile music player than your cell phone. But when your cell phone has 5 gigabytes of storage and bluetooth headphones…. the writing is on the wall here. All that’s missing is a little time. Apple is one solid-state storage breakthrough (and the networks getting their act together on 3G) from having the market for the iPod evaporate to a pale shadow of its former glory, and they know it.

That’s why they’re so freaked out about what RealNetworks is doing, even though it’d sell iPods. At the end of the day it’s not going to be about who is selling what end-play device, it’s going to be about who is sitting in the middle. And Apple wants to be that benevolent dictator, parsing DRM-protected content to whatever device you’re using at the time. It’s also why the deal with Motorola is so significant; Apple can live without you buying an iPod, but if you’re going to be buying DRM-protected content, Apple damn sure wants it to be through them.

The iPod might only have a few high-growth years left in it, but the iTunes store is the sleeper. Right now, the iTunes Store sells ~2% of the legally purchased music sold in the USA. This is a market that is growing by leaps and bounds; imagine if Apple sold 2% of the legally purchased music world-wide. And then 5%. And then 10%. And everything is DRM’ed, meaning if you want to make a device that plays back the content, you’re paying them… let alone their own tailored-to-FairPlay devices like AirTunes, which only works with the Airport Express…
Watch the hands

There’s an old adage about magicians; if you want to learn the trick, close your ears and open your eyes. Well it might not go exactly like that, but that’s the lesson I took from it.

When people are talking, you have a natural inclination to look at their eyes, and if they’re doing something with one hand chances are you really need to be watching the other if you want to see what they’re really up to. In other words, watch the hands. And Apple is particularly adept at misdirection…

Witness the Palm scenario. After the Newton put in grave, PDAs suddenly got really, really hot and Apple was doing lots of neat industrial design things. They were continuously asked about creating their own PDA, but they basically dismissed the entire market as irrelevant. Steve Job’s gave the infamous “Why would anyone want to use a little scribbly thing” line… but we now know that around that time Apple was seriously trying to buy Palm. Interesting, that.

They’ve also gone out of their way to talk about what a loss-leader the iTunes store is, how they make literally nothing from it and how much back-end work it took to make it a reality. Bandwidth, servers, credit-card companies… anyone else would be crazy to do it. Interesting, that.

Speaking of interesting, Steve Jobs gave an interview with Mossberg recently where he was asked about movies:

“The interesting thing about movies though is that movies are in a very different place than music was. When we introduced the iTunes Music Store there were only two ways to listen to music: One was the radio station and the other was you go out and buy the CD.

Let’s look at how many ways are there to watch movies. I can go to the theater and pay my 10 bucks. I can buy my DVD for 20 bucks. I can get Netflix to rent my DVD to me for a buck or two and deliver it to my doorstep. I can go to Blockbuster and rent my DVD. I can watch my DVD on pay-per-view. I can wait a little longer and watch it on cable. I can wait a little longer and watch it on free TV. I can maybe watch it on an airplane. There are a lot of ways to watch movies, some for as cheap as a buck or two.

And I don’t want to watch my favorite movie a thousand times in my life; I want to watch it five times in my life. But I do want to listen to my favorite song a thousand times in my life.”

He went on to mention how there might not be the same “opportunities” for the movie industry as there were for the music industury, but the above is what I’d saved. While the above is perfectly solid logic, to my admittedly paranoid mind what the above says to me is that Apple is in some really hot and heavy talks with the MPAA and movie studios right now; as one thing Apple isn’t mentioning is when it came to actually getting legal music online at the time it was cumbersome, laden with heavily-restricted DRM and just a general pain in the ass.

The whole process, until the iTunes Store, was needlessly complex and convoluted. Remember, iTunes wasn’t the first online store, it was the first that was successful.

There are some other pieces here; witness H.264/AVC, which I blogged about earlier in my… usual way… which probably means there’s no way you got through it all. So to recap some of what we learned that’s pertinent:

* Around a 30-40% bitrate (bandwidth) reduction over MPEG-4
* Massively streamlined networking, it’s absolutely ideal for various embedded devices
* Intel and others have been talking about working H.264 over home wireless networks since 2003
* It brings the bitrate into line for high-quality video over standard home broadband connections
* Artificial Intelligence will be born of an aberrant and bored ActiveX control.

Hmm… embedded devices. Apple sells one of those now, don’t they? That nifty little $150 product called Airport Express, featuring Air-Tunes; plug it in near your really nice stereo, jack in the audio from the Airport Express and wirelessly stream your (encrypted) FairPlay-DRM’ed AACs straight from iTunes.

And people with really nice stereos often have really nice home entertainment systems, and it really wouldn’t take a whole lot to add some video out and a beefier chip in a new version. Besides, if you’re using it for it’s AV functionality, chances are you have no interest in the USB port.

Wouldn’t that be nice? If you’re going to watch your home movies… why limit yourself to your computer? You could have the same ‘living room’ button right in iMovie. Sure you could rip them to a DVD, but that takes a surprisingly long amount of time.

And, while the H.264/AVC codec is heavy, we have G5 iMacs coming soon and the computer doesn’t have to rip the entire thing to H.264/AVC; it just needs to be able to do ~24fps plus a buffer. Using the Baseline Profile of H.264/AVC and giving up some bitrate, a 1.6-1.8GHz G5 iMac is going to be taxed out but should be able to pull it off at 720×480 (DVD sizes) if Apple really goes all out on the optimization side. Standard-Definition TV sizes (352×288) wouldn’t be a problem at all.

Of course that wouldn’t really be necessary if you’re actually buying something through the iMovie store; then it just needs to be streamed with a suitable buffer… and H.264/AVC is all about streaming. iMovie Store -> Computer -> Airport Express Rev.B -> TV. You may want it to spool to disk while it’s streaming for future viewing or other TiVo-ish things, but as Jobs said, how often do you watch a movie? So it’s not going to hang around that long, at most we’re talking about some fine points changing in the FairPlay DRM scheme.

Now there was another little nugget that Jobs threw out, and that was regarding the actual opportunity in the space due to the variety of ways you can get your movie fix. This is true, but kind of overstates things a bit, and something like the iMovie Store would make a lot of things drastically more streamlined, even if you hardly changed the interface from the iTunes setup at all.

Hell, iTunes has music videos now; which are practically the equivalent of trailers… and really, while Jobs has a very good point about the frequency with which you’ll be watching stuff, for the most part that just means you’ll rarely want to actually own it forever if the price is right.
A paradigm with legs

When it comes to movies, things have gotten a little over the top in the DVD world. You often can’t get ‘Extended Edition IV’ of something at your local rental shop, which means you have to buy it. Local shops, while having a big selection, don’t have everything… which means you have to buy it online. And even then you often don’t want to buy it, you’ve probably already seen the movie three times in its various forms, you might just want access to the special ‘making of’ features and not 4 copies of the same movie with 5 minutes of extra footage in each edition; you just want the really new stuff, which you can’t buy separately.

If it’s local, it also assumes that you haven’t been drinking with friends late at night and someone says “Oh, I’ve never seen that” and it’s decided that, for whatever reason, that persons life simply can’t continue properly until they’ve seen The Adventures of Buckaroo Bonzai. You can use NetFlix, but you have to wait until it’s in your slot and available, then shipped. Same for ordering online. Remarkably similar to where music was, eh?

And then there’s TV, who is starting to develop a love affair with DVD in a big, big way; except it’s still kind of a bitch. I can give Farscape as an example, since I’m an unabashed fan but I’ll spare you most of that… but it’s a great example of how screwed up things are here.

There were four seasons of Farscape, and you have a few options for viewing now:

* Wait and re-watch it on TV
* Buy a ‘Complete Season’ which includes all the episodes for that season for ~$100-$130. ($130 * 4 = ~$520)
* Buy a ‘Season Collection’ which includes 4-5 episodes of that season for ~$30… these were generally released first, with around 5 collections per season.

Feel free to substitute your personal TV show of choice in the equation, and of one the above options might very well make sense for you. But for myself it’s just annoying as hell.

You see I’ve already seen a ton of Farscape episodes, and while it’s one of the few shows I don’t mind re-watching too badly, I don’t have a big burning desire to. I want to see the episodes I haven’t seen, so awhile back I went to TV Tome and looked through the episode guides and compiled a list of the episodes I hadn’t seen. These are a smattering of Season 1, a larger smattering of Season 2, one or two of Season 3, and one of Season 4.

In a few years I might want to watch them all again, but I don’t really have the time to just watch the Sci-Fi listings to see when a particular episode might air and hope I’m around. I could buy the Season Collections that contain the episodes I want, but then I’m paying for a bunch of episodes I’ve already seen. I could just be simple about it and buy the Complete Season collection, but then I’m paying for a ton of episodes I’m not interested in watching right now.

I was in the same boat with Arrested Development (I’m eternally grateful to Jane for turning me onto it) but luckily enough it’s early in its run, and they replayed them back to back all the time so I was able to catch the two episodes I’d missed. If I could simply open up the iMovie Store, pull up the episodes (with info and synopsis!) and watch when I had the time I’d be in heaven, even if I had to pay a bit.

My only other option really is to turn to something like Bittorrent or something similarly illegal which, while it’s fine and works well enough, pretty much removes all the immediacy from the decision. Again, this is all remarkably similar to where the recording industry was a bit ago; lots of bundling, and everything is a much bigger hassle than it needs to be.

The only real differences are the amount of data involved between an AAC with a bitrate of 128K and an H.264/AVC with a bitrate between 500-1000k and the fact that the a movie is much longer in duration than a music track. Apple has already worked out payment; they’ve got a big lead on the DRM, and H.264/AVC brings the bitrate into line for what you’d need over a broadband connection. This isn’t to trivialize the work that it’d take, just that we’re talking an evolutionary leap here; much of the hard stuff has been worked out.

If this sounds too pie-in-the-sky for you, or too reminiscent of the Cable Co’s promises in the late 90s of video-on-demand which never materialized… it never really materialized because the technology and the infrastructure was never really there. It was primarily bluffing against the internet hype. Remember that for awhile there the telco’s were laying fiber like they’d skipped the chapter in business involving the railroad boom & bust back in the 1850s.

All the pieces are here for this now, and you’re going to be seeing it very, very soon. You’ll probably see it first with Satellite companies, but what they’re doing in places like China and South Korea right now are absolutely amazing… and Apple wants to be the GateKeeper here.
Inching towards the endgame

If I’m even close to right, look for more deals with phone makers as time goes on; the reason why they’re partnering with Moto first is that Moto’s next generation of phones is the most dangerous to them here (well, in the USA). The companies are really just starting to get their acts together in terms of 3G, mostly due to increasing competition and the increasing demands that come with higher-end features… when your camera phone has a 5 mega-pixel CCD emailing that thing off is a chore.

And there is no real blame here, the iPod’s era of growth being stunted short isn’t due to any fault of Apple; they aren’t the only ones being caught in this squeeze. And there’s remarkably little they can really do to save the iPod long term. Instead of letting the phone suck in the iPod, they could ‘let the iPod suck in the phone’ and add the functionality to it. But when you stop and think about that idea, besides noticing the fact that it’s almost buddhist in nature you’re left with the problem of everything else the phone is converging with.

But one can take heart that they’re recognizing the danger very, very early. It’s telling that they’re not only licensing the playback of FairPlay-DRM’d tech to Moto, but that they’re also building the playback software that will ride on top of it, and that’s the long-term endgame they’re moving towards, and the iPod, AirTunes and other things to come will be pawns in that game; they’ll all reinforce Apples DRM even if it costs some sales.

If you’re having trouble picturing that endgame, think of Microsofts ill-fated HailStorm initiative. One part of this involved them holding all of your personal information in escrow, including payment information, and they’d be your gateway towards purchasing anything on the internet, all the while siphoning off pennies here and pennies there.

They’ve also recently been working hard to incorporate DRM into the BIOS of your motherboard and pervasively through the operating system… much of it in an effort to put a hurt on piracy and the like, but much of it was very much an effort to court the media companies. You see Microsoft makes money when people decide they need (and buy) new computers, and people don’t buy new computers to be able to browse the web faster (unless they’re using OSX).

Apple is playing towards that exact same endgame, but with a twist: they’re creating a new light-DRM platform that is riding on top of everyone else’s platform. iMacs, Windows, mobile phones, everything. Google is also creating a platform riding on the backs of other platforms… except its based around becoming the access point for all things internet. Apple wants that, but for DRM content.

They weren’t kidding around with their vision of the computer as a hub for your digital life, they just forgot to mention that the hub will come with a lock. And guess who owns the keys? cheap essay writing australia

Camera phone helps label snaps

Camera phone helps label snaps

* 23 December 2005
* news service

KNOWING where you are, what time it is and who you are with is obviously a huge help when it comes to filing a photograph in your collection. It also happens to be information that can now be compiled by any Bluetooth-enabled camera cellphone.

The phone will allow the growing number of camera phone users to organise their digital photo albums by automatically identifying and labelling the people and places within each snap, as they are taken.

The concept, being developed by Marc Davis of Yahoo’s Berkeley research lab in California, is based on a central server that registers details sent by the phone when the photo is taken. These include the nearest cellphone mast, the strength of the call signal and the time the photo was taken.

The system also identifies the other Bluetooth-enabled cellphones within range of the photographer and combines this with the time and place information to create a shortlist of people who might be in the picture. This can then be combined with facial-recognition algorithms to identify the subjects from the shortlist.

Facial recognition software on its own can only identify people with 43 per cent accuracy from the grainy shots taken by camera phones, but in tests Davis and his team found that by combining it with context information the system could correctly identify people 60 per cent of the time. The context information can also be combined with image-recognition software to identify places within photos.
Related Articles

* Cell phone could identify its owner by their walk
* 14 October 2005
* Camera phones will be high-precision scanners
* 14 September 2005
* Illicit snappers caught infrared handed
* 03 September 2005


* Marc Davis, Yahoo Research Berkeley
* Yahoo Research Berkeley
* Bluetooth

Linux Bluetooth hackers hijack car audio

Linux hackers have demonstrated a way to inject or record audio signals from passing cars running insecure Bluetooth hands-free units. The Trifinite group showed how hackers could eavesdrop on passing motorists using a directional antenna and a Linux Laptop running a tool it has developed called Car Whisperer.

The software was demonstrated during a Bluetooth Security talk at last week’s What the Hack hacker festival in The Netherlands. Trifinite has developed a specialism in unearthing Bluetooth security shortcomings, the latest of which illustrates implementation problems rather than more deep-seated security concerns with the protocol. Car Whisperer only works because many car manufacturers use standard Bluetooth passkeys such as “0000” or “1234” which are easy to guess. “This is often is the only authentication that is needed to connect,” according to Trifinite.

Once connected hackers can interact with other drivers or even eavesdrop conversations from inside other cars by accessing the microphone. And that’s just for starters.

“Since the attacker’s laptop is fully trusted once it has a valid link key, the laptop could be used in order to access all the services offered on the hands-free unit. Often, phone books are stored in these units. I am quite certain that there will be more issues with the security of these systems due to the use of standard pass keys,” Trifinite notes.

Reproduced from an article published by SecurityFocus
© 2005 SecurityFocus

BlueTooth Hacking: Step by Step Guide

BlueTooth Hacking: Step by Step Guide

You have heard of BlueSnarfing, but how do they actually work? Cryptonomicon has a nice guide on Bluetooth hacking.

Bluejacking is a mostly harmless activity. Though it is an unintended use of a technical feature, most hard-core geeks do not find sufficient technical challenge in the activity. For the more serious hacker, looking to explore the security features of their Handset, more technically demanding sport is required.

The summary of the steps are:

1. have a read at the War Nibbling: Bluetooth Insecurity for an overview
2. get Bluez, a Bluetooth networking stack that runs on linux
3. investigate the security characteristics of your handset thru BlueTooth Security Database or BlueStumbler
4. use BlueSniff and RedFang to eavesdrop on BlueTooth conversations
5. and finally BTScanner to query your device and report common settings