Home | Site Map | Watch | FAQ | History | Store | Contact

The Future of Language (graphlang)

by Stephen Malinowski


The future of human language, considered in light of the history of language, the history of communication technologies, and recent trends.

Features of spoken language

Spoken language can do some things better than others. For example:

Speed . For spontaneous communication, no form of communication is as fast than speech (though some things come close: shorthand, sign language, and the court reporting machine come close, and some people have gotten nearly as fast at typing as speaking, so there are some contexts in which typing can serve as a “real-time” form of communication, e.g. online chat rooms).

Portability . Mouths and ears are permanently attached to our heads, they’re seldom busy for other things, and they function reliably under a wide range of conditions.

Visual . Speech is not visual. If you consider how much “visual intelligence” we have, you’ll realize what a profound limitation this is. We often supplant our speech with manual gestures (e.g. “it was THIS big”) facial expressions, pictures and diagrams, and other visual aids. A picture is sometimes worth a thousand words-a tremendous efficiency if it takes only a few seconds to look at a picture. Pictures are often more memorable and more compelling than speech.

Animated . Speech cannot easily convey a sequence of images, a motion, a gesture. (Try to tell someone how to tie their shoes!)

Permanence . Speech is not permanent; once you’ve said something, it may be remembered, but the original is gone.

Duplicated . When you say something, you are saying it once, for the people who can hear you. For someone else to hear it, either you or somebody else will have to say it again.

Remote . You speak to the people “within earshot.” Others cannot hear you.

Rearranged. When you say something, it comes out in the order you said it; if you want it to express it in some other order, you have to say it over again in that order. This can be an asset or a liability. If you want your listener to consider something a certain way, you can lead them through in a particular order, and present things in an order that’s easier to understand, or more convincing, than another. However, if the thing you’re describing has many relationships that exist at once, you may have to make several trips through to get everything covered, and your listener may lose track of things.

Features of language-related innovations

Now, let’s see how various language-related innovations stack up:

                  Fast Portable Visual Animated Permanent Duplicated Remote Rearranged
Spoken Language    X      X
Gesture            X      X       X       X                                     X
Sign Language      X      X       X       X
Writing, Typing    X                                X                  X
Printing                                            X         X        X
Drawing                           X                 X                  X
Photography                       X                 X         X        X
Telephone          X                                                   X
Cell phone         X      X                                            X
Fax machine                       X                 X                  X
Word Processor                                      X                           X
Email                                               X         X        X        X
Online Chat        X                                          X        X
Desktop Pub.                      X                 X         X                 X
World Wide Web                    X       X         X         X        X        X
Film                              X       X         X         X        X
Television                        X       X                   X        X
Multimedia                        X       X         X         X        X
PowerPoint                        X       X         X                           X

Only gesture and sign language share the speed and portability of spoken language, and these features count for a lot: nearly every human being communicates fluently with some combination of these. Speed and portability count for a lot! Could we develop a visual language that was as fast and portable as speech? I believe that, as computers become more ubiquitous, we will.


For those of you lucky enough not to have been exposed to it, PowerPoint is an example of what’s called “presentation software.” In its simplest (and, unfortunately, most common and most stultifying) form, a PowerPoint presentation is a series of computer-generated slides with text on them, accompanied by a person reading the slides to you in a droning voice, turning the incomplete sentences on the slides into complete ones. Microsoft may think I have an attitude problem but, in fact, I’m very impressed by PowerPoint. And I’m not the only one: thousands, perhaps millions, of businesspeople use it to make presentations. The reason for this is that PowerPoint gives them something previously available only to film makers and television studios: the ability to create a multimedia presentation to express their ideas.

The Future of Language: graphlang

PowerPoint, for all its popularity, is still a pretty crude tool. Compared to speech, it takes a long time to assemble a PowerPoint presentation, and the elements of the presentation are limited to text, graphics (which must be imported from other applications), and some simple charts. In the future, we will evolve a mode of communication that combines the speed, portability, and flexibility of speech with the graphical power of television. The result, a graphical language (which I will call graphlang), will be adopted because, in certain contexts, it will be more efficient, more precise, more expressive, more universal, and more persuasive than spoken language.

How might graphlang evolve?

To imagine what graphlang might be, let’s start with computer voice recognition, and add features from there.

Once computers can reliably recognize what you’re saying, they will be able to provide various useful services, such as recording it. Recording your speech as text is much more useful than recording it as sound. The computer could search its archive for things similar to what you are saying now, and remind you of who said them, and of other things that are related. It could also show you (and, with your permission, your audience) images (still or animated) that had previously been associated with what you are now saying. So, if you were telling somebody about your trip to Japan, your words could result in the computer displaying photographs you shot on your trip. The computer could also learn which photos you preferred, and when, so as you told about your trip to various people, the narrative would evolve collaboratively, with the computer reminding you what you might want to say, and you telling the computer what you wanted it to help you show. And, of course, the material supplied by the computer needn’t be limited to photographs; anything could be included: music and other sound recordings, spreadsheets, diagrams, etc.

Once you got used to using a computer to help you create documentaries of your experiences, you’d want more expressive control---like a cinematographer has. Simple controls---cuts, dissolves, fades---would lead to more sophisticated ones: zoom, pan, lighting, depth of field, etc. And like a cinematographer, you would learn the “language” and “rhetoric” of film: how to direct the viewer’s attention, how to pace the presentation to make it dramatic.

Of course, there are images you’d want to include that weren’t things you’d ever taken a picture of. There are lots of places you could get images to include:

  • from archives of images
  • from people you were talking to (that is, from their computer)
  • drawings or diagrams you’d made yourself
  • images the computer generated for you at your direction

    The last of these is where graphlang starts to get really interesting. What kinds of images would you want the computer to create for you, and how would you specify them? Here are some possibilities, from simple to sophisticated:

  • Gestures. At the simplest, you could direct the computer to do expressive gestures, the cinematographic equivalent of smileys.
  • Simulations. If you were trying to describe a process or behavior, you could sketch a few “key frames” and have the computer generate the “tweens” (in between frames), to create an animation. Or, you might specify the behavior algorithmically, and have the computer generate a simulation.
  • Iconels. The computer would have a library of iconographic elements (iconels), which you’d combine to create a tableau vivant-a living scene. The iconels would have various attributes and behaviors associated with them, so that they’d function appropriately in your tableau: a warning light would blink, a drain would suck things down it, etc. Like words, iconels would be developed by the users of graphlang, and would be passed from person to person.

    Could graphlang be the universal language?

    Graphlang would not exist apart from verbal language. Rather, the two will co-exist. A graphlang “show” would almost always have a narration (and might even be meaningless without it), and in some cases might only serve to punctuate what was spoken (like a gesture).

    However, graphlang will allow certain functions of words to be eliminated in certain contexts. For example, the graphlang equivalent of “the explosion happened after John lit the fuse” might have these visual elements

  • explosion
  • John
  • fuse
  • lighting action

    with the “after” provided either by a literal time sequence (a series of frames), or by an iconel that indicates temporal relations, or by a visual ordering that suggested the flow of time (such as: John, fuse, and lighting action on the left, explosion on the right).

    This method of expression opens up the possibility of a language which is very easy to translate, because it can be broken into two parts: things like objects (nouns) that can be simply and expressively represented iconically, and relations between objects, which would be expressed by visual relations. The objects would be relatively easy to translate, and the relations would not need to be translated.

    Would people want to use graphlang?

    Talk is cheap; at least, it’s easy: we just open our mouths and out it comes. It’s hard to imagine graphlang evolving to the point where it was easier than speaking. Would we be willing to put more effort to communicate with graphlang?

    I think so-in certain contexts, anyway. People are already using PowerPoint, after all! And there are lots of other contexts in which people use computers to have a conversation. An accountant, for example, may discuss a client’s options by showing simulations using a spreadsheet program. Likewise, an architect may use a simulation to show a client a proposed living space. It wouldn’t be necessary for graphlang to be adopted universally for it to be useful. As few as two co-workers who needed to develop their own way of talking about what they were working on might use graphlang.

    People with big budgets and a wide audience have been using the language of film to “win friends and influence people” on TV for a long time now; they recognize the persuasive power of the image-especially the animated image. So, it seems inevitable that some kind of multimedia presentation tool will evolve to allow people to express themselves “more graphically,” and therefore, more persuasively.

    I’ve spoken of graphlang mostly in terms of functionality. There’s also the potential for an aesthetic side to it, if you have beautiful images to express your point of view.

    How Far-fetched is this?

    How far-out is the idea of graphlang? To me, it seems reasonable-and likely, in some form or other. But there’s no question that it is quite different from how we communicate now. However, there have been great revolutions in language in the past. For example, ask yourself:

  • Could a person before the age of writing imagine writing a novel?
  • Could a person before the age of the printing press imagine a web page and the Internet?
  • Could a person before the age of electricity imagine a feature-length motion picture?
  • Could a person who’d never heard music imagine the effect created by a symphony orchestra?

    Drop me a note

    I’ve put this article up on the web to get a conversation going. If you'd like to join in, please drop me a note.


    Following are some things I wrote about graphlang that didn’t make it into the first draft (above).
    There have been various technological innovations that have been at the fringes of language, but that we’re soon going to be living in a world in which computer technology will be ubiquitous enough (that is, it’ll be almost everywhere) that using a computer to help you communicate will be something everyone does. In the past, language has evolved (by people moving around in the world, sharing words, mixing languages together to form new ones, inventing new words to cover different situations, etc.). In the future, the change will be in us-or rather, in the tools we have to do language with. In the past, we’ve done language with our mouths and ears. (There’s also been sign language, which works much the same way, just using hands to “say” words and eyes to “hear” them.) But we’ll start developing ways of using a computer to show what we mean. Think about the difference between apes and humans, due to language. Apes grunt and gesture, and they can communicate a certain amount of stuff, but words are a thousand times more informationally dense than gestures. There’s the saying “a picture is worth a thousand words.” What will it mean when we can “talk” in pictures, using a computer? Will we become a thousand times as able to express ourselves? What will this mean? Anyway, you can see why I’m excited about the prospects ...
    Metaphor is a way to bring images into language ... graphlang is another way.
    From a conversation with Octaviano Romano:

    [You have read at least a little Birkerts?]

    Had only encountered him peripherally until this morning; now I've seen him head on ... a little, at least. Some true, some over-reacting, some insightful, some stupid ... all provocative, but, from the perspective of what I've been thinking about, all beside the point. Sure, you can ask "what's going to happen to Language As We Know It?" That's valid. And the prospects are grim. But the prospects for something that will replace what's being lost -- or was never there in the first place -- are bright.

    Language evolved because it is more practical to trade verbal tokens of experience than to share experience directly. But that's been changing; I can show you a picture in a magazine. I can show you a videotape. These are slow, though, and spoken language is still unbeatable in the context of one-on-one, one-off communication (with written language being a close or distant second, depending ...).

    Not for long, though (at least, not long in the context of the history of language). In 1996, I made my first documentary video, as a substitute for talking in person. It was revolutionary. However, it took a week of pretty much continuous effort (writing the script, gathering the props, renting and borrowing equipment, filming) to produce it, and it's an effort I'm not likely to undertake again without a significant increase in motivational momentum. But the required effort is decreasing. A digital camera is tiny and portable; digital editing is simple. Eventually, all the hard parts will be eliminated, and anybody who wants to communicate by documentary will be able to.

    However, a video documentary, as useful and efficient as it is, is 20th century technology. Most of the silicon-assisted language enhancements (and replacements) have yet to be developed. Utterances and gestures are still the main forms of expression that we can create spontaneously, in real-time. When we can do the same with images, when we have a store of sharable images that is as accessible and flexible as our share of narratives, when we have syntactical tools for binding non-verbal thought ... then, we'll be able to say we're in a truly post-verbal age.

    Not that language will disappear. It will always maintain certain advantages (at least, until carbon is replaced by silicon altogether). We learn verbal (and gestural) language at a very early age, and it's very flexible. Being restricted to non-verbal, silicon-assisted tools would be like being restricted to using off-the-shelf software for everything, and not being able to write applications from scratch; there are always situations that nobody's anticipated, where you need to deal with the nuts and bolts.

    Not to mention that language and human interaction are inextricably intertwined.

    No, language will always be with us. But its role will become more managerial. As now, it will always be the glue that holds things together, the stick that pushes thoughts (and other things) along. But the variety of things that language pushes will increase dramatically ... the soup of expression will become much more heterogeneous.

    I agree that language is inefficient. But I don't find typing slow compared to speech; when I write email, I spend most of my time thinking (and editing). What typing is slow compared to is: communicating contexts.

    For example, I was just thinking about something I've tried recently at work: being the "scribe" during company meetings. I would take a laptop to the meeting, and take notes; the idea was that these notes would be distributed afterwards, so that people wouldn't have to take notes themselves, and nothing would drop through the cracks. (There were lots of reasons that it didn't work, but typing fast enough was not one of them!)

    Anyway, I conceived this context in a few seconds, but it took more than two minutes to figure out a way to get it into words that would make sense to you.

    From a conversation with Octaviano Romano:
    OR>We tend to exalt vision as the highest sense.
    Recently, I've taken to conceiving of a different facility as being
    the most essential: sequencing.
    OR>have the visual arts not offered some compelling
    OR>visual languages or elements thereof?
    Absolutely.  For example, in film, there is the language of the motion
    of the camera.  Very developed, very subtle, very persuasive.  But also,
    very hard to do (compared with tossing together a sentence).
    OR>You're suggesting linguistification of 2D imagery.
    OR>But you're suggesting something more than a two-dimensional
    OR>version of sign-language.  Real pictures ...
    OR>Not just still images, not just little multimedia clips;
    OR>maybe something with the power of a documentary, as in yr experiment,
    OR>but easier to compose
    "Compose" has two meanings here, and I want the looser of the two.
    I can toss off sentences with ease.  See?  Easy.  Likewise, I can toss
    off musical phrases at the piano (as can any jazz player).  There may
    not be a whole lot of meaning in these utterances, just a noncey, local
    meaning.  This is the aspect of "composing" I'm getting at.  My toss-off
    sentences aren't novels, and my musical licks aren't symphonies, or even
    compositions.  But novels and symphonies are possible because writers and
    composers have fluency with the basic elements, and can toss off expressive,
    useful pieces.
    OR>Are we to build new imagery from small, sensible graphical units/words,
    OR>or what?  Not sign-language style, but some way to de/construct imagery
    OR>using agreed units.
    Yep.  There need to be two separate types of syntax: the syntax of images,
    and the syntax of (I'm guessing) graphic-specifying gestural and verbal commands.
    OR>What sort of units and how to build with them?
    OR>Maybe these won't always be identifiable as units; maybe the trick of
    OR>'words' won't necessarily apply; but the relational aspect always will.
    OR>Maybe look for clues in software like Photoshop: filters, layers, effects?
    No, I think the key is to start with gesture.  Imagine that you're doing
    something that's a cross between laying down dominoes, doing sculpture,
    doing a sketch: you can utter image tokens (and sequences of image tokens)
    that you've developed before, you can shape these tokens (either to change
    them, or to express a change or sequence itself), and you can create new
    tokens from scratch.
    OR>But the isolated word's 'meaning' is just
    OR>residue roughly recollectively indicative of the way it was used.
    OR>The word may have no intrinsic meaning other than that residue.
    Well, sure, but the "residue" may have been strengthened by lots of
    consistent use, and a linguistic structure (with support from other
    words) that makes it hang together.  I mean, the meaning of "is" may
    be characterized as reside, but it's a pretty powerful residue.
    OR>Pictorial elements may not have intrinsic meanings either.
    OR>They might seem to, when they can evoke a stronger response
    OR>than a written counterpart.  They might start with intrinsic
    OR>meaning, e.g. a photo clearly relates to its original referent.
    OR>But people will interpret reference & referent differently.
    OR>Start using pictorial or other visual elements linguistically,
    OR>and apparent meanings will shift as elements get reused in
    OR>other contexts.  Some might come to operate like 'words',
    OR>with a certain reusability, but many won't.
    What will be most powerful, of course, is the combination of pictures
    and words.  Words are great for directing mental traffic, establishing
    relations that aren't easy to depict visually.
    OR>So are we going to be focusing on binding visual elements,
    OR>or functionally processing them (filtering, layering, combining,
    OR>breaking, reconstituting, reinterpreting)?
    Yes, both.

    Verbal language is a self-contained, self-sufficient world: anything that happens in language, you can share via language. But there are things outside language that cannot easily be shared through language: images, sounds, etc. With graphlan, the range of things that are encompassed increases; less of our experience is “outside” language.

    Revision History

  • 1999sep15 original idea
  • 1999oct15 first draft of this page
  • 2000feb20 put it up on the web