Stephen Baker

The Boost
Home - Viewing one post

Why computers can't figure out words

July 5, 2010Jeopardy book

There was a time that when you wanted a quick lunch, you told the cook behind the counter exactly what you wanted. "Easy on the onions, and why don't you slice one of those pickles real fine and put it in there, with just a dab of mayonnaise?" Then we industrialized the process, and the people behind the counter at McDonalds don't even have to know about food or money: They just hit a code for the order on the register.

J. Stoors Hall uses that example in Beyond A.I. to introduce what he calls "formalist float." We formalize information for efficiency, either communication or logistics, and in the process we lose customized detail. That's the price we pay for the systems we build. We attempt to codify justice into laws, education into curricula, and information into ones and zeros.

In each of these examples, there's a sizeable gap between what the system decrees and what the individual wants or needs, or is attempting to communicate.

I've been thinking about this formalist float as I write about computers struggling with human language. The "real" world, with all of its complexity, cannot be rendered accurately in symbols. That's why we use the symbols in the first place: to generalize. In a sense, each word we use is as imprecise as that key the McDonald's worker punches for the quarter pounder. There's something I'm thinking, it's as unique to me as the sandwich with the pickle and the dab of mayonnaise. But you and I don't share a word for that exact thought. So I come as close as I can with our formal vocabulary, and then we use gestures, voice tone, context, and shared memories to narrow the gap between the formal and the individual case. It's that cultural negotiation that the computer cannot understand.

We need words to be inexact, because if they were too precise we'd each have a unique vocabulary of several billion words, all of them intelligible to every one else. (Maybe that's what animals have.) I'd have a unique word for the sip of coffee I just took at 6:59 on this fifth of July, which was flavored with the anxiety that I'd better get out on my bike before the day heats up. (That would be as useless to me as to everyone else. A word has to be used at least twice to have any purpose.)

If you think about it, each word is a lingua franca, a fragment of a clumsy common language we struggle with. Imagine I say that I'm "weary." I'm thinking one thing, and you might have a very different idea. Maybe I carried a load a long way in the sun. I may have a troubled child. I may have argued with my editor or spent fruitless hours trying to balance my checkbook. You certainly have different ideas, based on your own experience, about what "weary" means. In addition to all different meanings, it might also send other signals to you. Maybe where you come from, it has a slightly rarified feel, and you're wondering whether I'm signaling my sophistication. In any case, we don't know what each other is thinking. But that single word "weary" extends a tiny bridge between us.

Now, with that bridge in place, the word shared, we dig deeper to see if we can agree on its meaning. You study my expression and my tone of voice. That communicates a lot. Someone who has won the Boston Marathon might look contentedly weary. Another, in a divorce hearing, looks anything but. I may slack my jaw in an exaggerated way, illustrating the word with a gesture, as if to say, "Know what I mean?" In this tiny negotiation, we're bridging the formalist float. And closing that gap is the challenge for computers like IBM's Watson, the one I'm writing about.

As computers struggle to bridge the formalist float, millions of humans are making it even more difficult for them. We're distancing ourselves from formal structures. With shorthand and abbreviations in text messages, many of us are creating our own patois. Humans have done this forever. It's how Spanish, Portuguese, Italian and French all grew out of Latin. But technology is speeding it up. The meaning of a single emoticon--;>)--evolves day by day, tribe by tribe.

Verbally, we're making it even harder. I hear conversations all the time in which people bypass the formal vocabulary altogether and rely entirely on sounds, gestures and tone. "So I'm like uuuun, and she's like hhhmmm?" Characters in Jane Austin's novels would find words for these feelings, perhaps "befuddled" and "huffy." Computers could look those words up and have at least an inkling of what we're talking about. They'll never bridge the formalist float entirely--our complexity cannot be reduced to ones and zeros. But elimating words from our discourse makes their job even tougher.

add comment share:

©2018 Stephen Baker Media, All rights reserved.     Site by Infinet Design

Kirkus Reviews -

LibraryJournal - Library Journal

Booklist Reviews - David Pitt

Locus - Paul di Filippo

read more reviews

Prequel to The Boost: Dark Site
- December 3, 2014

The Boost: an excerpt
- April 15, 2014

My horrible Superbowl weekend, in perspective
- February 3, 2014

My coming novel: Boosting human cognition
- May 30, 2013

Why Nate Silver is never wrong
- November 8, 2012

The psychology behind bankers' hatred for Obama
- September 10, 2012

"Corporations are People": an op-ed
- August 16, 2011

Wall Street Journal excerpt: Final Jeopardy
- February 4, 2011

Why IBM's Watson is Smarter than Google
- January 9, 2011

Rethinking books
- October 3, 2010

The coming privacy boom
- August 17, 2010

The appeal of virtual
- May 18, 2010