Stephen Baker

The Boost
Home - Viewing one post

Ferrucci on Watson's Jeopardy performance: Day One

February 14, 2011News

I went over to IBM Research this morning and watched Watson's performance on Jeopardy with David Ferrucci, the chief scientist on the project and the lead protagonist, along with Watson, in Final Jeopardy.

First things first. IBM's nightmare scenario can now be discarded. Watson can compete with the best human players. There was fear going in that one of the humans--and specifically Brad Rutter--might dominate the buzz. This didn't happen. If anything, Watson was the fastest of the three on the buzzer.

Here's one clue that tripped up Watson:
"It was the anatomical oddity of US gymnist George Eyser who won a gold medal on the parallel bars in 1904."

Ken Jennings won the buzz and guessed that Eyser was missing an arm. This was wrong, and Watson got the rebound. It answered, "What is leg?" Initially, as the game was taped, host Alex Trebek accepted the answer. But a judge stopped the game. After a lengthy parley, it was agreed that Watson missed the important point. Eyser's oddity was not his leg, but that he was missing one. They retaped the segment and Watson lost $1,000.

Ferrucci explained today that an "oddity" is a hard thing for a computer to puzzle out. What is it exactly? Well, the word is subjective from the get-go. We might find it an oddity if someone walked around my town in New Jersey carrying groceries on his head. In much of the world, this is normal.
So Watson, lacking its own point of view, would have had to find documents indicating that not only that Eyser was missing a leg, but that this was odd.

It came close. It concluded that leg was what distinguished Eyser. But it left it at that. Ferrucci said that Watson has been programmed to respond succinctly. Sometimes an extra detail can screw up an otherwise correct answer. If the host was not satisfied with leg, he could have asked for more details, and perhaps Watson would have added that it was missing. (This sometimes happens. For example, if Watson's response is The St Louis Cardinals, it can expand it to "The St. Louis Cardinals Baseball Club. But I haven't seen it refocus an answer, as it would have had to here. In any case, it never got the chance, because Trebek initially accepted "leg." This also prevented Rutter from a chance for an easy rebound.)

Watson's second notable mistake was an embarrassing one. Jennings missed the decade that gave birth to the Oreos, guessing the 1920s. The deaf and blind Watson, oblivious to its opponents, simply repeated Jennings' incorrect response.

Ferrucci was especially happy with Watson's final response, which pulled the machine into a tie with Rutter (at $5,000).

The clue, under the category Literary Characters APB: "Wanted for general evilness, last seen at the Tower of Barad-Dur. It's a giant eye, folks, kinda hard to miss." On this one, Watson first used "Tower of Barad-Dur" to lead it to J.R.R. Tolkien's Lord of the Rings. Then it had to find an "eye" associated with "evilness."

"If you manually craft a database of all eyes," Ferrucci said, referring to a traditional question-answering approach, "it's not clear that you'd include Sauron," a character in the book. But Watson managed to locate Sauron, and overcame the doubts it had that a character could also be an eye. It had 74% confidence in Sauron--and responded correctly.

One more point from the game. Ferrucci and his team were irked that "APB" in the Literary Characters category was spelled out for the contestants. Jennings and Rutter could glean from this that the characters would be villains. Watson could not benefit from this intelligence. That may have led it to lost on a Harry Potter clue to Rutter, who identified the book's uber villain, Valdemort. Watson was befuddled on the clue. It had 37% confidence in Harry Potter, only 20% in Valdemort.

Still, a good opening performance for the computer. In Double Jeopardy, on Tuesday, much more money will be at stake.

add comment share:






©2014 Stephen Baker Media, All rights reserved.     Site by Infinet Design







Kirkus - Kirkus Reviews

Andrew Dunn - Bloomberg News

Culture Mob - Dan Sampson

Shelfari (Amazon) - Tom Nissley

read more reviews



The Boost: an excerpt
- April 15, 2014


My horrible Superbowl weekend, in perspective
- February 3, 2014


My coming novel: Boosting human cognition
- May 30, 2013


Why Nate Silver is never wrong
- November 8, 2012


The psychology behind bankers' hatred for Obama
- September 10, 2012


"Corporations are People": an op-ed
- August 16, 2011


Wall Street Journal excerpt: Final Jeopardy
- February 4, 2011


Why IBM's Watson is Smarter than Google
- January 9, 2011


Rethinking books
- October 3, 2010


The coming privacy boom
- August 17, 2010


The appeal of virtual
- May 18, 2010


My next book: IBM's Jeopardy mission
- March 22, 2010