Philosophy of Science Portal: Watson wins on "Jeopardy"

Ken Jennings, left, and Brad Rutter

A three day landmark battle between Ken Jennings, Brad Rutter, and Watson [an IBM computer] ended today with Watson winning.

"Computer crushes the competition on 'Jeopardy!'"

by

Frazier Moore

February 16th, 2011

USA TODAY

The computer brained its human competition in Game 1 of the Man vs. Machine competition on "Jeopardy!"

On the 30-question game board, veteran "Jeopardy!" champs Ken Jennings and Brad Rutter managed only five correct responses between them during the Double Jeopardy round that aired Tuesday. They ended the first game of the two-game face-off with paltry earnings of $4,800 and $10,400 respectively.

Watson, their IBM supercomputer nemesis, emerged from the Final Jeopardy round with $35,734.

Tuesday's competition began with Jennings (who has the longest "Jeopardy!" winning streak at 74 games) making the first choice. But Watson jumped in with the correct response: What is leprosy?

He followed that with bang-on responses Franz Liszt, dengue fever, violin, Rachmaninoff and albinism, then landed on a Daily Double in the "Cambridge" category.

"I'll wager $6,435," Watson (named for IBM founder Thomas J. Watson) said in his pleasant electronic voice.

"I won't ask," said host Alex Trebek, wondering with everybody else where that figure came from.

But Watson knew what he was doing. Sir Christopher Wren was the correct response, and Watson's total vaulted to $21,035 as the humans stood by helplessly.

Watson blew his next response. But so did both his opponents. He guessed Picasso. Jennings guessed Cubism. Rutter guessed Impressionism. (Correct question: What is modern art?)

Back to Watson, who soon hit the game's second Daily Double. But even when he was only 32% sure (you could see his precise level of certainty displayed on the screen), Watson correctly guessed Baghdad as the city from whose national museum the ancient Lion of Nimrud ivory relief went missing (along with "a lot of other stuff") in 2003. Watson added $1,246 to his stash.

He even correctly identified the Church Lady character from "Saturday Night Live."

One answer stumped everyone: "A Titian portrait of this Spanish king was stolen at gunpoint from an Argentine museum in 1987." (Correct response: Philip.) Jennings shook his head. Rutter wrenched his face. Watson, as usual, seemed unfazed.

Even when he bungled Final Jeopardy, Watson (with his 10 offstage racks of computer servers) remained poised.

The answer: "Its largest airport is named for a World War II hero; its second largest, for a World War II battle."

Both Jennings and Rutter knew the right response was Chicago.

Watson guessed doubtfully, "What is Toronto?????" It didn't matter. He had shrewdly wagered only $947.

The trio will return on Wednesday, when their second game is aired. The overall winner will collect $1 million.

The bouts were taped at the IBM research center in Yorktown Heights, N.Y., last month.

"What’s With Watson’s Weird Bets? And Other Questions About IBM’s Jeopardy-Playing Machine"

by

Kathy Ceceri

February 16th, 2011

Wired

Tuesday night at Rensselaer Polytechnic Institute the computer guys from IBM had to explain to an auditorium full of Jeopardy and computer geeks how their supercomputer Watson responded “What is Toronto????” in the category of “U.S. Cities.” And what was with all the odd dollar amounts on the Daily Double wagers?

It all has to do with the fact that Watson is, well, he’s Not Human.

For three nights this week the IBM computer named after the company’s founder is a featured player on the popular trivia game show. At RPI’s EMPAC facility in Troy, NY Tuesday, a panel including Chris Welty and Adam Lally of IBM’s Watson team told a packed crowd that came to watch the show’s broadcast on a 56-foot-wide HD video screen that the methods Watson uses to come up with its Jeopardy responses are very different than those used by champions like Ken Jennings and Brad Rutter.

Watson’s brain has been packed full of facts and definitions from sources like Wikipedia and WordNet. Then the machine weighs its sources according to their proven accuracy during practice games. Of course, sometimes the computer needs a little help weeding out inappropriate responses.

“We had to put in a profanity filter after one bad incident,” Lally said with a laugh.

When looking for possible responses, Watson ranks choices according to the number of connections between the words in the clue. So for the Final Jeopardy answer involving airports named for World War II heroes and battles, the computer looked for cities with airports whose names had a link to the conflict. Categories have been found to be a less reliable way to narrow down responses; that’s why “Toronto” ranked higher than “Chicago” (the correct response) for a category that should have excluded cities in Canada.

As for the weird wagers — Watson bet $1246 on one Daily Double and $6,435 on the other — Lally said the Watson team did include game theorists, who programmed winning strategies (like searching the board’s higher-priced clues early for hidden Daily Doubles) into Watson’s game plan. But though human players usually pick round numbers when naming their Daily Double wager, there’s no reason why they have to — so the team decided to let Watson choose its own amount based on its own algorithms.

Although on Tuesday night it looked like Watson had an advantage over his human opponents when it comes to “buzzing in,” Welty said that wasn’t the case. Watson is programmed to wait until the light next to host Alex Trebek is lit, alerting players that they are allowed to press the buttons that signal their desire to respond (buzzing in early gets a player locked out momentarily). Human players get a split second head start, he said, because they are listening for Trebek to finish reading the clue rather than waiting for the light.

Despite Watson’s big international gaffe, Welty pointed out that the machine still won with about $25,000 to spare. And though winning is nice, the ultimate goal of the Watson project isn’t to develop computers that can beat humans at their own game. Rather, the game is a means for developing a computer that can communicate better with its flesh and blood counterparts. And by that measure, Watson is a success.

As Welty noted, “Watson is better than any machine before at processing natural language.”

RPI is posting streaming video of its Jeopardy panels (minus the show itself) on its website. The two-game match wraps up tonight, Wednesday; check the Jeopardy website for details.

Post script...

Watson is nothing more than a fast machine with access to a huge data base and the ability to statistically provide answers. Ask Watson if God exists or any other abstract question. LOSER.

"Computer Wins on ‘Jeopardy!’: Trivial, It’s Not"

by

John Markoff

February 16th, 2011

The New York Times

In the end, the humans on “Jeopardy!” surrendered meekly.

Facing certain defeat at the hands of a room-size I.B.M. computer on Wednesday evening, Ken Jennings, famous for winning 74 games in a row on the TV quiz show, acknowledged the obvious. “I, for one, welcome our new computer overlords,” he wrote on his video screen, borrowing a line from a “Simpsons” episode.

From now on, if the answer is “the computer champion on “Jeopardy!,” the question will be, “What is Watson?”

For I.B.M., the showdown was not merely a well-publicized stunt and a $1 million prize, but proof that the company has taken a big step toward a world in which intelligent machines will understand and respond to humans, and perhaps inevitably, replace some of them.

Watson, specifically, is a “question answering machine” of a type that artificial intelligence researchers have struggled with for decades — a computer akin to the one on “Star Trek” that can understand questions posed in natural language and answer them.

Watson showed itself to be imperfect, but researchers at I.B.M. and other companies are already developing uses for Watson’s technologies that could have significant impact on the way doctors practice and consumers buy products.

“Cast your mind back 20 years and who would have thought this was possible?” said Edward Feigenbaum, a Stanford University computer scientist and a pioneer in the field.

In its “Jeopardy!” project, I.B.M. researchers were tackling a game that requires not only encyclopedic recall, but the ability to untangle convoluted and often opaque statements, a modicum of luck, and quick, strategic button pressing.

The contest, which was taped in January here at the company’s T. J. Watson Research Laboratory before an audience of I.B.M. executives and company clients, played out in three televised episodes concluding Wednesday. At the end of the first day, Watson was in a tie with Brad Rutter, another ace human player, at $5,000 each, with Mr. Jennings trailing with $2,000.

But on the second day, Watson went on a tear. By night’s end, Watson had a commanding lead with a total of $35,734, compared with Mr. Rutter’s $10,400 and Mr. Jennings’ $4,800.

But victory was not cemented until late in the third match, when Watson was in Nonfiction. “Same category for $1,200” it said in a manufactured tenor, and lucked into a Daily Double. Mr. Jennings grimaced.

Even later in the match, however, had Mr. Jennings won another key Daily Double it might have come down to Final Jeopardy, I.B.M. researchers acknowledged.

The final tally was $77,147 to Mr. Jennings’ $24,000 and Mr. Rutter’s $21,600.

More than anything, the contest was a vindication for the academic field of computer science, which began with great promise in the 1960s with the vision of creating a thinking machine and which became the laughingstock of Silicon Valley in the 1980s, when a series of heavily funded start-up companies went bankrupt.

Despite its intellectual prowess, Watson was by no means omniscient. On Tuesday evening during Final Jeopardy, the category was U.S. Cities and the clue was: “Its largest airport is named for a World War II hero; its second largest for a World War II battle.”

Watson drew guffaws from many in the television audience when it responded “What is Toronto?????”

The string of question marks indicated that the system had very low confidence in its response, I.B.M. researchers said, but because it was Final Jeopardy, it was forced to give a response. The machine did not suffer much damage. It had wagered just $947 on its result.

“We failed to deeply understand what was going on there,” said David Ferrucci, an I.B.M. researcher who led the development of Watson. “The reality is that there’s lots of data where the title is U.S. cities and the answers are countries, European cities, people, mayors. Even though it says U.S. cities, we had very little confidence that that’s the distinguishing feature.”

The researchers also acknowledged that the machine had benefited from the “buzzer factor.”

Both Mr. Jennings and Mr. Rutter are accomplished at anticipating the light that signals it is possible to “buzz in,” and can sometimes get in with virtually zero lag time. The danger is to buzz too early, in which case the contestant is penalized and “locked out” for roughly a quarter of a second.

Watson, on the other hand, does not anticipate the light, but has a weighted scheme that allows it, when it is highly confident, to buzz in as quickly as 10 milliseconds, making it very hard for humans to beat. When it was less confident, it buzzed more slowly. In the second round, Watson beat the others to the buzzer in 24 out of 30 Double Jeopardy questions.

“It sort of wants to get beaten when it doesn’t have high confidence,” Dr. Ferrucci said. “It doesn’t want to look stupid.”

Both human players said that Watson’s button pushing skill was not necessarily an unfair advantage. “I beat Watson a couple of times,” Mr. Rutter said.

When Watson did buzz in, it made the most of it. Showing the ability to parse language, it responded to, “A recent best seller by Muriel Barbery is called ‘This of the Hedgehog,’ ” with “What is Elegance?”

It showed its facility with medical diagnosis. With the answer: “You just need a nap. You don’t have this sleep disorder that can make sufferers nod off while standing up,” Watson replied, “What is narcolepsy?”

The coup de grâce came with the answer, “William Wilkenson’s ‘An Account of the Principalities of Wallachia and Moldavia’ inspired this author’s most famous novel.” Mr. Jennings wrote, correctly, Bram Stoker, but realized he could not catch up with Watson’s winnings and wrote out his surrender.

Both players took the contest and its outcome philosophically.

“I had a great time and I would do it again in a heartbeat,” said Mr. Jennings. “It’s not about the results; this is about being part of the future.”

For I.B.M., the future will happen very quickly, company executives said. On Thursday it plans to announce that it will collaborate with Columbia University and the University of Maryland to create a physician’s assistant service that will allow doctors to query a cybernetic assistant. The company also plans to work with Nuance Communications Inc. to add voice recognition to the physician’s assistant, possibly making the service available in as little as 18 months.

“I have been in medical education for 40 years and we’re still a very memory-based curriculum,” said Dr. Herbert Chase, a professor of clinical medicine at Columbia University who is working with I.B.M. on the physician’s assistant. “The power of Watson- like tools will cause us to reconsider what it is we want students to do.”

I.B.M. executives also said they are in discussions with a major consumer electronics retailer to develop a version of Watson, named after I.B.M.’s founder, Thomas J. Watson, that would be able to interact with consumers on a variety of subjects like buying decisions and technical support.

Dr. Ferrucci sees none of the fears that have been expressed by theorists and science fiction writers about the potential of computers to usurp humans.

“People ask me if this is HAL,” he said, referring to the computer in “2001: A Space Odyssey.” “HAL’s not the focus, the focus is on the computer on ‘Star Trek,’ where you have this intelligent information seek dialog, where you can ask follow-up questions and the computer can look at all the evidence and tries to ask follow-up questions. That’s very cool.”

"What IBM’s ‘Watson’ Finds Elementary, And Not So Much'"

by

Adrian Covert

February 16th, 2011

Wired

When we last left our heroes, Watson and the undefeated Jeopardy champion Brad Rutter were tied for the lead with $5,000 apiece. Ken Jennings, meanwhile, trailed with $2,000. Watson had some shaky moments in the early stages of round one, but kept it together enough to stay on top and frustrate the Mormon Machine. Let’s rejoin the action…

Watson went on a spree early in the match, correctly responding to clues about diseases and classical music left and right. (Jeopardy questions are called “clues” and the contestants’ answers, which must be phrased as a question, are called “responses.”) After running his score up to $14,600 (and denying the other two even a chance to chime in), Watson hit the “Daily Double.”

Unlike Monday night, when he could only wager a max of $1,000, Watson had some freedom to play around with his money total. How much did he risk? $6,435. That drew a solid laugh from the crowd and prompted Trebek to proclaim, “I won’t ask. I won’t ask.” Predictably, Watson produced the correct response to a clue about architecture.

And then, Watson hit a hiccup. He picked the category dealing fine art, and was given the following clue: “In May 2010, five paintings worth $125 million by Braque, Matisse & three others left Paris’ Museum of this art period.”

Latching on to the the keywords “3 others,” Watson responded “Picasso.” The correct response, which required the contestants to complete the museum name, was “modern art.” But the other two were as puzzled by the clue as Watson was, thinking they had to name a more specific era. So no harm no foul.

A couple of questions later, Watson hit the “Daily Double” again, this time in the fine art category. This time around he wagered $1,246. While he understood what the clue was asking for this time (the city from which some art was stolen), his confidence percentages were shockingly low across the board, which his most confident response (in this case Baghdad) coming in at 32 percent. He even went as far as to mention he was guessing before answering. But Watson got it right.

From there, it was mostly a trivia bloodbath. Jennings and Rutter could hardly get a word in, as Watson wiped out the board full of clues regarding hedgehogs, Cambridge University, and terms including the words “church” or “state.” By the end of Double Jeopardy, Jennings and Rutter could only give the camera looks which sat somewhere between vexed and nonplussed. That’s because Jennings had $2400, Rutter had $5400 and Watson had $36,681.

For Final Jeopardy, the three challengers were given the category “U.S. Cities” and asked to name the city with one airport named after a World War II hero and another named for a World War II battle. Jennings and Rutter both responded correctly with “Chicago.” What was Watson’s response? “Toronto.” As in Ontario. As in Canada. Watson was totally confused to the point where it didn’t consider the restrictions set by the category itself.

But how much did he wager? In typical Watson form, he only bet $947, likely realizing it could wager $0 and still win. So while there’s still a whole other round to be played tomorrow night the truth is that with $4800 and $10,400 respectively, Jennings and Rutter will be hard-pressed to catch up to Watson’s $35,734. Barring a catastrophic meltdown, tomorrow night’s show will be a victory lap, celebrating the triumph of machine over man.

But now that we’ve been able to watch Watson in action for an entire round, we have a better idea of his strengths and weaknesses when he comes to answering open-ended questions.

What Watson Does Well

Memory: Watson doesn’t have to worry about forgetting anything. Whatever is loaded into his system is retained perfectly (the advantage of running on all zeroes and ones).

Reaction times: As an emotionless machine, Watson is better suited to react to the signal telling contestants to buzz in. He excels when clues are phrased as directly as possible. When given simple sentence structures clearly asking for who, what, when or where, Watson is unstoppable. We, as humans, can’t compete with that.

Wagering: When Watson hits Daily Doubles and the Final Jeopardy stage, he is able to analyze his confidence in the given category (or other similar ones) and his overall probability of winning, then uses those two factors to determine an optimal dollar amount. If he’s winning big or is not as confident in the category, he’ll tend to wager more conservatively. If he is down in cash or very confident in the category, he will wager more. Because he can compute the factors with a numerical preciseness humans cannot, his wagers take on strange dollar amounts. Watson has also been programmed with a historical knowledge of where Daily Doubles are most commonly found, so he can determine their most probable locations.

What Watson Doesn’t Do well

Complex Syntax: When sentence structures become complex, or the question is asking contestants to consider two indirectly related factors or ideas, Watson tends to get confused. His confidence drops and his reaction times slow.

Art: For whatever reason, Watson doesn’t know a damn thing about art. He struggled with nearly every clue in the category Tuesday, incorrectly responding to one clue, getting beat to the punch on another clue and failing to buzz in on another. And the one he got right? He had a confidence of only 32 percent.

Eliminating Previous Wrong Responses From Consideration: IBM programmers didn’t think Watson would ever have an issue with using the same incorrect response or wrong response structure as a contestant answering before him. Well he ran into the problem twice last night when he repeated one of Jenning’s incorrect responses, then failed to realize he had to include the word “missing” when replying to a clue about about a gymnast with one leg.

While the suspense seems to be (mostly) taken out of Wednesday’s final throwdown, it’s still pretty amazing to see a digital machine interpret and respond to the human language. Be sure to check back here tomorrow night for our thoughts on the conclusion of the IBM Jeopardy Challenge.

"Is It Time to Welcome Our New Computer Overlords?"

by

Ben Zimmer

February 18th, 2011

The Atlantic

Oh, that Ken Jennings, always quick with a quip. At the end of the three-day Jeopardy! tournament pitting him and fellow human Brad Rutter against the IBM supercomputer Watson, he had a good one. When it came time for Final Jeopardy, he and Rutter already knew that Watson had trounced the two of them, the best competitors that Jeopardy! had ever had. So, on his written response to a clue about Bram Stoker, the author of Dracula, Jennings wrote, "I, for one, welcome our new computer overlords."

Now, think about that sentence. What does it mean to you? If you are a fan of The Simpsons, you'll be able to identify it as a riff on a line from the 1994 episode, "Deep Space Homer," wherein clueless news anchor Kent Brockman is briefly under the mistaken impression that a "master race of giant space ants" is about to take over Earth. "I, for one, welcome our new insect overlords," Brockman says, sucking up to the new bosses. "I'd like to remind them that as a trusted TV personality, I can be helpful in rounding up others to toil in their underground sugar caves."

Even if you're not intimately familiar with that episode (and you really should be), you might have come across the "Overlord Meme," which uses Brockman's line as a template to make a sarcastic statement of submission: "I, for one, welcome our (new) ___ overlord(s)." Over on Language Log, where I'm a contributor, we'd call this kind of phrasal template a "snowclone," and that one's been on our radar since 2004. So it's a repurposed pop-culture reference wrapped in several layers of irony.

But what would Watson make of this smart-alecky remark? The question-answering algorithms that IBM developed to allow Watson to compete on Jeopardy! might lead it to conjecture that it has something to do with The Simpsons -- since the full text of Wikipedia is among its 15 terabytes of reference data, and the Kent Brockman page explains the Overlord Meme. After all, Watson's mechanical thumb had beaten Ken and Brad's real ones to the buzzer on a Simpsons clue earlier in the game (identifying the show as the home of Itchy and Scratchy). But beyond its Simpsonian pedigree, this complex use of language would be entirely opaque to Watson. Humans, on the other hand, have no problem identifying how such a snowclone works, appreciating its humorous resonances, and constructing new variations on the theme.

All of this is to say that while Ken and Brad lost the battle, Team Carbon is still winning the language war against Team Silicon. The "war" metaphor, incidentally, had been playing out for weeks, stoked by IBM and Jeopardy! to build public interest in the tournament. The press gladly played along, supplying headlines like the one in the Science Times from Tuesday, "A Fight to Win the Future: Computers vs. Humans." IBM knew from the Kasparov vs. Deep Blue days that we're all suckers for the "man vs. machine" trope, going back to John Henry's mythical race against the steam-powered hammer. It certainly makes for a better storyline than, say, "Check out the latest incremental innovations that Natural Language Processing researchers are making in the field of question-answering!"

I first encountered IBM's hype about the tournament last month, during the NFL's conference championship games, when Dave Ferrucci, the ebullient lead engineer on the project, showed up in commercial breaks to tell us about the marvels of Watson. One commercial intriguingly opens with Groucho Marx telling his classic joke: "One morning I shot an elephant in my pajamas. How he got into my pajamas, I don't know." Ferrucci then begins: "Real language is filled with nuance, slang and metaphor. It's more than half the world's data. But computers couldn't understand it." He continues: "Watson is a computer that uncovers meaning in our language, and pinpoints the right answer, instantly. It uses deep analytics to answer questions computers never could before, even the ones on Jeopardy!" Then a Jeopardy! clue is displayed: "Groucho quipped, 'One morning I shot' this 'in my pajamas.'"

Now, that's a provocative set of claims. Watson's performance in the tournament (despite a few howlers along the way) clearly demonstrates that it is very skilled in particular types of question-answering, and I have no doubt it could handle that Groucho clue with aplomb. But does that mean that Watson "understands" the "nuance, slang and metaphor" of natural language? That it "uncovers meaning in our language"? Depends what you mean by "meaning," and how you understand "understanding."

Elsewhere, Ferrucci has been more circumspect about Watson's level of "understanding." In an interview with IBM's own magazine ForwardView, he said, "For a computer, there is no connection from words to human experience and human cognition. The words are just symbols to the computer. How does it know what they really mean?" In other words, for all of the impressive NLP programming that has gone into Watson, the computer is unable to penetrate the semantics of language, or comprehend how meanings of words are shot through with allusions to human culture and the experience of daily life.

Such a sober assessment doesn't jibe with popular perceptions of artificial intelligence, of course. We want our "smart computers" to engage with us linguistically like HAL did in Stanley Kubrick's 2001 -- well, except for the murderous rampage part. (Ferrucci and his team prefer to use a different pop-cultural point of reference: the original series of Star Trek, with its more benign on-board talking computer.) But let's remember how HAL was introduced, via a BBC interview watched by the spaceship crew in the film:

The sixth member of the Discovery crew was not concerned about the problems of hibernation. For he was the latest result in machine intelligence -- the HAL 9000 computer, which can reproduce, though some experts still prefer to use the word 'mimic,' most of the activities of the human brain, and with incalculably greater speed and reliability.

Is Watson, despite its limitations, nonetheless a precursor to a HAL-esque machine that can "mimic" natural language, if not "reproduce" it? Baby steps, baby steps. First we need a computer that doesn't give Toronto as an answer to a clue about "U.S. Cities," as Watson memorably did for Final Jeopardy in the first game. And we'd also want it to know that the "anatomical oddity" of Olympian gymnast George Eyser was not his leg, but his missing leg.

Those were two isolated gaffes in a pretty clean run by Watson against his human foes, but they'll certainly be remembered at IBM. For proof, see Stephen Baker's book Final Jeopardy, an engaging inside look at the Watson project, culminating with the Jeopardy! showdown in the final chapter. (In a shrewd marketing move, the book was available electronically without its final chapter before the match, and then the ending was given to readers as an update immediately after the conclusion of the tournament.) Baker writes:

As this question-answering technology expands from its quiz show roots into the rest of our lives, engineers at IBM and elsewhere must sharpen its understanding of contextual language. Smarter machines will not call Toronto a U.S. city, and they will recognize the word "missing" as the salient fact in any discussion of George Eyser's leg. Watson represents merely a step in the development of smart machines. Its answering prowess, so formidable on a winter afternoon in 2011, will no doubt seem quaint in a surprisingly short time.

Baker's undoubtedly right about that, but we're still dealing with the limited task of question-answering, not anything even vaguely approaching full-fledged comprehension of natural language, with all of its "nuance, slang and metaphor." If Watson had chuckled at that "computer overlords" jab, then I'd be a little worried.

Watson again

IBM's "Watson" on Jeopardy this Fall

And finally, here is an essay by Ken Jennings...

"My Puny Human Brain"

Jeopardy! genius Ken Jennings on what it's like to play against a supercomputer.

by

Ken Jennings

February 16th, 2011

Slate

When I was selected as one of the two human players to be pitted against IBM's "Watson" supercomputer in a special man-vs.-machine Jeopardy! exhibition match, I felt honored, even heroic. I envisioned myself as the Great Carbon-Based Hope against a new generation of thinking machines—which, if Hollywood is to believed, will inevitably run amok, build unstoppable robot shells, and destroy us all. But at IBM's Thomas J. Watson Research Lab, an Eero Saarinen-designed fortress in the snowy wilds of New York's Westchester County, where the shows taped last month, I wasn't the hero at all. I was the villain.

This was to be an away game for humanity, I realized as I walked onto the slightly-smaller-than-regulation Jeopardy! set that had been mocked up in the building's main auditorium. In the middle of the floor was a huge image of Watson's on-camera avatar, a glowing blue ball crisscrossed by "threads" of thought—42 threads, to be precise, an in-joke for Douglas Adams fans. The stands were full of hopeful IBM programmers and executives, whispering excitedly and pumping their fists every time their digital darling nailed a question. A Watson loss would be invigorating for Luddites and computer-phobes everywhere, but bad news for IBM shareholders.

The IBM team had every reason to be hopeful. Watson seems to represent a giant leap forward in the field of natural-language processing—the ability to understand and respond to everyday English, the way Ask Jeeves did (with uneven results) in the dot-com boom. Jeopardy! clues cover an open domain of human knowledge—every subject imaginable—and are full of booby traps for computers: puns, slang, wordplay, oblique allusions. But in just a few years, Watson has learned—yes, it learns—to deal with some of the myriad complexities of English. When it sees the word "Blondie," it's very good at figuring out whether Jeopardy! means the cookie, the comic strip, or the new-wave band.

I expected Watson's bag of cognitive tricks to be fairly shallow, but I felt an uneasy sense of familiarity as its programmers briefed us before the big match: The computer's techniques for unraveling Jeopardy! clues sounded just like mine. That machine zeroes in on key words in a clue, then combs its memory (in Watson's case, a 15-terabyte data bank of human knowledge) for clusters of associations with those words. It rigorously checks the top hits against all the contextual information it can muster: the category name; the kind of answer being sought; the time, place, and gender hinted at in the clue; and so on. And when it feels "sure" enough, it decides to buzz. This is all an instant, intuitive process for a human Jeopardy! player, but I felt convinced that under the hood my brain was doing more or less the same thing.

Indeed, playing against Watson turned out to be a lot like any other Jeopardy! game, though out of the corner of my eye I could see that the middle player had a plasma screen for a face. Watson has lots in common with a top-ranked human Jeopardy! player: It's very smart, very fast, speaks in an uneven monotone, and has never known the touch of a woman. But unlike us, Watson cannot be intimidated. It never gets cocky or discouraged. It plays its game coldly, implacably, always offering a perfectly timed buzz when it's confident about an answer. Jeopardy! devotees know that buzzer skill is crucial—games between humans are more often won by the fastest thumb than the fastest brain. This advantage is only magnified when one of the "thumbs" is an electromagnetic solenoid trigged by a microsecond-precise jolt of current. I knew it would take some lucky breaks to keep up with the computer, since it couldn't be beaten on speed.

During my 2004 Jeopardy! streak, I was accustomed to mowing down players already demoralized at having to play a long-standing winner like me. But against Watson I felt like the underdog, and as a result I started out too aggressively, blowing high-dollar-value questions on the decade in which the first crossword puzzle appeared (the 1910s) and the handicap of Olympic gymnast George Eyser (he was missing his left leg). At the end of the first game, Watson had what seemed like an insurmountable lead of more than $30,000. I tried to keep my chin up, but in the back of mind, I was already thinking about a possible consolation prize: a second-place finish ahead of the show's other human contestant and my quiz-show archrival, undefeated Jeopardy! phenom Brad Rutter.

In the final round, I made up ground against Watson by finding the first "Daily Double" clue, and all three of us began furiously hunting for the second one, which we knew was my only hope for catching Watson. (Daily Doubles aren't distributed randomly across the board; as Watson well knows, they're more likely to be in some places than others.) By process of elimination, I became convinced it was hiding in the "Legal E's" category, and, given a 50-50 chance between two clues, chose the $1200 one. No dice. Watson took control of the board and chose "Legal E's" for $1600. There was the Daily Double. Game over for humanity.

IBM has bragged to the media that Watson's question-answering skills are good for more than annoying Alex Trebek. The company sees a future in which fields like medical diagnosis, business analytics, and tech support are automated by question-answering software like Watson. Just as factory jobs were eliminated in the 20th century by new assembly-line robots, Brad and I were the first knowledge-industry workers put out of work by the new generation of "thinking" machines. "Quiz show contestant" may be the first job made redundant by Watson, but I'm sure it won't be the last.

But there's no shame in losing to silicon, I thought to myself as I greeted the (suddenly friendlier) team of IBM engineers after the match. After all, I don't have 2,880 processor cores and 15 terabytes of reference works at my disposal—nor can I buzz in with perfect timing whenever I know an answer. My puny human brain, just a few bucks worth of water, salts, and proteins, hung in there just fine against a jillion-dollar supercomputer.

"Watching you on Jeopardy! is what inspired the whole project," one IBM engineer told me, consolingly. "And we looked at your games over and over, your style of play. There's a lot of you in Watson." I understood then why the engineers wanted to beat me so badly: To them, I wasn't the good guy, playing for the human race. That was Watson's role, as a symbol and product of human innovation and ingenuity. So my defeat at the hands of a machine has a happy ending, after all. At least until the whole system becomes sentient and figures out the nuclear launch codes. But I figure that's years away.

[Ken Jennings won 74 consecutive games of Jeopardy! in 2004. He is the author of Brainiac, Ken Jennings's Trivia Almanac, and the forthcoming Maphead: Charting the Wide, Weird World of Geography Wonks.]

Philosophy of Science Portal

Wednesday, February 16, 2011

Watson wins on "Jeopardy"

1 comment: