IBM Watson & Jeopardy: the Hype Train that Never Left the Station

Watson on Jeopardy

In January 2011, IBM amazed the entire world by hosting the television show Jeopardy, and pitched its AI computer, named Watson, against two of its best champions, Ken Jennings and Brad Rutter. Its victory was both shocking and exciting all at the same time.

Ken Jennings (who almost beat Watson), magnanimously stated, “I, for one, welcome our new computer overlords”. It was official, machines had finally usurped humans. It appeared that IBM had achieved the impossible dream that science-fiction novelists had been predicting for decades.

One of Watson’s lead developers, Dr. David Ferrucci, added to the quickly escalating hype with:

“People ask me if this is HAL in “2001: A Space Odyssey.” “HAL’s not the focus; the focus is on the computer on ‘Star Trek,’ where you have this intelligent information seek dialogue, where you can ask follow-up questions and the computer can look at all the evidence and tries to ask follow-up questions. That’s very cool.”

Seven years later the hype train appears to be still stuck at the station. All the promise of 2011 is yet to be materialized. How could something so utterly amazing back in 2011 still be struggling to find its place in the business world of today?

To answer that question, we need to go back in time to 2011 and look at exactly how Watson beat two extremely brilliant humans. What was its secret? And exactly how much AI was really involved?

Once you look at how Watson won, it’s not really that surprising that it was never able to capitalize on that victory in future years.

For those watching the show in February 2011 (when the show aired) they were told that Watson would not be allowed to connect to the internet. That would be cheating, right?

But what was not explained was that the team that built Watson had spent the previous five years downloading the internet onto Watson. Or, to be more precise, the parts of the internet that they knew Jeopardy took its questions from.

This was why the show was hosted at IBM offices. “Watson” was actually a massive room full of IBM hardware. Specifically, 90 IBM Power 750 servers, each of which contains 32 POWER7 processor cores running at 3.55 GHz, according to the company. This architecture allows massively parallel processing on an embarrassingly large scale: as David Ferrucci told Daily Finance, Watson can process 500 gigabytes per second, the equivalent of the content in about a million books.

Moreover, each of those processors were equipped with 256 GB of RAM so that it could retain about 200 million pages worth of data about the world. (During a game, Watson didn’t rely on data stored on hard drives because they would have been too slow to access.)

So, how did the Watson team know which parts of the internet to download and index? Well, that was the easy part. As anyone who has studied the game can tell you, Jeopardy answers can mostly be found from either Wikipedia or specific encyclopedias. “All” they had to do was download what they knew would be the source of the questions (called “answers” on Jeopardy) and then turn unstructured content into semi-structured content by plucking out what they felt would be applicable for any question (names, titles, places, dates, movies, books, songs, etc.). However, doing that was no simple feat, and one of the reasons why it took them five years to make Watson a formidable opponent.

It was a massively labor-intensive operation that required a large IBM staff with PhD’s many years to accomplish. This was the first red flag that the Watson team should have been aware of.

Unfortunately, their goal at the time was to build a computer that could win a TV quiz show. What they should have been building was something that could be implemented in any environment without the need for an army of highly paid professors. Instead they were told, “Build something that can win on Jeopardy”. And that’s exactly what they did.

To this day, Watson is still notoriously picky about the kind of data it can analyze, and still needs a huge amount of manual intervention. In the world of AI, automation is a key requirement of any solution that hopes to be accepted in the business world. Labor intensive solutions are massively expensive and require huge amounts of constant maintenance, which cost large amounts of money.

IBM made this work for Jeopardy because there was no realistic budget or timelines (it ended up taking them five years), and all it had to do was win one game. In the real world of business, budgets matter. And high maintenance costs will destroy any ROI.

IBM Watson Server Farm

Figure 1: Wikipedia-in-a-box

So, as you can see, from the get-go Watson had every advantage possible. While it wasn’t allowed to connect to the internet (remember, that would be cheating), they didn’t have to. They’d already spent five years indexing the parts of the internet they knew they needed.

And then there were the rules of Jeopardy. These played directly into the plexiglass “hand” of Watson. Anyone who has seen the show knows how it works. The “answer” is displayed on the screen, then the host of the show gets to verbally read it out. When he has finished a light appears on a big screen and the competitors can press a button. The first person to press the button gets to state what the “question” is.

But here’s why a machine has a massive built-in advantage. On average it takes 6 seconds for the host to actually read out loud the question. That means that both Watson and the humans get six seconds to figure out if they know the answer or not. Six seconds for a room full of computers is a huge amount of time. Try typing in a question using Google on your phone and see how fast it is. Yes, less than one second.

What this means is that by the time the light comes up on the board, Watson already knows the answer (almost all of the time). But, oftentimes, so do really smart humans. So, theoretically, when the light flashes, it’s possible that everyone knows the answer (and that night the questions weren’t particularly hard).

The determiner of who gets to answer the question is who can depress a buzzer the fastest.

What the TV audience didn’t get to see was that, to make this seem kind of human, the Watson team had rigged the machinery with a plexiglass hand that would automatically depress the buzzer the exact moment the light appeared on the board. Due to various laws of science, this took exactly 10 milliseconds to happen. The best any human can expect to do this is 100 milliseconds (try running this test yourself and see how fast you can do it). Which meant that it was technically impossible for any human to click the buzzer faster than Watson.

The only real way for the humans to stand a chance was to anticipate the light going on (though if they clicked too early they invoked a quarter of a second delay before they could click again). But to help out the humans, Watson was programmed not to anticipate the light.

If you watch the show, you can see that Ken Jennings clicked his buzzer on almost every question that Watson won on. He just wasn’t fast enough. No human could be, unless they got really lucky with their anticipation of the light going on.

Also, what wasn’t apparent to the viewers was that Watson wasn’t “listening” to any questions. An ascii text file was transmitted to Watson the moment the question appeared on the board. Watson then parsed that question (which actually was a very impressive feat by IBM) to figure out the true intent, and then IBM did use a voice feature to actually read out the answer.

In 2011 this was a genuine achievement by IBM, and we do salute the team that worked on it. What they did was not easy and did advance our understanding of AI. But not really in the way the world understood it. Watson was one small (massively over-hyped) step forward, not the huge leap it appeared to be.

What IBM did was build a fantastic Jeopardy machine. It did use elements of AI, but wasn’t quite the miracle it seemed to be. Yes, it was part AI, but it was also part “Wizard of Oz”. And because it was pretty much a one-tricky pony in 2011, IBM has subsequently struggled to make it work in the Enterprise. Though they have tried.

What has become very apparent since 2011 is that what “worked” for winning at Jeopardy doesn’t work today in the real world. AI has come a long way since those pioneering days, and approaches to creating the ultimate Q&A machine have altered dramatically.

While we commend Watson, and the awareness it created, we believe there are better ways to implement an AI solution that do not require an army of PhD graduates. AI is something that should be, and can be, implemented in a matter of weeks. And which can also be easily maintained – delivering ROI on day 1.

Please contact us to learn more…

Contact Us