IBM's Project Debater Gets its First True Test at THINK

Developing AI systems that can play games was inherent to the AI field from its inception, writes Dan Lahav, Computer Scientist and Debate Expert, IBM Research. In a seminal 1949 paper, Claude Shannon described how to program a computer to play chess at a reasonable level and a decade later, Arthur Samuel from IBM presented the first Checkers program.

Fast forward to 2011 when AI was on full display as Watson competed on Jeopardy, arguably thawing a period described as the AI winter. A few days after this historic moment, IBM Researchers started a journey based on a surprisingly simple idea – can we develop an AI system that will be able to argue?

Argumentation and rhetoric already attracted the greatest Greek philosophers nearly two millennia ago – and the ability to argue and to express our reasoning to others set the cornerstones of civilized society. It is a defining feature of what it means to be human. Thus, developing an AI system in this presumably sole-human arena represented a formidable challenge.

Project Debater: Being Put to the Test at THINK

Given this challenge, IBM Researchers developed Project Debater, the first ever AI system that can meaningfully engage with humans in a full live debate.

At the start of each debate, a controversial topic is chosen and Project Debater creates a coherent four-minute long speech by picking the most relevant and persuasive claims and evidence out of 10 billion of sentences from trusted newspapers and journals.

Project Debater then listens to the opponent’s four-minute response, identifies the key claims, and instantly generates a compelling rebuttal, exhibiting human-like reasoning. The audience is asked to vote before the debate starts and at the end followed by summary speeches by both sides.

Project Debater represents a new kind of AI challenge because debating and arguing are far more open-ended activities compared to playing chess, or GO. Even in competitive debate the rules stem from human culture of discussion, and are not well defined like the rules that determine the allowed moves of a bishop in chess, for example. This fundamental difference carries important implications. First, in complex board games an AI system may come up with any tactic to ensure winning, even if the associated moves could not be easily interpreted by humans. In debate this is no longer the case; the AI system must adapt to human rational, and propose lines of reasoning that humans can follow and empathize with.

Second, in sharp contrast to previous game-related challenges, in debate there is no natural scoring function the AI system can rely upon. The value of individual moves, i.e., arguments, is often inherently subjective; furthermore, there is not even an agreed objective metric to determine the ‘winner’. Project Debater demonstrates that AI can play a significant role in this uncharted territory as well as we believe it will enable a novel form of decision making that will synergistically combine man and machine, allowing humans to take more informed decisions.

How would you quantify the value of a claim? Or the value of a supporting evidence? One of the challenges for the researchers was working in an area where there are no right or wrong answers, and the available labeled data are scarce.

To develop Project Debater, the IBM Research team had to endow the system with three capabilities, each breaking new ground in AI:

1) Data-driven speech writing and delivery: Project Debater is the first demonstration of a computer that can digest massive corpora, and given a short description of a controversial topic, write a well-structured speech, and deliver it with clarity and purpose, while even incorporating humor where appropriate.

2) Listening comprehension: the ability to identify the key concepts and claims hidden within long continuous spoken language.

3) Modeling human dilemmas: modeling the world of human controversy and dilemmas in a unique knowledge representation, enabling the system to suggest principled arguments as needed.

Tune in Live

In June we unveiled the technology in a controlled environment on IBM premises, in a closed setting with 50 attendees and with two debaters who had debated with the system previously.

But now, this week at THINK in San Francisco the stakes are much higher. No longer in the safe confines of IBM, we are demonstrating the technology in a public conference center, with an audience of 800 people and a global livestream. In addition, Harish Hatarajan, our human debater is one of the best of the best, winning the most debating competitions, including the 2012 European debate champion. In addition, the human debater has never competed against Project Debater.

While it can be described as a competition, this is not the right perspective. When humans and machines work together everybody wins because we complement each other, but find out for yourself.

Watch the live debate at 5PM PST, 8PM ET, 11 Feb (1AM GMT, 12 Feb): https://youtu.be/m3u-1yttrVw

Sign up for our weekly news round-up!

Sign up to the newsletter: In Brief

Project Debater: Being Put to the Test at THINK

Tune in Live

Sign up for our regular news round-up!

Sign up for our weekly news round-up!

Sign up to the newsletter: In Brief

I would also like to subscribe to:

Thank you for subscribing