Alibaba AI Scores Higher than Human in a Q&A Task

An AI research laboratory attached to Chinese tech giant Alibaba has developed a deep learning model that has scored higher than a human in a complex question and answers task on Microsoft’s machine reading comprehension test, MS MARCO.

MS MARCO is a collection of large scale datasets developed for such deep learning tasks. The dataset includes 1,010,916 user queries to Microsoft’s search engine Bing, and 182,669 natural language answers.

The Alibaba DAMO Academy Language Technology Lab focusses on Natural Language Processing (NLP) used across the Alibaba platform. Its deep learning model also took the top position when tested on retrieval of specific passages; the latest example of China’s rapidly advancing AI capabilities.

It is the second time that the NLP model has outperformed a human. Last year, it scored higher than a human being on a Stanford University’s reading-comprehension test,SQuAD, a simpler test based on a significantly smaller dataset.

“The application of the technology is broad, from intelligent chatbots to search engines that can come back with direct answers rather than links of webpages” Alibaba said.

Alibaba AI — Q&A Task (04/01/2018-Present)

Dr. Luo Si, Leader of the Alibaba DAMO Academy Language Technology Lab, commented in an emailed statement: “We are thrilled by the rapid development of NLP over the past year and the exciting prospects of the technology in real-life uses.”

“NLP has been a core technology that underpins Alibaba’s business for serving hundreds of millions of customers on our e-commerce platforms.”

Moreover their Rouge-L passed the Human benchmark! https://t.co/mGJaJIryt9

— MSMarco (@MSMarcoAI) June 24, 2019

Alibaba AI Lab

One of the DAMO labs’ key models is the Ali Reader, which can understand user requests and queries by using algorithms to process unstructured text in the form of documents, descriptions and web pages, after which it summarises answers to the query for the user.

The Ali Reader is currently used by Alibaba for its Alicare and intelligence service robot platform. The intelligence robot service is based on a NLP that enables smart dialogue through various dialogue-enabling clients, such as websites, mobile apps, and robots.

Dr. Luo Si stated that: “Moving forward, we plan to put NLP on our cloud computing platform Alibaba Cloud, so more clients especially businesses in retail, tourism and public services that involves Q&A tasks could benefit from the technology.”

“We are also exploring a multi-module model that combines features of NLP, speech AI and machine translation. With such a model, users with different languages can communicate more freely with each other and interact with the machine in real-time without language barriers.”

The model used in the MS MARCO test was first developed as a deep cascade model by Alibaba’s researchers, the company said in a release. It was then put to test in Alibaba’s AliMe Chatbot system to assist with online inquiries from over two million daily visitors to the company’s retail platforms. An enriched BERT model (BERT is a technique for NLP pre-training open sourced by Google late last year.) was then developed to enhance both the accuracy and efficiency of machine understanding.

Currently the BERT model sits in the top three in another world-class language understanding test GLUE Benchmarkwith DeepMind as one of the organisers.

Sign up for our weekly news round-up!

Sign up to the newsletter: In Brief

Alibaba AI Lab

See Also: NVIDIA DGX-2 Smashes Trading Benchmark

Sign up for our regular news round-up!

Sign up for our weekly news round-up!

Sign up to the newsletter: In Brief

I would also like to subscribe to:

Thank you for subscribing