View all newsletters
Receive our newsletter - data, insights and analysis delivered to you

Facebook Says its New AI Training Recipe Upgrades Google’s Natural Language Processing System

Approach "leads to better downstream task performance"

By CBR Staff Writer

Facebook has created and published a new AI training recipe dubbed “RoBERTa” – a tool that has shot to the top of the General Language Understanding Evaluation (GLUE) benchmark leaderboard, Facebook said today.

GLUE is a collection of tools for evaluating the performance of models across a diverse set of existing Natural Language Processing (NLP) tasks. It is designed to help researchers develop ways for their AI systems to process language in a way that is not exclusive to a single task, genre, or dataset.

RoBERTa is an optimisation of Google’s popular BERT system for pre-training Natural Language Processing (NLP) systems that was open sourced in November last year. It relies on unannotated text drawn from the web, as opposed to a language corpus that’s been labeled specifically for a given task.

As Facebook AI put it in a new paper published on Tuesday: “RoBERTa builds on BERT’s language masking strategy, wherein the system learns to predict intentionally hidden sections of text within otherwise unannotated language examples.

Content from our partners
Green for go: Transforming trade in the UK
Manufacturers are switching to personalised customer experience amid fierce competition
How many ends in end-to-end service orchestration?

“RoBERTa, which was implemented in PyTorch, modifies key hyperparameters in BERT, including removing BERT’s next-sentence pretraining objective, and training with much larger mini-batches and learning rates. This allows RoBERTa to improve on the masked language modeling objective compared with BERT and leads to better downstream task performance. We also explore training RoBERTa on an order of magnitude more data than BERT, for a longer amount of time. We used existing unannotated NLP data sets as well as CC-News, a novel set drawn from public news articles.”

The results showed that BERT training procedures can significantly improve its performance on a variety of NLP tasks, the team said.

Its release is part of Facebook’s “ongoing commitment to advancing the state-of-the-art in self-supervised systems that can be developed with less reliance on time- and resource-intensive data labeling”. It comes days after it also shared a dataset dubbed “WikiMatrix” that includes 135 million parallel sentences for 1,620 different language pairs in 85 different languages. The dataset was extracted from Wikipedia with the aim of directly training neural machine translation systems between distantly related languages, without the need to first translate to English.

The model, pretraining and fine-tuning code implemented in PyTorch for roBERTa is here. The WikiMatrix dataset and examples are here.

Topics in this article : , , , , , ,
Websites in our network
Select and enter your corporate email address Tech Monitor's research, insight and analysis examines the frontiers of digital transformation to help tech leaders navigate the future. Our Changelog newsletter delivers our best work to your inbox every week.
  • CIO
  • CTO
  • CISO
  • CSO
  • CFO
  • CDO
  • CEO
  • Architect Founder
  • MD
  • Director
  • Manager
  • Other
Visit our privacy policy for more information about our services, how New Statesman Media Group may use, process and share your personal data, including information on your rights in respect of your personal data and how you can unsubscribe from future marketing communications. Our services are intended for corporate subscribers and you warrant that the email address submitted is your corporate email address.