View all newsletters
Receive our newsletter - data, insights and analysis delivered to you
  1. Leadership
  2. Strategy
August 21, 2019

LinkedIn Open Sources the Brains Behind its Abuse Prevention Algorithm

"Software library out into the developer community"

By CBR Staff Writer

LinkedIn, the world’s largest professional social networking site have last week announced the open sourcing of the machine learning library named ‘Isolation Forest’, an implementation of a widely used Machine Learning algorithm, the Isolation Forest. The library is being used by LinkedIn to detect and prevent its users from online abuse.

LinkedIn’s implementation of the Isolation Forest is at its core Machine Learning, a modern used approach to writing software in which the software makes decisions instead of a human, based on learnings from data. The firm outlined the “unique challenges” it faces using Machine Learning to tackle to the issue of online abuse in its article announcing the new library.

To name a few, such challenges (they said) are primarily due to labelling the data, an approach typically used in ‘supervised learning’ (a widely used approach to implementing Machine Learning) and adversarial adaptivity, simply meaning that the ‘abusers’ are “quick to adapt and evolve”.

Detecting Suspicious Abnormalities in the Data

As a result of these challenges, the team decided to use a different approach, still with Machine Learning, but instead utilising a well-known algorithm called ‘Isolation Forest’ in which outliers (essentially something that looks different to the norm in a set of data) can be optimally identified in a non-organised set of data – basically it’s much easier to tell if something in the data is strange.

With this in mind, the team noted how this enabled them to identify potentially abusive behaviours in order to safeguard their users: “For some types of abuse, such as spam, it is possible to have a scalable review process where humans label training examples as spam or not spam. There are other types of abuse, such as scraping, where this kind of scalable human labeling is much more difficult, or impossible”

Open Sourced and Applicable for Payment Fraud Through to Data Center Monitoring

As noted earlier, the LinkedIn have open-sourced this software library out into the developer community, meaning that developers from other firms will be able to utilise this unsupervised machine learning approach in a variety of contexts which the team suggest as potential usages, namely:

  • Automation detection
  • Payment fraud
  • ML health assurance
  • Data center monitoring

In closing, there are already other implementations of Isolation Forests available for developers to use, however with the size and scale of LinkedIn’s platform, in addition to the scale of the firms engineering output, it is likely that this open sourced library will benefit other technology teams looking to solve similar outlier detection problems and most importantly protect users in a variety of contexts.

Content from our partners
Green for go: Transforming trade in the UK
Manufacturers are switching to personalised customer experience amid fierce competition
How many ends in end-to-end service orchestration?

See Also: MIT Robotics: Researchers Create Lego-Like Microrobots

Websites in our network
Select and enter your corporate email address Tech Monitor's research, insight and analysis examines the frontiers of digital transformation to help tech leaders navigate the future. Our Changelog newsletter delivers our best work to your inbox every week.
  • CIO
  • CTO
  • CISO
  • CSO
  • CFO
  • CDO
  • CEO
  • Architect Founder
  • MD
  • Director
  • Manager
  • Other
Visit our privacy policy for more information about our services, how New Statesman Media Group may use, process and share your personal data, including information on your rights in respect of your personal data and how you can unsubscribe from future marketing communications. Our services are intended for corporate subscribers and you warrant that the email address submitted is your corporate email address.