View all newsletters
Receive our newsletter - data, insights and analysis delivered to you
  1. Technology
  2. Software
February 20, 2020updated 27 Jul 2022 10:10am

Vulnerabilities in the Core: Key Lessons from a Major Open Source Census

"Hundreds of thousands of open source software packages are in production applications throughout the supply chain..."

By CBR Staff Writer

A major new Open Source census has identified the Top 20 most commonly used free and open source software (FOSS) components in production applications.

The Linux Foundation/ Laboratory for Innovation Science at Harvard (LISH) “Census II” report, published this week, represents what it describes as the “first steps toward addressing the structural issues that threaten the FOSS ecosystem.”

What “Structural Issues”?

The report aims to examine the risk of vulnerabilities in these projects due to widespread use of outdated versions; understaffed projects; and existence of known security flaws. (As the list reveals, many are only sporadically updated).

It comes amid growing concerns in some quarters about the “back-dooring” of open source software code bases, following several recent such attacks.

(Most famously, a malicious actor gained publishing rights to the event-stream package of of a popular JavaScript library and then wrote a backdoor into the package. In July 2019, a Ruby developer’s repository was also taken over and code back-doored.)

Jim Zemlin, executive director at the Linux Foundation said: “The report begins to give us an inventory of the most important shared software and potential vulnerabilities and is the first step to understand more about these projects so that we can create tools and standards that results in trust and transparency in software.”

He added: “Open source is an undeniable and critical part of today’s economy, providing the underpinnings for most of our global commerce. Hundreds of thousands of open source software packages are in production applications throughout the supply chain, so understanding what we need to be assessing for vulnerabilities is the first step for ensuring long-term security and sustainability of open source software.

Content from our partners
How businesses can safeguard themselves on the cyber frontline
How hackers’ tactics are evolving in an increasingly complex landscape
Green for go: Transforming trade in the UK

Software Bill of Materials

It also comes as the US federal governments looks to create a Software Bill of Materials that will require all industries to detail the composition of their software systems.

The census authors note: “There is far too little data on actual FOSS usage. Although public data on package downloads, code changes, and known security vulnerabilities abound, the view on where and how FOSS packages are being used remains opaque.

“Accurate project identification impacts not only academia, but the private sector as well. As cyberattacks and security breaches increase, all companies—not just Big
Tech—will need to become more cognizant of which components comprise their websites and applications, as well as the origins of those components.”

Open Source Census: The Top 10 FOSS Components in Production Applications 

Here are the Top 10 most-used FOSS packages*, listed in alphabetical order. (Titles are hyperlinked to repositories). With these dominated by JavaScript-related packages, the open source census also compiled a non-JS-dominated list, see at bottom.

1: async

A utility module which provides functions for working with asynchronous JavaScript.

2: inherits

A browser-friendly inheritance fully compatible with standard node.js inherits.

3: isarray

This is Array for older browsers and deprecated Node.js versions.

4: kind-of

Get the native JavaScript type of a value.

5: Iodash

Another modern JavaScript utility library.

6: Minimist

This module is the guts of optimist’s argument parser.

7: Natives

Do stuff with Node.js’s native JavaScript modules.

8: QS

A querystring parsing and stringifying library with some added security.

9: Readable-Stream

Node.js core streams for userland.

10: String-Decoder

Node-core string_decoder for userland.

How Were These Identified?

The research tapped public data sets and private usage data by Software Composition Analysis (SCAs) and application security companies, including Snyk and Synopsys Cybersecurity Research Center (CyRC), in partnership with the Linux Foundation’s CII to produce the list, with the SCA partners providing data from automated scans of production systems within their customers’ environments.

The most used, non-JavaScript FOSS packages among those reported in the private usage data contributed by SCA partners.

The non-JavaScript FOSS packages Top 10

1: com.fasterxml.jackson.core:jackson-core
A core part of Jackson that defines Streaming API as well as basic shared abstractions.

2: com.fasterxml.jackson.core:jackson-databind
A general data-binding package for Jackson (2.x): works on streaming API (core) implementation(s).

3: com.google.guava:guava
Google core libraries for Java.

4: commons-codec
Apache Commons Codec (TM) software that provides implementations of common encoders and decoders such as Base64, Hex, Phonetic and URLs.

5: commons-io
Commons IO is a library of utilities to assist with developing IO functionality

6: httpcomponents-client
The Apache HttpComponents project is responsible for creating and maintaining a toolset of low level Java components focused on HTTP and associated protocols.

7: httpcomponents-core

8: logback-core
A generic logging framework for Java.

9: org.apache.commons:commons-lang3
A package of Java utility classes for the classes that are in java.lang’s hierarchy, or are considered to be so standard as to justify existence in java.lang

10: slf4j:slf4j
A simple logging facade for Java.

“FOSS was long seen as the domain of hobbyists and tinkerers. However, it has now become an integral component of the modern economy and is a fundamental building block of everyday technologies like smart phones, cars, the Internet of Things, and numerous pieces of critical infrastructure,” said Frank Nagle, a professor at Harvard Business School and co-director of the Census II project. “Understanding which components are most widely used and most vulnerable will allow us to help ensure the continued health of the ecosystem and the digital economy.

The full Linux Foundation report can be read here [pdf].

* A unit of software that can be installed and managed by a package manager — in turn, defined as “software that automates the process of installing/managing packages.”

See also: These Were The Top Five Apache Software projects in 2019

Websites in our network
Select and enter your corporate email address Tech Monitor's research, insight and analysis examines the frontiers of digital transformation to help tech leaders navigate the future. Our Changelog newsletter delivers our best work to your inbox every week.
  • CIO
  • CTO
  • CISO
  • CSO
  • CFO
  • CDO
  • CEO
  • Architect Founder
  • MD
  • Director
  • Manager
  • Other
Visit our privacy policy for more information about our services, how New Statesman Media Group may use, process and share your personal data, including information on your rights in respect of your personal data and how you can unsubscribe from future marketing communications. Our services are intended for corporate subscribers and you warrant that the email address submitted is your corporate email address.
THANK YOU