View all newsletters
Receive our newsletter - data, insights and analysis delivered to you
  1. Technology
  2. Data
April 15, 2019

AWS Gets 18 New Public Datasets: From African Soil Chemistry, via Wind Power Data

109 datasets now available

By CBR Staff Writer

Eighteen new public datasets are now available on a growing AWS registry, ranging from an encyclopedia of DNA elements to African soil chemistry data, via meteorological conditions and turbine power for more than 126,000 wind power sites.

The data was added to the public cloud giant’s Public Datasets programme, which provides free cloud storage for public datasets. AWS users can then choose to build services on top of it using a broad range of its commercial tools.

Amazon says it hopes that it that programme will help developers create “new cloud-native techniques, formats, and tools that lower the cost of working with data”.

public datasetsNew Public Datasets: A Snapshot

Among the newly added datasets: nine year’s worth of georeferenced soil sample data that was collected through the Africa Soil Information Service (AfSIS) project from 2009 to 2018. (Researchers have already used this data to train machine learning algorithms that predict to predict crop yield.)

One of the other newly added datasets was submitted by the University of Washington and contains 2PB of observations from the Murchison Widefield radio telescope array in Western Australia. These observations were taken in order to help detect the signatures of the first formation of galaxies and stars, which can give us a greater understanding of the evolution of the universe.

The University of Pennsylvania meanwhile has added a large-scale multilingual dataset of images paired with words. This dataset matches words with their equivalent in 97 other languages. Words in each languages are stored in parallel to the images that represent that word.

AWS Public Datasets

Kucing Indonesian word for Cat Image Source: University of Pennsylvania

If an organisation joins the Amazon Public Dataset project with the intention of adding its work, it is required to take on some responsibilities in relation to it, including maintaining and managing the quality of all data content it submits to the programme. Any contributor is also required to make “reasonable efforts” to optimise the end user experience.

Content from our partners
Rethinking cloud: challenging assumptions, learning lessons
DTX Manchester welcomes leading tech talent from across the region and beyond
The hidden complexities of deploying AI in your business

12 Trillion Lidar Point Cloud Records

Last February the United States Geological Survey (USGS) uploaded its 3D Elevation Program (3DEP) dataset to AWS. This contains a massive 12 trillion LIDAR point cloud records from over 1,200 projects across the United States.

The 3DEP initiative collects three-dimensional information from all over the US using LIDAR technology. A laser-based remote sensing device is fitted onto to an aircraft enabling it to collect billions of LIDAR pulse returns, helping to build a 3D map of an area.

AWS Public Datasets

Image Source: USGS

Kevin Gallagher, Associate Director for USGS Core Science System commented: “The 3D Elevation Program was founded on the concept that high resolution elevation data should be provided unlicensed, free and open to the public.”

“This agreement with Amazon helps to fulfill that promise by providing cloud-access to the trillions of data points collected through the Program.”

“The democratization of elevation data is a tremendous achievement by the community of partners leading this effort and promises to revolutionize approaches to applications from flood forecasting and geologic assessments to precision agriculture and infrastructure development.”

There are now 109 datasets available under the programme.

A full registry is available on AWS’s Github repo.

See Also: Do You Know Your Data Lake from a Data Mart, Vault or Warehouse?

Websites in our network
Select and enter your corporate email address Tech Monitor's research, insight and analysis examines the frontiers of digital transformation to help tech leaders navigate the future. Our Changelog newsletter delivers our best work to your inbox every week.
  • CIO
  • CTO
  • CISO
  • CSO
  • CFO
  • CDO
  • CEO
  • Architect Founder
  • MD
  • Director
  • Manager
  • Other
Visit our privacy policy for more information about our services, how Progressive Media Investments may use, process and share your personal data, including information on your rights in respect of your personal data and how you can unsubscribe from future marketing communications. Our services are intended for corporate subscribers and you warrant that the email address submitted is your corporate email address.