Amazon Web Services (AWS) has launched a “data exchange” service that allows its customers to connect and ingest data from an array of third-party data providers, allowing users to augment their data sets and build richer machine learning models.
The AWS Data Exchange launches with 1000+ licensable data products from over 80 data providers AWS cites a “diverse catalogue” of free and paid offerings across financial services, health care / life sciences, geospatial, weather, and mapping.
“Once subscribed, customers can use the AWS Data Exchange API or console to load data they subscribe to directly into Amazon S3”, AWS said.
These include services from Change Healthcare, Deloitte, Foursquare and Reuters. (Change Healthcare has access to over 14 billion healthcare transactions and $1 trillion in claims annually, while Foursquare processes data of 220 million consumers.)
The service includes an integration with machine learning specialist Databrikcs, whose Pankaj Dugar said: “[The service allows customers to combine] third-party data with their existing data lakes to perform advanced data science and analytics at scale.”
AWS Data Exchange
Currently many organisations and academic institutions are using third-party data to assist them with scientific research or to help them train machine-learning models.
Sometimes issues arise because data is shipped in a physical format which can result in delays due to delivery constraints or they have to manage credentials for multiple File Transfer Protocol hosts, thus creating a messy sharing ecosystem.
Trying to negotiate the delivery of third-party data, while also establishing automated update protocols to ensure that applications, data lakes and ML models are receiving the latest information can lead to inconsistent data ingestion and may reduce a company’s business advantage. This
On the other side of the coin, data providers can struggle to reach everyone who would be interested in their datasets, because reaching out to all potentially customers requires significant investment and marketing engagement. AWS notes this can be compounded for data provider customers as they also have to “manage disparate billing relationships and licensing agreements with every data provider they use.”
Addressing privacy concerns AWS notes that the Data Exchange ‘prohibits’ the sharing of sensitive personal data or any data that is not legally available. Data providers also have the ability to review each subscription so they can manage compliance concerns.