Amazon Web Services’s AWS Lake Formation – a new data lake service generally available today – has its default permissions for newly created databases and tables set to “everyone” in a configuration choice that left cloud security experts alarmed.
Misconfigured cloud storage is a leading source of data breaches. While the onus is on users to ensure services are configured properly (AWS has a dizzying array of security features) pressure has been building on service providers to limit permissions by default. And the setting is not the only one to raise questions in the new service…
What is AWS Lake Formation?
The new offering aims to make it simpler to create data lakes (a repository that holds large amounts of raw data in native format), by automating the cataloguing of data, and securely making it available for analytics via other AWS services.
It emphasises its strong security controls, with biotech company Amgen saying in a press release today: “Setting up security and access controls for each AWS account… could be cumbersome. AWS Lake Formation streamlines the process with a central point of control while also enabling us to manage who is using our data, and how.”
But independent AWS security consultant Scott Piper was among those noticing that the new service seemed to have some eyebrow-raising default settings.
AWS Lake Formation Default Settings: “Everyone”?
He said: “Its settings by default appear to be making all your data public.
“My hope is the Lake Formation team is not aware that ‘Everyone’ on AWS usually means public (e.g. that’s what it means for S3) and hoping this is accessible only to the principals in the account.” [UPDATED: Something later confirmed by AWS. Read on…]
AWS later updated the new service’s site to clarify what “everyone” means here.
“The grant ‘all to everyone’ settings below are turned on for this account for compatibility with any existing Glue permissions. ‘Everyone’ here means every IAM principal in your account, not the public. These settings are not recommended for production environments and we strongly suggest that you revoke them.”
Yet it wasn’t the only configuration that raised eyebrows, with another managed policy setting drawing (ex-NSA staffer, now AWS security expert) Piper’s attention.
— Scott Piper (@0xdabbad00) August 8, 2019
Piper told Computer Business Review: “Coudtrail:lookups events tells you all the of activity happening in the account, so all of the APIs being called.
“The managed policy is restricted around privileges you would need to use lakeformation (ie. least privileges) and then it gives you this which gives you visibility into everything happening in the account, so you can see new users being added or removed, new cloudformation stacks created, etc.
He added: “This gives more visibility into the account than is needed. AWS has done this before with their ECR policy where for some reason the web console apparently wants this privilege in order to tell the user when they’ve encountered errors somehow. I didn’t really understand AWS’s reasoning when I had reported the problem for ECR (they haven’t gotten back to me for this latest issue).
“This privilege doesn’t directly grant you access to sensitive info but potentially can because unfortunately a lot of cloudformation templates contain secrets (access keys, passwords, etc. A non-technical explanation might be that this privilege let’s you see the equivalent of all of the email subject lines, senders and receivers, in a company, when they should only be seeing their own.”
2.3 Billion Files Exposed Online
Piper flagged the AWS Lake Formation default settings issue amid an ongoing issue of companies failing to properly configure their cloud or other database settings.
Security firm Digital Shadows has identified a colossal 2.3 billion files exposed online: half via the Server Message Block (SMB) protocol – a technology for sharing files first designed in 1983. Other commonly misconfigured technologies including FTP services, rsync, and Network Attached Storage devices .
(Amazon introduced a new feature, “Block Public Access”, in November 2018 that has sharply reduced exposure of S3 buckets, Digital Shadows’ “Too Much Information” report reveals, noting “from the 16 million files we detected in Oct 2018 coming from S3 buckets, we are now [May 2019] detecting less than 2,000 files being exposed.”)
AWS Lake Formation: Don’t People Just Use S3 Storage Anyway?
While many AWS users already create data lakes using S3 buckets, AWS says the process can still be time-consuming.
In an announcement today, the largest public cloud provider said: “Customers [would previously still] need to provision and configure storage, move data from disparate sources into the data lake, and extract the schema and add metadata tags to make it accessible from a searchable data catalog.
Partitioning, indexing, and transforming the data to optimise the performance and cost that comes with running analytics on the data was still time consuming, it noted, as were setting up data access roles and enforcing security policies. As a result, most customers can take up to several months to set up a data lake.”
The service is free for existing AWS users, who pay for the underlying AWS services used (e.g. S3, Athena, etc.) AWS Lake Formation provides a “single, centralised place to set up and manage data access policies, governance, and auditing across Amazon S3 and multiple analytics engines” the company said in its release.
The service is available today in US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Tokyo), and Europe (Ireland).
Additional regions are “coming soon”.