Dangers of Data Sprawl increase during the Remote Work Revolution

In my experience working with Global 2000 enterprise companies, particularly those with active software development projects, I have noticed a troubling trend, writes Jason Truppi, co-founder of cybersecurity services firm ShiftState Security.

Whether the development project is being outsourced or completely in-house, the misuse of sensitive private data is overwhelmingly common and security requirements are often waived over the needs of the business.

As a security professional who has worked hundreds of breaches, I know what inevitably happens to that data. The unfortunate truth is that the data eventually gets leaked, exposed, stolen, and misused through processes of misconfiguration, mishandling, or direct exploitation. Last year the average cost of a data breach was approximately $4 million dollars, and there were plenty of breaches to point to that could have been mitigated or even prevented with proper data access control.

This isn’t just an overzealous hypothetical—it happens all the time. Facebook announced that around 100 developer partners had direct access to private, sensitive user data. Likewise, Twitter had a situation in which usernames and passwords were stored in plain text due to a logging bug. These sorts of breaches aren’t just issues for social media sites, though. Capital One, The Red Cross, Booz Allen, and countless others have fallen victim to similar issues. There are seemingly limitless examples of data being stored by third parties and/or cloud storage platforms, which are eventually breached.

As software eats the world, more and more companies are investing in outsourced development and cloud data storage (data warehouses and lakes) for quicker development cycles and broader business access. Both scenarios create a perfect storm for significantly increasing risk to the business. And as the needs of the business to access the data expands, it leads to less scrutiny and less control on the data. Here are a few observations I’ve made that open companies to additional data risk:

Production data used for development and testing – Software development inherently requires a minimum amount of production data during the building and testing process. Due to the demand, development teams frequently access sensitive data from internal corporate resources to meet development milestones and quality benchmarks. Unfortunately, developers have notoriously lax security controls on their work devices. If you talk to your dev teams they will argue that adding multiple endpoint security and systems management tools interferes with their applications’ communications or slows down their machines. In turn, many of these developers with whom private data rests, remove their security and operational controls, lobby for their removal, or circumvent corporate policies entirely. While I understand their reasoning behind pushing back on security controls, this means the industry overall leaves itself unnecessarily vulnerable in an effort to protect productivity.

Given that most companies make these tradeoffs, this places them in the precarious position of sharing and storing sensitive data on a number of developer machines (connected not only to the corporate network, but also to partner networks and other third parties) without proper security controls or governance.

Increased access to cloud data storage – The move to cloud storage is nothing new, but what is an alarming trend is how much data is being stored in data warehouses and data lakes, and how many more people in an organization have access to that data than ever before. Adding more people and more data in a centralized repository increases the risk that the data will not be governed properly. The question I usually ask companies is, Who is in charge of data security? The answers I usually receive usually result in pointing fingers between developers, security or compliance teams. What you will find is that there is no real champion with the right amount of cross-domain knowledge, security experience or enforcement power for the security of that data.

Data exposed to newly remote workers in response to COVID-19 – Essential business functions need to continue during this pandemic, but that means that employees will be accessing more data through untrusted devices than ever before. Companies have scrambled to buy new software and hardware to support the rapid shift to remote work, but many were not prepared and were forced to allow employees to access corporate resources from their personal devices. This can lead to unnecessary exposure of data onto devices that are outside the security boundaries of a company.

What If There Was A Way To Mitigate These Risks?

Of course there are mitigations to these problems. It just depends on what problem you are trying to solve.

Data synthesis: There’s no way around the fact that developers need realistic data during their development phases, but time and time again the practice has proven a dangerous one, often exposing your organization to risk unnecessarily. This is where data synthesis comes in. Real production data can be transformed into synthesized data which functions exactly like real data with none of the associated risk. This means that the synthetic data can be transferred to any part of your organization, or third parties, without concerns over potential exposure or violating data regulations. This is a great way to mitigate data sprawl for development projects on critical data sets.

Data security as a service: There are data access brokers and data security as a service tools that focus on securing the data flow and access. They can work in cloud environments and/or protect on-prem and legacy applications, depending on your configuration. These software tools can give you very granular access and control of your data down to the specific hosts, users, queries, data fields and data types. These technologies are everything we ever wanted from our databases that we never received from database engineers or IT teams. Be sure to baseline your configurations before implementing any particular solution, so you can have quality metrics to show your boss or compliance team post implementation.

Differential privacy: This is a field that has been evolving rapidly over the last several years. The idea is to give business units access to data, or metadata, good enough to give them the insights they need to grow their business, but not granular enough to expose the individual private records. Companies such as Google and Facebook have pioneered these techniques and provide open source projects to help in this process.

It may seem like a data breach simply couldn’t happen to you, but after working hundreds of breaches globally, I assure you that it can. If you continue to feed into the current development process which pressures developers to perform rapidly without regard for security, it’s only a matter of time before you suffer the consequences. Identify a data security champion and start adding stringent access control policies in your organization to bring back control.

At the end of the day, most attackers get in the door through social engineering, email and endpoint vulnerabilities, but they are ultimately targeting your data. How do you plan to protect it?

Sign up for our weekly news round-up!

Sign up to the newsletter: In Brief

What If There Was A Way To Mitigate These Risks?

See Also: Liverpool’s Sciontec Science Park signs £12 Million Deal with Bruntwood SciTech

Sign up for our regular news round-up!

Sign up for our weekly news round-up!

Sign up to the newsletter: In Brief

I would also like to subscribe to:

Thank you for subscribing