IT leaders have always had to contend with data silos. Until recently, though, these were at least contained within their organisations. Today, a company’s data assets are likely to reside in a mixture of on-premise, public and private cloud systems. These hybrid clouds present new challenges for data integration, governance and security. A ‘data fabric’ architecture is an emerging model for managing data across heterogeneous environments, that proponents argue can boost agility by taking the hard work out of data management.
“There’s a massive gap between the amount of data that is out there and the stuff businesses can make use of,” says Doug Cackett, CTO of HPE Ezmeral, a suite of data and infrastructure management systems that include a data fabric platform. “Closing that gap is really important, and data itself is the fundamental piece of that, not the data science. If you can’t organise your data without spending a massive amount of your time and budget on it, then you’re going to struggle to move forward.”
What is a data fabric architecture?
The term ‘data fabric’ was reportedly coined by data management systems provider NetApp. In a 2016 white paper, the company explained the challenges of managing data in hybrid cloud environments. These include maintaining security, an inability to move data at will, and the complexity of managing data across environments, as each platform tends to have its own tools.
The idea of a data fabric architecture is to bring together data management functions, such as data governance, security management and integration, into a single platform that connects to an organisation’s various data sources. Dave Wells, practice director in data management at consultancy firm Eckerson Group, identifies five characteristics of data fabric architecture as follows:
- Unified data access: a single, cohesive way to access data from multiple sources
- Consolidated data protection: a consistent approach to data back-up, security and recovery wherever the data is generated and stored
- Centralised service level management: a single way of measuring and monitoring service levels related to responsiveness, availability, reliability of data.
- Cloud mobility and portability: supporting the idea of a true hybrid cloud by minimising the friction caused by collating and analysing data from different cloud providers and apps.
- Infrastructure resilience: by separating data management from specific technologies and putting it in a single, dedicated environment, a data fabric creates a more resilient system where emerging technologies or new data sources can be connected with minimal disruption.
“It’s a multi-tiered architecture which talks about the data ingestion, the transformation, the governance, the security, the quality and any other layers in-between,” says Noel Yuhanna, VP and principal analyst for enterprise data architecture at Forrester. “It brings them into one agile platform that does the whole layer of data management, which is otherwise carried in a batch-oriented or a siloed manner.”
Having all these functions in one place, rather than in separate systems, can massively speed up the process of data collection and analysis, Yuhanna says, allowing teams to carry out DataOps on information, extracting insights that can make it more useful to the business.
Cackett says this simple, integrated set-up can ease the administrative burden on CIOs and business leaders and help them focus on outcomes. “Data fabric is all about not having to care,” he says. “Wherever I am in the life cycle of the data, be that how data is produced and consumed at the edge, or how it is stored and processed using AI and machine learning algorithms, as a business leader or CIO I don’t want to have to care about it.
“Having a data fabric gives you a single consistent data management view. You’ve defined how you want your data to be collected and behave ahead of time, and this allows you to take a step back.”
The business benefits of a data fabric architecture
The main organisational benefit to a data fabric architecture is agility, says Forrester’s Yuhanna, because it removes much of the heavy data management work that can slow down development. “We’re seeing [data fabric] as a big trend in sectors like financial services, retailers, healthcare, manufacturing, oil and gas because they want this kind of agile system.”
He describes the experience of a financial services company he was worked with. “They get data from partners coming in every week, which arrives in all kinds of formats,” Yuhanna explains. “The company wants to be analysing this data straight away and getting insights from it within 30 minutes, but if you want a developer to write an application to understand some of these data sources it can take three to five days.”
The data fabric architecture selected by the company enables it to process and analyse data much faster, Yuhanna explains, making use of artificial intelligence to automate parts of the process. “The fabric doesn’t need to know what data is coming in or what format, it’s able to contextualise it and give insights from that context,” he says.
By establishing a consistent set of data management services, data fabric architecture can enable automation, which in turn can allow business users to access the data they need without IT support.
“When I worked in IT we’d get calls from business users all the time wanting access to data, and it was slowing them down,” Yuhanna recalls. “This is a shortcut, with appropriate governance built-in, to providing that data to the consumption layer.”
Getting started with a data fabric architecture
When building a data fabric architecture for their organisations, Yuhanna argues that IT leaders should start small. “There’s no point trying to boil the ocean,” he says. “The idea of it is to get quick results, so if you start with a handful of data sources you will be getting insights from them in a couple of weeks, then you can start building the fabric out, adding more sources or data management methods. It’s a modular approach.”
He adds that it’s important to not overlook the security perspective. “The fabric exposes a lot of data and you want to control who can see what from day one,” he says. “It’s going to contain a lot of sensitive information, so the governance side and deciding what authorisations are required is really important.”