Over recent years there has been rapid growth in the adoption of self-service Data Discovery within the enterprise. Namely the use of desktop-based Data Discovery tools to dive into data and provide rapid on-demand visualisations and analysis. Such an approach has been likened to the use of Excel in terms of its potential ubiquity and because business analysts have relative autonomy to analyse data simply, without the need to rely on the IT department.
The growth of self-service Data Discovery is hard to ignore, but, can this approach meet the Business Intelligence (BI) needs of most organisations? Just like the spreadsheets before them, desktop-based Data Discovery tools come with significant limitations. These limitations mean we risk regressing back to a world of unsecure and ungoverned data silos, multiple versions of the truth and untrustworthy data analysis. And where does this leave us? Ultimately, with contradictory views of organisational performance and bad decision-making.
Understanding the limitations of Data Discovery
To appreciate the risk of relying on desktop-based Data Discovery to facilitate demand for self-service access to reporting and analytics in the enterprise, we need to understand the evolution of traditional BI solutions. Such technologies once started out for use on the desktop, just like many of today’s Data Discovery tools. Analysts would investigate data in their silo and then present that back to the business. It wasn’t until enterprises applied significant pressure that traditional BI vendors developed server-based solutions, which incorporated significant governance capabilities. Today however it seems like many organisations have been quick to forget the logic that underpinned server-based BI platforms in the first instance.
Today, with the popularity of desktop Data Discovery tools, analysts are once again working with data in their own world with IT having no visibility or control over what’s happening – just like the days of Excel. We’ve all experienced one form or another of ‘spreadsheet hell’ over the years. One version of the spreadsheet says ‘X’, another says ‘Y’, and there is rarely a way to easily understand which is right. In part this problem stems from a lack of governance and data ‘lineage’ – the inability to understand and control how the data has changed over time, and what calculations or amendments have been made, so that everyone can have a uniform and trustworthy view of the data. Data Discovery tools suffer from a similar shortage of metadata and the onus is on individuals, working outside a single platform that IT can govern, to manipulate and pull together data from different sources. History has already proved that humans make errors, that errors creep into analysis and that decisions are made based on that analysis. The case for governance should be clear.