View all newsletters
Receive our newsletter - data, insights and analysis delivered to you
  1. Technology
  2. Cloud
August 19, 2015

Distancing Hadoop from Big Data & mastering data integration

C-level briefing: Matt Baker, executive director, enterprise strategy, Dell, talks to CBR about why data integration is the biggest challenge facing Big Data and how business is failing to find a use for Hadoop.

By James Nunns

Big Data may be able to live up to its hype but the market is challenged with overcoming data integration, which Baker says: "Is probably the single largest problem that our customers and frankly the market faces."

This comes as a result of the amount of important business data which is finding its way to various locations. Dell is working on this problem with its solution Boomi, which he describes as an event based data integration platform.

Baker sees that the problem is that most customers already have a data integration capability, most of which are very traditional platforms that are batch oriented ETL (Extract, Transform and Load) capabilities.

Baker said: "If they’re starting there, they’re likely to end in a place that’s not so great.

"Depending on your degree of sophistication, you’re going to want that to be real time more than a batch oriented data integration platform would be."

He identifies a realisation of a big gap in Salesforce.com’s analytics capabilities, with many of Dell’s customers leveraging both traditional on premise ERP tools and augmenting them with off premise of cloud based CRM tools like Salesforce.

"They were having an incredibly difficult time doing just basic reporting and analytics between off site data and the on premise ERP data."

Content from our partners
An evolving cybersecurity landscape calls for multi-layered defence strategies
Powering AI’s potential: turning promise into reality
Unlocking growth through hybrid cloud: 5 key takeaways

The Big Data market is growing, but Baker points to a need to change the discussion around it, with too much focus being placed on Hadoop.

"We really over indexed on a discussion around unstructured analytics, and Big Data in many people’s minds become synonymous with Hadoop only.

"Hadoop continues to dominate the discussion around Big Data and I’m not sure that that’s really helpful to the world at large. I’d love to see us get back to a broader discussion on data analytics and what problems we’re trying to solve."

While Big Data may have become synonymous with Hadoop, Baker doesn’t see adoption as being broad. He believes that the space is in a, "state of flux," as businesses struggle to find use cases for it, outside the obvious ones in terms of social media marketing and marketing in general.

One of the issues he identifies is that businesses initially approached it as a technology that was broadly applicable, rather than starting with a question or a query and then deciding what technology was best to answer that question.

Hadoop it appears has had people trying to use it when it isn’t the appropriate tool for the questions being asked.

What Baker is seeing is: "People are investing significantly into old school or modernised or more traditional structured database environments to answer and accelerate the plethora of additional questions they have.

"These are better suited for those traditional ways of looking at data and they’re still trying to find the use cases that are best suited for Hadoop."

An area that he does see Hadoop being used is with HDFS as a platform and utilising Map Reduce, almost like an ETL technique for offloading data warehouses. "So it becomes a tier within a traditional warehouse."

The inability to find appropriate use cases for Hadoop appears to align with the common claim regarding the Big Data skills gap.

Although Baker agrees there is an education gap, he also questions what type of business processes and models actually require the high level of analytics that a Data Scientist offers. Traditional retail businesses for example are struggling with more basic analytics questions, he says.

Figuring out what colour will be popular next year or how many off a dress size to have in stock are issues that businesses struggle with and they are a higher priority than solving a new business model or augmenting the existing business with a new capability.

However, he warns that if you are mining a broad set of data in order to sell advertising space on a website: "You better know your way around unstructured data analytics because otherwise you’ll never be successful."

Websites in our network
Select and enter your corporate email address Tech Monitor's research, insight and analysis examines the frontiers of digital transformation to help tech leaders navigate the future. Our Changelog newsletter delivers our best work to your inbox every week.
  • CIO
  • CTO
  • CISO
  • CSO
  • CFO
  • CDO
  • CEO
  • Architect Founder
  • MD
  • Director
  • Manager
  • Other
Visit our privacy policy for more information about our services, how Progressive Media Investments may use, process and share your personal data, including information on your rights in respect of your personal data and how you can unsubscribe from future marketing communications. Our services are intended for corporate subscribers and you warrant that the email address submitted is your corporate email address.
THANK YOU