View all newsletters
Receive our newsletter - data, insights and analysis delivered to you
  1. Technology
  2. Data
February 26, 2016updated 31 Aug 2016 12:30pm

5 Hadoop problems and how to fix them

List: Adoption isn't always easy but CBR can help to overcome the barriers to adoption.

By James Nunns

Hadoop may have become almost synonymous with the world of big data it is neither the only technology nor the easiest to adopt.

While its ability to store and process vast amounts of data has made it a popular technology, particularly for its potential, there are challenges that must be overcome before adopting technologies in the Hadoop ecosystem.

CBR identifies the problem that you are likely to face and how to overcome them.

 

1. Hadoop confusion

Hadoop has picked up a little bit of a bad reputation for being extremely complex. While there are numerous companies such as Hortonworks, Cloudera and others working on making their own distributions easy to use, there remains complexity.

Selecting the right distribution can be a real challenge, especially as each of them embed different Hadoop components, for example Cloudera’s Impala in CDH, and configuration managers like Ambari.

Solving the challenge requires knowledge of the Hadoop ecosystem, the vendors and their different offerings. This can seem like an impossible task but by spending time reading through articles that compare the different distributions, perhaps speaking to consultants and by running a proof of concept, the right distribution to fit the business needs can be found.

Content from our partners
The hidden complexities of deploying AI in your business
When it comes to AI, remember not every problem is a nail
An evolving cybersecurity landscape calls for multi-layered defence strategies

 

2. Finding the use case

Before the business has even gone through the confusion of the Hadoop ecosystem there should be serious questions asked about why use the technology in the first place.

Should the business be devoting time and resources to Hadoop technologies if the problems they are trying to solve can be solved some other way entirely? Probably not.

This problem has been highlighted by the analyst firm Gartner, which in a 2015 study found that almost half of the 284 Gartner Research Circle members were finding it difficult to adopt the technology because of uncertainty around how it would provide business value.

To solve the problem businesses need to understand how much data they have, focus on the business problems they are being faced and consider whether Hadoop is the right technology. Having a strategy in place is vital and if the business is unsure about use cases then most vendors now have many examples to showcase on their sites.

 

3. Skills gap

The skills gap isn’t unique to Hadoop, it’s a problem that is across the technology sector but it has been magnified in the world of Hadoop.

The reason for this is as outlined above, it’s a complex technology and in the same research where Gartner highlighted the problem of finding a use case, the skills gap came out as the biggest hurdle to adoption.

Learning the skills for big data technology has been a challenge that the area needs to overcome. It is not possible for developers to simply download the technology and start working on it; it requires a minimum of four servers to work in the first place.

Overcoming this challenge isn’t an easy one and there isn’t really a quick fix. As mentioned earlier, there are vendors working on solving this problem but it takes time.

The vendors are busy running training programmes but there are jobs that businesses can do to help.

Firstly it is possible to train within the business, train the staff internally to be able to get use to the technology that they will be using. Secondly, find the right software. This is where knowing the use case and different vendor distributions can help.

 

4. Integration and management

This is an area that should have been covered when figuring out the business strategy with the technology. Basically it should have been figured out who will be maintaining it and is it going to replace existing systems such as the database or existing analytics tools.

Hadoop is typically used in conjunction with other existing technologies in the business but it is necessary to figure out what it will work alongside and what it won’t. Addressing this problem early on will help to save a lot of pain and suffering further down the road.

Like the other problems, vendors have been working on it. Most will now offer tools and instructions for how to integrate Hadoop and so will other vendors, for example in the database space. Database vendors have grown accustomed to integrating with Hadoop so some will offer native integration, meaning that it is even easier to integrate.

 

5. Data access

Finally, the work is almost done and the barriers to Hadoop have almost all been overcome but one more remains, which is transforming data into meaningful management information.

Hadoop is good at storing and processing data, it is at its most basic a batch-processing tool but it doesn’t necessarily offer a lot to the end user in terms of analytics.

Bringing in data from multiple data sources is relatively easy but it is not designed to be particularly interactive, meaning that there is both a skills gap issue and an issue of delivering value to the business.

Big moves have been made in this area with more and more vendors providing support for the technology in their own offerings. IBM for example plans to develop most of its analytics tools around Apache Spark, while SAP has also added the technology to its S4 HANA platform.

To some extent this is a problem that is out of the hands of the business, it is up to the vendors to make the technology accessible to every level of business and prove the value of it.

Most vendors have now got to the point where line of business users can use the technology as easily as they have other traditional BI solutions.

Websites in our network
Select and enter your corporate email address Tech Monitor's research, insight and analysis examines the frontiers of digital transformation to help tech leaders navigate the future. Our Changelog newsletter delivers our best work to your inbox every week.
  • CIO
  • CTO
  • CISO
  • CSO
  • CFO
  • CDO
  • CEO
  • Architect Founder
  • MD
  • Director
  • Manager
  • Other
Visit our privacy policy for more information about our services, how Progressive Media Investments may use, process and share your personal data, including information on your rights in respect of your personal data and how you can unsubscribe from future marketing communications. Our services are intended for corporate subscribers and you warrant that the email address submitted is your corporate email address.
THANK YOU