This week in big data: Google, Hortonworks, HPE, Apache Spark and big data boost to the UK economy

It has been a busy week in the world of big data with Gartner holing its Business Intelligence and Analytics Summit in London and Hortonworks team up with HPE for Apache Spark upgrades.

To help everyone to stay up to date, CBR has compiled a round-up of some of the major happenings in big data this week.

1. Hortonworks, HPE & Spark

It may have been a busy week in the big data world but a lot of that can be attributed to Hortonworks.

Three key announcements came from the Hadoop vendor, firstly around Apache Spark and a partnership with Hewlett Packard Enterprise.

Advancements to Spark have come about after HPE Labs managed to optimise its "Shuffle Engine", which basically helps to sort data, by rewriting it in C++. This is said to improve performance by up to 15% for certain workloads.

HPE is planning to open source this code and it will work with Hortonworks to do this. Spark version 1.6 will also now be included in the Hortonworks Data Platform (HDP).

Hortonworks also made improvements to DataFlow, the company’s streaming data package will now include Apache Storm and Apache Kafka. This means that customer will no longer have to have subscriptions to both HDP and DataFlow when using Apache NiFi with Kafka or Storm.

Finally, Hortonworks is reducing the core components’ release cadence to once annually, aligning it with the Open Data Platform initiative.

The company is also releasing a new version of Apache Ambari with SmartSense. This is designed to make Hadoop more manageable, it has been described as a single pane of glass.

2. Gartner BI & Analytics Summit

At the start of the week Gartner held its Business Intelligence and Analytics Summit in London. The event featured speakers such as Frank Buytendijk, VP and Gartner Fellow, Chris Howard, VP Distinguished Analyst at Gartner and exhibitors such as Accenture, Cloudera, Salesforce, Oracle, SAS, Microsoft and many more.

As could perhaps be expected, the event focuses on the growth of the BI and analytics market, problems being faced, how to overcome challenges and numerous other topics.

Among those at the summit were the makers of World of Warcraft, Blizzard entertainment. The popular game maker was represented by Jon Gleicher, business and gameplay insights manager, Blizzard.

Gleicher spoke about how the company is using data on gameplay insights in order to improve both game play and the business.
For example, the company is using Apache Kafka to gather gameplay data, store it in Hadoop with Cloudera and eventually analyse it in SQL or Python and visualise it in Tableau.

In addition to the use cases there is advice on things such as the steps to modernising BI and analytics.

3. Google fights the Zika virus

The spread of the Zika virus has seen concerns about it rise dramatically, in fact, Google found that search interest in the virus grew by 3000% since November.

To help combat the spread, Google has provided a grant of $1 million to UNICEF workers on the ground and the search engine provider will also provide engineers to work with UNICEF in order to analyse their data and to map and anticipate the spread of the virus.

A platform will be built to process data from sources such as weather and travel platforms, with the data being visualised to identify potential outbreak areas.

The platform will be open source and it aims to identify the risk of transmission for different regions. Although this is being deployed to tackle the Zika virus, it will also be used for future emergencies.

The work by Google doesn’t end there, in order to improve access to information about the virus, the company has updated its products so that information has been added globally in 16 languages, in addition to adding Public Health Alerts.

4. Big data and IoT to add £322bn to the UK economy by 2020

The big numbers arrived in force with the research from the Centre for Economics and Business Research on behalf of SAS.

Based on 409 interviews with senior UK decision makers and government data it was found that big data analytics is expected to contribute £241bn to the UK economy between 2015 and 2020, with IoT expected to add £81bn.

Graham Brough, Cebr CEO, said: "Collecting and storing data is only the beginning. It is the application of analytics that allows the UK to harness the benefits of big data and the IoT."

The big numbers don’t stop there – 182,000 new jobs are expected to be created, while big data is expected to result in efficiency savings of £220.4bn and innovation benefits of £12.4bn.

By 2020 it is exected that 67% of companies will have rolled out big data analytics, up from 56% currently.

SAS regional VP, Northern Europe and Russia/CIS Mark Wilkinson said: "The combined benefits of IoT and big data will fuel our economy like nothing else."

5. Syncsort

More Hadoop and Spark work being done, this time Syncsort a big data and mainframe software company has made it capable for organisations to work with mainframe data in Hadoop or Spark in its native format, something the company says is essential for maintaining data lineage and compliance.

The new iteration of the DMX-h migration tool brings in a bulk loading feature called Data Funnel that promises to speed up the process of extracting records from DB2, the relational store from IBM.

Previously customers of Syncsort were able to move information to Hadoop or Spark one database table at a time, now hundreds can be shifted in a single operation. Essentially this will speed up the whole process and make administrators jobs a bit easier.

The data loading option now means that DMX-h no longer requires information from DB2 to be converted into a format that Hadoop and Spark can natively ingest before performing a transfer, this is again a big time saving development.

Sign up for our weekly news round-up!

Sign up to the newsletter: In Brief