EMC, a provider of information infrastructure offerings, has entered into an alliance with Cloudera, a provider of Hadoop-based data management software and services, to enable integration of the latter’s technology with its Data Computing Products division’s Greenplum technology.

The collaboration will allow businesses to manage and analyse large and continuously growing amounts of structured and unstructured information including: log files, sensor data, streaming data, sales receipts, e-mails, research data and images.

The company said that the integration between Cloudera’s Distribution for Hadoop (CDH) for collecting, consolidating and analysing data with EMC’s Greenplum parallel processing database and enterprise data cloud platform will provide an architecture for collaborative analysis of large amounts of structured and unstructured data.

Cloudera’s data management platform is built on the Apache Hadoop open-source software package that consolidates data into a single, reliable repository for comprehensive analysis at lower costs while enabling sophisticated, detailed processing and analysis of the data.

EMC said that that data staged by Cloudera’s Distribution for Hadoop will be integrated with its Greenplum Chorus platform, which uses cloud computing techniques and social collaboration for enterprise data warehousing and analytics.

The integration will, therefore, enable users to discover, access and analyse data from both Greenplum databases and Hadoop infrastructure seamlessly.