View all newsletters
Receive our newsletter - data, insights and analysis delivered to you
  1. What Is
March 21, 2017

What is Spark?

This popular open source tool has become the backbone of many projects.

By James Nunns

Apache Spark is an open source parallel processing framework which is designed to run data analytics across clustered computers.

The general engine for large-scale data processing is maintained by the Apache Software Foundation.

Spark is designed to provide programmers with an application programming interface (API) centred on a resilient distributed dataset, this is a multiset of data items distributed over a cluster of machines.

Spark is capable of running on Hadoop, Mesos, standalone, or in the cloud and can access numerous data sources such as HDFS, Cassandra, HBase, and S3.


Why has Spark become so popular?

Content from our partners
Why the tech sector must embrace faster, smarter talent recruitment
Sherif Tawfik: The Middle East and Africa are ready to lead on the climate
What to look for in a modern ERP system

Topics in this article :
Websites in our network
Select and enter your corporate email address Tech Monitor's research, insight and analysis examines the frontiers of digital transformation to help tech leaders navigate the future. Our Changelog newsletter delivers our best work to your inbox every week.
  • CIO
  • CTO
  • CISO
  • CSO
  • CFO
  • CDO
  • CEO
  • Architect Founder
  • MD
  • Director
  • Manager
  • Other
Visit our privacy policy for more information about our services, how New Statesman Media Group may use, process and share your personal data, including information on your rights in respect of your personal data and how you can unsubscribe from future marketing communications. Our services are intended for corporate subscribers and you warrant that the email address submitted is your corporate email address.