View all newsletters
Receive our newsletter - data, insights and analysis delivered to you

Microsoft adds a Spark to Machine Learning Library for data scientists

Microsoft updates Apache Spark with new Machine Learning Library with advanced capabilities for data scientists leveraging innovation.

By Hannah Williams

Microsoft has unveiled a new function that caters to data scientists, with the release of a Machine Learning library for Apache Spark.

The aim is to offer an increased rate of experimentation and also help data scientists leverage advanced machine and deep learning techniques on large datasets.

Microsoft’s Machine Learning Library (MLlib) is built to make machine learning scalable and easy to use, providing tools such as algorithms to offer classification, regression, clustering and filtering of machine learning.

According to Microsoft, customers already using its SparkML have found it to be a platform which helps in building scalable machine learning models but have still struggled with low-level APIs.

In order to change this, Microsoft has added the primary Machine Learning API for Spark as the DataFrame-based API in the spark.m1 package.

Read more: Machine learning and data science workloads ignite Apache Spark adoption

By doing this, the Machine Learning for Apache Spark will be able to simplify the necessary tasks for building models, while the library also offers more consistent APIs that can be used to handle different types of data in the form of text or categories.

Ahead of this, Microsoft also put innovation into it by adding a new Spark connector for Azure Cosmos DB. It is designed to deliver real-time data science, machine learning, advanced analytics and embedded features to explore over globally distributed data in Azure Cosmos DB.

Content from our partners
GenAI cybersecurity: "A super-human analyst, with a brain the size of a planet."
Cloud, AI, and cyber security – highlights from DTX Manchester
Infosecurity Europe 2024: Rethink the power of infosecurity

Azure Cosmos DB is Microsoft’s multi-model database service for mission-critical applications. By connecting Apache Spark to the database, provides customers with the opportunity to solve fast-moving data science problems.

In a blog post, Denny Lee, PPM, Azure Cosmos DB said: “With the updated Spark connector for Azure Cosmos DB data models: Documents, Tables and Graphs.”

Apache Spark with Azure Cosmo DB is what drives machine learning, data science, artificial intelligence and advanced analytics.

Microsoft has also made the Machine Learning library for Apache Spark available on GitHub as an open source project for easier access for customers.

Websites in our network
Select and enter your corporate email address Tech Monitor's research, insight and analysis examines the frontiers of digital transformation to help tech leaders navigate the future. Our Changelog newsletter delivers our best work to your inbox every week.
  • CIO
  • CTO
  • CISO
  • CSO
  • CFO
  • CDO
  • CEO
  • Architect Founder
  • MD
  • Director
  • Manager
  • Other
Visit our privacy policy for more information about our services, how Progressive Media Investments may use, process and share your personal data, including information on your rights in respect of your personal data and how you can unsubscribe from future marketing communications. Our services are intended for corporate subscribers and you warrant that the email address submitted is your corporate email address.