View all newsletters
Receive our newsletter - data, insights and analysis delivered to you
  1. Technology
  2. Software
July 21, 2020

Apache Cassandra 4.0 Lands: Five Times Faster, Audit Logging and More

"Running the biggest baddest workloads on the Internet"

By CBR Staff Writer

Apache Cassandra, the distributed NoSQL database, ranks highly in the “most dreaded” database category of Stack Overflow’s annual developer survey.

That’s despite the open source database’s undeniable utility and resilience, as well as widespread adoption by companies including Apple and Netflix.

(Unlike many databases with their primary/secondary architecture under which the latter can only perform read operations, in Cassandra, every node is capable of performing read and write, making it easier to scale and replicate workloads across geographies or hybrid environments by adding clusters).

Now an Apache Cassandra 4.0 beta has landed — the last full release was in 2015 — with over 1,000 bug fixes that may just drive it into the sunlit uplands of “most loved”; or at least stop it keeping company with IBM DB2 and Couchbase. More importantly, it’s up to five-times faster, says Netflix, and comes with a host of welcome new features.

cassandra 4.0

The “most dreaded” databases. Credit: Stack Overflow developer survey, 2020.

The Cassandra community describes it as “battle-tested” and says there will be no breaking changes before it goes GA.

(Cassandra 4.0 has seen software, hardware, and QA testing donations from the likes of Amazon, Datastax, Instaclustr and island).

Patrick McFadin, who heads up developer relations at Datastax, a Cassandra specialist and lead contributor to the open source database, told Computer Business Review: “The past few years weren’t spent waiting and watching. This is the product of running the biggest baddest workloads on the Internet. The primary goal is to make Cassandra allergic to data loss under any circumstance.

Content from our partners
How designers are leveraging tech to apply the brakes to fast fashion
Why the tech sector must embrace faster, smarter talent recruitment
Sherif Tawfik: The Middle East and Africa are ready to lead on the climate

Cassandra 4.0 release will be the most stable database ever. Many large companies will be running 4.0 in production before it goes GA most likely. Why? Because they want to believe in it before they put their name on it.

He added: “This is what a real OSS database looks like.”

Cassandra 4.0: What’s New?

“Globally distributed systems have unique consistency caveats and Cassandra keeps the data replicas in sync through a process called repair. Many of the fundamentals of the algorithm for incremental repair were rewritten to harden and optimize incremental repair for a faster and less resource intensive operation to maintain consistency across data replicas,” Datastax notes.

The beta release includes “Zero Copy” streaming functionality, which the DB’s contributors say makes it 5x faster without vnodes compared to previous versions, which means a more elastic architecture particularly in cloud and Kubernetes environments.

As one Netflix contributor puts it on the Cassandra blog: “[When it comes to] Mean Time to Recovery (MTTR) — a KPI that is used to measure how quickly a system recovers from a failure — Zero Copy Streaming has a very direct impact here with a five fold improvement on performance.

“Zero Copy Streaming is [also] ~5x faster. This translates directly into cost for some organizations primarily as a result of reducing the need to maintain spare server or cloud capacity.

“In other situations where you’re migrating data to larger instance types or moving AZs or DCs, this means that instances that are sending data can be turned off sooner saving costs. An added cost benefit is that now you don’t have to over provision the instance. You get a similar streaming performance whether you use a i3.xl or an i3.8xl provided the bandwidth is available to the instance.”

Other improvements include a new audit logging feature, a new fqltool that allows the capture and replay of production workloads for analysis, replay, fuzz, property-based, fault-injection, and performance tests on clusters as large as 1000 nodes. Hundreds of real-world use-cases and schemas have been tested.

The curious can visit the Apache Cassandra downloads site or pull the Docker image.

 

 

Websites in our network
Select and enter your corporate email address Tech Monitor's research, insight and analysis examines the frontiers of digital transformation to help tech leaders navigate the future. Our Changelog newsletter delivers our best work to your inbox every week.
  • CIO
  • CTO
  • CISO
  • CSO
  • CFO
  • CDO
  • CEO
  • Architect Founder
  • MD
  • Director
  • Manager
  • Other
Visit our privacy policy for more information about our services, how New Statesman Media Group may use, process and share your personal data, including information on your rights in respect of your personal data and how you can unsubscribe from future marketing communications. Our services are intended for corporate subscribers and you warrant that the email address submitted is your corporate email address.
THANK YOU