Amazon’s AWS has launched a managed version of the open source data streaming tool Apache Kafka.
AWS described Apache Kafka clusters as “challenging to setup, scale, and manage in production”.
Apache Kafka, created and open sourced by LinkedIn in 2011, has evolved from messaging queue to a full-fledged streaming platform powered by the KSQL streaming engine.
See also: Industry First as Apache Kafka Gets Data Visualisation Software
It is a hugely popular choice for anyone wants to read/write/process streaming data in real time, at scale, using a SQL-like syntax, and has been adopted by swathes of blue chips, from Apple to Netflix.
(A basic Kafka instance under the AWS managed Kafka service starts at $0.21 per hour. A month’s use would cost approximately $468, AWS said, based on 31 days x 24 hrs/day x 3 brokers = 2,232 hours x $0.21.)
AWS Managed Kafka: Cloud Eats Open Source?
The move is the latest example of open source software (OSS) being packaged up as a paid-for cloud service: something that has sparked a renewed debate about the terms of open source licencing.
Some observers at the re:Invent conference in Las Vegas, where the service was announced, had expected AWS to also announce a managed Apache Cassandra service (the open source NoSQL database management system).
The first rule about #cassandra at @aws is they don’t talk about cassandra. Why would they even mention a database that allows you to run in multiple clouds?
— Patrick McFadin (@PatrickMcFadin) November 29, 2018
Many OSS-powered companies like Redis Labs and MongoDB have moved to change their licence terms to avoid “strip mining” of community code by cloud services providers who have contributed little to it.
Redis Labs CMO Manish Gupta told Computer Business Review bluntly: “AWS are simply poaching open source investment.”
He added: “They want to leverage and monetise the free world and cloud has become the new real estate. There is no one answer to this problem…. We are trying one approach to protect and defend ourselves.”
OSS veteran Patrick McFadin, VP of Developer Relations at DataStax, told Computer Business Review: “AWS will stand up and say ‘we love you guys, we’re doing this as a service because we know doing it hurts’. But who maintains that software? The community…. Who fixes it? The magic software gnomes? Open Source infrastructure in particular is at crisis point. And while VCs have pumped loads into open source companies, they are going to ask more and more what the proprietary bits are.”
He added: “Look at Elasticsearch though – when Amazon decided to do it as a service, people thought Elasticsearch is done, but they are doing great after going public. Because they’re offering a better service than Amazon is. Now it’s up to Confluent (the leading Apache Kafka provider) to do so [too].”
Elastic: Proprietary IP Has Its Place
Speaking to Computer Business Review at AWS’s re:Invent conference in Las Vegas, Elasticsearch’s VP Worldwide Marketing Jeff Yoshimura said: “We’re building an open source company: but it’s critical to build proprietary IP around features.
“Amazon decided to take Elasticsearch and build it into their own product… They can’t put our proprietary features into their services though, unless they had a partnership with us”.
(Several core Elasticsearch features are built on proprietary IP; Elasticsearch still offers a wide range of these for free to users, with a paid-for tiered enterprise offering with further features all built on the same code base).
He added: “It’s nice [though] that Google Cloud said ‘we’re not going to build a rival offering to Elasticsearch and we want our customers to benefit from those proprietary offerings’, so they have partnered with us instead. We’ve also partnered with Alibaba Cloud.”
He added: “We do not give away support unless someone is a paying customer… If you rely on a separate support licence for retention of customers, the user suffers as your product isn’t going to improve. We focus on simplicity of use: some companies don’t want to make a feature too simple or they won’t be called for support.”
AWS: Hey, We Love Open Source
AWS appears to have had one contributor to Apache Kafka’s open source code. He did one commit last year and filed a few bugs a few months ago; mostly just forking the repos; i.e. not affecting the original product.
But the company told Computer Business Review it is an enthusiastic contributor to the OSS world.
An Amazon spokesperson told Computer Business Review: “AWS has over 1400 projects on GitHub. We contribute to projects such as Apache MXNet, FreeRTOS and Kubernetes, amongst others.”
The company meanwhile emphasises ease of use of the managed Kafka service.
“Amazon MSK [managed streaming for Kafka] lets you focus on creating your streaming applications without having to worry about the operational overhead of managing your Apache Kafka environment.”
“Amazon MSK manages the provisioning, configuration, and maintenance of Apache Kafka clusters and Apache Zookeeper nodes for you. Amazon MSK also shows key Apache Kafka performance metrics in the AWS console” it said.
Users can vote with their wallets: there are other managed Kafka services available, including from the team at Confluent who helped build it, and with open source being just that, as cloud takes off, dominant providers are going to see opportunity to package up such services, there’s nothing stopping them (bar licence changes…)
But as several interviewees told Computer Business Review, one area in which AWS need to be careful is developer goodwill.
As Redis Labs’ Manish Gupta put it: “The decision cycle in the enterprise has turned upside down. You need the developers to be with you. Alienate a lot of developers and you have a problem.”
With AWS generating $27 billion a year, it may not be so concerned.