Note: This is on Tuesday, not Thursday
The Evening's Topic: Stream All the Things!!
Big data started with an emphasis on batch-oriented architectures, where data is captured in large, scalable stores, then processed using batch jobs. To reduce the gap between data arrival and information extraction, these architectures are now evolving to be stream oriented, where data is processed as it arrives. Fast data is the new buzz word.
Microservices are inherently message driven, a core tenet of “Reactive Systems”: http://www.reactivemanifesto.org/, responding to requests for service and sending messages to other microservices, in turn. Hence, they are also stream oriented, in a sense.
Because it’s trendy, the word “stream” is used in both spheres, since both are concerned with a never-ending sequence of data, but the resemblance is not superficial. Many of the same challenges and design patterns are shared. Hence, the movement to stream-oriented architectures is driving a convergence of data-centric and microservice architectures.
I’ll begin the talk by quantifying what streaming means in the context of four axes of concern that cross the fast data and microservice divide:
Low latency: How low?High volume: How high?Integration with other tools: Which ones and how?Data processing: What kinds? In bulk? As individual events?
Next, I’ll consider specific examples of streaming tools and how they fit on these axes, including heavy hitters in the data world, such as Spark and Kafka, as well as microservice toolkits such as Akka.
Finally, I’ll speculate on the future of these trends, how I believe that fast data and microservice architectures are going to converge, driven by the ever-growing importance of data and the scalability of fast data streaming.
Dean Wampler, Ph.D., is the architect for fast data products at Lightbend (http://lightbend.com/). He specializes in scalable, distributed fast data and streaming systems using tools like Spark, Mesos, Akka, Cassandra, and Kafka (the SMACK stack). Dean is the author of Programming Scala (http://programming-scala.org/) and Functional Programming for Java Developers (http://oreilly.com/catalog/9781449311032/), and the coauthor of Programming Hive (http://shop.oreilly.com/product/0636920023555.do), all from O'Reilly Media. He is a contributor to several open source projects and the co-organizer of several conferences around the world and several user groups in Chicago. Dean can be found on Twitter as @deanwampler (http://twitter.com/deanwampler.)