• Kafka Streams Applications

    HaArba'a St 32

    18:00 - 18:30: Networking, mingling & refreshments. 18:30 - 19:00: Real-time fraud detection with Kafka Streams. Ofir Sharony @ MyHeritage. 19:00 - 19:30: Querying Kafka with Presto. Itamar Syn-Hershko @ BigData Boutique. *** All talks are delivered in English and live-streamed via YouTube *** First session description: In this talk, we'll build a gatekeeper to your website. Our fraud detection system will target various types of malicious activities, such as account takeover, parameter tampering, forbidden access and more. We'll try to identify potential attacks and react to them in near real-time. Addressing this problem in a classical batch fashion will result with a complex, non-scalable nor real-time solution. We'll adjust our clumsy implementation to a modern, stream processing windowed-aggregation, use Kafka Streams as our streaming framework, and end up with a beautiful, clean and maintainable code. Bio: Ofir is a BackEnd team lead at MyHeritage, with a passion for event-driven design and stream processing frameworks. Ofir has acquired most of his experience by planning scalable server-side solutions and developing data pipelines. Ofir has spoken of these ideas in local and global conferences and wrote about them here: https://medium.com/@ofirsharonys. ******************************************************************************* Second session description: Presto is a state of the art Distributed SQL Query Engine for BigData, enabling efficient querying on cold data and various data sources. With extended SQL language and features like geospatial queries, joins between different data sources (SQL to join data from HDFS, Elasticsearch, and Kafka anyone?), and the ability to run on containers and cheap servers, Presto is slowly becoming the standard ad-hoc querying engine for BigData. In this talk, we will present Presto and how it can be used with Kafka. We will discuss data architectures, Presto features and why is it so good for your data, and finally see how it can be leveraged to querying data from Kafka as well as executing a single SQL statement that joins data from Kafka on data from SQL, Cassandra, Elastic and more. Bio: I'm a search technologies, BigData, and distributed systems expert. Over the years I have built and maintained several big mission-critical systems on both Windows and Linux, and gained a lot of experience I now use to perfect systems built to deal with scale. Today I'm a frequent speaker at international conferences and provide on-site training and consultancy services around the world via BigData Boutique.

    4
  • Apache Kafka @ Production

    JoyTunes

    18:00 - 18:30: Networking, mingling & refreshments. 18:30 - 19:30: So You've Inherited Kafka….Now What? Alon Gavra, Platform Team Lead @ Appsflyer. YouTube Livestream: http://bit.ly/2SrZNc0 19:30 - 20:00: Handling Transient Failures in Kafka Streams. David Ostrovsky @ Proofpoint. YouTube Livestream: http://bit.ly/2S6xel7 *** All talks are delivered in English and live-streamed via YouTube *** First session description: Kafka, many times is just a piece of the stack that lives in production that often times no one wants to touch - because it just works. At AppsFlyer, a mobile attribution and analysis platform that generates a constant "storm" of 70B+ events (HTTP Requests) daily, Kafka sits at the core of our infrastructure. Recently I inherited the daunting task of managing our Kafka operation and discovered a lot of technical debt we needed to recover from if we wanted to be able to sustain our next phase of growth. This talk will dive into how to safely migrate from outdated versions, how to gain trust with developers to migrate their production services, how to manage and monitor the right metrics and build resiliency into the architecture, as well as how to plan for continued improvements through paradigms such as sleep-driven design, and much more. Bio: Alon Gavra has been with Appsflyer for the past two years - and today serves as the Platform Team Lead. Originally a backend developer he has transitioned to lead the real-time infrastructure team and took on the role of bringing some of the most heavily used infrastructure in AppsFlyer to the next level. A strong believer in sleep driven design, Alon's main focus is stability and resiliency in building massive data ingestion and storage solutions. Second session description: In-order processing and strong delivery guarantees are two of Kafka Streams’ greatest strengths. However, they come with an inherent weakness: you must finish processing each message before moving to the next one in the partition. There is no built-in mechanism to retry handling a message without blocking the processing for the partition in question. At Proofpoint we rely on Kafka to move a lot of data between dozens of different services, which call external 3rd party APIs, perform IO, or do various other things that are prone to temporary failures. We care a lot about end-to-end latency, so we’re quite reluctant to implement local retry logic in every service because that would add multiple seconds to the total processing time. We had to implement our own solution to retry temporary processing failures asynchronously, without blocking the processing of following messages. In this session, we’ll talk about the various considerations we had for designing an asynchronous retry mechanism, why we eventually settled on our current implementation, and whether you might do something different for your own use-case. Bio: A software developer and architect with over 18 years of industry experience, trainer, author of multiple courses and books. Currently specializing in big data systems, NoSQL, distributed architecture and cloud computing. Participated in designing and building dozens of large-scale distributed systems, using NoSQL databases such as Couchbase Server, Cassandra and MongoDB, and open source tools like ElasticSearch, Hadoop, Spark, Storm, Kafka and more. Hands-on experience with cloud environments, including Microsoft Azure, Amazon Web Services, and various private cloud stacks. Certified trainer, with over 20 successful courses taught in Israel and abroad. Experience at leading a team of developers, defining tasks and project goals, and managing development resources. In-depth knowledge of .NET Framework, including WPF, Win8, and ASP.NET. Extensive knowledge of database administration and programming, with MS-SQL, MySQL, and Oracle.

    5
  • #ApacheKafkaTLV hosting Gwen Shapira

    JoyTunes

    18:00 - 18:30: Networking, mingling & refreshments. 18:30 - 19:30: The Magical Consumer Group Protocol of Apache Kafka. Gwen Shapira is a principal data architect @ Confluent. Join Livestream on YouTube: http://bit.ly/2ReFYVf 19:30 - 20:00: Unlimited Kafka Messages. Maor Mordehay @ Alooma. Join Livestream on YouTube: http://bit.ly/2rQSDiN *** All talks are delivered in English and live-streamed to YouTube at: *** First session description: Very few people know that inside’s Apache Kafka’s binary protocol for publishing and retrieving messages hides another protocol - a generic, extensible protocol for managing work assignments between multiple instances of a client application. When multiple Kafka consumers in the same consumer group subscribe to a set of topic partitions, Kafka knows how to assign a subset of a topic partitions to each consumer and how to handle failover automatically. What is less known is that this assignment is determined by the consumer client itself and that the same protocol can be used by any application for both leader election and task assignment. Let's dive into the internals of this little-known assignment protocol! We’ll look in detail at how Kafka Consumers, Connect and Streams API use this protocol for task management. Bio: Gwen Shapira is a principal data architect at Confluent, where she helps customers achieve success with their Apache Kafka implementation. She has 15 years of experience working with code and customers to build scalable data architectures, integrating relational and big data technologies. Gwen currently specializes in building real-time reliable data-processing pipelines using Apache Kafka. Gwen is an Oracle Ace Director, the coauthor of Hadoop Application Architectures, and a frequent presenter at industry conferences. She is also a committer on Apache Kafka and Apache Sqoop. When Gwen isn’t coding or building data pipelines, you can find her pedaling her bike, exploring the roads and trails of California and beyond. Second session description: As Kafka looks to turn 8 this year, we look to see how we can push its boundaries. Just how extensible can Kafka be? It is a no-brainer that Kafka is the primary choice when looking for an open-sourced stream processing platform. Where it thrives, is handling relatively small messages. But what if you want to process large or very large messages? At Alooma, we are heavy Kafka users but found ourselves needing and wanting more. So, we put our heads together to bring a Kafka Streams version that can handle unlimited Kafka messages to the community. Join us for a sneak peek of our open source project.

    6
  • Apache Kafka Streams Workshop 2/4

    Labs Tel Aviv - Azrieli Sarona Tower

    No talking, just coding! Last chance to catch up with our workshop series! 18:00 - 18:30: Mingling, Pizza & Beer. 19:00 - 20:30: Hands-On Kafka StreamsWorkshop 2/4 This is the second workshop in a 4 series workshop dedicated to Kafka Streams. In these workshops, you'll learn how to develop and deploy microservices based on the Kafka Streams. After this session, you will be able to develop production ready Stream Processing services. Basic knowledge of Apache Kafka and experience in Java required! Bring your laptops Linux machines only (If you're running on Windows please prepare a VirtualBox Linux machine) with Java 8 and Kafka 0.11+ installed. Code will be available on our GitHub repo: https://github.com/ApacheKafkaTLV/kstreams [Stay tuned to ApacheKafkaTLV Slack channel - we'll provide more details] The workshop will be delivered by : Vladi Feigin - R&D Software Architect @ LivePerson Omer Ornan - Software Developer @ LivePerson Dimitry Wolfson - Software Developer @ LivePerson Ori Donner - DataOps Engineer @ SQream Fastest SQL GPU DB!

    6
  • Apache Kafka Streams Workshop 2/4

    Location visible to members

    No talking, just coding! Last chance to catch up with our workshop series! 18:00 - 18:30: Mingling, Pizza & Beer. 19:00 - 20:30: Hands-On Kafka StreamsWorkshop 2/4 This is the second workshop in a 4 series workshop dedicated to Kafka Streams. In these workshops, you'll learn how to develop and deploy microservices based on the Kafka Streams. After this session, you will be able to develop production ready Stream Processing services. Basic knowledge of Apache Kafka and experience in Java required! Bring your laptops Linux machines only (If you're running on Windows please prepare a VirtualBox Linux machine) with Java 8 and Kafka 0.11+ installed. Code will be available on our GitHub repo: https://github.com/ApacheKafkaTLV/kstreams [Stay tuned to ApacheKafkaTLV Slack channel - we'll provide more details] The workshop will be delivered by : Vladi Feigin - R&D Software Architect @ LivePerson Omer Ornan - Software Developer @ LivePerson Dimitry Wolfson - Software Developer @ LivePerson Ori Donner - DataOps Engineer @ SQream Fastest SQL GPU DB!

  • Apache Kafka Streams Workshop. Part 1

    Labs Tel Aviv - Azrieli Sarona Tower

    18:00 - 18:30: Mingling, Pizza & Beer. 18:30 - 19:00: Lecture - Short intro to stream processing. Comparison between leading stream processing frameworks. 19:00 - 20:30: Hands-On Workshop. Part1 : Stateless Kafka Streams. Kafka Streams intro. Comparison with other Stream Processing frameworks. Workshop. This is the first workshop in the series of workshops dedicated to Kafka Streams. In these workshops, you'll learn how to develop and deploy microservices based on the Kafka Streams. After this session you will be able to develop production ready Stream Processing services. The first session is for Kafka Streams beginners. Next sessions will contain more advanced stuff. Required basic knowledge of Apache Kafka and experience in Java Prerequisites : Bring your laptops Java 8 Kafka 0.11+ [Stay tuned to the meetup channel - we'll provide more details] The lecture and workshop will be delivered by Vladi Feigin, R&D Software Architect @ LivePerson Omer Ornan - Software Developer @ LivePerson Dimitry Wolfson - Software Developer @ LivePerson

    11
  • KSQL Deep Dive, Kai Waehner - Technology Evangelist at Confluent

    Labs Tel Aviv - Azrieli Sarona Tower

    ----> YouTube Live Stream: http://bit.ly/2Kuyveu <---- 18:00 - 18:30 - Mingling, Pizza & Beer 18:30 - 19:15 - KSQL Deep Dive Part I 19:15 - 19:30 - Break 19:30 - 20:15 - KSQL Deep Dive Part II 20:15 - 20:30 - Q&A KSQL Deep Dive Part I: - Apache Kafka Ecosystem. - Kafka Streams as a foundation for KSQL. - The motivation for KSQL. - Live Demo #1 - KSQL intro. KSQL Deep Dive Part II: - KSQL architecture. - Live Demo #2 - Clickstream Analysis. - Getting Started. The rapidly expanding world of stream processing can be daunting, with new concepts such as various types of time semantics, windowed aggregates, changelogs, and programming frameworks to master. KSQL is an open-source, Apache 2.0 licensed streaming SQL engine on top of Apache Kafka which aims to simplify all this and make stream processing available to everyone. Even though it is simple to use, KSQL is built for mission-critical and scalable production deployments (using Kafka Streams under the hood). Benefits of using KSQL include No coding required; no additional analytics cluster needed; streams and tables as first-class constructs; access to the rich Kafka ecosystem. This session introduces the concepts and architecture of KSQL. Use cases such as Streaming ETL, Real-Time Stream Monitoring or Anomaly Detection are discussed. A live demo shows how to setup and use KSQL quickly and easily on top of your Kafka ecosystem. Kai Waehner works as Technology Evangelist at Confluent. Kai’s main area of expertise lies within the fields of Big Data Analytics, Machine Learning / Deep Learning, Messaging, Integration, Microservices, Stream Processing, Internet of Things and Blockchain. He is a regular speaker at international conferences such as JavaOne, O’Reilly Software Architecture or ApacheCon, writes articles for professional journals, and shares his experiences with new technologies on his blog (www.kai-waehner.de/blog). Contact and references: [masked] / @KaiWaehner / www.kai-waehner.de

    10
  • The Move From Micro-Service To Event-Sourcing Architecture

    Agenda: 18:00 - 18:30: Networking, mingling & refreshments. 18:30 - 19:30: Let’s stop obsessing about data infrastructure, Yair Weinberger, CTO @ Alooma. YouTube Live-Stream Event: https://goo.gl/8K74Dp 19:30 - 20:30: Distributed Kafka Architecture, Taboola Scale. Tal Sliwowicz & Lior Chaga @ Taboola. YouTube Live-Stream Event: https://goo.gl/2V6HhE First session description: Stop obsessing about data infrastructure. The move from micro-service oriented architecture to modern event-sourcing based architecture. Presenting: Yair Weinberger, CTO at Alooma Second session description: In this talk, we will present our multi-DC Kafka architecture, and discuss how we tackle sending and handling 10B+ messages per day, with maximum availability and no tolerance for data loss. Our architecture includes technologies such as Cassandra, Spark, HDFS, and Vertica - with Kafka as the backbone that feeds them all. Presenting: Tal Sliwowicz, Director R&D - Infrastructure Engineering Lior Chaga - Senior Software Engineer, Data Platform *** All talks will be delivered in English ***

    11
  • Running Micro-Services on Apache Kafka

    WeWork Sarona

    Agenda: 18:00 - 18:30: Networking, mingling Pizza & Beer 18:30 - 19:30: Apache Kafka as a Pub / Sub Mechanism in a Microservice Architected Software Platform. Kobi Hikri. YouTube Live-Stream Event: https://goo.gl/Y3s8QV 19:30 - 20:30: Uses of Stream Processing, Microservices and, DDD in FinTech. Anna Keren and Yoav Sharon, Funding Circle. YouTube Live-Stream Event: https://goo.gl/CoiHQF First talk description: In this lecture we will review aspects of Apache Kafka which makes it a serious candidate for utilization as a Publisher / Subscriber messaging system - in particular for our microservice architecture software platform. We will discuss the built-in support for load-balancing, as well as scaling and fault-tolerance. Kobi Hikri is a self-proclaimed software simplifier. Kobi mostly deals with distributed systems and is an avid software geek. On top of consulting software enterprises, Kobi writes content for Pluralsight and hack.guides(). In his spare time, you will find him on his mountain bike or with his camera. Second talk description: In this talk we’ll describe the transition we’ve done at Funding Circle towards relying on Kafka as the source of truth and system of record for our entire marketplace activities. Funding Circle is the global leader financial technology marketplace, that connects investors and borrowers, where the borrowers are small businesses. Over the past 2.5 years we’ve been migrating our entire stack to use event stream processing on Kafka Streams to solve scaling challenges. We’re using DDD when building our new stack, and we’d be happy to share the main lessons we’ve learned, the tools we’ve been using & building, and the challenges we’re still facing. We’ll address the above from product and engineering perspectives. Anna is engineering manager with over 6 years experience in financial services industries, with software engineering background. Yoav is the head of investor product at Funding Circle, leading the migration to the new architecture from the product point of view.

    8
  • Kafka RDBMS bi-directional integration

    Labs Tel Aviv - Azrieli Sarona Tower

    Agenda: 18:00 - 18:30: Networking, Mingling, Pizza & Beer. 18:30 - 19:30: Streaming millions events per second from Kafka to RDBMS. Shani Einav, Alooma. YouTube Live Stream Event: https://goo.gl/cfyQ2R 19:30 - 20:30: Building an end-to-end streaming analytics application from RDBMS to dashboard with Kafka, Björn Rost, Pythian. YouTube Live Stream Event: https://goo.gl/ZEXApa First talk description: We'll talk about how to use Apache Kafka to build a data pipeline that processes and loads data from Kafka to any RDBMS in real time. We'll hit 3 main points on the challenges of building a data pipeline: - Transforming data in real time while making sure you still deliver exactly once - Processing efficiently by decoupling processing from loading - Loading data into an RDBMS while maintaining the exactly once promise We'll also dive into the last T in ETLT: Transforming data *after* you've already loaded, to hit that last layer of exactly once. All of this with a real-life example of Kafka to Redshift! First speaker bio: Shani is a backend Java developer who for the last past 2 years has been building a complex stream processing engine on top of Kafka. In the past, she was a part of a core backend team at Taboola. Shani holds a degree in Computer Science from the Hebrew University and is currently working towards MSc in Machine Learning at Tel Aviv University. Second talk description: Apache Kafka is a massively scalable message queue that is being used at more and more places connecting more and more data sources. This presentation will introduce Kafka from the perspective of a mere mortal DBA and share the experience of (and challenges with) getting events from the database to Kafka using Kafka connect including poor-man’s CDC using flashback query and traditional logical replication tools. To demonstrate how and why this is a good idea, we will build an end-to-end data processing pipeline. We will discuss how to turn changes in database state into events and stream them into Apache Kafka. We will explore the basic concepts of streaming transformations using windows and KSQL before ingesting the transformed stream in a dashboard application. Second speaker bio: Björn Rost is an Oracle Developer Champion, ACE Director, and one of Pythian’s top Oracle experts. A popular presenter, Björn travels the world attending technology conferences, sharing insights, and learning with his wide network of peers. Björn is attending ILOUG Tech Days 2018 in Israel.

    2