• Building modern data pipelines by unifying Apache Pulsar, Heron and BookKeeper

    • What we'll do Description: Data pipelines are hard to build and maintain. This is due to complexity of big data open source ecosystem that has numerous software each specializing in solving one piece of the puzzle. In this talk, we will focus on three key open source software Apache Pulsar, Apache Heron and Apache BookKeeper and how are integrated to make it easy to build data pipelines. Abstract: For today’s enterprises, ensuring that data pipelines are available to every corner of the organization is key to building next generation data-driven applications. In this talk Karthik Ramasamy of Streamlio will present on how to combine three best of breed open-source projects to have a solid data infrastructure that are is easy to develop against and simple to operate at scale in production. He will provide an overview of the merits of the three open source systems and the benefits they bring when integrated: Apache Pulsar: unified queuing and streaming Apache Heron: stream processing Apache BookKeeper: distributed stream storage Bio: Karthik Ramasamy is the co-founder of Streamlio that focuses on building next generation real time processing engines. Before Streamlio, he was the engineering manager and technical lead for real-time analytics at Twitter where he co-created Twitter Heron. He has two decades of experience working in parallel databases, big data infrastructure, and networking. Karthik is the author of several publications, patents, and "Network Routing: Algorithms, Protocols and Architectures". He has a Ph.D. in computer science from the University of Wisconsin, Madison with a focus on big data and databases. • What to bring • Important to know Please provide your full name for building security.

    3
  • Live Webinar - How Apache Spark Adoption is Empowering Data Driven Enterprises

    Join experts from StreamAnalytix and guest Forrester Research on Oct 13 and learn why Apache Spark is becoming the de facto technology choice for stream processing, real-time analytics, data science and machine learning applications at scale. Topics to be covered: 1. What is driving Spark adoption- influencers, trends, compelling capabilities, use cases, challenges or inhibitors? 2. Impetus customer success stories around real-time solutions with Spark/StreamAnalytix. 3. An overview of Impetus Visual Spark Studio – a free, newly downloadable IDE that offers break-through productivity to learn, develop and deploy Spark based real-time and advanced analytical applications When – October 13 (10:00 am PT/ 1:00 pm ET) Register now – http://bit.ly/2xO2mv2

  • Creating a successful SRE program like Netflix and Google

    This talk for engineers, DevOps, Architects who want to learn how to run production systems, what to monitor, what to alert on, how to structure the team, deployment process, etc. There are many talks how to work with data using Kafka, Flink, Spark and other streaming technologies. But I did not come across many talks dedicated to operating this services and many other stateless or stateful services reliably. In this talk we will learn from Jonah Horowith (@Stripe) and Blake Busset (Google) what is SRE and how to build it. The event starts at 7.00 sharp, door will be open at 6:45 pm. Please do not come earlier than 6:45 pm because building security will not let in before 6:45 pm. Abstract: What isn’t site reliability engineering? Lots of companies claim to have SRE teams, but some don’t quite understand the full value proposition—or what shiny technologies and organizational structures will negatively impact your operations rather than empowering your team to accomplish your mission. Jonah Horowitz from Stripe and Blake Bisset (formely at Google) share stories about anti-patterns in monitoring, incident response, configuration management, and more that they’ve tripped over on their own teams, seen proposed as good practice in talks at other conferences, or heard in talks with peers in the industry. Jonah also explains how Google and Netflix view the role of the SRE and how it differs from the traditional system administrator role. You’ll learn that freedom and responsibility are key, trust is required, and chaos is (sometimes) your friend. Speakers: Jonah Horowitz (https://www.linkedin.com/in/jonahhorowitz/) (Stripe) Jonah Horowitz is a site reliability engineer at Stripe, where he works with all of the company’s individual engineering teams to drive reliability efforts, including monitoring, alerting, deployment pipelines, and chaos resiliency. Previously, Jonah worked at several startups around the Bay Area, including Netflix, Quantcast (a leading ad-tech startup, where he grew their network to process over three million events per second), and Looksmart (a contextual advertising company), and was on the founding team of Wal-Mart.com (now @Walmart Labs), where he built out the company’s software deployment pipelines and its product image management systems. Blake Bisset (https://www.linkedin.com/in/bisset/) (Google) Blake Bisset got his first legal tech job at 16. He won’t say how long ago, except that he’s legitimately entitled to make shakey fists while shouting, “Get off my LAN!” He’s cofounded three startups—a joint venture with Dupont/ConAgra, a biotech spinoff from UW, and one that started this time a bunch of kids were sitting around on New Year’s Eve, wondering why they couldn’t watch movies on the internet—only to end up spending a half-decade as an SRM at YouTube and Chrome, where his happiest accomplishment was holding the go/bestpostmortem link for several years. Agenda: 6:45 -7:10 - networking 7:15-8:00 - Creating a successful SRE program like Netflix and Google and Q&A 8:00- 8:15 - Wallaroo Announcement Wallaroo is an ultrafast and scalable data processing engine that rapidly takes you from prototype to production by eliminating infrastructure complexity. A variety of applications can be built with Wallaroo, from microsecond response to long-running analysis, including monitoring, analytics, model training, predictive analytics, and microservices. Our goal with Wallaroo is to make it really simple to deploy and scale, with the broadest developer support, and the best performance! Wallaroo Core will be available open source (under Apache 2) on 9/29/2017. In this quick talk, you will get an overview of Wallaroo and learn how you participate in this new community. About Stripe: Stripe is a US technology company operating in over 25 countries, that allows both private individuals and businesses to accept payments over the Internet. About our host - Lifion (http://lifion.com/about/): Lifion (ADP company) is transforming a world of HR pain into useful tools and meaningful experiences for millions of people worldwide. They are bringing together some of the brightest developers, architects and designers in the industry to create next generation HR platform.

    4
  • Apache Spark Structured Streaming Upgrade and How Enterprises Can Benefit

    Attend this webinar to learn about Structured Streaming and how it can help to perform complex streaming analytics with improved and unprecedented speed-to-insights. Topics to be covered: 1. Evolution of Spark and its functionality to date including version 2.2 2. Structured Streaming - Technical overview, benefits and limitations 3. How to integrate Structured Streaming with the surrounding stack 4. Talent Vs Tooling Date- August 23 (9:30 am PT/12:30 pm ET) Register now – http://bit.ly/2frZ1Ku

    1
  • Apache Spark Structured Streaming Upgrade and How Enterprises Can Benefit

    Attend this webinar to learn about Structured Streaming and how it can help to perform complex streaming analytics with improved and unprecedented speed-to-insights. Topics to be covered: 1. Evolution of Spark and its functionality to date including version 2.2 2. Structured Streaming - Technical overview, benefits and limitations 3. How to integrate Structured Streaming with the surrounding stack 4. Talent Vs Tooling Date- August 23 (9:30 am PT/12:30 pm ET) Register now – http://bit.ly/2frZ1Ku

    1
  • Meetup @ QConNewYork - Survival of the Fittest - Streaming Architectures

    NOTE DIFFERENT LOCATION: Marriot Marquis. This month we are partnering with QCon NYC (https://qconnewyork.com/), and having a session in one of the evenings of the conference. Michael Hansen, Principal Data Engineer @hbcdigital, will give us his QConn talk and then we will follow up with discussion and Q&A session with others QConn speakers from the streaming track. Survival of the Fittest - Streaming Architectures Abstract: ​“Perfect is the enemy of good” ​ ​ -​ ​Voltaire On the journey through life, we learn and adapt via trial and error - software development is no different. We realize and accept that we won’t build the perfect solution the first time around, it takes many iterations. At Gilt.com, now part of HBC Digital, we started processing and streaming event data nearly 5 years ago. Our initial solution was dramatically different from our current solution - and will likely be different from our solution 5 years from now. The Gilt.com banner, at HBC Digital, is in the business of flash sales, which makes for some interesting use cases in the world of streaming. We release new sales of top designer labels, at up to 70% off retail, on the web and our mobile app, every day at Noon and 9pm. Around the time of these releases, we experience volume spikes between 10X and 100X on our streams. Numerous streaming frameworks, homemade, as well as, open source, did not pass the evolutionary tests. Frameworks come and go, ​so this talk is not about the “best” framework or platform to use, rather it’s about core principles that will stand the tests of streaming evolution. Also, this talk covers major potential pitfalls that you may stumble over on your path to streaming, as well as, how to avoid these. Finally, this talk will cover what the next evolutionary step in streaming at HBC Digital. ​ Speaker: Michael Hansen (https://www.linkedin.com/in/danishdatageek/) Michael has two decades of technical and leadership experience in engineering big data, data warehouse, and data streaming systems, as well as, the full-stack environments, automation, and tooling surrounding these. Currently he is working as Principal Data Engineer at HBC Digital, responsible for all things related to data plumbing. He holds a Bachelor of Science in operations research and industrial engineering from UC Berkeley. Agenda: 7:00 -7:30 - networking 7:30-8:15 - the talk 8:15- 9:00 - QA

    1
  • Introduction to Sendence Wallaroo: An industrial-grade streaming data platform

    Introduction to Sendence Wallaroo: An industrial-grade streaming data platform. Presented by Vid Jain, Founder & President of Sendence and John Mumm, Lead Engineer. Note: An open source version of Sendence Wallaroo will be available in the coming months Wallaroo turns a group of servers into an industrial-grade processing platform that acts like a single low-cost and highly scalable system. Real-time applications built with Wallaroo are significantly faster than other approaches and get data accuracy, exactly-once processing, an in-memory data store, and resiliency. Other benefits include: • quickly write code once and then deploy it anywhere at any scale • instantly handle real-time data spikes without any application changes • 50x faster at 1/3 the cost vs. alternatives We will talk about use cases including an electronic trading position keeping system that requires sub-millisecond response time and a cloud monitoring service that requires processing of millions of messages per second. Additionally, we will discuss why we built Wallaroo, the architecture, and the challenges of building a distributed industrial-grade platform. Agenda 6:30 pm: Guest arrival & networking. 7:00 pm: Introduction to Wallaroo , Q&A 8:30 pm: Event ends. Speakers: BiosVid Jain holds a Ph.D. from UC Berkeley and has spent the last 20 years pushing the technology envelope in various industries. He was a co-founder of an ad tech startup, worked in the Electronic Trading group at Merrill Lynch and most recently is founder & CEO of Sendence, which provides a software platform for building, deploying and operating serverless, real-time data applications. John Mumm holds a Ph.D. in philosophy from Fordham University and has spent the last several years working with distributed systems and functional programming. He is a lead engineer at Sendence, where he works on the core features of Wallaroo, an industrial-grade, low-latency stream processing system written in the Pony programming language. About Sendence Sendence provides software infrastructure that radically simplifies the creation, deployment & operation of any business critical real-time service, on-premise or in the cloud.

    3
  • Don't miss this! Final call - for Hadoop Batch with special offer limited slots

    Hi, After the tremendous success of our last event on Big Data & Hadoop. KRATOES (http://www.kratoes.com/) is launching HADOOP (http://unbouncepages.com/hadoop-course/) Session on demand. Join to learn and get mentored in Big Data journey and Broaden your skills. Focus on:- • BIG DATA AND HADOOP INTRODUCTION • HDFS ARCHITECTURE, HADOOP CONFIGURATIONS & DATA LOADING • INTRODUCTION TO MAP REDUCE • ADVANCED MAP REDUCE CONCEPTS • INTRODUCTION TO PIG AND ADVANCE PIG • INTRODUCTION TO HIVE AND ADVANCE HIVE • INTRODUCTION TO HBASE AND ADVANCED HBASE • SQOOP AND FLUME • BASIC OOZIE AND OOZIE CONFIGURATION. FOR MORE (http://www.kratoes.com/course-hadoop.html) This promises to be an extremely enriching session and we hope you can make it - Register Now (http://unbouncepages.com/hadoop-course/) All registered attendees will receive a free copy of our latest Hadoop Recording briefing that was created based on specific real life experiences. http://unbouncepages.com/hadoop-course/ Cheers!

  • Twitter Heron in Practice

    Lifion by ADP

    Twitter generates billions and billions of events per day. Analyzing these events in real-time presents a massive challenge. To meet our scaling and operational needs, Twitter designed and deployed a new streaming system called Heron, our next generation streaming system. In this talk, we will provide an introduction to Heron, how it is being used at Twitter and share our operating experiences and challenges of running Heron at scale. We recently announced our open sourcing of Heron under the permissive Apache v2.0 license. Heron has been in production nearly 2 years and is widely used by several teams for diverse use cases. Prior to Heron, Twitter used Apache Storm, which we open sourced in 2011. Heron features a wide array of architectural improvements and is backward compatible with the Storm ecosystem for seamless adoption. Following the talk, we will provide a hands-on experience with Heron. Bring your laptops and we will show you how to install, use, and operate Heron. Speaker: Karthik Ramasamy (https://www.linkedin.com/in/kramasamy) Karthik is the engineering manager for Real Time Analytics at Twitter and co-creator of Heron. He has two decades of experience working in parallel databases, big data infrastructure and networking. He cofounded Locomatix, a company that specializes in realtime streaming processing on Hadoop and Cassandra using SQL that was acquired by Twitter. Before Locomatix, Karthik was at Juniper Networks where he designed and delivered platforms, protocols, databases and high availability solutions for network routers that are widely deployed in the Internet. Before Juniper, at University of Wisconsin, he worked extensively in parallel database systems, query processing, scale out technologies, storage engine and online analytical systems. Several of these research were spun as a companies later acquired by Teradata. He is the author of several publications, patents and one of the best selling book "Network Routing: Algorithms, Protocols and Architectures." He has a Ph.D. in Computer Science from UW Madison. AGENDA: 6:30 - 7:00 - Registration and networking 7:00 - 7:45 - Heron Introduction - Karthik Ramasamy 7:45 - 8:30 - Hands on Heron

  • Live Webinar: Apache Spark VS Hadoop MapReduce

    Needs a location

    Hello, We'd like to invite you for an expert live Webinar on 'Apache Spark vs. Hadoop MapReduce (http://unbouncepages.com/sparkvsmapreduce-inter/)' scheduled on 22nd August 2016, Thursday 9:30 PM to 11:00 PM ( EDT ) TOPICS • Introduction to Big Data and it's challenges • Introduction to Hadoop and it’s characteristics • Hadoop ecosystem • HDFS and MapReduce (Yarn) • Advantage and Disadvantage of Hadoop • Introduction to Spark and Scala • Why Spark and Scala • Data Loading Using RDD • Difference between Spark and Hadoop This promises to be an extremely enriching session and we hope you can make it - Register Now (http://unbouncepages.com/sparkvsmapreduce-inter/) In case you can't make it sign-up anyway, we'll send you the recording. Cheers!