Open Source, Java, Data Pipelines & public city bike. What ties it all together?


Details
En esta ocasión, aprovechamos la visita a Madrid del Java Champion Rustam Mehmandarov (@rmehmandarov), Líder de javazone y
uno de los organizadores de NorwegianJUG (@javaBin), para invitarle a que comparta con nosotros un ejemplo real de uso de Apache Beam.
¡No podíamos dejar pasar esta oportunidad! ¿la vas a dejar pasar tú?
La charla será en inglés, a continuación tienes el abstract
Abstract:
A few years ago moving data between applications and datastores included expensive monolithic stacks from the large software vendors with little flexibility. Now with frameworks like Apache Beam and Apache Airflow, we can schedule and run data processing jobs for both streaming and batch with the same underlying code.
In this presentation we demonstrate the concepts of how this can glue your applications together, and we will show how we can run a data pipeline from Apache Kafka through Hadoop Flink to Hive, and move this to Pubsub, Dataflow and Bigquery by changing a few lines of Java in our Apache Beam code.

Open Source, Java, Data Pipelines & public city bike. What ties it all together?