Past Meetup

October Meetup: Datalakes and efficient pipelines for logs

This Meetup is past

113 people went

Location visible to members


Hello everybody,

we are happy to announce our October meetup (first one, 'cause yes we will have a second one in October), with new subjects that we've been planning for some time: Data lakes and Logs ingestion/pipeline and we have 2 wonderful presenters: a long time friend of our meetup, Radu Gheorghe ( @ Sematext and a newer friend, Cristina Grosu ( @ Bigstep. Thank you Cristina and Radu for taking time to be with our community.

So, the agenda for the evening:

6:30 - 7:00 PM gathering and socializing

7:00 - 7:40 PM Efficient pipelines for logs, metrics and the like, Radu Gheorghe ( @ Sematext

This session focuses on the performance and reliability trade-offs while building a pipeline for log analytics. The context is pushing data to a search engine (Elasticsearch/Solr), where it often happens that the pipeline takes more resources than the storage. We can and should do better :). The tools we'll focus on are Logstash, rsyslog, Beats and Kafka. None of them is limited to logs (e.g. can be used for metrics) and they're not limited to Elasticsearch/Solr either, so please bring your own big data use-case.

7:40 - 8:20 PM Datalake Architectures Seen in Production, Cristina Grosu ( @ Bigstep

Cristina will be talking about common architectures and challenges encountered while building data lakes using open source tools. Why Data lakes? Companies are choosing this strategy to democratize access to data, data usually stored in enterprise data warehouses that are managed/accessed/operated mainly by the IT professionals. We'll show some data lake implementations, developed in production using the Hadoop ecosystem technologies.

8:20 - 9:30 Pizza, drinks and socializing, sponsored by Netopia ( - our happy and steady sponsor.

Hope to see many of you in our meetups,