Elasticsearch as a Primary Store @ Bol.com and Logging @ Adyen

Elastic User Group NL
Elastic User Group NL
Public group
Location image of event venue


Hi all,

It's time to meet again, this time at the offices of Adyen in Amsterdam. We're going to have two talks about two very different use cases (see details below).

Agenda is as follows:

18:30 Doors open with some food and drinks
19:00 First talk: "Using Elasticsearch as the Primary Data Store" by Volkan Yazici of Bol.com
19:45 small break
20:00 Second talk: "Logging at scale: how we grew from 1TB to 1PB+ and why we had to start from scratch" by Lucian Grosu of Adyen
20:45 wrap up drinks.

See you all there,

More details about the talks and speakers:

:: Using Elasticsearch as the Primary Data Store

The biggest e-commerce company in the Netherlands and Belgium, bol.com, set out on a 4 year journey to rethink and rebuild their entire ETL (Extract, Transform, Load) pipeline, that has been cooking up the data used by its search engine since the dawn of time. This more than a decade old white-bearded giant, breathing in the dungeons of shady Oracle PL/SQL hacks, was in a state of decay, causing ever-increasing hiccups on production. A rewrite was inevitable. After drafting many blueprints, we went for a Java service backed by Elasticsearch as the primary storage! This idea brought shivers to even the most senior Elasticsearch consultants hired, so to ease your mind I’ll walk you through why we took such a radical approach and how we managed to escape our legacy.

Speaker bio: Volkan has been working as a Java plumber in the domain of e-commerce search since 2014. In addition to his daily rescue trips to the land of "Java-based reactive microservices flavored by Spring, Elasticsearch, and RDBMS hazards", he enjoys performing public service for Log4j, Reactor, and OpenJDK Project Loom. His spare time (read as "nights") is mostly occupied by the maintenance of certain Log4j plugins and J2EE record-and-replay suites. Prior to that, you could have found him coding for embedded devices, sending patches to PostgreSQL, implementing data structures in Lisp, and developing distributed software-defined network (SDN) controllers. He holds an internationally accredited Permanent Head Damage, aka, PhD.

:: Logging at scale: how we grew from 1TB to 1PB+ and why we had to start from scratch

It comes without saying that log processing and management is an important part of any infrastructure that expands beyond
4-5 servers. Furthermore, it provides new opportunities, like performing automated actions based on log patterns and
specific events or preventing security incidents,

I'll be talking about how we scaled one of our logging clusters from a few machines to multiple racks worth of servers,
what choices we made along the way, some of the issues we experienced and how we fixed them. Emphasys has to be made on the
role that the log sources have on the choices made when designing specific parts of the logging pipeline.

In the end, we'll look into why, even after successfully scaling to this size, we had to start from scratch.

Speaker bio: I'm Lucian, a Linux System Engineer focusing on automation, logging and monitoring at Adyen, and a father, husband,
motorcycle enthusiast and OpenBSD fan in my spare time.