Berlin DataTalksClub Group cover photo

Part of DataTalks.Club Network - 2 groups

Berlin DataTalksClub Group

4.3•564 ratings

Share

About us

A group for experienced and aspiring data professionals.
Join our Slack: https://datatalks.club/slack.html

Upcoming events

4

Context Engineering for Agentic Hybrid Applications
Tue, Mar 10 · 12:30 PM CET
·
Online
Online
Research survey and upcoming trends discussion - Ivan Potapov and Tobias Lindenbauer

As an agent keeps running, its context window balloons with tool logs, stale diffs, and repeated data dumps. The model starts drowning in irrelevant details and falls victim to "lost-in-the-middle" effects — missing critical facts buried deep in oversized prompts.

We'll walk through research for keeping only high-signal observations: masking vs. summarization trade-offs, compressing bulky tool output (drawing from ideas like LLMLingua-2), and pruning dead branches from the agent's trajectory so it stops dragging noise forward. We'll also share insights on cutting LLM call costs along the way.

Then we'll connect those techniques to bigger-picture design: memory hierarchies (session → working set → notes → cross-session) and standardized tool interfaces like MCP that reduce "context debt" and keep the agent's working set clean.

Finally, we'll look at where the field is heading — toward a world where Context Engineering becomes something you train, not just script.

About the Speakers:
Tobias Lindenbauer is an AI researcher at JetBrains Research, where he advances efficient and effective code agents that robustly solve long-horizon software engineering tasks. Currently, he is most interested in efficiency topics, context management, interpretability and data synthesis. He recently presented “The Complexity Trap: Simple Observation Masking Is as Efficient as LLM Summarization for Agent Context Management” at the Deep Learning for Code workshop at NeurIPS 2025, highlighting practical pitfalls of LLM summarization-based context strategies and evidence for more computationally efficient alternatives.

Ivan Potapov is a Research Engineer in Discovery Search & Ranking at Zalando, where he builds search retrieval and ranking systems. He teaches data engineering, AI agents, and LLM alignment, with a focus on bridging software engineering and applied ML. His recent work centers on long-running agents and context engineering—memory, state, and retrieval—exploring why many code-first agent designs fall short. His key thesis: context management is becoming something we train and iterate on, not just script. https://blog.ivan.digital/context-engineering-for-agentic-hybrid-applications-why-code-agents-fail-and-how-to-fix-them-076cab699262

Join our Slack: https://datatalks.club/slack.html
69 attendees
Hands-On Data Engineering: From Zero to Billion-Row Analytics [IN-PERSON!]
Tue, Mar 10 · 6:00 PM CET
Hotel Telegraphenamt, Monbijoustraße 11, Berlin, DE
# Workshop: From Zero to 1 Billion Rows

Process 1 billion+ rows of real-world NHS prescription data with Exasol Personal.

Install the database, build a data pipeline with workflow orchestration, and create an AI-powered dashboard.

## Workshop Outline

Configure Exasol Personal on AWS

Build a reliable data ingestion pipeline

Ingest 1B+ rows with data cleaning in staging

Create data warehouse tables ready for analytics

Build an AI-powered dashboard for instant analytics

Note: You need an AWS account with admin permissions if you want to install Exasol Personal yourself. Otherwise we will provide access to a pre-configured Exasol instance for your experiments during the workshop.

## Details

Date: March 10, 2026

Time: 18:00 (doors open earlier)

Location: Hotel Telegraphenamt, Berlin
This workshop is part of Exasol Xperience 2026, an in-person gathering in Berlin for customers, prospects, and partners. The event offers customer success stories, use cases, panel discussions, and a chance to connect with the Exasol community. DataTalks.Club community members can attend the conference for free with code EXA-VIP-RDTC.
https://www.exasol.com/events/exasol-xperience/registration/

## Speaker

Alexey Grigorev - Founder of DataTalks.Club, principal data scientist, and creator of the Zoomcamp free course series. He teaches 100,000+ students worldwide and has 16+ years of software engineering experience.

This workshop is sponsored by Exasol. Thank you for supporting our community!
144 attendees
Data Engineer Career in 2026: Roles, Specializations, & what Companies look for
Tue, Mar 17 · 12:30 PM CET
·
Online
Online
Even though the Data Engineer title has been around for a long time, it can mean different things across companies.

In this episode, we’ll explore the reality of the Data Engineer role in 2026 with Slawomir Tulski, a Data Engineer who spent much of his career at Meta, where he moved between individual contributor and manager roles and conducted hundreds of technical interviews while helping shape the company’s Data Engineering hiring process.

Today, Slawomir works as an independent consultant, advising companies on structuring data teams and maximizing value from their Data Engineering efforts. His recent ebook examines the current Data Engineering landscape and why the role is often misunderstood across the industry.

In this conversation, we will discuss:

Why does the term Data Engineer still create confusion across the industry

The different specializations inside Data Engineering and how they differ in practice

How the same role changes depending on company maturity and team structure

What companies look for when hiring Data Engineers

Common career traps and misleading job descriptions that engineers should watch for

How the role may evolve as AI transforms the data ecosystem

About the Speaker:

Slawomir Tulski is a Data Engineer who spent the bulk of his career at Meta, moving between Individual Contributor and Manager roles. During his time there, he focused on scaling the Data Engineering support for the Meta Ads ranking system. Beyond his direct team, he was also heavily involved in global hiring, conducting hundreds of interviews and helping shape the company's data engineering recruitment process.

Today, he works as an independent consultant, helping businesses get more out of their Data Engineering teams. On the personal side, he’s a dad, a husband, and a failed bass player.

Join our Slack: https://datatalks.club/slack.html
41 attendees
How to Evaluate MCP-powered AI Agents Beyond Accuracy using Agent GPA
Tue, Mar 24 · 5:00 PM CET
·
Online
Online
This hands-on workshop introduces the Agent Goal-Plan-Action (Agent GPA) framework, a practical and advanced method for evaluating and improving AI agents.

Moving beyond simple final-answer scoring, Agent GPA focuses on the agent's entire reasoning process: evaluating goal achievement efficiency, plan logic, appropriate tool usage, and execution follow-through.

Agent GPA has achieved state-of-the-art benchmark results on TRAIL/GAIA, with 95% error coverage and 86% error localization, demonstrating the power of process-level evaluation over simple final-answer scoring.
We'll move beyond simple accuracy to analyze the agent's behavior holistically through the Agent GPA lens, which provides a deeper view of the agent’s working process and allows us to evaluate it. Using Agent GPA, you will diagnose and iteratively improve the agent's performance, specifically addressing frequent issues like planning failures, tool selection errors, and execution gaps.

You’ll discover how seemingly minor changes, particularly in tool definitions, can lead to measurable improvements in tool selection and tool calling.

What you’ll learn:
- How to build an AI agent powered by Snowflake MCP
- How agents discover and choose tools through MCP
- How to design tool descriptions that influence agent behavior
- How to evaluate agent quality using structured metrics
- How to compare agent versions using observability and traces
- Why data grounding matters for reliable agents
What we’ll do:
- Build an initial agent version connected to Snowflake MCP
- Evaluate its performance using TruLens metrics
- Identify failure modes in tool selection and tool calling
- Improve MCP tool definitions using a coding agent
- Rebuild and re-evaluate a second agent version
- Compare both versions side by side using their traces and evaluation data
The workshop uses a concrete example: a health research agent, grounded on clinical trials and PubMed data available from the Snowflake Marketplace.

By the end of the session, you’ll understand how to evaluate AI agents using the Agent GPA framework and move beyond simple accuracy or final-answer scoring. You’ll learn how to analyze an agent’s goals, plans, tool usage, and execution, diagnose failures, and iteratively improve agent performance using structured evaluation and observability.

Please come prepared with a fresh Python environment (such as Jupyter) to run the lab.

About the speaker:
Josh is a developer advocate for AI and Open Source at Snowflake, previously at TruEra (acquired by Snowflake). He is also a maintainer of TruLens, an open-source library for systematically tracking and evaluating LLM-based applications.

Josh regularly delivers tech talks and workshops at events including PyData, Devoxx, AI_Dev, AI DevWorld, AI Camp meetups, and more. He also developed courses and taught students on a variety of platforms, including Coursera, DeepLearning.ai, Udemy, and DataCamp, and served as an advisor for Trustworthy Machine Learning at Stanford.

**Join our Slack: https://datatalks.club/slack.html**

This post is sponsored by Snowflake.
35 attendees

Past events

371

Organizers

DataTalks.Club Events

Members

7,267

Related topics

Machine Learning

Data Engineering