Exploring Advanced Data Search Strategies


Details
Join us on Wednesday, February 26, for the first 2025 Bucharest Big Data meetup, focused on advanced search methods in data retrieval.
First, Emil Calofir, Head of AI at eSolutions.tech, will cover indexing methods for unstructured data, including sparse and dense embeddings, different encoders, and essential document processing steps, including a live demo. Following, Radu Gheorghe, Software Engineer at Vespa.ai, will discuss the trade-offs of combining mutable and immutable data structures for real-time search and how Vespa's tensor capabilities enhance vector search.
Let’s further meet our speakers and the topics they will cover.
First talk:
Search Techniques Using Embeddings
This presentation will cover various indexing techniques for unstructured data we have tried, including sparse and dense embeddings, along with their strengths and applications. We will also discuss different types of encoders, essential document processing steps, and practical comparisons. We will end the presentation with a short demo.
About the speaker:
Emil Calofir, Head of AI // eSolutions
Emil is an experienced software engineer, with a solid background in the IT and services industry, including healthcare, gaming, telecom, banking, retail, and logistics. He is passionate about Generative AI and is currently the Head of AI at eSolutions.tech, where he leads the integration of AI solutions into different projects. Emil started his career in mobile game development but has since moved into software engineering, making important contributions at Deutsche Bank and Orange Romania, especially in API development, microservices, and data warehousing.
Second talk:
Lexical, Vector, and Hybrid Search With Vespa
Vespa is an open-source search engine optimized for quick, flexible, and scalable retrieval. We'll take a short but deep dive into its capabilities, starting with lexical search: What trade-offs Vespa makes when combining mutable with immutable data structures to provide real-time search? Then we'll move to vector search: how Vespa's tensors support new similarity models such as ColBERT or ColPali. Finally, we'll have a look at phased ranking and how you can use it to manage precision vs performance.
About the Speaker:
Radu Gheorghe, Software Engineer // Vespa. ai
Radu has been working full-time with search for about 13 years, most of them with Lucene-based search engines: Elasticsearch, Solr, and OpenSearch. While he is passionate about Vespa (he should be, he works at Vespa.ai), he can't help but draw parallels and contrasts to frame design choices and their trade-offs.
Agenda
18:30 - 19:00 - Welcome & Networking
19:00 - 19:40 - Emil Calofir: Search Techniques Using Embeddings
19:50 - 20:30 - Radu Gheorghe: Lexical, Vector, and Hybrid Search With Vespa
20:30 - 21:30 - Networking
The event is hosted by ING Hubs Romania. Meet us all on Wednesday, February 26, at their office (174-176 Calea Victoriei).
This is an in-person event, presentations will be conducted in English. Please RSVP to secure your spot.
See you there!

Sponsors
Exploring Advanced Data Search Strategies