What we’re about
This meetup is for people working in unstructured data. Speakers will come present about related topics such as vector databases, LLMs, and managing data at scale. The intended audience of this group includes roles like machine learning engineers, data scientists, data engineers, software engineers, and PMs.
This meetup was formerly Milvus Meetup, and is sponsored by Zilliz maintainers of Milvus.
Upcoming events (4)
See all- Unstructured Data in LLMs88 Colin P Kelly Jr St, San Francisco, CA
This is an in-person event! Registration is required in order to get in. Github will email you a form the day before the event, which you will need to complete for your access pass.
Topic: Connecting your unstructured data with Generative AI
What we’ll do:
Have some food and refreshments. Hear three exciting talks and a demo about unstructured data and generative AI.5:30 - 6:00 - Welcome/Networking/Registration
6:05 - 6:30 - Sourabh Agrawal, Co-founder and maintainer, UpTrain
6:35 - 7:00 - Jiang Chen, Head of Ecosystem & AI Platform, Zilliz
7:05 - 7:30 - Shangyin Tan, key contributor, DSPy
7:35 - 7:45 - Community demo - Ben Cerchio, Co-founder, Secludy
7:45 - 8:30 - NetworkingTech Talk 1: Challenges associated with using LLM-as-a-judge
Speaker: Sourabh Agrawal
Abstract: Using LLMs to determine quality of LLM applications has gained a lot of interest recently, rightly so because it is highly scalable and solves the subjective nature of human evaluations. However, building production-grade evaluations is much more complicated than prompting the LLM to act as a judge and grade the given response. In this talk, we will cover the key techniques employed in industry + academia on how to effectively define LLM-based evaluations, understand associated challenges and look at what lies beyond evaluation. We will learn real-world instances of how these evaluations can be leveraged to improve your LLM applications.Tech Talk 2: Building production ready data pipelines with Milvus and Spark
Speaker: Jiang Chen
Abstract: Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.Tech Talk 3: Programming Foundation Models with DSPy
Speaker: Shangyin Tan
Abstract: Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.Community Demo: Generating privacy-protected synthetic data using Secludy and Milvus
Speakers: Ben Cerchio
Abstract: During this demo, the founders of Secludy will demonstrate how their system utilizes Milvus to store and manipulate embeddings for generating privacy-protected synthetic data. Their approach not only maintains the confidentiality of the original data but also enhances the utility and scalability of LLMs under privacy constraints. Attendees, including machine learning engineers, data scientists, and data managers, will witness first-hand how Secludy's integration with Milvus empowers organizations to harness the power of LLMs securely and efficiently.Who Should attend:
Anyone interested in talking and learning about Unstructured Data and Generative AI Apps.When:
June 10, 2023
5:30PMWhere:
This is an in-person event. Registration using this form is required to get into the event. Registration in advance will close 2 days before the event. Sponsored by Zilliz maintainers of Milvus.Can't make it in person? We will also be streaming live: https://www.twitch.tv/vectordatabase
- Unstructured Data in LLMs88 Colin P Kelly Jr St, San Francisco, CA
This is an in-person event! Registration is required in order to get in. Github will email you a form the day before the event, which you will need to complete for your access pass.
Topic: Connecting your unstructured data with Generative LLMs
What we’ll do:
Have some food and refreshments. Hear three exciting talks about unstructured data and generative AI.
5:30 - 6:00 - Welcome/Networking/Registration
6:05 - 6:30 - Charles Xie, CEO, Zilliz
6:35 - 7:00 - Alexandre Bonnet, Lead ML Solutions Engineer, Encord
7:05 - 7:30 - Joe Maionchi, VP R&D, Aparavi
7:35 - 7:45 - Community demo
7:45 - 8:30 - NetworkingWho Should attend:
Anyone interested in talking and learning about Unstructured Data and Generative AI Apps.Talk 1: Milvus and Zilliz
Speaker: Charles XieTech Talk 2: Garbage In, Garbage Out: Why poor data curation is killing your AI models (and how to fix it)
Speaker: Alexandre Bonnet, Lead ML Solutions Engineer, Encord
Abstract: Enterprises have traditionally prioritized data quantity, assuming more is better for AI performance. However, a new reality is setting in: high-quality data, not just volume, is the key. This shift exposes a critical gap – many organizations struggle to understand their existing data and lack effective curation strategies and tools. This talk dives into these data challenges and explores the methods of automating data curation.Tech Talk 3: Unstructured Data Preparation for AI
Speaker: Joe Maionchi
Abstract: Aparavi is a privacy-centric data fabric platform that provides deep intelligence for corporate unstructured data without moving, copying, or sharing the data. It automates data preparation for AI projects, adding classifications, anonymizing PII, and ensuring full traceability of embeddings back to the source.Who Should attend:
Anyone interested in talking and learning about Unstructured Data and Generative AI Apps.When:
July 16, 2024
5:30PMWhere:
This is an in-person event! Registration is required to get in. Registration will close 2 days before the event. Co-sponsored by Aparavi, Encord, and Zilliz (maintainers of Milvus).Can’t make it in person? Join us virtually on Twitch:
https://www.twitch.tv/vectordatabase - Unstructured Data in LLMs88 Colin P Kelly Jr St, San Francisco, CA
This is an in-person event! Registration is required in order to get in. Github will email you a form the day before the event, which you will need to complete for your access pass.
Topic: Connecting your unstructured data with Generative AI
What we’ll do:
Have some food and refreshments. Hear three exciting talks about unstructured data and generative AI.5:30 - 6:00 - Welcome/Networking/Registration
6:05 - 6:30 - Tech Talk 1
6:35 - 7:00 - Aamir Shakir, Co-founder, mixedbread.ai
7:05 - 7:30 - Tech Talk 3
7:35 - 7:45 - Community demos
7:45 - 8:30 - NetworkingTech Talk 2: Building the Future of Neural Search: How to Train State-of-the-Art Embeddings
Speaker: Aamir Shakir
Abstract: Neural search plays a crucial role in Retrieval Augmented Generation (RAG) and various other AI use cases. In this talk, we will discuss the future of neural search, explore interesting challenges we are addressing, and explain how we build our state-of-the-art embedding model, which helps to develop high-quality RAG systems at scale.Who Should attend:
Anyone interested in talking and learning about Unstructured Data and Generative AI Apps.When:
August 5, 2023
5:30PMWhere:
This is an in-person event. Registration using this form is required to get into the event. Registration in advance will close 2 days before the event. Sponsored by Zilliz maintainers of Milvus.Can't make it in person? We will also be streaming live: https://www.twitch.tv/vectordatabase
- Unstructured Data in LLMs88 Colin P Kelly Jr St, San Francisco, CA
This is an in-person event! Registration is required in order to get in. Github will email you a form the day before the event, which you will need to complete for your access pass.
Topic: Connecting your unstructured data with Generative AI
What we’ll do:
Have some food and refreshments. Hear three exciting talks about unstructured data and generative AI.5:30 - 6:00 - Welcome/Networking/Registration
6:05 - 6:30 - Tech Talk 1
6:35 - 7:00 - Tech Talk 2
7:05 - 7:30 - Amit Sangani, Senior Director, Meta
7:35 - 7:45 - Community demos
7:45 - 8:30 - NetworkingTech Talk 3: Llama 3 !!
Speaker: Amit SanganiWho Should attend:
Anyone interested in talking and learning about Unstructured Data and Generative AI Apps.When:
September 9, 2023
5:30PMWhere:
This is an in-person event. Registration using this form is required to get into the event. Registration in advance will close 2 days before the event. Sponsored by Zilliz maintainers of Milvus.Can't make it in person? We will also be streaming live: https://www.twitch.tv/vectordatabase