PyData London - 90th Meetup


Details
## Details
Venue: Riverbank House, 2 Swan Ln, London EC4R 3AD -
IMPORTANT: LOCATION UPDATED!
Please note:
- π¨π¨π¨A valid photo ID is required by building security. π¨π¨π¨
- This event follows the NumFOCUS Code of Conduct, please familiarise yourself with it before the event.
If your RSVP status says "You're going" you will be able to get in. No need to show your RSVP confirmation when signing in.
If you can no longer make it, please unRSVP as soon as you know.
***
Code of Conduct:
This event follows the NumFOCUS Code of Conduct. Please get in touch with the organisers with any questions or concerns regarding the Code of Conduct.
***
As always, there'll be free food & drinks, generously provided by our host, Man Group.
***
Main Talks
Valuable lessons learned on Kaggle's ARC AGI (LLM) Challenge - Ian Ozsvald
Having worked on Kaggle's LLM-based ARC AGI program-writing challenge for 6 months using Llama3, I'll give reflections on the lessons learned making an automatic program generator, evaluating it, coming up with strong representations for the challenge, chain-of and program-of-thought styles and some multi-stage critical thinking approaches. You'll get ideas for how to tune your own prompts and shortcuts to help you evaluate your own LLM usage with greater assurance in the face of non-deterministic outcomes.
Unlocking Scalability: Building High-Capacity Vector Databases with Open-Source Techniques - Sergii Ivakhno
The rapid adoption of large language models and Retrieval-augmented generation (RAG) techniques has led to a significant surge in vector database usage, with a large number of open source and proprietary techniques available on the market. However, the prevalent focus on latency optimization and fast RAM retrieval has limited their applicability to large datasets composed of trillions of records.
In this presentation, I will illustrate how open-source techniques can be leveraged to construct an in-house vector database that embraces scalability and cost-effectiveness, managing hundreds of billions of records. Alongside this I will showcase how to integrate vector database search with standard SQL queries to produce flexible analytic solutions. More advanced uses such as compressed classification, Big Data clustering and RAG will also be highlighted.
β‘ Lightning Talks
1οΈβ£ Turn YouTube videos & podcasts into readable Markdown with Whisper and LLMs - Shun Liang
2οΈβ£ Coming soon ...
Logistics
Doors open at 6.30 pm (get there early as you have to sign-in via building security), talks start at 7 pm, drinks from 9 pm in the bar. We will have reduced capacity for this event but there will be plenty of people to discuss data science questions with!
Please unRSVP in good time if you realise you can't make it. We're limited by building security on the number of attendees, so please free up your place for your fellow community members!
Follow @pydatalondon (https://twitter.com/pydatalondon) for updates and early announcements.

Sponsors
PyData London - 90th Meetup