Skip to content

Gen-AI Day: Tech Talks & Hands-on Workshop by speakers from Microsoft & Meta

Photo of Arivoli Tirouvingadame
Hosted By
Arivoli T.
Gen-AI Day:  Tech Talks & Hands-on Workshop by speakers from Microsoft & Meta

Details

Welcome to the free 1-day hands-on "GenAI Day" event!

MORNING SESSION: 9:00 am - 12:30 pm PST

THEME: Data platforms for the GenAI Era

TECH TALK #1:
Title: Intro to Microsoft Fabric
Speaker: Dipti Borkar, VP, Azure Data, Microsoft

Bio:
Dipti Borkar is responsible for Azure Data ISVs, Azure Databricks and Microsoft Fabric Data Dev. Dipti is a senior technology executive and entrepreneur with nearly 20 years of experience in cloud, distributed systems, database tech including relational & NoSQL systems. Prior to Microsoft, she founded Ahana (acquired by IBM) and created a cloud managed service for SQL on data lakes. Prior to Ahana, she held various executive roles across product, engineering, and world-wide solutions engineering teams. At IBM, Dipti managed large world-wide dev teams for DB2 Distributed, where she also started her career as a software engineer. She is very passionate about empowering and mentoring women in tech, and women in open source. Dipti holds a MS in Computer Science from UC San Diego with a specialization in databases and an MBA from the Haas School of Business at UC Berkeley.

TECH TALK #2:
Title: CoPilots in Fabric - overview including SynapseML, Cognitive Services
Speaker: Raj Rikhy, Principal GenAI and Data Product Manager, Microsoft

Abstract:
Deep dive into CoPilot for data science and data engineering and demo

Bio:
Raj Rikhy, Principal Product Manager at Microsoft Azure Data, pioneering generative AI (Copilot) for data engineering and science in Microsoft Fabric. Formerly led data initiatives at IBM. Passionate about transforming raw data into actionable insights. With a track record of driving innovation and delivering impactful solutions, he thrives on solving complex problems and shaping the future of AI-driven data analytics.

TECH TALK #3:
Title: CosmosDB - the database for GenAI Apps
Speaker: Abinav Rameesh, Principal Product Manager, Microsoft

Bio:
Principal PM, Manager on the Azure Cosmos DB team, specifically focusing on the Managed MongoDB offerings under the Cosmos DB umbrella. I have spent most of my time at Microsoft on the distributed systems and distributed database space, having worked on High Availability, SDKs, Customer success onboarding some of our largest customers in Azure and more recently - integrating AI functionality into the database engine for the era of AI.

TECH TALK #4:
Title: Underneath the hood of Data service focused CoPilots
Speaker: Avrilia Floratou, Principal Scientist Manager, Microsoft

Bio:
Avrilia is a Principal Scientist Manager at Microsoft’s Gray Systems Lab (GSL)(opens in new tab). Her research broadly lies in data management with a recent focus on leveraging Large Language Models (LLMs) to improve user productivity in domains such as data integration, data exploration and code migration. Previously, she worked on leveraging data semantics and domain knowledge to improve the productivity of data scientists. Avrilia received her Ph.D. in Computer Science from University of Wisconsin-Madison and her B.Sc. from University of Athens in Greece.

AFTERNOON SESSION: 1:30 pm - 4:00 pm PST

THEME: Hands-on workshop: Introduction to Retrieval-Augmented Generation with Llama 2 and MongoDB

Speaker: Alex Kalinin, Engineering Manager, AI / ML, Ads / Privacy

Abstract:
In this workshop we’ll learn how to combine the retrieval approach powered by the MongoDB Vector database with the generative features of Llama 2 to build an end-to-end Q&A agent.

Bio:
Alex Kalinin currently leads AI/Machine Learning at Meta. Previously, he developed smart home software at Home.ai using computer vision and deep learning. At Yahoo he led development of Big Data user acquisitions systems for Yahoo Games business. Alex holds MS in Physics, and published several papers on Image Recognition and Pattern detection.

Agenda:

  1. Introduction to large language networks
  2. Few-shot learning. Prompting concepts
  3. Introduction to vector-based search and retrieval. Embeddings

Hands-on coding:

  1. Vectorize documents (PDFs) and load into a vector database for retrieval
  2. Visualize embeddings space, and relation between the query and the documents
  3. Build end-to-end retrieval-generation pipeline with Llama

Pre-requisites:

  • Set up an account at [Kaggle.com](https://www.kaggle.com/). Make sure you have access to a GPU.
  • We’ll use [Kaggle.com](https://www.kaggle.com/) to run our code. Kaggle provides 30 hours / week of GPU time.
  • You can use an alternate cloud-based account with access to a GPU and Internet.
  • Knowledge of Python and basic programming.
Photo of Data Riders group
Data Riders
See more events
Hacker Dojo
855 Maude Ave · Mountain View, CA