Skip to content

Explore DBRX: Databricks Open LLM (DBRX Meetup at MIT)

Photo of Alan Werner
Hosted By
Alan W.
Explore DBRX: Databricks Open LLM (DBRX Meetup at MIT)

Details

Location: Room 1-190, Bld. 1, first floor, next to du pont court

Join the Databricks Boston User Group for a special edition meetup with our friends at MIT. We'll be hosting two incredible sessions:

Introducing DBRX: our new SOTA open LLM w/ Dr. Sam Raymond

Last month, Databricks released a new open LLM, DBRX. Across a range of standard benchmarks, DBRX sets a new state-of-the-art for established open LLMs. Moreover, it provides the open community and enterprises building their own LLMs with capabilities that were previously limited to closed model APIs. DBRX surpasses GPT-3.5, and it is competitive with Gemini 1.0 Pro. It is an especially capable code model, surpassing specialized models like CodeLLaMA-70B on programming, in addition to its strength as a general-purpose LLM.This state-of-the-art quality comes with marked improvements in training and inference performance. DBRX advances the state-of-the-art in efficiency among open models thanks to its fine-grained mixture-of-experts (MoE) architecture. Inference is up to 2x faster than LLaMA2-70B, and DBRX is about 40% of the size of Grok-1 in terms of both total and active parameter-counts.

In this meet up we will dive deeper into how DBRX was made, and how you can get started building on top of this new, state-of-the-art open model

Bio
Dr. Sam Raymond, an expert in machine learning at Databricks, specializes in Large Language Models (LLMs) and Machine Learning Operations (MLOps). His work primarily focuses on the practical applications and advancements of these technologies.

With a Ph.D. in Computational Science and Engineering from MIT, Dr. Raymond combines his industry expertise with a solid academic foundation. He actively bridges the gap between theoretical research and practical implementation in the field of machine learning.
Dr. Raymond also contributes to the education sector, teaching courses on edX such as "LLMs: Application through Production" and "LLMs: Foundation Models from the Ground Up." These courses provide professionals with insights into the application and theory behind LLMs and MLOps.

In his academic role as adjunct professor at Dartmouth College, Dr. Raymond imparts his knowledge and experience to the next generation of engineers and data scientists, enriching his professional experience and influence in the field.

Lilac: Data Exploration and Understanding in the Age of GenAI

Data is at the core of any LLM-based system — whether preparing datasets for training models, evaluating model outputs, or filtering Retrieval-Augmented Generation (RAG) data. Exploring and understanding these datasets is critical for building quality GenAI apps. However, analyzing unstructured text data can become extremely difficult in the age of GenAI. Lilac makes exploration of unstructured data easy: it is a delightful tool for data scientists and AI researchers to explore, understand, and modify text datasets in a tractable way. Lilac empowers data scientists and researchers to explore data clusters, derive new data categories using human feedback and classifiers, and tailor datasets based on these insights.

Bio
Daniel Smilkov was the CEO of Lilac, an AI startup which recently joined Databricks. Previously, he spent a decade at Google Research, co-creating and co-leading 2 large projects: TensorFlow.js and Know Your Data, at the intersection of ML research and visualization. Daniel graduated with a masters degree from MIT Media Lab in 2014 working with Prof. Cesar Hidalgo in the Macro Connections groups and was the co-creator of the popular email visualization tool Immersion.

Photo of Databricks Boston User Group group
Databricks Boston User Group
See more events
MIT Building 1
33 Massachusetts Ave · Cambridge, MA