AI & Biology: GenePT – Bridging Text Knowledge & Transcriptomics for Inisghts!


Details
Join us for our next exciting session as we delve into a fascinating new frontier in single-cell biology:
"GenePT: A Simple But Effective Foundation Model for Genes and Cells Built From ChatGPT" by Yiqun Chen and James Zou
This paper introduces a truly novel concept: integrating the vast knowledge encoded in text (like gene literature) from Large Language Models (LLMs) such as ChatGPT, directly with gene expression data!
We'll explore how powerful embeddings, translated from both text descriptions of genes and their transcriptional profiles, can unlock profound biological and clinical insights.
GenePT offers a surprisingly simple yet effective alternative to resource-intensive foundation models. By leveraging NCBI text descriptions of individual genes with GPT-3.5, it generates gene embeddings. It then creates single-cell embeddings by averaging gene embeddings (weighted by expression) or forming a cell sentence embedding (genes ordered by expression level). Without the need for extensive data curation or additional pretraining, GenePT is efficient, easy to use, and achieves comparable, often better, performance on downstream tasks like classifying gene properties and cell types.
Come discover how this innovative approach reshapes our understanding of biological foundation models by directly connecting text knowledge with cellular function! As always, it will be an informal and easy-going get-together to learn and chat about the paper. I will prepare a brief presentation to walk us through it.

AI & Biology: GenePT – Bridging Text Knowledge & Transcriptomics for Inisghts!