[In-person Event] LLM Optimization with GPUs: Performance on Llama 2, Bloom

![[In-person Event] LLM Optimization with GPUs: Performance on Llama 2, Bloom](https://secure.meetupstatic.com/photos/event/d/d/e/f/highres_471176815.jpeg?w=750)
Details
This is an in-person event, Food and drink will be provided. Join us for networking and socializing. The talk is at AMD.
Speaker: Nick Ni, Sr Director AI Product Management of AMD
Talk Abstract:
Large language models (LLMs) like GPT-3 have demonstrated impressive capabilities in natural language processing. However, running these massive neural networks requires significant computational resources. New and powerful data center GPUs from AMD offer powerful performance optimized for training and inference. In this presentation, we explore using open software platform to run LLMs. Standard frameworks like PyTorch and TensorFlow are used as well as open libraries such as vLLM. We benchmark performance of various sized LLMs like Llama2 and Bloom. Our results demonstrate that comparable or better performance can be achieved. Key optimizations include efficiently mapping matrix multiplication and attention layers. With careful tuning, better performance and cost-effective deployment, large language models are possible for a wide array of applications.
Meetup agenda:
6-6:30 pm Check in, food & drink, networking
6:30-7:30 pm Talk by Nick Ni
7:30-8 pm Q&A and additional networking
To expedite the onsite security registration, please fill in these few questions so we can pre-register you - https://docs.google.com/forms/d/e/1FAIpQLSepbm4dPWgMpPTGDf5UzpbhP4RRkZDLCe4Hn1LnnXVOG5jhfA/viewform

Sponsors
[In-person Event] LLM Optimization with GPUs: Performance on Llama 2, Bloom