Skip to content

Exploring LoRA: Fine-tuning LLMs

N
Hosted By
Nita G.
Exploring LoRA: Fine-tuning LLMs

Details

IMP: Please register and wait for our confirmation email. This is an invite only event with limited seats!

Abstract:
Numerous applications in the fields of natural language processing and computer vision rely on the adaptation of a single large-scale, pre-trained language model for multiple downstream applications. This adaptation process is typically achieved through a technique called fine-tuning, which involves customising the model to a specific domain and task.In this talk, we will explore the necessity of fine-tuning pre-trained large models for specialised tasks, starting with the conventional method and its limitations in computational and storage demands. We will discuss LoRA (Low-Rank Adaptation), one of the Parameter-Efficient Fine-Tuning (PEFT) methods based on adapter modules, which allows task-specific training with a reduced number of parameters and costs.There is an abundance of articles discussing how to implement LoRA fine-tuning to Large Language Models (LLMs) such as LLaMa and alike. Grasping the methodology and rationale behind LoRA while applying to large models is challenging because of the inherent complexity of the models. To deepen our understanding on the algorithm, we will implement LoRA to a Multilayer Perceptron (MLP) for a binary classification task, starting with the creation of a toy dataset. We will cover adapter insertion, parameter configuration, and efficiency, as well as practical considerations for model sharing and loading via the Hugging Face Hub. Finally, we will address LoRA’s inference, memory limitations and potential solutions like QLoRA.

Pre-requisites:

  • This talk is targeted towards folks who have a basic understanding of deep learning neural architectures and their training process.
  • Please bring your laptops for trying out the accompanied Colab notebook.

Take aways:

  • Workings of adaptor based fine-tuning.
  • Idea behind LoRA algorithm - how and why it works

Depth of the topic - 4/5

About the Speaker :
Preethi Srinivasan is a Solution Consultant at Sahaj Software. She has a Masters (by Research) from IIT Mandi. As part of her masters thesis she worked on applications of Deep Learning algorithms to medical imaging problems and published work at NIPS-WIML workshop, IEEE-CBMS-20, ACCV-20. At Sahaj, she has developed prototypes for Video Summarisation (in an unsupervised fashion) and Video Captioning, focusing on extracting meaningful information from the video data. She is currently working on building question-answering systems for specific domains based on document analysis, utilising RAG and/or fine-tuning on LLMs.
Social Accounts of Speaker:
Linkedin: https://www.linkedin.com/in/preethi-srinivasan-3a221915/
Github: https://github.com/s3pi
Google Scholar: https://scholar.google.com/citations?user=JMWfeg4AAAAJ&hl=en

Register now on the form https://forms.gle/2eTH4bz7ygXQg9bV8 to participate in the DevDay!

Kindly Note : If you are shortlisted for the event, you will receive a confirmation email from us after you register on the link above. Kindly produce the email on arrival at our venue. Please bring your laptops for trying out the accompanied Colab notebook.

Photo of DevDay - Bangalore group
DevDay - Bangalore
See more events
Sahaj AI Software Pvt. Ltd.
1st Cross Road, 3rd Block, Koramangala · Bengaluru, Ka