Data Preparation & Fine Tuning for LLMs


Details
Please provide your basic contact information (name, email address) for registration at the entrance.
Speakers:
- "Intro to AI Alliance" by Jaikrishnan Hari, AI Alliance
- "Overview of Data Prep Kit" by Dr. Aanchal Goyal, IBM Research
- "Data Preparation using Data Prep Kit" by Aisha Mohammed Farooq Darga of IBM - This talk introduces Data Prep Kit, a new project accepted by the Linux Foundation, designed to streamline essential data preparation tasks such as content extraction, cleaning, de-duplication and filtering out problematic data. Attendees will learn how the Data Prep Kit can accelerate data preparation, improve overall data quality, and enhance the efficiency of building robust LLM applications.
- "Making LLM fine-tuning accessible with InstructLab" by Ramakrishna Reddy Yekulla (Ramky), Red Hat - The rise of large language models (LLMs) has opened up exciting possibilities for developers looking to build intelligent applications. However, the process of adapting these models to specific use cases can be difficult, requiring deep expertise and substantial resources. In this talk, we'll introduce you to InstructLab, an open-source project that helps make LLM tuning accessible to developers and engineers of all skill levels, on consumer-grade hardware.We'll explore how InstructLab's innovative approach combines collaborative knowledge curation, efficient data generation, and instruction training to enable developers to refine foundation models for specific use cases. Through a live demonstration, you'll learn how to enhance an LLM with new knowledge and capabilities for targeted applications, without needing data science expertise. Join us to explore how LLM tuning can be more accessible and democratized, empowering developers to build on the power of AI in their projects.
Speaker Details:
- Jaikrishnan Hari, AI Alliance - Jai is Strategy & Business Development Executive for IBM Research. He actively engages clients and partners across different industries in bringing IBM Research innovations across AI, Quantum Computing, Hybrid Cloud and Security to market and drive industry adoption. 23 Years of Industry experience in BD, Product Management and Consulting across IBM, Nokia, CA, Wipro and Tech Mahindra. Academic background includes postgraduate degree in management from Indian Institute of Management Ahmedabad and engineering from NIT Surat. Patents Granted
- Dr. Aanchal Goyal is a Senior Research Scientist at the IBM Research Lab (IRL) in Delhi, with over 10 years of experience. Her expertise lies in applied mathematics, data analytics, and artificial intelligence (AI). Dr. Goyal is currently focused on exploring methods for data curation, filtering, and feeding data to existing AI models. This research aims to improve the overall performance and effectiveness of AI systems. Before this, Dr. Goyal was involved in research on compute sustainability. This included quantifying the carbon footprint of on-premise hardware and cloud infrastructure, promoting more environmentally friendly computing practices.
- Aisha Mohammed Farooq Darga, AI Engineer at IBM - Aisha works at the intersection of innovation and real-world problem solving. As part of the Ecosystem Engineering Lab in Client Engineering, she helps bring AI ideas to life—designing and deploying solutions that tackle real business challenges head-on.Her role involves collaborating with clients to develop and implement innovative AI solutions that address complex business challenges. Whether it's natural language processing, machine learning, or data-driven insights, she's all about delivering AI that actually makes a difference.Outside the day-to-day, Aisha is passionate about staying current with emerging AI trends, continuously learning, and contributing back to the tech community whenever she can. She's driven by curiosity, a love for solving tough problems, and a deep belief in the power of AI to create meaningful change.
- Ramakrishna Reddy Yekulla (Ramky) Senior Principal Product Manager, Data + AI Group, Red Hat - Ramky, serves as the Senior Principal Product Manager for the Data + AI Group at Red Hat. In this role, he leads the technical strategy and operationalization of artificial intelligence (AI) models, overseeing their entire lifecycle. He ensures seamless alignment between technical execution and strategic business objectives while adeptly managing regulatory compliance and associated risks.
Ramky acts as a vital liaison among data science, platform, engineering, and governance teams, fostering cross-functional collaboration to effectively integrate AI models into Red Hat’s product portfolio and enhance its AI capabilities. Leveraging extensive technical expertise from his tenure as a Principal Architect specializing in developer platforms and security, he brings a wealth of knowledge to his current responsibilities.
In addition to his leadership at Red Hat, Ramky is an active contributor to several prominent open-source projects, including Fedora, Django, GNOME, LibreOffice, and storage solutions such as GlusterFS and Heketi. His professional interests encompass system design, functional programming, and observability, reflecting his commitment to advancing technology and innovation. https://x.com/ramkrsna and https://www.linkedin.com/in/ramkrsna/
Agenda:
10:15 am Building entrance registration opens
10:30 am Refreshments & Networking
10:45 am Welcome & Talks Begin
11:00 am "Intro to AI Alliance"
11:15 am "Intro to Data Prep Kit" and "Data Preparation using Data Prep Kit"
11:45 am "Making LLM fine-tuning accessible with InstructLab"
12:15 pm Q&A
12:30 pm End
Location details: 2nd Floor at IBM EGL, D Block, Embassy Golf Links, Off Indira Nagar-Koramangala Intermediate Ring Road, IBM Ln, Embassy Golf Links Business Park, Challaghatta, Bengaluru, Karnataka 560071
Please provide your basic contact information (name, email address) for registration at the entrance.

Data Preparation & Fine Tuning for LLMs