Skip to content

Details

About the speaker - https://www.linkedin.com/in/harini-saravanan-63344b278/

About the topic

## TOPIC: Practical Strategies for Fine-Tuning Large Language Models (LLMs) for Multimodal Human/AI Content Detection and Conversational AI

Introduction

  • Starting with a few introductory comments about LLMs and how they are changing the game in text, vision, and multimodal AI tasks.
  • Emphasizing the need for domain-specific fine-tuning - especially with respect to safety, detection, and anything to do with reality.

Section 1:
Foundation of Fine-Tuning LLMs

  • What is fine-tuning, why is it important for taking your model out into the real world, and what part does the specialized dataset play?
  • Discuss the different fine-tuning paradigms – full-model fine-tuning, parameter-efficient (e.g., LoRA), and prompt-based fine-tuning.
  • Providing context for the fine-tuning I will be discussing by briefly describing my use of GPT-2, DistilBERT, and Flan-T5 in different scenarios for text detection, conversation modeling, and intent classification at the beginning of each point.

Section 2:
Dataset Curation & Multimodal Pipelines

  • Describing how I went about collecting, filtering, and balancing datasets, and referencing the process with Kaggle stories (human) and Stable Diffusion (AI images).
  • Describing how multimodal detection pipelines (GPT-2 for text, EfficientNet for images) must be carefully generated and normalized as datasets.
  • Describing the unique challenges in generating datasets for conversational safety detection (PAN12 corpus) and also for real-time inference.

Section 3:
Model Architecture & Training Procedures

  • Walkthrough of the code and scripts used to establish training pipelines, clearly outlining the 8 essential parts of a training pipeline (Tokenization, model loading, dataset construction, training loop, logging, evaluating, etc. related to your application).
  • For text detectors and chatbots: providing clarity about the fine-tuning of transformers (e.g., supervised labels for AI/Human distinction or intent/grooming detection).
  • For image detectors: describing the EfficientNet architecture and its use in distinguishing generative content.

Section 4:
Parameter Efficient Fine-Tuning (PEFT)

  • Explaining emerging strategies such as LoRA, adapters, soft-prompts - discussing the benefit of reduced computational cost while maintaining performance.

Section 5:
Retrieval-Augmented Generation (RAG) for Enriching Context

  • Sharing my experience in using RAG with LangChain and Hugging Face to improve LLM response quality and ground outputs in real-life documents.
  • Explaining how RAG can help one customize conversational bots for enterprise and domain applications.

Section 6:
Prompt Engineering & Human Feedback

  • Discussing the practical influence of prompt engineering in fine-tuning, such as zero, one, and few-shot sets for customizing LLM behavior.
  • Discussing using a combination of supervised labels and human-in-the-loop techniques to detect more nuanced AI/human content and conversation intention.

Section 7:
Evaluation & Bias Mitigation

  • Discussing model evaluation approaches, accuracy, F1-score, confusion matrix, and classification reports.
  • Outlining my approach to dataset balancing (undersampling/oversampling) to mitigate bias and allow better generalization, using prey/non-predator corpora as an example.
  • Emphasizing my belief in robust validation, benchmarking, and error analysis for safe and reliable AI application deployment.

Section 8:
Deployment Challenges & Best Practices

  • Highlighting real world challenges: model/package compatibility, GPU scaling, adopting to dataset drift, integrating code frameworks and pre-trained models.
  • Providing practical advice for deployment and fine-tuning LLM and multimodal detectors and maintenance considerations - real-time pipelines, serving the model through APIs, CI/CD for retraining.
  • Discussing what trends are emerging as future work: multimodal fusion, RLHF, and model distillation

Conclusion

  • Summarizing what I have learnt projects in relation to model fine-tuning, model multimodal integration, and engineer testimony for practical AI-based solutions.
  • Considering the future of LLMs such as human/AI content detection, safety in human LLM conversational interfaces, and emphasize the important of ethical AI developments.
  • Mentioning possible future direction: scaling to more than one media type, multilingual contexts, and adaptive retraining.

Members are also interested in