Testing GenAI applications with Docker


Detalles
La evolución de las aplicaciones de GenAI trae unos nuevos retos a la hora de seleccionar los métodos de testing que pueden evaluar de manera efectiva la complejidad de las respuestas generadas por los LLMs.
La propuesta para utiliza un LLM como un Agente Validador representa un enfoque prometedor, abriendo camino a una nueva era de desarrollo y evaluación de software en el campo de la inteligencia artificial.
Esta propuesta conlleva definir un criterio de evaluación detallado, usando un LLM como un "Evaluador" para determinar si las respuestas cumplen los requisitos especificados. Este enfoque puede ser aplicado para validar respuestas a preguntas específicas, basándose tanto en el conocimiento general del modelo como con información especializada. Al incorporar instrucciones detalladas y ejemplos, un Evaluador puede proporcionar evaluaciones precisas y justificadas, ofreciendo claridad sobre el por qué una respuesta sea considerada correcta o incorrecta.
En esta sesión mostraremos langchain para interactuar con los LLMs, Testcontainers para crear las dependencias necesarias para utilizar RAG, y Docker Desktop para correr LLM locales.
Agenda
---
Speaker
Manuel de la Peña - Docker (Software Engineer)
Hosted By
Manuel de la Peña, Software Engineer
Manuel is an OSS Software Engineer at Docker where he maintains "Testcontainers for Go". Since 2003, he has held various roles in different parts of the development process, starting in 2003 working for the regional public administration in Castilla-La Mancha (Spain) until 2007, then worked at more traditional consulting firms. In 2011, he transitioned to product-oriented and Open Source companies, where he has served as a support engineer, trainer, and Core Engineer at Liferay, as QA Tech lead at Liferay Cloud. From 2019 to 2022 he was involved in Engineering Productivity at Elastic as part of the Observability product, and since 2022 doing pure OSS at AtomicJar, which was acquired by Docker in Dec' 23. In every job he tries to improve the quality of the software products and processes from the automation and testing point of view.
He has also founded and managed a couple of small web development and systems consulting companies. Additionally, he organises the Google Developers Group in Toledo, Spain (GDG Toledo), where they run monthly discussions about software in its various aspects, serving as a small community outside the bustling Madrid. Manuel has delivered talks at different national and international events too.
Manuel holds a BS in Computer Science (UNED Spain), and a master degree in Research in Software Engineering and Information Systems (UNED Spain). You can find him on Internet as "mdelapenya" everywhere.
Eun Young Cho, Backend developer & blockchain4good evangelist
Andrés Velasco, He loves tech and computers
Team lead and Software Engineer
Javier Lopez de Ancos, Software Engineer
Antonio Crespo Velasco, Co-Organizer
---
Partner
Mes de QA (https://mesdeqa.github.io/)
---
Complete your event RSVP here: https://gdg.community.dev/events/details/google-gdg-toledo-presents-testing-genai-applications-with-docker/.


Patrocinadores
Testing GenAI applications with Docker