
What we’re about
Aggregate Intellect (https://ai.science/) is generative knowledge management platform helping businesses improve their workflows using large language models and similar AI technologies.
-
Learn about Sherpa (https://sherpa-ai.readthedocs.io/)
-
See more (https://ai.science/)
-
Join our Slack Community (https://join.slack.com/t/aisc-to/shared_invite/zt-f5zq5l35-PSIJTFk4v60FML177PgsPg)
Upcoming events (1)
See all- One day Workshop - LLM EVALUATION AND QUALITY CONTROL (virtual)Link visible for attendees
FOR DETAILS: click here
Evaluating the output of large language models is crucial due to the nuances involved in assessing their performance. These models generate text based on statistical patterns and do not possess true understanding or knowledge. Therefore, evaluating their output requires careful consideration of factors such as coherence, relevance, and factual accuracy.
Quality control of the behavior of products built using large language models is another important aspect to consider. These models are often used for various use cases, such as chatbots, virtual assistants, and content generation tools. Ensuring that the behavior of these products aligns with product requirements, human preferences, and ethical and legal standards is crucial. It is necessary to implement robust mechanisms for filtering and moderating the generated content to maintain the integrity and trustworthiness of the products and their associated businesses .# SCHEDULE
FRIDAY, DECEMBER 8th (times are in ET):
- 9:00 Dr. Daniel Rock (Assistant Professor @ The Wharton School, University of Pennsylvania) GPTs are GPTs!!!
- 10:00 Abi Aryan (Founder @ Abide AI) Developing In-House LLM Benchmarks
- 11:00 Percy Chen (PhD Student @ McGill U. / R&D Engineer @ Aggregate Intellect) Sherpa - Open Source Project Update [INCLUDES DEMO]
- 12:00 Suhas Pai (CTO @ Hudson Labs, formerly Bedrock AI) Break! Machine Learning Trivia and Networking
- 13:00 Meg Risdal (Sr. Product Manager @ Google / Kaggle) Empirical Rigor in ML
- 14:00 Dr. Val Andrei Fajardo (Founding ML Engineer @ LlamaIndex) Evaluating Multi-Modal RAG Systems [TUTORIAL]
- 15:00 Dr. Benedicte PIERREJEAN (Sr. ML Scientist @ Ada) Automatic Evaluation of Dialogue Systems [INCLUDES DEMO]
- 16:00 Benjamin Labaschin (MLE @ Workhelix) Normie Tools for Validating LLM Outputs