Skip to content

AI in production systems + Embedding Recycling to Accelerate ML Inferencing

Photo of Kristian Aune
Hosted By
Kristian A.
AI in production systems + Embedding Recycling to Accelerate ML Inferencing

Details

We are back with two talks, same location, same people but in a new company [Vespa.ai](https://vespa.ai/)!

Our industry saw groundbreaking changes last year with the massive use of LLMs, Retrieval Augmented Generation, and ever-better-performing models. In this event, we will look at how to deploy this in real-world applications, with some demos as well.

There will be snacks and refreshments as usual.

Using modern AI in production systems
*By Jon Bratseth, CEO, Vespa.ai*

Modern AI is becoming able to solve practical problems in all kinds of applications. Examples range from chatbots such as ChatGpt to recommendation and personalisation, search, and question answering. However, applying these techniques in online production systems can be challenging. In this talk, we’ll see through real-world examples how you can accomplish this at scale, using open-source platforms and without a big budget.

Embedding Recycling to Accelerate ML Inferencing
*By Jo Kristian Bergum, Distinguished Engineer, Vespa.ai*

In the era of self-supervised deep learning, unstructured data in various modalities have become more useful than ever. However, the high computational costs of running task-specific deep-learning models over the same data often present a significant cost challenge. To address this issue, embedding recycling (ER) has emerged as a promising technique that enables the reuse of intermediate embedding representations for different tasks. By memorizing the output of intermediate layers of deep neural networks as embeddings, practitioners can reuse them for various tasks using task-specific layers conditioned on the task-specific input. This talk introduces the concept of recyclable embeddings and how Vespa, an efficient and scalable search and recommendation engine, can efficiently do ML inferencing over many data points with recyclable embeddings. Key Takeaways

  • Overview of deep-learned embeddings and the growing importance of embedding models for ML
  • How to recycle embeddings for different tasks
  • Practical strategies and best practices for efficiently managing and operating deep-learned embeddings in real-world applications.
Photo of Trondheim Big Data group
Trondheim Big Data
See more events
Prinsens gt. 49
Prinsens gt. 49 · Trondheim