Python + AI: Vision models
Details
Our third stream in the Python + AI series is all about vision models!
Vision models are LLMs that can accept both text and images, like GPT 4o and 4o-mini. You can use those models for image captioning, data extraction, question-answering, classification, and more!
We'll use Python to send images to vision models, build a basic chat-on-images app, and build a multimodal search engine.
This session is a part of a series! To learn more, click here
Pre-requisites:
If you'd like to follow along with the live examples, make sure you've got a GitHub account.
Habla español? Tendremos una serie para hispanohablantes!