What we're about

This virtual group is for data scientists, machine learning engineers, and open source enthusiasts who want to expand their knowledge of computer vision and complementary technologies. Every month we’ll bring you two diverse speakers working at the cutting edge of computer vision.

What’s computer vision? It’s how systems can derive meaningful information from digital images, videos and other visual inputs — and how they can take actions or make recommendations based on that information.

Use cases for computer vision include: autonomous vehicles, facial recognition, inventory management, medical imaging and more.

  • Are you interested in speaking at a future Meetup?
  • Is your company interested in sponsoring a Meetup?

Contact the Meetup organizers!

This Meetup is sponsored by Voxel51, the lead maintainers of the open source FiftyOne computer vision toolset. To learn more about FiftyOne, visit the project page on GitHub: https://github.com/voxel51/fiftyone

Upcoming events (4)

Getting Started with FiftyOne Workshop (Americas)

Network event

23 attendees from 12 groups hosting

Link visible for attendees

Zoom Registration
https://voxel51.com/computer-vision-events/getting-started-with-fiftyone-workshop-jun-28/

About the Workshop
Want greater visibility into the quality of your computer vision datasets and models? Then join Jacob Marks, PhD, of Voxel51 for this free 90 minute, hands-on workshop to learn how to leverage the open source FiftyOne computer vision toolset.

In the first part of the workshop we’ll cover:

  • FiftyOne Basics (terms, architecture, installation, and general usage)
  • An overview of useful workflows to explore, understand, and curate your data
  • How FiftyOne represents and semantically slices unstructured computer vision data

The second half will be a hands-on introduction to FiftyOne, where you will learn how to:

  • Load datasets from the FiftyOne Dataset Zoo
  • Navigate the FiftyOne App
  • Programmatically inspect attributes of a dataset
  • Add new sample and custom attributes to a dataset
  • Generate and evaluate model predictions
  • Save insightful views into the data

Prerequisites
A working knowledge of python and basic computer vision. All attendees will get access to the tutorials, videos, and code examples used in the workshop.

July '23 Computer Vision Meetup (Virtual - EU and Americas)

Network event

32 attendees from 13 groups hosting

Link visible for attendees

Zoom Link

https://voxel51.com/computer-vision-events/july-2023-computer-vision-meetup/

Unleashing the Potential of Visual Data: Vector Databases in Computer Vision

Discover the game-changing role of vector databases in computer vision applications. These specialized databases excel at handling unstructured visual data, thanks to their robust support for embeddings and lightning-fast similarity search. Join us as we explore advanced indexing algorithms and showcase real-world examples in healthcare, retail, finance, and more using the FiftyOne engine combined with the Milvus vector database. See how vector databases unlock the full potential of your visual data.

Speaker

Filip Haltmayer is a Software Engineer at Zilliz working in both software and community development.

Computer Vision Applications at Scale with Vector Databases

Vector Databases enable semantic search at scale over hundreds of millions of unstructured data objects. In this talk I will introduce how you can use multi-modal encoders with the Weaviate vector database to semantically search over images and text. This will include demos across multiple domains including e-commerce and healthcare.

Speaker

Zain Hasan is a senior developer advocate at Weaviate, an open source vector database.

Reverse Image Search for Ecommerce Without Going Crazy

Traditional full-text-based search engines have been on the market for a while and we are all currently trying to extend them with semantic search. Still, it might be more beneficial for some ecommerce businesses to introduce reverse image search capabilities instead of relying on text only. However, both semantic search and reverse image may and should coexist! You may encounter common pitfalls while implementing both, so why don't we discuss the best practices? Let's learn how to extend your existing search system with reverse image search, without getting lost in the process!

Speaker

Kacper Łukawski is a Developer Advocate at Qdrant - an open-source neural search engine.

Fast and Flexible Data Discovery & Mining for Computer Vision at Petabyte Scale

Improving model performance requires methods to discover computer vision data, sometimes from large repositories, whether its similar examples to errors previously seen, new examples/scenarios or more advanced techniques such as active learning and RLHF. LanceDB makes this fast and flexible for multi-modal data, with support for vector search, SQL, Pandas, Polars, Arrow and a growing ecosystem of tools that you're familiar with. We'll walk through some common search examples and show how you can find needles in a haystack to improve your metrics!

Speaker

Jai Chopra is Head of Product at LanceDB

How-To Build Scalable Image and Text Search for Computer Vision Data using Pinecone and Qdrant

Have you ever wanted to find the images most similar to an image in your dataset? What if you haven’t picked out an illustrative image yet, but you can describe what you are looking for using natural language? And what if your dataset contains millions, or tens of millions of images? In this talk Jacob will show you step-by-step how to integrate all the technology required to enable search for similar images, search with natural language, plus scaling the searches with Pinecone and Qdrant. He’ll dive-deep into the tech and show you a variety of practical examples that can help transform the way you manage your image data..

Speaker

Jacob Marks is a Machine Learning Engineer and Developer Evangelist at Voxel51.

July '23 Computer Vision Meetup (Virtual - APAC)

Network event

4 attendees from 13 groups hosting

Link visible for attendees

Zoom Link

https://us02web.zoom.us/webinar/register/WN_2H2Kjg8vQoqI0xyMQVLeOw

MARLIN: Masked Autoencoder for Facial Video Representation LearnINg

This talk proposes a self-supervised approach to learn universal facial representations from videos, that can transfer across a variety of facial analysis tasks such as Facial Attribute Recognition (FAR), Facial Expression Recognition (FER), DeepFake Detection (DFD), and Lip Synchronization (LS). Our proposed framework, named MARLIN, is a facial video masked autoencoder, that learns highly robust and generic facial embeddings from abundantly available non-annotated web crawled facial videos. As a challenging auxiliary task, MARLIN reconstructs the spatio-temporal details of the face from the densely masked facial regions which mainly include eyes, nose, mouth, lips, and skin to capture local and global aspects that in turn help in encoding generic and transferable features. Through a variety of experiments on diverse downstream tasks, we demonstrate MARLIN to be an excellent facial video encoder as well as feature extractor, that performs consistently well across a variety of downstream tasks including FAR (1.13% gain over supervised benchmark), FER (2.64% gain over unsupervised benchmark), DFD (1.86% gain over unsupervised benchmark), LS (29.36% gain for Frechet Inception Distance), and even in low data regime.

Zhixi Cai is a Ph.D. student in the Data Science and Artificial Intelligence Department of Monash University IT Faculty, supervised by Dr. Munawar Hayat, Dr. Kalin Stefanov, and Dr. Abhinav Dhall. My research interests include computer vision, deepfake, and affective computing.

Aug '23 Computer Vision Meetup (Virtual - APAC)

Network event

1 attendee from 13 groups hosting

Link visible for attendees

Zoom Link

https://us02web.zoom.us/webinar/register/WN_6Qthi0A8QvGcAVlLmxImqQ

Removing Backgrounds Automatically or with the User's Native Language

Image matting, also known as removing background, refers to extracting the accurate foregrounds in the image, which benefits many downstream applications such as film production and augmented reality. To solve this ill-posed problem, previous methods require extra user inputs with large amounts of manual effort such as trimap or scribbles. In this session, we will introduce our research, which allows user to automatically remove the background or even flexibly choose the specific foreground by a user's native language. We'll also show some fancy demos and illustrate some downstream applications.

Jizhizi Li has just finished her Ph.D. study in Artificial Intelligence at the University of Sydney. With several papers published in top-tier conferences and journals including CVPR, IJCV, IJCAI and Multimedia, her research interests include computer vision, image matting, multi-modal learning, and AIGC.

Past events (17)

June '23 Computer Vision Meetup (Virtual - EU and Americas)

This event has passed