- Network event72 attendees from 14 groups hostingVirtual Open Office Hours with Professor Jason Corso - June 10Link visible for attendees
Virtual Open Office Hours with Professor Jason Corso
Drop in on a weekly and informal chat with Professor Jason Corso!
When: Every Monday | 12 PM Eastern
Join the Zoom: https://us02web.zoom.us/j/85383168408
What are Open Office Hours?
These chats are for students, engineers, researchers, founders, open source contributors, coders, roboticists, authors, and sci-fi enthusiasts. Office Hours take place every Monday at noon Eastern Time for 60 minutes.
What topics are open for discussion?
In addition to your questions, Professor Corso would like to hear your perspectives, such as challenges and opportunities in your research or what robot you would choose to join you on a desert island and why.
About Dr. Jason Corso
Dr. Jason Corso is currently a Professor of Robotics and Electrical Engineering & Computer Science at the University of Michigan. He received his Ph.D. in Computer Science at The Johns Hopkins University in 2005. He is a recipient of the NSF CAREER award (2009), ARO Young Investigator award (2010), Google Faculty Research Award (2015) and the DARPA CSSG (2009).
He is also the Co-Founder and Chief Scientist of Voxel51, a computer vision startup that is building the state of the art platform for video and image based applications.
- Network event10 attendees from 14 groups hostingVirtual Open Office Hours with Professor Jason Corso - June 17Link visible for attendees
Virtual Open Office Hours with Professor Jason Corso
Drop in on a weekly and informal chat with Professor Jason Corso!
When: Every Monday | 12 PM Eastern
Join the Zoom: https://us02web.zoom.us/j/85383168408
What are Open Office Hours?
These chats are for students, engineers, researchers, founders, open source contributors, coders, roboticists, authors, and sci-fi enthusiasts. Office Hours take place every Monday at noon Eastern Time for 60 minutes.
What topics are open for discussion?
In addition to your questions, Professor Corso would like to hear your perspectives, such as challenges and opportunities in your research or what robot you would choose to join you on a desert island and why.
About Dr. Jason Corso
Dr. Jason Corso is currently a Professor of Robotics and Electrical Engineering & Computer Science at the University of Michigan. He received his Ph.D. in Computer Science at The Johns Hopkins University in 2005. He is a recipient of the NSF CAREER award (2009), ARO Young Investigator award (2010), Google Faculty Research Award (2015) and the DARPA CSSG (2009).
He is also the Co-Founder and Chief Scientist of Voxel51, a computer vision startup that is building the state of the art platform for video and image based applications.
- Network event45 attendees from 12 groups hostingJune 26 - Getting Started with FiftyOne WorkshopLink visible for attendees
Where
Virtually over Zoom: https://voxel51.com/computer-vision-events/getting-started-with-fiftyone-workshop-june-26-2024/About the Workshop
Want greater visibility into the quality of your computer vision datasets and models? Then join Harpreet Sahota, Machine Learning Engineer at Voxel51, for this free 90 minute, hands-on workshop to learn how to leverage the open source FiftyOne computer vision toolset.In the first part of the workshop we’ll cover:
- FiftyOne Basics (terms, architecture, installation, and general usage)
- An overview of useful workflows to explore, understand, and curate your data
- How FiftyOne represents and semantically slices unstructured computer vision data
The second half will be a hands-on introduction to FiftyOne, where you will learn how to:
- Load datasets from the FiftyOne Dataset Zoo
- Navigate the FiftyOne App
- Programmatically inspect attributes of a dataset
- Add new sample and custom attributes to a dataset
- Generate and evaluate model predictions
- Save insightful views into the data
Prerequisites are a working knowledge of Python and basic computer vision. All attendees will get access to the tutorials, videos, and code examples used in the workshop.
- Network event123 attendees from 14 groups hostingJune 27 - AI, Machine Learning and Computer Vision MeetupLink visible for attendees
When: June 27, 2024 – 10:00 AM Pacific / 1:00 PM Eastern
Register for the Zoom: https://voxel51.com/computer-vision-events/june-27-2024-ai-machine-learning-computer-vision-meetup/
Leveraging Pre-trained Text2Image Diffusion Models for Zero-Shot Video Editing
Text-to-image diffusion models demonstrate remarkable editing capabilities in the image domain, especially after Latent Diffusion Models made diffusion models more scalable. Conversely, video editing still has much room for improvement, particularly given the relative scarcity of video datasets compared to image datasets. Therefore, we will discuss whether pre-trained text-to-image diffusion models can be used for zero-shot video editing without any fine-tuning stage. Finally, we will also explore possible future work and interesting research ideas in the field.
About the Speaker
Bariscan Kurtkaya is a KUIS AI Fellow and a graduate student in the Department of Computer Science at Koc University. His research interests lie in exploring and leveraging the capabilities of generative models in the realm of 2D and 3D data, encompassing scientific observations from space telescopes.
Improved Visual Grounding through Self-Consistent Explanations
Vision-and-language models that are trained to associate images with text have shown to be effective for many tasks, including object detection and image segmentation. In this talk, we will discuss how to enhance vision-and-language models’ ability to localize objects in images by fine-tuning them for self-consistent visual explanations. We propose a method that augments text-image datasets with paraphrases using a large language model and employs SelfEQ, a weakly-supervised strategy that promotes self-consistency in visual explanation maps. This approach broadens the model’s working vocabulary and improves object localization accuracy, as demonstrated by performance gains on competitive benchmarks.
About the Speakers
Dr. Paola Cascante-Bonilla received her Ph.D. in Computer Science at Rice University in 2024, advised by Professor Vicente Ordóñez Román, working on Computer Vision, Natural Language Processing, and Machine Learning. She received a Master of Computer Science at the University of Virginia and a B.S. in Engineering at the Tecnológico de Costa Rica. Paola will join Stony Brook University (SUNY) as an Assistant Professor in the Department of Computer Science.
Ruozhen (Catherine) He is a first-year Computer Science PhD student at Rice University, advised by Prof. Vicente Ordóñez, focusing on efficient algorithms in computer vision with less or multimodal supervision. She aims to leverage insights from neuroscience and cognitive psychology to develop interpretable algorithms that achieve human-level intelligence across versatile tasks.
Combining Hugging Face Transformer Models and Image Data with FiftyOne
Datasets and Models are the two pillars of modern machine learning, but connecting the two can be cumbersome and time-consuming. In this lightning talk, you will learn how the seamless integration between Hugging Face and FiftyOne simplifies this complexity, enabling more effective data-model co-development. By the end of the talk, you will be able to download and visualize datasets from the Hugging Face hub with FiftyOne, apply state-of-the-art transformer models directly to your data, and effortlessly share your datasets with others.
About the Speaker
Jacob Marks, PhD is a Machine Learning Engineer and Developer Evangelist at Voxel51, where he leads open source efforts in vector search, semantic search, and generative AI for the FiftyOne data-centric AI toolkit. Prior to joining Voxel51, Jacob worked at Google X, Samsung Research, and Wolfram Research.
- Network event12 attendees from 14 groups hostingJuly 3 - AI, Machine Learning and Computer Vision MeetupLink visible for attendees
When: July 3, 2024 – 9 AM Eastern / 2 PM BST / 6:30 PM IST
Register for the Zoom: https://voxel51.com/computer-vision-events/ai-machine-learning-computer-vision-meetup-july-3-2024/
Performance Optimization for Multimodal LLMs
In this talk we’ll delve into Multi-Modal LLMs, exploring the fusion of language and vision in cutting-edge models. We’ll, highlight the challenges in handling diverse data heterogeneity, its architecture design, strategies for efficient training, and optimization techniques to enhance both performance and inference speed. Through case studies and future outlooks, we’ll illustrate the importance of these optimizations in advancing applications across various domains.
About the Speaker
Neha Sharma has a rich background in digital products and technology services, having delivered successful projects for industry giants like IBM and launching innovative products for tech startups. As a Product Manager at Ori, Neha specializes in developing cutting-edge AI solutions by actively engaging on various AI-based use cases centered around latest/popular LLMs, demonstrating her commitment to staying at the forefront of AI technology.
Stay tuned! More speakers will be announced shortly.
- Network event1 attendee from 14 groups hostingDeveloping FiftyOne Plugins Workshop - July 10Link visible for attendees
Are you ready to take your computer vision tooling to the next level? Open source FiftyOne is the most flexible computer vision toolkit on the planet. By tapping into its builtin Plugin framework, you can extend your FiftyOne experience and streamline your workflows, building Gradio-like applications with data at their core.
Register for the Zoom: https://voxel51.com/computer-vision-events/developing-fiftyone-plugins-workshop-july-10/
From concept interpolation to image deduplication, optical character recognition, and even curating your own AI art gallery by adding generated images directly into a dataset, your imagination is the only limit. Join us to discover how you can unleash your creativity and interact with data like never before.
In the workshop we’ll cover:
- FiftyOne Plugins – what are they?
- Installing a plugin
- Creating your own Python plugin
- Python plugin tips
- Creating your own JavaScript plugin
- Publishing your plugin
Prerequisites
A working knowledge of Python and basic familiarity with FiftyOne. All attendees will get access to the tutorials, videos, and code examples used in the workshop.
Check out some these popular plugins
- VoxelGPT: AI Assistant for Computer Vision
- Image Quality Issues
- Image Deduplication
- AI Art Gallery
- Optical Character Recognition
- Visual Question Answering
Resources for the workshop
- FiftyOne Plugins Documentation
- Python Operators API Docs
- FiftyOne Plugins Repo
- Plugins Channel in FiftyOne Community Slack
Videos
- Network event1 attendee from 14 groups hostingJuly 24 - Getting Started with FiftyOne WorkshopLink visible for attendees
Where
Virtually over Zoom: https://voxel51.com/computer-vision-events/getting-started-with-fiftyone-workshop-july-24-2024/About the Workshop
Want greater visibility into the quality of your computer vision datasets and models? Then join Harpreet Sahota, Machine Learning Engineer at Voxel51, for this free 90 minute, hands-on workshop to learn how to leverage the open source FiftyOne computer vision toolset.In the first part of the workshop we’ll cover:
- FiftyOne Basics (terms, architecture, installation, and general usage)
- An overview of useful workflows to explore, understand, and curate your data
- How FiftyOne represents and semantically slices unstructured computer vision data
The second half will be a hands-on introduction to FiftyOne, where you will learn how to:
- Load datasets from the FiftyOne Dataset Zoo
- Navigate the FiftyOne App
- Programmatically inspect attributes of a dataset
- Add new sample and custom attributes to a dataset
- Generate and evaluate model predictions
- Save insightful views into the data
Prerequisites are a working knowledge of Python and basic computer vision. All attendees will get access to the tutorials, videos, and code examples used in the workshop.
- Network event3 attendees from 14 groups hostingDeveloping FiftyOne Plugins Workshop - Aug 7Link visible for attendees
Are you ready to take your computer vision tooling to the next level? Open source FiftyOne is the most flexible computer vision toolkit on the planet. By tapping into its builtin Plugin framework, you can extend your FiftyOne experience and streamline your workflows, building Gradio-like applications with data at their core.
Register for the Zoom: https://voxel51.com/computer-vision-events/developing-fiftyone-plugins-workshop-aug-7-2024/
From concept interpolation to image deduplication, optical character recognition, and even curating your own AI art gallery by adding generated images directly into a dataset, your imagination is the only limit. Join us to discover how you can unleash your creativity and interact with data like never before.
In the workshop we’ll cover:
- FiftyOne Plugins – what are they?
- Installing a plugin
- Creating your own Python plugin
- Python plugin tips
- Creating your own JavaScript plugin
- Publishing your plugin
Prerequisites
A working knowledge of Python and basic familiarity with FiftyOne. All attendees will get access to the tutorials, videos, and code examples used in the workshop.
Check out some these popular plugins
- VoxelGPT: AI Assistant for Computer Vision
- Image Quality Issues
- Image Deduplication
- AI Art Gallery
- Optical Character Recognition
- Visual Question Answering
Resources for the workshop
- FiftyOne Plugins Documentation
- Python Operators API Docs
- FiftyOne Plugins Repo
- Plugins Channel in FiftyOne Community Slack
Videos
- Network event17 attendees from 14 groups hostingAug 8 - AI, Machine Learning and Computer Vision MeetupLink visible for attendees
When
August 8, 2024 – 10:00 AM Pacific / 1:00 PM EasternWhere
VirtualRegister for the Zoom: https://voxel51.com/computer-vision-events/ai-machine-learning-computer-vision-meetup-aug-8-2024/
GenAI for Video: Diffusion-Based Editing and Generation
Recently, diffusion-based generative AI models have gained popularity due to their wide applications in the image domain. Additionally, there is growing attention to the video domain because of its ubiquitous presence in real-world applications. In this talk, we will discuss the future of GenAI in the video domain, highlighting recent advancements and exploring its potential and impact on video editing and generation. We will also examine the challenges and opportunities these technologies present, offering insights into how they can revolutionize the video industry.
About the Speaker
Ozgur Kara is a PhD student in the Computer Science Department at the University of Illinois at Urbana-Champaign. He earned his Bachelor’s degree in Electrical and Electronics Engineering from Boğaziçi University. His research focuses on generative AI and computer vision, particularly on generative AI and its applications in video.
- Network event15 attendees from 14 groups hostingAug 15 - AI, Machine Learning and Computer Vision MeetupLink visible for attendees
When
August 15, 2024 – 9 AM Eastern, 2 PM BST / 6:30 PM ISTWhere
VirtualRegister for the Zoom: https://voxel51.com/computer-vision-events/ai-machine-learning-computer-vision-meetup-aug-15-2024/
SyntStereo2Real: Edge-Aware GAN for Remote Sensing Image Translation while Maintaining Stereo Constraints
In this talk we’ll explore SyntStereo2Real, a lightweight I2I translation framework for stereo pairs considering a semantic-consistent translation of left and right images. Utilization of edge maps and an efficient warping loss are used to improve the matching of the generated pairs. This approach outperformed the current SOTA on remote sensing and autonomous driving datasets!
About the Speaker
Vasudha Venkatesan is a researcher and engineer with expertise in computer vision and machine learning. She has made significant contributions to the field of remote sensing, particularly addressing challenges related to stereo-matched data and domain generalization.
How to Develop Data Science Projects with Open Source
As a data scientist/ML practitioner, how would you feel if you can independently iterate on your data science projects without ever worrying about operational overheads like deployment or containerization? Let’s find out by walking you through a sample project that helps you do so! We’ll combine Python, AWS, Metaflow and BentoML into a template/scaffolding project with sample code to train, serve, and deploy ML models…while making it easy to swap in other ML models.
About the Speaker
Jay Cui had worked at BENlabs as a machine learning platform engineer. Prior to that, he was an undergraduate student studying applied mathematics with a minor in computer science at BYU.