Visual Question Answering Based on Image and Video


Details
The lecture will cover a new research on semantically understanding visual scenes, in part based on the CVPR 2020 paper - Hierarchical Conditional Relation Networks (HCRN) for Video Question Answering.
The speaker is the researcher and the paper's author.
Lecture abstract:
Deep learning has recently achieved remarkable successes and become a de facto approach to many computer vision problems. Its superb performance is, however, limited to tasks mostly requiring visual perception. It is still very challenging to solve tasks requiring new knowledge acquired through multi-step inference.
In this talk, I present our research on learning to reason visually by asking machines to respond to a natural question based on knowledge presented in a visual scene, either from a static image or a
dynamic scene from a video. This visual question answering task is multi-disciplinary by nature, which constitutes the high-level understanding of both vision and language, hence, considered to be a good proxy for visual reasoning.
git: https://github.com/thaolmk54/hcrn-videoqa
Presenter BIO:
Thao Minh Le is currently a second-year PhD student at Applied Artificial Intelligence Institute, Deakin University. He works on how machines learn and reason about the world from what they
see. His interests are in deep learning and its applications to computer vision and bio-medicine.
Going back in time, he obtained a Bachelor of Engineering from Hanoi University of Science and Technology in 2014 and a Master of Engineering from Tokyo Institute of Technology under the
Japanese Government MEXT Scholarship Program in 2018.
His git: https://github.com/thaolmk54
** ** Please register through the zoom link right after your RSVP. We will send the links to the zoom event via email only to those who have registered through zoom. ** **
-------------------------
Find us at:
All lectures are uploaded to our Youtube channel ➜ https://www.youtube.com/channel/UCHObHaxTXKFyI_EI8HiQ5xw
Newsletter for updates about more events ➜ http://eepurl.com/gJ1t-D
Sub-reddit for discussions ➜ https://www.reddit.com/r/2D3DAI/
Discord server for, well, discord ➜ https://discord.gg/MZuWSjF
Blog ➜ https://2d3d.ai

Visual Question Answering Based on Image and Video