Skip to content

High-Resolution Networks: A Universal Architecture for Visual Recognition

Photo of Peter Naf
Hosted By
Peter N.
High-Resolution Networks: A Universal Architecture for Visual Recognition

Details

The lecture will cover the neural network - HRNet (CVPR 2019) - Which is a backbone network for different machine vision tasks, among them a state of the art pose estimation implementation. The lecturer is the paper's author.

Lecture abstract:

Since AlexNet was invented in 2012, there has been rapid development in convolutional neural network architectures for visual recognition. Most milestone architectures, e.g. GoogleNet, VGGNet, ResNet, and DenseNet, are developed initially from image classification. It’s a golden rule that classification architecture is the backbone for other computer vision tasks.

What’s next for a new architecture that is broadly applicable to general computer vision tasks? Can we design a universal architecture from general computer vision tasks rather than from classification tasks?

We pursued these questions and developed a High-Resolution Network (HRNet), a network that comes from general vision tasks and wins on many fronts of computer vision, including semantic segmentation, human pose estimation, face alignment, and object detection. It is conceptually different from the classification architecture. HRNet is designed from scratch, rather than from the classification architecture. It breaks the dominant design rule, connecting the convolutions in series from high resolution to low resolution, which goes back to LeNet-5, and connects the high and low resolution convolutions in parallel.
git: https://jingdongwang2017.github.io/Projects/HRNet/

Presenter BIO:

Jingdong Wang is a Senior Principal Research Manager with the Visual Computing Group at Microsoft Research Asia (Beijing, China). He received the B.Eng. and M.Eng. degrees from the Department of Automation at Tsinghua University in 2001 and 2004, respectively, and the PhD degree from the Department of Computer Science and Engineering, the Hong Kong University of Science and Technology, Hong Kong, in 2007. His areas of interest include neural network design, human pose estimation, large-scale indexing, and person re-identification. He is an Associate Editor of the IEEE TPAMI, the IEEE TMM and the IEEE TCSVT, and is an area chair of several leading Computer Vision and AI conferences, such as CVPR, ICCV, ECCV, ACM MM, IJCAI, and AAAI. He is an IAPR Fellow and an ACM Distinguished Member.

His representative works include deep high-resolution network (HRNet), interleaved group convolutions, discriminative regional feature integration (DRFI) for supervised saliency detection, neighborhood graph search (NGS) for large scale similarity search, composite quantization for compact coding, and so on. He has shipped a number of technologies to Microsoft products, including Bing search, Bing Ads, Cognitive service, and XiaoIce Chatbot. The NGS algorithm developed in his group serves as a basic building block in many Microsoft products. In the Bing image search engine, the key color filter function is based on the salient object algorithm developed in his group. He has pioneered in the development of a commercial color-sketch image search system. More information about Dr. Jingdong Wang can be found at https://jingdongwang2017.github.io/.

This is a technical deep learning talk, prior DL knowledge is advised.

** ** Please register through the zoom link right after your RSVP. We will send the links to the zoom event via email only to those who have registered through zoom. ** **

-------------------------
Find us at:

All lectures are uploaded to our Youtube channel ➜ https://www.youtube.com/channel/UCHObHaxTXKFyI_EI8HiQ5xw

Newsletter for updates about more events ➜ http://eepurl.com/gJ1t-D

Sub-reddit for discussions ➜ https://www.reddit.com/r/2D3DAI/

Discord server for, well, discord ➜ https://discord.gg/MZuWSjF

Blog ➜ https://2d3d.ai

Photo of 2d3d.ai group
2d3d.ai
See more events
Online event
This event has passed