AutoNet: Automated Network Construction from Massive Text Corpora [Virtual]
Details
AutoNet: Automated Network Construction from Massive Text Corpora
by Jingbo Shang
Abstract:
Mining structured knowledge from massive unstructured text data is a key challenge in data science. In this talk, I will discuss my proposed framework, AutoNet, that transforms unstructured text data into structured heterogeneous information networks, on which actionable knowledge can be further uncovered flexibly and effectively. AutoNet is a data-driven approach using distant supervision instead of human curation and labeling. It consists of four essential steps: (1) quality phrase mining; (2) entity recognition and typing; (3) relation extraction; and (4) taxonomy construction. Along this line, I have developed a number of state-of-the-art distantly-supervised/unsupervised methods and published them in top conferences and journals. Specifically, I will present my work about phrase mining, entity recognition, and taxonomy construction in detail, while touching my other research topics slightly. Finally, I will summarize the AutoNet framework with a demo video and conclude by discussing future work collaborating with other disciplines.
Bio:
Jingbo Shang is an Assistant Professor in Computer Science Engineering and Halıcıoğlu Data Science Institute at UC San Diego. He obtained his Ph.D. from the University of Illinois at Urbana-Champaign in 2019. He received his B.E. from Shanghai Jiao Tong University in 2014. His research focuses on data mining, natural language processing, and machine learning methods with minimum human effort and their applications. His research has been recognized by many prestigious awards, including the Grand Prize of Yelp Dataset Challenge in 2015, Google Ph.D. Fellowship in Structured Data and Database Management in 2017, SIGKDD Dissertation Award Runner-up in 2020.
=================
Agenda (Pacific Daylight Time, UTC -07)
- 5:30 - 5:40 pm -- Gathering and introductions
- 5:40 - 6:30 pm -- Talk
- 6:30 - 7:00 pm -- Q & A, discussion
Zoom link: https://us02web.zoom.us/j/83577536935?pwd=UElhMjJyc2ZON1VKSTVVVGlrZENidz09
Links to slides and videos of meetup presentations are available on the SDML GitHub repo https://github.com/SanDiegoMachineLearning/talks
=================
Questions?
Join our slack channel or leave a comment below if you have any questions about the group or need clarification on anything.
https://join.slack.com/t/sdmachinelearning/shared_invite/zt-6b0ojqdz-9bG7tyJMddVHZ3Zm9IajJA
