Code generation & understanding with Natural Language Generation (NLG)
Details
Talk abstract:
Natural Language Generation (NLG) is one of the most exciting areas of NLP. The success of pre-trained language models such as BERT & GPT has inspired application of NLG to Programming Languages. This session will cover how NLG techniques are being used for software code generation & understanding in the real world.
In the first part of this session, I will start with the basics and briefly cover the foundations of NLG - modeling, training, decoding and model evaluation. Then we will take a deeper dive using the example of CodeT5, a pre-trained encoder-decoder model for programming languages (released by Salesforce Research last month, in September 2021). The talk will cover how CodeT5 works, its state-of-the-art benchmark results, and how it could aid software development with text-to-code generation, auto-completion, summarization, etc.
Speaker: Unmesh Kulkarni
Speaker Bio:
Unmesh Kulkarni is currently at Adobe working on product marketing strategy & operations for innovation & collaboration projects, and he also serves as an advisor to AI startups. Previously, Unmesh was VP of Product & Engineering at Civitas AI, where he led development of NLP-based virtual assistant product that helps cities connect with their residents. Prior to that he was VP of Engineering & Product Delivery at Covad and VP of Products at EVault, a Cloud Data company. Unmesh has an MBA from the Wharton School and an MS of Computer Science from IIT (Kanpur).
Related Paper:
"CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation" by Yue Wang et al,
https://arxiv.org/pdf/2109.00859.pdf
Code: https://github.com/salesforce/CodeT5
Agenda:
7-7:15 pm Meet and greet
7:15-8:15 pm Paper presentation and group discussion
8:15-8:30 pm Additional discussions
