Skip to content

Details

Because of the naturalness of software and the rapid evolution of ML techniques, frequently repeated code change patterns (CPATs) occur often. They range from simple API migrations to changes involving several complex control structures such as for loops. While manually performing CPATs is tedious, the current state-of-the-art techniques for inferring transformation rules are not advanced enough to handle unseen variants of complex CPATs, resulting in a low recall rate. In this talk we present a novel, automated workflow that mines CPATs, infers the transformation rules, and then transplants them automatically to new target sites. We designed, implemented, evaluated and released this in a tool, PYEVOLVE. At its core is a novel data-flow, control-flow aware transformation rule inference engine. Our technique allows us to advance the state-of-the-art for transformation-by-example tools; without it, 70% of the code changes that PYEVOLVE transforms would not be possible to automate. Our thorough empirical evaluation of over 40,000 transformations shows 97% precision and 94% recall. By accepting 90% of CPATs generated by PYEVOLVE in famous open-source projects, developers confirmed its changes are useful. JetBrains engineers and ML developers worldwide can leverage our findings and tools to enhance ML systems. Moreover we present our future research on determining which transformations are profitable to transplant at new code sites.

This talk will be jointly presented by Prof. Danny Dig and PhD student Malinda Dilhara.

Danny is a JetBrains Scientific Consultant with the ML4SE team at JetBrains Research, where he is excited for doing a sabbatical in 2023. He is also an associate professor of Computer Science at the University of Colorado, and an adjunct professor at University of Illinois at Urbana-Champaign. He is the Founder and the Executive Director of the NSF Center on Pervasive Personalized Intelligence for IoT Systems (http://ppicenter.org). He enjoys doing research in Software Engineering, with a focus on interactive program transformations that improve programmer productivity and software quality. Together with his grad students they pushed the frontier of refactoring into cutting-edge domains including AI/ML, mobile, concurrency and parallelism, component-based, testing, and end-user programming. Their research ships with official versions of Eclipse, NetBeans, Visual Studio, and Android Studio. He hopes that this year they will contribute to JetBrains IDEs. Danny travels all over the world to train leaders on personal and
professional growth. His goal is to be a transformational leader that equips and inspires the next generation of tech leaders.

Malinda is a PhD candidate in the department of computer science at University of Colorado-Boulder. Malinda received his B.Sc. from University of Moratuwa, Sri Lanka in 2015. He is a former senior software engineer in London Stock Exchange Technology (2015-2018). Malinda enjoys doing research on software refactoring, static code analysis, and program synthesis. He was awarded the Gold Prize at the ACM Student Research Competition held at the flagship ACM conference in Software Engineering.

Related topics

Machine Learning
Software Development
Software Engineering

You may also like