Towards Exploring the Fine-Grained Code Reuse in the PyPI Ecosystem


Details
Speaker
George Drosos
Abstract
Modern programming languages facilitate software reuse by hosting software repositories containing interdependent software libraries, which form software ecosystems. Such ecosystems present substantial security risks. To understand and mitigate such risks, researchers often conduct analyses on software ecosystems through dependency networks, usually on the package level. Recent research has proved that a more fine-grained analysis through call-based dependency networks can offer significantly more precise results.
In this talk, we will discuss how we can employ call-based dependency networks in order to measure two aspects of code reuse correlated with security risk, namely software bloat and technical leverage. Our analysis is performed in the popular ecosystem of PyPI, which has recently been the target of many supply chain attacks. With respect to software bloat, our goal is to measure the amount of bloated dependencies and unreachable dependency code existing on PyPI at the fine-grained level. We also aim to demonstrate a way of measuring technical leverage on the fine-grained level, a software metric which describes the ratio between the amount of code that packages leverage from dependencies in comparison to their own's size.
Additionally, we will talk over some preliminary results of our analysis: We found out that a significant amount of software bloat exists on PyPI, indicating that a potential debloating process could effectively reduce the attack surface of the corresponding ecosystem. Additionally, we observed that PyPI applications are highly dependent on third-party code, and thus exposed to security risks.
Bio
George Drosos is a researcher in software engineering at the Business Analytics Laboratory at Athens University of Economics and Business (AUEB). He holds a BSc with distinction in Management Science and Technology from AUEB, majoring in Software and Data Analysis Technologies. He has previously worked as a Research Software Engineer at the FASTEN Research Project. In the past, he was also involved in research projects focusing on analyzing and detecting compiler bugs. His main research interests include Software Analytics and Compilers.
---
Attend in person, or online at https://kth-se.zoom.us/j/61754874417

Towards Exploring the Fine-Grained Code Reuse in the PyPI Ecosystem