Skip to content

What's in your AI code? Learn why SCA tools are wrong, and how to deal with it

Photo of Ben Pick
Hosted By
Ben P. and 4 others
What's in your AI code? Learn why SCA tools are wrong, and how to deal with it

Details

With the rise of AI-fueled by Python-based libraries, it has become of paramount importance to scan Python-based projects and their dependencies for OSS vulnerabilities. Python relies on package managers like pip or conda to manage declared dependencies. Dependencies are declared in manifest files which the package manager uses to install the correct version of the required dependency. However, Python’s dependency management system coupled with its dynamic type nature makes it an especially challenging language to deal with.

Of particular focus is the phenomenon of phantom dependencies which are unreported dependencies in a project's manifest profile. These hidden dependencies, which are often provided dependencies (which is especially true for libraries such as tensorflow and pytorch which are essential for AI), challenge software composition analysis (SCA) of Python code, impacting the reliability of vulnerability results.

For example, in the case of OpenAI's baseline codebase, there is a dependency on tensorflow that is not explicitly declared and is hence a phantom dependency This can cause unexpected behavior and security vulnerabilities. We show how using type-aware program analysis to create call graphs and perform reachability helps us determine the correct dependency set for a codebase irrespective of what is in the manifest.

Program analysis aims to extract information from software programs to enhance reliability, security, and performance. This session explores program analysis, specifically reachability analysis in Python, and delves into phantom dependencies - often overlooked in Python applications.

Python's dynamic typing and interpreted nature make it a challenging subject for reachability analysis. The lack of type information makes it hard to precisely determine what dependency/features are used in the code.

In summary, program analysis, including Python's unique challenges, is essential in software development. Phantom dependencies in Python underscore the significance of meticulous dependency management for code reliability and security. Understanding these concepts is vital for Python developers aiming to build robust software. This abstract sheds light on program analysis complexities and the pitfalls of phantom dependencies, offering valuable insights into Python development and software reliability.

Photo of OWASP Northern Virginia Chapter group
OWASP Northern Virginia Chapter
See more events
11955 Democracy Dr
11955 Democracy Dr · Reston, VA