On the Security Blind Spots of Software Composition Analysis
On the Security Blind Spots of Software Composition Analysis
16 October 2024
Modern software heavily relies on the use of components. Those components are usually published in central repositories, and managed by build systems via dependencies. Due to issues around vulnerabilities, licenses, and the propagation of bugs, the study of those dependencies is of interest, and numerous software composition analysis (SCA) tools have emerged for this purpose. Most existing tools are based on the analysis of the dependency graph constructed from project metadata (declared dependencies). While this is easy to implement and scales well, there are known issues around the accuracy of the analysis. Recently, improvements have been proposed to address the low precision of this approach. We explore a different yet related problem: the recall of SCA, i.e., whether existing methods miss dependencies on vulnerable components. We demonstrate that for the Java / Maven ecosystem this is indeed the case as (often somehow obfuscated – “shaded”) clones of vulnerable components are deployed in Maven Central, but not marked as vulnerable in vulnerability databases. This can be exploited for subtle package typo-squatting and confusion attacks evading detection by SCA tools. We demonstrate that such vulnerable clones can be discovered with some rather simple tooling. Our approach is lightweight in that it does not require the creation and maintenance of a custom index, but directly uses Maven Central, and precise by design as it does not introduce new false positives. We evaluate our approach on 29 vulnerabilities with assigned CVEs. We retrieve over 53k potential vulnerable clones from Maven Central. After running our analysis on this set, we detect 727 confirmed vulnerable clones (86 if versions are aggregated) and syn thesize proof-of-vulnerability tests for each of those. We demonstrate that existing software composition analysis tools often miss those exposures. At the time of submission those results have led to changes to the entries for ten CVEs in the GitHub Security Advisory Database (GHSA) via accepted pull requests.
Venue : ACM Workshop on Software Supply Chain Offensive Research and Ecosystem Defenses (SCORED '24)
File Name : scor048-dietrich.pdf