Software Supply Chain Smells: Lightweight Analysis for Secure Dependency Management

1 Software Supply Chain Smells: Lightweight Analysis for Secure Dependency Management Larissa Schmid ∗ , Diogo Gaspar ∗ , Raphina Liu ∗ , Soﬁa Bobadilla ∗ , Benoit Baudry † , Martin Monperrus ∗ ∗ KTH Royal Institute of T echnology , Stockholm, Sweden † Uni versit ´ e de Montr ´ eal, Montr ´ eal, Canada ∗ { lgschmid,dgaspar ,raphina,sofbob,monperrus } @kth.se † benoit.baudry@umontreal.ca Abstract —Modern software systems heavily rely on third-party dependencies, making software supply chain security a critical concern. W e introduce the concept of software supply chain smells as structural indicators that signal potential security risks. W e design and evaluate D I RT Y - W A T E R S , a novel tool for detecting such smells in the supply chains of software packages. Through interviews with practitioners, we show that our proposed smells align with real-world concerns and captur e signals considered valuable. A quantitative study of popular packages in the Maven and NPM ecosystems reveals that while smells are prevalent in both, they differ signiﬁcantly across ecosystems, with traceability and signing issues dominating in Maven and most smells being rare in NPM, due to strong registry-le vel guarantees. Software supply chain smells support developers and or ganizations in making informed decisions and improving their software supply chain security posture. Index T erms —Software Supply Chain, Open Source, Software Security I . I N T RO D U C T I O N The number of packages provided by package managers is continuously increasing, and software projects rely on more dependencies than e v er before [1], [2]. T ogether , these dependencies and their relationships form the softwar e supply chain , i.e., the network of third-party components, build tools, and distribution infrastructure inv olv ed in producing a software system [1], [3]. Reusing third-party libraries in software dev elopment reduces development costs by a v oiding “rein v enting the wheel” [3]. Large-scale reuse also introduces new security risks [4], as substantial trust has to be placed in e xternal parties. For example, installing an a v erage NPM package implicitly means trusting 79 transitive packages and 39 maintainers [5]. These trust relationships are often implicit and lar gely in vis- ible to developers [1]. As a result, many projects unkno wingly depend on vulnerable or even malicious packages [6]: Software supply chain attacks refer to attack er compromising a single dependency in order to take control of one or more downstream projects [7]. Dev elopers need to comprehend the complex interdepen- dencies among packages in their supply chain [3]. This is challenging. Packages are installed through package managers, which behave differently across programming languages [8]. When they install dependencies as binary , these binaries do not necessarily correspond to the av ailable source code [9]. Dev elopers have no reliable means to determine whether a dependency can be audited and trusted before including in their supply chain, or for post-mortem analysis of an incident [1]. In this paper , we present D I RT Y - W AT E R S , a no vel approach to detect software supply chain smells and help developers se- curing their systems. Extending the concept of code smells [10], a software supply chain smell is a package that matches speciﬁc patterns indicating potential security issues, either today or in the future [11]. For example, a dependenc y that does not link to its source code repository is a supply chain smell, because engineers cannot audit the source code before inclusion or cannot inv estigate it after a security incident. D I RT Y - W A T E R S statically analyzes three sources of information: 1) dependency ﬁles, 2) package registries, and 3) GitHub repositories, to detect software supply chain smells. D I RT Y - W A T E R S generates user- friendly reports to help stak eholders assess the quality of their software supply chains. T o e valuate the rele v ance and usefulness of these smells, we perform two complementary e valuations. First, we conduct structured intervie ws with practitioners to gather their assessment of the smells and to understand if D I RT Y - W A T E R S could integrate into e xisting software supply chain security practices. Second, we study the pre valence of these smells in practice, by conducting a quantitati ve analysis of the packages in the supply chain of the 50 most-depended upon packages in two ecosystems, Maven and NPM. Our results sho w that D I RT Y - W A T E R S captures security- rele vant signals that practitioners consider important and could meaningfully strengthen existing software supply chain security processes. Practitioners also proposed additional indicators, es- pecially related to project and maintainer health, that can inform future extensions of D I RT Y - W AT E R S . Our quantitativ e analysis shows that a non-ne gligible subset of dependencies e xhibits smells warranting further in vestigation, with clear ecosystem differences: Mav en projects are more frequently aff ected by traceability and signing gaps – issues practitioners rated as high or critical – while most smells in NPM remain comparati vely rare, partly due to stronger re gistry-level guarantees. The concept of softw are supply chain smells is no vel and important. State of the art software supply chain management relies on: Softw are composition analysis (SCA) tools [12] that scan dependencies and match them against known vulnerability databases; Software Bills of Materials (SBOMs) [13], which provide machine-readable inv entories of components and their relationships; Security framew orks, such as in-toto [14] and SLSA [15], that deﬁne guidelines for securing build and release processes. None of these tools and guidelines help de velopers identify dependencies that are not trustworthy , nor do they highlight which dependencies in a project’ s supply chain deserve closer inspection. T o the best of our knowledge, this paper provides the ﬁrst systematic and empirically grounded account of software supply chain smells, combining a well- deﬁned taxonomy , practitioner feedback, and a large-scale cross-ecosystem analysis. T o summarize, our contributions are as follo ws: • A novel, systematic taxonomy of softw are supply chain smells, informed by industry . • An open-source tool, D I RT Y - W AT E R S 1 , for automatically detecting those software supply chain smells across tw o ecosystems (Maven and NPM). D I RT Y - W AT E R S can be readily integrated into modern software engineering CI/CD workﬂo ws. • A qualitative e valuation of the smells and their perceived sev erity through interviews of 11 senior engineers. • A quantitative analysis of the prev alence of software supply chain smells in the dependencies of 50 heavily depended upon projects across two ecosystems. I I . P RO B L E M S T A T E M E N T Modern software development heavily relies on third-party components, and ev en small projects may depend on hundreds of external packages maintained by different parties [5]. A software supply chain comprises the set of libraries, tools, and infrastructure inv olved in de veloping, building, and publishing a software artif act [3]. This deﬁnition is recursi ve: the supply chain of an application is the transitive closure of the supply chains of all its dependencies. This scale make software supply chains difﬁcult to reason about and challenging to secure. Software supply chain attacks exploit the implicit trust placed in these opaque dependencies by inserting malicious code into components of the supply chain, allowing the attack to propagate to downstream consumers [16]. Attack vectors include the publication of malicious packages, compromise of maintainer accounts, or tampering with b uild and release infrastructure [17]. Due to the transitive nature of dependencies, a single compromised component can af fect a large number of projects, resulting in a wide attack blast radius and delayed detection [5]. In practice, de velopers have to make critical decisions that affect software supply chain security primarily in two situations: when adding a new dependency , and when updating existing dependencies. In both cases, developers need to assess whether the change is acceptable. Inspecting changes in dozens of direct dependencies is cumbersome. Consider transiti ve dependencies, which can quickly number in the hundreds and thousands: basically de velopers cannot truly assess the impact of a dependenc y change, and are left with blind approv als of dependency changes. The pr oblem addressed in this paper is the lack of automated signals to evaluate the security of softwar e dependencies. Adding or updating dependencies may affect reliability , but comprehensiv e test suites usually catch such issues. Security , 1 https://github .com/chains-project/dirty-waters howe ver , remains largely invisible and e xisting tooling of fers only limited support. At best, dependency management tools may report known vulnerabilities in the dependencies (e.g., ‘npm audit‘), b ut they provide no support in assessing the authenticity of the resulting dependency tree. Consequently , ev en with the best tools available, de velopers lack essential tooling regarding traceability , integrity and prov enance in their software supply chains. Dev elopers need effecti ve, scalable signals to analyze dependencies for re view and to make informed security decisions when accepting new dependencies or dependency version updates. I I I . T A X O N O M Y O F S O F T W A R E S U P P L Y C H A I N S M E L L S The well-accepted concept of code smell refers to charac- teristics in the source code that indicate a potentially deeper problem, or will trigger one in the future [10]. Extending this concept, we deﬁne the concept of software supply chain smell. A software supply chain smell is a package in the dependency tree that matches speciﬁc patterns, which indicate potential security issues, current or to come in the future. In this section, we introduce the ﬁrst taxonomy of software supply chain smells. First, we discuss our methodology to identify the smells in subsection III-A . Next, we present the taxonomy of smells in subsection III-B. A. Methodology T o identify software supply chain smells, we followed an iterativ e, practitioner -informed process. W e conducted three workshops on software supply chain security with practitioners from dev elopment and security teams in Swedish companies, complemented by informal follow-up con versations. Based on these discussions, we deriv ed an initial set of smells reﬂecting structural indicators that practitioners associate with reduced trust in dependencies. W e then revie wed related work on dependency smells to ensure that the proposed smells were original and not already covered by existing dependency smell taxonomies. The resulting smells were implemented in D I RT Y - W A T E R S , which is av ailable as open-source software, and piloted on real-world projects to validate feasibility and gather further feedback. T o ev aluate the relev ance and sev erity of the smells, we subsequently conduct a practitioner study with a separate group of practitioners (cf. section V), providing empirical validation of the collected smells. B. Softwar e Supply Chain Smells In the following, we introduce the software supply chain smells we collected. W e also outline potential security issues that each smell indicates and the attacks it enables. T able I shows an overvie w of smells and related attack vectors. a) No Source Code URL: The package’ s metadata does not include a URL for its associated source code repository . W ithout access to the source code, there is no insight into the code that is included and run when including this package in projects. This makes it impossible to audit the package for security vulnerabilities and malicious code. Some legitimate 2 ID SSC Smell Related Software Supply Chain (SSC) Attacks Distribute Malicious version Develop Malicious Package Inject Into Sources T ake Advantage of V ulnerabilities 1 No Source Code URL ✓ ✓ - - 2 In valid Source Code URL ✓ ✓ - - 3 Inaccessible Release T ag - - ✓ - 4 Deprecated - - - ✓ 5 Fork - - ✓ - 6 No Code Signature ✓ - - - 7 Inv alid Code Signature ✓ - - - 8 Aliased - ✓ - - 9 No Provenance - - ✓ - T ABLE I: Overvie w of collected Software Supply Chain Smells (SSCS) and related Software Supply Chain (SSC) attacks. packages may lack public repositories, especially proprietary ones, so this smell does not always indicate malicious intent. Howe ver , if the package is meant to be open source, there is a strong expectation that its source code should be publicly av ailable and linked from its metadata. Its absence signals poor transparency or potential concealment of malicious intent: This smell is a facilitator of attack vectors, ”distribute malicious version of legitimate package” and ”de velop and advertise distinct malicious package from previous work [16]. The ctx PyPI incident [18] is an example where linked source code enabled the detection of the distribution of a malicious version of a legitimate package. There, developers noticed that a new package release had appeared even though the linked GitHub repository had not been updated for years. This mismatch raised suspicion and enabled early detection of the compromise. The incident highlights that having a linked source code repository is essential for identifying irregularities that may signal a supply chain attack. b) In valid Source Code URL: The package’ s metadata provides an inv alid source code URL, e.g., leading to a 404 error . Similar to the previous smell, this prev ents checking the source code for malicious content or vulnerabilities, enabling the same attack vectors. An inv alid URL may be due to outdated or mistyped metadata after moving a repository . It generally indicates poor maintenance rather than malicious intent. Howe ver , it can also be a deliberate attempt to deceiv e users and simulate legitimacy by creating a false sense of transparency with a fake URL. c) Inaccessible Commit SHA/Release T ag: The source code repository of the package lacks the commit SHA or release tag speciﬁed in the package’ s metadata. Even though the repository provides access to the source code of the dependency , the actual version used is unknown. W ithout access to the release tag, it becomes impossible to trace the exact source code for a giv en package version and to analyze it for a malicious payload. This is of utmost importance for package registries of binary code, such as Maven, or of bundled/miniﬁed code, such as NPM. Moreov er , not knowing the exact version hinders the in vestigation of reasons for a security incident, as the dev elopers cannot determine if the version of the dependency that is in the supply chain is vulnerable to a certain attack. The lack of the commit SHA or release tag may result from poor release management, accidental deletion or rewriting of tags. It could also be deliberate attempts to hide unauthorized code changes. While not all of these reasons indicate malicious intent, the absence of a commit SHA or release tag weakens traceability and prev ents reproducible builds of the package. This is connected to the attack vector ”inject into sources of legitimate package”: without knowing the exact version, it is impossible to know if malicious code has been injected into an otherwise authentic repository . d) Depr ecated: The package is marked as deprecated by its maintainers, either due to lack of maintenance or because of known vulnerabilities. This indicates that the package is no longer actively updated, making it more likely to contain unpatched security issues and exposing dependent projects to known attacks. Deprecation can be indicated via package metadata (on the package registry) or source repository metadata (e.g., on GitHub). Malicious actors have full access to this information and can exploit this not only by taking adv antage of known vulnerabilities b ut also by taking advantage of unmaintained infrastructure, such as in the attack on the CTX library , where the attacker was able to obtain ownership of the e-mail associated with the project maintainer because the domain had expired [18]. e) F ork: The source code repository provided by the producer is a fork from an upstream repository . While forks are common in open-source dev elopment, they introduce uncertainty about the authenticity of the package. There are legitimate reasons for forks to exist, such as adding custom features, adapting code to special execution contexts, or maintaining compatibility with legac y systems. Howe ver , forks can also serve as an attack strategy to disguise malicious changes to the original source code and trick consumers into using compromised versions of trusted projects [19]: Attackers may introduce backdoors, add malicious installation or build steps, or replace dependencies with compromised versions while preserving much of the original code structure, relating to the ”inject into sources of legitimate package” attack v ector [16]. When published in a package registry under a similar or misleading name, this can mislead both automated scanners and human revie wers who assume the fork is the legitimate upstream. A notable example is the hosting of cryptocurrency mining malware on GitHub [20], demonstrating how forking 3 can be abused to create deceptiv e repositories that appear legitimate. f) No Code Signatur e: The package release is not cryptographically signed. Signatures are used to verify the authenticity of a package and can also be used to verify their integrity . W ithout a signature, the package’ s contents may have been tampered with during transmission, or the package may hav e been sourced from an unofﬁcial source. The absence of signatures may be caused by ecosystem limitations, missing tooling support, or lo w aw areness of signing practices, meaning the absence of a signature is not always malicious. But omitting a signature can also be of malicious intent to obscure a package’ s origin or that it has been tampered with using a Person-in-the-middle attack, relating to the ”distribute malicious versions of legitimate package” attack [16]. Ho wev er, we note that the mere existence of a code signature is not sufﬁcient to establish trust. After conﬁrming that a package is signed, the identity of the owner of the key must be veriﬁed, and their trustworthiness must be e v aluated: A v alid signature only prov es that the artifact was produced by a particular key , not that the key belongs to, and was used by a legitimate maintainer . This veriﬁcation step is inherently non- automatable and requires human judgment or organizational trust policies. Code signing complements provenance: while prov enance attests to how and where software was built, code signatures verify who released it and that it has not been modiﬁed since. g) In valid Code Signatur e: The package release holds an in valid signature status. This can indicate sev eral serious issues. It may mean that the package was tampered with after signing, that the signing key does not match the expected one, or that the signature was created using an e xpired or rev oked key . Malicious actors may also intentionally provide an in valid signature to gi ve the illusion of compliance with signing requirements while av oiding veriﬁcation. An inv alid signature does not hav e to be out of malicious intent, as it could also be due to technical errors, such as expired certiﬁcates or broken signing workﬂows. Howe ver , the presence of an in v alid signature means the package’ s authenticity cannot be veriﬁed. The CCleaner compromise [21] illustrates this smell: While the malicious version of the software was signed with a certiﬁcate that was v alid at the time, the certiﬁcate was rev oked once the breach was discovered. Similar to before, the mere presence of a signature is not enough to trust the package – it is important to also verify the owner of the key . h) Aliased: The package installs one or more of its dependencies under an alias. Aliasing is the practice of installing a dependency under an alternativ e, self-chosen name, allo wing a package to install and reference different versions of the same dependency under different names. The self-chosen name is then used in the rest of the code base to refer to the dependency . While aliasing can be useful in some contexts, it also reduces transparency in dependency resolution and can obscure the true source or identity of a dependency , making it harder to verify which exact package is being installed and from where. Malicious actors may take advantage of that by re- directing alias targets to dependencies from attacker-controlled sources, compromising downstream systems without obvious signs of tampering. Moreover , aliasing introduces opportunities for attackers to upload malicious packages that match common alias names, exploiting users or automated systems that assume the alias corresponds to a trusted dependency . This smell is a realization of the attack vector ”dev elop and advertise distinct malicious package from scratch” [16]. i) No Pr ovenance: A pro venance ﬁle, aka b uild attestation, is a cryptographically signed metadata about a software artifact and its build, containing information on how and where it was built. A package with prov enance gi ves substantial security guarantees about the source code origin. One of the ecosystems that supports provenance is NPM [22], allowing developers to verify where a package was built and who published a package. Sigstore [23] is used to sign and log this information, allowing consumers to verify the authenticity and integrity of the package. Framew orks such as SLSA [15] and SSDF [24] also require veriﬁable provenance of software artifacts to ensure that they can be trusted. W ithout provenance, consumers of the package hav e no guarantee that the package was built from the claimed source code or by a trusted entity . Missing provenance enables sev eral attack vectors. Attackers could replace the artifact or compromise build en vironments to inject malicious code, enabling the ”inject during the build of legitimate package” [16] attack vector . A recent example of this is the s1ngularity attack [25] targeting the Nx build system package, where attackers were able to inject code into the GitHub workﬂo w and then publish a compromised package. Notably , the compromised package was published without provenance. W e ackno wledge that systematic prov enance is an ambitious goal. The lack of provenance today mostly stems from ecosystem limitations, missing tooling, or lo w awareness. Howe ver , our research is forward looking, and, in the future, the lack of provenance will indicate efforts to conceal untrusted build environments or unauthorized code modiﬁcations. C. Discussion It is important to note that the absence of smells does not guarantee that a package is safe to use as the smells listed mostly target the meta-data attached to a package to assess its authenticity and origin, but do not check its contents. Howe ver , the absence of smells is a prerequisite for further analysis of the package security . Code analyses can ﬁnd vulnerabilities in imported code, but if the code is not av ailable, it becomes impossible to analyze it. W ith the large size of dependency trees and high cost of code analyses, checking for our software supply chain smells is a proxy to assess if it is worth putting more effort into analyzing a package. Security scanners often query vulnerability databases that contain information about vulnerabilities associated with packages and their version; howe ver , if the commit SHA or release tag is missing in the package metadata, a user cannot be sure that the information in the database applies to the package used. Some organizations may have security policies that trust packages from certain organizations or maintainers; howe ver , if no signature is provided, it is impossible to determine whether the package was indeed issued by them. 4 ID SSC Smell 1 No Source Code URL 2 Invalid Source Code URL 3 Inaccessible Release T ag 4 Deprecated 5 Fork 6 No Code Signature 7 Invalid Code Signature 8 Aliased 9 No Provenance Information Sources Software Pr oject 1. Extract Project Dependencies 2. Fetch Metada ta 3. Gener ate Software Supply Chain Smells Report 🔁 per dependency Check for Software Supply Chain Smells Package Registry Source Code Repository Dependency List Fig. 1: W orkﬂow of D I RT Y - W AT E R S . Based on the dependency tree of a project, it gathers analytics from the dependencies and the package registries, to generate a software supply chain smells report. I V . D I RT Y - W A T E R S D E S I G N & I M P L E M E N TA T I O N In this section, we present the design of D I RT Y - W A T E R S , a nov el tool for checking software supply chain smells and informing dev elopers in their dependency security decisions. The core novelty of D I RT Y - W A T E R S is that it detects smells through the combined analysis of 1) the dependency manifest in the repository , 2) the remote package registry , and 3) the source code repository (Section IV -A ). Moreover , we design an integration of the analysis into CI/CD pipelines by designing a GitHub action to aid adoption (Section IV -B). A. W orkﬂow Figure 1 sho ws an o vervie w of the workﬂo w of D I RT Y - W A T E R S . D I RT Y - W A T E R S receiv es as input the project name and version to be analyzed, and the package manager it uses. The workﬂo w consists of three stages. First, the tool extracts the complete dependency list, including both direct and transitiv e dependencies, from the project’ s dependency manifest, whose format depends on the package manager used by the project. Next, it fetches and accumulates metadata for each extracted dependency from sev eral sources. For this, the tool starts with fetching metadata from the package registry . If that contains a valid GitHub repository URL, smells pertaining to the repository are checked (Inaccessible Release T ag, Fork). Information about aliasing, i.e., if and which dependencies are installed under alternativ e names to allow referencing different versions of the same package (cf. III-B ), is extracted from the dependency list. The dependency list explicitly deﬁnes aliases, mapping an alias name to a speciﬁc package source or version. Finally , the collected data is stored in a ﬁle that will be later used to generate the summary of software supply chain smells. 1) Extract Pr oject Dependencies: T o extract both direct and transitiv e dependencies, D I RT Y - W AT E R S analyzes the giv en project’ s dependency manifest. D I RT Y - W A T E R S is indepen- dent of the speciﬁc ecosystem and relies on the respectiv e package manager to provide the complete, resolved list of dependencies. For NPM, the lockﬁle is used, as it records the exact versions and sources of all dependencies, including transitiv e ones. Unlike the dependency manifest, which spec- iﬁes version ranges or constraints, the lockﬁle captures the resolved, concrete dependency tree. For Mav en, we utilize the maven-dependency-plugin for parsing the pom.xml to retrieve the complete dependency tree, incl. the transitiv e ones. 2) F etch Metadata: D I RT Y - W A T E R S retriev es various meta- data according to the different smells introduced in Sec- tion III-B . As shown in Figure 1 and T able II, D I RT Y - W A T E R S utilizes different sources to analyze their prev alence: The package registry is queried for the source code URL, deprecation status, code signature, and prov enance ﬁle. T o check the accessibility of the source code URL, D I RT Y - W A T E R S sends a request and veriﬁes the HTTP response. If the link is accessible, D I RT Y - W AT E R S then assesses whether the repository is a fork from data in the source code repository . Moreov er , the accessibility of the release SHA is assessed; if none is present or it is found to be in v alid, the accessibility of the release tag is assessed based on the common tagging patterns identiﬁed by Keshani et al. [26]. T o check if a package is aliased, D I RT Y - W A T E R S utilizes the extracted dependency list from the project that is checked, as the dependency list contains information about aliases used directly and in transiti ve dependencies. The collected data is stored in a JSON ﬁle for report generation, which we call the ”Dirty Pond”. Due to constraints in Ma ven’ s ecosystem and its limited package metadata, not all smell checks are supported for Mav en projects: namely , for deprecation and provenance. Moreover , aliasing is not supported by Mav en. T able II shows an overvie w of the supported smell checks by package manager . 3) Generate Software Supply Chain Smells Report: Utilizing the Dirty Pond, D I RT Y - W A T E R S generates a comprehensiv e, human-readable Markdown summary of the analysis. The goals of the report are to raise awareness of supply chain smells, support informed decision-making and priorization of issues, and improve transparency about the trustworthiness of dependencies. Therefore, the report has to prioritize clarity and readability to make results understandable to dev elopers. This includes ensuring actionability by outlining next steps for mitigation, maintaining traceability by linking each smell to the affected packages, and providing contextual explanations that describe why each issue matters and what risks it introduces. Figure 2 shows an excerpt of a report generated by D I RT Y - W A T E R S for the qos-ch/slf4j package: The report begins with instructions on ho w to interpret the results, follo wed by k ey points related to the software supply chain smells uncovered by D I RT Y - W A T E R S and why they matter together with a list of packages that expose each smell. Importantly , the report includes a ”Call to Action” section to provide guidelines for dev elopers to ﬁx the smells: 5 Fig. 2: Excerpt of a Software Supply Chains smells report generated by D I RT Y - W A T E R S . a) No or In valid Sour ce Code URL, Inaccessible Commit SHA/Release T ag: Consumers should submit a Pull Request to the dependency’ s maintainer, requesting correct repository metadata and proper tagging. b) Depr ecated: Conﬁrm the maintainer’ s deprecation intention and double-check for alternative versions that are not deprecated. c) F ork: Inspect the package and its GitHub repository to verify that the fork is not malicious. d) No Code Signatur e: Open an issue in the dependency’ s repository to request the inclusion of code signature in the CI/CD pipeline. e) Invalid Code Signatur e: V erify the code signature and contact the maintainer to ﬁx the issue. f) Aliased P ackage: Check the aliased package and its repository to verify that the alias is not malicious. g) No Pro venance: Open an issue in the dependency’ s repository to request the inclusion of provenance and build attestation in the CI/CD pipeline. Note that, while this is still cutting-edge today , we believe that this is the future of software supply chain security . B. Inte gration in Continuous Inte gration It is crucial to integrate static analysis tools into modern software engineering workﬂows to foster their adoption [27]. Therefore, we design D I RT Y - W AT E R S - A C T I O N , a GitHub action that can be integrated in GitHub workﬂo ws CI/CD pipelines to provide automatic feedback to project dev elopers and maintainers on the security of their software supply chain. This idea is that when the CI is triggered, D I RT Y - W A T E R S is ID SSC Smell P S D NPM Mav en 1 No Source Code URL ✓ ✓ ✓ 2 In valid Source Code URL ✓ ✓ ✓ 3 Inaccessible Release T ag ✓ ✓ ✓ 4 Deprecated ✓ ✓ × 5 Fork ✓ ✓ ✓ 6 No Code Signature ✓ ✓ ✓ 7 In valid Code Signature ✓ ✓ ✓ 8 Aliased ✓ ✓ × 9 No Provenance ✓ ✓ × T ABLE II: Overvie w of supported Software Supply Chain Smells (SSCS) by package manager and information source: P = package registry , S = source code repository , D = dependency list. Supported checks indicated by ✓ , checks not supported by the package manager by × . automatically executed after checking out its repository at a certain version and installing its dependencies. Fail ing the build . D I RT Y - W AT E R S - A C T I O N can fail the CI if high-severity smells are found. This enforces immediate attention to critical supply chain issues and ensures that dev elopers address them before merging or deployment. By failing the build and making supply chain security a mandatory check in the build process, D I RT Y - W A T E R S - A C T I O N elev ates software supply chain security from a secondary consideration to a ﬁrst-lev el concern for dev elopers. While this enforces stricter security standards, the conﬁguration of D I RT Y - W AT E R S - A C T I O N allows balancing security enforcement with develop- ment ﬂexibility by triggering failures only for high-sev erity or threshold-exceeding smells, while less critical issues are shown as informative reports. A utomated reporting . D I RT Y - W AT E R S - A C T I O N publishes the software supply chains report generated as comment to the pull request or commit, depending on the conﬁguration. This raises awareness by clearly showing which actual smells exist in the supply chain, helping de velopers understand risks early and enabling them to assess whether they want to proceed with the change or revie w it and reduce the number of smells before merging. Attaching the report directly to the PR or commit also provides traceability and documentation, allowing teams to track issues over time and revie w the history of supply chain problems and their resolution. V . P R AC T I T I O N E R S T U DY : E V A L U A T I O N O F D I RT Y - W A T E R S S M E L L S T o assess the relev ance of D I RT Y - W A T E R S (cf. Section III-B ), we conduct a campaign of structured interviews with qualiﬁed practitioners to answer the following research questions: RQ1: How sev ere do practitioners consider D I RT Y - W A T E R S smells? RQ2: What additional smells do practitioners consider impor- tant? First, we describe the methodology of our user study in Section V -A . W e then report the results and discuss them in two parts: ﬁrst, practitioners’ sev erity ratings of our proposed smells (Section V -B ), and second, the additional smells they suggested (Section V -C). 6 ID Y oE Sector T ech Stack P1 > 20 IT Consulting C/C++ , Linux P2 18 Information T echnology Java P3 18 Game Dev elopment .NET P4 > 20 Consulting – Public Sector J av a , JavaScript P5 > 20 Information T echnology Ja va P6 > 20 Information T echnology Go P7 > 20 Information T echnology Go , Linux (Debian) P8 > 20 Information T echnology Ja va P9 > 20 Information T echnology Python , Java P10 10 Information T echnology Rust , Python, Node.js, Nix P11 20 Information T echnology Python T ABLE III: Y ears of experience, sector , and tech stack of the interviewees. Primary tech stack used in bold . A. Methodology T o ev aluate how serious practitioners consider the proposed smells, we conducted interviews with industry professionals who have experience regarding software supply chain security . This included developers, team leads, and others inv olved in dependency-related engineering decisions. W e contacted potential participants by e-mail, selecting them from a mailing list they had previously subscribed to, which we use to share news about our software supply chain research. The e-mail included a short introduction to our research and an outline of the interview . After they agreed to take part and signed a consent form, we held semi-structured interviews on Zoom, each lasting about 30 minutes. During the interviews, we ﬁrst asked participants to introduce themselves. W e then introduced the D I RT Y - W AT E R S smells introduced in Section III-B to them one by one and asked them to rate the severity of each of the smells regarding their impact on security , regarding maliciousness and vulnerability . W e also asked about a short reasoning why they felt the ranking was appropriate. Next, they were asked to mention any smells they felt were missing. Finally , they were giv en the opportunity to describe their current practices for handling software supply chain security . W e recorded each interview and transcribed it to text using Whisper 2 . In total, we interviewed ele ven practitioners (cf. T able III). Eight of them had more than twenty years of professional experience, two had eighteen years, and one had around ten years. Nine participants worked in security , while one focused on quality assurance and one was a de velopment manager . Their technical backgrounds were div erse: four primarily worked with Jav a, two with Go and Python, respectiv ely , and one each with C/C++, .NET , and Rust. Companies they worked for included, but are not limited to Google, Oracle, Sonatype, and K eyfactor . The opinions shared by the interviewees, ho we ver , do not represent an ofﬁcial statement by these companies. B. RQ1: Rating of Smells In this section, we present and discuss how practitioners rate the D I RT Y - W AT E R S smells introduced in Section III-B. 1) Results: T able IV shows how the intervie wed practitioners rated the sev erity of our software supply chain smells. Six out of nine smells received at least one rating as critical. In valid 2 https://github .com/openai/whisper code signature showed strong consensus as a critical smell with eight participants rating it as critical. Participants mentioned that an in valid signature is a sign of broken integrity and ”that something is not as it should be” (P10) to them. One participant (P7), howe ver , classiﬁed this smell as low , noting that the signature could become in valid due to an expired key , which would not bother them. In valid source code URL was rated as critical by four participants, and as high by another three. Participants reasoned that this signals that the code cannot be trusted and can be a blocker for using it: ”Y ou cannot trust the sour ce code” (P2) , ”[I] would like to know what the source code is and I cannot get the answer to that question” (P7) , ”that would be a block er” (P4) . Similarly , without source code URL received three ratings as critical and another ﬁve as high, with participants noting that there should be no reason to not giv e the URL in open source development, and that they would therefore not trust the package; for example, P2 noted ”I would not trust that (...) and I would not use such a packag e” , and P2 stated that ”we should always r equir e that the source code is available” . Participants also perceiv ed the Inaccessible commit SHA/tag smell as serious, with two ratings as critical and ﬁve as high, mentioning that knowing the exact version of a package is a prerequisite for reproducibility of builds (P5) and that not ha ving it could indicate that ”someone is tampering with the packag e” (P4). P6 stated that this smell is ”not as bad as no sour ce code at all, but still bad” . No code signature receiv ed more mixed ratings: 2 partici- pants rated it as critical, and another four as high, reasoning that they cannot be sure by whom the package is distributed and if the code is not tampered with. Another four participants, ho wev er, rated the smell as low or medium; three because they would not expect to see code signatures in their ecosystem (P3/.NET , P8/Java, and P11/Python), one stressing that the v alue of a code signature also ”depends on the ability to chec k by whom it is signed” (P7) . The deprecated smell was mostly rated between medium and high. Participants rating the smell as critical noted that ”if you have it [the dependency] and it is depr ecated, then you should invest work on getting rid of this dependency” (P8) , and that they are not allowed to use it at all (P2). Participants rating the smell as medium mentioned a trade-off between the additional work of switching to another package and security risks, which they would look at on a case-by-case basis. Forks were rated as medium by sev en out of elev en participants because they are common and legitimate in open- source dev elopment. Y et, eight practitioners still expressed that a forked dependency warrants closer inspection to understand its purpose and legitimacy; for example, P2 noted that ”we would clearly look into it and try to ﬁgure out why they did the fork” and that ”we need to to dig into these things fairly in detail befor e using some thir d-party dependency” . For Aliases , two participants reasoned that they should be av oided for reasons of transparency , but do not pose a security risk itself. Another participant mentioned relying on security scanners for a problem like that, while another did not judge aliasing as problematic as long as the alias name of dependencies cannot be used in their own code base: ”unless you can use the very same alias in your own code base without deﬁning it yourself 7 (...) I would not look at the aliases” (P10) . No Prov enance , i.e., not providing cryptographically signed metadata about a software artifact and its build (cf. Sec- tion III-B ), was rated low or medium by nine out of eleven practitioners. This is not because they thought of it as not important, but because it is not yet widely adopted in practice. Sev eral participants noted that they would like to see more packages providing prov enance, stating for example, that ”this would be great” (P8), and would therefore like to ideally rate this smell higher . Some participants gav e multiple ratings for the same smell feeling that they are context-dependent; for clarity , we classiﬁed them conservati vely as the respective lo wer category . One participant classiﬁed the inaccessible commit SHA/release tag as ”between medium and high”, another participant rated the absence of a tag as high, but the absence of a commit SHA as critical. Similarly , one participant rated the deprecated smell as medium for ”packages with small attack surfaces”, and as high for packages with large attack surfaces. Another made the rating of the fork smell dependent on whether the original project was still active (low rating) or not (critical rating). One participant did not rate the prov enance smell, as it does not apply to their context, where they are b uilding all software from source code themselves. Another participant did not rate both the no code signature and in valid code signature smell, stating that they are not well-phrased, as the smell should be about checking the owner of the signing key , not if there is a signature. Five participants did not feel comfortable providing a rating for the Alias code smells, as the ecosystems they work with do not support aliasing. 2) Discussion: The interview results demonstrate that our proposed software supply chain smells are meaningful and align well with practitioners’ real-world concerns. Most smells were rated as critical by at least one participant. In particular , the smells related to source authenticity and signature validity receiv ed consistently high or critical ratings. Sev erity assessments often depended on the participant’ s background and ecosystem; generally , participants rated smells as low not only if they thought it was not a meaningful indicator for a security risk, but also if it did not apply to their particular context. For example, one participant rated the Inaccessible Release T ag smell as medium because they build the source code themselves. Similarly , two participants rated the No Code Signature smell as low and medium, respectively , arguing that signatures are not a ”must hav e” due to not being widely adopted yet. Some practitioners emphasized that smells cannot be judged in isolation. Multiple weak indicators could compound, mak- ing the combined risk greater than the sum of individual smells. Others also stressed that changes across versions – such as disappearing provenance, altered signing behavior , or unexpected repository mov es – would immediately raise suspicion, sometimes more than a stati c absence of data. Se veral participants further highlighted that the absence of a smell does not guarantee that a package is secure to use, reinforcing our view that the smells are best viewed as indicators that rev eal fundamental reasons not to trust a package. More broadly , participants noted that they are cautious about ID Smell L M H C N 1 No Source Code URL 1 2 5 3 0 2 In valid Source Code URL 1 3 3 4 0 3 Inaccessible Release T ag 1 3 5 2 0 4 Deprecated 1 4 4 2 0 5 Fork 3 7 1 0 0 6 No Code Signature 2 2 4 2 1 7 In valid Code Signature 1 0 1 8 1 8 Aliased 3 2 1 0 5 9 No Provenance 7 2 1 0 1 T ABLE IV: Practitioner ratings of software supply chain smells; L = lo w severity , M = medium, H = high, C = critical, N = no rating. Most common rating in bold . The high sev erity validates D I RT Y - W AT E R S ’ s relev ance. introducing new dependencies. Integrating smell detection into existing gov ernance processes could provide actionable transparency to enable informed decisions about new de- pendencies without adding heavy manual effort. Moreover , sev eral participants mentioned gaps in continuous monitoring of dependencies, noting that dependencies are ev aluated when ﬁrst introduced but not consistently re-assessed when updated. D I RT Y - W A T E R S - A C T I O N could help ﬁll this gap by offering lightweight, repeatable checks at ev ery update. Answer to RQ1: The r esults fr om the interviews with expert developers show that our smells capture signals practitioners consider valuable. The collected evidence shows that the pr oposed smells ar e gr ounded in practical needs and inform supply chain security decisions. C. RQ2: Additional Smells In this section, we present additional smells proposed by practitioners and if and how they could become part of the smells checked in D I RT Y - W A T E R S . 1) Results: All participants suggested one or more additional smells they considered relev ant for assessing the security of a package before being added as a dependency . T able V shows an overvie w of proposed smells. Many interviewees highlighted project and maintainer health, pointing to indicators such as contributor and maintainer activity and identity , i.e., who maintainers are (mentioned by six interviewees), long periods without updates (three mentions), the age of a package (one mention), irregular release patterns (one mention), and missing security testing (one mention). Others focused on the dependency footprint and usage, noting that an overly complex dependency tree (four mentions), limited real use of a dependency (one mention), or unclear adoption lev els by other users (two mentions) can signal risk to them. Several suggestions concerned authenticity and trust, including checks for suspicious name similarity (two mentions), veriﬁcation of signature o wners on top of checking for a valid signature, unexpected changes in project location or build setup, or cases where a repository appears to be a copy rather than a proper fork (one mention each). Interviewees also mentioned build and release integrity issues, such as missing reproducible builds (two mentions) or binaries bundled within source releases (one 8 mention). Finally , a few pointed to behavior and permission- related smells, for example, packages requesting many access rights (one mention) or exhibiting unexpected network behavior during install or runtime (one mention). 2) Discussion: The additional smells proposed by practition- ers cover a wide range of potential indicators for issues. Se veral of these suggestions could be integrated into future extensions of D I RT Y - W A T E R S (cf. T able V), as they align with our design goal of supporting smells that can be extracted reliably from metadata. For example, long periods without updates or the age of a package could be derived from release timestamps, while adoption lev els could be approximated using download statistics from package registries. Similarly , suspicious name similarity – often associated with typosquatting attacks – could be detected using package naming metadata and existing tooling for detecting potential typosquatting packages [28]. Some build-related suggestions, such as checking for reproducible builds or binaries bundled within source releases, could also be incorporated, although they may introduce trade-of fs in terms of analysis time and are more ecosystem-speciﬁc (e.g., particularly relev ant for Mav en/Jav a). Sev eral other suggestions, howe ver , fall outside the scope of our current analysis. Indicators related to project and maintainer activity and identity , irregular release patterns, or the ”reasonableness” of a dependency tree are typically gradual and highly contextual, making them difﬁcult to express as generalized smells that are either present or absent. Like wise, signals such as missing security testing, veriﬁcation of signature owners, unexpected changes in project location or build setup, or distinguishing copied repositories from legitimate forks are challenging to extract automatically and reliably at scale. Finally , smells based on runtime behavior or requested access rights require executing the software and are therefore incompatible with our static, metadata-based approach. The interview results also show a broader set of practical needs. Participants expressed interest in understanding the amount of additional transitive dependencies included when using a certain package, sometimes in combination with understanding how much of a package is used in their code, which does not translate into a binary smell. T o our knowledge, this is the ﬁrst detailed account in academic literature of how practitioners themselves ev aluate supply chain smells and what additional signals they consider relev ant. Our ﬁndings provide novel insight into real-world needs and expectations that existing research has not systemat- ically captured. Answer to RQ2: Practitioners also care about other softwar e supply chain smells r elated to pr oject health and maintainers identity . The pr oposed smells call for future r esearc h and development based on our gr ounded ﬁndings. V I . Q UA N T I TA T I V E S T U DY A C RO S S E C O S Y S T E M S In this section, we present a quantitative analysis of software supply chain smells across two package ecosystems. Our goal is to understand how common the smells are in practice and ho w their prev alence differs between ecosystems. W e examine the following research question: Category Proposed smell Mentions In scope? Project & maintainer health Maintainer identity/activity 6 × No recent updates 3 ✓ Old package 1 ✓ Irregular releases 1 × No security testing 1 × Dependency footprint Large dependency tree 4 × Unclear adoption 2 ✓ Low real-world usage 1 ✓ Authenticity & trust Suspicious name similarity 2 ✓ Project location/build changes 1 × Copied repository 1 × Un veriﬁed signature owner 1 × Build & release integrity No reproducible builds 2 Future Binaries in source release 1 Future Behavior & permissions Excessiv e permissions 1 × Unexpected network beha vior 1 × T ABLE V: Additional software supply chain smells proposed by practitioners, grouped by category . RQ3 What smells do packages in the supply chain of popular packages in different ecosystems exhibit? First, we describe our methodology for collecting and analyzing packages from each ecosystem in Section VI-A . W e then report our ﬁndings in Section VI-B , follo wed by a discussion of threats to validity in Section VI-C. A. Methodology W e analyze the 50 most depended-on packages in the NPM and Maven ecosystems. W e consider a project if it is hosted on GitHub and: 1) for NPM, has a packag e-lock.json ﬁle, 2) for Mav en, has a pom.xml ﬁle. W e use the metric of most depended-on packages as a proxy for ecosystem impact as retriev ed by npm-rank 3 for NPM, and the libraries.io 4 service for Maven. For each package, we retriev e its GitHub repository path and its most recent version as of November 25, 2025, and the matching release commit or release tag. W e use a package’ s most recent version to ensure that we do not report issues that hav e already been ﬁxed at the time of analysis. W e run D I RT Y - W A T E R S until we obtain ﬁfty successfully completed analyses. For Maven, nine analyses fail, seven due to an error in the maven-dependency-plugin when resolving dependencies, and two due to an error in the r esolve-plugin . For NPM, nine analyses fail due to non-supported formatting of lockﬁles. W e provide the complete list of packages analyzed together with the analysis results as part of our supplementary material 5 . T o answer RQ3, we consider all individual packages that appear in the dependency trees of the 50 analyzed projects per ecosystem. For each package, we analyze whether it directly exhibits any of the deﬁned software supply chain smells, independent of the project that depends on it. This allo ws us to study how common each smell is among packages in the ecosystem and to compare the distribution of smells between Mav en and NPM. 9 0 20 40 60 80 100 No Sour ce Code URL Invalid Sour ce Code URL Inaccessible R elease T ag F ork Invalid Code Signatur e No Code Signatur e Depr ecated Aliased No P r ovenance 49/1891(2.6%) 34/8071(0.4%) 26/1891(1.4%) 81/8071(1.0%) 547/1891(28.9%) 464/8071(5.7%) 4/1891(0.2%) 86/8071(1.1%) 44/1891(2.3%) 249/1891(13.2%) 2/8071(0.0%) 271/8071(3.4%) 6/8071(0.1%) 8045/8071(99.7%) Maven NPM Fig. 3: Distribution of smells exhibited by packages directly . Note that the Deprecated, Aliased, and No Prov enance smells are not supported for Maven. B. RQ3: Direct smells in packag es Figure 3 summarizes how many packages exhibit each smell directly , considering all packages present in the dependency trees of the top 50 projects per ecosystem. This amounts to 1891 unique packages for Maven and 8071 for NPM. Knowledge about smells is particularly relev ant when adding a new dependency or updating a dependency version, as they indicate whether a package provides suf ﬁcient traceability , integrity , and provenance, supporting informed security decisions. Let us ﬁrst discuss the smells for Maven (blue bars). The most frequent direct smell is inaccessible commit SHA , affect- ing 547 packages (62.8%). This is signiﬁcant, as practitioners frequently rated this smell as high or critical due to the resulting loss of traceability and auditability . The important presence of this smells is evidence that linking released artifacts back to speciﬁc source code versions remains a common challenge in the ecosystem. When adding a dependency with this smell, dev elopers cannot reliably trace the released artifact back to a speciﬁc source code version, limiting their ability to revie w changes or respond to incidents. The second most common smell is no code signature (249 packages, 28.6%). Giv en that Mav en has long supported artifact signing [29], this represents a substantial fraction of dependencies lacking integrity and authenticity guarantees. For a dev eloper ev aluating a new dependency , this means there is no cryptographic assurance that the artifact originates from the expected publisher . Other smells are considerably less common: No source code URL (49 packages, 5.6%), in valid code signature (44 packages, 5.1%), invalid source code URL (26 packages, 3%), and fork ed projects (4 packages, 0.5%). For NPM (orange bars), most direct-package smells appear far less frequently in relativ e terms, though absolute counts remain at around the same numbers compared to Maven due to 3 https://github .com/tristan-f-r/npm-rank 4 https://libraries.io/ 5 https://github .com/chains- project/dirty- waters- utils/tree/master/scripts/ smell- presence- analysis/2025- 11. the larger dataset. Nearly all packages in the dependency tree exhibit the no pro venance smell (8045 packages, 99.7%). This aligns with our expectation that the smell is forward-looking, as prov enance is not yet widely adopted. The second- and third- most common direct smells are inaccessible commit SHA (464 packages, 5.8%) and deprecated (271 packages, 3.4%). When adding a dependency , these smells signal potential difﬁculties in revie wing source code or relying on long-term maintenance, respectiv ely . In valid source code URL (81 packages, 1.0%) and forked Projects (86 packages, 1.1%) also occur but remain rare overall. Smells related to signing are nearly absent: No code signature (2 packages, 0.0%) and in valid code signature (0 packages). As packages in NPM are signed by the registry 6 , it is normal to ﬁnd no packages without or with an in v alid code signature, highlighting how ecosystem design shapes the supply chain risk proﬁle and developers adding new dependencies in NPM beneﬁt from strong registry-le vel integrity guarantees by default. Answer to RQ3: F or Maven, the most common smells ar e related to traceability and missing code signatur es, smells that were mostly rated as high or critical by practitioners, suggesting that Maven would beneﬁt from str onger r egistry-le vel guarantees. F or NPM, all smells r emain relatively rar e, demonstrating str ong ecosystem awar eness to dependency pr oblems. C. Thr eats to V alidity W e discuss how we address threats to validity as outlined by W ohlin et al. [30] and Runeson and H ¨ ost [31]. a) Internal V alidity: A ﬁrst threat concerns the use of the 50 most depended-upon projects per ecosystem as a proxy for popularity . While this excludes less common packages, we are conﬁdent that our analysis still provides a reasonable view of typical supply chain practices because these projects anchor large parts of the ecosystem. T ool failures (nine Mav en, nine NPM) present another threat, but since our goal is to capture ecosystem-lev el tendencies rather than ev aluate speciﬁc projects, we do not expect these missing cases to affect the ov erall patterns. Dependency resolution and repository mapping may introduce inaccuracies, especially in Maven multi-module projects where we analyze the full repository . These effects apply broadly and are unlikely to skew comparative results. Re- liance on metadata ﬁelds also creates a risk of misclassiﬁcation, as metadata may be incomplete or inconsistent. Howe ver , since such inconsistencies are themselves supply chain concerns, detecting them aligns with the purpose of our analysis. b) External V alidity: Our results may not generalize to the full ecosystems, as we sampled only the most depended- upon packages, which are typically better maintained. Howe ver , because these projects inﬂuence many downstream consumers, they still offer valuable insight into common dependency patterns. A further limitation is the requirement for a lockﬁle or standard Mav en setup, and the restriction to GitHub- hosted repositories, which excludes some projects. Given the prev alence of GitHub and standard build practices in widely 6 https://docs.npmjs.com/about-registry-signatures 10 used packages, we expect this to hav e a limited ef fect on generalizability . V I I . R E L A T E D W O R K A. Softwar e Supply Chain Attacks Prior work has proposed taxonomies to classify software supply chain attacks and their attack vectors. Ohm et al. [7] deriv e a taxonomy from 174 malicious open-source packages. Ladisa et al. [16] present an ecosystem-agnostic attack taxon- omy cov ering all stages of open-source supply chains. Gokkaya et al. [32] extend this line of work by linking attack vectors to mitigations and present a framework for software supply chain risk assessment. Beyond taxonomies, sev eral approaches aim to detect activ e software supply chain attacks. T an et al. [33] mine anomalous runtime behavior to identify compromised dependencies. Zhen et al. [34] present OSCAR, which detects dynamic code poison- ing by fully executing packages in sandboxed en vironments and monitoring behavior through fuzzing and API hooks. Sejﬁa et al. [35] apply machine learning to classify malicious packages based on observed characteristics, and Reyes et al. [36] study a speciﬁc attack technique in Maven exploiting class resolution order at runtime. In contrast, our work does not detect concrete attacks. W e focus on structural indicators – software supply chain smells – that undermine trust and transparency in dependencies and increase susceptibility to the attack vectors identiﬁed in prior work. B. V ulnerability-Centric Supply Chain Analysis Much prior work focuses on identifying known vulnerabili- ties in software supply chains. Software Composition Analysis (SCA) tools generate in ventories of third-party components and check them against vulnerability databases [37], while static analysis techniques aim to detect vulnerable code patterns directly in source code [38]. Comparativ e studies show that SCA tools vary signiﬁcantly in their effecti veness, largely due to differences in the accuracy of queried vulnerability databases [37]. Sev eral studies highlight the limitations of vulnerability- centric approaches. Imtiaz et al. [39] report a median delay of 17 days between the release of a security ﬁx and the publication of a corresponding security advisory . Mir et al. [40] apply reachability analysis to Maven dependencies and ﬁnd that less than 1% of packages have reachable call paths to vulnerable code, indicating that impossible reachability is a major cause of false positiv es. Other work analyzes vulnerable transitive dependencies at scale using CVE data [41]–[43]. While these approaches center on known vulnerabilities and disclosed security issues, our work targets properties of packages and their metadata that affect trust and auditability in- dependently of known vulnerabilities, enabling risk assessment ev en in the absence of CVEs or security advisories. C. Metadata- and Repository-Based Risk Indicators Prior work has explored metadata- and repository-based indicators to assess risks in software supply chains, including repository accessibility [44], malware injection in artifacts [45], maintainer activity [46], repository popularity [47], and mal- ware in repository forks [19]. These studies typically analyze individual signals in isolation and do not provide a uniﬁed view that combines information from package registries and source code repositories. Sev eral approaches focus on registry metadata. Zahan et al. [46] propose security-relevant signals for npm packages with a strong emphasis on maintenance practices, but do not consider repository-lev el indicators as we do. PyRadar [48] addresses the quality of registry metadata by mapping packages to their source repositories, but does not assess broader trust or security implications. Other tools, such as Socket [49] and the OpenSSF Score- card [50], aggregate multiple metadata- and repository-based signals into risk scores or alerts. Empirical ev aluations us- ing Scorecard report generally low security scores across projects [51]. Related tooling, such as Heisenberg [52], in- tegrates selected metadata, such as time since release and OpenSSF Scorecard scores, into development workﬂo ws. While such risk scoring approaches provide coarse-grained assess- ments, they often conﬂate different signals and emphasize de velopment practices. They do not support individual decision by dev elopers to add or update a dependency with authenticity and traceability . Complementary efforts focus on deﬁning or enforcing integrity and prov enance guarantees. SLSA [15] speciﬁes security le vels and criteria for supply chain artif acts, particularly around prov enance, but does not provide automated, ecosystem- wide assessment. Macaron [53] operationalizes parts of SLSA by discovering source repositories, checking for prov enance, and detecting regressions over time, yet remains focused on compliance with speciﬁc standards rather than a broader set of risk indicators. At the ecosystem level, package managers such as PNPM provide policy mechanisms to prev ent security regressions, for example, by blocking updates that remove previously av ailable provenance information [54]. W ork on reproducible builds proposes strong integrity signals for software artifacts in ecosystems such as Python and Mav en [26], [55], but typically treats reproducibility as a standalone property rather than as part of a broader taxonomy . In contrast, our work focuses on identifying and system- atizing individual risk indicators derived jointly from package registry , source code repository , and dependency ﬁles, providing a uniﬁed and actionable framework rather than a single aggregated score. D. Dependency Smells and Structural Dependency Issues Prior work on dependency smells focuses primarily on maintainability issues rather than supply chain trust or security . Existing studies identify problems such as bloated dependencies, missing dependencies, and erroneous version constraints [56], [57]. Other work aims to reduce attack surface by debloating dependency trees [58] or improving dependency resolution reliability [59]. While these approaches address important challenges in dependency management, they do not consider trust, authenticity , or traceability of dependencies as we do in this paper . 11 The concept of smells itself originates from code quality research, where code smells are used as heuristic indicators of technical debt affecting maintenance [10]. Prior work has proposed tools to detect code smells [60] and provide recom- mendations to developers [61], and more recent studies extend smell-based thinking to build scripts and infrastructure [62]. These approaches demonstrate the usefulness of lightweight, heuristic signals, but focus on internal code and build quality rather than supply chain security . In contrast, our work adapts the smell abstraction to the software supply chain domain, positioning software supply chain smells as security-oriented signals complementary to maintainability-focused analyses. V I I I . C O N C L U S I O N In this paper , we introduced the nov el concept of software supply chain smells: they are structural indicators for security issues in modern software supply chains. W e presented and ev aluated D I RT Y - W A T E R S , an open-source tool for detecting supply chain smells across dependency trees. Our interviews with practitioners clearly showed that the proposed smells align with real-world concerns and provide signals practitioners consider meaningful for assessing supply chain risks. A quantitati ve study on packages from Mav en and NPM sho ws the prev alence of smells in both ecosystems; howe ver , with clear differences between them: T raceability and signing issues are pre valent in Maven, while most smells are rare in NPM, partly due to stronger registry-le vel guarantees. T o conclude, software supply chain smells are pragmatic, grounded, and actionable ways to reason about software supply chain security . A C K N O W L E D G M E N T S This work was partially supported by the W ASP program funded by Knut and Alice W allenberg Foundation, and by the Swedish Foundation for Strategic Research (SSF). Some computation was enabled by resources provided by the National Academic Infrastructure for Supercomputing in Sweden (NAISS). The authors acknowledge the use of ChatGPT (OpenAI) to improve the language and grammar in the manuscript. R E F E R E N C E S [1] R. Cox, “Surviving software dependencies, ” Commun. ACM , vol. 62, no. 9, p. 36–43, Aug. 2019. https://doi.org/10.1145/3347446 [2] E. Wittern, P . Suter, and S. Rajagopalan, “ A look at the dynamics of the jav ascript package ecosystem, ” in Proceedings of the 13th International Confer ence on Mining Software Repositories , ser. MSR ’16. New Y ork, NY , USA: Association for Computing Machinery , 2016, p. 351–361. https://doi.org/10.1145/2901739.2901743 [3] M. Ohm and C. Stuke, “SoK: Practical Detection of Software Supply Chain Attacks, ” in Proceedings of the 18th International Confer ence on A vailability, Reliability and Security , ser. ARES ’23. Ne w Y ork, NY , USA: Association for Computing Machinery , Aug. 2023, pp. 1–11. https://dl.acm.org/doi/10.1145/3600160.3600162 [4] A. Gkortzis, D. Feitosa, and D. Spinellis, “Software reuse cuts both ways: An empirical analysis of its relationship with security vulnerabilities, ” Journal of Systems and Software , vol. 172, p. 110653, 2021. https://www .sciencedirect.com/science/article/pii/S0164121220301199 [5] M. Zimmermann, C.-A. Staicu, C. T enny , and M. Pradel, “Smallworld with high risks: a study of security threats in the npm ecosystem, ” in Pr oceedings of the 28th USENIX Confer ence on Security Symposium , ser . SEC’19. USA: USENIX Association, Aug. 2019, pp. 995–1010. [6] G.-P . Drosos, T . Sotiropoulos, D. Spinellis, and D. Mitropoulos, “Bloat beneath Python’ s Scales: A Fine-Grained Inter-Project Dependency Analysis, ” Artifact for ”Bloat beneath Python’s Scales: A Fine- Grained Inter-Pr oject Dependency Analysis” , v ol. 1, no. FSE, pp. 114:2584–114:2607, Jul. 2024. https://dl.acm.org/doi/10.1145/3660821 [7] M. Ohm, H. Plate, A. Sykosch, and M. Meier, “Backstabber’ s Knife Collection: A Revie w of Open Source Software Supply Chain Attacks, ” in Detection of Intrusions and Malware, and V ulnerability Assessment , C. Maurice, L. Bilge, G. Stringhini, and N. Nev es, Eds. Cham: Springer International Publishing, 2020, pp. 23–43. [8] Y . Gamage, D. Tiwari, M. Monperrus, and B. Baudry , “The design space of lockﬁles across package managers, ” 2025. [9] D.-L. V u, F . Massacci, I. Pashchenko, H. Plate, and A. Sabetta, “LastPyMile: identifying the discrepancy between sources and packages, ” in Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the F oundations of Softwar e Engineering , ser . ESEC/FSE 2021. New Y ork, NY , USA: Association for Computing Machinery , Aug. 2021, pp. 780–792. https://dl.acm.org/doi/10.1145/3468264.3468592 [10] G. Lacerda, F . Petrillo, M. Pimenta, and Y . G. Gueheneuc, “Code Smells and Refactoring: A T ertiary Systematic Re view of Challenges and Observations, ” Journal of Systems and Software , vol. 167, p. 110610, Sep. 2020, arXiv:2004.10777 [cs]. http://arxiv .org/abs/2004.10777 [11] R. Liu, S. Bobadilla, B. Baudry , and M. Monperrus, “Dirty- waters: Detecting software supply chain smells, ” in Proceedings of the 33rd ACM International Conference on the F oundations of Softwar e Engineering , ser . FSE Companion ’25. New Y ork, NY , USA: Association for Computing Machinery , 2025, p. 1045–1049. https://doi.org/10.1145/3696630.3728578 [12] J. Dietrich, S. Rasheed, A. Jordan, and T . White, “On the Security Blind Spots of Software Composition Analysis, ” Oct. 2023, [cs]. http://arxiv .org/abs/2306.05534 [13] “Cybersecurity and infrastructure security agency: Software bill of materials (sbom). ” https://www .cisa.gov/sbom [14] S. T orres-Arias, H. Afzali, T . K. Kuppusamy , R. Curtmola, and J. Cappos, “in-toto: Providing farm-to-table guarantees for bits and bytes, ” 2019, pp. 1393–1410. https://www .usenix.org/conference/usenixsecurity19/ presentation/torres- arias [15] “Slsa: Supply-chain levels for software artifacts. ” https://slsa.dev/ [16] P . Ladisa, H. Plate, M. Martinez, and O. Barais, “SoK: T axonomy of Attacks on Open-Source Software Supply Chains, ” in 2023 IEEE Symposium on Security and Privacy (SP) , May 2023, pp. 1509–1526, iSSN: 2375-1207. https://ieeexplore.ieee.or g/document/10179304/ [17] L. W illiams, G. Benedetti, S. Hamer, R. Paramitha, I. Rahman, M. T amanna, G. T ystahl, N. Zahan, P . Morrison, Y . Acar, M. Cukier, C. K ¨ astner , A. Kapravelos, D. W ermke, and W . Enck, “Research Directions in Software Supply Chain Security, ” ACM Tr ansactions on Software Engineering and Methodology , p. 3714464, Jan. 2025. https://dl.acm.org/doi/10.1145/3714464 [18] “Malicious p ython library ctx remo ved from pypi repo. ” https://portswigger .net/daily- swig/malicious- python- library- ctx- removed- from- pypi- repo [19] A. Cao and B. Dolan-Gavitt, “What the Fork? Finding and Analyzing Malware in GitHub Forks, ” in Pr oceedings 2022 W orkshop on Measur ements, Attacks, and Defenses for the W eb . San Diego, CA, USA: Internet Society , 2022. https://www .ndss- symposium.org/wp- content/uploads/madweb2022 23001 paper .pdf [20] “Greedy cybercriminals host malware on github. ” https://blog.av ast.com/ greedy- cybercriminals- host- malware- on- github [21] “Progress on ccleaner inv estigation. ” https://blog.avast.com/progress- on- ccleaner- inv estigation [22] “Generating provenance statements — npm docs. ” https://docs.npmjs. com/generating- provenance- statements [23] Z. Newman, J. S. Meyers, and S. T orres-Arias, “Sigstore: Software Signing for Everybody, ” in Proceedings of the 2022 ACM SIGSAC Confer ence on Computer and Communications Security . Los Angeles CA USA: ACM, Nov . 2022, pp. 2353–2367. https: //dl.acm.org/doi/10.1145/3548606.3560596 [24] M. Souppaya, K. Scarfone, and D. Dodson, “Secure software de velopment framew ork (ssdf) version 1.1: Recommendations for mitigating the risk of software vulnerabilities, ” 2022. [25] “s1ngularity: Popular nx build system package compromised with data-stealing malware. ” https://www .stepsecurity .io/blog/supply- chain- security- alert- popular- nx- build- system- package- compromised- with- data- stealing- malware#indicators- of- compromise- iocs 12 [26] M. Keshani, T .-G. V elican, G. Bot, and S. Proksch, “ Aroma: Automatic reproduction of mav en artifacts, ” Proc. ACM Softw . Eng. , vol. 1, no. FSE, Jul. 2024. https://doi.org/10.1145/3643764 [27] J. Smith, L. N. Q. Do, and E. Murphy-Hill, “Why Can’t Johnny Fix V ulnerabilities: A Usability Evaluation of Static Analysis T ools for Security, ” 2020, pp. 221–238. https://www .usenix.org/conference/ soups2020/presentation/smith [28] “T yposquatting: Detect potential typosquatting packages across package ecosystems. ” https://github .com/andrew/typosquatting [29] “How to generate pgp signatures with maven. ” https://www .sonatype. com/blog/2010/01/how- to- generate- pgp- signatures- with- maven [30] C. W ohlin, P . Runeson, M. H ¨ ost, M. C. Ohlsson, B. Regnell, and A. W essl ´ en, Experimentation in Software Engineering . Berlin, Heidelberg: Springer Berlin Heidelberg, 6 2012. https://link.springer . com/content/pdf/bfm:978- 3- 642- 29044- 2/1?pdf=chapter%20toc [31] P . Runeson and M. H ¨ ost, “Guidelines for conducting and reporting case study research in software engineering, ” Empirical Software Engineering , vol. 14, no. 2, pp. 131–164, 12 2008. https: //link.springer .com/content/pdf/10.1007/s10664- 008- 9102- 8.pdf [32] B. Gokkaya, L. Aniello, and B. Halak, “Software supply chain: A taxonomy of attacks, mitigations and risk assessment strategies, ” Journal of Information Security and Applications , vol. 97, p. 104324, 2026. https://www .sciencedirect.com/science/article/pii/S2214212625003606 [33] Z. T an, K. Xiao, J. Singer, and C. Anagnostopoulos, “Operational runtime behavior mining for open-source supply chain security , ” 2026. https://arxiv .org/abs/2601.06948 [34] X. Zheng, C. W ei, S. W ang, Y . Zhao, P . Gao, Y . Zhang, K. W ang, and H. W ang, “T ow ards robust detection of open source software supply chain poisoning attacks in industry environments, ” in Proceedings of the 39th IEEE/ACM International Conference on Automated Softwar e Engineering , ser . ASE ’24. New Y ork, NY , USA: Association for Computing Machinery , 2024, p. 1990–2001. https://doi.org/10.1145/3691620.3695262 [35] A. Sejﬁa and M. Sch ¨ afer , “Practical automated detection of malicious npm packages, ” in Proceedings of the 44th International Confer ence on Software Engineering , ser. ICSE ’22. New Y ork, NY , USA: Association for Computing Machinery , 2022, p. 1681–1692. https://doi.org/10.1145/3510003.3510104 [36] F . Reyes, F . Bono, A. Sharma, B. Baudry , and M. Monperrus, “Mav en-hijack: Software supply chain attack exploiting packaging order , ” 2025. https://arxiv .org/abs/2407.18760 [37] N. Imtiaz, S. Thorn, and L. Williams, “ A comparative study of vulnerability reporting by software composition analysis tools, ” in Pr oceedings of the 15th A CM / IEEE International Symposium on Empirical Softwar e Engineering and Measur ement (ESEM) , ser . ESEM ’21. New Y ork, NY , USA: Association for Computing Machinery , 2021. https://doi.org/10.1145/3475716.3475769 [38] T . Hastings and K. R. W alcott, “Continuous V eriﬁcation of Open Source Components in a W orld of W eak Links, ” in 2022 IEEE International Symposium on Software Reliability Engineering W orkshops (ISSREW) , Oct. 2022, pp. 201–207. https://ieeexplore.ieee.org/abstract/document/ 9985184 [39] N. Imtiaz, A. Khanom, and L. Williams, “Open or sneaky? fast or slow? light or heavy?: In vestigating security releases of open source packages, ” IEEE T ransactions on Softwar e Engineering , vol. 49, no. 4, pp. 1540–1560, 2023. [40] A. M. Mir , M. Keshani, and S. Proksch, “On the effect of transitivity and granularity on vulnerability propagation in the mav en ecosystem, ” in 2023 IEEE International Conference on Softwar e Analysis, Evolution and Reengineering (SANER) , 2023, pp. 201–211. [41] J. Mahon, C. Hou, and Z. Y ao, “Pypitfall: Dependency chaos and software supply chain vulnerabilities in python, ” arXiv preprint , 2025. [42] A. G. M ´ arquez, A. J. V arela-V aca, and M. T . G ´ omez L ´ opez, “ A dataset on vulnerabilities affecting dependencies in software package managers, ” Data in Brief , vol. 62, p. 111903, Oct. 2025. https://www .sciencedirect.com/science/article/pii/S2352340925006274 [43] A. Germ ´ an M ´ arquez, A. J. V arela-V aca, M. T . G ´ omez L ´ opez, J. A. Galindo, and D. Benavides, “V ulnerability impact analysis in software project dependencies based on Satisﬁability Modulo Theories (SMT), ” Computers & Security , vol. 139, p. 103669, Apr. 2024. https://www .sciencedirect.com/science/article/pii/S0167404823005795 [44] A. Tsakpinis and A. Pretschner , “ Analyzing the Accessibility of GitHub Repositories for PyPI and NPM Libraries, ” in Proceedings of the 28th International Conference on Evaluation and Assessment in Softwar e Engineering , Jun. 2024, pp. 345–350, arXiv:2404.17403 [cs]. http://arxiv .org/abs/2404.17403 [45] D. L. V u, I. Pashchenko, F . Massacci, H. Plate, and A. Sabetta, “T owards Using Source Code Repositories to Identify Software Supply Chain Attacks, ” in Proceedings of the 2020 ACM SIGSAC Confer ence on Computer and Communications Security , ser . CCS ’20. New Y ork, NY , USA: Association for Computing Machinery , Nov . 2020, pp. 2093–2095. https://dl.acm.org/doi/10.1145/3372297.3420015 [46] N. Zahan, T . Zimmermann, P . Godefroid, B. Murphy , C. Maddila, and L. Williams, “What are weak links in the npm supply chain?” in Proceedings of the 44th International Conference on Software Engineering: Software Engineering in Practice , ser . ICSE-SEIP ’22. New Y ork, NY , USA: Association for Computing Machinery , Oct. 2022, pp. 331–340. https://dl.acm.org/doi/10.1145/3510457.3513044 [47] H. Borges, A. Hora, and M. T . V alente, “Understanding the Factors That Impact the Popularity of GitHub Repositories, ” in 2016 IEEE International Confer ence on Software Maintenance and Evolution (ICSME) , Oct. 2016, pp. 334–344. https://ieeexplore.ieee.org/document/ 7816479 [48] K. Gao, W . Xu, W . Y ang, and M. Zhou, “Pyradar: T owards automatically retrieving and validating source code repository information for pypi packages, ” Proc. ACM Softw . Eng. , vol. 1, no. FSE, Jul. 2024. https://doi.org/10.1145/3660822 [49] “Socket. ” https://socket.dev [50] “Openssf scorecard. ” https://scorecard.dev/ [51] R. Hege wald and R. Beyer, “Evaluating Software Supply Chain Security in Research Software, ” Aug. 2025, arXiv:2508.03856 [cs]. http://arxiv .org/abs/2508.03856 [52] “Heisenberg: How we learned to stop worrying and love the sbom. ” https://appomni.com/ao- labs/secure- pull- requests- heisenberg- open- source- security/ [53] B. Hassanshahi, T . N. Mai, A. Michael, B. Selwyn-Smith, S. Bates, and P . Krishnan, “Macaron: A Logic-based Framew ork for Software Supply Chain Security Assurance, ” in Proceedings of the 2023 W orkshop on Software Supply Chain Offensive Researc h and Ecosystem Defenses . Copenhagen Denmark: ACM, Nov . 2023, pp. 29–37. https://dl.acm.org/doi/10.1145/3605770.3625213 [54] “Settings: trustpolicy . ” https://pnpm.io/settings#trustpolicy [55] “Reproduce: Library to check if a package is reproducible. ” https://github .com/vltpkg/reproduce [56] A. J. Jafari, D. E. Costa, R. Abdalkareem, E. Shihab, and N. Tsantalis, “Dependency Smells in JavaScript Projects, ” IEEE T ransactions on Softwar e Engineering , vol. 48, no. 10, pp. 3790–3807, Oct. 2022. https://ieeexplore.ieee.or g/document/9519532 [57] Y . Cao, L. Chen, W . Ma, Y . Li, Y . Zhou, and L. W ang, “T owards Better Dependency Management: A First Look at Dependency Smells in Python Projects, ” IEEE T ransactions on Softwar e Engineering , vol. 49, no. 4, pp. 1741–1765, Apr . 2023. https://ieeexplore.ieee.org/document/9832512 [58] P . Pashakhanloo, A. Machiry , H. Choi, A. Canino, K. Heo, I. Lee, and M. Naik, “Pacjam: Securing dependencies continuously via package-oriented debloating, ” in Proceedings of the 2022 ACM on Asia Confer ence on Computer and Communications Security , ser . ASIA CCS ’22. New Y ork, NY , USA: Association for Computing Machinery , 2022, p. 903–916. https://doi.org/10.1145/3488932.3524054 [59] C. Paulsen and S. Proksch, “Marco: Compatible version ranges in maven, ” in 2025 IEEE International Conference on Software Maintenance and Evolution (ICSME) , 2025, pp. 910–914. [60] B. Cherry , C. Nagy , M. Lanza, and A. Cleve, “SMEA GOL: A Static Code Smell Detector for MongoDB. ” IEEE Computer Society , Mar . 2024, pp. 816–820. https://www .computer .org/csdl/proceedings- article/saner/2024/306600a816/1YCRoPI4vQI [61] R. de Mello, R. Oliveira, A. Uch ˆ oa, W . Oizumi, A. Garcia, B. Fonseca, and F . de Mello, “Recommendations for Dev elopers Identifying Code Smells, ” IEEE Software , vol. 40, no. 2, pp. 90–98, Mar . 2023. https://ieeexplore.ieee.or g/document/9904005 [62] M. T amanna, Y . Chandrani, M. Burrows, B. Wroblewski, L. W illiams, and D. W ermke, “Y our Build Scripts Stink: The State of Code Smells in Build Scripts, ” Oct. 2025, arXiv:2506.17948 [cs]. http: //arxiv .org/abs/2506.17948 13

Software Supply Chain Smells: Lightweight Analysis for Secure Dependency Management

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment