Improper artifact integrity validation is a CI/CD security oversight that allows attackers to inject malicious code into the software delivery pipeline via artifacts within the pipeline. Tampering opportunities arise from the blend of internal and third-party resources within CI/CD systems. Failing to implement checks to verify the integrity of artifacts permits undetected tampering, which can lead to harmful code execution in the pipeline or production. This oversight results from various factors — weak validation processes, inadequate security controls, lack of awareness about the importance of artifact integrity.
CICD-SEC-9, as identified on the OWASP Top 10 CI/CD Security Risks, stems from the potential of an attacker with access to a system within the CI/CD pipeline to push malicious code or artifacts down the pipeline. This risk is exacerbated by insufficient mechanisms to validate the authenticity and integrity of code and artifacts.
As CI/CD processes combine internal resources with third-party packages fetched from assorted locations, the resulting mix creates multiple entry points susceptible to tampering. If a compromised resource infiltrates the delivery process undetected, it can flow through the pipeline, masquerading as a legitimate resource, and potentially reach production environments. Such a breach can lead to the execution of malicious code within CI/CD systems or, more concerning, in live production environments.
An integral part of CI/CD security, artifact integrity validation provides assurance that digital artifacts, such as software packages, containers, and configuration files, remain unaltered and authentic from their original state. The security process involves using cryptographic methods, digital signatures, and checksums to confirm each artifact's origin while ensuring that the artifact hasn’t been tampered with during transit or storage.
By properly validating the integrity of artifacts, users can trust the reliability of the information, assured that deployed artifacts are free from unauthorized modifications.
Key components of effective artifact integrity validation — in addition to cryptographic checksums and digital signatures — include secure artifact storage, secure transport protocols, and secure key management practices. Each component plays a role in safeguarding the integrity and authenticity of artifacts at different stages of the CI/CD pipeline, from artifact creation and artifact transfer between stages to artifact deployment.
To understand how improper artifact integrity validation exposes organizations to risk, let's look at a hypothetical attack scenario.
Initial Entry
A seasoned attacker discerns vulnerabilities in a prominent software company's CI/CD pipeline. Recognizing the potential to exploit lapses in artifact integrity validation, the attacker devises a plan to introduce a tampered artifact.
Reconnaissance
The attacker meticulously studies the company's CI/CD process. Noting the expected blend of internal resources with third-party packages, the attacker identifies potential weak points where the integrity of artifacts might lack rigorous validation.
Exploitation
Crafting a malicious library that mimics a widely used third-party package, the attacker infiltrates a mirror repository. By replacing the legitimate library with the tampered version, the attacker sets the stage for the company's CI/CD pipeline to inadvertently pull in the malicious code.
Bypassing Security Gates
The company's CI/CD system fetches the latest version of all dependencies. Due to lapses in secure artifact storage practices, the system unknowingly retrieves the tampered library from the compromised mirror repository.
Although the company employs checksum validation, the attacker, having manipulated the mirror repository, updates the checksum file to match the tampered library's hash. The absence of a multisource validation mechanism allows the malicious library to pass unchecked.
Deployment and Execution
Once the tampered library is fetched and linked during the build process, the resulting application, now tainted with the malicious code from the library, progresses through the pipeline. Upon deployment in the production environment, the concealed malicious code activates, leading to system compromise.
The trustworthiness of artifacts is critical to cloud-native application development. By ensuring that only trustworthy artifacts are deployed, proper artifact integrity validation reduces the possibility of malicious code making it into production environments.
Organizations open themselves to potential security breaches, data leaks, and operational disruptions resulting from tampered artifacts that could have been detected prior to deployment if proper validation measures had been in place.
Case Study 1: Webmin Falls Victim to Stealthy Server Exploit
Attackers exploited Webmin's development build server in April 2018, introducing a vulnerability to the password_change.cgi script. To conceal the malicious modification, they altered the file's timestamp, and the compromised file became part of Webmin version 1.890. Although developers reverted the file using GitHub's version, attackers altered it again by July 2018, impacting versions 1.900 to 1.920. The exploit remained active only when a specific feature was enabled. After receiving a zero-day exploit report in August 2019, Webmin promptly removed the exploit and released version 1.930.
Case Study 2: PHP's Internal Security Breach
In early 2021, PHP's git.php.net server faced a malicious attack. Initially considered an individual account compromise, two malicious commits were made under the names of prominent PHP contributors. A deeper investigation revealed that these commits bypassed the standard gitolite infrastructure, hinting at a server compromise. The commits were pushed using HTTPS and password-based authentication, raising suspicions of a potential leak in the master.php.net user database. The attacker's ability to authenticate after only a few username guesses further intensified these concerns.
Understanding the risks associated with artifacts highlights the importance of implementing staunch checks to ensure their integrity. To mitigate risks, consider the following strategies:
Implement processes and technologies that validate resource integrity throughout the software delivery chain. As developers generate a resource, they should sign it using an external resource signing infrastructure. Before consuming a resource in subsequent pipeline stages, cross-check its integrity against the signing authority. Key measures include:
Code Signing
Source code management (SCM) solutions offer the capability to sign commits with a unique key for each contributor, preventing unsigned commits from progressing through the pipeline.
Artifact Verification Software
Tools designed for signing and verifying code and artifacts, such as the Linux Foundation's Sigstore, can thwart unverified software from advancing down the pipeline.
Configuration Drift Detection
Implement measures to detect configuration drifts, such as resources in cloud environments not managed using a signed infrastructure as code (IAC) template. Such drifts could indicate deployments from untrusted sources or processes.
Third-party resources incorporated into build and deploy pipelines, like scripts executed during the build process, should undergo rigorous validation. Before utilizing these resources, compute their hash and compare it against the official hash provided by the resource provider.
The industry has established standards and guidelines for artifact integrity validation. Examples include the use of cryptographic algorithms like SHA-256 for checksums, X.509 certificates for digital signatures, and secure transport protocols such as HTTPS for artifact transfer. Organizations should align their practices with these standards to maintain a reliable and secure software delivery pipeline.
To ensure proper artifact integrity validation, organizations should establish clear policies that define validation processes. Once established, regularly audit compliance with internal policies to identify and address weaknesses, as well as areas of noncompliances. Continuous monitoring and analysis will help detect anomalies or unauthorized activities.
Use public key infrastructure (PKI) to cryptographically sign artifacts at each stage of the CI/CD pipeline. This practice validates signatures against a trusted certificate authority before consumption. Configure your CI/CD pipeline to reject artifacts with invalid or missing signatures to reduce risks of deploying tampered resources or unauthorized changes.
Establish a secure tamper-proof repository to store artifacts and enforce strict access controls, preventing unauthorized modifications. Enable versioning to maintain a historical record of artifact changes and implement real-time monitoring to track and alert on suspicious activity. In case of compromised artifacts, configure the system to facilitate rollbacks to previous, known-good versions.
Adopt a multisource validation strategy that verifies the integrity of artifacts using various sources, such as checksums, digital signatures, and secure hash algorithms, as well as trusted repositories. Keep the cryptographic algorithms and keys up to date to maintain their effectiveness.
Incorporate vulnerability scanning tools and static application security testing (SAST) into the CI/CD pipeline to identify potential security issues in artifacts — including third-party dependencies — before deployment. Taking a proactive approach allows DevOps teams to address vulnerabilities early in the development process, reducing the risk of security incidents and maintaining a high level of code quality.
Educate and train development teams about the importance of artifact integrity validation and the potential risks associated with improper validation. Encourage adherence to secure coding practices and emphasize the role each individual plays in maintaining a secure CI/CD environment.
Hash functions are cryptographic algorithms that take inputs of any length and generate fixed-size outputs called hashes or digests. Designed to be deterministic, hash functions ensure the same input consistently produces the same hash value. Additionally, their one-way nature makes it computationally infeasible to deduce the input from the hash value. Common applications for hash functions include data integrity verification, digital signature creation, and secure password storage.
Well-known hash functions include SHA-256, MD5, and SHA-1.
A software configuration management (SCM) solution is a tool or system that manages and tracks changes made to software projects throughout their development lifecycle. By controlling modifications to source code, files, and documentation, it aids in maintaining consistency, traceability, and accountability across the development process.
SCM solutions enable developers to collaborate efficiently, prevent conflicting changes, and easily revert changes to previous versions. They also facilitate branching and merging, allowing simultaneous development of multiple features or bug fixes in isolated environments. SCM solutions streamline the build and deployment processes, ensuring the right versions of software components are combined and released.
Popular SCM tools include Git, Subversion, and Mercurial.