Infrastructure as Code (IaC) shifted how teams think about building and securing infrastructure. Now, your entire application architecture can be defined in code, version controlled, and secured before being deployed. Combined with GitOps, expanding and iterating on cloud infrastructure is a few lines of code and a git commit away. Everything else, from integration to deployment, can be automated.
Leaving the entire release process up to automation takes a leap of faith. Security must be involved at every stage, from design to code to build to deploy. It is necessary to include security checks, feedback, and guardrails at every step of the process. Additionally, proper controls should be in place to ensure code integrity. If accomplished, a secure IaC supply chain leads to more rapid innovation and improved security.
Infrastructure as code (IaC) is quickly becoming the standard method for configuration management at any tech company, and why shouldn't it? It's repeatable, maintainable and auditable while improving productivity and speeding up the development-to-production pipeline.
What's not to love? Well, to misquote Uncle Ben, "With great DevOps tooling comes great security implications." We are firmly into the 21st century now, and the internet is no longer the “Wild West” it once was. Security is not as optional as it once seemed (it was never actually optional), which means that as great as IaC and GitOps (more on that below) are, it's important to configure a process and pipeline that take the possibility of a breach into account.
Thankfully, this isn't an unsolved problem. Security is much more of a process than it is an objective, and by introducing checks and balances into your DevOps workflow, you can ensure that the powerful automation enabled by IaC and GitOps can run as smoothly – and safely – as possible.
As the name implies, GitOps is the process of applying DevOps practices like version control, collaboration and automation to the practice of configuration management. The key principle behind GitOps is the use of an underlying version control repository (typically using git; however, other version control systems like Mercurial can be used) as the central management point of all IaC definitions.
By using a version control repository as the source of truth, all infrastructure changes can be tracked over time, allowing DevOps teams to quickly roll back entire changes to the infrastructure with a few simple commands, which can drastically reduce the mean time to recovery (MTTR) for incidents. When paired with a modern source code hosting service like GitHub or GitLab, the ability to review code configuration and collaborate on infrastructure changes can provide a much-needed level of transparency, while at the same time speeding up the rate at which changes get released to a production environment, thanks to continuous integration and deployment (CI/CD) processes.
With all that in mind, how exactly do we secure our IaC supply chain? What checks do we need to have in place between the first line of code being written and the final deployment being triggered?
While there are several things that can be done to secure your IaC supply chain, let's break the underlying supply chain down into four distinct phases: Development, Collaboration, Build and Deploy.
The Development phase, as the name implies, is the time when the code is first written. When it comes to supply chain security, this phase is the single most important for preventing the introduction of vulnerabilities into your application infrastructure because the best kinds of defects are the ones that never get encoded in the first place.
When it comes to writing secure code, static analysis and automated testing are your friends. Static analysis tools, as the name implies, will analyze code as it is written to identify vulnerable patterns. This can be incredibly useful for catching configuration errors in your IaC definitions, such as permissive access settings or misconfigured security settings.
Automated testing is another solution that, while requiring a little more overhead, will allow engineers to review the impact of their code changes in a simulated environment long before they get released onto live infrastructure. For example, a well-designed test suite can verify that a set of configuration changes results in the expected behavior.
It's important to mention that, while static analysis and automated testing are best practices for any code project, there are several great tools that focus specifically on IaC security. A particularly helpful guardrail to implement is one that identifies and prevents sensitive information like security keys and passwords from being committed directly to the version control repository.
Many of these tools can (and should) be deployed during all phases of the IaC supply chain; however, the earlier in the process that issues can be caught, the lower their overall impact will be on the organization.
Once the code is written, it needs to be reviewed. The reason why GitOps is such an important evolution of IaC is that it encourages the use of highly collaborative tooling like merge/pull requests, code reviews and second-phase testing and analysis.
Establishing a formal code review process is an important step in supply chain security, and while getting a second set of eyes on any piece of code helps prevent bugs and share knowledge, it is also an opportunity for continuous improvement of the automated tooling that is deployed during the Development phase.
Automatically catching any type of defect, let alone vulnerabilities, can be extremely difficult without a pattern to search for. An active and engaged code review process, coupled with security training and education, can help identify gaps in the automated tooling that can be plugged up and prevented in future implementations.
The more effort that gets placed on reviewing the design and implementation choices of a proposed infrastructure change, the more effective the DevOps team will become at identifying and remediating them. It is a process that, while potentially expensive at first, will pay dividends over time.
After the code has been reviewed and approved, it then gets merged into the target branch and built. What exactly does "built" mean in this context? While the exact process will likely be dependent on the organization, this is generally the phase when the merged code is last tested and prepared to be released into a production environment.
The Build phase is often when any final manual testing and security audits may be performed, depending on the risk profile of the organization. It is also where all of the pending changes may be deployed to a production-like environment for more realistic verification and security testing within the context of the larger application infrastructure.
In organizations with complex infrastructure layouts, it is important to verify that code and configuration changes that passed muster in earlier phases don't have unforeseen impacts on other services or their underlying dependencies.
The final phase of the IaC supply chain is Deployment. This is when the changes specified in the committed code and configuration take effect. At this point, all of the automated checks and balances have passed, and the infrastructure changes will go "live." As you can imagine, there are very few guardrails that can be placed into the Deployment phase, as this is the point where confidence in the release is at its highest.
That said, there are a few processes that can be implemented to reduce the impact of any vulnerabilities that may have slipped through the cracks. For one, phased rollouts are a great way to limit the impact of changes to smaller subsets of your infrastructure. Additionally, incorporating a "shift-right" process into your post-deployment workflow can provide a final verification phase by running integrity checks against any new builds in a production environment and initiating automatic rollbacks in the event of a failure.
Ultimately, all of this work is only as secure as the underlying infrastructure that supports it. After all, it doesn’t matter how strong your front door is if you never lock it. To put it simply, securing the pipeline can be thought of as locking your doors and practicing good security hygiene. While the extent to which you apply this process is fairly dependent on the tooling you rely on, there are a handful of best practices that are recommended to effectively secure your IaC pipeline.
Required Reviewers
Many version control systems allow you to not only require that code reviews are taking place but also explicitly require a certain number of reviewers. This is helpful, as it ensures that multiple sets of eyes can be placed on every piece of code before it gets released. Additionally, for particularly sensitive code, specific people can be flagged as “code owners,” ensuring that the appropriate subject-matter experts are looped into relevant reviews wherever necessary.
Branch Protection
It is considered good practice to prevent direct contributions to the main branch in any version controlled project. By implementing branch protection rules, all code is forced through a pull request process, ensuring that no unexpected code can reach a production environment without first being subjected to the appropriate level of tests and code reviews.
Enforcing Multi-Factor Authentication
Multi-factor authentication (MFA) is table stakes in any security-conscious environment, but beyond simply encouraging it, it needs to be enforced across the organization. Some of the most devastating security breaches in history have happened as a result of simple mistakes like password reuse. While MFA isn’t a silver bullet, it can make credential stuffing a far less effective attack vector.
Implement Least-Privileged Permissions
Role-based access control (RBAC) is an important feature to take advantage of within any tool, let alone the IaC pipeline. Both users and automated tools should be given the least access necessary to perform their given function. While this often means that production access is only limited to a privileged few people, the same care should be taken for development and testing environments.
Restricting CI/CD Resources
Despite taking all of the precautions above, accidents can still happen. Third-party tools can still be compromised, and taking a stance of no-trust can go a long way toward protecting your pipeline. Restricting what resources can do and have access to within your CI/CD pipeline, such as preventing remote code execution, is crucial.
Automation can be scary, especially when the outcome of that automation can have a drastic effect on an organization's infrastructure. That is why it is important to introduce guardrails and other checks and balances into your change management process. But automated security scanning and code reviews can only go so far. They must be counterbalanced with a well-trained security culture.
Security isn't an activity that you can check off, or a button you can press. It's a process and one that must be nurtured and developed. Training, education, seminars and other activities can increase the security awareness of an organization and ensure that those automated tools are understood and supported.