Separating CI from CD

The CI/CD tool sits at an interesting position in a modern web application stack: it is one of the few places, if not the only one, that has controlling access into both production and non-production environments. As a result, it has a big responsibility enforcing the boundary between the production and the non-production systems.

As an example, here is a typical CI/CD setup in a software startup: source code is managed in an online git repository, and is connected to a combined CI/CD tool that is either self-hosted or online as-a-service. Development work is performed on ephemeral development branches, which anyone in the team can create and push into; hooks are also set up in development branches so that unit tests are run each time code is pushed in. Once code is peer reviewed and approved, and unit tests pass, the development branch can be merged to one of the special branches (e.g. 'release') where more tests are run and, if they all pass, a deployment occurs to a corresponding environment. These special branches are usually gated against direct actions, and can only be pushed or merged into after unit tests and human approval.

I want to discuss two potential security concerns here, as a result of combining CI (non-production) and CD (production). The first is the ability to influence tests and what counts as 'passing', and the second is access control of secrets. Both are possible paths for an insider, who normally does not have the permission to directly deploy to production, to bypass these security controls.

In the first case, an attack scenario works as follows: a developer wishes to commit a piece of code that would not pass necessary tests (e.g. code that would fail a security-related unit test), by also removing the offending test cases in the commit. When the CI tool runs its tests on the development branch, if its behavior is to run tests using the version that was just committed, then it will see that all tests pass. There are legitimate needs to modify or remove tests in the course of software development, so it may be hard to notice such behavior to a casual reviewer. As a result, a developer on the unprivileged side (CI) is able to modify the gate that is placed upon them and escalate into the privileged side (CD).

There are no good countermeasures against this attack, primarily because there are too many legitimate cases for modifying or removing test cases, that it is hard to notice when an attack is happening. The only consolation is that the human review step is not bypassed and can catch if something is off. Also, all commits are logged and tied to individuals so post hoc investigation is possible (there is a handwaving assumption here that the git hosting system can do authentication correctly).

The second security concern arises if the CI/CD system treats secrets stored with it equally. A CD system naturally needs production secrets in order to perform a deploy. Such secrets may also be accessible to a not-yet-approved pull request, and a development branch can then use production secrets either directly or exfiltrate it to a developer. While secrets can be hidden to developers trying to access through normal means, the fact that these secrets need to be used during a deploy makes it impossible to completely hide them, for example by printing one character at a time. If a developer, who does not and should not have access to production secrets, is able to retrieve them through a development branch, they can probably use those secrets and privilege escalate themselves into the production environment.

The solution for this case is quite simple: control which branches (or another similar logic) can access which secrets. To be proper, secrets should not be the only aspect that needs separation between non-production CI and production CD, the whole execution environment should be separate, or ideally ephemeral with a new environment provisioned from a golden image for each run.