Denim Group has been acquired by Coalfire. Learn More>>

Effective Application Security Testing in DevOps Pipelines



Businesses and development teams are rushing to embrace DevOps so they can be more agile, deploy code more quickly, and provide more value to their customers. Hallmarks of DevOps initiatives are support for significant automation, flexible provisioning, and cultural support for shared responsibilities. This often makes security teams uncomfortable, and they find themselves on the receiving end of this trend with little power to stop or even slow these changes. But the shift to DevOps does open a window of opportunity for security teams to exert influence and improve the security of applications.

Before considering what it means to have application security testing integrated into the DevOps Continuous Integration/Continuous Delivery (CI/CD) pipeline, it is worth asking why it is valuable to integrate application security testing into these pipelines in the first place.  A fundamental tenet of DevOps and the reason for having CI/CD pipelines for software builds is to allow teams to have up-to-the-minute feedback on the status of their development efforts so that they know if a build is ready to push to production. This involves testing quality, performance and other characteristics of the system. And it should include security as well.

By integrating security into the CI/CD pipeline, security vulnerabilities are found quickly and reported to developers in the tools they’re already using. This removes friction from the remediation process. Instead of relying on an ornate change management process, security vulnerabilities are quickly reported as software bugs to be addressed – preferably by the developer who recently introduced them into the codebase. Security moves beyond something handled on a quarterly or annual basis to being just another check before developers can feel that they are “code complete” and move on to another task.

Conceptually, there is no reason why security testing should not be included alongside other CI/CD testing concerns. In practice, however, there are issues that can make integrating application security testing into CI/CD pipelines challenging. Many developers do have some knowledge of application security, but struggle with specifics. If a pipeline build fails due to unit tests or functional tests failing, developers can consult user stories or apply some common sense to identify and diagnose the issue. However, for developers without a strong background in secure coding, security issues identified during pipeline builds can be arcane and challenging to address.

In addition, most security tools are not well suited out of the box to be successfully integrated into CI/CD pipelines. They are built for use by security teams with expertise in application security and their results are meant to be consumed by those who have similar backgrounds. In addition, their run times can be long when viewed against the desire to rapidly approve builds for delivery. Many security tools are designed with the intention that they be exhaustive – identifying all risks so as not to miss minor details. That is not the best characteristic for security tools in a CI/CD pipeline. Also, most application security testing tools were originally intended to be run in an interactive mode by an analyst. Fortunately, many popular application security testing tools like OWASP ZAP are starting to expose APIs that help support the type of automation required for CI/CD integration.

So, what should the success criteria be as we look at application security testing within CI/CD pipelines? The first question to ask is “are we getting value from the testing we are doing?” This means determining if the development team is being notified of important vulnerabilities quickly after their introduction so they are easy to fix by the developer who only recently introduced them. It is also critical to make sure that the application security testing activities are not too expensive. Security testing gets expensive when:

  • Tests take too long to run and delay builds being passed or failed
  • Testing identifies too many false positives requiring manual filtering
  • Developers receive vulnerability reports they do not understand

All these issues require resources to address, and if the cost of application security testing is too great then it does not make sense for development teams to integrate this testing into their pipeline.

What Is In an Effective Application Security Testing Policy?


When looking to integrate an application security testing policy into a developer’s CI/CD pipeline, there are three phases that need to be specified:

  • Testing
  • Decision
  • Reporting

The right policy for a given application will depend on several factors including the risk profile of the application in question and the risk tolerance level of the organization. Mission critical applications that manage valuable data subject to compliance requirements should be treated differently than less critical applications managing public data. In addition, different policies can be applied at different times – it may make sense to apply one policy on every check-in or on a nightly build, whereas another more stringent policy might be applied to a weekend build or a build that is run at the end of an iteration. Developing these policies is a collaborative effort between security teams and development teams.

Testing Phase

Application security testing approaches for CI/CD pipelines are fundamentally different than the monolithic point-in-time testing approaches often practiced by security teams. For CI/CD integration, the focus must be on the optimizations needed to do security testing frequently, rather than the goal of exhaustive security testing. This requires a testing configuration that:

  • Runs (relatively) fast
  • Produces high-value non-false-positive results

Application security testing in CI/CD pipelines also requires a mindset change away from one that tries to avoid ever passing a build that contains a vulnerability, to one focused on the “window of exposure” and “mean time to fix” for vulnerabilities. Teams can risk deploying something with vulnerabilities into production if they can correct identified issues quickly.

So, what does that mean for security teams integrating application security testing into CI/CD pipelines? First, teams must trim down rulesets to reduce false positives and reduce run times. The default behavior of most application security testing tools is to run an exhaustive set of rules geared at producing the most findings. This results in long run times and more false positives. Tuning CI/CD-based testing to only run high-confidence tests that are going to find the most important vulnerabilities reduces both testing run times as well as false positive rates. The focus for application security testing in CI/CD is on early identification of obvious and serious vulnerabilities and quick communication of these to the development team. This means that these issues can be addressed quickly and the build can be fixed. This focus on the vulnerabilities that are easy to identify with automation makes additional sense because those are the types of vulnerabilities that many attackers are also going to be able to identify and exploit using similar automation but on live environments.

A critical concern is determining how to run the tests run as fast as is reasonable. In general, there are a couple of ways to approach this. Controlling the rulesets to limit checks can help to reduce application security tool runtimes. In addition, doing differential or incremental scans can help to reduce the scope of the testing being performed, with associated time saved. For static application security testing (SAST), tests can be run only on the portions of the codebase that have been changed since the last round of testing. The ability to do this type of differential testing is typically vendor-dependent. Checkmarx, for example, provides this capability.  For dynamic application security testing (DAST) you can have the scanner only look at new URLs or ones that have been modified based on the changes to the codebase since the last set of tests were run. See this presentation looking at attack surface calculations for more information on tracking application attack surface changes over time.

Synchronous tests are those that are started with the intention that they are completed in a reasonable amount of time, such that the results of these tests can be used to decide about whether to break the build. These tools or tests run through completion. Where possible, it is preferable to use synchronous tests because we can make go/no-go decisions based on the outcomes. But this requires that these tests be run in a short enough time window that they are not unduly holding up the completion of the build process.

Asynchronous testing tasks are those that are initiated as part of the CI/CD pipeline, but that are not expected to complete before a decision is made to “break the build.” It is simply a reality that for large applications or certain testing technologies testing will not complete within an acceptable time window.

Decision Phase

The decision phase is where a go/no-go decision is made based on the results of the synchronous tests and where the build fails if the results of security testing are not satisfactory. Organizations would not go live with a build if it had serious quality errors based on the testing done by unit and functional tests, but many organizations will go live with security vulnerabilities in their applications. Teams do it every day and it is important to acknowledge that this is the current state of practice in the industry. Teams have to make a decision about security. A challenge with this decision is that it is less clear cut than one that would be made if functional test results were known to be deficient because in this case the team is approving a build that works but that will expose the organization to risks if it is deployed.

What criteria are used to make these risk decisions? First it is the severity and type of vulnerabilities identified. From a severity standpoint, automated scanners are going to assign severities to vulnerabilities and these severities can be used to approximate the riskiness of deploying the build currently being tested. A build is allowed a certain amount of perceived risk before it is considered unacceptable to pass. In addition, there can be value in examining the types of vulnerabilities identified. Certain types of vulnerabilities like SQL injection may be considered unacceptable for a build because of their potential impact, regardless of the scanner’s perceived severity of a specific vulnerability.

A valuable concept when implementing application security testing in CI/CD pipelines is the “newness” of vulnerabilities. In a perfect world, security teams could make policies such as “no critical or high vulnerabilities in production.” In the real world, and in dealing with applications that have been under development for a time without security testing, this may not be politically feasible. “No critical or high vulnerabilities” may not work, but “no new critical or high vulnerabilities” may be defensible. After all – the developers shouldn’t be introducing more vulnerabilities now that everyone agrees that they are a problem and there is testing in place. In many situations, this is a more acceptable approach. As we have seen above, when integrating application security testing into CI/CD pipelines, pragmatism is a primary driver.

Reporting Phase

Unlike the results of most security testing, development teams are the direct consumer. So, the development teams must be able to consume the results of this reporting without intervention from the security team. This means that outputs from the testing need to be delivered to the tools the development team is using for managing bugs. Teams have often made a significant investment in both deploying tools and crafting processes. Any security testing done in the CI/CD pipeline needs to have its result slipstreamed into these systems and processes to be actionable. Otherwise security just slowed down the build process to serve their own needs. Historical vulnerabilities must be tracked so they are only reported to developers once, and because testing is being run frequently on incremental code changes, the count of new vulnerabilities identified per run should be small. Finally, reporting needs to package the vulnerabilities in the way that is going to be most useful and most consumable by the development teams. This means providing appropriate context as well as supporting materials so that developers can self-serve the information they need to fix the issues.

There are several common strategies for bundling vulnerabilities into software defects:

  • Bundle by type
  • Bundle by code location
  • Bundle by severity

Bundling vulnerabilities by type makes sense in many cases because the code-level changes for remediation are often the same. They use the same encoding function, same coding pattern, etc. Developers can fix many vulnerabilities quickly if they are making the same kind of changes to code.

Bundling by code location makes sense when one developer is responsible for a specific part of the codebase, and perhaps they are the only one who can easily maintain that part of the codebase. From an agile standpoint, this might not be ideal, but it does reflect the reality of many development teams.

Bundling by severity makes sense in situations where the application has its security “under control” – i.e. it has been cleared of major vulnerabilities. In cases like this, bundling by severity after a particularly bad check-in may make sense. This highlights all the new important vulnerabilities and allows a developer to go in and address the new issues that have been added.

Onboarding and Maintaining Application Security Testing in CI/CD

For application security testing in CI/CD pipelines to be successful, there must be onboarding and maintenance processes in place. Onboarding an application for CI/CD involves running an initial scan with the target ruleset, culling out false positives, and possibly further tuning the ruleset. This is typically done by the application security team because it often requires a lot of skill with the testing tools. The onboarding process is also a great opportunity for the security team to learn more about the development team and the specific characteristics of the application being brought under management.

Maintaining the testing policies over time is also required. There must be a process in place for development and security teams to flag false positives and return builds to passing status. Over time, analysis of these false positive reports can provide data on how to either alter the overall testing policy or to further evolve the rules being used for testing. This feedback loop allows the security and development teams to work together to further the goals of application security testing in CI/CD pipelines: find important vulnerabilities quickly and report them to development teams for resolution without slowing the process down with a lot of false positive “noise.”


There are many benefits to incorporating application security testing in developers’ CI/CD pipelines. This testing allows development teams to be informed quickly about serious security vulnerabilities that have been introduced to their codebase so that those vulnerabilities can be fixed. It also gives development teams confidence that they are ready for “continuous delivery” because security aspects of code correctness have been addressed along with more traditional functional aspects. However, to successfully introduce application security testing to CI/CD pipelines, security teams must accept some risk and make concessions involving the depth and breadth of testing with the belief that shallow testing done more frequently provides value. Understanding political tradeoffs within the organization as well as understanding how to best tune application security testing tools to meet these somewhat esoteric goals will allow security managers to reduce risk via tighter integration with development team efforts.

We have been doing quite a bit of working helping organizations integrate application security testing into their CI/CD pipelines and we are going to be distilling a lot of those experiences into ThreadFix to make it even easier for teams to reap the benefits. Contact us if you would like to know more about staying secure during your transition to DevOps.

Several folks looked at drafts of this blog post and provided feedback. Any good ideas are likely stolen from them and any bad ones that remain are my own. Thanks to Bryan Beverly, John Dickson, Cap Diebel, Matt Konda, Greg Leeds, Andrew Montz, Kyle Pippin, David Rook, Matt Snider, and Ben Tomhave.

About Dan Cornell

Dan Cornell Web Resolution

A globally recognized application security expert, Dan Cornell holds over 15 years of experience architecting, developing and securing web-based software systems. As the Chief Technology Officer and a Principal at Denim Group, Ltd., he leads the technology team to help Fortune 500 companies and government organizations integrate security throughout the development process. He is also the original creator of ThreadFix, Denim Group's industry leading application vulnerability management platform.
More Posts by Dan Cornell

Leave a Reply

Your email address will not be published. Required fields are marked *