The cost of underperformance in delivering quality software is steep. In addition to interrupted in-app experiences, bugs often contribute to high rates of customer churn and can lead to a damaged brand image. Users expect highly functioning apps with no bugs or issues, and apps that don’t provide these qualities are quickly deemed irrelevant and often fall forgotten. As such, software testing has become increasingly important in the rollout process.
Since software testing can be a bottleneck in organizations striving for rapid release cycles, teams must have some objective measures to know when they have tested enough in order to ensure quality. In particular, coverage is a key measure that many teams use. Today, there are two widely used coverage measures, code coverage and test coverage. These two measures have similar goals—to measure the completeness of testing—but they are different measures with different owners in the software development lifecycle (SDLC) and different implications.
Code Coverage is a measure of how much of your source code was exercised during the execution of your test suite. It is a white-box testing approach, which means you need to have access to the code itself to complete your testing.
While the definition of code coverage may seem fairly straightforward, you need to go a level deeper to truly understand how code coverage can be measured. Wikipedia indicates that there are 4 main types of coverage measures:
- Function coverage – has each Function in the code been called?
- Statement coverage – has each Statement in the code been executed?
- Branch coverage – how many Branches of the software control structure have been executed? This includes “if” statements being tested for both true and false possibilities.
- Condition coverage – has each Boolean subexpression been tested for a true and a false value. (Beware, this is not the same as Branch coverage!)
Today, code coverage measurement tools typically go one step further and include Line coverage, or a measure of how many of your lines of code were executed during the run of your test suite, as part of their criteria as well.
So how much code coverage is enough? The Google Test team has shared their guidelines for coverage: 60% coverage is “acceptable,” 75% coverage is “commendable,” and 90% is “exemplary.” But they also caution that high code coverage, i.e., above 95%, does not necessarily indicate that you are ensuring high-quality performance depending on how the tests are created and structured.
As the name and the specific measurement types indicate, code coverage is typically a measure tracked by the development team. Some teams include reviewing code coverage and test results as part of their code review process.
Test Coverage, on the other hand, is targeted at measuring how many Requirements are tested during the execution of a test suite. It is considered a black-box testing approach, because no knowledge of the code base is required to complete the tests.
This measure is the domain of the business-focused teams, like product management, who are responsible for specifying the requirements in the first place. QA teams often measure their performance using this statistic as well.
While test coverage also sounds straightforward at first glance, it too has a second level of detail that should be considered. First, there are different types of application requirements that should or could be documented and tested. These requirements include Functional Requirements, Security and Performance Requirements, or other types of User Requirements. Often teams are focused on functional requirements primarily and don’t include other types of requirements in their coverage calculations.
The second piece of the calculation takes into account the design of the test suite itself. For example, if there are 20 functional requirements documented for an application, there might be 200 test cases written that cover these requirements. The ultimate assessment of test coverage is often measured against this portfolio of tests.
Coverage is then calculated when the test suite is executed; for example, were all 200 test cases executed as part of the test regimen? If only 160 tests were performed, the team might measure test coverage as 80%. But some teams also take into account the requirements covered by each test case. So if the 160 tests executed actually included testing for 17 requirements, the team might assess test coverage at 85% (17 out of 20 requirements).
In our opinion, this measure has several drawbacks. First, the measure is dependent on the upfront definitions of the requirements, so it’s inherently circular. Secondly, as long-lived and often complex applications grow and change over time, the actual behaviors of users could prove to diverge significantly from the defined requirements set by product managers. This means that the measured test coverage bears little relationship to what end users see as their requirements for the application.
So how can organizations, particularly those business-focused teams, move beyond these inherent drawbacks? We propose a third measure—Application Coverage—that is a more complete measure of an application’s functionality.
Application Coverage measures whether every unique page and state were reached and whether every possible action in the application was executed during testing. This measure may be more important than code coverage since it tests all of the possible ways a user can use the application. While in the past, it would have been time-consuming to write and maintain scripts to achieve high Application Coverage, a concern even in achieving high code coverage as noted by the Google team, but new script creation capabilities open up the possibility of tracking and achieving high Application Coverage.
In general, we have seen teams achieve 10% to 30% Application Coverage today using their current test tools. And this results in releases that disappoint users with bugs not being found within the defined test coverage.
How to Achieve 95+% Application Coverage
We mentioned that achieving high Application Coverage would have been time-consuming and problematic, as prior scripting technologies would have led to an unwieldy level of maintenance for test cases. But those limitations are no longer germane with the advent of AI-written scripts, not just AI-assisted script creation. When AI is deployed to understand an application completely, it can identify all the page states and execute all the possible actions, while also writing the tests as it progresses through the application. That means that as soon as a new build is ready for testing, or pushed to production, the complete breadth and depth of the application can be assessed. What about maintenance, you might be thinking? When AI is deployed to write the tests, it writes them anew each time it traverses the application, meaning that script maintenance is effectively eliminated. Now it has become possible to fully test an application in a continuous testing motion.
This is where AIQ’s AI Blueprint comes in. The AI Blueprint creates thousands of user stories itself, following the training and validations, to achieve near 100% Application Coverage each time. Each AI Blueprint execution could also be evaluated with your code coverage tool, and high percentage coverage should be expected.
If you want to achieve more, now might be the time to consider AI script generation to augment your efforts.