Every day I hear the same question: “how can I improve my coverage with little effort?” Of course, this is a loaded question. What coverage do you mean?
There might be (at least) four ways to think about coverage:
- Code coverage
- Application (Actions, Pages, States) coverage
- REAL user flow coverage
- TEST coverage
In the past we defined them this way:
- Code Coverage: Did we activate every measurable part of the code? This was generally difficult to get near 100% as it would require 1000’s of scripts. In general we see teams achieve between 5% and 30% code coverage. To know you must instrument your code with any variety of tools which will watch to see if that section indeed gets activated during testing.
- Application Coverage: Did we get to every unique page and state and execute every possible action in the application? this may be more important than code coverage since its tests all of the possible ways a user can use the application. Not more, not less. While this would be time consuming to write and maintain scripts, a few thousand scripts may be able to achieve this. In general we have seen teams achieve 10% to 30% application coverage today. And this results in releases which disappoint users with bugs not found with the defined test coverage. While many strive for 100% but of course the more you get the more maintenance you have.
- Real User Flow Coverage: Do we execute every user story against the new build that real users actually followed in production? This has been an impossible dream (the tech did not exist), but is the ultimate test. In general people achieve a small percentage of this today (and in any case the value is not known), not knowing how people actually use the application.
- Test coverage: This is the metric QA has often used but it is a bit pointless when you dig into it. Did we test what a BA told us to? We don’t know if those user stories will ever be used…but that’s what BA’s want so we will do that. And often we say we have close to 100% test coverage. But that is essentially implementing tests which are a guess of user stories and have no relation to actions, code or anything else. So test coverage is only as good as the guess of the business analysts or product manager. Are they 20% correct? 80%? Who knows. How does it relate to everything the app can do? It doesn’t. What about what real users do? No relation. It’s just a guess, and thats not a very scientific method of measuring results.
So that brings us to modern testing which has made use of AI generated scripts now for a few years (no recording/scripting/writing – meaning fully machine generated.)
And the new question is: “How do we know we are achieving the desired coverage with AI generated scripts”
Again, we have to go back to our four definitions above to better answer this.
With Appvance IQ, we have seen the following typical results:
- Code Coverage: Did we activate every measurable part of the code? The AI Blueprint creates thousands of user stories by itself, no code or recording, following the training and validations, to achieve close to 100% each time. Or as close as the UI will allow. Machine generated.
- Application Coverage: Did we get to every unique page and execute every possible action in the application? The AI Blueprint creates thousands of user stories itself, following the training and validations, to achieve near 100% each time. In fact by definition, AI blueprinting must execute each and every action and get to each unique page.
- Real User Flow Coverage: Do we execute every user story against the new build that real users actually followed in production? Since the AI regression test generator recreates the actual user flows from production (with no additions to the applications code) and applies them to the new build, this is now the gold standard of testing. Will the new build work for our users the way they have been using it? The answer is yes. Typically 5000 or so regression tests are created, data driven, providing near 100% coverage of actual user flows. In fact one can setup the generation to create every user story it sees from production until it runs out of unique flows. Achieving regression coverage above 99% each time.
- Test coverage: Did we test what a BA told us to? Given the near 100% results of the categories above, this category becomes less critical, since these user stories are simply created by the BA. However, it’s good practice to create these stories with scripts (say in AIQ Test Designer), which is 20X faster than creating them in Selenium. If for no other reason than to please the BA. Of course, there is no “black box” in any of the above tests. One can open any machine generated script to review it and even replay it or re-use it. But by the time you would describe a specific user flow or even compare it to others, you would indeed have written it in Test Designer (which simply records the QA engineers actions and creates the editable script). So the combination of Test Designer scripts to satisfy your BA, with thousands of AI generated user stories, seems to be the killer combo.
What coverage do you want to achieve? Ask yourself: Code? Application? Real regression? BA test coverage? Maybe some combination?
I would argue that application coverage is the key metric to attain. As it leaves no stone unturned, tests every combination a user could actually do, and we know that low application coverage leads to major bugs getting released.
The technology exists today to help you achieve all of these with less effort than achieving just one using legacy scripts. If you want to achieve more, including true regression, now might be the time to consider AI script generation to augment your efforts.