In this article, I will explain you several software testing metrics and KPIs and why we need them and how should we use them. This article based on my experiences and understanding. Also, I will use several quotes from various books and articles. They are listed at references part of this article.
I want to start with metrics. Metrics can be very useful as well as very harmful to your development and testing lifecycle. It depends on how to interpret and use them. In any kind of organization, people (managers, testers, developers, etc.) generally talk about metrics and how to do measurements in a right way. Some of them use very dangerous metrics to assess team members’ performance, on the other hand, some of them use relevant, meaningful, and insightful metrics to improve their processes, efficiency, knowledge, productivity, communication, collaboration and also psychology. If you measure the correct metrics in a right way and transparently, they will guide you to understand the team progress towards certain goals and show the team’s successes and deficiencies.
In lean software development, we should focus on to use concise metrics that lead our continuous improvement. In reference , there is a quotation from Implementing Lean Software Development: From Concept to Cash, by Mary and Tom Poppendieck, stated that fundamental lean measurement is the time it takes to go “from concept to cash,” from a customer’s feature request to delivered software. They call this measurement “cycle time.” The focus is on the team’s ability to “repeatedly and reliably” deliver new business value. Then, the team tries to continuously improve their process and reduce the cycle time.
These kinds of metrics need “whole team” approach because all the team members’ efforts enhance these metrics results. Time-based metrics are critical to enhancing our speed. We should ask ourselves some questions based on speed, latency, and velocity. In the agile world, one of the most used metrics is Team Velocity. It shows us, how many stories point (SP) that the team tackles during a single sprint and this is one of the fundamental key metrics of Scrum. It is calculated at the end of each sprint by summing up the completed User Story points.
Lean development focuses to delight end users (customers), which should be the goal for all the team. Thus, we need to measure also business metrics such as Return on Investment (ROI) and business value metrics. If the main business goal of the software is to reach and gain more customers, we need to deliver a software which serves this purpose.
In another reference, Sarialioğlu states that “without any metrics and measures, software testing becomes a meaningless and unreal activity. Imagine, in any project, that you are not aware of the total effort you have expended and the total number of defects you have found. Under these circumstances, how can you identify the weakest points and bottlenecks?”  This approach is more metric based and with this way you should talk with the numbers such as:
- Total Test Effort and Test Efficiency (with schedule and cost variances)
- Test Execution Metrics (passed/failed/in progress/blocked etc.)
- Test Effectiveness (number of defects found in system/total number of defects found in all phases)
- Total number of Defects and their Priorities, Severities, Root Causes (dev, test, UAT, stage, live)
- Defect Turnaround Time (Developer’s defect fixing time)
- Defect Rejection Ratio (Ratio of rejected/invalid/false/duplicate defects)
- Defect Reopen Ratio (Ratio of successfully fixed defects)
- Defect Density (per development day or line of code)
- Defect Detection Ratio (per day or per testing effort)
- Test Coverage, Requirement Coverage, and so on…
In “Mobile Testing Tips” book , metrics importance stated as follows: “without the knowledge you would obtain through proper test metrics, you would not know how good/bad your testing was and which part of the development life cycle was underperforming.” The whole team approach also critical on the metrics that you will measure and report. Some of the tips are listed below for the rest of them I suggest you read the book.
- Tell people why metrics are necessary.
- Explain each metric that your gather to all team members and stakeholders not only the test team.
- Make people believe in metrics.
- Try to be informative, gentle, and simple in your reports.
- Try to evaluate/monitor/measure processes and products rather than individuals.
- Try to report point in time and trend metrics.
- Try to add your comments and interpretation with your metrics.
- Try to be %100 objective.
- Metrics should be 7X24 accessible
Also, metrics categorized in three sections in the book. These are test resources, test processes, and defects. Resources metrics are about time, budget, people, effort, efficiency, etc. Process metrics is about test case numbers, statuses, requirement coverages, etc. Defects metrics is about the number of defects, defect statuses, defect rejection ratio, defects reopen ratio, defects root causes, defects platforms, defects types, etc. At last, the book states that metrics make the test process transparent, visible, countable, and controllable and in a way allows you to repair your weak areas and manage your team more effectively.
Some approaches such as Rapid Software Testing (RST) – Bach & Bolton states that you need to use discussion rather than KPIs and objective metrics. Bach emphasized that you need to gather relevant evidence through testing and other means. Then discuss that evidence. 
Also, Bach wrote that in “Lessons Learned in Software Testing”  book at Lesson 65, “Never use the bug-tracking system to monitor programmers’ performance”. If you report a programmer’s huge number of defects, he gives his all to fix his bugs and tries to postpone all other his tasks. Also, the other crucial mistake is to attack and embarrass a developer for his bugs. This will cause very big problems on team collaboration and whole team approach. The other developer team members also respond this action very defensively and they will start to argue on each bug and they don’t accept most of them. They generally tell, “This works in my machine”, “Have you tried it after clear the browser cache with CTRL+F5”, “This is a duplicate bug”, “It is not written in requirements”, and so on. Also, the worst thing is they may start to attack you on your testing methods, approaches, strategies, and skills. This causes a terrific mess in the team, big problems on team members’ communications, and reduce team efficiency.
And also lesson 66 tells us “Never use the bug-tracking system to monitor testers’ performance”. If you start to evaluate the testers with the number of bugs they found, they may start to behave not intended way. They are starting to focus on easy bug such as all kinds of cosmetic defects, they try to focus only bug counts rather than questioning the requirements and examine all kinds of edge cases. They may report the same bugs several times and this also irritates the developers and leads waste of time. Testers less likely to spend their time for coaching to other testers or self-improvement activities etc. Also, it affects their psychology in a bad way. For example, in team A, if developers wrote unit tests which have %99 coverage and they also run main business flow tests before testing phase and the test environment, test data, network, etc. are very stable, then in these situations tester X may not find too many defects. On the other hand, in team B, if developers do not have the necessary things to do before the testing phase, this time tester Y may find too many defects. In these conditions, if you assess tester X and tester Y with the bug counts, this will be very unfair. You need to understand the reasons of the bugs; you need to question them not just only take into consideration the bug counts.
I totally agree with Bach on lessons 65 and 66. Assessing developers and testers with the bug counts leads too many problems. You need to focus on reasons of the bugs and improve your system, process, methodologies, strategies, plans, etc. There are several lessons related bugs in “Lessons Learned in Software Testing” book, it is worth to read it.
In 2004, Cem Kaner and Walter P. Bond published an article on metrics . In this article’s first section, they stated that some companies established metric programs to conform the criteria established in the CMMi, TMMi, models and fewer of them succeed with them. These metric programs are very costly Fenton  estimates a cost of %4 of the development budget. Robert Austin  gave information about measurement distortions and dysfunctions. The main idea of that paper is if a company managed by using the measurement results and that measurements (metrics) are inadequately validated, insufficiently understood, and not tightly linked to the attributes they are intended to measure, then all of these actions cause the measurement distortions. Kaner and Bond proposed a new approach: the use of multidimensional evaluation to obtain the measurement of an attribute of interest. They conclude their paper in this way: “There are too many simplistic metrics that don’t capture the essence of whatever it is that they are supposed to measure. There are too many uses of simplistic measures that don’t even recognize what attributes are supposedly being measured. Starting from a detailed analysis of the task or attribute under study might lead to more complex, and more qualitative, metrics, but we believe that it will also leads to more meaningful and therefore more useful data.”
In my opinion, in this way, you may get very useful data to help and improve the testers. But in practice, it takes too much time and this will be understood as “micromanaging the details of tester’s job” by the testers and also if the team practicing an agile development framework, it will be much harder to follow this approach.
In an another article , below suggestions is provided:
- When using metrics, we should go beyond metrics and seek qualitative explanations or “stories” being told about metrics.
- Measure humans and their work in numbers and use metrics to gain more information, use them as clues to solve and uncover deeper issues.
- Do not use metrics to judge the work of a human, do not reward or punish individual’s work. Programmer/Tester productivity metric is something that you need to avoid.
- Create an environment where metrics misuse can be minimized.
- Metrics are meant to help you think, not to do the thinking for you – Alberto Savio
I also agree above suggestions and from this point, I want to share some metrics with you. Actually, metrics maybe endless. You may need to use metrics based on your software development life cycle, development framework, company culture, goals, etc. I will present you some software testing metrics below.
Number of User Stories (If you use Agile Scrum)
The number of stories in each sprint.
Number of Test Cases
Number of test cases per project/phase/product/tester/story etc. You can generate many metrics based on test case number.
Test Case Execution Progress
It is a test case execution progress of a sprint, project, phase, or custom period.
- Passed Test Cases
- Failed Test Cases
- Blocked Test Cases
- In Progress Test Cases
- Retested Test Cases
- Postponed Test Cases
Test Tasks Distribution per Tester
It is a distribution of test tasks in your team. If you use JIRA, you can easily gather this information with JQL query and pie chart graph.
Test Tasks Status
You can monitor this metric per sprint, project, or a defined time period.
Test Tasks per Projects
It shows how many test task in each project in a given period.
Test Case Writing Tasks Distribution per Tester
If you are a classical testing guy and writing strict test cases in your project, you can measure them per testers.
Test Case Writing Tasks Status
It is the status of test case writing tasks per a given period.
Test Case Writing Tasks per Projects
You can get how many test case writing tasks do you have for each project in a given period of time.
Created Defects Distribution per Tester
It shows the created defects per tester in a given time interval
It shows the status of defects in a given time interval.
Defects Root Causes
It shows root causes of the defects in a given time interval.
Defect Tasks Priority (Such as Minor – Major – Critical – Blocked)
It shows business priorities of defects.
Defect Tasks Severity (Such as Minor – Major – Critical – Blocked)
It shows system level severities of defects.
Defects Environment Distribution (Such as Test – UAT – Staging – Live)
It shows the distribution of defects per environment.
Defects per Project
It shows the distribution of defects per project.
Resolved/Closed Defects Distribution per Developer
It shows the resolved defects distribution per developer.
Regression Defects Count
It shows how many regression defects do you have in a given time period.
Resolved/Closed Defects Distribution per Project Hour
It shows how much the team spent on defects per project in a given time period.
Worklog Distribution of Testers per Tasks (Such as Test Execution, Test Case Writing, etc.)
This metric shows how many requirements you covered with your test cases.
Number of Defects Found in Production
This metric shows defects found in production. It shows your development, system, network, etc. quality. You have to make a Pareto analysis to find and prioritize your major problems.
Cumulative Defect Graph
This graph shows the defects count cumulatively in a given time period.
Test Case Execution Activity
This shows the activity of you test case execution statuses in a given period.
Test Case Activity per Day
This metric shows how many test case created and updated per day.
Test Case Distribution per Tester
This shows how many test cases are written by each tester in a team.
Cost per Detected Defect
It is the ratio of test effort and bug count in a given period such as sprint.
Example: 4 test engineers performed 2 days testing and found 10 errors.
4 people * 2 days * 8 hours = 64 hours
64/10 = 6.4 Hours / Error
The test engineers spent 6.4 hours for an error.
I think this is a very low-level metric. It maybe hard to measure too. We should focus on to deliver valuable, defect-prone, high-quality, fast, secure, usable products.
Defects per Development Effort
This is the ratio between development effort and the defects in a given time period such as sprint, week, month, etc.
Example: 5 programmers have 5 days of software development activity. A total of 100 defects found.
5 people * 5 days * 8 hours = 200 hours
200/100 = 2 hours
This shows, for 2 hours of effort, 1 defect arise.
Defect per User Story
This metric shows defect counts per user story.
Defect Fixation Time
This metric shows how much time developers spent time to fix the defects.
It is the report that shows how many defects has been found in each module of our product in a given time period.
Example: 20% of the defects are found in the Job Search module and 30% are found in the Admin screens.
Successful Sprint Count Ratio
Successful Sprint Count Ratio = (Successful Sprint # / Total Sprint #) * 100
The sprint goal and must be approved by the PO at the end of each sprint. All successful sprint statistics are collected and the percentage of successful sprints is calculated. Some agile teams use this metric as KPI but most of the agile enthusiasts against to this behavior.
Quality Ratio = (Successful Test Cases/ Total Test Cases) * 100
It is a metric based on passed, failed rates of all tests run in a project according to the determined period. Please do not use this metric to assess an individual. Focus on problems and try to solve them. It has to lead you to ask questions. If you use this metric as a KPI, it may cause problems between testers and developers in the software development life cycle and may damage the behavior of being transparent. Developers want from the testers to open fewer defects and testers may tend to open fewer defects to keep the “Quality Ratio” higher. Be careful about this metric.
Test Case Quality
Written test cases will be evaluated and scored according to the defined criteria. If it is not possible to examine all the test cases, the test cases will be evaluated by sampling. Our goal should be to produce quality test case scenarios by paying attention to the defined criterions in all the tests we have written. You may use this metric as KPI.
Test Case Writing Criterions:
- Test cases must be written for fault finding.
- Application requirements must be understood correctly.
- Areas to be affected should be identified.
- The test data should be accurate and cover all possible situations.
- Success and also failure scenarios should be covered.
- The expected results should be written in the correct and clear format.
- Test & Requirement coverage must be fully established.
- When the same tests are repeated over and over again, tests that consist of the same test cases no longer finds new defects. To overcome this paradox, test cases should be regularly reviewed, new tests should be written for the potential defects in the software and in different parts of the system.
- Each test scenario must absolutely cover a requirement.
- There should not be requirements or test scenarios written “possibly”, “maybe”, “exact” results should be given.
UAT Defect Leakage/Slippage Ratio
Defect Leakage = (Total Number of UAT Defects) / ((Total Number of Valid Test Defects) + (Total Number of UAT Defects))*100
UAT Defects: Requirements that have been coded, unit tests passed, test execution finished by test experts, and then the story is tested by POs in UAT environment and defects are found by PO during UAT process.
Both DEV and TEST teams are responsible for sending a faultless application to the UAT. It is expected that faults found during UAT will be lower than the number of valid faults found during the testing and development processes. If the UAT defects pass the TEST defects, we can say that there is a significant problem in development and testing phases. By Dividing the defects found in the UAT with the sum of the UAT + TEST defects and multiplying that result by 100, in this way the UAT Defect Leakage obtained. Some team uses this metric as a KPI for their testers and developers.
1) Valid test defects in the test environment= 211
2) Defects in the UAT environment= 13
3) UAT Leakage Ratio = (13/(211 + 13))*100 = %5,8
Note: This measurement is calculated one step further by multiplying each defect with their priorities and severities. With this method, defect scores are obtained by their priorities and severities.
Trivial: 0 Point, Minor: 1 Point, Major: 2 Point, Critical: 3 Point, Blocker: 4 Point
Defect Removal Efficiency Before Staging
Defect Removal Efficiency = (Total Number of (Test + UAT) Defects)/Total Number of (Test+ UAT + Staging) Defects) * 100
It shows how successfully the defects are caught before moving to the staging environment. Ideally, Test Defects> UAT Defects> Staging Defects. With this Metric, we will measure our achievement of defect detection and removal efficiency prior to staging environment testing.
For example, if 20 Tests, 10 UATs, 5 Staging Defects are captured, our defect removal efficiency is calculated as follows: (30/35) * 100 = 85.7% Defect Removal Efficiency is caught. Some teams are using this metric as a KPI.
You can also measure your defect removal efficiency before Production.
Defect Resolution Success Ratio
Defect Resolution Success Ratio = (Total Number of Resolved Defects – Total Number Reopened Defects)/ Total Number of Resolved Defects) * 100
It is a KPI that shows how many of the defects that are resolved are reopened. Ideally, if all defects do not reopen, 100% success is achieved in terms of resolution. If 3 out of 10 defects are reopened, the resolution success will be ((10-3)/10)*100 = 70%.
Conversely, if you subtract this ratio from 100 will give us the “Defect Fix Rejection Ratio“.
Retest Ratio = (Total Count of Reopened Defects/(Total Number of Resolved Defects + Total Count of Reopened Defects) * 100
It is the metric that shows how many times a defect is REOPEN and RETEST again. Each reopen defect will have to be retested and this metric will show the efficiency rate that we lost in retesting the resolved defects.
Total Count of Reopened Defects = Total Retest Test Number
For example, if none of the 10 defects are Reopened:
(0 / (10 + 0)) * 100 = 0
0% Retest Rate indicates that we have not spent any effort for retesting and we are very efficient in this metric.
For example, if 10 defects are Reopened 30 times:
(30 / (10 + 30)) * 100 = 75
75% Retest Ratio indicates that 75% is our retest effort in all defects tests.
Some organizations use this metric as KPI to assess development teams.
Rejected Defect Ratio
Rejected Defect Ratio = (Number of (Test + Staging) Rejected Defects/ Total Number of (Test + Staging) Defects) * 100
It is a metric that measures the status of faulty defects that a test engineer has opened. This metric will be able to use to measure defect efficiency. Too many rejections of defects indicate inefficiency and time loss in development life cycle. Some organizations use this metric as KPI for their test teams.
Total Number of (Test + Staging) Defects: 217
Rejected (Test + Staging) Defects: 3
Rejected Defect Ratio = (3/217)*100 = %1,38
I did not take into consideration UAT defects. Because they are found by PO in UAT environment. If you want, you can also add UAT defects in this measurement. But this time, you should not use this metric as a tester KPI.
Test Case Defect Density
Test Case Defect Density = (Number of Defects/ Total Number of Test Cases) * 100
It is the density of the defects from the running test scenario. For example, if the 200 of 250 test cases are run and 20 of them have defects, the error density is (20/200) * 100 = 10%.
My comment on this metric/KPI: While this metric is a success for the development team, it can be interpreted as a failure for the test team or vice a verse. If we use this metric in this way, we will create serious conflicts between development and testing teams. It would be more logical not to consider this metric as KPI for development and test teams. This metric can be varying with the size of development work, experiences of developers and testers, time pressure, lack of documentation, poor SDLC process and so on. Too many variables connected to this metric. It is better not to use this as a KPI.
Also, you can use metrics to prepare your sprint test status reports. I will share with you a sample test status report fields as follows. You can modify it as you wanted.
Sprint Status Report
- Obstacles and Emergency Support Needed Problems
- Test Case Execution Status
- Test Case – Requirement Coverage
- Test Tasks Status
- UAT Status
- Defect Reports per Statuses/Types/Root Causes/Environments etc.
- Sprint Burn-Down Chart
- Sprint Progress Chart
Process adherence and improvement (KPI)
This can be a KPI for your team members. If a team member comes with unique and innovative ideas and these ideas lead you to perform your test efforts much faster with better quality, then you should congratulate those team members and reward them. Make them happy and passionate.
To Obtain International Test Certification (KPI)
This can be a KPI for your testers. Especially, there are some junior testers in your team, you can set this goal for them to get international test certification. This will also improve their test knowledge. IF your testers experienced, then you should set advance level certification goals. However, if you believe in test philosophies such as RST (Rapid Software Testing), you should not spend time for this kind of certifications. This KPI depends on your software testing philosophy.
To Make a Presentation at Meetups or Conferences (KPI)
Making technical presentations will be a good goal for your testers. It improves their passion and vision in software testing. Also, they can widen their networks and they will be an ambassador of your company and team. In each meetup or conference, they learn new things and they can share their knowledge. You should support your testers to promote their selves and your team, share their knowledge, and become an international software tester.
Complete Online Training Programs (KPI)
There are several online learning websites available. Thus, you can assign related courses to your team members and monitor them to finish those assigned courses. In this way, you can improve your team’s skills and competences.
Soft Skill KPIs
There are a lot of soft skill KPIs that you set as a goal for your team such as:
- Initiative and Dynamism
- Problem Solving
- Technical Competence
- And so on…
Also in reference , an agile tester has to have the following principles:
- Provide continuous feedback (Proactive)
- Deliver value to customer (Result oriented)
- Enable face to face communication (communicative)
- Have Courage (Brave)
- Keep it Simple (Lean)
- Practice Continuous Improvement (Visionary)
- Respond to Change (Flexible)
- Self-Organize (Motivated)
- Focus on People (Synergistic)
- Enjoy (Humoristic)
Also at SeleniumCamp-2017 Alper Mermer talked on software testing metrics and he listed those metrics as follows:
- Production Monitoring and Metrics
- Performance Measurement
- Security Warnings
- Code Quality Metrics (Static Code Analysis Metrics)
- Test Coverage Metrics
- API + GUI Test Metrics
- Defects by Priority and Severity
- Production Bugs / Incidents
- Build Failures and so on…
As you seen and written in this article, metrics are limitless. Using some of them as a KPI may be dangerous. You can use them to see the status of your situation and try to question your system, methodology, deficiencies, for continuous improvement. If you use them as a rewarding or punishing tool, it will create an enormous pressure and stress on your team. Try to use them in a right way and effectively based on your beliefs, philosophy, organization culture, etc. but whatever you will do, please be Fair! Seek first to understand, then act! Empathize! Create Trust and Love! Help your team! Believe in them! These are my advice to be a “good” maestro.
You can also read my Q&A on Agile Testing Mindset article here: http://www.swtestacademy.com/agile-testing/
It is a long article but I hope you enjoyed it. Happy Testing! :)
 A Practical Guide for Testers and Agile Teams – Lisa Crispin, Janet Gregory, Addison-Wesley Signature Series
 Software Testing Tips Experiences & Realities – Baris Sarialioglu, Keytorc Inspiring Series
 Lessons Learned in Software Testing – Cem Kaner, James Bach, Bret Pettichord
 Mobile Testing Tips Experiences & Realities – Baris Sarialioglu, Keytorc Inspiring Series
 N. E. Fenton, “Software Metrics: Successes, Failures & New Directions,” presented at ASM 99: Applications of Software Measurement, San Jose, CA, 1999.
 R. D. Austin, Measuring and Managing Performance in Organizations. New York: Dorset House Publishing, 1996
Onur Baskirt is a senior IT professional with 15+ years of experience. Now, he is working as a Senior Technical Consultant at Emirates Airlines in Dubai.