Useful metrics for project evaluation

In October, I already talked about how to evaluate testing, all the suffering and sympathizers can watch the recording here . And today I wanted to touch on the topic of the project's metrics in general, and the metrics are not “for the killers”, but the metrics of “user-friendly” and “project-improving”. That is why, instead of dry formulas and a list of metrics, I will tell 3 stories from the experience of introducing and using strictly defined metrics under strictly defined conditions - and about the results that were achieved with their help.

Why measure something?

There is a project. Your favorite, dear, whom you want to grow and flourish.
But how would you rate his prosperity if there are no criteria for this very prosperity?
How can you quickly respond to problems before they become unrecoverable, if you do not use the “sensor of the future G”?
How do you understand what should be improved if you do not know the source of the problems?

In short, metrics are needed to effectively manage the project: diagnose problems, localize them, correct and check whether the solutions you choose to solve the problem really help.
')
I will share different types of metrics, each of which is tested and brought considerable benefits. Each time, introducing them, any team is very lazy and uncomfortable: you have to save additional information, measure something, raise bureaucracy. But when we first benefit from a metric, discipline and a deep understanding of the importance of a particular metric comes to replace laziness.

And if they don’t come, then the metric can be safely thrown away;)

Story 1: Who let him in here ??

In one great company, management complained about a “poor quality product,” which was to blame for testing. My task was to analyze the reasons for this unfortunate misunderstanding and solve them in any way, moreover yesterday.

Task # 1 became obvious to me: estimating% of the errors that were missed : is it true that testers are missing something? To do this, we entered the field “informed client” in the bug tracker, marked the old bugs in this way and counted. The percentage was slightly more than 5%, and not all of them were critical.

Is it a lot or a little? In my experience, this is a pretty good percentage. Where, then, is the opinion that testers miss a lot?

We have entered another field: "reproduced on the release version." Each time, registering a new error from the test bench, testers checked if it was in the latest user version: maybe users simply do not report specific errors? The result for the first month - about 40 % of the errors registered in the bug tracker are reproduced in the release version .

It turns out, we really miss a lot, but users do not report specific errors, but the opinion “your software sucks!” Is clearly formed. Thus, we formed the sensor metrics: what is wrong:

% of errors missed in the release version
% of errors reported by user

We set a goal (otherwise why do we need to measure something at all?)! We want no more than 10% of errors in the release version. But how to ensure this? Excessively expand resources? Increase the timeline?

To answer this question, we need to dig further, and look for new metrics that will answer this question.

In this case, we added one more field for all the missed errors: “The reason for the omission.” And the choice indicates why not brought this bug before:

unknown requirement (did not know or did not understand that it was needed)
did not consider the test (did not think to test it SO)
did not test (the test was, it was checked, but then the functional broke, and again this area was not checked)

According to this algorithm, I have already investigated the reasons for omissions in many companies, and the results are always different. In the case under consideration, more than 60% of the errors were missed because the testers did not consider any test, that is, they did not even think that it needed to be tested. Of course, we need to work on all fronts, but we started with 60%, relying on Pareto's law.

Brainstorming “how to solve this riddle” led to various solutions: a weekly discussion of the missing defects in the testing group, the coordination of tests with analysts and developers, direct communication with users to study their environments and conditions, etc. Introducing these new procedures little by little, in just 2 months we reduced the percentage of missed errors to 20%. Not expanding the team, not increasing the time.

Up to 10%, we have not yet reached, but in July it was 14% - we are already very close to the goal, and judging by the assurances of the implementers, customers have already noticed changes in quality. Not bad, huh?

Story 2: Where are the Dros?

This story concerns one of my own projects. We are developing some terribly necessary and useful service, and the terms of development did not really warm my soul. Naturally, on my project everything is very good with testing, but why is the development barely weaving?

Naturally, I began by trying to measure my subjective feelings “slowly”. How to understand this? What to compare? KLOC per month? Fitch in the iteration? Average breakdowns of terms regarding the plan? Naturally, the first 2 metrics will not bring anything useful, so I began to watch% of deadlines for features (iterations do not have a fixed set of features, so they cannot seriously be late - we managed to do it and test it in 2 weeks). But features!

It turned out that for them we break deadlines by an average of 1.5-2 times! I will not tell you what I should get this information from redmine, but here it is. And I want to dig further, using the principle of "five" why "." Why is that? We are planning badly? Do I want a result too fast? Or low qualifications? What time does it take?

I began to analyze: on average, 1 to a small feature accounts for 15 to 40 bugs, and the time it takes to fix them takes more than developing the feature itself. Why? Is it a lot or a little? Developers complain that there are a lot of requests to change the already developed functionality - is this true or is it a subjective error estimate?

We dig further. I enter into a poor unfortunate bug-tracker field that is swollen from additional fields: "The cause of the error." Not passes, as in History # 1, namely appearances. This field is filled in by the developer at the moment of committing, when he already knows exactly what and how he corrected. And the answer options are as follows:

Code (here they took and nakosyachili)
Failure to understand the requirements (“Well, I didn’t understand what exactly it was needed!”)
Changing requirements (product owner looked at the result and said “eh, not really need a different way, but not the way I originally asked”)

Errors in the code we found about 30%. Changes in the requirements - less than 5% (the developers were surprised, but recognized - it’s they who indicate the reason!). And almost 70% of the errors were caused by a lack of understanding of the requirements. In our case, when the bugfix takes more development, it is HALF OF TIME SPENDING ON THE DEVELOPMENT OF FICHE.

What to do?

We found a lot of solutions to the problem, starting from hiring a technical writer who will find out the requirements of the product owner and document in detail everything that we describe in a couple of lines and ending with the product owner, transferred to the secretaries, day and night documenting new features . We didn’t like any of these options, they are too bureaucratic for a team of 4 developers sitting in the same office. Therefore, we did the following:

Product owner briefly, as always, describes a new feature.
The developer, when it comes to it, carefully considers the method of implementation, how it will look, what to do with this feature.
After that, the developer and the RO sit together, and the developer tells in detail his thoughts on the ~~bright future of the~~ feature being developed.
The developer under no circumstances starts working on a new feature without going through the above described algorithm of actions and not agreeing his vision with the RO
The tester most often participates in this process, prompting in advance the difficult moments that he will test.

Now we have about 3-7 such ~ watch "chatters" a week, which is 2-3 people. The number of bug fixes decreased, of which code errors became more than 50% - therefore, our next task will be to introduce code review, since Now we have a new "main problem".

But from the metric analyzer, we returned to the metric sensor and realized that never since spring had we broken the deadlines for the feature by more than 50%, although before that the average value of the breakdown was from 50% to 100%, and sometimes even more.

And this is just the beginning! ;-)

Story # 3: Who is slowing down developers?

Another story concerns my very recent experience in a third-party company. Real-Agile, weekly iterations ... And weekly deadlines!

The reason stated by the management of the company: "Developers admit too many bugs."

I began to analyze how this happens. I just participated in the process and watched from the side, as it is very well described in Imai’s book Gemba Kaizen. And that's what I saw: Releases on Thursdays, Friday is preparatory to the new iteration day. On Tuesday-Wednesday there is an assembly for testing. On Wednesday-Thursday defects are made. On Friday, instead of preparing for the new iteration, the developers urgently fix bugs every week.

I asked in the task tracker, where features from the board are duplicated, to put down the statuses on the feature: the feature is accepted for development, the feature is given for testing, the feature is tested and sent for revision, the feature is tested and accepted for release.

And what do you think, what is the average time between the “feature given for testing” and “the feature tested and sent back for refinement”? 1.5 days!

And sometimes - with the ONLY blocking defect.

The developers at this company complained about brake testers, but testers and management were against developers: “you yourself must test and not give away a raw product.” But Caesar Caesar!

So, there is a metric, 1.5 days is unacceptable a lot, we want to reduce at least three times - this should speed up releases by day. How to do it? Again, a brainstorm, again a bunch of ideas, 90% of the participants in the process insist that "developers must test themselves."

But in the end we decided to try it differently: as soon as a feature, according to the developer, is ready, the tester sits down with him for one computer, takes a notebook with a pen and starts checking, commenting, writing out the noticed jambs in a notebook, without wasting time on the bug tracking More than half of the bugs developers fix in this mode on the fly! After all, the feature is only written, it is still held in the head!

We reduced the period from 1.5 to 0.5 days very quickly, but in practice we achieved another, more serious change:% of the features transferred to the “sent for revision” status decreased from almost 80 to almost 20! That is 4 times! That is, 80% of features are now immediately accepted after being transferred to the “testing” status, because shortly before being transferred to this status, testing was carried out on the fly, which greatly reduces the time taken to register errors and the cost of correcting them.

By the way, history 3 is the only one where we immediately reached our goal. There are still breakdowns of iterations, but now this is an exception, and almost every Thursday the development team leaves home on time, and on Friday the preparation for the next iteration really begins.

Bingo!

findings

I really did not want to draw dry formulas, to philosophize and to theorize. I told specific stories from fresh (2012!) Experience. Stories in which we shortened terms and improved quality without changing the budget.

Are you still not ready to use the metrics with benefit?

Then we go to you! :)

Source: https://habr.com/ru/post/141671/

All Articles