The saga is about how Java developers should test their applications. Part 1

If you still think that there is no use of reports at conferences, we invite you to read an article prepared based on the report by Nikolai xpinjection Alimenkov (EPAM) at JPoint 2016. In a nearly two-hour report, Nikolay talks in detail about various aspects of testing (and development). Java applications - from approaches to testing business logic to TDD, BDD and UI testing, demonstrated on practical examples from a real project.

The post turned out just huge, so we broke it into two parts. Now you are reading the first part, and the second is available by reference .

Video of the report:
')

Briefly about yourself

My name is Nikolai Alimenkov, I came to you from sunny Kiev. I have a Twitter . If you have one too, and you read something about Java and development processes - join.

I have been working with agile-methodologies and engineering practices for a long time. The main direction of my activity throughout my rather long career is Java and the development of large distributed systems of various kinds. Most of this time I worked as a Java Technical Lead and Delivery Manager.

Concurrently, I am the founder and trainer in the XP Injection training center, on whose behalf we conduct many interesting activities. We have big conferences:

Selenium Camp , entirely dedicated to test automation issues. Initially, the conference, existing since 2011, was entirely devoted to Selenium / WebDriver, but now it has gone beyond its scope;
JEE Conf - the largest Java-conference in Eastern Europe, dedicated to modern approaches to the development of distributed, highly loaded, scalable systems, new directions and interesting architectural solutions for Java, interaction with other technologies;
XP days Ukraine big conference, which is dedicated to engineering practices (continuous delivery, continuous integration, testing, automation and everything related to engineering practice), as well as tools that allow you to implement the discussed approaches.

Currently I work for EPAM in the position of Senior Delivery Manager, leading a large grocery project. The team is now more than 150 people, we are growing and we are planning to grow further. What we will talk about today, we use in teams not only on this, but also on other projects. All this is scaled both on small, and on big teams without any problems.

However, everything that I will tell you is based on my personal experience and on my vision, how this should be done. Your experience and vision may not be the same as stated. If you have a different opinion, I will be happy to discuss everything on occasion.

Business Logic Testing

Let's start with a fairly simple - with business logic. There are 2 main schools here.

The first school says that any existing logic should be tested in complete isolation from any other logic. In other words, you must create a completely isolated environment for each piece of your business logic. If you are testing a service, then any other classes used to pull some more pieces of logic, transformations, etc., should be isolated. Those. This school insists that moki should be used everywhere and tested in complete isolation.

This approach has a remarkable advantage: in the test, you can emulate any condition that you can think of. You need to let some sender of a message throw an exception of any kind (even one that is very hard to simulate in real life) - no problem. For example, you can find out how your code will respond to "Out of memory". Will he eat "out of memory"? Or will it pledge and go on? Such things can also be tested.
It gives a certain freedom. And the advantages of the approach are clearly visible: tests are performed very quickly, with them simply. And here there is a huge number of libraries.

The second school says the opposite. If we test the logic, then we must test it with the whole environment, with the exception of fairly complex structures, for example, external systems - they need to be somehow isolated. This approach also has certain advantages. One of them is that you simultaneously test not only the logic itself, but also integration with the environment. This means that the tests will be more revealing, will give more information about the situation, if something went wrong. After all, it often happens that all unit tests pass, but the application does not even rise. And the reason is very simple: each element individually had some kind of its own protocol, but when they got together, it turned out that you used the old protocol to emulate the behavior, and, voila, nothing works. This approach protects against this development.

But what happens if your business logic has some kind of ornateness (there are logical conditions, cycles, etc.), and in the nested pieces of logic that are delegated to someone, there are also cycles and nesting. In this case, you get a lot of tests, because need to go all the way. You can come to the conclusion that in order to test one service method, you will have to write about fifty tests. Understandably, no one will write these 50 tests. You write 10 or 15 and stop at this, moving away from the original goal of providing normal coverage that would give you an accurate understanding that everything is going well. But do not write all 50.
Such tests run slower. It turns out that it is difficult, but there is no common problem with mocks (when you “shut up” everything and everything is not very reliable in the end).

I am a supporter of the first approach (where everything has to be wet), but with certain nuances. And this is just the most interesting thing I want to talk about.
There are many libraries for mocks: EasyMock, Mockito, JMock, Powermock, Spock and others. Each has its pros and cons.
For example, one of the minuses of the Mockito is that if you use it, you need to sit down and really figure it out. And our developers (by “our” I mean developers from the post-Soviet space) are used to bashing money. They saw the Mockito, looked at the example in Stack Overflow or in the Mockito documentation: “Everything is clear. I made mock - I made verify at the end, ”and rushed to use it. But Mockito works on the spy principle, i.e. records all actions, and you can check a part, but not a part of it. Not understanding this logic, you can go on a completely wrong scenario. I saw it in real life: the guys worked out half a year, but didn’t make a correct verify in any test. As a result, they had a lot of tests that ran and were always green, but did not test anything (they tested that Mockito recorded everything, but in the end they did not check anything, because, in their opinion, everything had to be recorded). Those. understand how the framework works is necessary.

EasyMock behaves more like Strict-mok - by default, it checks everything. Strict-mok, when you didn’t teach him something (they called some method that wasn’t warned about, or suddenly two times called what they promised to call once), says: “I don’t know how to do it, so I fell ". But, again, they need to be able to use, because there is a logic that allows you to make a stub or a more lightweight design out of such a mock.

The biggest problem with mokami arises when you have a certain service, which is actually an integrator and does not contain any logic itself, but inside it pulls others (for example, go to the database and pull out the user, for which we pass the user ID there and pull out user, then register the user somewhere - register, return boolean true, send a message to the user by e-mail that he is registered and return boolean true or false). Within the framework of flat logic (without branching of if-s), we pull A, then B, and so on. What will your test look like if we use moki? It will look exactly the same: expect A to be called, then B to be called, and so on. And then an interesting question arises: are such tests not too fragile? They simply repeat the implementation. If now I just change something in the implementation (and, not conceptually, I just move something), I have to do the same in the test.

There are three ways to solve this problem.

First, we do not test such integrator services at all. This is the most pleasant option for the developer: no need to test - no problem. Since we use flat logic, it is enough to call this service at least once in a higher level test. We have REST-tests or functional tests through the UI, and they certainly will somehow affect this service. If they touched him and everything went well, that's enough. The method is very simple. By the way, you can even specifically mark integrators with annotations so that they are not taken into account in all coverage statistics.
The second way I don’t really like is to live with this pain and really have tests in which all behavior is completely duplicated by mock.
The third way is to write integration tests for such integrators. But in this case there will be a real base, a real sender, and so on.

Which of these options to choose - you know better.

Examples from practice

Let's look at some practical examples.

@RunWith(UnitilsJUnit4TestClassRunner.class) public class ProjectIndexStorageText { private ProjectIndexStorage indexStorage; @RegularMock private DirectoryCompressor compressor; @RegularMock private FileIndexMaintainer fileIndexMaintainer; @RegularMock private ArchiveStorage storage;

Here I have IndexStorage and moki: DirectoryCompressor, FileIndexMaintainer and ArchieveStoreage. I marked all of them with Regular Mock - this is the usual mock.
What my test looks like:

  private File baseDir = new File("project-index-storage-test"); private File unpackedIndexTempDir = new File("download-file"); private File indexActualDir = new File("page-index");

  @Before public void init() throws URISyntaxException, IOException { indexStorage = new ProjectIndexStorage(fileIndexMaintainer, storage); FileUtils.forceMkdir(unpackedIndexTempDir); }

In the constructor, when the test method of the init method is called, I in Before transfer all my mocks to IndexStorage (no magic happens).
At the end, I delete the file dir:

  @After public void done() { FileUtils.deleteQuietly(baseDir); }

The test is called “archive is created and saved”:

  @Test public void archiveIsCreatedAndStored() throws Exception { expectGettingIndexPath(); storage.store("11", indexActualDir); replay(fileIndexMaintainer, storage); indexStorage.uploadIndex(11); assertProjectIdFileExists(11); }

Here is a reusable morsel:

  public void expectGettingIndexPath() { expect(fileIndexMaintainer.getIndexPath(11)).andReturn(indexActualDir)); }

Then I kind of say: “FileIndexMaintainer, you will be asked for the index along the way, and you give him back some such directory (this is the real directory - above, I also created it).”
After that: “Storage, you should be called. And you keep this real directory. ” I taught them all, and then I say replay, i.e. they are now familiar with this behavior.
I want to note that this is Strict-moc. If I now change some parameter, for example, I’ll specify not 11, but 12 for storage. Store, he will say: "Sorry, I did not expect 12, it was only from 11".
After that, I call my upload and check on the file system that a certain archiver has been added to a certain folder - assertProjectIdFileExists (11) (although this does not apply to moks):

  public void assertProjectIdFileExists(long projectId) { assertTrue(new File(baseDir, String.valueOf(projectId)).exists()); }

What is worth paying attention to? When such construction storage.store ("11", indexActualDir); , - some misunderstanding is created. If you forget that this is a regular mock, the Java syntax creates the feeling that you are actually calling this storage. To avoid this feeling, many frameworks, the same Mockito, have their own fluent API. And EasyMock, when you don’t need to return anything (as you can see, the store method doesn’t return anything, just void), don't wrap it in expect, but just call it. It is sometimes confusing.

I just gave an example here, wrapping it in expect to show: in fact, this is not a real challenge. I just need to explain what it means. For this, I am using the expectGettingIndexPath method:

  public void expectGettingIndexPath() { expect(fileIndexMaintainer.getIndexPath(11)).andReturn(indexActualDir)); }

This is much better than posting comments. Do this and everything will be cool.

Description through testing

Business logic is tested for several reasons:

to make sure that it really works the way you want it;
to achieve security. If now someone goes into the code itself and changes it by mistake (not the way you expected it), your tests should fail;
so that it is described somewhere what your class can do.

And I would like to talk about the last point in more detail.

Someone makes descriptions Javadoc. But how often do you update it?

Do we update javadoc as soon as we make a change? This approach is not very reliable. Therefore, as an additional bonus from tests on business logic, I would like to have a description of what a particular class can do.
It all depends on you. That the class has been described, it is necessary to describe it somewhere. And there is a great place for this - in the names of test scripts.

You’ve probably seen some call their test scripts, say, test_success_1, test_success_2, etc. If I then see what my class can do, then I will see that it is very “success”. Maybe somewhere will fail. And, unfortunately, this is all that remains to me a legacy from the person who figured in this class when writing tests on him (and, possibly, was the author of this class).

If I act differently and try to call each implemented test script on the business logic according to what I want to test, it will result in some kind of live specification.

I will see: archiveIsCreatedAndStored()
Perhaps this is the success-path - nothing falls, everything is fine.

Here is another example:

  @Test(expected = IllegalStateException.class) public void failIndexDownloadIfTemDirectoryRenameHasFailed() { FileUtils.touch(indexActualDir); expectGettingIndexPath(); expectRetrievalIndexFromStorage(unpackedIndexTempDir); replay(fileIndexMaintainer, storage, compressor); indexStorage.downloadIndex(11); }

Here failIndexDownloadIfTempDirectoryRenameHasFailed() , i.e. if rename temp directory falls, then the whole operation falls. This is a business rule that we want to test. And you have a direct connection with this business rule, since the specification (or task, in whatever matter you store your original task of implementing business logic) says: "If the rename temp directory falls, throw an exception."

There is a wonderful plugin for IDEA (and Eclipse too) called TestDox, which will do a good deed for you - it breaks up all the test methods that we call CamelCase into words, providing for each test a short description in the form of specific operation names be carried out.

ProjectIndexStorage:
>> archive is created and stored
>> skip downloading index if directory already exists
>> index is downloaded from storage if target dir abscent
>> index is downloaded from storage if maintainer thrown expected exception
>> skip index removal for non project events
>> skip index removal for wrong event type
>> index removal is delegated to storage

TestDox allows you to conveniently navigate between these tests. Imagine that I have never seen this ProjectIndexStorage. Here I can not just read what he can (perhaps you would have written this in Javadoc too). The biggest bonus is that I can choose the behavior that I’m interested in, double-click on it, see what is expected of those mocks with whom I interact, and run this behavior. I can stop at debugging and see how it works.

 public void indexDownloadedFromStorageIfMaintainerThrownExpectedException() { expectGettingIndexPathFailure(); expectRetrievalIndexFromStorage(unpackedIndexTempDir); replay(fileIndexMaintainer, storage, compressor); indexStorage.downloadIndex(11); assertFalse(unpackedIndexTempDir.exists()); assertTrue(indexActualDir.exists()); assertProjectIdFileExists(11); }

ProjectIndexStorage:
>> archive is created and stored
>> skip downloading index if directory already exists
>> index is downloaded from storage if target dir abscent
>> index is downloaded from storage if maintainer thrown expected exception

There is no need to raise the entire system - you can deal with one class, while right in the test. Notice, we now share several goals.

Firstly, tests exist and run. And even if we write test_success_1, etc., they will still run and solve one problem.
But, secondly, there is still an additional task - this is live documentation. For this you need additional efforts on your part - the correct names. Nothing more to do.

But what about logging?

Many do super-logging unit tests. What for? The unit-test report has everything you need: if you called it correctly and understand which piece of functionality does not work. The second thing you have is StackTrace. I hope everyone can do operations with StackTrace: I copied StackTrace, went to IDEA, said Analize StackTrace and saw what was going wrong. This is enough for a unit test. Additional logging is physically useless. The debugger in a good way is also not needed.

But here I am repelled by the idea that we are testing only business logic. In this case, we have one single goal - to verify that the business logic is working correctly. It is not necessary, in the general case, to add some second goal there. One-time (for unexpected situations) - no problem. But I say that there should not be logging in the normal mode.

Of course, there are different situations, for example, you may have tests that blink - and this is quite a normal thing, because running tests from the IDE and the console (or local infrastructure settings) is different. In this case, you have to understand. But I'm not sure that the debugger will help you. Rather, running the command line that runs on your Continuous Integration Server will help. A log does not give anything. The same timings will be automatically - they will be given by the unit testing framework used.

If the test falls in different environments, there is clearly a problem with external dependencies. Everyone knows that unit tests should be independent. But some developers implicitly make these dependencies. They can pass tests on the local machine, because they wrote them in a certain order, test by test. And locally they launch them in the order in which they are written. And if you turn on the "Run in random order" checkbox, a test may or may not fall. Most likely this means that it depends on something - on some other test. You may not have cleared the directory or variable. This happens, and you have to find it out by checking the box.

Sometimes you are confronted with the fact that you have Memory Leak in unit tests. Garbage Collector starts to work more aggressively and more aggressively, and he sometimes has pauses. Since we used the reactive model, then we wait on the promises until the result returns (let's say we set a timeout for three seconds - this is more than enough for a unit test), but it is not executed. Look at the Garbage Collector Profile, and it turns out that there is a pause. As a result, locally tests are run, because we, for example, have given more memory, and somewhere else - no. This also happens. With this you need to understand in private. But by default there should be only the correct good test names and everything, nothing else you need here.

It should be noted that we are not testing the hash code here. Hashcode and equals are defined when they want to be used in collections that are based on Hashcode and equals — hashmap, hashset, and so on. But if you don’t use Hash anywhere, you may not need a hashcode. Therefore, if we know that there is some kind of solution - we want to save somewhere or decide to do field caching so that it is precomputed - then we are testing it. But we try to write as little code of this kind as possible.

About business logic and test time coverage

Usually in projects, business logic coverage is in most cases close to 100%, since Most developers write on TDD (people do not write code if they cannot formulate what it should do). We are talking about all the code, with the exception of synthetic, getters, setters, frameworks, or the above-mentioned integrators, i.e. things we don't want to cover.

In the project that I am running now, we strive to increase the coverage, and at the moment it is about 60%.

Now let's talk about the mutation tests. They are used to check how much your tests cover logic. In a nutshell, they work like this: come to some if and invert it. And they look - the tests continue to work or not. If the tests are still green, then this if you are not very good at checking out. Likewise, they do with rearranging other things. For example, assigning a local variable of a different value. The most popular tool used in Java is pit.

In the past project (the code of which I showed) we experimented and found about 3 to 5 tests that were “so-so” (and this is debatable), so we did not include mutation testing on an ongoing basis, because we had been developing for 5 years by that time. If in 5 years we only had 3 such tests, it is too much overhead to do the check all the time. But on some projects, I assure you, you will find a lot of interesting things. Some tests do not check anything at all.

But coverage should not be a goal. If the goal is coverage, write any rubbish, so long as it calls methods. You can not write a single assert, make a million tests that simply call methods with different parameters - and you will have super-coverage. But that makes no sense. You should not fight for coverage, but for tests that give an understanding that the functionality works correctly and periodically breaks down (because when the test breaks, it shows its benefit - except for the situation when they break down due to incorrect spelling).

How long does it take? Here I will say such a thing, which, perhaps, not everyone will agree: I don’t want to consider how much it takes. And the reason is very simple: everyone has some kind of approach to writing code. You cannot approach the developer and say: “You generate high-quality solutions, but how does your productivity change depending on whether you are programming with one or two hands? Maybe we will pay you half the wages, but you can do a good job with one hand too. ” Manipulating test writing time is an attempt to isolate a piece from the standard process by which you get a quality solution. And she hints that you would like to get rid of him (“optimize”).

The same question can be asked about code review. How much does code review take? This is a good question in terms of optimizing code review itself (in order to have this as a metric and keep track of code review less), but for a different purpose it is dangerous to ask this question.

I can say a purely personal feeling: I write tests for much longer than the code itself (and here we smoothly flow into the next topic - TDD). Most of the code I do not write, but the generator. I never write public class and something else, constructor, getters, setters, etc. For the past 8 years I have never created a method from scratch myself. I just write in the test how it should look, and after that the IDE automatically generates everything for me. It turns out that most of the time I write a test. But this does not mean that in the absence of the test, I would immediately be greatly accelerated. On the contrary, I would slow down and the low level design would get worse.

Evolution of tests

The test is more or less punctual. Personally, I adhere to this approach: I know that we are trying not to put extra effort into this test. If I have some kind of new functionality that makes me modify the test (and change the name), I often leave the old, new generation, make sure the new one works, and then remove the old one. If necessary, I act in the old antiquated way of copy-paste, i.e. copy paste driven development in this regard helps me a lot. When I write a new test, I try to formulate how it should look now. After that, I copy paste the inside of the dough and, thus, I can get rid of the old one.

This solves the problem of changing the names. With Javadoc, the opposite is true: it doesn’t affect you at all, you can change it after you finish your work (but “after” usually flies out of your head and no one has any desire to do it).

Tests can be written both immediately and as the code rises. Usually, I immediately write those test scenarios that I’m afraid to forget.

Why is it good to write a test script before the code? When writing, you start to think about using the API, about possible outcomes (for example, returning false instead of expected true in certain situations). And then you either record yourself that you need to test the script with false, or you write the script with false right away - this already depends on what kind of memory the person has. Some focus on one thing and everything they meet along the way, then they forget. But then it's a shame that the thinking process was in vain. Therefore, I prefer, meeting something unobvious (which was not originally in memory), to immediately formulate it as a test.

But all the tests at once, I do not write. Why am I not doing this? Here is an example: I am starting to write some kind of test and it seems to me that now the API is good. But not the fact that he will remain so. Not the fact that when I start to implement it, I will not see any disadvantages in it. And it will be insulting if I formulate all the tests in the framework of the same API, and then I will change everything. Therefore, it is better to go on a small piece.

If tests are not relevant, delete them. Do not comment, do not skip, just delete. You have a version control system from which you can easily restore them if necessary. If you are afraid of losing, enter a rule: when you remove some tests, you mark it with some special tag in the comment in your version control system (for example, unit test remove - utr). All - if utr is marked, the test is deleted there and it can be easily found. But I think you will never need it.

The insides of the test - the division of responsibility

There are a few typical questions about the guts of the test concerning the division of responsibility in different variations.

First, the question of the number of assertions in the test. I do not agree with the rule: "Do always one assert." The only thing is that asserts should not cover several scenarios at once.

When we talk about unit tests for business logic, I would like to cover some specific scenario. Sometimes it saves you a lot of money and in the course of a bunch of things to check, but it turns out more integration test. His problem is that it is fragile - if only one functionality has changed (not all, but one of those that you assert), it will fall. Therefore, I tend to the rule that assert should be clearly from the business scenario you are testing. But how many of them will be physically (one will check the folder, the other will check the mocks, the third - the initial parameters) - it does not matter. As in the example we talked about above:

  indexStorage.downloadIndex(11); assertFalse(unpackedIndexTempDir.exists()); assertTrue(indexActualDir.exists()); assertProjectIdFileExists(11);

Some say that it is impossible to do this and at the end there should be only one assert string. And here 3 assert, but they are all interconnected. It would be possible to connect them together, putting meaning in them and physically turning them into one, but I never did it - I don’t see the point. In my opinion, this is perfectionism.

Secondly, the concept of single responsibility helps to make a decision about testing protected, private, etc. When do you want to make a protected-method? When you realize that its logic is quite complex and you would like to test it independently. But when the complexity of the class exceeds a certain level, for good it is necessary to split up into several classes and delegate this logic to someone else (perhaps in the same package and package visible without additional ceremonies). And then build separate tests for individual classes.

The profit is that classes become easier and you support the concept of single responsibility.

Imagine that you have a class in which there are 5 different responsibilities. There is a high probability that people from different tasks will come to it and will simultaneously work on its code. responsibility, (.. , , ) .

, . -, - , . , , , ( private). , - . , , , , . . . , — , — . - . — , .

, protected -, . , protected, . private , - reflection , , protected , .. ( ), , — .
, package default — . - package default, , package, public. ( API, ).

-

, , - — JDK - . , blank ( / ), . , , .

, -, , - -. , , .

, : , ; , , . . - .

: blank, blank? , , -? - ?
. , blank, , , . , - (, - «, ») blank .

. , , . .

. .

JPoint 2017 (7-8 2017 ). « Hibernate ». !

, JPoint Java — .

JPoint , JEEConf .

Source: https://habr.com/ru/post/323920/

All Articles