Java job interview. Collections

Recently, I had the insistent idea that professional development has slowed down a lot and I want to fix it somehow. Yes, I read books, listen to courses, but at the same time comes the understanding that it is possible that the time has come to change jobs, here everything seems to be learned, we are gradually going into a routine. This thought led me to send my resume to several companies that are market leaders. After passing the interview in 3 of them, I decided, as usual, to contribute my 5 kopecks to the coverage of the extensive topic of the interview, namely, technical issues on Java collections that I have to face. Yes, I know, the reader will say: “the collection is a hackneyed topic, as much as possible,” but I asked some of the questions below to my familiar developers, who occupy exactly the positions of the developers (“strong middle peasants”, by the standards of remote backwoods from Moscow, who confidently they cope with their work in practice, but in theory we say there are gaps, because the work does not require solving some non-trivial tasks, and because not everyone is interested in studying how the data structure works inside) caused confusion. I think that the reviewed material will not be very interesting for developers above the Junior level (I will ask them to comment, supplement and criticize the material presented here), but Juniors are sure to find this article interesting for themselves.

I admit honestly, during the interview I didn’t know the answers to some of the questions set out below, although it seems I had already passed the junior stage. This is doubly insulting, given the fact that positions in those companies where everyone was sympathetic, starting from communicating with HR and ending with a possible future activity, could not get an offer and there were questions about collections that I could not cope with (I'm sure they made their negative contribution). But where everything went quite well from the point of view of the interview, the proposed field of activity and communication in general with future colleagues left a negative, so the law of "meanness" in all its glory. As a result, I want to fill this gap in my head with this topic + systematize this knowledge on paper.

In the article I will consider not only the questions that have caused me difficulties at the last interviews, but also the questions that I have been asked for throughout my entire practice of interviewing. Well, I think it's time to move on to the questions:

1. How does ArrayList differ from LinkedList?
')
In my rating this is one of the two most popular questions about the collection, asked in 90% of cases. I caused a problem at my first Junior Developer interview. In short, the answer to this question comes down to the following: ArrayList is a list implemented on an array basis, and LinkedList is a classic linked list based on objects with links between them.

Advantages of ArrayList: the ability to access an arbitrary item by index for a constant time (as it is an array), minimum overhead for storing such a list, inserting it at the end of the list on average is also done for a constant time. On average , because the array has a certain initial size n (in the code it is the capacity parameter), by default n = 10, when writing n + 1 elements, a new array of (n * 3) / 2 + 1 will be created, it will be All elements from the old array + new, added element are placed. As a result, we find that when an element is added, if it is necessary to expand the array, the time to add will be significantly longer than when writing an element to a ready-made empty cell. However, on average, the time to insert an item at the end of the list is constant. Deletion of the last element occurs in constant time. The disadvantages of an ArrayList appear when an element is inserted / deleted in the middle of the list - it causes overwriting of all elements placed “to the right” in the list one position to the left, besides, when deleting elements, the size of the array does not decrease until an explicit call to the trimToSize () method.

LinkedList, on the contrary, can perform insertion / deletion of elements in the list in constant time (insertion and deletion, searching for insertion and deletion positions is not included here). Access to an arbitrary element is carried out in linear time (but access to the first and last element of the list is always carried out in constant time - links are constantly stored on the first and last, so adding an element to the end of the list does not mean that you have to go through the entire list in search of the last element). In general, LinkedList in absolute values loses to ArrayList both in terms of memory consumption and speed of operations. LinkedList is preferable to apply when active work (insert / delete) occurs with the middle of the list or in cases when it is necessary to guarantee the time of adding an item to the list.

For in-depth and express training at the same time, I highly recommend reading the excellent tarzan82 articles about ArrayList and LinkedList for reading. I would also recommend an article from lany about memory consumption by collections - very informative .

2. What do you usually use (ArrayList or LinkedList)? Why?

This question is a slightly disguised version of the previous one, since the answer to this question will lead to a gradual statement of the answer to the previous question. In 90% of cases, ArrayList will be faster and more economical than LinkedList, so ArrayList is usually used, but nevertheless there are always 10% of cases for LinkedList. I say that I usually use ArrayList, referring to the tests and the last paragraph from the previous question, but I do not forget about LinkedList (in which cases it also helps the last paragraph of the previous question).

3. What is faster ArrayList or LinkedList?

Another disguised version of the first question. The more cunning of the above options is that the question implies a monosyllabic answer with a choice of one of the suggested options, which, as I understand it, the author of the question, should immediately identify a person with shallow knowledge in the collections. The correct action will be the counter question of what actions will be performed on the structure. As a result, the dialogue goes smoothly to the answer to the first question.

4. You need to add 1 million. element which structure do you use?

Also quite popular hidden version of the first question. The same formulation implies the choice of one of the proposed options, although in fact there is no information for a single choice. You need to ask additional questions: in which part of the list is the addition of elements? Is there any information about what will happen to the list items? What are the memory or speed limits? In general, this is the same first question, but a little on the other side: through additional questions, you show the depth of understanding of the work of Array and Linked List.
Once I “pecked” on this hook, thinking to myself what to add - this is “insert” at the end of the list and strongly promoted ArrayList, although I did not know anything (and did not try to find out) about the further action with this list and possible limitations.

5. How are items removed from an ArrayList? How does the size of the ArrayList change in this case?

Again, the answer to question 1 contains the answer to this question. When removing an arbitrary element from the list, all elements that are “to the right” are shifted one cell to the left and the actual size of the array (its capacity, capacity) does not change in any way. The mechanism of automatic “expansion” of the array exists, but there is no automatic “compression”; you can only explicitly perform “compression” with the trimToSize () command.

6. Offer an efficient algorithm for removing several adjacent elements from the middle of the list, implemented by ArrayList.

Unbroken, by my standards, the question I met only once, when I did not know the mechanism for removing elements from an ArrayList. As a result, caused me serious difficulties. In fact, everything is quite simple and obvious when you know how to delete a single element. Suppose you need to remove n items from the position m in the list. Instead of performing deleting one element n times (each time shifting the items to the “right” in the list by 1 position), you need to offset all the items that are “to the right” n + m position and n items to the left to the top of the list. Thus, instead of performing n iterations of moving list items, everything is done in 1 pass.

7. How does the HashMap work?

This is the second of the list of the most popular questions on the collections. I don’t even remember if there was a case when they didn’t ask me this question.

In short, HashMap consists of “baskets” (buckets). From a technical point of view, “baskets” are elements of an array that store references to element lists. When a new key-value pair is added, it calculates the key's hash code, based on which the basket number is calculated (the cell number of the array) into which the new item will fall. If the basket is empty, then the link to the newly added item is saved in it; if there is already an item, then a consecutive link follows the links between the items in the chain, searching for the last item, from which a link to the newly added item is placed. If an item with the same key was found in the list, it is replaced. Adding, searching and deleting items is done in constant time. It seems everything is great, with one reservation, the hash functions should evenly distribute the elements across the baskets, in this case the time complexity for these 3 operations will not be lower than lg N, and in the average case just the constant time.

In general, this answer is quite enough for the question posed, then the dialogue on HashMap, with an in-depth understanding of the processes and subtleties, is likely to start.

Again, I recommend reading the article tarzan82 by HashMap .

8. What is the initial number of baskets in a HashMap?

Quite an unexpected question, again, he once made me guess the number of baskets when using the default constructor.

The answer here is 16. In response, it is worth noting that you can use constructors with parameters: you can set your initial quantity of baskets using the capacity parameter.

9. What is the estimate of the time complexity of selecting an item from a HashMap? Does HashMap guarantee the specified difficulty of selecting an item?

The answer to the first part of the question can be found in the answer to question 7 - the constant time is necessary for selecting the element. Here on the second part of the question, I was recently taken aback. I knew the HashMap device and also knew about the hash function, but I was not ready for such a question, I rushed in another direction in my mind and concentrated on the HashMap structure, discarding the hash code, which I always used to consider to be a hash code with uniform distribution. In fact, the answer is quite simple and follows from the answer to question 7.

If you take a hash function that constantly returns the same value, then HashMap will turn into a linked list, with disgusting performance. Then even if you use the hash function with a uniform distribution, in the limiting case only the time complexity lg N will be guaranteed. So, the answer to the second part of the question is no, it is not guaranteed.

10. The role of equals and hashCode in HashMap?

The answer to this question follows from the answer to question 7, although it is clearly not registered there. hashCode allows you to define a basket to search for an element, and equals is used to compare the keys of the elements in the list inside the basket and the desired key.

11. Maximum number of hashCode () values?

Everything is pretty simple here, just remember the method signature: int hashCode (). That is, the number of values is equal to the range of type int - 2 ^ 32 (they never asked the exact range, such an answer was enough).

12. How and when does the increase in the number of baskets in the HashMap?

This is a rather subtle question. As my mini-survey showed, if many people are more or less clear about the essence of the HashMap device, then this question often put the interlocutor in a dead end.

In addition to the capacity in the HashMap, there is also the loadFactor parameter, on the basis of which the limit number of used baskets is calculated (capacity * loadFactor). The default loadFactor = 0.75. Upon reaching the limit value, the number of baskets increases by 2 times. For all stored items, a new “location” is calculated based on the new number of baskets.

13. In what case can an item be lost in a HashMap?

This interesting question was sent to me by LeoCcoder , they didn’t ask me anything and honestly, after reading it, I couldn’t think of a script for losing an item. Again, everything turned out to be quite simple, though not so explicitly: let's say, as a key, not a primitive is used, but an object with several fields. After adding an item to the HashMap, the object that serves as the key changes one field that participates in the calculation of the hash code. As a result, when you try to find this element by the original key, the correct basket will be accessed, but equals (after all, equals and hashCode must work with the same set of fields) will not find the specified key in the list of elements. However, even if equals is implemented in such a way that changing this object field does not affect the result, then after increasing the size of the baskets and recalculating the hash codes of the elements, the specified element with the changed field value is more likely to fall into a completely different basket. and then he will be completely lost.

14. Why can't you use byte [] as a key in a HashMap?

Another question from LeoCcoder . As usual, everything turned out to be quite simple - the array hash code does not depend on the elements stored in it, but is assigned when creating the array (the method of calculating the array hash code is not redefined and calculated using the standard Object.hashCode () based on the array address). Also, the arrays are not redefined by equals and perform a comparison of pointers. This leads to the fact that accessing an element stored with an array key will fail when using another array of the same size and with the same elements, access can only be done in one case — using the same array reference that was used to save an item. For the answer to this question, special thanks go to the user @dark_dimius.

15. What are the differences between TreeSet and HashSet?

To begin with, Set is a set (also called a “set”). Set does not allow storage of two identical elements. Formally speaking, the term "set" and so designates a set of different elements, it is very important that it is different elements, since this is the main property of Set. Given this definition, an explanation about storing the same elements is not required, but in everyday life, the notion of “set” has lost its strict meaning regarding the uniqueness of the elements included in it, so still specify this property of the set separately.

TreeSet provides an orderly storage of items in the form of a red-black tree. The complexity of performing basic operations in TreeSet lg N. HashSet uses the same approach for storing elements as HashMap, except that the element itself is the key in HashSet, and HashSet (like HashMap) does not support ordered storage of elements and provides the time complexity of operations similar to HashMap.

16. Device TreeSet?

This question is asked instead of question 14, and here the short answer is sufficient that the TreeSet is based on a red and black tree. As a rule, this is enough and the interlocutor immediately proceeds to the next question, I have never been asked a tree balancing mechanism or other details of its implementation.

For the express deepening of knowledge on the red-ebony, I recommend this article .

17. What happens if you add items to the TreeSet in ascending order?

Usually, the interlocutor prefaces this question with a phrase that the TreeSet is based on a binary tree and if you add elements in ascending order, how they will be distributed throughout the tree.

If there is no exact idea about the TreeSet device, but there is a general understanding that this is a binary tree (which the interlocutor assists us in addition), then this question can lead to an interesting result: all the elements after adding to the usual binary tree will be in one branch the length of N elements, which negates all the advantages of such a structure as a tree (in fact, a list is obtained). In fact, as mentioned above, in the basis of the TreeSet there is a red-ebony tree that is able to balance itself. As a result, the TreeSet is still in what order you add elements to it, the advantages of this data structure will be preserved.

Conclusion

I hope the questions discussed will be useful for habrauser. I also ask you to forgive me for some possible naivety in that the above questions require such detailed consideration, but in due time such an article would seriously help me. I am sure that there are inaccuracies in the article - I ask in the commentary, besides, I hope that the more experienced comrades in the comments will actively share questions from their practice and, if the article is favorably accepted by the habrasoobshchestvom, it is quite possible to continue the review of technical issues for Java interviews.

PS A bit of mercantile interest: the search for a new job continues and if someone from Habrayuzer in the process of finding a Java developer in the company with a modern approach to development and interesting tasks, or may simply recommend to look at any suitable vacancy - I will be grateful, please lichku.

Source: https://habr.com/ru/post/162017/

All Articles

Java job interview. Collections

More articles: