📜 ⬆️ ⬇️

Understanding hashCode () and equals ()


I recently started programming, and there are a lot of new things for me in this area. This article is designed for beginners java-programmers, and, I hope, will help in mastering the favorite topic for interviews “ hashCode and equals ”.
I want to immediately warn that I am not an expert in this topic and may not understand something, so if you find a mistake or inaccuracy, please contact me.

What is a hash code?


If it is very simple, then the hash code is a number. Really simple, isn't it? More specifically, it is a fixed-length bit string obtained from an array of arbitrary length ( wikipedia ).

Example â„–1
Execute the following code:
 public class Main { public static void main(String[] args) { Object object = new Object(); int hCode; hCode = object.hashCode(); System.out.println(hCode); } } 

As a result of the program execution, a 10-digit integer number is output to the console. This number is our fixed-length bit string. In java, it is represented as a primitive type number int , which is 4 bytes, and can put numbers from -2,147,483,648 to 2,147,483,647. At this stage it is important to understand that the hash code is a number that has its limit, which for java is limited to the primitive integer type int.

The second part of the explanation reads as follows:
derived from an array of arbitrary length.
By an array of arbitrary length, we mean an object. In example 1, an object of the type Object is used as an array of arbitrary length.
')
As a result, in terms of Java, the hash code is the integer result of the method, to which an object is passed as an input parameter.

This method is implemented in such a way that for the same input object, the hash code will always be the same. It should be understood that the set of possible hash codes is limited to the primitive type int , and the set of objects is limited only by our imagination. Hence the following statement: “The set of objects is more powerful than the set of hash codes”. Due to this limitation, it is quite possible that the hash codes of different objects may coincide.

The main thing here is to understand that:

The situation when different objects have the same hash codes is called a collision. The likelihood of a collision depends on the hash code generation algorithm used.

Summarize:

First, to avoid confusion, we will define the terminology. Same objects are objects of the same class with the same field contents.

  1. for the same object, the hash code will always be the same;
  2. if the objects are the same, then the hash codes are the same (but not vice versa, see rule 3).
  3. if the hash codes are equal, then the input objects are not always equal (collision);
  4. if the hash codes are different, then the objects are guaranteed different;

The notion of equivalence. equals() method


To begin with, in java, every call to the new operator creates a new object in memory. To illustrate, create a class, let it be called “BlackBox”.

Example 2
Execute the following code:
 public class BlackBox { int varA; int varB; BlackBox(int varA, int varB){ this.varA = varA; this.varB = varB; } } 

Create a class to demonstrate the BlackBox .
 public class DemoBlackBox { public static void main(String[] args) { BlackBox object1 = new BlackBox(5, 10); BlackBox object2 = new BlackBox(5, 10); } } 

In the second example, two objects will be created in memory.



But, as you have already noticed, the contents of these objects are the same, that is, equivalent. To verify equivalence in the Object class, there is an equals() method that compares the contents of objects and outputs a boolean true value of boolean true if the content is equivalent, and false if not.

 object1.equals(object2);//   true,     

Equivalence and hash code are closely related, since the hash code is calculated based on the object's contents (field values) and if two objects of the same class have the same content, then the hash codes must be the same (see rule 2 ) .

In other words:
 object1.equals(object2)//   true object1.hashCode() == object2.hashCode()//   true 

I wrote “must be”, because if you follow the previous example, then in fact the result of all operations will be false . To clarify the reasons, let's look at the source code of the Object class.

Class object


As you know, all java-classes are inherited from the class Object . In this class, the hashCode() and equals() methods are already defined.
By defining your class, you automatically inherit all the methods of the Object class. And in a situation where your class does not override ( @overriding ) hashCode() and equals() , then their implementation from Object .

Consider the source code of the equals() method in the Object class.
 public boolean equals(Object obj) { return (this == obj); } 

When comparing objects, the “ == ” operation returns true only in one case - when the links point to the same object. In this case, the contents of the fields are not taken into account.

By executing the code below, equals returns true .
 public class DemoBlackBox { public static void main(String[] args) { BlackBox object3 = new BlackBox(5, 10); BlackBox object4 = object3;//  object4   //-     object3 object3.equals(object4)//true } } 




Now it’s understood why Object.equals() doesn’t work as it should, because it compares links, not the contents of objects.
Next on the queue hashCode() , which also does not work as it should.

Let's look at the source code of the hashCode() method in the Object class:
 public native int hashCode(); 

That's actually the whole implementation. The keyword native means that the implementation of this method is performed in another language, for example, C, C ++ or assembler. The specific native int hashCode() implemented in C ++, the source code is http://hg.openjdk.java.net/jdk7/jdk7/hotspot/file/tip/src/share/vm/runtime/synchronizer.cpp function get_next_hash .

When calculating the hash code for objects of the Object class, the Park-Miller RNG algorithm is used by default. The basis of this algorithm is a random number generator. This means that each time the program is started, the object will have a different hash code.

It turns out that using the implementation of the hashCode() method from the Object class, we will get different hash codes each time we create an object of the new BlackBox() class. Moreover, restarting the program, we will get completely different values, since this is just a random number.

But, as we remember, the rule should be fulfilled: “if two objects of the same class have the same content, then the hash codes must be the same” . Therefore, when creating a custom class, it is customary to override the hashCode() and equals() methods so that the object fields are taken into account.
This can be done manually or by using the source code generation tools in the IDE. For example, in Eclipse, this is Source → Generate hashCode () and equals () ...

As a result, the BlackBox class takes the form:
 public class BlackBox { int varA; int varB; BlackBox(int varA, int varB){ this.varA = varA; this.varB = varB; } @Override public int hashCode() { final int prime = 31; int result = 1; result = prime * result + varA; result = prime * result + varB; return result; } @Override public boolean equals(Object obj) { if (this == obj) return true; if (obj == null) return false; if (getClass() != obj.getClass()) return false; BlackBox other = (BlackBox) obj; if (varA != other.varA) return false; if (varB != other.varB) return false; return true; } } 


Now the hashCode() and equals() methods work correctly and take into account the contents of the object's fields:

 object1.equals(object2);//true object1.hashCode() == object2.hashCode();//true 


Who is interested in redefining the manual, you can read Effective Java - Joshua Bloch , chapter 3, item 8.9.

Results


Creating a custom class, you need to override the methods hashCode() and equals() , so that they work correctly and take into account the data object. In addition, if you leave the implementation from Object , then using java.util.HashMap cause problems, since HashMap actively use hashCode() and equals() in their work, but it is well written about tarzan82 in the post Data structures in pictures. HashMap .

References:


Source: https://habr.com/ru/post/168195/


All Articles