📜 ⬆️ ⬇️

Some subtleties of GetHashCode

When reading the “Framework Design Guidelines: Conventions, Idioms, and Patterns for Reusable .NET Libraries” came across this phrase:

"Ensure that getHashcode returns exactly the same value regardless of any changes."

Hmm ... I thought, what are they talking about? A standard implementation that is generated by ReSharper appeared before my eyes, and I realized that the generated value will not be constant throughout the life of the object when it changes.

I decided to sketch a sample in order to realize the scale of the problem, so suppose we have a class reflecting a person, and for unique identification we will use his SNILS number:
')
public class Employee { public string FirstName { get; set; } public string SecondName { get; set; } public string Snils { get; set; } protected bool Equals(Employee other) { return string.Equals(Snils, other.Snils); } public override bool Equals(object obj) { if (ReferenceEquals(null, obj)) return false; if (ReferenceEquals(this, obj)) return true; if (obj.GetType() != this.GetType()) return false; return Equals((Employee) obj); } public override int GetHashCode() { return (Snils != null ? Snils.GetHashCode() : 0); } } 

Overloaded methods generated by ReSharper. At first glance, everything is fine. The fields used in the equality test are used to generate the hash. Equal objects will have equal hash codes. It seems to be all great.
Add some business logic:

 var employees = new HashSet<Employee>(); var employee = new Employee() { FirstName = "Sergei", SecondName = "Popov", Snils = "123456" }; employees.Add(employee); Console.WriteLine(employees.Contains(employee)); 

And we see the message "True".
What if at some point I decided to change my SNILS

 var employees = new HashSet<Employee>() var employee = new Employee() { FirstName = "Sergei", SecondName = "Popov", Snils = "123456" }; employees.Add(employee); //      employee.Snils = "654321"; Console.WriteLine(employees.Contains(employee)); 

And we see the message "False".

What happened?
Internally, a HashSet consists of a number of baskets. Recycle bin for an object is selected based on the value returned by GetHashCode. As soon as we changed the number of SNILS, the value returned by GetHashCode also changed. HashSet, in turn, chose another basket for viewing based on the hash code and, naturally, there is no object in this basket (with a very small probability it could have been there). In other baskets HashSet will not look, because equal objects must have equal values ​​GetHashCode. That's it. The object will not be found.

And how did it even work?
If you have not redefined Equals & GetHashCode, then your object will have a constant GetHashCode object throughout the life of the object, regardless of the changes you made in the fields of the object. But, if you overload these methods, it is necessary to use only immutable fields in the hash generation algorithm, or not to change the fields used in the generation algorithm, or to invent your own crutch (alternatively, you can use the approach implemented in the standard Object class implementation) .

Hence the moral:
The value of the hash code must be constant throughout the life of the object, or you must be clearly aware of what you are doing, if in your case it may change.

Ps. I understand that Rocket Science is far from being described here. Everything written here is obvious and follows from the requirements of Microsoft to these methods. There is a good description from Lippert here , however, on the move, I ran and did not believe that HashSet would return False. I hope that you no longer.

Source: https://habr.com/ru/post/235101/


All Articles