It is well known that in the .NET object model, as in many other software platforms, you can compare objects by reference and by value.
By default, two objects are considered equal if the corresponding variables contain the same reference. Otherwise, objects are considered unequal.
However, a situation may arise when it is necessary to consider objects of a certain class as equal if they are in some way the same in content.
Let there is a class Person, containing personal data - the name, surname, and date of birth of the person.
Using this class as an example, consider:
')
- the minimum required set of class modifications so that objects of this class are compared by value using the standard .NET infrastructure;
- the minimum necessary and sufficient set of improvements so that objects of this class are always compared by value using the standard .NET infrastructure - unless it is explicitly stated that the comparison should be made by reference.
For each case, we will consider in what way it is better to implement a comparison of objects by value in order to get a consistent and, as far as possible, compact, copy-paste free, productive code.
The task is not as trivial as it may seem at first glance.
And also consider what improvements could be made to the platform in order to simplify the implementation of this task.
Person class:
class Personusing System; namespace HelloEquatable { public class Person { protected static string NormalizeName(string name) => name?.Trim() ?? string.Empty; protected static DateTime? NormalizeDate(DateTime? date) => date?.Date; public string FirstName { get; } public string LastName { get; } public DateTime? BirthDate { get; } public Person(string firstName, string lastName, DateTime? birthDate) { this.FirstName = NormalizeName(firstName); this.LastName = NormalizeName(lastName); this.BirthDate = NormalizeDate(birthDate); } } }
If two objects of class Person are compared in any way:
then objects will be considered equal only if the variables pointing to them contain the same reference.
When placed in
hash sets (hash cards) and
dictionaries , objects will also be considered equal only if the links match.
To compare objects by value in the client code, you need to write lines like this:
Code var p1 = new Person("John", "Smith", new DateTime(1990, 1, 1)); var p2 = new Person("John", "Smith", new DateTime(1990, 1, 1)); bool isSamePerson = p1.BirthDate == p2.BirthDate && p1.FirstName == p2.FirstName && p1.LastName == p2.LastName;
Notes:
- The Person class is implemented in such a way that the FirstName and LastName string properties are not always null .
If FirstName or LastName are unknown (not set), then an empty string is appropriate as a sign of the absence of a value.
This will avoid avoiding NullReferenceException when accessing the properties and methods of the FirstName and LastName fields, as well as collisions when comparing null and empty string (do FirstName equal to two objects if the first FirstName object is null and the other empty string ?) . - The BirthDate property, in contrast, is implemented as a Nullable (Of T) structure, since If the date of birth is unknown (not specified), then it is advisable to keep the property exactly the indefinite value, and not the special value of the form 01/01/1900, 01/01/1970, 01/01/0001 or MinValue .
- When comparing objects by value, date comparison is first implemented, since comparison of date-time type variables will generally be faster than string comparisons.
- Comparing dates and strings is implemented using the equality operator, since the equality operator compares structures by value, and for rows, the equality operator is overloaded and also compares rows by value.
So that objects of the Person class can be compared by value in the following ways:
for the Person class, you must
override the Object.Equals (Object) and
Object.GetHashCode () methods as follows:
- The Equals (Object) method compares those class fields whose combination of values forms the value of an object.
- The GetHashCode () method must return the same hash values for equal objects (i.e., for objects whose comparison with Equals (Object) returns true ).
It follows that if objects have different hash codes, then objects are not equal; at the same time, unequal objects can have the same hash codes.
(To get the hash code, the result of the operation of “exclusive or” values of GetHashCode () fields, which are used in Equals to compare objects by value, is usually used;
if a field is a 32-bit integer, the field value can be used instead of the hash code of this field;
Various optimizations are also possible to minimize the likelihood of collisions when two unequal objects have the same hash code.)
It is worth paying special attention that the
documentation for the
Equals (Object) method contains special requirements:
- x.Equals (y) returns the same value as y.Equals (x).
- If (x.Equals (y) && y.Equals (z)) returns true, then x.Equals (z) returns true.
- x.Equals (null) returns false.
- Returns the same value as the objects.
- And a number of others, in particular, concerning the rules for comparing floating-point values.
It is also worth noting that the
documentation for the
GetHashCode () method gives a warning that the value returned by the method is not a constant value and therefore should not be saved to disk or database, used as a key, and also that it is not should be used to compare objects (unequal objects can have the same hash codes), etc.
Person class with overlapped
Equals (Object) and
GetHashCode () methods:
class Person using System; namespace HelloEquatable { public class Person { protected static string NormalizeName(string name) => name?.Trim() ?? string.Empty; protected static DateTime? NormalizeDate(DateTime? date) => date?.Date; public string FirstName { get; } public string LastName { get; } public DateTime? BirthDate { get; } public Person(string firstName, string lastName, DateTime? birthDate) { this.FirstName = NormalizeName(firstName); this.LastName = NormalizeName(lastName); this.BirthDate = NormalizeDate(birthDate); } public override int GetHashCode() => this.FirstName.GetHashCode() ^ this.LastName.GetHashCode() ^ this.BirthDate.GetHashCode(); protected static bool EqualsHelper(Person first, Person second) => first.BirthDate == second.BirthDate && first.FirstName == second.FirstName && first.LastName == second.LastName; public override bool Equals(object obj) { if ((object)this == obj) return true; var other = obj as Person; if ((object)other == null) return false; return EqualsHelper(this, other); } } }
Notes to the
GetHashCode () method:
- If any of the used fields contains null , then instead of the value GetHashCode () , zero is usually used.
- The Person class is implemented in such a way that the FirstName and LastName reference fields cannot be null , and the BirthDate field is a Nullable (Of T) structure for which, in the case of an undefined value, GetHashCode () returns zero, and the NullReferenceException when the GetHashCode () call is not arises.
- If the fields of the Person class could contain null , then the GetHashCode () method would be implemented as follows:
GetHashCode () public override int GetHashCode() => this.FirstName?.GetHashCode() ?? 0 ^ this.LastName?.GetHashCode() ?? 0 ^ this.BirthDate?.GetHashCode() ?? 0;
Let us consider in detail exactly how the
Equals (Object) method is implemented:
- First, the reference to the current object ( this ) is compared with the reference to the incoming object, and if the links are equal, true is returned (this is the same object, and comparison by value does not make sense, including for performance reasons).
- Then, the input object is cast to the Person type using the as operator. If the result of the cast is null , then false is returned (either the incoming link was initially null , or the incoming object is incompatible with the Person class, and is certainly not equal to the current object).
- Then the fields of the two objects of the Person class are compared by value, and the corresponding result is returned.
For readability of the code and possible reuse, the comparison of objects directly by value is made in the auxiliary EqualsHelper method.
So far we have implemented only the minimum necessary functionality for comparing objects by value, but questions are already emerging.
The first question is more theoretical.
Pay attention to the requirement for the
Equals (Object) method:
x.Equals(null) returns false.
Once I wondered why some instance methods in the standard .NET library check
this for
null - for example, the
String.Equals (Object) method is implemented like this:
String.Equals (Object) public override bool Equals(Object obj) { //this is necessary to guard against reverse-pinvokes and //other callers who do not use the callvirt instruction if (this == null) throw new NullReferenceException(); String str = obj as String; if (str == null) return false; if (Object.ReferenceEquals(this, obj)) return true; if (this.Length != str.Length) return false; return EqualsHelper(this, str); }
First of all, the method checks
this for
null and, in the case of a positive test result, throws a
NullReferenceException .
The comment indicates in which cases
this may take a
null value.
(By the way,
this is compared to
null using the
== operator, which is
overloaded in the
String class, so from a performance point of view, it is better to check it by explicitly casting
this to
object : (object) this == null, or use the
Object.ReferenceEquals method
(Object, Object) , as is done in the second comparison in the same method.)
And then there was an article where you can read more about this:
When this == null: a true story from the world of the CLR .
However, in this case, if you call the overloaded method Person.Equals (Object) without creating an instance, passing
null as an input parameter, then the very first line of the method (if ((object) this == obj) return true;) returns
true , which will actually be correct, but it will formally contradict the requirements for implementing the method.
In this case, the
documentation for the method does not indicate that the first thing to do is to check
this for
null and throw an exception if the check is successful.
In this case, in general, in all the instance methods of all classes, the first line should be to check
this for
null , which is absurd.
Therefore, it seems that the official requirements for the implementation of the
Equals (Object) method should be clarified as follows:
- (for classes, not structures) if the references to the current and incoming objects are equal, then true is returned;
- and already the second requirement - if the reference to the incoming object is null , then false is returned.
But the second question on the implementation of the Equals (Object) method is more interesting and has practical significance.
It concerns how to most correctly implement the requirement:
x.Equals(y) returns the same value as y.Equals(x).
Moreover, are the requirements and examples for the implementation of the method in this part set forth fully and consistently in the
documentation , and are there alternative approaches to the implementation of this requirement.
About this, as well as questions of the implementation of a complete set of class improvements to compare its objects by value, let's talk in the following publications .