When All Things Are Not Equal

This blog was created by Arafat Tanin, Software Security Engineer, OpenRefactory and edited by Charlie Bedard.

When working with Java and its extensive standard library, you may often find yourself dealing with collections like HashSet and HashMap. These collections rely on the equals(Object) and hashCode() methods of the objects they store. But do you know why these methods are crucial, and how they work together to ensure consistency in collections? Let’s dive into the world of equals(Object) and hashCode().

The Contract

The Java Language Specification (JLS) defines a contract between the equals(Object) and hashCode() methods of an object. This contract is vital for the proper functioning of hash-based data structures like HashSet and HashMap. The contract can be summarized as follows:

If two objects are equal according to the equals(Object) method, then calling the hashCode() method on each of the two objects must produce the same integer result.

In simpler terms, if two objects are considered equal, they should have the same hash code.

Why Is This Contract Important?

Imagine that we are using a HashSet to store a collection of objects. The HashSet uses hash codes to efficiently organize and retrieve its elements. When we add an object to a HashSet, it calculates the hash code of the object and places it in the appropriate bucket based on this code. Later, when we want to check if an object exists in the set, it will first calculate the hash code of the object we are looking for, check the bucket associated with that hash code, and then use the equals(Object) method to verify if the object is truly the one we are looking for.

Now, let’s consider what would happen if the equals(Object) and hashCode() contract were violated. Suppose we have two objects, obj1 and obj2, which are considered equal based on our custom equals(Object) implementation, but their hashCode() values are different. If we try to use these objects in a HashSet, it might end up placing them in different buckets based on their inconsistent hash codes. As a result, we’d encounter unexpected behavior — contains(obj1) might return false even though we know obj1 is in the set.

Bad Implementation

To illustrate the importance of a good implementation, consider a bad implementation of equals(Object)and hashCode():

import java.util.HashSet;

import java.util.Objects;

 

 

class BadImplementation {

   private final String firstName;

   private final String lastName;

   private final int age;

 

   public BadImplementation(String firstName, String lastName, int age) {

       this.firstName = firstName;

       this.lastName = lastName;

       this.age = age;

   }

 

   @Override

   public boolean equals(Object o) {

       if (this == o) return true;

       if (o == null || getClass() != o.getClass()) return false;

       BadImplementation person = (BadImplementation) o;

       return age == person.age &&

               Objects.equals(firstName, person.firstName) &&

               Objects.equals(lastName, person.lastName);

   }

 

   // hashCode() was not overridden here

}

In this implementation, the contract between equals(Object) and hashCode() was broken and thus whenever we check the hash value of two identical objects it gives us different result:

public static void main(String[] args) {

   HashSet<BadImplementation> badImplementationHashSet = new HashSet<>();

 

   BadImplementation person1 = new BadImplementation(“John”“Doe”30);

   BadImplementation person2 = new BadImplementation(“Jane”“Smith”25);

   BadImplementation person3 = new BadImplementation(“John”“Doe”30);

 

   // Testing equality

   System.out.println(“Person1.equals(person2): ” + person1.equals(person2)); // false

   System.out.println(“Person1.equals(person3): ” + person1.equals(person3)); // true

 

   // Testing hash codes

   System.out.println(“Person1.hashCode(): ” + person1.hashCode());

   System.out.println(“Person2.hashCode(): ” + person2.hashCode());

   System.out.println(“Person3.hashCode(): ” + person3.hashCode());

 

   // Testing hash set

   badImplementationHashSet.add(person1);

   badImplementationHashSet.add(person2);

 

   boolean searchValue = badImplementationHashSet.contains(person3);

 

   if(searchValue){

       System.out.println(“Found”);

   }else{

       System.out.println(“Not found”);

   }

The result will look like this:

Person1.equals(person2): false
Person1.equals(person3): 
true
Person1.hashCode(): 
135721597
Person2.hashCode(): 
142257191
Person3.hashCode(): 
1044036744
Not found

See the demonstration here and source code here

Proper Implementation

Now, let’s see a proper implementation:

class ProperImplementation {

   private final String firstName;

   private final String lastName;

   private final int age;

 

   public ProperImplementation(String firstName, String lastName, int age) {

       this.firstName = firstName;

       this.lastName = lastName;

       this.age = age;

   }

 

   @Override

   public boolean equals(Object o) {

       if (this == o) return true;

       if (o == null || getClass() != o.getClass()) return false;

       ProperImplementation person = (ProperImplementation) o;

       return age == person.age &&

               Objects.equals(firstName, person.firstName) &&

               Objects.equals(lastName, person.lastName);

   }

 

   // hashCode() is overridden here

   @Override

   public int hashCode() {

       return Objects.hash(firstNamelastNameage);

   }

 

}

Here, the contract was not broken, thus the output of the program is as expected by the developer:

Person1.equals(person2): false
Person1.equals(person3): 
true
Person1.hashCode(): –
2068529906
Person2.hashCode(): 
396701379
Person3.hashCode(): –
2068529906
Found

See the demonstration here and find the source code here.

Conclusion

In the world of Java, understanding the contract between equals(Object) and hashCode() is crucial when working with hash-based data structures. Violating this contract can lead to unpredictable and erroneous behavior in code. By implementing these methods correctly, it ensures the consistency and reliability of collections.

End Note:
The work done by OpenRefactory has been supported by a grant from the Alpha Omega project. 

Recent Posts