Monday, 20 July 2015

3 Ways to Find Duplicate Elements in an Array - Java

There are multiple ways to find duplicate elements in an array in Java and we will see three of them in this program. Solution and logic shown in this article is generic and applies to array of any type e.g. String array or integer array or array of any object. One of the most common way to find duplicates is by using brute force method, which compare each element of array to every other element. This solution has time complexity of O(n^2) and only exists for academic purpose. You shouldn't be using this solution in real world. Standard way to find duplicate elements from array is by using HashSet data structure. If you remember, Set abstract data type doesn't allow duplicates. You can take advantage of this property to filter duplicate elements. This solution has time complexity of O(n), as you only need to iterate over array once, but also has space complexity of O(n) as you need to store unique elements in array. Our third solution to find duplicate elements in array is actually similar to our second solution but instead of using Set data structure we will use hash table data structure. This is a pretty good solution because you can extend it to found count of duplicates as well. In this solution, we iterate over array and build the map which stores array elements and their count. Once the table is build, you can iterate over hash table and print out all the elements, who has count greater than one, those are your duplicates. This question is also very popular on programming interviews and if you are preparing for them, I also suggest you to solves problems from Cracking the Coding Interview: 150 Programming Questions and Solutions. One of the best book to prepare for software developer interviews.


How to find duplicates in Java array?

In first paragraph, I have given you brief overview of thee ways to find duplicate elements from Java array. Now, let's understand logic behind each of those solution in little more detail.

Solution 1 :

Our first solution is very simple. All we are doing here is to looping over array and comparing each element to every other element. For doing this, we are using two loops, inner loop and outer loop. We are also making sure that we are ignoring comparing of elements to itself by checking for i != j before printing duplicates. Since we are comparing every element to every other element, this solution has quadratic time complexity i.e. O(n^2). This solution has the worst complexity in all three solutions.
 for (int i = 0; i < names.length; i++) {
     for (int j = i + 1 ; j < names.length; j++) {
          if (names[i].equals(names[j])) {
                   // got the duplicate element
          }
     }
 }


Solution 2 :

Second solution is even simpler than this. All you need to know is that Set doesn't allow duplicates in Java. Which means if you have added an element into Set and trying to insert duplicate element again, it will not be allowed. In Java, you can use HashSet class to solve this problem. Just loop over array elements, insert them into HashSet using add() method and check return value. If add() returns false it means that element is not allowed in the Set and that is your duplicate. Here is the code sample to do this :
 for (String name : names) {
     if (set.add(name) == false) {
        // your duplicate element
     }
}
Complexity of this solution is O(n) because you are only going through array one time, but it also has space complexity of O(n) because of HashSet data structure, which contains your unique elements. So if an array contains 1 million elements, in worst case you would need an HashSet to store those 1 million elements.


Solution 3 :

Our third solution take advantage of another useful data structure, hash table. All you need to do is loop over array using foreach and insert each element and its count into hash table. You can use HashMap class of JDK to solve this problem. It is the general purpose hash table implementation in Java. In order to build table, you check if hash table contains the elements or not, if it is then increment the count otherwise insert element with count 1. Once you have this table ready, you can iterate over hashtable and print all those keys which has values greater than one. These are your duplicate elements. This is in fact a very good solution because you can extend it to found count of duplicates as well. If you remember, I have used this approach to find duplicate characters in String earlier. Here is how you code will look like :
// build hash table with count
        for (String name : names) {
            Integer count = nameAndCount.get(name);
            if (count == null) {
                nameAndCount.put(name, 1);
            } else {
                nameAndCount.put(name, ++count);
            }
        }

        // Print duplicate elements from array in Java
        Set<Entry<String, Integer>> entrySet = nameAndCount.entrySet();
        for (Entry<String, Integer> entry : entrySet) {
            if (entry.getValue() > 1) {
                System.out.printf("duplicate element '%s' and count '%d' :", entry.getKey(), entry.getValue());
            }
        }
Time complexity of this solution is O(2n) because we are iterating over array twice and space complexity is same as previous solution O(n). In worst case you would need a hash table with size of array itself.

How to find duplicate elements in array in Java



Java Program to find duplicate elements in array

Here is our three solutions packed into a Java program to find duplicate elements in array. You can run this example from command line or Eclipse IDE, whatever suits you. Just make sure that name of your Java source file should be same as your public class e.g. "DuplicatesInArray". I have left bit of exercise for you, of course if you would like to do. Can you refactor this code into methods, you can do that easily by using extract method feature of IDE like Eclipse and Netbans and write unit test to check the logic of each approach. This would give you some practice on refactoring code and writing JUnit tests, two important attributes of a professional programmer.
package dto;

import java.util.HashMap;
import java.util.HashSet;
import java.util.Map;
import java.util.Map.Entry;
import java.util.Set;

/**
 * Java Program to find duplicate elements in an array. There are two straight
 * forward solution of this problem first, brute force way and second by using
 * HashSet data structure. A third solution, similar to second one is by using
 * hash table data structure e.g. HashMap to store count of each element and
 * print element with count 1.
 * 
 * @author java67
 */

public class DuplicatesInArray{

    public static void main(String args[]) {

        String[] names = { "Java", "JavaScript", "Python", "C", "Ruby", "Java" };

        // First solution : finding duplicates using brute force method
        System.out.println("Finding duplicate elements in array using brute force method");
        for (int i = 0; i < names.length; i++) {
            for (int j = i + 1; j < names.length; j++) {
                if (names[i].equals(names[j]) ) {
                   // got the duplicate element
                }
            }
        }

        // Second solution : use HashSet data structure to find duplicates
        System.out.println("Duplicate elements from array using HashSet data structure");
        Set<String> store = new HashSet<>();

        for (String name : names) {
            if (store.add(name) == false) {
                System.out.println("found a duplicate element in array : "
                        + name);
            }
        }

        // Third solution : using Hash table data structure to find duplicates
        System.out.println("Duplicate elements from array using hash table");
        Map<String, Integer> nameAndCount = new HashMap<>();

        // build hash table with count
        for (String name : names) {
            Integer count = nameAndCount.get(name);
            if (count == null) {
                nameAndCount.put(name, 1);
            } else {
                nameAndCount.put(name, ++count);
            }
        }

        // Print duplicate elements from array in Java
        Set<Entry<String, Integer>> entrySet = nameAndCount.entrySet();
        for (Entry<String, Integer> entry : entrySet) {

            if (entry.getValue() > 1) {
                System.out.println("Duplicate element from array : "
                        + entry.getKey());
            }
        }
    }
}
Output :
Finding duplicate elements in array using brute force method
Duplicate elements from array using HashSet data structure
found a duplicate element in array : Java
Duplicate elements from array using hash table
Duplicate element from array : Java
From the output you can see that the only duplicate element from our String array, which is "Java" has been found by all of our three solution.


That's all about how to find duplicate elements in an array in Java. In this tutorial, you have learned three ways to solve this problem. The brute force way require you to compare each element from array to another, hence has quadratic time complexity. You can optimize performance by using HashSet data structure, which doesn't allow duplicates. So a duplicate element is the one for which add() method of HashSet return false. Our third solution uses hash table data structure to make a table of elements and their count. Once you build that table, iterate over it and print elements whose count is greater than one. This is a very good coding problem and frequently asked in Java Interview. It also shows how use of a right data structure can improve performance of algorithm significantly.

If you have like this coding problem, you may also like to solve following coding problems from Java Interviews :
  • How to find all pairs on integer array whose sum is equal to given number? [solution]
  • How to remove duplicates from array in Java? [solution]
  • How to sort an array in place using QuickSort algorithm? [solution]
  • Write a program to find top two numbers from an integer array? [solution]
  • How do you remove duplicates from array in place? [solution]
  • How do you reverse array in place in Java? [solution]
  • Write a program to find missing number in integer array of 1 to 100? [solution]
  • How to check if array contains a number in Java? [solution]
  • How to find maximum and minimum number in unsorted array? [solution]


Recommended books to Prepare for Software Engineer Interviews

If you are solving these coding problems to prepare for software engineer job interviews, you can also take a look at following books. They contain wealth of knowledge and several frequently asked coding problems from Java and C++ interviews :
  • Programming Interviews Exposed: Secrets to Landing Your Next Job (book)
  • Coding Puzzles: Thinking in code By codingtmd (book)
  • Cracking the Coding Interview: 150 Programming Questions and Solutions (book)

No comments:

Post a Comment