# [Solved] Find Duplicate in Array in O(n) Linear Time

Problem Statement:

```An array contains n numbers ranging from 0 to n-1. There are some numbers duplicated in the array.

It is not clear how many numbers are duplicated or how many times a number gets duplicated.

How do you find a duplicated number in the array?```

Example:

If an array of length 7 contains the numbers {2,  3, 1, 0, 2, 5, 3}, the implemented function (or method) should return either 2 or 3.

#### Method 1: Using Sorting

The simple solution to the above problem is sorting elements in the array list. If the number is the same as the number located next to it in the array, then the number is duplicate.

Python Program:

```def findDup(liArr):
liArr.sort()

liDuplicate=[]
for i in range(0, len(liArr)-1):
if liArr[i]==liArr[i+1]:
liDuplicate.append(liArr[i])

return liDuplicate

print(findDup([2, 3, 1, 0, 2, 5,3]))```

Output:

`[2, 3]`

Complexity:

In the best case, the merge sort takes time `O(nlogn)` to sort the n elements. After sorting, we are traversing over the sorted array again, this will take time `O(n)`.

So the total complexity of this algorithm is `O(nlogn+n)` i.e. `O(nlogn)`.

Let’s see another optimized solution which is having less complexity.

#### Method 2: Using Hashing

A hash table of size n is used. There will be one hash table entry for each element. The value in the hash table can be either 0 or 1.

Algorithm:

• Take the hash table of size n (says `hashIndex`) and initialize each value in the hashtable to zero.
• Traverse over each element in the array.
• For each element (i) in the array
• if hashIndex[i]==0, set hashIndex[i]=1
• if hashIndex[i]==1, element is duplicate.

Let’s implement this logic by coding.

Python Program:

```def findDuplicate(arr):
liDuplicate=[]
hashIndex=*len(arr)
for i in arr:
if hashIndex[i]==0:
hashIndex[i]=1
elif hashIndex[i]==1:
liDuplicate.append(i)

return liDuplicate

arr=[4, 5, 2, 1, 4, 6, 6]
print(findDuplicate(arr))```

Output:

`[4, 6]`

Complexity:

Here we are using the hashing technique. The `hashIndex` is a kind of hash table where the key is an element from the actual array and the value is 0 or 1.

Each element in the array is visited at once. The time complexity of this algorithm is `O(n)`.

This question to find duplicates in array was asked on the NVIDIA interview coding round. You can solve this problem in any programming language like Python, C/++ or Java.

#### FAQ (MCQ) question:

Vijay is given a problem to solve. he is given an array of names and asked to find duplicates in the names. Vijay builds a hashtable with all the names and uses that to find duplicates. Which of the following statements are true?

(unless otherwise stated, assume that the hash-function and hash-table are working well and doing a good job.) pick all that apply in most cases,

1. the program will run in o(n) time in most cases,
2. the program will run in o(n log n) time in most cases,
3. the program will run in o(n^2) time
4. if the hash-function is a doing a bad job, and there are lots of collisions, the program will run in o(n log n) time
5. if the hash-function is a doing a bad job, and there are lots of collisions, the program will run in o(n^2) time