Arrays are a fundamental way of organizing data in programming that provides a systematic way to store and organize elements of the same data type. What if there are duplicate elements in an array? Without considering duplicates, storing an array of (n) elements requires (O(n)) space, accounting for individual element storage. Do you know that accessing an element in an unsorted array takes (O(n)) time due to full-array scans? However, removing duplicates from a sorted array improves efficiency: the space complexity remains (O(1)) as the operation is performed in place and the time complexity reduces to (O(n)) by iteratively removing duplicates in a single pass.
Don’t worry, as we’ll be discussing the top ways to remove duplicate elements from the array in this blog.
Explore the world of C programming with this captivating YouTube video—your gateway to a comprehensive learning experience!
What is an Array?
An array is a basic data structure in computer science that stores elements with the same data type, which means if we want 5 elements in an array, we should all have them as integers, characters, booleans, etc. They provide a sequential memory layout, meaning elements are stored one after the other in contiguous memory locations.
Arrays allow you to quickly access any item using its position, called an index. The index starts at 0 for the first element, 1 for the second, and so on. The ability to access elements by index helps for easy access to elements from the data.
7 Ways to Remove Duplicate Elements from Array
Removing duplicates from an array ensures each element is unique, preventing inaccurate results in algorithms and applications. Let’s explore five different ways to remove duplicate elements from an array.
Using Temporary Array
One common method to remove duplicate elements from an array is by utilizing a temporary array. This approach involves iterating through the original array and copying elements to a new array only if they haven’t been encountered before. By doing so, we create a new array that contains only unique elements.
C Programming
#include <stdio.h>
void removeDuplicates(int arr[], int *size) {
// Check if the array is empty or has only one element
if (*size <= 1) {
return;
}
// Temporary array to store unique elements
int temp[*size];
int j = 0;
// Traverse the array and store unique elements in the temporary array
for (int i = 0; i < *size - 1; i++) {
if (arr[i] != arr[i + 1]) {
temp[j++] = arr[I];
}
}
// Store the last element of the original array
temp[j++] = arr[*size - 1];
// Copy elements from the temporary array back to the original array
for (int i = 0; i < j; i++) {
arr[i] = temp[I];
}
// Update the size of the array after removing duplicates
*size = j;
}
int main() {
int arr[] = {1, 2, 2, 3, 4, 4, 5};
int size = sizeof(arr) / sizeof(arr[0]);
// Display original array
printf("Original Array: ");
for (int i = 0; i < size; i++) {
printf("%d ", arr[I]);
}
// Remove duplicates
removeDuplicates(arr, &size);
// Display array after removing duplicates
printf("\nArray after removing duplicates: ");
for (int i = 0; i < size; i++) {
printf("%d ", arr[I]);
}
return 0;
}
Output:
Original Array: 1 2 2 3 4 4 5
Array after removing duplicates: 1 2 3 4 5
Python Programming
def remove_duplicates_temp_array(arr):
temp_array = []
for item in arr:
if item not in temp_array:
temp_array.append(item)
return temp_array
#Example usage
arr = [1, 2, 2, 3, 4, 4, 5]
print(f"Original Array: {arr}")
print(f"Array after removing duplicates: {remove_duplicates_temp_array(arr)}")
Output:
Original Array: [1, 2, 2, 3, 4, 4, 5]
Array after removing duplicates: [1, 2, 3, 4, 5]
Java Programming
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
public class RemoveDuplicatesTempArray {
public static List removeDuplicates(int[] arr) {
List tempArray = new ArrayList<>();
for (int item : arr) {
if (!tempArray.contains(item)) {
tempArray.add(item);
}
}
return tempArray;
}
public static void main(String[] args) {
int[] arr = {1, 2, 2, 3, 4, 4, 5};
System.out.println("Original Array: " + Arrays.toString(arr));
System.out.println("Array after removing duplicates: " + removeDuplicates(arr));
}
}
Output:
Original Array: [1, 2, 2, 3, 4, 4, 5]
Array after removing duplicates: [1, 2, 3, 4, 5]
Another way to solve the problem is by using more space, but instead of creating a temporary array, we use something called a “set.” A set is like a special box that helps us remember which elements are unique. This is handy when the list we’re working with might have items that are not easy to put in order.
C Programming
#include <stdio.h>
void removeDuplicates(int arr[], int *n) {
if (*n <= 1) {
// Array with 0 or 1 element has no duplicates
return;
}
// Create a temporary array to store unique elements
int temp[*n];
// Initialize the first element in the temporary array
temp[0] = arr[0];
// Initialize variables to keep track of the unique element count
int j = 1;
// Loop through the original array to check for duplicates
for (int i = 1; i < *n; i++) {
int duplicate = 0;
// Check if the current element is a duplicate
for (int k = 0; k < j; k++) {
if (arr[i] == temp[k]) {
duplicate = 1;
break;
}
}
// If the current element is not a duplicate, add it to the temporary array
if (!duplicate) {
temp[j++] = arr[i];
}
}
// Update the original size after removing duplicates
*n = j;
// Copy the unique elements back to the original array
for (int i = 0; i < j; i++) {
arr[i] = temp[i];
}
}
int main() {
int arr[] = {1, 2, 3, 4, 2, 5, 6, 1, 7};
int n = sizeof(arr) / sizeof(arr[0]);
printf("Original Array: ");
for (int i = 0; i < n; i++) {
printf("%d ", arr[i]);
}
removeDuplicates(arr, &n);
printf("\nArray after removing duplicates: ");
for (int i = 0; i < n; i++) {
printf("%d ", arr[i]);
}
return 0;
}
Output:
Original Array: 1 2 3 4 2 5 6 1 7
Array after removing duplicates: 1 2 3 4 5 6 7
Python Programming
def remove_duplicates_extra_space(arr):
return list(set(arr))
#Example Usage
arr = [1, 2, 2, 3, 4, 4, 5]
print(f"Original Array: {arr}")
print(f"Array after removing duplicates: {remove_duplicates_extra_space(arr)}")
Output:
Original Array: [1, 2, 2, 3, 4, 4, 5]
Array after removing duplicates: [1, 2, 3, 4, 5]
Java Programming
import java.util.Arrays;
import java.util.HashSet;
import java.util.List;
import java.util.Set;
import java.util.stream.Collectors;
public class RemoveDuplicatesExtraSpace {
public static List removeDuplicates(int[] arr) {
Set set = new HashSet<>();
for (int item : arr) {
set.add(item);
}
return set.stream().collect(Collectors.toList());
}
public static void main(String[] args) {
int[] arr = {1, 2, 2, 3, 4, 4, 5};
List<Integer> uniqueList = removeDuplicates(arr);
System.out.println("Original Array: " + Arrays.toString(arr));
System.out.println("Array after removing duplicates: " + uniqueList);
}
}
Output:
Original Array: [1, 2, 2, 3, 4, 4, 5]
Array after removing duplicates: [1, 2, 3, 4, 5]
While it can be difficult, it is sometimes useful to remove duplicates without consuming additional space. We can compare each element of the array with all subsequent elements to achieve this.
C Programming
#include <stdio.h>
// Function to remove duplicates from an array in-place
int removeDuplicates(int arr[], int n) {
if (n == 0 || n == 1)
return n;
// Initialize index to keep track of the non-duplicate elements
int index = 0;
// Traverse the array
for (int i = 0; i < n - 1; i++) {
// If the current element is not equal to the next element, store it at the next index
if (arr[i] != arr[i + 1]) {
arr[index++] = arr[i];
}
}
// Store the last element as it will always be part of the result
arr[index++] = arr[n - 1];
// Return the new length of the array without duplicates
return index;
}
int main() {
int arr[] = {1, 2, 2, 3, 4, 4, 5};
int n = sizeof(arr) / sizeof(arr[0]);
printf("Original Array: ");
for (int i = 0; i < n; i++) {
printf("%d ", arr[i]);
}
n = removeDuplicates(arr, n);
printf("\nArray after removing duplicates: ");
for (int i = 0; i < n; i++) {
printf("%d ", arr[i]);
}
return 0;
}
Output:
Original Array: 1 2 2 3 4 4 5
Array after removing duplicates: 1 2 3 4 5
Python Programming
def remove_duplicates_in_place(arr):
if not arr:
return arr
arr.sort()
write_index = 1
for read_index in range(1, len(arr)):
if arr[read_index] != arr[read_index - 1]:
arr[write_index] = arr[read_index]
write_index += 1
return arr[:write_index]
Example usage
arr = [1, 2, 2, 3, 4, 4, 5]
print(f"Original Array: {arr}")
print(f“Array after removing duplicates: {remove_duplicates_in_place(arr)}")
Output:
Original Array: [1, 2, 2, 3, 4, 4, 5]
Array after removing duplicates: [1, 2, 3, 4, 5]
Java Programming
import java.util.Arrays;
public class RemoveDuplicatesInPlace {
public static int[] removeDuplicates(int[] arr) {
if (arr.length == 0) {
return arr;
}
Arrays.sort(arr);
int writeIndex = 1;
for (int readIndex = 1; readIndex < arr.length; readIndex++) {
if (arr[readIndex] != arr[readIndex - 1]) {
arr[writeIndex++] = arr[readIndex];
}
}
return Arrays.copyOf(arr, writeIndex);
}
public static void main(String[] args) {
int[] arr = {1, 2, 2, 3, 4, 4, 5};
System.out.println("Original Array: " + Arrays.toString(arr));
System.out.println("Array after removing duplicates: " + Arrays.toString (removeDuplicates(arr)));
}
}
Original Array: [1, 2, 2, 3, 4, 4, 5]
Array after removing duplicates: [1, 2, 3, 4, 5]
Using Pointers
In the following code, we’ve used pointers to access array elements and modify them. The remove duplicates function takes a pointer to the array and a pointer to its size (n). The key role of pointers is to navigate through the array, access its elements, and modify them during the process of removing duplicates. Here’s how pointers are utilized:
C Programming
#include <stdio.h>
// Function to remove duplicate elements from an array
void removeDuplicates(int *arr, int *n);
int main() {
// Example array with duplicates
int arr[] = {1, 2, 3, 4, 2, 5, 6, 1, 7, 8, 9, 9, 6};
int n = sizeof(arr) / sizeof(arr[0]);
// Display the original array
printf("Original Array: ");
for (int i = 0; i < n; i++) {
printf("%d ", arr[i]);
}
// Call the function to remove duplicates
removeDuplicates(arr, &n);
// Display the array after removing duplicates
printf("\nArray after removing duplicates: ");
for (int i = 0; i < n; i++) {
printf("%d ", arr[i]);
}
return 0;
}
// Function to remove duplicate elements from an array
void removeDuplicates(int *arr, int *n) {
// Iterate through each element in the array
for (int i = 0; i < *n; i++) {
// Compare the current element with all subsequent elements
for (int j = i + 1; j < *n;) {
// If a duplicate is found
if (arr[i] == arr[j]) {
// Shift elements to remove the duplicate
for (int k = j; k < *n - 1; k++) {
arr[k] = arr[k + 1];
}
(*n)--; // Decrease the size of the array
} else {
j++; // Move to the next element
}
}
}
}
Output:
Original Array: 1 2 3 4 2 5 6 1 7 8 9 9 6
Array after removing duplicates: 1 2 3 4 5 6 7 8 9
Python Programming
def remove_duplicates_pointers(arr):
if not arr:
return arr
write_index = 1
for read_index in range(1, len(arr)):
if arr[read_index] != arr[write_index - 1]:
arr[write_index] = arr[read_index]
write_index += 1
return arr[:write_index]
arr = [1, 2, 2, 3, 4, 4, 5]
print(f"Original Array: {arr}")
print(f"Array after removing duplicates: {remove_duplicates_pointers(arr)}")
Output:
Original Array: [1, 2, 2, 3, 4, 4, 5]
Array after removing duplicates: [1, 2, 3, 4, 5]
Java Programming
import java.util.Arrays;
public class RemoveDuplicatesPointers {
public static int[] removeDuplicates(int[] arr) {
if (arr.length == 0) {
return arr;
}
int writeIndex = 1;
for (int readIndex = 1; readIndex < arr.length; readIndex++) {
if (arr[readIndex] != arr[writeIndex - 1]) {
arr[writeIndex++] = arr[readIndex];
}
}
return Arrays.copyOf(arr, writeIndex);
}
public static void main(String[] args) {
int[] arr = {1, 2, 2, 3, 4, 4, 5};
System.out.println("Original Array: " + Arrays.toString(arr));
System.out.println("Array after removing duplicates: " + Arrays.toString (removeDuplicates(arr)));
}
}
Output:
Original Array: [1, 2, 2, 3, 4, 4, 5]
Array after removing duplicates: [1, 2, 3, 4, 5]
Using HashMaps
A more efficient approach for unsorted arrays involves using a HashMap (or dictionary in Python) to keep track of the frequency of each element. A HashMap is like a magic box that lets you quickly find things (values) by their special labels (keys). It’s like a super-fast dictionary for computer programs. This approach has a time complexity of O(n).
C Programming
#include <stdio.h>
#include <stdlib.h>
#define MAX_SIZE 100
// Function to remove duplicates from an array using a hash map
void removeDuplicates(int arr[], int *size) {
int hash[MAX_SIZE] = {0}; // Initialize hash map
int newSize = 0; // Variable to store the size of the a
ray after removing duplicates
for (int i = 0; i < *size; i++) {
if (hash[arr[i]] == 0) {
// If the element is not in the hash map, add it to the hash map and update the array
hash[arr[i]] = 1;
arr[newSize] = arr[I];
newSize++;
}
}
*size = newSize; // Update the size of the array
}
// Function to display the elements of an array
void displayArray(int arr[], int size) {
for (int i = 0; i < size; i++) {
printf("%d ", arr[I]);
}
printf("\n");
}
int main() {
int arr[MAX_SIZE];
int size;
// Input size of the array
printf("Enter the size of the array: ");
scanf("%d", &size);
// Input array elements
printf("Enter the elements of the array:\n");
for (int i = 0; i < size; i++) {
scanf("%d", &arr[I]);
}
// Remove duplicates using hash map
removeDuplicates(arr, &size);
// Display the array after removing duplicates
printf("Array after removing duplicates: ");
displayArray(arr, size);
return 0;
}
Output:
Enter the size of the array: 6
Enter the elements of the array:
5 4 5 1 2 1
Array after removing duplicates: 5 4 1 2
Python Programming
def remove_duplicates_hashmap(arr):
return list(dict.fromkeys(arr))
Example usage
arr = [1, 2, 2, 3, 4, 4, 5]
print(f”Original Array: {arr}”)
print(f”Array after removing duplicates: {remove_duplicates_hashmap(arr)}”)
Output:
Original Array: [1, 2, 2, 3, 4, 4, 5]
Array after removing duplicates: [1, 2, 3, 4, 5]
Java Programming
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Arrays;
public class RemoveDuplicatesHashMap {
public static List removeDuplicates(int[] arr) {
Map map = new HashMap<>();
for (int item : arr) {
map.put(item, true);
}
return new ArrayList<>(map.keySet());
}
public static void main(String[] args) {
int[] arr = {1, 2, 2, 3, 4, 4, 5};
System.out.println("Original Array: " + Arrays.toString(arr));
System.out.println("Array after removing duplicates: " + removeDuplicates(arr));
}
}
Output:
Original Array: [1, 2, 2, 3, 4, 4, 5]
Array after removing duplicates: [1, 2, 3, 4, 5]
Using List Comprehension with Set
Python Programming
This approach is to utilize the set to keep track of what has already been seen and the list comprehension for building the result list. The returned value of set.add is None, thus placing it inside the list comprehension will guarantee that each element only gets added once, hence preserving its original order.
def remove_duplicates_list_comprehension(arr):
seen = set()
return [x for x in arr if x not in seen and not seen.add(x)]
Example usage
arr = [1, 2, 2, 3, 4, 4, 5]
print(f"Original Array: {arr}")
print(f"Array after removing duplicates: {remove_duplicates_list_comprehension(arr)}")
Output:
Original Array: [1, 2, 2, 3, 4, 4, 5]
Array after removing duplicates: [1, 2, 3, 4, 5]
Java Programming
The algorithm is implemented by creating an empty HashSet; when going through every integer in nums[], add those integers that have not been added before into such a HashSet.
import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import java.util.Set;
import java.util.Arrays; // Import Arrays for array printing
public class RemoveDuplicatesListComprehension {
public static List removeDuplicates(int[] arr) {
Set seen = new HashSet<>();
List result = new ArrayList<>();
for (int x : arr) {
if (seen.add(x)) {
result.add(x);
}
}
return result;
}
public static void main(String[] args) {
int[] arr = {1, 2, 2, 3, 4, 4, 5};
System.out.println("Original Array: " + Arrays.toString(arr));
System.out.println("Array after removing duplicates: " + removeDuplicates(arr));
}
}
Output:
Original Array: [1, 2, 2, 3, 4, 4, 5]
Array after removing duplicates: [1, 2, 3, 4, 5]
Python Programming
This technique employs itertools.groupby to group elements in an array. First, groups are generated by sorting the array where groupby groups of consecutive identical elements together. Choosing the first element from each group guarantees only uniques are left behind.
from itertools import groupby def remove_duplicates_groupby(arr):
arr.sort()
return [key for key, _ in groupby(arr)]
Example usage
arr = [1, 2, 2, 3, 4, 4, 5]
print(f"Original Array: {arr}")
print(f"Array after removing duplicates: {remove_duplicates_groupby(arr)}")
Output:
Original Array: [1, 2, 2, 3, 4, 4, 5]
Array after removing duplicates: [1, 2, 3, 4, 5]
Java Programming
This method groups the array elements after sorting them and picks the first element of each group.
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
public class RemoveDuplicatesGroupby {
public static List removeDuplicates(int[] arr) {
Arrays.sort(arr);
List result = new ArrayList<>();
for (int i = 0; i < arr.length; i++) {
if (i == 0 || arr[i] != arr[i - 1]) {
result.add(arr[i]);
}
}
return result;
}
public static void main(String[] args) {
int[] arr = {1, 2, 2, 3, 4, 4, 5};
System.out.println("Original Array: " + Arrays.toString(arr));
System.out.println("Array after removing duplicates: " + removeDuplicates(arr));
}
}
Output:
Original Array: [1, 2, 2, 3, 4, 4, 5]
Array after removing duplicates: [1, 2, 3, 4, 5]
Concluding Thoughts
Choosing the right method to remove duplicates from an array depends on various factors, such as whether the array is sorted, the allowed time complexity, and the available memory. Each approach has its trade-offs, and understanding the characteristics of the input data can help in selecting the most appropriate solution. For example, for a sorted array, using pointers with less time complexity may be advantageous. Using extra space like a set works well for unsorted arrays, preventing duplicate elements efficiently. HashMaps offers a versatile solution for diverse scenarios.
FAQs
Why is removing duplicates important?
Removing duplicates is crucial for various reasons, including efficient storage utilization, improving algorithm efficiency, and ensuring accurate data analysis. Duplicate elements can lead to incorrect results in computations and may waste valuable memory space.
Are there other methods to remove duplicates from an array?
Yes, there are several other methods, such as sorting the array and then removing duplicates or using advanced data structures like Bloom filters. The choice depends on the specific requirements and constraints of the problem at hand.
Can these methods be applied to multidimensional arrays?
Yes, the basic principles can be applied to multidimensional arrays. However, the implementation details may vary based on the programming language and the specific characteristics of the multidimensional array.
How do these methods perform on large datasets?
The performance of these methods depends on factors like the size of the dataset, the chosen algorithm, and the underlying hardware. Generally, methods that utilize extra space (such as HashMaps) tend to have better time complexity compared to methods without extra space, especially for large datasets.