Bucket Sort Algorithm

Bucket-Sort-Algorithm-feature.jpg

Bucket Sort is a powerful distribution-based sorting algorithm that works by dividing the input into several groups or “buckets”, sorting each bucket individually, and then combining the results to form a fully sorted array. In this comprehensive guide, we will explore Bucket Sort from the ground up. You’ll learn the step-by-step working of the algorithm, along with its pseudocode, flowchart, and detailed explanation. To help you implement it in practice, we’ll also provide Bucket Sort code examples in Python, Java, JavaScript, and C++, the most popular languages for data structures and algorithms.

Table of Contents:

What is Bucket Sort?

Bucket Sort distributes elements into buckets based on their value range, sorts each bucket, and then merges them to produce the sorted result. It is a sorting algorithm that works by dividing the input elements into several groups called buckets, then sorting each bucket individually (often using another sorting algorithm like Insertion Sort), and finally combining all the sorted buckets to get the final sorted array.

Key Characteristics of Bucket Sort:

  • Requires Extra Space: It is not an in-place algorithm, i.e., it uses O(n + k) extra space to store the buckets and their contents.
  • Simple and Intuitive: It follows a straightforward approach; it divides the data into buckets, sorts each bucket, and then merges them. It’s easy to implement, especially when the input data is within a known and limited range.
  • Not Comparison-Based: It does not rely on comparisons between elements; it distributes elements into buckets based on their value range (like hashing) and making it a non-comparison-based sorting algorithm.
  • Stable: It can be stable if the algorithm used to sort individual buckets is stable.
  • Linear Time Complexity in Best Case: It achieves O(n + k) time complexity in the best case (n is the number of elements and k is the number of buckets). This is possible when the elements are uniformly distributed, each bucket contains a small number of elements, and a fast sorting method (like insertion sort) is used on each bucket.

When and Why to Use Bucket Sort?

This can be used when, 

  • The input is uniformly distributed over a known range
  • You need to sort real numbers (like decimals) efficiently.
  • The dataset is not too large, and you cannot use extra memory for buckets.

The bucket  sort is used because it is, 

  • Faster than comparison-based sorts (like QuickSort) in the best-case scenario.
  • Stable, if the internal sort used is stable.
  • Handles non-integer values, unlike counting sort or radix sort.

It is useful where the data is uniformly distributed or can be naturally grouped, and can be used when sorting floating-point numbers, distributing student scores, and so on.

Get Interview-Ready with In-Depth Data Structures Training!
Join Now and level up your problem-solving skills!
quiz-icon

Understanding the Bucket Sort Algorithm

Now, let us understand the working of the Bucket Sort Algorithm.

Bucket Sort Pseudocode

Below is the Pseudocode of the bucket sort algorithm.

 BUCKET_SORT(A)
max_val ← maximum value in a
k ← number of buckets (e.g., k = n)

create k empty buckets: bucket[0...k-1]

for i = 0 to n - 1 do:
bi ← floor(a[i] * k / (max_val + 1))
insert a[i] into bucket[bi]

for j = 0 to k - 1 do:
if bucket[j] is not empty:
sort bucket[j]

sorted_array_index ← 0

for j = 0 to k - 1 do:
for each element e in bucket[j] do:
a[sorted_array_index] ← e
sorted_array_index ← sorted_array_index + 1
return a
   

Working Principle

Let us take the above example of an array [44, 7, 22, 2, 27, 38, 32, 16]. Using this example, we will understand the step-by-step working of Bucket Sort. 

Step 1: Define the Number of Buckets

First, you have to choose the number of buckets, k. This is based on the size and distribution of the input data, that is, the array. We take a look at the max element to decide. In the above example, the maximum element is 44. Therefore, we can divide the range into 5 buckets, each covering a range of 10 units.

If the value of k is already given, calculate the range of each bucket. 

Step 2: Scatter the elements in their respective buckets

When you are not given the value of k, you can scatter the element in two ways. Either you decide the element’s place using the range, or you mathematically calculate it using a formula.

1. Using a mathematical formula

We will calculate the bucket each should be assigned to using the formula

Formula-to-calculate-bucket-index.

2. Using the range

We assign a range for the bucket and assign an element to the bucket if it lies in that range. In this example, the range assigned to each bucket is:

Bucket 0: 0-10
Bucket 1: 10-20
Bucket 2: 20-30
Bucket 3: 30-40
Bucket 4: 40-50

Now assigning elements to their respective buckets, we have: 

Unsorted Buckets

When the value of k is given, you will have to calculate the range for each bucket. Once you have that, you can assign each element according to which range it lies in. The range of the bucket can be calculated using the following formula.

Formula to calculate range

In our example, the range of each bucket is 

Range = (44 – 2) / 5 = 8.4 ≈ 8

Bucket 0: 0-8
Bucket 1: 9-16
Bucket 2: 17-24
Bucket 3: 25-32
Bucket 4: 33-40

Step 3: Sort individual Buckets

Now you have to sort each bucket individually. The buckets will look like this after sorting:

Sorted Buclets

Step 4: Gather all the elements

This is the final step. Once you have sorted all the buckets individually, gather them together to get the final sorted array. 

Final Sorted Array

Flowchart of Bucket Sort Algorithm 

Below is the flowchart of the bucket sort algorithm.

Bucket Sort Algorithm Flowchart

Bucket Sort Code Implementations

In this section, we will look at the implementation of the Bucket Sort algorithm in different programming languages like Java, C++, Python, and JavaScript

Remember, to sort each bucket, you can use any sorting algorithm like quicksort, merge sort, or any other algorithm of your choice.

Bucket Sort Algorithm in C++

Code:

Cpp

Output: 

Output - Implementation is CPP

Explanation: The array is split into buckets using value range and a bucket size of 10. Each bucket is sorted using std::sort(), which typically uses Introsort (a hybrid of QuickSort, HeapSort, and Insertion Sort).

Get 100% Hike!

Master Most in Demand Skills Now!

Bucket Sort Algorithm in Java

Code: 

Java

Output:

Output - Implementation is Java

Explanation: The implementation divides the input into buckets based on their value range using a bucket size of 10. Each bucket is sorted using Java’s Collections.sort(), which uses TimSort

Bucket Sort Algorithm in Python

Code: 

Python

Output:

Output - Implementation is Python

Explanation: The elements are grouped into buckets based on the range of values and bucket size. Each bucket is sorted using Python’s built-in sorted(), which implements TimSort.

Bucket Sort Algorithm in JavaScript

Code: 

Javascript

Output:

Output - Implementation is Javascript

Explanation: The elements are distributed into buckets by range, using a bucket size of 10. Each bucket is sorted using Array.prototype.sort() with a comparator, which in V8 (used by Chrome/Node.js) is TimSort.

Time and Space complexity of Bucket Sort

In this section, we will analyze the bucket sort time complexity.

Time Complexity

The time complexity of Bucket Sort depends on three main steps:

  1. Distributing elements into buckets
  2. Sorting individual buckets
  3. Concatenating all sorted buckets

Let,

  • n is the total number of input elements
  • k is the number of buckets
  • m is the average number of elements per bucket

1. Best Case: O(n + k)

The best case occurs when the elements are evenly distributed across buckets and each bucket contains only a few elements. The buckets are sorted using a fast algorithm like insertion sort, which performs well on small or nearly sorted data. Bucket distribution takes O(n) time and sorting each small bucket takes O(1) on average, resulting in O(k) for all, then merging buckets takes O(n), hence, 
Total: O(n + k)

2. Average Case: O(n + n²/k + k)

In the average case, we are assuming uniform distribution and use of insertion sort, i.e., let bucket distribution be O(n), sorting each bucket with n/k elements is O(k × (n/k)²) = O(n² / k)
And concatenating all buckets is O(k), hence,
Total: O(n + n²/k + k) = O(n+k)
This becomes O(n) when the number of buckets is proportional to the number of elements.

3. Worst Case: O(n²)

The worst case occurs when all the elements fall into a single bucket, and a quadratic sorting algorithm like insertion sort is used internally. In this case, one bucket contains all elements; hence, sorting that one bucket becomes O(n²), 

Total: O(n²)

Case Time Complexity
Best Case O(1)
Average Case O(log n)
Worst Case O(n)

Space Complexity

The space used for the input array is (O(n)), and the extra space used for k buckets, each holding up to n elements in the worst case, hence, 

Total Space Complexity is O(n + k)

What Affects the Performance of Bucket Sort?

The performance of the Bucket Sort is not fixed; it heavily depends on how the algorithm is applied and the nature of the input. Below are the key factors that affect its efficiency:

  • Internal Sorting Algorithm: It uses another sorting algorithm to sort the elements within each bucket; hence, the overall efficiency depends on this choice. If some buckets are large (due to uneven distribution), insertion sort’s O(n²) performance becomes a bottleneck. In such cases, using Merge Sort or QuickSort inside each bucket may provide better performance, especially for buckets with many elements.
  • Input Data Distribution: The most important factor influencing the performance is the distribution of input data. It assumes that the data is uniformly distributed across a range; if it uses this assumption, then elements are evenly spread among the buckets, keeping each bucket small and easy to sort. However, if the data is skewed (i.e., many elements fall into a small range), some buckets will become overloaded while others remain mostly empty. This results in unbalanced buckets and can degrade performance significantly, even leading to a worst-case time complexity of O(n²) if a large bucket is sorted with a slow algorithm like insertion sort.
  • Number of Buckets (k): If fewer buckets are used, they will contain many elements, and sorting them becomes more expensive, and if too many buckets are used, this will increase memory usage and add overhead, especially if many buckets remain empty.
  • Range and Type of Input Values: This sorting technique works best when the range of values is known and limited, especially for floating-point numbers between 0 and 1. The algorithm uses the value itself to decide the appropriate bucket. If the range is too wide or not known in advance, it becomes difficult to determine bucket boundaries. This can result in poor bucket assignments or require complex logic, hurting both simplicity and performance.

 Use Cases of Bucket Sort

  • Sorting Floating Point Numbers: Useful for numbers with the range [0 to 1]. In these cases, Bucket Sort Algorithm splits them into intervals and sorts the input efficiently.
  • Sorting Integers with a Known Range: Works well if you are aware of the range, like when you are sorting students’ scores on exams ranging from 0 to 1000.
  • When the data is Uniformly Distributed: When the data is uniformly distributed, the Bucket Sort Algorithm works better because it will fill up the buckets uniformly.
  • If you know how the data is distributed (uniform, Gaussian, etc.), you can use the Bucket Sort algorithm for improved performance.
  • As a technique in Radix Sort: Bucket Sort is sometimes used as a step in Radix Sort, where the numbers are grouped by their digits.
  • When you have to maintain the order: A stable sort may be necessary to maintain the relative order of equal elements, such as sorting students by marks and names. Bucket Sort can be a stable algorithm if the individual buckets are sorted using a stable sorting algorithm, such as insertion sort.

Comparison with Other Sorting Algorithms

Below is a comparison of the bucket sort with Counting, Radix, and Quick sort.

Feature Bucket Sort Counting Sort Radix Sort QuickSort
Time Complexity (Best) O(n + k); when input is uniformly distributed O(n + k); when the input elements are within a small, known range. O(n × d); when the number of digits (d) is small, and the input size (n) is large. O(n log n); it performs well on most random data, especially when balanced partitions occur.
Space Complexity Uses extra space for buckets and the final sorted list. Needs extra space. Needs space for buckets per digit pass. Needs space for recursive stack calls only.
Stability Maintains the order of equal elements. Maintains the original order of duplicates. Maintains the original order if a stable sort is used at each pass Doesn’t guarantee the relative order of equal items.
In-place Needs extra buckets or lists. Needs an extra array. Needs extra space for digit-wise buckets. Performs sorting within the input array.
Handles Decimals Yes, it can be adapted to float/decimal values by customizing bucket logic. No, it only works with non-negative integers. No, it works only with integers. Yes, it can handle any comparable data type (integers, floats, strings).

Now, let us discuss the bucket sort advantages and disadvantages.

Advantages of Bucket Sort

Below are the advantages of the bucket sort. 

  • Linear Time in Best Case: When the data is uniformly distributed, Bucket Sort can achieve O(n + k) time complexity.
  • Efficient for Floating-Point Numbers: It works well for sorting real numbers with a fixed range like [0, 1).
  • Stable Sorting: It can be made stable if the sorting algorithm used inside the buckets is stable (e.g., Insertion Sort).
  • Simple and Intuitive: It is easy to understand and implement by using the divide, sort, and merge approach.
  • Good for Known Data Ranges: It is very effective when the range of values and distribution is known.

Disadvantages of Bucket Sort

Below are the disadvantages of the bucket sort. 

  • Not Suitable for All Distributions: The performance drops if the data is not uniformly distributed.
  • Requires Extra Space: It uses O(n + k) space to store buckets; hence, not an in-place algorithm.
  • Limited Use Cases: It is best suited for specific types of data (e.g., float values or known ranges), and not general-purpose.
  • Poor Worst-Case Time: If all elements are present in one bucket, sorting takes O(n²) time.
  • Sensitive to Input Range and Scaling: If the input values are not spread over a very wide or uneven range, it becomes difficult to find an appropriate bucket boundary. For example, if your data contains both very small (e.g., 0.001) and very large values (e.g., 1,000,000), then most values may fall into one or two buckets, which will lead to imbalanced buckets and low performance.

When does the bucket sort excel? 

It excels when the data is uniformly distributed across a known range, sorting floating-point numbers, and when the range of values is small and known (e.g., student scores between 0 and 1000).

When does the bucket sort fail? 

It does not work well when the range of input values is unknown or very large, which makes the bucket assignment difficult or inefficient.

Variations of Bucket Sort

Below are the variations of Bucket Sort.

1. Adaptive Bucket Sort

Adaptive Bucket Sort is a variation of the standard Bucket Sort that dynamically adjusts the number and range of buckets based on the input characteristics, i.e., its distribution and size. It aims to improve performance and resource usage rather than relying on fixed bucket parameters. In this sorting technique, the buckets are created based on input value range, spread, or clustering instead of being fixed in size or number, and it uses heuristics (e.g., variance, skewness) to allocate bucket ranges more efficiently.

2. Flash Sort

Flash Sort is a high-performance variation of Bucket Sort that works especially well with uniformly distributed data. Instead of simply dividing elements into buckets, it uses a classification step to estimate where elements should go and then permutes them into their correct regions. After this phase, a sorting algorithm like Insertion Sort is applied to finalize the ordering within those regions. Flash Sort is known for its speed and efficiency in specific scenarios, particularly when dealing with large datasets with predictable distribution.

3. Histogram Sort

Histogram Sort is a variant of Bucket Sort that focuses on counting the number of elements falling into each bucket rather than storing the elements themselves. It builds a histogram by scanning the input and recording frequency counts, then uses that information to determine where each element belongs in the final sorted array. This method is memory-efficient and well-suited for situations where the actual content of the buckets is less important than their distribution or range positioning.

4. Postman’s Sort

Postman’s Sort is a distribution-based sorting algorithm that works similarly to how a postman delivers letters by distributing them into mailboxes (buckets) based on postal codes. Like Bucket Sort, it involves dividing the input into several categories (buckets) and then sorting each category individually. However, Postman’s Sort often emphasizes multi-pass distribution, where elements are distributed based on more than one key (like radix sort). It’s particularly useful when sorting large datasets with a known key structure and works efficiently when combined with other sorting strategies for sorting within buckets.

5. Proxmap Sort

Proxmap Sort is a variant of Bucket Sort that uses a proximity map to place elements directly into their correct or near-correct positions in an auxiliary array. It improves on bucket sort by avoiding multiple passes and reducing the need for sorting within buckets. It uses a hash-like function to map each key to an index based on its value, and stores the start positions for each key range in the final sorted array.

Why is Bucket Sort not Used Frequently in Interviews?

This sorting technique is rarely asked in interviews because it has a very targeted use case, i.e., it works best when the input data is uniformly distributed, especially with floating-point numbers, which isn’t common in most problems. Unlike comparison-based sorting algorithms like QuickSort or MergeSort, which are more versatile and frequently used, Bucket Sort depends on extra memory for buckets and is not an in-place algorithm, making it less space-efficient. It’s also harder to implement in a generic, reusable way since its performance heavily depends on how buckets are initialized and how the data is distributed. Also, it’s not commonly seen in real-world applications or standard libraries, so interviewers tend to focus on more practical and widely applicable algorithms.

Common Mistakes and Debugging Tips

Below are some of the common tips that should be used in the bucket sort algorithm. 

1. Incorrect Bucket Range Selection: One of the mistakes in bucket sort is choosing bucket ranges that do not evenly distribute the elements. This can cause some buckets to become overloaded while others remain underutilized, leading to inefficient sorting. To solve this, analyze the input data range beforehand and ensure buckets cover the full range with uniform width.

2. Using an Inappropriate Sorting Algorithm Inside Buckets: Another common issue is selecting an algorithm to sort individual buckets. For example, using quicksort on small buckets can be overkill and introduce unnecessary overhead. To solve this, use insertion sort for small buckets since it’s efficient for nearly sorted or small datasets, and use more complex algorithms like merge sort if the bucket sizes are expected to be large.

3. Forgetting to Combine Sorted Buckets: Do not sort the buckets in the wrong order, as it can result in incorrect final output. Instead of this, after sorting each bucket, ensure they are merged in the correct sequence to maintain overall ordering.

4. Not Handling Duplicate or Boundary Values Correctly: Improper handling of duplicate values can result in elements being placed in the wrong bucket or lost entirely. Hence, ensure that the bucket mapping function includes boundary conditions clearly (e.g., use <= for upper limits carefully), and test the algorithm with data containing duplicate values and edge cases.

Master the Basics of Data Structures – Free of Cost
Start Learning for Free and build a solid foundation in programming!
quiz-icon

Conclusion

As we discussed in this article, the Bucket Sort algorithm is a fast and efficient sorting algorithm, generally ideal for uniformly distributed data over a known range. The basic idea is to divide the input into multiple buckets, then sort each bucket individually and combine the results. You could use various internal sorting algorithms to sort each bucket, such as Insertion Sort. In comparison with other sorting algorithms, under correct conditions, Bucket Sort can perform better. Now that you have a solid understanding of how Bucket Sort works, implementations in various languages, time and space complexity, and its applications, it is now possible for you to utilize Bucket Sort confidently. 

Useful Resources: 

Bucket Sort Algorithm – FAQs

Q1. What is Bucket Sort in simple terms?

Bucket Sort divides elements into multiple buckets based on value ranges. Each bucket is sorted individually, and the results are combined.

Q2. What is the time complexity of Bucket Sort?

Its average and best-case time complexity is O(n + k), but in the worst case, it can degrade to O(n²) if all elements land in one bucket.

Q3. Is Bucket Sort stable or in-place?

Bucket Sort can be stable if the inner sort is stable. It is not in place since it requires extra space for buckets.

Q4. What is an example of Bucket Sort?

Sorting decimal numbers like [0.25, 0.36, 0.58, 0.41] using Bucket Sort involves placing them into 0–1 buckets, sorting each, and merging them in order.

Q5. Where is Bucket Sort best used?

It works best when data is uniformly distributed over a known range, such as floating-point numbers between 0 and 1.

About the Author

Technical Research Analyst - Full Stack Development

Kislay is a Technical Research Analyst and Full Stack Developer with expertise in crafting Mobile applications from inception to deployment. Proficient in Android development, IOS development, HTML, CSS, JavaScript, React, Angular, MySQL, and MongoDB, he’s committed to enhancing user experiences through intuitive websites and advanced mobile applications.

Full Stack Developer Course Banner