Bucket Sort Algorithm: Use and Comparison
Bucket sort refers to a sorting algorithm that works by distributing elements of an array into several buckets. Every bucket is sorted individually, either using a different sorting algorithm or by applying recursively the bucket sort algorithm.
Bucket sort is a type of comparison sort algorithm that is efficient for certain types of input, such as uniformly distributed data. Let us learn about this algorithm in more detail.
Table of Contents
- What is bucket sort algorithm?
- How does Bucket Sort Algorithm Work?
- Example
- Pseudocode Implementation of Bucket Sort Algorithm
- Comparison of Bucket Sort with other sorting algorithms
- When to use Bucket Sort?
What is bucket sort algorithm?
The basic idea behind bucket sort is to divide an array into several smaller arrays, or buckets, each of which contains a range of elements. For example, in case we have an array of integers between 0 and 100, we might divide the array into 10 buckets, each containing elements between 0 and 9, 10 and 19, and so on. We then sort each bucket individually, using another sorting algorithm or recursively applying bucket sort.
The time complexity of bucket sort is dependent on the sorting algorithm used to sort each bucket. If we use a comparison-based sorting algorithm, such as quicksort or mergesort, then the time complexity of bucket sort is O(n log n), where n is the number of elements in the array. However, if the elements in the array are uniformly distributed and the number of buckets is proportional to the number of elements in the array, then the time complexity of bucket sort can be reduced to O(n).
One of the advantages of bucket sort is its relatively ease to implement and can be used with any type of data that can be sorted. However, bucket sort has a few limitations that make it less suitable for some types of data. For example, suppose the data is not uniformly distributed. In that case, some buckets may contain a large number of elements while others contain very few, leading to inefficiencies in the sorting process. Additionally, if the data contains a large number of duplicates, then bucket sort may require a lot of memory to store all of the duplicate elements in the same bucket.
How does Bucket Sort Algorithm Work?
Bucket sort is a sorting algorithm that works by dividing an array into several smaller arrays or buckets. Each bucket is sorted individually, either using another sorting algorithm or by recursively applying the bucket sort algorithm. Here are the steps of the bucket sort algorithm:
- Initialize an array of empty buckets. The number of buckets is typically chosen based on the size and range of the input data.
- Traverse the input array and distribute the elements into their respective buckets based on a key function that maps each element to its bucket index.
- Sort each bucket individually using a different sorting algorithm or by recursively applying the bucket sort algorithm. The sorting algorithm used for each bucket may depend on the characteristics of the elements in the bucket.
- Concatenate the sorted buckets to obtain the sorted output array.
Best-suited Java courses for you
Learn Java with these high-rated online courses
Example
Here is an example of how to sort an array of integers using bucket sort:
Suppose we have an input array of integers, [6, 3, 1, 8, 9, 2, 4, 7, 5], and we want to sort them using bucket sort. We will use 3 buckets, each of size 3, to store the integers.
- Initialize the buckets. We will create 3 empty buckets: bucket0, bucket1, and bucket2.
- Distribute the elements into their respective buckets based on their values. We will use a key function that maps the integers to their bucket index as follows:
integers 1-3 go in bucket0
integers 4-6 go in bucket1
integers 7-9 go in bucket2
After distribution, the buckets will look like this:
bucket0: [3, 1, 2]
bucket1: [6, 4, 5]
bucket2: [8, 9, 7]
- Sort each bucket individually. We will use insertion sort to sort each bucket in this example. After sorting, the buckets will look like this:
bucket0: [1, 2, 3]
bucket1: [4, 5, 6]
bucket2: [7, 8, 9]
- Concatenate the sorted buckets to obtain the sorted output array. We concatenate the buckets in order: bucket0, bucket1, and bucket2. The sorted output array will be:
[1, 2, 3, 4, 5, 6, 7, 8, 9]
Bucket sort can be a very efficient sorting algorithm for certain types of input data, especially when the input data is uniformly distributed over a range. However, it may require more memory and overhead compared to other sorting algorithms when the input data is not uniformly distributed.
Pseudocode Implementation of Bucket Sort Algorithm
Here is a pseudocode implementation of the bucket sort algorithm:
function bucketSort(array, bucketSize) // Determine minimum and maximum values in the array minValue = array[0] maxValue = array[0] for i = 1 to length(array) do if array[i] < minValue then minValue = array[i] else if array[i] > maxValue then maxValue = array[i]
// Determine number of buckets and initialize empty buckets bucketCount = floor((maxValue - minValue) / bucketSize) + 1 buckets = new array of bucketCount empty arrays
// Distribute the elements into the buckets for i = 0 to length(array) - 1 do bucketIndex = floor((array[i] - minValue) / bucketSize) append array[i] to buckets[bucketIndex]
// Sort each bucket and concatenate the sorted buckets sortedArray = new array of length(array) index = 0 for i = 0 to bucketCount - 1 do bucket = sort(buckets[i]) for j = 0 to length(bucket) - 1 do sortedArray[index] = bucket[j] index = index + 1
return sortedArray
In this implementation, the function bucketSort takes an input array and a bucket size as parameters. The function first determines the minimum and maximum values in the array and uses them to determine the number of buckets needed. It then initializes an array of empty buckets and distributes the elements from the input array into the buckets based on their values. After that, the function sorts each bucket individually and concatenates them to obtain the final sorted output array. The sort function used in the implementation can be any sorting algorithm, such as insertion sort or quicksort.
Here is a Java implementation of the bucket sort algorithm:
import java.util.ArrayList;import java.util.Collections;
public class BucketSort { public static void bucketSort(int[] array, int bucketSize) { // Determine minimum and maximum values in the array int minValue = array[0]; int maxValue = array[0]; for (int i = 1; i < array.length; i++) { if (array[i] < minValue) { minValue = array[i]; } else if (array[i] > maxValue) { maxValue = array[i]; } }
// Determine number of buckets and initialize empty buckets int bucketCount = (maxValue - minValue) / bucketSize + 1; ArrayList<arraylist> buckets = new ArrayList<>(bucketCount); for (int i = 0; i < bucketCount; i++) { buckets.add(new ArrayList()); }
// Distribute the elements into the buckets for (int i = 0; i < array.length; i++) { int bucketIndex = (array[i] - minValue) / bucketSize; buckets.get(bucketIndex).add(array[i]); }
// Sort each bucket and concatenate the sorted buckets int index = 0; for (int i = 0; i < bucketCount; i++) { ArrayList bucket = buckets.get(i); Collections.sort(bucket); for (int j = 0; j < bucket.size(); j++) { array[index] = bucket.get(j); index++; } } }
public static void main(String[] args) { int[] array = {6, 3, 1, 8, 9, 2, 4, 7, 5}; int bucketSize = 3; bucketSort(array, bucketSize); for (int i = 0; i < array.length; i++) { System.out.print(array[i] + " "); } }}</arraylist
In this implementation, the bucketSort method takes an input array and a bucket size as parameters. The method first determines the minimum and maximum values in the array and uses them to determine the number of buckets needed. It then initializes an ArrayList of empty buckets and distributes the elements from the input array into the buckets based on their values. After that, the method sorts each bucket individually using the Collections.sort() method and concatenates them to obtain the final sorted output array. The main method creates an input array and a bucket size, calls the bucketSort method to sort the array, and prints the sorted array to the console.
Comparison of Bucket Sort with Other Sorting Algorithms
Here is a comparison of bucket sort with other sorting algorithms in a tabular form:
Algorithm | Best Case Time Complexity | Average Case Time Complexity | Worst Case Time Complexity | Space Complexity | Stable? |
Bucket Sort | ฮฉ(n+k) | ฮ(n+k) | O(n^2) | O(n+k) | Yes |
Insertion Sort | ฮฉ(n) | ฮ(n^2) | O(n^2) | O(1) | Yes |
Selection Sort | ฮฉ(n^2) | ฮ(n^2) | O(n^2) | O(1) | No |
Bubble Sort | ฮฉ(n) | ฮ(n^2) | O(n^2) | O(1) | Yes |
Merge Sort | ฮฉ(n log(n)) | ฮ(n log(n)) | O(n log(n)) | O(n) | Yes |
Quick Sort | ฮฉ(n log(n)) | ฮ(n log(n)) | O(n^2) | O(log(n)) | No |
In general, bucket sort performs well when the input array is uniformly distributed over a range. It has a linear time complexity in the best case and average case, but can have a quadratic time complexity in the worst case when the elements in the array are not uniformly distributed. It also requires additional space for the buckets.
Insertion sort, selection sort, and bubble sort are all comparison-based sorting algorithms with a worst-case time complexity of O(n^2). They are generally not as efficient as merge sort and quicksort, which have an average-case time complexity of O(n log(n)). Merge sort is stable and has a space complexity of O(n), while quick sort is not stable and has a space complexity of O(log(n)).
When to use Bucket Sort?
Bucket sort is a good choice of algorithm when the input array is uniformly distributed over a range. This is because bucket sort divides the input array into smaller subarrays, or buckets, which are then sorted individually using another sorting algorithm or recursively using bucket sort. If the elements in the input array are uniformly distributed, then the distribution of elements among the buckets will also be roughly uniform, and sorting the individual buckets will be efficient.
Bucket sort can also be used when the range of elements in the input array is known in advance, as this information is needed to determine the size and number of buckets needed. It is commonly used for sorting non-negative integers, where the range of values is limited and known in advance.
Another advantage of bucket sort is that it is a stable sorting algorithm, meaning that it preserves the relative order of elements with equal values. This can be important in some applications, such as sorting an array of objects by a specific attribute.
Overall, bucket sort is a good choice when the input array is uniformly distributed over a range and the range of values is known in advance. It can be an efficient algorithm for sorting non-negative integers and has the added benefit of being a stable sorting algorithm.
Conclusion
In summary, bucket sort is a sorting algorithm that works by dividing an array into several smaller arrays, or buckets, each of which contains a range of elements. Bucket sort is efficient for certain types of input, such as uniformly distributed data, and can be used with any type of data that can be sorted. However, bucket sort has some limitations that make it less suitable for certain types of data. The choice of sorting algorithm depends on the characteristics of the input array and the desired performance trade-offs. Bucket sort is a good choice when the input array is uniformly distributed over a range, while merge sort and quick sort are generally more efficient for general inputs.
This is a collection of insightful articles from domain experts in the fields of Cloud Computing, DevOps, AWS, Data Science, Machine Learning, AI, and Natural Language Processing. The range of topics caters to upski... Read Full Bio