Turbocharging Java With Parallel Array Sorting

JDK enhancement proposal 103 brings a new way of array sorting which uses the Fork/Join parallelism technique to provide sorting of arrays in parallel. This new feature is going to be added in the forthcoming Java 9 release.

Currently, we have two sorting implementations, one sorts arrays and another sorts collections. These two implementations perform sort in a single thread. The parallel array sorting enhancement will allow us to sort things in parallel which has a significant performance benefit.

I have tested this new feature with following code :


import java.util.Arrays;
import java.util.Random;

public class Main {

	public static void main(String[] args) {
		Random random = new Random();
		int[] ints1K = random.ints(1, 1_000).boxed().limit(1_000).mapToInt(x -> x).toArray();
		int[] ints10K = random.ints(1, 10_000).boxed().limit(10_000).mapToInt(x -> x).toArray();
		int[] ints100K = random.ints(1, 100_000).boxed().limit(100_000).mapToInt(x -> x).toArray();
		int[] ints1M = random.ints(1, 1_000_000).boxed().limit(1_000_000).mapToInt(x -> x).toArray();
		int[] ints10M = random.ints(1, 10_000_000).boxed().limit(10_000_000).mapToInt(x -> x).toArray();
		int[] ints100M = random.ints(1, 100_000_000).boxed().limit(100_000_000).mapToInt(x -> x).toArray();

		System.out.println("Sorting using Arrays.sort()");
		sort(ints1K);
		sort(ints10K);
		sort(ints100K);
		sort(ints1M);
		sort(ints10M);
		sort(ints100M);

		System.out.println("\nSorting using Arrays.parallelSort()");
		parallelSort(ints1K);
		parallelSort(ints10K);
		parallelSort(ints100K);
		parallelSort(ints1M);
		parallelSort(ints10M);
		parallelSort(ints100M);
	}

	public static void sort(int[] ints) {
		long start = System.currentTimeMillis();
		Arrays.sort(ints);
		System.out.println("Time took for: "+ ints.length+" ints: " + (System.currentTimeMillis() - start));
	}

	public static void parallelSort(int[] ints) {
		long start = System.currentTimeMillis();
		Arrays.parallelSort(ints);
		System.out.println("Time took for: "+ ints.length+" ints : " + (System.currentTimeMillis() - start));
	}
}

The output of the aforementioned code spinet :


Sorting using Arrays.sort()
Time took for: 1000 ints: 1
Time took for: 10000 ints: 3
Time took for: 100000 ints: 12
Time took for: 1000000 ints: 93
Time took for: 10000000 ints: 847
Time took for: 100000000 ints: 9434

Sorting using Arrays.parallelSort()
Time took for: 1000 ints : 0
Time took for: 10000 ints : 6
Time took for: 100000 ints : 4
Time took for: 1000000 ints : 29
Time took for: 10000000 ints : 61
Time took for: 100000000 ints : 376

The performance benefit for the smaller sized array may seem insignificant, however, when the array grows, the outcome becomes quite impactful.

The machine I have used for this test:
OS: Ubuntu 14.04 LTS
Memory: 16 GB
Processor: Intel® Core™ i5-4570 CPU @ 3.20GHz × 4

This new feature is defined for all primitive array types (except boolean), Comparable object types and any arbitrary Object types using a supplied Comparator.
For more details: http://openjdk.java.net/jeps/103

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s