Merge K sorted arrays of size n using merge algorithm from mergeSort

merge k sorted arrays using heap in java
merge k sorted arrays divide and conquer
merge k sorted arrays leetcode
k-way merge sort python
merge k sorted lists divide and conquer java
merge k sorted arrays python heap
k-way merge sort java
merge 2 sorted arrays

Problem: Given K sorted arrays of size N each, merge them and print the sorted output.

Sample Input-1:

K = 3, N =  4

arr[][] = { {1, 3, 5, 7},

            {2, 4, 6, 8},

            {0, 9, 10, 11}} ;


Sample Output-1: 

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]

I know there is a way to do this problem using a priority queue/min heap, but I want to do it using the merge procedure from mergeSort. The idea seems straightforward enough...at each iteration, merge the remaining arrays in groups of two, such that the number of arrays gets halved at each iteration.

However, whenever halving leads to an odd number, this becomes problematic. My idea is that whenever halving leads to an odd number, we take care of the extra array by merging it with the array formed from the last merge.

The code I have so far is below. This only works on one out of 30 test cases, however:

static int[] mergeArrays(int[][] arr) {
        int k = arr.length;
        int n = arr[0].length;
        if(k < 2){
            return arr[0];
        }

        boolean odd_k;
        if(k%2){
            odd_k = false;
        }
        else{
            odd_k = true;
        }

        while(k > 1){
            int o;
            if(odd_k){
                o = (k/2) + 1;
            }
            else{
                o = k/2;
            }
            int[][] out = new int[o][];

            for(int i=0; i < k; i = i + 2){
                int[] a;
                int[] b;
                if(odd_k && i == (k-1)){
                    b = arr[i];
                    b = out[i-1];
                }
                else{
                    a = arr[i];
                    b = arr[i+1];
                }
                out[i] = mergeTwo(a, b);
            }
            k = k/2;
            if(k % 2 == 0){
                odd_k = false;
            }
            else{
                odd_k = true;
            }

            arr = out;
        }
        return arr[0];

    }

    static int[] mergeTwo(int[] a, int[] b){
        int[] c = new int[a.length + b.length];
        int i, j, k;
        i = j = k = 0;
       while(i < a.length && j < b.length){
           if(a[i] < b[j]){
               c[k] = a[i];
               i++;
               k++;
           }
           else{
               c[k] = b[j];
               j++; k++;
            }
       }
       if(i < a.length){
           while(i < a.length){
               c[k] = a[i];
               i++; k++;
           }
       }
       if(j < b.length){
           while(j < b.length){
               c[k] = b[j];
               j++; k++;
           }
       }
       return c;
    }

We can shorten your mergeTwo implementation,

static int[] mergeTwo(int[] a, int[] b) {
    int[] c = new int[a.length + b.length];
    int i = 0, j = 0, k = 0; // declare and initialize on one line
    while (i < a.length && j < b.length) {
        if (a[i] <= b[j]) {
            c[k++] = a[i++]; // increment and assign
        } else {
            c[k++] = b[j++]; // increment and assign
        }
    }
    // No need for extra if(s)
    while (i < a.length) {
        c[k++] = a[i++];
    }
    while (j < b.length) {
        c[k++] = b[j++];
    }
    return c;
}

And we can then fix your mergeArrays and shorten it by starting with the first row from the int[][] and then using mergeTwo to concatenate the arrays iteratively. Like,

static int[] mergeArrays(int[][] arr) {
    int[] t = arr[0];
    for (int i = 1; i < arr.length; i++) {
        t = mergeTwo(t, arr[i]);
    }
    return t;
}

I then tested it with

public static void main(String[] args) {
    int arr[][] = { { 1, 3, 5, 7 }, { 2, 4, 6, 8 }, { 0, 9, 10, 11 } };
    System.out.println(Arrays.toString(mergeArrays(arr)));
}

And I get (as expected)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]

Merge two sorted arrays in Java, . The code snippet that demonstrates this is given as follows. Create an result [] of size n*k. Copy all the elements from k arrays into result array. This will take O (nk). Sort the result [] using Merge Sort.

As you say you have merged two arrays at a time. As it is inefficient you can merge all subarrays same time. What you have to do is to find the minimum from every subarray and remember the position of that element.


To do that we can use another array (say curPos) to remember the current position

 private int[] merge(int[][] arr) 
{
    int K = arr.length;
    int N = arr[0].length;

    /** array to keep track of non considered positions in subarrays **/
    int[] curPos = new int[K];

    /** final merged array **/
    int[] mergedArray = new int[K * N];
    int p = 0;

    while (p < K * N)
    {
        int min = Integer.MAX_VALUE;
        int minPos = -1;
        /** search for least element **/
        for (int i = 0; i < K; i++)
        {
            if (curPos[i] < N)
            {
                if (arr[i][curPos[i]] < min)
                {
                    min = arr[i][curPos[i]];
                    minPos = i;
                }
            }                
        }
        curPos[minPos]++;            
        mergedArray[p++] = min;
    }
    return mergedArray;

why should we use n-way merge? what are its advantages over 2 , breaks down the arrays to subarrays of size one third. The k-way merge problem consists of merging k sorted arrays to produce a single sorted array with the same elements. Denote by n the total number of elements. n is equal to the size of the output array and the sum of the sizes of the k input arrays. For simplicity, we assume that none of the input arrays is empty.

Probably the easiest way to handle this is to use a queue of arrays. Initially, add all the arrays to the queue. Then, remove the first two arrays from the queue, merge them, and add the resulting array to the queue. Continue doing that until there is only one array in the queue. Something like:

for each array in list of arrays
    queue.push(array)

while queue.length > 1
    a1 = queue.pop()
    a2 = queue.pop()
    a3 = merge(a1, a2)
    queue.push(a3)

result = queue.pop()

That simplifies things quite a bit, and the problem of "halving" goes away.

3-way Merge Sort, Giving k sorted arrays, each of size N, the task is to merge them into a single start looking at the k arrays as the intermediate state of the merge sort algorithm. As others have said, using the min heap to hold the next items is the optimal way. It's called an N-way merge. Its complexity is O(n log k). You can use a 2-way merge algorithm to sort k arrays. Perhaps the easiest way is to modify the standard merge sort so that it uses non-constant partition sizes.

Merge K sorted arrays, Merge K sorted arrays of size n using merge algorithm from mergeSort. Problem: Given K sorted arrays of size N each, merge them and print the sorted output. An efficient solution is to use heap data structure. The time complexity of heap based solution is O(N Log k). 1. Create an output array. 2. Create a min heap of size k and insert 1st element in all the arrays into the heap

Merge K sorted arrays of size n using merge algorithm from mergeSort, In computer science, k-way merge algorithms or multiway merges are a specific type of sequence merge algorithms that specialize in taking in k sorted lists and merging them Denote by A[1..p] and B[1..q] two arrays sorted in increasing order. The k-way merge problem consists of merging k sorted arrays to produce a  Merge Sort works similar to quick Sort where one uses a divide and conquer algorithm to sort the array of elements. It uses a key process Merge(myarr, left,m, right) to combine the sub-arrays that were divided using m position element.

k-way merge algorithm, Merge K Sorted Arrays - Min Heap Algorithm ("Merge K Sorted Lists" on LeetCode) know Duration: 11:25 Posted: 9 Jan 2019 Count inversions in an array | Set 2 (Using Self-Balancing BST) Count inversions of size k in a given array; Count inversions in an array | Set 4 ( Using Trie ) Find array with k number of merge sort calls; Merge Sort with O(1) extra space merge and O(n lg n) time; Why Quick Sort preferred for Arrays and Merge Sort for Linked Lists?

Comments
  • if(k%2) and b = out[i-1] are probably what you did wrong.
  • You can merge more than two arrays at a time if you want. Of course it will require an index into each array and a loop to find the next element to put into the merged result.
  • @OleV.V. - For a non-external sort, there is little benefit from doing more than a 2 way merge. See my comment to Tharaka Ratnayake answer.
  • That was my initial implementation but if you look at that closely it is not efficient. It merges the output of a merge with the next, so it would be O(k2n). Trying to do it in groups of two, so its O(kn log k).
  • @ohbrobig - despite the better time complexity, ignoring loop overhead, either approach is about the same number of compares and moves. For your suggested approach, for each pass, you could create a matrix[(arr.length+1)/2][arr[0].length*2], merge even and odd rows (or copy a row if there is no row to merge it with (the odd number of rows case)) into that matrix, then repeat using that matrix as input for the next pass .
  • For a non-external sort, there is little benefit from doing more than a 2 way merge. The number of operations is about the same. For example, in the typical inner loop, 2 way merge does 1 compare and 1 move for each element moved, while 4 way merge does 3 compares and 1 move for each element moved, but only does 1/2 the number of moves, so 4 way uses 1/2 times the number of moves, but 3/2 times the number of compares, about the same number of operations. If using a heap for a k-way merge, the overhead of the heap slows it down.
  • Your technique of searching for the smallest element is very inefficient. You're doing a linear search for each element. The result is that your algorithm is O(k * n * k), whereas the OP's algorithm is O(k * n * log(k)). If you want to merge all of the lists at the same time, then you need to use a more efficient priority queue. But as @rcgldr pointed out in his comment, the heap overhead slows things down. For internal merges, the only time the heap-based k-way merge outperforms pairwise merging is when some lists are much (orders of magnitude) longer than others.