Back

Merge K Sorted Arrays: Efficient Approaches and Code Examples

7 May 2025
6 min read

Merge K Sorted Arrays is a fundamental problem in computer science, frequently encountered in technical interviews and real-world applications. This problem arises in scenarios such as database management, data compression, and information retrieval, where efficiently handling sorted data is crucial for performance. 

In this guide, we will explore various approaches to solving the problem of merging K sorted arrays. We will discuss the brute-force method, the use of min heaps, and the divide-and-conquer technique, analyzing their time complexity, space requirements, and practical applications. 

Understanding the Problem of Merge K Sorted Arrays

Problem Statement

The task is to merge K sorted arrays, each containing N elements, into a single sorted array. The key challenge lies in efficiently sorting elements from multiple arrays while keeping track of their positions.

When merging multiple sorted arrays, we need to ensure that the final merged array maintains a sorted order without performing unnecessary computations. The naive approach of concatenating all arrays and sorting them is inefficient, especially for large values of K and N. Therefore, an optimized approach is required to handle the merging process efficiently.

Constraints

  • Each of the K arrays is already sorted in non-decreasing order.
  • The total number of elements in the final array will be K × N.
  • The merging process should be optimized for performance, ideally achieving better time complexity than brute-force sorting.

Input Format

  • K: The number of sorted arrays.
  • N: The number of elements in each array.
  • K sorted arrays of size N each.

Output Format

  • A single sorted array containing all elements from the K input arrays.

Example

Input

K = 3, N = 4  
arr1 = [1, 3, 5, 7]  
arr2 = [2, 4, 6, 8]  
arr3 = [0, 9, 10, 11]  

Process

Start with the three sorted arrays:

[1, 3, 5, 7]
[2, 4, 6, 8]
[0, 9, 10, 11]

Merge the first two arrays:

[1, 3, 5, 7] + [2, 4, 6, 8] → [1, 2, 3, 4, 5, 6, 7, 8]

Merge the result with the third array:

[1, 2, 3, 4, 5, 6, 7, 8] + [0, 9, 10, 11] → [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]

Output

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]  

Challenges and Considerations

  • Efficient merging strategy: A naive approach (concatenation + sorting) takes O(KN log(KN)) time, which is inefficient for large values of K and N.
  • Memory usage: Extra space might be needed for storing intermediate results, but in-place merging techniques can help reduce the memory footprint.
  • Optimal merging: Using a min-heap (priority queue) or divide-and-conquer merge strategy can improve performance to O(KN log K).

Approaches to Solve the Problem

When merging multiple sorted arrays into a single sorted array, we can follow different strategies to achieve the goal efficiently. One of the simplest approaches is the Brute-Force Approach, which uses basic list operations to solve the problem.

1. Brute-Force Approach

The Brute-Force Approach is the simplest method for multiple merge K sorted arrays. It follows these steps:

  1. Collect all elements from K sorted arrays into a single list.
  2. Sort the combined list using an inbuilt sorting function.
  3. Return the sorted merged array.

This approach ensures correctness but is not the most efficient, as sorting takes additional time.

Algorithm

  1. Initialize an empty list to store all elements from the K sorted arrays.
  2. Iterate through each sorted array and append all elements to the result list.
  3. Sort the result list using Python’s built-in sorting method (.sort()).
  4. Return the sorted merged array as the final output.

Code Implementation for Merge K Sorted Arrays Python

def merge_k_sorted_arrays(arrays):
    result = []  # Step 1: Initialize an empty list
    for array in arrays:
        result.extend(array)  # Step 2: Add all elements from each array
    result.sort()  # Step 3: Sort the combined list
    return result  # Step 4: Return the sorted list

# Example usage
arrays = [[1, 3, 5], [2, 4], [0]]
print(merge_k_sorted_arrays(arrays))

Explanation of the Code

  • We start with an empty list result.
  • We iterate through each sorted array and use extend() to append all elements to the result.
  • After collecting all elements, we use sort() to order them in non-decreasing order.
  • Finally, we return the sorted merged array as the output.

Output

[0, 1, 2, 3, 4, 5]

Time Complexity Analysis

  • Copying all elements into a new list takes O(K × N) time, where K is the number of arrays and N is the size of each array.
  • Sorting the combined array takes O((K × N) log(K × N)) time using Python’s built-in sort(), which is based on Timsort.
  • Overall time complexity: O(K × N log(K × N)).

Space Complexity Analysis

O(K × N) additional space is required to store all elements in the result. Sorting is done in-place, so no extra space is used for sorting. Overall space complexity: O(K × N).

2. Merge Using Min Heap

The Min Heap Approach efficiently merge K sorted Arrays by maintaining a min heap (priority queue) to extract the smallest element at each step. This significantly improves performance over the brute-force method.

Algorithm

  1. Initialize a min heap (priority queue).
  2. Insert the first element of each array into the heap along with its array index and element index.
  3. Extract the smallest element from the heap and add it to the result list.
  4. Insert the next element from the same array into the heap.
  5. Repeat steps 3 and 4 until all elements are processed.
  6. Return the merged sorted array.

Code Implementation (Python) of Merge Using Min Heap

import heapq

def merge_k_sorted_arrays(arrays):
    min_heap = []
    result = []

    # Step 1: Insert the first element of each array into the heap
    for i in range(len(arrays)):
        if arrays[i]:  # Ensure array is not empty
            heapq.heappush(min_heap, (arrays[i][0], i, 0))  # (value, array index, element index)

    # Step 2: Extract min and push the next element from the same array
    while min_heap:
        val, arr_idx, elem_idx = heapq.heappop(min_heap)  # Get the smallest element
        result.append(val)

        # Step 3: Insert the next element from the same array into the heap
        if elem_idx + 1 < len(arrays[arr_idx]):
            heapq.heappush(min_heap, (arrays[arr_idx][elem_idx + 1], arr_idx, elem_idx + 1))

    return result

# Example usage
arrays = [[1, 3, 5], [2, 4, 6], [0, 9, 10]]
print(merge_k_sorted_arrays(arrays)) 

Explanation of the Code

  1. Initialize a min heap (min_heap) to keep track of the smallest elements from each array.
  2. Insert the first element of each array into the heap, along with the array index and element index.
  3. Extract the smallest element (heappop), add it to the result list, and then push the next element from the same array into the heap.
  4. Repeat until the heap is empty, ensuring all elements are merged in sorted order.
  5. Return the final merged, sorted array.

Output

[0, 1, 2, 3, 4, 5, 6, 9, 10]

Time Complexity Analysis

  • Heap Initialization: Inserting the first elements of K arrays takes O(K log K).
  • Processing Elements: Each of the K × N elements is inserted and removed from the heap once, each operation taking O(log K).
  • Total Time Complexity: O(KN log K) (much better than O(KN log(KN)) in the brute-force approach).

Space Complexity Analysis

  • Heap Stores Up to K Elements: O(K) space is used for the heap.
  • Result Array Stores All Elements: O(K × N) space is required for the final result.
  • Overall Space Complexity: O(K × N) (output) + O(K) (heap) → O(K × N).

3. Divide and Conquer Approach

The Divide and Conquer Approach efficiently merge K sorted arrays by recursively splitting them into smaller groups, merging them pairwise, and combining them step by step. This approach leverages the efficiency of Merge Sort to achieve optimal performance.

Algorithm

  1. Base Case: If there is only one array, return it.
  2. Divide: Recursively split the K arrays into two halves.
  3. Conquer: Merge the two halves using a merging function.
  4. Repeat Until Only One Sorted Array Remains.

Code Implementation (Python) of Divide and Conquer Approach

def merge_two_sorted_arrays(arr1, arr2):
    """ Helper function to merge two sorted arrays """
    i, j = 0, 0
    merged = []
    
    while i < len(arr1) and j < len(arr2):
        if arr1[i] < arr2[j]:
            merged.append(arr1[i])
            i += 1
        else:
            merged.append(arr2[j])
            j += 1

    # Append remaining elements
    while i < len(arr1):
        merged.append(arr1[i])
        i += 1
    while j < len(arr2):
        merged.append(arr2[j])
        j += 1

    return merged

def merge_k_sorted_arrays(arrays, left, right):
    """ Recursively merges K sorted arrays using divide and conquer """
    if left == right:
        return arrays[left]
    
    mid = (left + right) // 2
    left_half = merge_k_sorted_arrays(arrays, left, mid)
    right_half = merge_k_sorted_arrays(arrays, mid + 1, right)

    return merge_two_sorted_arrays(left_half, right_half)

# Wrapper function
def merge_k_arrays(arrays):
    if not arrays:
        return []
    return merge_k_sorted_arrays(arrays, 0, len(arrays) - 1)

# Example usage
arrays = [[1, 3, 5], [2, 4, 6], [0, 9, 10], [8, 12, 15]]
print(merge_k_arrays(arrays))

Explanation of the Code

The function first merges two sorted arrays using a two-pointer approach. It then recursively divides the list of arrays into two halves until only one array remains. The left and right halves are merged step by step until all arrays are combined. 

A wrapper function handles the input and starts the merging process. The approach ensures efficient sorting with O(KN log K) time complexity.

Output

[0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 15]

Time Complexity Analysis

  • Splitting the arrays recursively takes O(log K) time.
  • Merging two arrays of total size N takes O(N) time.
  • Since each level of recursion merges O(K) arrays in O(N) time, the total time complexity is: O(KN log K), which is similar to the Min Heap Approach but often performs better in practice.

Space Complexity Analysis

  • Recursive call stack takes O(log K) space.
  • Merging requires O(K × N) space to store results.
  • Total Space Complexity: O(K × N) (output) + O(log K) (recursion stack) → O(K × N).

Conclusion

Merge K Sorted Arrays is a common problem in computer science with applications in databases, information retrieval, and big data processing. Various approaches can be used to solve this problem, each with different efficiency levels. 

The Brute-Force Approach is simple but inefficient, while the Min Heap Approach improves performance by using a priority queue. The Divide and Conquer Approach further optimizes the process by reducing the number of comparisons. 

Choosing the right approach depends on the constraints, such as the number of arrays and elements per array. Understanding these methods of merge K sorted arrays helps in solving similar problems efficiently in real-world scenarios.

Frequently Asked Questions

1. What is the most efficient way to merge K sorted arrays?

The Min Heap Approach and Divide and Conquer Approach both achieve O(KN log K) time complexity, making them the most efficient methods compared to the brute-force approach.

2. Why is the brute-force method not ideal for large datasets?

The brute-force method involves copying all elements and sorting them, resulting in O(KN log(KN)) time complexity. This is inefficient for large values of K and N due to excessive sorting operations.

3. How does a min heap improve performance?

A min heap allows efficient extraction of the smallest element from multiple arrays, reducing unnecessary sorting. Each insertion and deletion operation takes O(log K) time, leading to an overall time complexity of O(KN log K).

4. When should we use the Divide and Conquer Approach?

The Divide and Conquer method is useful when merging very large datasets, as it efficiently merges pairs of arrays step by step, maintaining optimal performance with O(KN log K) complexity.

5. What is the space complexity of these approaches?

  • Brute-force: O(KN) (stores all elements)
  • Min Heap: O(K + KN) (heap storage + result array)
  • Divide and Conquer: O(KN) (result storage) with O(log K) recursion stack overhead

6. Can these approaches be used in real-world applications?

Yes! These methods are used in database indexing, log merging, data streaming, and search engines where efficient merging of sorted data is required.

7. Which approach should I use for small datasets?

If K and N are small, the brute-force method may be sufficient. However, for larger datasets, the Min Heap or Divide and Conquer Approach is recommended to optimize performance.

Read More Articles

Chat with us
Chat with us
Talk to career expert