Partition an array into K subarrays with minimal difference

split array into sub-arrays python
divide an array into 4 subarrays
minimize the cost of partitioning an array into k groups
clustering partitioning an array such that sum of square differences is minimum
partition array for maximum sum
split an array into three equal sum subarrays
minimum sum of array after k steps
divide array in sets of k consecutive numbers

DISCLAIMER:

Described problem looks like a task from a competition. I'm not participating in any of them, I'm not aware about any ongoing competitions, which might involve the problem. If there are any of them, I'll close the question to stay fair!

I have a problem: given an array A of values and integer K, split A into exactly K non-overlapping contiguous subarrays in such way that difference between a subarray with minimal and a subarray maximum sums is minimal. It is allowed to rotate A by any number in any direction.

Consider an example:

Input: A = [5 1 1 1 3 2], K = 3

Output: [5][1 1 1][3 2], maximum sum = 5, minimum sum = 3, result = 2

I have partially working code (terribly ugly, my bad, but it does not meant to be production quality):

#include <climits>
#include <cstdio>
#include <cstring>

const int max_n = 50;
const int max_k = 20;

int deps[max_n];

int max (int x, int y) {
  return x > y ? x : y;
}

int min (int x, int y) {
  return x < y ? x : y;
}

int sum (int a[], int start, int end) {
  int res = 0;
  for (int i = start; i <= end; ++i) res += a[i];

  return res;
}

int k_partitioning(int k, int n, int deps[]) {
  int res = INT_MAX;
  // consider all possible rotations/shifts
  for(int offset = 0; offset < n; ++offset) {
    for(int l_min = 0; l_min < n; ++l_min) {
      for(int r_min = l_min; r_min < n; ++r_min) {
        // check minimal sum subarray
        int min_sum = sum (deps, l_min, r_min);

        int dp[k][n];
        for (int s = 0; s < k; ++s) {
          for (int q = 0; q < n; ++q) {
            dp[s][q] = 0;
          }
        }
        // assuming that current sum is a target sum
        dp[0][r_min-l_min] = min_sum;

        for(int p = 1; p < k; ++p) {
          for(int l_max = 0; l_max < n; ++l_max) {
            for(int r_max = 0; r_max < n; ++r_max) {
              int max_sum = sum(deps, l_max, r_max);

              if (max_sum >= min_sum) dp[p][r_max] = max(dp[p-1][l_max], max_sum);
            } // l_maxs
          } // r_maxs
        } // partitions
        // printing dp

        // skip incorrect partitioning, when not all K partitions were used
        if (dp[k-1][n-1] == 0) continue;

        // update difference
        res = min (res, dp[k-1][n-1] - min_sum);
      } // end min sum seg
    } // start min sum seg
    //break;
  } // cuts
  return res;
}

int main(int argc, char* argv[]) {
  int k = 0;
  scanf("%d", &k);

  int n = 0;
  scanf("%d", &n);

  for (int i = 0; i < n; ++i) {
    scanf("%d", &deps[i]);
  }

  printf ("%d\n", k_partitioning(k, n, deps));

  return 0;
}

The idea is simple: assume that current partition has minimal sum, enumerate all possible maximal partitions, setup dynamic programming for generating maximum sum with minimal value, check for difference. Total complexity: O(K*N^4).

My problem is that it fails some tests and I'm stuck with troubleshooting it. Could someone help me with it?

Failed test, for example:

N = 4, K = 2, A = [6 13 10 2]

UPDATE

This version should fix some previous issues. First, it removes wasteful loop over "offsets" and adds just an array rotation in the end of l_min loop. Second, I've noticed, that dp can't be initialized with 0 - this is minimization task, so it should be initialized with some large value (depends on a problem's constants, max_value here already is out of value domain). Finally, intervals should not overlap anymore - each sum exclude left end of an interval. However, it still does not produce expected results.

#include <climits>
#include <cstdio>
#include <cstring>

const int max_value = 200000;
const int max_n = 50;
const int max_k = 20;

int deps[max_n];

int max (int x, int y) {
  return x > y ? x : y;
}

int min (int x, int y) {
  return x < y ? x : y;
}

int sum (int a[], int start, int end) {
  int res = 0;
  for (int i = start; i <= end; ++i) res += a[i];

  return res;
}

int k_partitioning(int k, int n, int deps[]) {
  int res = max_value;

  for(int l_min = 0; l_min < n; ++l_min) {
    for(int r_min = l_min; r_min < n; ++r_min) {
      int min_sum = sum (deps, l_min+1, r_min);

      int dp[k][n];
      for (int s = 0; s < k; ++s) {
        for (int q = 0; q < n; ++q) {
          dp[s][q] = max_value;
        }
      }
      // assuming that current sum is a target sum
      dp[0][r_min-l_min] = min_sum;

      for(int p = 1; p < k; ++p) {
        for(int l_max = 0; l_max < n; ++l_max) {
          for(int r_max = l_max; r_max < n; ++r_max) {
            int max_sum = sum(deps, l_max+1, r_max);

            if (max_sum >= min_sum) dp[p][r_max] = max(dp[p-1][l_max], max_sum);
          } // l_maxs
        } // r_maxs
      } // partitions

      // skip incorrect partitioning, when not all K partitions were used
      if (dp[k-1][n-1] == max_value) continue;

      // update difference
      res = min (res, dp[k-1][n-1] - min_sum);
    } // end min sum seg

    // rotate an array to consider different starting points
    int tmp[n];
    for (int i = 0; i < n; ++i) {
      int new_idx = i + n + 1;

      tmp[new_idx % n] = deps[i];
    }

    for(int i = 0; i < n; ++i) deps[i] = tmp[i];
  } // start min sum seg

  return res;
}

int main(int argc, char* argv[]) {
  int k = 0;
  scanf("%d", &k);

  int n = 0;
  scanf("%d", &n);

  for (int i = 0; i < n; ++i) {
    scanf("%d", &deps[i]);
  }

  printf ("%d\n", k_partitioning(k, n, deps));

  return 0;
}

Ok, I think I did it!

The idea is following: we assume that minimum sum interval always starts from 0. Then we start to enumerate maximum sum intervals, starting from the right boundary of the minimal interval. We build DP problem for current max interval to determine a minimum maximal sum. After that you update result and rotate an array by one.

My code is not perfect in a way that I compute current sums each iteration. One can pre-compute them and just index them each time.

This code might have some bugs, but it passes all test that I have.

#include <climits>
#include <cstdio>
#include <cstring>

const int max_value = 200000;
const int max_n = 50;
const int max_k = 20;

int deps[max_n];

int max (int x, int y) {
  return x > y ? x : y;
}

int min (int x, int y) {
  return x < y ? x : y;
}

int sum (int a[], int start, int end) {
  int res = 0;

  for (int i = start; i <= end; ++i) res += a[i];

  return res;
}

int k_partitioning(int k, int n, int deps[]) {
  int res = max_value;
  for(int offset = 0; offset < n; ++offset) {
    int l_min = 0;
    for(int r_min = l_min; r_min < n; ++r_min) {
      int min_sum = sum (deps, l_min, r_min);

      int dp[k][n];
      for (int s = 0; s < k; ++s) {
        for (int q = 0; q < n; ++q) {
          dp[s][q] = max_value;
        }
      }
      // assuming that current sum is a target sum
      dp[0][r_min-l_min] = min_sum;

      for(int p = 1; p < k; ++p) {
        for(int l_max = r_min; l_max < n; ++l_max) {
          for(int r_max = l_max; r_max < n; ++r_max) {
            int max_sum = sum(deps, l_max+1, r_max);

            if (max_sum >= min_sum) {
              dp[p][r_max] = min(dp[p][r_max], max(dp[p-1][l_max], max_sum));
            }

          } // l_maxs
        } // r_maxs
      } // partitions

      // skip incorrect partitioning, when not all K partitions were used
      if (dp[k-1][n-1] == max_value) continue;

      // update difference
      res = min (res, dp[k-1][n-1] - min_sum);
    } // end min sum seg
    int tmp[n];
    for (int i = 0; i < n; ++i) {
      int new_idx = i + n - 1;

      tmp[new_idx % n] = deps[i];
    }

    for(int i = 0; i < n; ++i) deps[i] = tmp[i];

  } // start min sum seg
  return res;
}

int main(int argc, char* argv[]) {
  int k = 0;
  scanf("%d", &k);

  int n = 0;
  scanf("%d", &n);

  for (int i = 0; i < n; ++i) {
    scanf("%d", &deps[i]);
  }

  printf ("%d\n", k_partitioning(k, n, deps));

  return 0;
}

Divide an array into K subarray with the given condition , Given an array arr[] and an integer K. The task is to divide the array into K parts ( subarray ) such that the sum of the values of all subarray is minimum. sum of difference of second group = (9-9) + (9-5) + (9-4) + (9-8) + (9-3) + (9-6) = 19. partition the array into exactly K subarrays and calculate their sum. find the minimum possible difference of the maximum sum and the minimum sum. (1<k<=n<=40) for example for N=6 and K=3 and array={5 1 1 1 3 2} the optimal way is to split it into [1 1 1][3 2] so the maximum sum is 5, minimum sum is 3 so the answer is 5-3=2.

Partition an array into K subarrays with minimal difference, Ok, I think I did it! The idea is following: we assume that minimum sum interval always starts from 0. Then we start to enumerate maximum sum  Given an array arr[] and an integer K. The task is to divide the array into K parts ( subarray ) such that the sum of the values of all subarray is minimum. The value of every subarray is defined as: Take the maximum from that subarray. Subtract each element of the subarray with the maximum. Take the sum of all the values after subtraction.

Now that you've got your code working, here's an alternative method :)

Consider that for each k, we can pair a sum growing from A[i] to the left (sum A[i-j..i]) with all available intervals recorded for f(k-1, i-j-1) and update them - for each interval, (low, high), if the sum is greater than high, then new_interval = (low, sum) and if the sum is lower than low, then new_interval = (sum, high); otherwise, the interval stays the same. For example,

i:  0 1 2 3 4 5
A: [5 1 1 1 3 2]

k = 3
i = 3, j = 0
The ordered intervals available for f(3-1, 3-0-1) = f(2,2) are:
  (2,5), (1,6) // These were the sums, (A[1..2], A[0]) and (A[2], A[0..1])
Sum = A[3..3-0] = 1
Update intervals: (2,5) -> (1,5)
                  (1,6) -> (1,6) no change

Now, we can make this iteration much more efficient by recognizing and pruning intervals during the previous k round.

Watch:

A: [5 1 1 1 3 2]

K = 1:

  N = 0..5; Intervals: (5,5), (6,6), (7,7), (8,8), (11,11), (13,13)

K = 2:

  N = 0: Intervals: N/A

  N = 1: Intervals: (1,5)

  N = 2: (1,6), (2,5)

    Prune: remove (1,6) since any sum <= 1 would be better paired with (2,5)
           and any sum >= 6 would be better paired with (2,5)

  N = 3: (1,7), (2,6), (3,5)

    Prune: remove (2,6) and (1,7)

  N = 4: (3,8), (4,7), (5,6), (5,6)

    Prune: remove (3,8) and (4,7)

  N = 5: (2,11), (5,8), (6,7)

    Prune: remove (2,11) and (5,8)

For k = 2, we are now left with the following pruned record:

{
  k: 2,
  n: {
    1: (1,5),
    2: (2,5),
    3: (3,5),
    4: (5,6),
    5: (6,7)
  }
}

We've cut down the iteration of k = 3 from a list of n choose 2 possible splits to n relevant splits!

The general algorithm applied to k = 3:

for k' = 1 to k
  for sum A[i-j..i], for i <- [k'-1..n], j <- [0..i-k'+1]:
    for interval in record[k'-1][i-j-1]: // records are for [k'][n']
      update interval
  prune intervals in k'

k' = 3
  i = 2
    sum = 1, record[2][1] = (1,5) -> no change

  i = 3
    // sums are accumulating right to left starting from A[i]
    sum = 1, record[2][2] = (2,5) -> (1,5)
    sum = 2, record[2][1] = (1,5) -> no change

  i = 4
    sum = 3, record[2][3] = (3,5) -> no change
    sum = 4, record[2][2] = (2,5) -> no change
    sum = 5, record[2][1] = (1,5) -> no change

  i = 5
    sum = 2, record[2][4] = (5,6) -> (2,6)
    sum = 5, record[2][3] = (3,5) -> no change
    sum = 6, record[2][2] = (2,5) -> (2,6)
    sum = 7, record[2][1] = (1,5) -> (1,7)

The answer is 5 paired with record[2][3] = (3,5), yielding the updated interval, (3,5). I'll leave the pruning logic for the reader to work out. If we wanted to continue, here's the pruned list for k = 3

{
  k: 3
  n: {
    2: (1,5), 
    3: (1,5),
    4: (3,5),
    5: (3,5)
  }
}

Uber | Onsite interview, Given an array arr of non-negative integers. You need to split it into k contiguous subarrays such that the absolute difference between max sum and min sum is  Partition into two subarrays of lengths k and (N – k) such that the difference of sums is maximum Given an array of non-negative integers of length N and an integer k. Partition the given array into two subarrays of length K and N – k so that the difference between the sum of both subarray is maximum.

Given an array of integers, divide the array into k subarrays such that , Given an array of integers, divide the array into k subarrays such that the difference between the maximum sum and minimum sum subarrays is  Partition into two subarrays of lengths k and (N - k) such that the difference of sums is maximum; Number of ways to select equal sized subarrays from two arrays having atleast K equal pairs of elements; Partition the array into three equal sum segments; Equal sum array partition excluding a given element

How to partition an array of positive integers into K subarrays , How can I partition an array of positive integers into K subarrays, minimizing the difference between the largest and smallest parts? azsHRddVcIV UbFAAGykrTn​  Runtime analysis : 3b. will run in O(n) and 3c. can decrease k' to a minimum of 1 as the least number of partitions can be 1. So the worst case runtime will be of the order O(nk) For my particular question I just need to have one more pass over the array and find the max sum which may be less than the bound I set.

Partition array into K subsets, each with balanced sum, total = total of all n-integers. p= no of parts , I initialise it to n at first and k'=k create an initial sub-arrays array in length given sub arrays count. sub arrays  k = 2: For k = 2 the answer is the maximum of the first and last element. k = 1: Only possible partition is one segment equal to the whole array. So the answer is the minimum value on the whole array. Below is the implementation of the above approach. C++.

Comments
  • The idea is sound but the code has some issues. Off the top of my head, you're not using the outer loop (offset) really, so you're definitely not getting the rotations right. The sum function is inclusive on both ends, so you're effectively looking at subarrays that overlap at their endpoints. Your complexity estimate is wrong: I count 5 nested loops that go up to n and one that goes up to k. Plus the sum function loops, making it closer to O(KN^6) in total. Otherwise, it doesn't look too far from correct (reaching O(KN^4) might require some work though).
  • @gus Thanks! I've resolved some issues, look at updated post. However, it still does not give expected results.
  • Hi @captaintrunky, Can I know the logic behind the solution. I worked on it but lost it most of the times. Thanks..