10 minute read

Split array largest sum.

Problem description

Taken directly from LeetCode.

Given an integer array nums and an integer k, split nums into k non-empty subarrays such that the largest sum of any subarray is minimized.

Return the minimized largest sum of the split.

A subarray is a contiguous part of the array.

Example 1:

Input: nums = [7,2,5,10,8], k = 2
Output: 18
Explanation: There are four ways to split nums into two subarrays.
The best way is to split it into [7,2,5] and [10,8], where the largest sum among the two subarrays is only 18.

Example 2:

Input: nums = [1,2,3,4,5], k = 2
Output: 9
Explanation: There are four ways to split nums into two subarrays.
The best way is to split it into [1,2,3] and [4,5], where the largest sum among the two subarrays is only 9.

Constraints:

  • 1 <= nums.length <= 1000
  • 0 <= nums[i] <= 106
  • 1 <= k <= min(50, nums.length)

Solution type

Algorithmic - binary search / parametric search.

Solution

The first question I asked is “what is the minimum largest sum”? This is sum(nums) / k; the idea is that if you split the array so that this is the sum of all chunks, then try to move “mass” from one chunk to another to reduce the minimum largest sum, you’re going to end up creating a new largest sum by adding the mass of the one array chunk to the other. It feels to me that there’s supposed to be an \(O(n)\) solution that relies on this.

The brute force solution is to try all possible chunks. Programmatically, I would think of this as “insert \(k - 1\) dividers into the array”. For \(n\) = len(nums), there are less than \(n + (k - 1) \choose (k - 1)\) possible choices for the dividers (I think??1), which is going to end up being a blatantly unacceptable \(O((n + k - 1)!)\)2 time complexity. We need to take advantage of the contiguous subarray part of the problem in order to reduce our time.

The question is “how do we insert our dividers?”. Here’s a starter algorithm that works on the two example test cases:

int splitArray(vector<int>& nums, int k) {
    int average = reduce(nums.begin(), nums.end()) / k;
    int currentSum = 0;
    vector<int> possibleSums;
    for (auto elt : nums) {
        currentSum += elt;
        if (currentSum > average) {
            possibleSums.push_back(currentSum);
            currentSum = elt;
        }
    }
    // remember that this returns an iterator lol
    return *min_element(possibleSums.begin(), possibleSums.end());
}

The idea is that we traverse the array and only consider inserting a divider when we go above the average value. We store all possible sums created by these dividers and return the minimum.

This fails if we reverse the chunks, and it also fails on higher values of \(k\). Which…reading this again this feels really weird, so I’m not surprised.

What if we just do a greedy algorithm once going forward and once going backward? Like so:

int splitArray(vector<int>& nums, int k) {
    int average = reduce(nums.begin(), nums.end()) / k;
    cout << average;
    int currentSum = 0;
    vector<int> chunkSumsForward, chunkSumsBackward;
    for (auto elt : nums) {
        currentSum += elt;
        if (currentSum > average) {
            chunkSumsForward.push_back(currentSum);
            currentSum = 0;
        }
    }
    currentSum = 0;
    for (int i = nums.size() - 1; i >= 0; --i) {
        currentSum += nums[i];
        if (currentSum > average) {
            chunkSumsBackward.push_back(currentSum);
            currentSum = 0;
        }
    }

    // remember that this returns an iterator lol
    int maxForward = *max_element(chunkSumsForward.begin(), chunkSumsForward.end());
    int maxBackward = *max_element(chunkSumsBackward.begin(), chunkSumsBackward.end());

    return min(maxForward, maxBackward);
}

Seems encouraging at first but fails on [7,2,5,10,8], k = 4. The issue is that average = 8 in this scenario and our algorithm will divide this array into [7,2], [5,10], [8] or [8], [10], [5,2,7]. It doesn’t recognize that we have an additional divider and that we can split up the maximum even more.

One train of thought is to store indices of the array chunks’ starts, run this algorithm, then break up the chunk or chunks with the maximum sum - putting it that way, it soooooort of starts to sound like divide and conquer? I feel like this could get really hairy with the code though, because I’m sure you could craft an adversarial example where you have to keep breaking chunks down further and further until you use up all \(k\) dividers. So my instinct is to try and think more about the foundation of the problem.

How do we ensure that we use all \(k\) dividers? Our “average” heuristic is clearly not enough to ensure this on its own. I feel you are supposed to use the fact that the minimum possible max sum is the average, and I feel like…hmm. Maybe what you’re supposed to do is something similar to the first idea I had: when your current sum goes over the average, you have to make a choice whether or not to add the current value to that sum or start a new chunk from there. In that sense it starts to sound similar to Kadane’s algorithm…

The problem is that we can’t really do that greedily, because whether or not you include that last element will affect the elements you include in the next chunks in the array. That feels wrong. I’m almost out of time, so I think I’m going to have to call it on this one and look up the solution later (unless I magically figure it out right now…).

int splitArray(vector<int>& nums, int k) {
    double average = reduce(nums.begin(), nums.end()) / k;
    cout << average;
    int currentSum = 0;
    vector<int> chunkSums;
    for (auto elt : nums) {
        currentSum += elt;
        if (currentSum > average) {
            // should we add elt or not add elt?
            if (currentSum > *max_element(chunkSums.begin(), chunkSums.end())) {
                chunkSums.push_back(currentSum - elt);
                currentSum = elt;
            }
            else {
                chunkSums.push_back(currentSum);
                currentSum = 0;
            }
        }
    }
    return *max_element(chunkSums.begin(), chunkSums.end());
}

This was as far as I got before time went up – basically, if currentSum would become the maximum element, then push back currentSum - elt and start a new chunk at elt instead. But this just errors out on the empty initial chunkSums and this would basically never add an element to chunkSums.

Okay, looking at the solution now. Somehow this is binary search? So it looks like it’s correct to focus on the array values instead of the indices. It also looks like it’s not possible to get it down to \(O(n)\); the fastest solutions are \(O(n \log (\textrm{sum(\texttt{nums})}))\). Let’s actually write a working solution to make sense of this…

int splitArray(vector<int>& nums, int k) {
    int lowest = *max_element(nums.begin(), nums.end());
    int highest = accumulate(nums.begin(), nums.end(), 0);

    while (lowest < highest) {
        int mid = (lowest + highest) / 2; 
        int num_splits = 1, running_sum = 0;
        for (int elt : nums) {
            if (running_sum + elt > mid) {
                num_splits++;
                running_sum = 0;
            }
            running_sum += elt;
        }
        if (num_splits > k) {
            lowest = mid + 1;
        }
        else {
            highest = mid;
        }
    }

    return highest;
}

Here’s the key: instead of binary searching on indices, binary search on values. Your array isn’t sorted, but the range of possible maximum subarray sums is [max(nums), sum(nums)], and it is perfectly valid to perform a binary search on that contiguous range. It’s important how you set lowest and highest in the end if-else block; lowest should be the exclusively set boundary because if mid is not a suitable candidate (i.e. requires too many splits), then you should only examine values strictly greater than mid. (Conversely, if mid is a suitable candidate, then it should be included in the range you are examining in case the next iterations of binary search do not return any lower valid values.)

Each iteration of binary search takes \(O(n)\) time because that’s the time it takes to break the array down into subarrays. We do \((O \log (\textrm{(sum(\texttt{nums}))}))\) iterations of binary search, so our final time complexity is this wonky \(O(n \log (\textrm{(sum(\texttt{nums}))}))\) term. Glancing through different solutions online suggests that this is optimal.

We get a bit more context when we do some research on algorithm design. Parametric search takes a decision algorithm (e.g. deciding where to put \(k-1\) dividers) and turns it into an optimization algorithm (e.g. maximize a sum). This is more common in computational geometry, apparently. Parametric search has more interesting extensions and applications in parallel algorithms, but a basic sequential algorithm works like this: you have some test algorithm and decision algorithm, where the test algorithm tests whether a proposed solution is optimal and the decision algorithm tells you which proposed solution to take out of the solution space. I’ll take an example directly from Wikipedia here:

An example considered both by Megiddo (1983) and van Oostrum & Veltkamp (2002) concerns a system of an odd number of particles, all moving rightward at different constant speeds along the same line. The median of the particles, also, will have a rightward motion, but one that is piecewise linear rather than having constant speed, because different particles will be the median at different times. Initially the particles, and their median, are to the left of the origin of the line, and eventually they and their median will all be to the right of the origin. The problem is to compute the time \(t_{0}\) at which the median lies exactly on the origin. A linear time decision algorithm is easy to define: for any time \(t\), one can compute the position of each particle at time \(t\) and count the number of particles on each side of the origin. If there are more particles on the left than on the right, then \(t < t_0\), and if there are more particles on the right than on the left, then \(t > t_0\). Each step of this decision algorithm compares the input parameter \(t\) to the time that one of the particles crosses the origin. Using this decision algorithm as both the test algorithm and the decision algorithm of a parametric search leads to an algorithm for finding the optimal time \(t_0\) in quadratic total time. To simulate the decision algorithm for parameter \(t_0\), the simulation must determine, for each particle, whether its crossing time is before or after \(t_0\), and therefore whether it is to the left or right of the origin at time \(t_0\). Answering this question for a single particle – comparing the crossing time for the particle with the unknown optimal crossing time \(t_0\)g – can be performed by running the same decision algorithm with the crossing time for the particle as its parameter. Thus, the simulation ends up running the decision algorithm on each of the particle crossing times, one of which must be the optimal crossing time. Running the decision algorithm once takes linear time, so running it separately on each crossing time takes quadratic time.

The Wikipedia page itself has a comparison with binary search; basically, if you are searching in an interval of numbers known to contain the optimal, then binary search can be used as the decision algorithm. The test algorithm depends on the specific problem. If you only desire an integer output and you have a discrete range of integers (and not e.g. a real number interval) then binary search is a perfectly suitable decision algorithm. Parametric search is more interesting for problems that aren’t discrete because they give strongly polynomial algorithms instead of weakly polynomial (in this case specifically because binary search’s complexity depends on the desired precision of the output1), but even then, binary search is much simpler and faster in most applications, and you need some justification to use parametric search.

I’ve done this one before, failed it, seen the solution, and forgot it again. I don’t blame myself for that; it’s a heck of a technique to catch on your own. I think there is a DP solution that I didn’t consider at all, which is probably from a bias on my part since this is classified in Blind 75 as an array problem.

  1. Technically strongly vs weakly polynomial is decided by length of input, but the precision of the output can be treated as an input, and we can consider it being bound by said input.  2

  2. Extreme overestimate but I’m too lazy to reduce this to something nice. Or it reduces to something nice but I forget how. The important part is that this is too slow. 

Leave a comment

Your email address will not be published. Required fields are marked. *

Loading...