8 minute read

Three sum.

Problem description

Taken directly from LeetCode.

Given an integer array nums, return all the triplets [nums[i], nums[j], nums[k]] such that i != j, i != k, and j != k, and nums[i] + nums[j] + nums[k] == 0.

Notice that the solution set must not contain duplicate triplets.

Example 1:

Input: nums = [-1,0,1,2,-1,-4]
Output: [[-1,-1,2],[-1,0,1]]
Explanation: nums[0] + nums[1] + nums[2] = (-1) + 0 + 1 = 0.
nums[1] + nums[2] + nums[4] = 0 + 1 + (-1) = 0.
nums[0] + nums[3] + nums[4] = (-1) + 2 + (-1) = 0.
The distinct triplets are [-1,0,1] and [-1,-1,2].
Notice that the order of the output and the order of the triplets does not matter.

Example 2:

Input: nums = [0,1,1]
Output: []
Explanation: The only possible triplet does not sum up to 0.

Example 3:

Input: nums = [0,0,0]
Output: [[0,0,0]]
Explanation: The only possible triplet sums up to 0.

Constraints:

  • 3 <= nums.length <= 3000
  • \(-10^5\) <= nums[i] <= \(10^5\)

Solution type

Went for a caching type solution, but the actual solution is more about clever algorithm design than anything.

Solution

Missed the mark again here. You can see here how I wandered in the wrong direction with my solution: I realized we could sort, but then I latched onto a solution similar to Two Sum’s solution and got too attracted to it. I think technically

This is definitely not an \(O(n)\) problem, so one thing we can immediately consider is tracking all of our triplets in a set, then returning a vector constructed from that set. It will reduce the logic needed to avoid duplicates in the output.

The naive \(O(n^3)\) solution is to consider all triplets.

vector<vector<int>> threeSum(vector<int>& nums) {
    set<vector<int>> s;
    for (int i = 0; i < nums.size() - 2; ++i) {
        for (int j = i + 1; j < nums.size() - 1; ++j) {
            for (int k = j + 1; k < nums.size(); ++k) {
                if (nums[i] + nums[j] + nums[k] == 0) {
                    vector<int> v = {nums[i], nums[j], nums[k]};
                    // to avoid out-of-order duplicates in output
                    sort(v.begin(), v.end());
                    s.insert(v);
                }
            }
        }
    }
    return vector<vector<int>>(s.begin(), s.end());
}

What feels wrong with this solution is that we are definitely reconsidering pairs of indices. Every triplet of indices is unique, but some of those unique triplets contain the same indices in different positions. Consider, for example, i=0, j=5, k=6 and i=5, j=6, k=8. We already computed nums[j] + nums[k] for the first triplet, and we should be able to reuse that computation to determine nums[i] = nums[j] in the second triplet.

I think there’s a couple ways to do this. Notably, since we are asked to return the elements and not the indices, we can reorder nums and still get the correct solution. My first thought is whether sorting can help us.

By using \(O(n \log (n))\) time to sort and then doing \(O(\log n)\) work for \(O(n^2)\) pairs, we can reduce our time complexity to \(O(n^2 \log (n))\). This still runs into the same issue as above, though, where we are recomputing values we shouldn’t need to recompute.

We can get an \(O(n^2)\) solution by doing this:

  1. Place all distinct pairs of elts in nums into a hash map, where the key is the sum and the value is a pair of indices; construction is \(O(n^2)\) time
  2. Make one pass through nums and for each element, check if the negation of said element is a key in the map with a value that does not include its own index; pass is \(O(n)\) time
  3. Return a vector created from the set of elements

I store the indices and not the values to avoid confusion with e.g. [-1,2,0] and [-1,-1,2]; I don’t want the -1 to be counted twice in the first case, but I do want it counted twice in the second. I feel like there’s a nicer way to do this though.

So something like this:

vector<vector<int>> threeSum(vector<int>& nums) {
    set<vector<int>> s;
    unordered_map<int, vector<pair<int, int>>> m;
    for (int i = 0; i < nums.size() - 2; ++i) {
        for (int j = i + 1; j < nums.size() - 1; ++j) {
            m[nums[i] + nums[j]].push_back(pair<int,int> {i, j});
        }
    }

    for (int i = 0; i < nums.size(); ++i ) {
        if (m.find(-nums[i]) != m.end()) {
            for (pair<int, int> p : m[-nums[i]]) {
                if (p.first != i && p.second != i) {
                    vector<int> v = {nums[i], nums[p.first], nums[p.second]};
                    sort(v.begin(), v.end());
                    s.insert(v);
                }
            }
        }
    }

    return vector<vector<int>>(s.begin(), s.end());
}

(Using unordered_set versus set for s shouldn’t actually change the problem’s overall time complexity, but it is a better time complexity for all the set operations if it’s unordered. Shouldn’t make a huge difference unless there are proportionally a lot of duplicates. I am using set because an unordered_set of vector objects would require me to override std::hash for vector. In Python this would not be a problem, as Python’s default set uses hashing.)

I wasn’t surprised to see that this times out on a large array of all zeroes. The problem is that we need to store all possible ways to get a partiuclar sum. If we refer to this number as \(k\) for a particular sum, our algorithm is actually \(O(n^2 + nk)\), since we might need to check all \(k\) ways to get that sum against \(O(n)\) elements. If \(k\) is very large - it can go up to \(O(n^2)\) - this term dominates and we don’t get a satisfactory result. (It doesn’t help that I don’t have a way to hash vector or pair in C++, but oh well…)

Close, though. Just need a way to eliminate this \(k\) term.

Here’s a thought. Instead of doing one last pass through nums, we could do one last pass through the keys of m and check whether the complement to each key is in nums. If we find a match, then construct our output set for each element in . This reduces the work we do for each pair stored in a value of m down to \(O(1)\) from \(O(n)\) - if we start viewing the work done from the perspective of the values in m isntead of the values in nums.

Looking up an item in nums is \(O(n)\), but we can construct an unordered set or map in \(O(n)\) time and then look up any desired element in \(O(1)\) time. This solution is starting to get messy in terms of number of objects, but that’s fine.

The tricky bit is again dealing with [-1,2,0] and [-1,-1,2], i.e. determining whether we have duplicates. [0,0,0] versus [0,0,1] also becomes a problem now.

Barely ran out of time! Bugger.

vector<vector<int>> threeSum(vector<int>& nums) {
    set<vector<int>> s;
    unordered_map<int, vector<pair<int, int>>> m;
    for (int i = 0; i < nums.size() - 1; ++i) {
        for (int j = i + 1; j < nums.size(); ++j) {
            m[nums[i] + nums[j]].push_back(pair<int,int> {nums[i], nums[j]});
        }
    }

    unordered_map<int, int> nums_map;
    for (int elt : nums) {
        nums_map[elt]++;
    }
    for (auto& it : m) {
        int key = it.first;
        if (nums_map.find(-key) != nums_map.end()) {
            for (auto& p : it.second) {
                if (p.first != -key && p.second != -key
                ||  p.first == 0 && p.second == 0 && nums_map[0] >= 2
                ||  nums_map[-key] >= 1) {
                    vector<int> v = {nums[-key], nums[p.first], nums[p.second]};
                    sort(v.begin(), v.end());
                    s.insert(v);
                }
            }
        }
    }

    return vector<vector<int>>(s.begin(), s.end());
}

Looking at the solution, it looks like I was pretty far off. Here’s a solution I wrote back in the day:

vector<vector<int>> threeSum(vector<int>& nums) {
    vector<vector<int>> result;
    
    sort(nums.begin(), nums.end());
    
    for (int i = 0; i < nums.size(); ++i) {
        int target = -nums[i];
        int left = i + 1;
        int right = nums.size() - 1;
        while (left < right) {
            if (nums[left] + nums[right] == target) {
                result.push_back({nums[i], nums[left], nums[right]});
                while (left + 1 < right && nums[left + 1] == nums[left]) {
                    ++left;
                }
                while (right - 1 > left && nums[right - 1] == nums[right]) {
                    --right;
                }
                
                ++left;
                --right;
            }
            else if (nums[left] + nums[right] < target) {
                ++left;
            }
            else {
                --right;
            }
        }
        
        while (i + 1 < nums.size() && nums[i] == nums[i + 1]) {
            ++i;
        }
    }
    
    return result;
}

The idea here is this: sort, then work your way up from the start of the array to the end of it. Create two pointers: one at the current index + 1, one at the end of the array. Within this range, you’re looking for two numbers that sum up to the complement of your current element. If your current sum between the two numbers is too high, then decrease the larger index (to give you a smaller number). Vice versa. If you come upon two elements that sum up to the complement, you’re golden - add that to the output. Because you’re looking at \(O(n)\) elements for each “base” element, you easily fit under an \(O(n^2)\) bound. To skip duplicates, just repeatedly increment and decrement values when you are doing your increment / decrement until you arrive at a new element.

Really, I think the correct way to approach this problem would’ve been “for each element, solve Two Sum with a given target”. This is bascially using the naive sorting solution to Two Sum, except it’s not a bad solution anymore because sorting is cheap enough to be affordable.

Much simpler and much less messy code. I can’t give a good reason as to why I got so tunnel-visioned on a complicated set / map approach except that it’s been a while.

This problem also makes me understand why Python is preferred over C++ for interview problems. I don’t think most of this would’ve taken as much time in Python, and I might’ve had a chance at getting through it (even with the clumsy approach).

The fact that I’m missing problems I got back in the day tells me that I haven’t been challenging myself enough with these algorithmic problems. I think this is fair because I’ve become more interested in systems and “real coding”; still, it’s a sign that this is good practice for me and that I shouldn’t get too assured with myself. Someone else could give a grumbly ramble about how coding interviews are testing for competition skills instead of real, meaningful technical knowledge…but I’ll save that rant for someone else and focus on studying.

Leave a comment

Your email address will not be published. Required fields are marked. *

Loading...