4 minute read

Two sum.

A quick introduction

This is meant to be the start of a daily 1-hour practice regimen for my future coding interviews. Blind 75 is a popular list of practice problems, and I am planning on going through one problem a day. Since we’re starting today on Jan 1 20241, I am planning on wrapping up this series around March 16 of 2024.

The main purpose of blogging this is to keep myself accountable on the practice. I’d also like to show my students and other young learners how I make and improve on mistakes. I think in tech, we’re often pressured not to admit to having any mistakes, and I’ve seen this create tons of pressure on my peers and students in university. (Additionally, any company that filters me out for showing my mistakes is probably not a company I want to work for. If the mistakes are consistent and extremely stupid then that’s another matter, but I don’t want to work somewhere that excessively perfectionist.)

I might not do this for every problem, but I’d also like to reflect on each problem a little more than I usually would. How does the solution actually work, and why is it actually better? When I took operating systems at my university, the professor told us that there were five solutions to any problem in software system design: indirection, caching, hashing, replication, and encryption. A lot of this won’t be applicable for LeetCode because LeetCode is not systems; these problems are way too simple for that. But I’ll use them2 as a starting point to think about why some solutions are better than others.

I do tend to over-explain and spend too much time documenting things, so I will try to deliberately make these a little more brain-barfy and scratchwork-y to help cut down on that. Some documentation (esp docs being presented to non-technical audiences) should be better polished, but I don’t think the potential audience here would lose out if I had slightly less elegant prose than usual.

Odd number problems will be in Python, even number problems in C++. C++ is standard in games, while Python is normal for most software interviews. I think it’ll be good to do both.

None of this, of course, is great practice for real software engineering and real projects. More on that in other blog posts, although for the time being it looks like the only thing I reasonably have time for is keeping my website sharp.

Problem description

All problem descriptions will be taken directly from LeetCode.

Given an array of integers nums and an integer target, return indices of the two numbers such that they add up to target.

You may assume that each input would have exactly one solution, and you may not use the same element twice.

You can return the answer in any order.

Example 1:

Input: nums = [2,7,11,15], target = 9
Output: [0,1]
Explanation: Because nums[0] + nums[1] == 9, we return [0, 1].

Example 2:

Input: nums = [3,2,4], target = 6
Output: [1,2]

Example 3:

Input: nums = [3,3], target = 6
Output: [0,1]

Constraints:

  • 2 <= nums.length <= \(10^4\)
  • -\(10^9\) <= nums[i] <= \(10^9\)
  • -\(10^9\) <= target <= \(10^9\)
  • Only one valid answer exists.

Follow-up: Can you come up with an algorithm that is less than O(n2) time complexity?

Solution type

This section answers “why is the optimal solution better?”.

For this problem, the answer is caching.

Solution

I’ve done this problem a few times now. This solution took <5 minutes to write and passes all test cases. I got mixed up on .keys() being a member variable versus a member function but other than that had smooth sailing.

def twoSum(self, nums: List[int], target: int) -> List[int]:
    # maps elt to idx
    d = {}
    for i in range(len(nums)):
        ith = nums[i]
        if (target - ith) in d.keys():
            return [d[target - ith], i]
        d[ith] = i

This solution is \(O(n)\) time and space.

Let’s backtrack to the simple solution. The “easy”3 solution is to check every pair of numbers in the list to see if they add up to target:

def twoSum(self, nums: List[int], target: int) -> List[int]:
    for i in range(len(nums)):
        for j in range(i + 1, len(nums)):
            if nums[i] + nums[j] == target:
                return [i, j]

This is \(O(n^2)\) time, \(O(1)\) space. Time is usually the more important resource to conserve, and we can save some time here if we cache. “Cache” here just means that we reference a hash map like in the optimal solution above. I count this as caching because the idea is that we make it cheaper to check whether a number we care about (target - ith) has already been seen in our one pass of the array. Particularly, we reduce the cost of that check from \(O(n)\) (look through the array) to \(O(1)\) (lookup in hash map).

I think the important idea in this problem is identifying the expensive operation (looking up the complement to the current number) and making it cheaper.

I remember this problem becoming more interesting if you do 3-sum, 4-sum and so on. Caching doesn’t work as nicely (although I think you can force it to work) because you need to somehow track whether numbers are

  1. Small lie here; I’m doing this on the 27th of December because it works better with my vacation schedule. 

  2. Minus encryption, I don’t imagine a standard SWE interview asking about encryption unless you are specifically interviewing for a security position. 

  3. I say easy, but especially when you’re starting out I feel like it’s easy to run into the problem of “what the frick am I doing?”, particularly if you’re like me and you haven’t been regularly coding since the age of 12. 

Leave a comment

Your email address will not be published. Required fields are marked. *

Loading...