5 minute read

Course schedule.

Problem description

Taken directly from LeetCode.

There are a total of numCourses courses you have to take, labeled from 0 to numCourses - 1. You are given an array prerequisites where prerequisites[i] = [ai, bi] indicates that you must take course bi first if you want to take course ai.

  • For example, the pair [0, 1], indicates that to take course 0 you have to first take course 1.

Return true if you can finish all courses. Otherwise, return false.

Example 1:

Input: numCourses = 2, prerequisites = [[1,0]]
Output: true
Explanation: There are a total of 2 courses to take.
To take course 1 you should have finished course 0. So it is possible.

Example 2:

Input: numCourses = 2, prerequisites = [[1,0],[0,1]]
Output: false
Explanation: There are a total of 2 courses to take.
To take course 1 you should have finished course 0, and to take course 0 you should also have finished course 1. So it is impossible.

Constraints:

  • 1 <= numCourses <= 2000
  • 0 <= prerequisites.length <= 5000
  • prerequisites[i].length == 2
  • 0 <= ai, bi < numCourses
  • All the pairs prerequisites[i] are unique.

Solution type

Caching, kinda. Really it’s just application of graph algorithms.

Solution

I got unreasonably stuck on this one because I declined to examine the obvious solution. See my ramblings below…

The problem description is code for “check for cycles in the graph”. The prerequisites are directed edges in the graph, and the idea should be to follow all those directed edges to see if you get a chain of length greater than numCourses.

You can probably get a valid solution just by constructing the graph as an adjacency list? I think you could save on memory by just using the input list, but I’m not sure you could do so without sacrificing time. Or use DFS / BFS again.

Okay - I think what we do is construct an adjlist, then run DFS to search over the entire graph and create chains of nodes (I guess technically subtrees?). We keep track of the current “chain” length, and if we detect that the chain reaches size numCourses, we know we have a cycle (a subtree of \(n\) nodes can have at most \(n - 1\) edges, else it is not a tree anymore). As we “walk back” the DFS we decrease the value of our chain variable. If we have a non-tree graph then we have a cycle, so there is no need to keep track of a set or map that shows whether a node has been seen or not.

Okay, so in other words do DFS and keep track of the depth. Easy enough. We may have a graph with multiple components, so to be safe we’ll loop over the entire adjlist - basically check every possible starting node. Hmm…just to be safe, I’ll actually go ahead and make a vector of bools to check if I’ve seen a node already.

Noting that I need to switch from DFS to BFS to correctly track depth. BFS goes one layer deeper on each iteration; with DFS you can’t track the depth nearly as easily.

Wait, no. Consider the adjlist [[1, 2, 3], [2, 3], [3]]. This is acyclic but not a tree. Trees…are not directed. There is a concept of a polytree, which is a directed acyclic graph whose underlying undirected graph is a tree (thanks Wikipedia), but we can’t just check for polytrees because other DAGs with more edges exist. This might not be a big deal actually; we just need to ask “what is the maximum number of directed edges a DAG can contain?” and we should be okay.

Hmm…maximum size of a DAG wrt \(n\). I think this is just the \(n-1\)th triangle number? From your first node, draw \(n - 1\) edges; you can then draw \(n - 2\) edges from your second node without inducing a cycle, then \(n - 3\) edges from your third node without inducing a cycle…so on and so forth. If you want, you can think of it recursively in terms of how many edges you can add after adding a new node such that you avoid inducing a cycle. Okay, so our solution is still fine - we just need to adjust our upper limit.

What’s the \(n\)th triangle number? It’s \(\sum_{i = 1}^n i\), which is \(\frac{(n+1)(n)}{2}\). If it sounds like I’m being pedantic, yes, but I’m also just walking through the steps again to try and refresh my brain. It’s been a second since my math courses from school.

Nope, still not getting it. This is a little frustrating, I evidently don’t remember how to properly check for cycles. (Also i didn’t rename my dfs function lol)

bool dfs(vector<vector<int>>& adjs, vector<bool>& seen, int i, int limit) {
    queue<int> q;
    q.push(i);
    int length = 0;
    while (!q.empty()) {
        int current = q.front();
        q.pop();
        seen[current] = true;
        if (length++ == limit) {
            return false;
        }
        for (auto index : adjs[current]) {
            q.push(index);
        }
    }
    return true;
}

bool canFinish(int numCourses, vector<vector<int>>& prerequisites) {
    vector<vector<int>> adjs(numCourses);
    vector<bool> seen(numCourses, false);
    for (auto& edge : prerequisites) {
        adjs[edge[0]].push_back(edge[1]);
    }
    bool result = true;
    int limit = numCourses * (numCourses + 1) / 2;
    for (int i = 0; i < numCourses; ++i) {
        if (!seen[i]) {
            result = result && dfs(adjs, seen, i, limit);
        }
    }
    return result;
}

Looking at the solution, I was making it way more complicated than it had to be. Really, I got turned off by the idea of keeping track of a visited array (which is dumb - it doesn’t change the asymptotic memory complexity) and chased a very incorrect rabbit hole as a result. Here’s the correct solution - DFS, track seen, return false if you ever find something in the DFS stack and true otherwise.

bool dfs(vector<vector<int>>& adjs, vector<bool>& seen, vector<bool>& currentlyInStack, int i) {
    if (!seen[i]) {
        seen[i] = true;
        currentlyInStack[i] = true;
    
        for (auto adjIndex : adjs[i]) {
            if (currentlyInStack[adjIndex]) {
                return false;
            }
            if (!dfs(adjs, seen, currentlyInStack, adjIndex)) {
                return false;
            }
        }
    }
    currentlyInStack[i] = false;
    return true;
}

bool canFinish(int numCourses, vector<vector<int>>& prerequisites) {
    vector<vector<int>> adjs(numCourses);
    vector<bool> seen(numCourses, false);
    vector<bool> currentlyInStack(numCourses, false);
    for (auto& edge : prerequisites) {
        adjs[edge[0]].push_back(edge[1]);
    }
    bool result = true;
    for (int i = 0; i < numCourses; ++i) {
        if (!seen[i]) {
            result = result && dfs(adjs, seen, currentlyInStack, i);
        }
    }
    return result;
}

At this point I’m just referencing the internet…which I guess is what you’re supposed to do.

What we’re doing here is a form of topological sort. Typically with topological sort we are trying to output an ordering of vertices, but since we just care about the cycle detection we don’t bother with accumulating that output. This is a DFS-based variant of topological sort; you can cut down on memory consumption in the topological sort itself by using Kahn’s algorithm, but the asymptotic memory consumption is still $$O( V + E )$$ for the adjlist. Both variants are referenceable on Wikipedia.

I won’t bother writing yet another solution for this writeup; I think there’s a pretty clear explanation at https://leetcode.com/problems/course-schedule/solutions/4646958/video-topological-sort-visualization-to-detect-a-cycle/ if you are curious about the more memory-efficient way.

Leave a comment

Your email address will not be published. Required fields are marked. *

Loading...