Can I pet your DAG?

Posted on July 29, 2025

Directed acyclic graphs appear quite often when analyzing problems. One such problem is one I’ve recently come across under the moniker of workflow orchestration. The key problem here is that you have a certain set of tasks with dependencies with one another, which you need to schedule in a correct ordering. Let’s look at some ways you can phrase (a severely restricted variant of) the problem using some nifty Haskell.

Just another day in the life

Let’s assume you have a set of predefined tasks, say the things you were to do on a particular day. Role-playing as a high-schooler, you’d go to school and maybe take an exam. Going home, maybe your dad asked you to get some groceries. Maybe in the afternoon do some homework or watch TV.
data Task
  = GoOutside
  | GoToSchool | TakePhysics | TakeExam | GoHome
  | GetGroceries | UnpackGroceries
  | HaveLunch | DoHomework
  | HaveDinner | WatchTV | PlayVideoGames
  | GoToBed

There’s a notion of dependencies here. You can only take an exam, if you’re at school. Dinner, TV or video games can only be in the afternoon. Going to bed has to occur at the end of the day. Drawing an example subset of these dependencies out, shows that it can already get quite convoluted. I’ll use the following in the rest of this text. Of course the example is contrived, but it’s merely for exhibition purposes anyway.

These describe the dependencies between tasks. Now say we have Billy and Bobby. Billy has to go to school, go to Physics class, and take a Math exam today. While Bobby doesn’t have school today and only has to do some grocery shopping.

While Billy and Bobby could take their actions independently, let’s say we have to determine some global ordering between the actions they take, consistent with the dependencies between their actions. One could say Billy does all of his tasks first according to the above, and subsequently Bobby does all of his. But that would mean that Bobby is waiting at home, all the while Billy is productively at school. Instead, let’s force common nodes as synchronization points, i.e. Bobby will definitely have done the groceries while Billy is at school, and they’ll leave and get home at approximately the same time
It’s never taken me 6 hours to do groceries, but maybe Bobby is doing groceries for an orphanage.
Maybe they both hold one piece of a key, and the door to their house only opens if both of them are there. I personally wouldn’t want this lock, but I’ve heard it sells like hotcakes in BillyBobbyLand.
Did I mention that this is all extremely contrived? Like all great decisions, the idea for this blog post came from a wrong solution to a problem I had, now fitted to a nonexistent one.
.

While the previous convoluted graph phrased dependencies between tasks, the following shows a concrete set of given tasks that have to occur. Our main goal now is to find a way to operate on schedules of them that obey the dependency relationships between tasks.

Get me my baton

We can phrase dependencies as a Map Task [Task] along with a dependsOn function that emits whether two tasks are transitively dependent, as the following.
dependsOn :: Task -> Task -> Bool
dependsOn t1 t2 = case Map.lookup t1 dependencies of
  Nothing -> False
  Just deps -> t2 `elem` deps || any (\d -> d `dependsOn` t2) deps

If we wanted to find a global schedule between Billy and Bobby, one naive approach would be to do comparisons using this dependsOn function, adding an action to the schedule only if it has no dependencies remaining.
data Actor = Billy | Bobby
type Action task = (task, Actor)

billyTodo, bobbyTodo, allTodos :: [Action Task]
billyTodo = map (,Billy) [GoOutside, GoToSchool, TakePhysics, TakeExam, GoHome]
bobbyTodo = map (,Bobby) [GoOutside, GetGroceries, GoHome, UnpackGroceries]
allTodos = billyTodo <> bobbyTodo

scheduleNaive :: [Action Task]
scheduleNaive = reverse $ go [] allTodo
 where
  go acc (act@(t, _) : actions)
    | not (any (dependsOn t . fst) actions) =
        go (act : acc) actions
    | otherwise = go acc (actions <> [act])
  go acc _ = acc

This approach does a quadratic number of comparisons between the tasks in the set, however, as we compare each action to every other action. In this case, we can do a bit better.

Dependency relationships as we’ve phrased them here form a poset, some tasks can be transitively inferred to have to occur prior to others, but other combinations of tasks cannot, i.e. GetGroceries and TakeExam for example. We can turn this poset into a total ordering using a topological sort.

First we define a graph from our same adjacency Map, using the containersData.Graph. We can use the int-mapping associated with Enums to get the vertices of our graph.
dependencyDAG :: Array.Array Int [Int]
dependencyDAG =
  Array.array (fromEnum (minBound :: Task), fromEnum (maxBound :: Task)) $
    map (bimap fromEnum (map fromEnum)) $
      Map.toList dependencies

From the relative positions within a topological sort, we can then derive a total ordering of all tasks within our domain.
topologicSortedTasks = map toEnum $ Graph.reverseTopSort dependencyDAG
totallyOrderedTasks = Map.fromList $ zip topologicSortedTasks [0 ..]

newtype OrderedTask = OrderedTask Task deriving newtype (Eq, Show)
instance Ord OrderedTask where
  compare (OrderedTask t1) (OrderedTask t2) =
    (totallyOrderedTasks Map.! t1) `compare` (totallyOrderedTasks Map.! t2)

We can check this works on our example day in the lives of Billy and Bobby.
allTodosLinear = map (first OrderedTask) allTodos
scheduleLinear = sort allTodosLinear
-- > scheduleLinear == scheduleNaive ==
-- >  [ (GoOutside, Billy)
-- >  , (GoOutside, Bobby)
-- >  , (GoToSchool, Billy)
-- >  , (TakePhysics, Billy)
-- >  , (TakeExam, Billy)
-- >  , (GetGroceries, Bobby)
-- >  , (GoHome, Billy)
-- >  , (GoHome, Bobby)
-- >  , (UnpackGroceries, Bobby)
-- >  ]

Using OrderedTask, we can even maintain a priority queue on the next action to occur by turning it into a Set. Keeping with that idea, we can even interrupt the global schedule and push new sets of actions onto the queue, like Billy helping Bobby unpack groceries.
actionQueue = Set.fromList allTodosLinear

nextAction :: Set.Set a -> Maybe (a, Set.Set a)
nextAction = Set.minView

newActionQueue = Set.insert (UnpackGroceries, Billy) actionQueue
-- > Set.toList newActionQueue ==
-- >  [ ...
-- >  , (GetGroceries, Bobby)
-- >  , (GoHome, Billy)
-- >  , (GoHome, Bobby)
-- >  , (UnpackGroceries, Billy)
-- >  , (UnpackGroceries, Bobby)
-- >  ]

Wait, there’s two of them!

While nifty, using Set as a priority queue for a poset isn’t particularly useful. By turning the poset formed by the dependencies between tasks into a total ordering, we essentially force that no two actions can occur at the same time. Billy can’t have his exam, exactly while Bobby is doing groceries. Even better, using the above queue, we’d only have Bobby start doing groceries after Billy is done with his exam.

So we exactly lose what makes a poset so useful in phrasing parallelizeable tasks. In the nomenclature, Billy and Bobby’s tasklists are also knows as chains. Every element in a chain can be compared with one another. Antichains are the inverse. They are (sub)sets that contain elements of which no two elements can be compared with one another. Antichains are precisely the tasks that can be executed in parallel.

Less nifty is what I now end up with if I do want to preserve the poset, and want to maintain the minimal antichains in the current set of to-be-scheduled tasks.
data NextTask = NextTask
  { dependencyCounts :: HashMap.HashMap Task Int
  -- ^ Maintain how many tasks block each one currently in the schedule.
  , roots :: HashSet.HashSet Task
  -- ^ Keep track of tasks with no current dependencies.
  }

I’ll leave how to keep this datatype up to date to keep queries reasonably fast as an exercise to the reader. For a solution, check the appendix. It would’ve been pretty cool to have a use case where losing the poset-iness of a poset doesn’t hamper functionality, in which case turning it into a priority-queue via Set would have been very nice.

I personally really like the various Ord-specific functionalities Map and Set provide. They’re not that common in other languages as far as I know, and If you phrase your key types just right, you can do very nifty things and query some sets of facts quite efficiently.