Part 1 — Path Finding

The fundamental problem of pathfinding is finding the shortest path between two points. Specifically, in pathfinding, we assume that we want to solve the problem for a single agent. Also, we assume that:

1. Time is discretized and actions can only be taken one at a time.
2. Actions are deterministic and durationless, i.e. applying an action is immediate and it would change the state by a deterministic function.
3. We have full knowledge about the world, i.e. where obstacles are located, how many points there are, etc.

Surely, there are other assumptions we make (i.e., the agents are cooperative and not self-interested). However, I aim to only explicitly write the assumptions that we will soon try to drop.

A classical algorithm in AI for solving the pathfinding problem is A*. Given a weighted graph, A* maintains a tree of paths originating at the start node and extends those paths according to a heuristic one edge at a time until reaching the goal. If graphs, heuristics, and A* are new to you, follow the links for a great introduction for graphs and A*.

Generalizing to Multi-Agent Path Finding (MAPF)

A* solves the problem of single-agent pathfinding. But, in many real-world applications, we deal with multiple agents. Therefore, MAPF illustrates the problem of finding a shortest path for all agents to reach their goals without collisions (i.e., no couple of agents can be in the same location at the same time).

What do we mean by “shortest”?

For a single agent, we looked for the minimal number of actions it needs to perform to reach its goal. For multiple agents, there are two main metrics for path length in use today:

1. Sum Of Costs — As it sounds, simply sum the cost of all agents’ paths.
2. Makespan — Take the maximal single-agent cost.

Clearly, optimizing for each metric will result in different shortest path solutions. I will focus on algorithms that optimize for the Sum Of Costs.

Exponential search-space

Fortunately, A* time complexity is polynomial (given a good heuristic). Generally, in computer science, a polynomial running time is considered efficient. Can we generalize it to multiple agents and keep the polynomial time complexity?

First, we need to remember about A* branching factor. The branching factor of a graph is the number of children at each node. For A* with a single agent, where the agent performs one action (out of b actions) at each time step, the branching factor is b. For A* with…