Path Finding Tutorial

Path finding is a useful tool which can be applied to many types of games in a variety of situations. The tutorial covers a Java implementation of the A* ("A Star") path finding algorithm suitable for use in game gardens games.

 The paths discussed in this article are sets of coordinates defining movement through a coordinate space or graph. This is unrelated to the Path class used in Sprite movement. 

What does path finding do?
A graph is a common data structure consisting of a set of nodes containing data and edges connecting the nodes to each other. Path finding allows us to determine if and how we can reach one node from another. Additionally we can consider not only if two nodes are connected but also assign some form of cost to traveling between them. We can then search for not only the shortest path but also the cheapest, fastest, or safest.

Shortest path:
Usually we are interested in finding the shortest or most efficient path between two nodes, such as the shortest path between two tiles on a map. A board game may need to know if a piece can reach some tile and how many moves it would require to get there.

Flood fill:
If we ask a path finding algorithm to search for a path to an unreachable destination we won't get a result but we still get useful information. The set of nodes the algorithm explored trying to find a path gives us all the nodes that are reachable from our starting location. If our graph represents a map we can use this to identify if two land masses are connected or find all the locations which are part of a lake.

Decision making:
Our graph does not need to represent a set of physical locations. Instead suppose each node represents some form of technology in our game's tech tree. We can use a path finding algorithm as part of our AI to determine the cheapest series of upgrades requires to reach a specific technology level.

The A* Algorithm
The A* ("A star") algorithm has three important properties:
 * It will always return the least expensive path if a path exists to the destination, other algorithms may find a path faster but it is not necessarily the "best" path we can take.
 * A* uses a heuristic (a "guess") to search nodes considered more likely to lead to the destination first, allowing us to often find the best path without having to search the entire map and making the algorithm much faster.
 * A* is based on the idea that each node has some cost associated with it. If the costs for all nodes are the same then the best path returned by A* will also be the shortest path but A* can easily allow us to add different costs to moving through each node.

A* creates two lists of nodes; a closed list containing all the nodes we have fully explored, and an open list containing all the nodes we are currently working on (the perimeter of our search). Each node will have 3 values associated with it; F, G, and H. Each node will also need to be aware of its parent so we can establish how we reached that node.
 * : the exact cost to reach this node from the starting node.
 * : the estimated(heuristic) cost to reach the destination from here.
 * : As the algorithm runs the F value of a node tells us how expensive we think it will be to reach our goal by way of that node.

Heuristics
Selecting an appropriate heuristic is critical in determining the performance A* can achieve.

Ideally we would select a value of H exactly equal to the cost of reaching our destination. If we can do so then A* will only follow the best path and never waste time exploring extra nodes. Of course we don't normally know the exact cost to reach our goal, finding it is the reason we are running a path finder in the first place. We can choose a method which will give us the exact value some of the time, such as when traveling in a straight line with no obstacles, and A* will be perfectly efficient in such cases.

If we choose a value for H greater than the actual cost of reaching our goal we will allow A* to search faster but less accurately and we can no longer be certain of finding a path to the goal. Therefore we normally want to make certain that H is never accidentally greater than the real cost.

If we select a value of H less than the actual cost A* will always find the best possible path. However the lower our value of H the longer A* will take to complete its search. In the worst case of H = 0 our A* will give the same performance as Dijkstra's algorithm

A* Examples
Suppose we have a simple graph represented as a grid. Each node is connected to the four nodes immediately surrounding it, diagonal movement is not allowed. We want to use A* to find a path between two nodes on this graph. For our heuristic we can compute the exact distance we need to travel, assuming we do not hit an obstacle.

Initial heuristic, the Manhattan distance:
Initially we define starting and ending positions for our path. We begin with only the starting node in the open list so we move it to the closed list and add its neighbors to the open list.

The node immediately to the right of our starting node is closest to the goal and therefore has the lowest F value of the 4 nodes in our open list so we will consider it next. We repeat the same process as before, moving it to the closed list and its neighbors to the open list.

The process repeats until eventually the first node in our open list is our destination.

We then trace a path back along each node's parent to construct our final path.

In this case our heuristic was exact so our search never deviated from the path but what happens if our path does not follow a straight line?

Any combination of 6 moves to the right and 4 moves down will reach the goal so our path finder just too the first one it found. Technically this path is as efficient as any we can hope to find but it doesn't look quite right. We might prefer a path that tends to follow a straight line towards the goal. We will choose a new heuristic and see if we can get better performance.

Improving the heuristic, Euclidean distance:
If we try running our first test case again we get the exact same path, a straight line without any unnecessary searching. When we try a diagonal path we get something new.

Now our path finder favors a path along a straight line but it has to explore a larger area to find that path so it will run a little slower. For now this is good enough.

Dealing with terrain:
What happens when we run into an obstacle? So far every node on our graph had a cost of 1 to move into it. Let's add some nodes with a much higher movement cost to the middle of our graph. A* has to explore around obstacles to find the best path but notice that our final path never doubles back.

It is also worth mentioning that while the path will go around obstacles they will not stop the it entirely. If we leave no other option, or if going around an obstacle is even more expensive than crossing it, A* will create a path through high cost terrain.

If we want to add strictly impassable terrain we will need to prevent A* from considering that node entirely. Depending on our game it might also be helpful to stop our path finder if it cannot find a path less than some maximum cost. This would both prevent the path finder from running for an extended period of time and from returning incredibly long or expensive paths.

Further heuristic improvements; efficiency, tie breaking:
Using the Euclidian distance as our heuristic looks better than the Manhattan distance but it also explores unnecessary nodes.

Our path finder is exploring some nodes above and to the left of our starting node, in the opposite direction of our goal. Our heuristic is underestimating the cost to reach the goal to such an extent that nodes near the start have a lower F value than nodes near the goal.

We have a second concern as well. If our path finder reaches a position where two equal paths exist to the goal it will explore both of them. To save time we would prefer that A* explore only one of these paths if possible.

Implementing A*
The A* Tutorial game provides a set of abstract classes and interfaces to support A* path finding. Using the provided code it should be possible to add path finding to most games with minimal effort. See the section below for some cases where modification of the provided code may be necessary.

Interface GenericMap
Defines global attributes of the graph used for path finding. An instance of GenericMap need not be aware of the contents of its nodes but is responsible of the overall dimensions of the graph, determining edges between nodes, and calculating distances between nodes for use in the A* heuristic. GenericMap was designed with regular graphs in mind, where edges follow a consistent pattern, and as such is not a distributed object. If your map needs to create asymmetrical edges between nodes or needs to create/destroy edges dynamically you will need to create a distributed object to synchronize these changes between clients.

Class AStarNode
A storage class for the A* algorithm. Contains the F, G, H values, location, parent, and cost for a node. Each instance of AStar will create an AStarNode for each MapNode, allowing multiple instances of AStar to search for paths across the same graph.

Interface MapNode
A storage class for a single node of the graph used for path finding. Each node need not be aware of its neighbors, only of its own attributes which determine path finding costs and its unique location. Each node on the graph should be represented by a single MapNode which in turn will produce an AStarNode for each instance of AStar operating on the graph. MapNode implements DSet.Entry to allow the graph to be easily shared with the game's clients.

Class AStar
Handles the A* algorithm itself. Given a GenericMap, a collection of MapNodes defining the graph, start and end MapNode an instance of AStar will compute the appropriate A* path either as a single operation or incrementally. AStar allows read access to the open and closed lists during path finding and to the final path and closed list once a path has been found.

Using the A* Tutorial Classes

 * 1) Create a class implementing MapNode to store and synchronize the state of your game's map. See TutorialMapNode in the A* Tutorial for an example.
 * 2) Create a sprite class to display each MapNode if needed.
 * 3) Create a class implementing GenericMap to define your game's map attributes. See GridMap in the A* Tutorial for an example.
 * 4) At run time, create an instance of AStar for each concurrent path finder you want to run.
 * 5) Use one of these methods to calculate a path:
 * 6) Call AStar.findPath once to search for a path as a single (potentially time consuming operation).
 * 7) Call AStar.setPath to set up your search. Call AStar.processPath until AStar.pathComplete indicates a final path has been found. Changes to MapNodes, specifically to nodes appearing in either the open or closed list necessitate restarting the path finder. Changes to the GenericMap will be ignored by instances of AStar, recreate the path finder if the game map's global attributes change.
 * 8) Use Astar.getClosedList and AStar.getPath to obtain whichever results are appropriate for your game.

Further Improvements
Given that this requires modifying the A* algorithm it was omitted from the tutorial in order to keep the provided A* class as generic as possible. Should the need for such a tie breaker prove to be commonplace a tie breaking method should probably be added to GenericMap.
 * The heuristic used in the tutorial is not necessarily the best choice for a Cartesian grid. Most games using such a grid should would do better using the Manhattan distance as a heuristic and add an additional tie breaker in the A* algorithm itself. Of the nodes tied for the lowest F value in the open list select the node with the lowest Euclidean distance.


 * The A* Tutorial game does not include support for impassable nodes. To implement impassible nodes either remove the corresponding MapNode from your map(so it can not be added to the open list) or modify A* to ignore MapNodes flagged as impassable.


 * A* is in now way limited to Cartesian grids. See Blockade Runner for an example of A* path finding on a hexagonal grid.