When it comes to pathfinding in Unity, you can roll your own, or use one of two publicly available resources. The most drool-worthy is AngryAnt’s Path. Having gotten way too familiar with the problem of pathfinding myself, and wanting a better workflow and richer capabilities than my home-grown solution affords me, I decided to poke around with AngryAnt’s code last night.
What I found was that there were definitely a few spots that could be improved pretty considerably. Some changes are easy, so I went ahead and made some improvements. This being GPLed code I’ve made my improvements available.
The first few commits on the branch are just tweaks and adjustments related to changes in Unity 2.6 plus integration of the bits and pieces of work done by others. Then I proceed to triage some editor UI issues. This commit makes some aesthetic improvements including telling Unity to render the in-editor UI in a way that is a bit less buggy. This one lets you edit the position values for a waypoint instead of having to drag the scene-view handle. These two fix some edit-time leaks where Texture2D and PathEditor objects get leaked into the scene. (This isn’t a problem as Unity will helpfully clean these up for you, but it’s annoying to get the info notices every time you save your scene for the remainder of the editor session.)
That’s all well and good, but really what I care about is the performance of the runtime itself. For this, I set up a test scene that has 72 “seekers” randomly pathfinding across a navmesh of roughly 1,800 cells.
So the real fun starts with this commit. The performance of an A* implementation is dependent upon several key things, most notably the performance characteristics of the open list, and the closed list. The naive approach to A* uses some sort of list structure for the closed list, which involves an O( n ) traversal to determine if a node is in fact on that list. A better solution is to use a hash table which offers O( 1 ) average-case search time. This one simple change resulted in a speedup of ~2.5x for my test scenario. In practice, this improvement will help even more with graphs that have more obstructions resulting in a higher branching factor (the typical performance of A* itself being essentially O( bd ) where “b” is branching factor and “d” is the dimensionality of the graph structure).
The next easy win comes in this
commit. A key part of the A* search is finding the element on the open list
that has the lowest cost. I didn’t bother going to the effort of implementing
one of the more sophisticated approaches (such as using a red-black tree) but
instead simply reduced AngryAnt’s current approach to something a bit less
costly albeit fundamentally similar. The existing code takes all the
PathData
objects, puts them into a list, and sorts this list then pulls out
the first element. The work of sorting the list is not saved and reused across
multiple loop iterations so effectively the code doesn’t “care” about anything
EXCEPT the lowest-cost element. Saving and reusing the results of the sort
across iterations is a more complicated change and not worth the effort
(versus, say, using a better open list representation) but at a minimum we can
skip the sort entirely and just do a linear search of the list. This reduces
us from a typical-case performance of O( n log n ) – and worst-case of
O( n2 ) – to a guaranteed O( n ). Not great, but much much better. The end
result is a further 4x speedup in my test scenario.
I would still call the A* implementation here quite naive but with a net 10x improvement in performance for large search graphs / many seekers, it’s certainly much more usable.
The next issue to address is memory. AngryAnt’s code is quite fond of allocating temporary objects – notably small arrays, and collection objects. In Microsoft’s implementation of .Net, or recent iterations of Java, this would be a complete non-issue as both Sun and Microsoft have invested heavily in multi-generational garbage collection systems. A key feature of both is the “Eden” or “Young” generation, which is implemented as a copying collector over a small pair of memory pools. They take advantage of the observation that the younger an object is, the more likely it is to be garbage. The end result is that scratch objects become very nearly “free”. Mono does not have such a collector.
In my tests, the 72 seekers typically allocate around 38MB of RAM during their initial path computation. Now, in practice this may not be nearly as bad as it sounds – you’re usually not going to have nearly so many seekers all hitting the pathfinder at once but the point of this test is to gauge the worst-case performance of Path for the sake of shoring it up.
I have made a slew of experimental changes aimed at reducing this (all on the liposuction branch), although I’m less confident about the correctness of these changes so I don’t recommend them for Real Use just yet. The end result of these changes is a reduction in memory usage of around 70% in my test scenario, although that’s still 11MB for 72 seekers which seems quite high.
These changes boil down to really three categories of change:
- Do less work by caching the results of complex operations that do a lot of allocations. It’s these changes that are most likely to introduce bugs and cause me to consider this “experimental”.
- Avoid using
foreach
against collection objects (where possible – not all collections can be subscripted) as this allocates an enumerator. Doingforeach
against simple arrays is fine, and there’s theoretically a slight cost in CPU time to using subscripting to traverse some collections but in practice this seems to be outweighed by less time spent allocating / garbage collecting. - Change the
PathData
structure to not keep a full copy of the path but instead just keep a reference to the nextPathData
object in the path. This is a fairly standard way of doing things in A*. The gotcha is that because this results in a linked list from destination to origin, you still need to build a list at the very end that’s in the proper/useful order. For the sake of expedience, I usedStack<T>
to do this, but that really ought not to be necessary – plus, I don’t recall for certain ifStack<T>
does so, but referring toStack
winds up causing Unity to need to pull in a lot more of Mono into your builds which makes this much less desirable.
There’s still a ton more that could be done on just the runtime. Some things I observed while nosing around the code:
- Too much data being serialized, resulting in .asset files that are larger
than they need to be. Notably,
TriangleAsset
serializes a reference to its containingCellAsset
, but this could be synthesized when deserializing theCellAsset
. There’s also a LOT of memory allocations happening for deserialization in general but I don’t know if there’s much to be done about that. TriangleAsset
uses an array ofVector3
. Since it’s unlikely that triangles will suddenly have a fourth vertex in the foreseeable future, hard coding this to three localVector3
members would eliminate a bunch of small array allocations in operations such as creating a navmesh and more importantly, in finding the nearest navmesh node when pathfinding.- Modifying
PathData
to be bidirectionally linked and simply populating the forward direction at the end of the search would obviate the need for aStack<T>
, eliminate aList<T>
allocation per (successful) search, and slightly reduce CPU time spent preparing the solution to be usable.
The next area of inquiry is the editor side of Path and I haven’t even begun to dig into this side yet. One thing I observed though, was that Browser quickly becomes a massive performance bottleneck once one has a non-trivial number of nodes in a network (say, ~1,800, for example). This would appear to be due to it doing a LOT of memory allocations and copies in all GUI phases (layout, input [if an input as happened], and render). Unfortunately the way the code is written does not make this terribly amenable to an elegant fix. The only opportunity I see is aggressive caching and corresponding cache invalidation which I just don’t have time to implement properly right now. Perhaps if I wind up using Path for anything “real”, I’ll revisit this. In the meantime, I leave it as an Exercise For The Reader.