May 01, 2005

Ending the HFSM Tyranny

HFSMs are way popular these days -- I'm guessing that most character-based games that come out today use them in some form or another for controlling behavior. And why not? HFSMs are a great way to think about and organize behavior. They're a great way to modularize your decision-making process.


The problem is, that when you read the average AI post-mortem, the HFSM is generally talked about as BEING the AI, as if intelligence IS behavior. Which is, of course, nonsense. HFSMs are great, don't get me wrong. But I also think they're close to reaching the limits of their applicability. In Halo2 we used an HFSM, and by the end, with over 100 behaviors teeming and crowding in one tree, all competing for exclusive expression on one meager channel (the agent's body), man, that thing was unmanageable.


I think one of the problems is that right now, we're leaning on behavior to accomplish far more than is actually appropriate, leading to trees that are huge and massively complicated. Worse still, we find all sorts of ways to cheat and bypass the beautiful modularity that is implied by any diagram you might see or draw. We do this because getting the kind of behavior we ultimately need is impossible without it.


My theory is that most of the time that we cheat, it's because we don't have the
appropriate representations at hand. These representations are ones that might be constructed and maintained by quite another system than behavior, a world- or cognitive-modeling system.


That's another problem we have right now: our AI suffer from massively impovrished mental representations. In many of the system I've worked with, the cognitive model has consisted of little more than a list of objects the agent knows about, each object having a few simple parameters associated with it (such as current position, current speed, etc.). They contain no relational information (is-next-to), no grouping information (is-part-of, or is-made-up-of) and no real memory.


It might sound like I'm suggesting a move towards more of a semantic network type knowledge model. In fact, my suggestion is less specific than that: I simply want to move towards richer, more useful knowledge models (plus, I don't think I've ever heard of work on real-time semantic networks, ones where knowledge is constantly updated and frequently changes. But it's also not a field I know much about, so if someone has a good reference, I'd love to see it).


What I don't think we can underestimate is how much easier BEHAVIOR will be with a good knowledge model. When I was at the Media Lab, we were working on a system where tracked object positions were represented with position-distributions over space rather than xyz points. The idea was that as the agent looked around (assuming the tracked object was not currently visible) he was invalidating (zeroing out) bits of the distribution that the agent could see. As time passed, the probability tended to diffuse, spreading out in the areas that were hidden, reflecting the possibility that the target object could be moving (assuming it was that kind of target). The neat thing is that as the agent wandered about, this knowledge model was constantly updated, and so the agent's idea of the target's most likely place changed as certain hiding places were uncovered. We gave the agent (in this case a simulated sheepdog, named Duncan) exactly two behaviors: approach a neutral home position (in this case, the simulated shepherd) and approach the target (in this case a sheep). What we got was a whole suite of search behaviors. In approaching the target, Duncan would approach the current most likely target position. Finding that position empty, he would try somewhere else (the NEXT most likely target position). In short, he conducted what seemed like a systematic search of his simulated world. He even displayed more interesting look-around behavior, as his eyes swept a simulated landscape, pushing bubbles of probability around with his eyes.


This is really interesting in all kinds of ways, but the most interesting way is that the BEHAVIOR, per se, was extremely simple. It is satisfying, moreover, that this arrangement reflects the fact that "searching" is something that goes on in our conception of the world, not in our behavior. Whether Duncan wants to bite a sheep, eat a sheep, play with a sheep or bark at a sheep, he still always needs to FIND the sheep first -- the act of searching is merely an implementation detail. In the classical formulation however (where objects are xyz positions and the behavior tree is king) search would need to be listed explicitly as a strategy for all those things.


In most of game AI (certainly FPS AI) we don't really feel the lack of such a general searching mechanism, because most of the time we're only interested in searching one kind of thing for one kind of purpose: find an enemy so that we can shoot it. Still, I'm guessing that if such a representation were available, we'd find ways to use it -- like all good representations, it's good for solving a lot of kinds of problems.


But I shouldn't get too caught up in that. This position-distribution thing, something I called probabilistic occupancy maps, and which others have called particle filters, is really just an example (and a non-semantic-network model at that) of how a more powerful mental model leads to simpler behavior. And simple behavior is GREAT news for us, because let's face it, behavior is bloody complicated enough as it is.

Posted by naimad at 04:57 PM