November 08, 2005

Great Expectation-Formations

I’ve been thinking a lot lately (and by “lately”, I mean “for the past 5 years or so”) about AI and expectation-formation. One of my original posts (on the FSM tyranny) dealt peripherally with the subject, discussing “Occupancy Maps”, which I spun at the time I was writing my thesis as a form of location-expectation (and which is the subject of my article for AIGPW3).

Clearly I talk a lot about the Media Lab’s Synthetic Characters group, mostly because I think we were onto a couple of really cool/important ideas there, that unfortunately never get explored as much as they could/should have (damn you, graduation). Expectation-formation was one of those subjects.

(For those who don’t want to wade through the nonsense below, there is an important question for game developers at the bottom of this post.)

Forming expectations is a problem that is relatively easy to concretize: AIs have an internal knowledge model of the world. If the AI is able to provide an estimate for what that state is going to look like n ticks in the future, that’s an expectation – and naturally we’re going to want the AI to react not (or not just) to current world-state, but also to its expectation. React to its own expectation. I think that’s a neat idea, and architecturally it makes a lot of sense. Assuming we have a reactive behavior system of some sort already, we don’t, as a first pass, need to modify IT at all. All we need to do is provide future instead of current data. Great!

So where does the complication come in? Part of it – an oft-overlooked part of it – is in what I would call expectation-management. I can make a prediction about the world, but how is that prediction maintained over time in the face of new, potentially contradictory information? If our prediction is proved false (I expected to see the player at spot X, but I don’t see them there) what do we fall back on? (This is again, what I was primarily dealing with with Occupancy Maps).

The larger and more obvious question is, how do we generate predictions in the first place?

I’m going to be Media Lab-centric again in pimping the work of my friend and former colleague Rob Burke, who did an amazing thesis on this topic called It’s About Time: Temporal Representations for Synthetic Characters. In it, Rob systematized a lot of our ideas about expectations and learning, blurring in a great way the boundary between the two, and also with the very title pointed out what the real deficit is in current approaches: the lack of representations of time.

(If you don’t want to read Rob’s entire thesis, even though you should, check out a shorter version he wrote as a chapter for LifeLike Characters.)

HIS approach, in true SC fashion, was taken directly from the animal learning/psychology literature, in particular the work of Galistel (check out Time, Rate and Conditioning). Here’s the basic basic basic idea: experience is a soup of stimuli, some inherently salient, some subjectively salient (because they’ve been harbingers of good or bad things in the past) and some irrelevant. Learning is about finding correlations between these stimuli -- correlations that are temporal in nature. I.e. not just “A follows B” but “A follows B after 3 seconds”. This gives us a basis rejecting hypotheses, which we might do if, say, B still has not happened after 10 seconds. AIs in Rob’s world would therefore keep around a list of hypotheses about temporal correlations, and those hypotheses would compete to explain the ongoing state of the world. Hypotheses that were not successful long-term would be culled, those that were were reinforced.

(Somewhere out there Rob Burke’s ears are burning as I misrepresent or dramatically oversimplify his work. Rob, jump in with any corrections/clarifications/comments!)

This is, of course, basically an implementation of conditioning, and the Gallistel/Burke model also accounts for some of the secondary effects of conditioning, like blocking and overshadowing. They also make the strong argument that classical conditioning (a tone is followed by an electric shock) and operant conditioning (if the AI DOES X, the result is Y) are the same thing (hence the blurring of the line between learning and action-selection. I think lots of game-ai developers will be familiar with the basics of the latter (“learning is almost always taken to involve some action on the part of the agent), but I suspect that the former is a little less familiar (why learn about correlation between events that don’t involve me?).

One of my favorite parts of Rob’s thesis was his discussion of a “curiosity drive” – basically the idea that the desire to figure out which correlation-hypotheses are correct can be an end unto itself. Thus if my agent has a recently-formed rule (based on one observation) that says “if I pull the lever, food appears 2 seconds later”, the agent, having eaten its fill (and therefore no longer being driven by hunger) might well pull the lever a few more times, just to see whether its hypothesis about the relationship between the lever and the food really holds. Awesome idea, huh?

(An aside: I think a common theme that runs through many good works on expectations is that it is never enough to simply annotate a hypothesis with “amount of expected reward” – we also have to provide some degree of confidence that the hypothesis we’re considering is valid at all! To combine our confidence with expected reward is a fatal oversimplification that conflates two very different concerns. The curiosity drive discussed above would, for example, specifically look for high-reward, low-confidence hypotheses. If we can turn that into high-reward high-confidence, we’re golden!)

Of course being that this is a blog for game developers, we come back to the 23.2 billion dollar question: how do we work the idea of expectation-formation (through whatever mechanism) into games? And like all such pricey questions, my answer is “I dunno, but I’m thinking about it.”

One way I like to tackle such problems is to pull up the Halo source code, squint my eyes and tilt my head and ask myself, what are some examples of the phenomenon that we’ve ALREADY implemented (probably in an informal way)? Having made a short list, can we generalize on them? Maybe make a separate system to handle them? Extend the use of that system to other arenas? Call if revisionism if you want, but I think it’s a useful way to catalog the examples of problems (in this case expectation-related problems) that we HAD to solve, and therefore avoid wandering into an academic-style wouldn’t-it-be-cool-if la-la land.

Well immediately a few examples come to mind (and they’re probably going to sound stupid, but I think they still count): how about leading a target when firing a burst of slow-moving projectiles? Or, the fact that target positions are cached in the brain of each individual agent, thereby providing an “expectation” about the target’s position even when the agent is no longer observing it?

So I promised an important question and this is it: What are some examples of expectations or predictions made by agents in your game, or games you’ve played? I’m interested to know, because I think, as I’ve said before, that expectation-formation is important, perhaps even essential to our own intelligence – and I think it’s also REALLY intuitively perceived and understood by a player, since the player is making the same kinds of predictions – perhaps even about the same things – as they run around the world killing things.

Posted by naimad at 10:41 PM | Comments (22)