Introduction. 
In this paper we interlink a dynamic programming, a game theory and
a behavioral simulation approach to the same problem of economic
exchange.
The size and complexity of the strategy sets for even a simple infinite
horizon exchange economy are so overwhelmingly large that it is
reasonably clear that individuals do not indulge in exhaustive search over
even a large subset of the potential strategies. Furthermore unless one
restricts the unadorned definition of a noncooperative equilibrium to a
special form such as a perfect noncooperative equilibrium, almost any
outcome can be enforced as an equilibrium by a sufficiently ingenious
selection of strategies. In essence, almost anything goes, unless the
concept of what constitutes a satisfactory solution to the game places
limits on permitted or expected behavior. The latter presumes that the
players follow the same introspective process as the game-theorist. As
these refinements may be hard to justify, it is interesting to complement
this introspective approach with a study of whether interactive market
processes provide enough structure to tie down the set of strategies
played.
Karatzas, Shubik and Sudderth [1992] formulated a simple infinite
horizon economic exchange model involving a continuum of agents as
a set of parallel dynamic programs, and were able to establish the
existence of a stationary noncooperative equilibrium. In order to obtain
an explicit closed form solution for the optimal policy and equilibrium
wealth distribution, it relies on a particular utility function. In order to
match these analytical results with a behavioral approach, we first
develop simulation models of market processes with agents learning
through reinforcement. Second, we consider more general classes of
utility functions.
J.E.L. classification codes.  C61, C63, C72, C73, D83, D91
Keywords. Market game, dynamic programming, Classifier
System, adaptive behavior