Cooperation and coordination between fuzzy reinforcement learning agents in continuous state partially observable Markov decision processes
Cooperation and coordination between fuzzy reinforcement learning agents in continuous state partially observable Markov decision processes
01 August 1999
We consider a pseudo-realistic world in which one or more opportunities appear and disappear in random locations. Agents use fuzzy reinforcement learning to learn which opportunities are most worthy of pursuing based on their promised rewards, expected lifetimes, path lengths and expected path costs. We show that this world is partially observable because the history of an agent influences the distribution of its future states. We implement a coordination mechanism for allocating opportunities to different agents in the same world. Our results show that optimal team performance results when agents behave in a partially selfish way. We also implement a cooperation mechanism in which agents share experience by using and updating one joint behavior policy. Our results demonstrate that K cooperative agents each learning in a separate world for N time steps outperform K independent agents each learning in a separate world for K*N time steps, with this result becoming more pronounced as the degree of partial observability in the environment increases.
Venue : N/A
File Name : Cooperation and coordination between fuzzy reinforcement learning agents in continuous state partially observable Markov decision processes 1999.pdf