This paper presents a general framework for performing reconfiguration of a distributed system
based on maximizing the long-term business value, defined as the discounted sum of all
future rewards and penalties. The problem of dynamic resource allocation among multiple
entities sharing a common set of resources is used as an example.
A specific architecture (DRA-FRL) is presented, which uses the emerging methodology of
reinforcement learning in conjunction with fuzzy rulebases to achieve the desired objective.
This architecture can work in the context of existing resource allocation policies and learn the
values of the states that the system encounters under these policies. Once the learning process
begins to converge, the user can allow the DRA-FRL architecture to make some additional
resource allocation decisions or override the ones suggested by the existing policies so
as to improve the long-term business value of the system. The DRA-FRL architecture can also
be deployed in an environment without any existing resource allocation policies.
An implementation of the DRA-FRL architecture in Solarisâ„¢ 10 demonstrated a robust performance
improvement in the problem of dynamically migrating CPUs and memory blocks
between three resource partitions so as to match the stochastically changing workload in each
partition, both in the presence and in the absence of resource migration costs.
*This material is based upon work supported by DARPA under Contract No. NBCH3039002.