TY - JOUR
T1 - Convergence analysis for pure stationary strategies in repeated potential games
T2 - Nash, Lyapunov and correlated equilibria
AU - Clempner, Julio B.
AU - Poznyak, Alexander S.
N1 - Publisher Copyright:
© 2015 Elsevier Ltd. All rights reserved.
PY - 2016/3/15
Y1 - 2016/3/15
N2 - In game theory the interaction among players obligates each player to develop a belief about the possible strategies of the other players, to choose a best-reply given those beliefs, and to look for an adjustment of the best-reply and the beliefs using a learning mechanism until they reach an equilibrium point. Usually, the behavior of an individual cost-function, when such best-reply strategies are applied, turns out to be non-monotonic and concluding that such strategies lead to some equilibrium point is a non-trivial task. Even in repeated games the convergence to a stationary equilibrium is not always guaranteed. The best-reply strategies analyzed in this paper represent the most frequent type of behavior applied in practice in problems of bounded rationality of agents considered within the Artificial Intelligence research area. They are naturally related with the, so-called, fixed-local-optimal actions or, in other words, with one step-ahead optimization algorithms widely used in the modern Intelligent Systems theory. This paper shows that for an ergodic class of finite controllable Markov games the best-reply strategies lead necessarily to a Lyapunov/Nash equilibrium point. One of the most interesting properties of this approach is that an expedient (or absolutely expedient) behavior of an ergodic system (repeated game) can be represented by a Lyapunov-like function non-decreasing in time. We present a method for constructing a Lyapunov-like function: the Lyapunov-like function replaces the recursive mechanism with the elements of the ergodic system that model how players are likely to behave in one-shot games. To show our statement, we first propose a non-converging state-value function that fluctuates (increases and decreases) between states of the Markov game. Then, we prove that it is possible to represent that function in a recursive format using a one-step-ahead fixed-local-optimal strategy. As a result, we prove that a Lyapunov-like function can be built using the previous recursive expression for the Markov game, i.e., the resulting Lyapunov-like function is a monotonic function which can only decrease (or remain the same) over time, whatever the initial distribution of probabilities. As a result, a new concept called Lyapunov games is suggested for a class of repeated games. Lyapunov games allow to conclude during the game whether the applied strategy provides the convergence to an equilibrium point (or not). The time for constructing a Potential (Lyapunov-like) function is exponential. Our algorithm tractably computes the Nash, Lyapunov and the correlated equilibria: a Lyapunov equilibrium is a Nash equilibrium, as well it is also a correlated equilibrium. Validity of the proposed method is successfully demonstrated both theoretically and practically by a simulated experiment related to the Duel game.
AB - In game theory the interaction among players obligates each player to develop a belief about the possible strategies of the other players, to choose a best-reply given those beliefs, and to look for an adjustment of the best-reply and the beliefs using a learning mechanism until they reach an equilibrium point. Usually, the behavior of an individual cost-function, when such best-reply strategies are applied, turns out to be non-monotonic and concluding that such strategies lead to some equilibrium point is a non-trivial task. Even in repeated games the convergence to a stationary equilibrium is not always guaranteed. The best-reply strategies analyzed in this paper represent the most frequent type of behavior applied in practice in problems of bounded rationality of agents considered within the Artificial Intelligence research area. They are naturally related with the, so-called, fixed-local-optimal actions or, in other words, with one step-ahead optimization algorithms widely used in the modern Intelligent Systems theory. This paper shows that for an ergodic class of finite controllable Markov games the best-reply strategies lead necessarily to a Lyapunov/Nash equilibrium point. One of the most interesting properties of this approach is that an expedient (or absolutely expedient) behavior of an ergodic system (repeated game) can be represented by a Lyapunov-like function non-decreasing in time. We present a method for constructing a Lyapunov-like function: the Lyapunov-like function replaces the recursive mechanism with the elements of the ergodic system that model how players are likely to behave in one-shot games. To show our statement, we first propose a non-converging state-value function that fluctuates (increases and decreases) between states of the Markov game. Then, we prove that it is possible to represent that function in a recursive format using a one-step-ahead fixed-local-optimal strategy. As a result, we prove that a Lyapunov-like function can be built using the previous recursive expression for the Markov game, i.e., the resulting Lyapunov-like function is a monotonic function which can only decrease (or remain the same) over time, whatever the initial distribution of probabilities. As a result, a new concept called Lyapunov games is suggested for a class of repeated games. Lyapunov games allow to conclude during the game whether the applied strategy provides the convergence to an equilibrium point (or not). The time for constructing a Potential (Lyapunov-like) function is exponential. Our algorithm tractably computes the Nash, Lyapunov and the correlated equilibria: a Lyapunov equilibrium is a Nash equilibrium, as well it is also a correlated equilibrium. Validity of the proposed method is successfully demonstrated both theoretically and practically by a simulated experiment related to the Duel game.
KW - Best-reply analysis
KW - Complexity analysis
KW - Convergence
KW - Correlated equilibrium
KW - Lyapunov equilibrium
KW - Nash equilibrium
KW - Repeated Markov games
UR - http://www.scopus.com/inward/record.url?scp=84948135531&partnerID=8YFLogxK
U2 - 10.1016/j.eswa.2015.11.006
DO - 10.1016/j.eswa.2015.11.006
M3 - Artículo
SN - 0957-4174
VL - 46
SP - 474
EP - 484
JO - Expert Systems with Applications
JF - Expert Systems with Applications
ER -