TY - JOUR
T1 - A Markovian Stackelberg game approach for computing an optimal dynamic mechanism
AU - Clempner, Julio B.
N1 - Publisher Copyright:
© 2021, SBMAC - Sociedade Brasileira de Matemática Aplicada e Computacional.
PY - 2021/9
Y1 - 2021/9
N2 - This paper presents a dynamic Bayesian–Stackelberg incentive-compatible mechanism, in which multiple agents observe private information and learn their behavior through a sequence of interactions in a repeated game, for a class of controllable homogeneous Markov games. We assume that the leaders can ex ante commit to their disclosure strategy and mechanism, and affect followers’ actions. Along the paper, leaders possess and benefit from some commitment leadership, which describes the distinctive nature of a Stackelberg game. In this dynamics, leaders and followers together are in a Stackelberg game where actions are taken in a sequential way in the two layers of the hierarchy, but independently leaders and followers are involved non-cooperativelyin two (Nash) games where actions are taken simultaneously. This game considers an ex-ante incentive-compatible mechanism, which in equilibrium maximizes the reward while the agents are learning their actions over a countable number of periods. The formulation of the problem considers a Bayesian–Stackelberg equilibrium in the context of Reinforcement Learning. We propose an algorithm supported by the extraproximal method and show that it converges. The Tikhonov’s regularization technique is employed for ensuring the existence and uniqueness of the Bayesian–Stackelberg equilibrium. We show and guarantee the convergence of the method to a single incentive-compatible mechanism. We derive the analytical expressions for computing the mechanism in a Stackelberg game, which is one of the main results of this work. We demonstrate the efficiency of the method by an experiment drawn from an electric power problem represented by an oligopolistic market structure dominated by a small number of large sellers (oligopolists).
AB - This paper presents a dynamic Bayesian–Stackelberg incentive-compatible mechanism, in which multiple agents observe private information and learn their behavior through a sequence of interactions in a repeated game, for a class of controllable homogeneous Markov games. We assume that the leaders can ex ante commit to their disclosure strategy and mechanism, and affect followers’ actions. Along the paper, leaders possess and benefit from some commitment leadership, which describes the distinctive nature of a Stackelberg game. In this dynamics, leaders and followers together are in a Stackelberg game where actions are taken in a sequential way in the two layers of the hierarchy, but independently leaders and followers are involved non-cooperativelyin two (Nash) games where actions are taken simultaneously. This game considers an ex-ante incentive-compatible mechanism, which in equilibrium maximizes the reward while the agents are learning their actions over a countable number of periods. The formulation of the problem considers a Bayesian–Stackelberg equilibrium in the context of Reinforcement Learning. We propose an algorithm supported by the extraproximal method and show that it converges. The Tikhonov’s regularization technique is employed for ensuring the existence and uniqueness of the Bayesian–Stackelberg equilibrium. We show and guarantee the convergence of the method to a single incentive-compatible mechanism. We derive the analytical expressions for computing the mechanism in a Stackelberg game, which is one of the main results of this work. We demonstrate the efficiency of the method by an experiment drawn from an electric power problem represented by an oligopolistic market structure dominated by a small number of large sellers (oligopolists).
KW - Bayesian equilibrium
KW - Dynamic mechanism design
KW - Incentive-compatible mechanisms
KW - Markov games
KW - Stackelberg games with private information
KW - equilibrio bayesiano
KW - Diseño de mecanismos dinámicos
KW - Mecanismos compatibles con incentivos
KW - juegos de Markov
KW - Juegos Stackelberg con información privada
UR - http://www.scopus.com/inward/record.url?scp=85110350483&partnerID=8YFLogxK
U2 - 10.1007/s40314-021-01578-4
DO - 10.1007/s40314-021-01578-4
M3 - Artículo
AN - SCOPUS:85110350483
SN - 2238-3603
VL - 40
JO - Computational and Applied Mathematics
JF - Computational and Applied Mathematics
IS - 6
M1 - 186
ER -