ResearchSpace

Identifying and tracking switching, non-stationary opponents: a Bayesian approach

Show simple item record

dc.contributor.author Hernandez-Leal, P
dc.contributor.author Taylor, ME
dc.contributor.author Rosman, Benjamin S
dc.contributor.author Sucar, LE
dc.contributor.author Munoz de Cote, E
dc.date.accessioned 2017-05-17T07:18:30Z
dc.date.available 2017-05-17T07:18:30Z
dc.date.issued 2016-02
dc.identifier.citation Hernandez-Leal, P., Taylor, M.E., Rosman, B.S., Sucar, L.E. and Munoz de Cote, E. 2016. Identifying and tracking switching, non-stationary opponents: a Bayesian approach. Workshop on Multiagent Interaction without Prior Coordination (MIPC) at AAAI-16, 13 February 2016, Phoenix, Arizona USA, p. 560-566 en_US
dc.identifier.uri https://www.aaai.org/ocs/index.php/WS/AAAIW16/paper/view/12584/12424
dc.identifier.uri http://mipc.inf.ed.ac.uk/2016/papers/mipc2016_hernandezleal_etal.pdf
dc.identifier.uri http://hdl.handle.net/10204/9091
dc.description Workshop on Multiagent Interaction without Prior Coordination (MIPC) at AAAI-16, 13 February 2016, Phoenix, Arizona USA en_US
dc.description.abstract In many situations, agents are required to use a set of strategies (behaviors) and switch among them during the course of an interaction. This work focuses on the problem of recognizing the strategy used by an agent within a small number of interactions. We propose using a Bayesian framework to address this problem. Bayesian policy reuse (BPR) has been empirically shown to be efficient at correctly detecting the best policy to use from a library in sequential decision tasks. In this paper we extend BPR to adversarial settings, in particular, to opponents that switch from one stationary strategy to another. Our proposed extension enables learning new models in an online fashion when the learning agent detects that the current policies are not performing optimally. Experiments presented in repeated games show that our approach is capable of efficiently detecting opponent strategies and reacting quickly to behavior switches, thereby yielding better performance than state-of-the-art approaches in terms of average rewards. en_US
dc.language.iso en en_US
dc.publisher Association for the Advancement of Artificial Intelligence (AAAI) en_US
dc.relation.ispartofseries Worklist;16648
dc.subject Policy reuse en_US
dc.subject Non-stationary opponents en_US
dc.subject Repeated games en_US
dc.title Identifying and tracking switching, non-stationary opponents: a Bayesian approach en_US
dc.type Conference Presentation en_US
dc.identifier.apacitation Hernandez-Leal, P., Taylor, M., Rosman, B. S., Sucar, L., & Munoz de Cote, E. (2016). Identifying and tracking switching, non-stationary opponents: a Bayesian approach. Association for the Advancement of Artificial Intelligence (AAAI). http://hdl.handle.net/10204/9091 en_ZA
dc.identifier.chicagocitation Hernandez-Leal, P, ME Taylor, Benjamin S Rosman, LE Sucar, and E Munoz de Cote. "Identifying and tracking switching, non-stationary opponents: a Bayesian approach." (2016): http://hdl.handle.net/10204/9091 en_ZA
dc.identifier.vancouvercitation Hernandez-Leal P, Taylor M, Rosman BS, Sucar L, Munoz de Cote E, Identifying and tracking switching, non-stationary opponents: a Bayesian approach; Association for the Advancement of Artificial Intelligence (AAAI); 2016. http://hdl.handle.net/10204/9091 . en_ZA
dc.identifier.ris TY - Conference Presentation AU - Hernandez-Leal, P AU - Taylor, ME AU - Rosman, Benjamin S AU - Sucar, LE AU - Munoz de Cote, E AB - In many situations, agents are required to use a set of strategies (behaviors) and switch among them during the course of an interaction. This work focuses on the problem of recognizing the strategy used by an agent within a small number of interactions. We propose using a Bayesian framework to address this problem. Bayesian policy reuse (BPR) has been empirically shown to be efficient at correctly detecting the best policy to use from a library in sequential decision tasks. In this paper we extend BPR to adversarial settings, in particular, to opponents that switch from one stationary strategy to another. Our proposed extension enables learning new models in an online fashion when the learning agent detects that the current policies are not performing optimally. Experiments presented in repeated games show that our approach is capable of efficiently detecting opponent strategies and reacting quickly to behavior switches, thereby yielding better performance than state-of-the-art approaches in terms of average rewards. DA - 2016-02 DB - ResearchSpace DP - CSIR KW - Policy reuse KW - Non-stationary opponents KW - Repeated games LK - https://researchspace.csir.co.za PY - 2016 T1 - Identifying and tracking switching, non-stationary opponents: a Bayesian approach TI - Identifying and tracking switching, non-stationary opponents: a Bayesian approach UR - http://hdl.handle.net/10204/9091 ER - en_ZA


Files in this item

This item appears in the following Collection(s)

Show simple item record