Output details
15 - General Engineering
University of Sheffield
Simple learning rules to cope with changing environments
This paper studies how agents can maximise their profit by learning the outcome of their actions in randomly changing worlds. According to an article published in Science (http://dx.doi.org/10.1126/science.1184719), this problem is "extremely difficult, perhaps impossible, to optimize analytically". We prove mathematically that established algorithms, in the long-run, are no better than random sampling and provide state-of-the-art algorithms that perform near optimally. The results are relevant to a range of disciplines; the Science paper cites our work to support the fact that "multiarmed bandits have been widely deployed to study learning across biology, economics, artificial intelligence research, and computer science".