Engineering

65 pages
13 views

The Application of Reinforcement Learning to the Simple Soccer Game

Please download to get full document.

View again

of 65
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Share
Description
The Application of Reinforcement Learning to the Simple Soccer Game
Transcript
  Application of RL to Simple Soccer THE APPLICATION OF REINFORCEMENT LEARNING TO THE SIMPLE SOCCER GAME Title : The Application of Reinforcement Learning to the Simple Soccer Game  For Qualification of  : Master of Science  Author  : Eamon Costello  Date : August 2007 Supervisor  : Dr. Michael Madden  Department  : Department of Information Technology  Head of Department  : Professor Gerard Lyons  Institution : National University of Ireland, Galway1  Application of RL to Simple Soccer AbstractThis thesis looks at Reinforcement Learning (RL) and its application to computer-simulatedsoccer games. RL itself is explained and its main points detailed. A summary of research intocomputer simulated soccer is then outlined. The Simple Soccer game is introduced and an RLtechnique (tabular Q-learning) is applied to it. Aspects of the research literature are reviewedin the context of Simple Soccer, such as the issues involved in designing and constructing RLexperiments using games. Finally the results of the application are presented and an analysisof the findings is made2  Application of RL to Simple Soccer  Contents  Abstract......................................................................................................................................21 Reinforcement Learning (RL)............................................................................................51.1 RL Introductory Literature.........................................................................................51.2 The RL Problem.........................................................................................................51.2.1 Definition...........................................................................................................51.2.2 Rewards..............................................................................................................61.2.3 Reward Discounting...........................................................................................61.2.4 Balancing Exploitation with Exploration...........................................................71.2.5 Dynamic Programming (DP).............................................................................81.2.6 Temporal Difference (TD) Learning.................................................................91.3 Advances in RL and Current Research....................................................................101.3.1 Function Approximation..................................................................................101.3.2 Hierarchical RL................................................................................................111.3.3 Least Squares...................................................................................................111.3.4 Relational Reinforcement Learning (RRL).....................................................112 Application of RL to Games............................................................................................132.1 Types of Games.......................................................................................................132.2 TD-Backgammon.....................................................................................................132.3 Soccer and RL..........................................................................................................142.3.1 Approaches to Simulated Soccer.....................................................................152.3.2 Keepaway.........................................................................................................152.3.3 Half Field Offense............................................................................................162.3.4 Progress in Keepaway......................................................................................172.4 Summary..................................................................................................................173 Implementation and Verification of C++ Learner...........................................................183.1 Design of the Implemented Learner........................................................................183.2 Testing Against CliffWorld.....................................................................................193.3 GridWorld................................................................................................................213.3.1 Overview..........................................................................................................213.3.2 Comparison Tests.............................................................................................223.3.3 Findings and Implementation Revision...........................................................264 Application of RL to Simple Soccer................................................................................284.1 Simple Soccer..........................................................................................................284.1.1 Overview..........................................................................................................284.1.2 States................................................................................................................294.1.3 Actions.............................................................................................................314.2 Application of RL to Simple Soccer........................................................................334.2.1 Differences with GridWorlds...........................................................................334.2.1.1 Adversarial...................................................................................................334.2.1.2 Complex Object Environment.....................................................................334.2.1.3 Real-Time Processing..................................................................................344.2.2 Benchmarking Results.....................................................................................344.2.3 Simple State/Action Structure for Novel Policy Discovery............................354.2.3.1 States............................................................................................................354.2.3.2 Actions.........................................................................................................364.2.3.3 Rewards........................................................................................................364.2.4 Architecture......................................................................................................374.2.5 Results..............................................................................................................384.2.5.1 Performance.................................................................................................383  Application of RL to Simple Soccer 4.2.5.2 State/Action Representation Findings..........................................................404.2.6 Testing with Reduced Action Space................................................................435 Conclusions......................................................................................................................455.1 Evaluation of Simple Soccer....................................................................................455.2 Application of Tabular Q-Learning to Games.........................................................46  References ................................................................................................................................47Appendix A. Simple Soccer Class Diagrams..........................................................................53SoccerTeam class from Buckland (2005)............................................................................53PlayerBase Class from Buckland (2005).............................................................................54StateMachine Classes from Buckland (2005)......................................................................55Appendix B. Functions of the Learner Specific to Simple Soccer..........................................56Functions added to Simple Soccer Code to implement RL.................................................56Appendix C. Original Project Plan..........................................................................................61Thesis Statement..................................................................................................................61 Need for the Project.............................................................................................................61Research methodology.........................................................................................................61Project Completion..............................................................................................................62Project Plan..........................................................................................................................63Project Phases..................................................................................................................63Gantt Chart.......................................................................................................................654  Application of RL to Simple Soccer 1   Reinforcement Learning (RL)Reinforcement Learning (RL) refers most generally to computational models of learningfrom interaction. Kaebling et al (1996) defines the RL problem as that “faced by an agent thatmust learn behaviour through trial-and-error interactions with a dynamic environment”. Theconcept of trial and error learning has intuitive appeal and has analogues in psychology and behavioural sciences. However it is the field of computer science which has seen muchresearch into RL, as an idealized form of learning, which can be studied with mathematicalmodels or computational experiments. 1.1    RL Introductory Literature Sutton and Barto (1998) give one of the most comprehensive reviews of RL, whileKaebling, Littman and Moore (1996); and Russel and Norvig (2003); provide goodoverviews. Sutton and Barto’s approach is meticulous in grounding RL within older areas of research to which it is indebted. In general Sutton and Barto’s book is accessible and detailed. 1.2   The RL Problem In an RL system constituting an agent and its environment, there are four mainelements: a policy; a reward function; a value function; and, in some cases, a model of theenvironment. As the agent interacts with its environment it receives a state signal s. The agentthen takes an action a, which affects a state transition to a successive state s'. This transition isnon-deterministic which means here that the same successive state might not be obtainedeven if the same action is taken in a given preceding state. The agent receives a reward value r  for this transition. The agent’s aim, or goal, is to receive, a maximum sum of these rewardsover its lifetime. As we shall see, this summing of rewards can be formulated in differentways. 1.2.1    Definition In formal terms the problem can be stated as one that happens over a series of discretestages: . At each t  the agent receives a representation of its state  s such that ...3,2,1 = t  S  s t  ∈  where S  is all possible states. The agent selects an action)( t   s Aa ∈ where is the set of )( t   s A 5
Related Documents
View more...
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks
SAVE OUR EARTH

We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

More details...

Sign Now!

We are very appreciated for your Prompt Action!

x