Recipes/Menus

64 pages
9 views

Movement Skill Acquisition using Imitation and Reinforcement Learning

of 64
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Share
Description
Movement Skill Acquisition using Imitation and Reinforcement Learning
Transcript
    Movement Skill Acquisition using Imitation and Reinforcement Learning AFFAN PERVEZ Master of Science Thesis Stockholm, Sweden 2013      Movement Skill Acquisition using Imitation and Reinforcement Learning AFFAN PERVEZ DD221X , Master’s Thesis in Computer Science ( 30 ECTS  credits) Master Programme in Machine Learning 120  credits Royal Institute of Technology year 2013  Supervisor at CSC was John Folkesson Examiner was Danica Kragic TRITA-CSC-E 2013:012  ISRN-KTH/CSC/E-- 13  / 012 --SE ISSN- 1653-5715   Royal Institute of Technology School of Computer Science and Communication KTH  CSC SE- 100 44  Stockholm, Sweden URL: www.kth.se/csc  Abstract This thesis focuses on having a robot learn to play the game of darts.Playing darts involves multiple tasks. For e.g. how to throw a dartand where to target on the board. It has been shown that with afew demonstrations by the user, the robot can learn how to producetrajectories for hitting a given point on the board; with improvementin accuracy along with the experience. On the other hand we showedhow a robot can discover regions for hitting on the board so that it canmaximize its expected score.     Contents List of Figures1 Introduction 1 1.1 Report outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2 Background 33 Robot Programming by Demonstration 5 3.1 PbD by using dynamical systems . . . . . . . . . . . . . . . . . . . . 53.2 PbD by using Gaussian Mixture Model (GMM) . . . . . . . . . . . . 53.2.1 Expectation Maximization (EM) for GMM . . . . . . . . . . 63.2.2 Gaussian Mixture Regression (GMR) . . . . . . . . . . . . . . 6 4 A brief survey/overview of reinforcement learning 11 4.1 Discrete domain RL . . . . . . . . . . . . . . . . . . . . . . . . . . . 124.2 Continuous domain RL . . . . . . . . . . . . . . . . . . . . . . . . . 124.2.1 Expectation Maximization (EM) based RL . . . . . . . . . . 12 5 Multi Optima search with adaptive Gaussian Mixture Model 15 5.1 Policy learning by Weighting Exploration with Returns . . . . . . . 155.2 Exploiting covariance information in RL . . . . . . . . . . . . . . . . 165.2.1 Different ways of modeling exploration noise . . . . . . . . . 165.3 Multi Optima search using GMM . . . . . . . . . . . . . . . . . . . . 175.3.1 Adapting the number of components . . . . . . . . . . . . . . 195.3.2 Alternate approach for Multi Optima search . . . . . . . . . . 21 6 Dart game experiment 25 6.1 Darts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256.2 Generating heat map from dart board . . . . . . . . . . . . . . . . . 266.3 Finding targeting points in darts . . . . . . . . . . . . . . . . . . . . 266.4 Average log-likelihood of weighted data . . . . . . . . . . . . . . . . . 286.5 Advantage of discovering/tracking multiple optimas . . . . . . . . . 29 7 Movement skill acquisition 31
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks