Ranchod, PRosman, Benjamin SKonidaris, G2015-11-162015-11-162015-10Ranchod, P, Rosman, B.S. and Konidaris, G. 2015. Nonparametric bayesian reward segmentation for skill discovery using inverse reinforcement learning. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, Hamburg Germany, September-October 2015http://irl.cs.duke.edu/pubs/npbrs.pdfhttp://hdl.handle.net/10204/8290IEEE/RSJ International Conference on Intelligent Robots and Systems, Hamburg Germany, September-October 2015.We present a method for segmenting a set of unstructured demonstration trajectories to discover reusable skills using inverse reinforcement learning (IRL). Each skill is characterised by a latent reward function which the demonstrator is assumed to be optimizing. The skill boundaries and the number of skills making up each demonstration are unknown. We use a Bayesian nonparametric approach to propose skill segmentations and maximum entropy inverse reinforcement learning to infer reward functions from the segments. This method produces a set of Markov Decision Processes (MDPs) that best describe the input trajectories. We evaluate this approach in a car driving domain and a simulated quadcopter obstacle course, showing that it is able to recover demonstrated skills more effectively than existing methods.enInverse reinforcement learningNonparametric bayesian methodsSkill discoveryImitation learningNonparametric bayesian reward segmentation for skill discovery using inverse reinforcement learningConference PresentationRanchod, P., Rosman, B. S., & Konidaris, G. (2015). Nonparametric bayesian reward segmentation for skill discovery using inverse reinforcement learning. IEEE. http://hdl.handle.net/10204/8290Ranchod, P, Benjamin S Rosman, and G Konidaris. "Nonparametric bayesian reward segmentation for skill discovery using inverse reinforcement learning." (2015): http://hdl.handle.net/10204/8290Ranchod P, Rosman BS, Konidaris G, Nonparametric bayesian reward segmentation for skill discovery using inverse reinforcement learning; IEEE; 2015. http://hdl.handle.net/10204/8290 .TY - Conference Presentation AU - Ranchod, P AU - Rosman, Benjamin S AU - Konidaris, G AB - We present a method for segmenting a set of unstructured demonstration trajectories to discover reusable skills using inverse reinforcement learning (IRL). Each skill is characterised by a latent reward function which the demonstrator is assumed to be optimizing. The skill boundaries and the number of skills making up each demonstration are unknown. We use a Bayesian nonparametric approach to propose skill segmentations and maximum entropy inverse reinforcement learning to infer reward functions from the segments. This method produces a set of Markov Decision Processes (MDPs) that best describe the input trajectories. We evaluate this approach in a car driving domain and a simulated quadcopter obstacle course, showing that it is able to recover demonstrated skills more effectively than existing methods. DA - 2015-10 DB - ResearchSpace DP - CSIR KW - Inverse reinforcement learning KW - Nonparametric bayesian methods KW - Skill discovery KW - Imitation learning LK - https://researchspace.csir.co.za PY - 2015 T1 - Nonparametric bayesian reward segmentation for skill discovery using inverse reinforcement learning TI - Nonparametric bayesian reward segmentation for skill discovery using inverse reinforcement learning UR - http://hdl.handle.net/10204/8290 ER -