hayato kobayashi tsugutoyo osaki tetsuro okuyama akira
play

Hayato Kobayashi , Tsugutoyo Osaki, Tetsuro Okuyama, Akira Ishino, - PowerPoint PPT Presentation

Hayato Kobayashi , Tsugutoyo Osaki, Tetsuro Okuyama, Akira Ishino, and Ayumi Shinohara Tohoku University, Japan (Team Jolly Pochie) 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 1 Robot position Ball position We need to check the screen


  1. Hayato Kobayashi , Tsugutoyo Osaki, Tetsuro Okuyama, Akira Ishino, and Ayumi Shinohara Tohoku University, Japan (Team Jolly Pochie) 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 1

  2. Robot position Ball position We need to check the screen and field at the same time https://youtu.be/mB5MuDy9GFw 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 2 / 25

  3. Augmented environment Camera AR can alleviate the difficulty https://youtu.be/yGzA6hC9YY8 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 3 / 25

  4. Perceptible Real robot Virtual ball Virtual robot Touchable Augmented environment Intermediate role Simulated environment Real environment in “ Haribote ” developed by team ARAIBO 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 4 / 25

  5. Augmented Soccer Field System Projector Recognition Program Camera Virtual Application 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 5 / 25

  6. Other Object Robot Extraction of contours Identification of robots’ orientation by a background subtraction method by a template matching method 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 6 / 25

  7. Positions of virtual objects e.g., virtual ball and robots Real environment Virtual application Positions of real objects e.g., real robots 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 7 / 25

  8. Robots can interact the virtual ball 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 8 / 25

  9.  Essential for robot soccer  No lost point, no lost game  Learning has been difficult so far 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 9 / 25

  10.  Human intervention  Time consuming  Motor failure Ex. Learning of goal saving skills in the real environment Spank its head for failed saving Stroke its head for successful saving https://youtu.be/9oHA-GH9JT8 https://youtu.be/3Pluuk20xqs 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 10 / 25

  11.  Gap from real environments  Serious, especially for legged movements Gap Real environment Simple simulator with human intervention, without any difficulties time consuming process, and motor failure fear 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 11 / 25

  12.  To bridge the gap  Using the movements of real robots  To allow autonomous learning  Using the convenience of virtual balls Easy Normal Hard mode mode mode Simple simulator Augmented environment Real environment without any difficulties without human intervention with many difficulties 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 12 / 25

  13.  Acquire the map from states to actions maximizing the sum of rewards Reward r t +1 Action a t Agent (AIBO) Environment State s t  Sarsa( λ ) [Rummery and Niranjan 1994; Sutton 1996]  Tile-coding (aka CMACs [Albus 1975] ) 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 13 / 25

  14. V h State representation V v ( r , φ , V v , V h , X , Y ) • ( r , φ ): Ball position • ( V v , V h ): Ball velocity r • ( X , Y ): Robot position φ (We removed the orientation of the robot by considering the X PK situation only.) Y 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 14 / 25

  15. Actions ( 10 kinds) • 8 directional walks • stay ; prepare enemy’s kicks • save ; interrupt enemy’s goals 8 directional walk actions stay action save action 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 15 / 25

  16. Aim: To stop the ball using save action as safely as possible Rewards (punishments) • save_rewared : 0.5 when save is successful • save_punishment : -0.02 For getting near to the ball when save is failed • lost_punishment : -10 when a goal is scored • dist_reward : 1-|ydist|/112.5 when the game is over • passive_punishment : -0.0000001 when save is not selected ydist 1 episode (game) is over, when the ball is out, when a goal is scored, or when save is successful For accelerating the initial phase 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 16 / 25

  17. Initial strategy Learned strategy https://youtu.be/C2YTw6d7xPw https://youtu.be/xCoSGsQHkRY Success: Blue screen Failure: Red screen 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 17 / 25

  18. • ( r , φ ): Ball position • ( V v , V h ): Ball velocity Robot’s actions save action stay action × 8 walk actions 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 18 / 25

  19. • ( r , φ ): Ball position • ( V v , V h ): Ball velocity • ( X , Y ): Robot position Robot’s actions save action stay action × 8 walk actions 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 19 / 25

  20. • ( r , φ ): Ball position • ( V v , V h ): Ball velocity • ( X , Y ): Robot position Robot’s actions save action stay action × 8 walk actions 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 20 / 25

  21. Average success rate in past 100 episodes 100 95 % success 80 Success Rate Success Rate 60 40 1200 episodes 20 0 0 500 1000 1500 2000 Number of Episodes Number of Episodes 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 21 / 25

  22. The true positions of the virtual ball and robot The action of the real robot https://youtu.be/HVx6TlHkPgw 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 22 / 25

  23. https://youtu.be/F3-3o2oCP14 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 23 / 25

  24. 100 80 Success Rate Success Rate 60 75 % Success 40 45 % Success 20 Starting from the result ( 95 % success) in the simulator 0 0 50 100 150 200 Number of Episodes Number of Episodes 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 24 / 25

  25. 100 100 Gap of legged movements 80 80 Success Rate Success Rate Success Rate 60 60 40 40 20 20 2000 Episodes 200 Episodes 0 0 0 50 100 150 200 0 500 1000 1500 2000 Number of Episodes Number of Episodes Number of Episodes 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 25 / 25

  26.  Augmented soccer field system  Intermediate role between simulated environments and real environments  Autonomous learning of goalie strategies  Movements of real robots  Convenience of virtual balls 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 26 / 25

  27. Air hockey game using our system 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 27 / 25

Recommend


More recommend