Hayato Kobayashi , Tsugutoyo Osaki, Tetsuro Okuyama, Akira Ishino, and Ayumi Shinohara Tohoku University, Japan (Team Jolly Pochie) 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 1
Robot position Ball position We need to check the screen and field at the same time https://youtu.be/mB5MuDy9GFw 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 2 / 25
Augmented environment Camera AR can alleviate the difficulty https://youtu.be/yGzA6hC9YY8 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 3 / 25
Perceptible Real robot Virtual ball Virtual robot Touchable Augmented environment Intermediate role Simulated environment Real environment in “ Haribote ” developed by team ARAIBO 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 4 / 25
Augmented Soccer Field System Projector Recognition Program Camera Virtual Application 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 5 / 25
Other Object Robot Extraction of contours Identification of robots’ orientation by a background subtraction method by a template matching method 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 6 / 25
Positions of virtual objects e.g., virtual ball and robots Real environment Virtual application Positions of real objects e.g., real robots 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 7 / 25
Robots can interact the virtual ball 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 8 / 25
Essential for robot soccer No lost point, no lost game Learning has been difficult so far 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 9 / 25
Human intervention Time consuming Motor failure Ex. Learning of goal saving skills in the real environment Spank its head for failed saving Stroke its head for successful saving https://youtu.be/9oHA-GH9JT8 https://youtu.be/3Pluuk20xqs 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 10 / 25
Gap from real environments Serious, especially for legged movements Gap Real environment Simple simulator with human intervention, without any difficulties time consuming process, and motor failure fear 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 11 / 25
To bridge the gap Using the movements of real robots To allow autonomous learning Using the convenience of virtual balls Easy Normal Hard mode mode mode Simple simulator Augmented environment Real environment without any difficulties without human intervention with many difficulties 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 12 / 25
Acquire the map from states to actions maximizing the sum of rewards Reward r t +1 Action a t Agent (AIBO) Environment State s t Sarsa( λ ) [Rummery and Niranjan 1994; Sutton 1996] Tile-coding (aka CMACs [Albus 1975] ) 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 13 / 25
V h State representation V v ( r , φ , V v , V h , X , Y ) • ( r , φ ): Ball position • ( V v , V h ): Ball velocity r • ( X , Y ): Robot position φ (We removed the orientation of the robot by considering the X PK situation only.) Y 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 14 / 25
Actions ( 10 kinds) • 8 directional walks • stay ; prepare enemy’s kicks • save ; interrupt enemy’s goals 8 directional walk actions stay action save action 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 15 / 25
Aim: To stop the ball using save action as safely as possible Rewards (punishments) • save_rewared : 0.5 when save is successful • save_punishment : -0.02 For getting near to the ball when save is failed • lost_punishment : -10 when a goal is scored • dist_reward : 1-|ydist|/112.5 when the game is over • passive_punishment : -0.0000001 when save is not selected ydist 1 episode (game) is over, when the ball is out, when a goal is scored, or when save is successful For accelerating the initial phase 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 16 / 25
Initial strategy Learned strategy https://youtu.be/C2YTw6d7xPw https://youtu.be/xCoSGsQHkRY Success: Blue screen Failure: Red screen 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 17 / 25
• ( r , φ ): Ball position • ( V v , V h ): Ball velocity Robot’s actions save action stay action × 8 walk actions 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 18 / 25
• ( r , φ ): Ball position • ( V v , V h ): Ball velocity • ( X , Y ): Robot position Robot’s actions save action stay action × 8 walk actions 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 19 / 25
• ( r , φ ): Ball position • ( V v , V h ): Ball velocity • ( X , Y ): Robot position Robot’s actions save action stay action × 8 walk actions 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 20 / 25
Average success rate in past 100 episodes 100 95 % success 80 Success Rate Success Rate 60 40 1200 episodes 20 0 0 500 1000 1500 2000 Number of Episodes Number of Episodes 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 21 / 25
The true positions of the virtual ball and robot The action of the real robot https://youtu.be/HVx6TlHkPgw 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 22 / 25
https://youtu.be/F3-3o2oCP14 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 23 / 25
100 80 Success Rate Success Rate 60 75 % Success 40 45 % Success 20 Starting from the result ( 95 % success) in the simulator 0 0 50 100 150 200 Number of Episodes Number of Episodes 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 24 / 25
100 100 Gap of legged movements 80 80 Success Rate Success Rate Success Rate 60 60 40 40 20 20 2000 Episodes 200 Episodes 0 0 0 50 100 150 200 0 500 1000 1500 2000 Number of Episodes Number of Episodes Number of Episodes 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 25 / 25
Augmented soccer field system Intermediate role between simulated environments and real environments Autonomous learning of goalie strategies Movements of real robots Convenience of virtual balls 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 26 / 25
Air hockey game using our system 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 27 / 25
Recommend
More recommend