Hayato Kobayashi , Tsugutoyo Osaki, Tetsuro Okuyama, Akira Ishino, - PowerPoint PPT Presentation

Hayato Kobayashi , Tsugutoyo Osaki, Tetsuro Okuyama, Akira Ishino, and Ayumi Shinohara Tohoku University, Japan (Team Jolly Pochie) 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 1

Robot position Ball position We need to check the screen and field at the same time https://youtu.be/mB5MuDy9GFw 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 2 / 25

Augmented environment Camera AR can alleviate the difficulty https://youtu.be/yGzA6hC9YY8 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 3 / 25

Perceptible Real robot Virtual ball Virtual robot Touchable Augmented environment Intermediate role Simulated environment Real environment in “ Haribote ” developed by team ARAIBO 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 4 / 25

Augmented Soccer Field System Projector Recognition Program Camera Virtual Application 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 5 / 25

Other Object Robot Extraction of contours Identification of robots’ orientation by a background subtraction method by a template matching method 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 6 / 25

Positions of virtual objects e.g., virtual ball and robots Real environment Virtual application Positions of real objects e.g., real robots 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 7 / 25

Robots can interact the virtual ball 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 8 / 25

 Essential for robot soccer  No lost point, no lost game  Learning has been difficult so far 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 9 / 25

 Human intervention  Time consuming  Motor failure Ex. Learning of goal saving skills in the real environment Spank its head for failed saving Stroke its head for successful saving https://youtu.be/9oHA-GH9JT8 https://youtu.be/3Pluuk20xqs 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 10 / 25

 Gap from real environments  Serious, especially for legged movements Gap Real environment Simple simulator with human intervention, without any difficulties time consuming process, and motor failure fear 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 11 / 25

 To bridge the gap  Using the movements of real robots  To allow autonomous learning  Using the convenience of virtual balls Easy Normal Hard mode mode mode Simple simulator Augmented environment Real environment without any difficulties without human intervention with many difficulties 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 12 / 25

 Acquire the map from states to actions maximizing the sum of rewards Reward r t +1 Action a t Agent (AIBO) Environment State s t  Sarsa( λ ) [Rummery and Niranjan 1994; Sutton 1996]  Tile-coding (aka CMACs [Albus 1975] ) 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 13 / 25

V h State representation V v ( r , φ , V v , V h , X , Y ) • ( r , φ ): Ball position • ( V v , V h ): Ball velocity r • ( X , Y ): Robot position φ (We removed the orientation of the robot by considering the X PK situation only.) Y 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 14 / 25

Actions ( 10 kinds) • 8 directional walks • stay ; prepare enemy’s kicks • save ; interrupt enemy’s goals 8 directional walk actions stay action save action 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 15 / 25

Aim: To stop the ball using save action as safely as possible Rewards (punishments) • save_rewared : 0.5 when save is successful • save_punishment : -0.02 For getting near to the ball when save is failed • lost_punishment : -10 when a goal is scored • dist_reward : 1-|ydist|/112.5 when the game is over • passive_punishment : -0.0000001 when save is not selected ydist 1 episode (game) is over, when the ball is out, when a goal is scored, or when save is successful For accelerating the initial phase 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 16 / 25

Initial strategy Learned strategy https://youtu.be/C2YTw6d7xPw https://youtu.be/xCoSGsQHkRY Success: Blue screen Failure: Red screen 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 17 / 25

• ( r , φ ): Ball position • ( V v , V h ): Ball velocity Robot’s actions save action stay action × 8 walk actions 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 18 / 25

• ( r , φ ): Ball position • ( V v , V h ): Ball velocity • ( X , Y ): Robot position Robot’s actions save action stay action × 8 walk actions 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 19 / 25

• ( r , φ ): Ball position • ( V v , V h ): Ball velocity • ( X , Y ): Robot position Robot’s actions save action stay action × 8 walk actions 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 20 / 25

Average success rate in past 100 episodes 100 95 % success 80 Success Rate Success Rate 60 40 1200 episodes 20 0 0 500 1000 1500 2000 Number of Episodes Number of Episodes 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 21 / 25

The true positions of the virtual ball and robot The action of the real robot https://youtu.be/HVx6TlHkPgw 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 22 / 25

https://youtu.be/F3-3o2oCP14 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 23 / 25

100 80 Success Rate Success Rate 60 75 % Success 40 45 % Success 20 Starting from the result ( 95 % success) in the simulator 0 0 50 100 150 200 Number of Episodes Number of Episodes 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 24 / 25

100 100 Gap of legged movements 80 80 Success Rate Success Rate Success Rate 60 60 40 40 20 20 2000 Episodes 200 Episodes 0 0 0 50 100 150 200 0 500 1000 1500 2000 Number of Episodes Number of Episodes Number of Episodes 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 25 / 25

 Augmented soccer field system  Intermediate role between simulated environments and real environments  Autonomous learning of goalie strategies  Movements of real robots  Convenience of virtual balls 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 26 / 25

Air hockey game using our system 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 27 / 25

Hayato Kobayashi , Tsugutoyo Osaki, Tetsuro Okuyama, Akira Ishino, - PowerPoint PPT Presentation

Hayato Kobayashi , Tsugutoyo Osaki, Tetsuro Okuyama, Akira Ishino, and Ayumi Shinohara Tohoku University, Japan (Team Jolly Pochie) 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 1 Robot position Ball position We need to check the screen

Autonomous Learning of Ball Trapping in the Four-legged Robot League Hayato Kobayashi 1 ,

A Framework for Advanced Robot Programming in the RoboCup Domain Hayato Kobayashi 1 , Akira Ishino

Distributed Representations of Web Browsing Sequences for Ad Targeting Yukihiro Tagami, Hayato

Learning for Online Auction Fraud Detection Phiradet Bangcharoensap 1 , Hayato Kobayashi 2 ,

Complexity of Teaching by a Restricted Number of Examples Hayato Kobayashi and Ayumi Shinohara

Perplexity on Reduced Corpora Analysis of Cutoff by Power Law Hayato Kobayashi Yahoo Japan

Constraints on Words Hayato Kobayashi , Hiromi Wakaki, Tomohiro Yamasaki, and Masaru Suzuki

The Size of Message Set Needed for the Optimal Communication Policy Tatsuya Kasai, Hayato

Incorporating Topic Sentence on Neural News Headline Generation Jan Wira Gotama Putra 1 , Hayato

Improving Flexibility of IGCC for Harmonizing with Renewable Energy - Osaki CoolGens Efforts -

Quantitative equidistribution in non-archimedean and complex dynamics Y usuke Okuyama (Kyoto

Phase Transition of Anti-Symmetric Kazumi Okuyama Shinshu U, Japan Workshop@Michigan Kazumi

t Hooft Expansion of 1 / 2 BPS Wilson Loop Kazumi Okuyama JHEP 0609 (2006) 007 p. 1/ ??

Cost Padding, Monitoring, and Regulation Shinji Kobayashi and Shigemi Ohba Shinji Kobayashi and

Geopolitical Theory and its Application to East Asia Dr. Masashi Okuyama In this paper, I will

Un Unsu super pervised vised Ensem semble ble of Rankin king Mo Models els for New ews

JPARC- Decay Volume M.Sakuda (KEK) 11 November 2003 In collaboration with A.Ichikawa,

Recent Progress on Hyper-Kamiokande Project Tetsuro Sekiguchi KEK, IPNS 2016. 5. 13

My Position Paper on New Paradigms for Future Akira Kumagai akira.kumagai@tel.com Tokyo Electron

Diamonds in the Rough: Generating Fluent Sentences from Early-stage Drafts for Academic Writing

Pretraining Sentiment Classifiers with Unlabeled Dialog Data Jul. 18, 2018 Toru Shimizu *1 ,

Lifetime measurement of o-Ps in NaI(Tl) scintillator Diana Seitova Hayato Nishimiya Yasunori

MATTER BISPECTRUM BEYOND HORNDESKI ( ) based on 1801. 07885 SH , T. Kobayashi, S.

A Machine Learning Approach to Recipe Flow Construction Shinsuke Mori, Tetsuro Sasada, Yoko