osu!mania Reinforcement Learning Agent Χρυσομάλλης Ιάσων ichrysomallis@isc.tuc.gr 2014030078
Contents Introduction osu!mania game Graphical User Interface customization Agent’s environment Approach and variable definition Q-learning Deep reinforcement learning Future plans 2
Introduction Topic: Develop an agent able to learn how to play the video game osu!mania, through reinforcement learning. Two agents : Q-learning agent Deep reinforcement learning agent 3
osu!mania game Rhythm game , notes are falling Single-tap notes 1. Hold notes 2. Judgment bar 3. Player keys 4. Combo 5. Hitburst 6. Overall accuracy 7. Score 8. 4
Graphical User Interface customization Fully customizable environment, all elements can be changed Each element is painted with solid color RGB = [X, 100, 100], where X is in accordance with the element’s identity (see numbers) 5
Agent’s environment Record screenshots and translate information based on the RGB values given Small fraction of the screen includes relevant information , specific boxes are being recorded 6
Approach and variable definition (1) Identical behavior on each column, problem can be narrowed down to single column learning Agent’s actions : Instantaneous key tap 1. Key press (no release) 2. Key release 3. Do nothing 4. 7
Approach and variable definition ( 2 ) Rewards: Epsilon: o Initial value = 1 o Decay value = 0.9977 o Minimum value = 0.01 8
Approach and variable definition ( 3 ) State: One column of 200 pixels Only red (R) layer Three possible values (no note, singe-tap note, hold note) Deep reinforcement learning: o Raw input of the column Q-learning: o Only 8 pixels due to state complexity, taking one pixel every 15 pixels of the recorded column 9
Q-learning Algorithm: Steps: Receive current state Choose an action based on epsilon Execute the action Receive new state Check if song is over Update Q-table 10
Deep reinforcement learning Neural network model (Keras): Steps: Identical steps apart from last one. Save transitions in temporary memory and train the model with a smaller, randomly selected sample group (batch). 11
Results Q-learning agent DQN agent 12
Future plans Try different combinations of neural network model layers Design the neural network model in TensorFlow Run the agent on GPU, instead of CPU Make use of a high end computer 13
Recommend
More recommend