TacTex’13: A Champion Adaptive Power Trading Agent Daniel Urieli Peter Stone Department of Computer Science The University of Texas at Austin {urieli,pstone}@cs.utexas.edu AAAI 2014 Daniel Urieli, Peter Stone TacTex’13: A Champion Adaptive Power Trading Agent 1
The Smart Grid Vision “Grid 2030” - vision for a smart-grid Transform Transform ing the Grid to R ing the Grid to R ev ev olutionize Electric Power in North Am olutionize Electric Power in North Am erica erica Major challenge: aligning “G R ID 2030” supply-demand in the presence of A N ATIONAL V ISION FOR renewable, intermittent generation E ICITY ’ S S ECOND 100 Y EAR LECTR S AI: a main building block July 2003 . . Smart-grid: new challenges for AI United States Departm ent of Energy Office of Electric Transm ission and Distribution [Ramchurn et. al 2012] Daniel Urieli, Peter Stone TacTex’13: A Champion Adaptive Power Trading Agent 2
The Smart Grid Vision “Grid 2030” - vision for a smart-grid Transform Transform ing the Grid to R ing the Grid to R ev ev olutionize Electric Power in North Am olutionize Electric Power in North Am erica erica Major challenge: aligning “G R ID 2030” supply-demand in the presence of A N ATIONAL V ISION FOR renewable, intermittent generation E ICITY ’ S S ECOND 100 Y EAR LECTR S AI: a main building block July 2003 . . Smart-grid: new challenges for AI United States Departm ent of Energy Office of Electric Transm ission and Distribution [Ramchurn et. al 2012] Daniel Urieli, Peter Stone TacTex’13: A Champion Adaptive Power Trading Agent 2
The Smart Grid Vision “Grid 2030” - vision for a smart-grid Transform Transform ing the Grid to R ing the Grid to R ev ev olutionize Electric Power in North Am olutionize Electric Power in North Am erica erica Major challenge: aligning “G R ID 2030” supply-demand in the presence of A N ATIONAL V ISION FOR renewable, intermittent generation E ICITY ’ S S ECOND 100 Y EAR LECTR S AI: a main building block July 2003 . . Smart-grid: new challenges for AI United States Departm ent of Energy Office of Electric Transm ission and Distribution [Ramchurn et. al 2012] Daniel Urieli, Peter Stone TacTex’13: A Champion Adaptive Power Trading Agent 2
The Power Trading Agent Competition (Power TAC) Grid 2030 milestone: “Customer participation in power markets through demand-side management and distributed gener- ation” Power TAC (Power Trading Agent Competition) Uses a rich smart grid simulation platform Focuses on retail power markets structure and operation Competitors: autonomous broker agents Daniel Urieli, Peter Stone TacTex’13: A Champion Adaptive Power Trading Agent 3
The Power Trading Agent Competition (Power TAC) Grid 2030 milestone: “Customer participation in power markets through demand-side management and distributed gener- ation” Power TAC (Power Trading Agent Competition) Uses a rich smart grid simulation platform Focuses on retail power markets structure and operation Competitors: autonomous broker agents Daniel Urieli, Peter Stone TacTex’13: A Champion Adaptive Power Trading Agent 3
The Power Trading Agent Competition (Power TAC) Grid 2030 milestone: “Customer participation in power markets through demand-side management and distributed gener- ation” Power TAC (Power Trading Agent Competition) Uses a rich smart grid simulation platform Focuses on retail power markets structure and operation Competitors: autonomous broker agents Daniel Urieli, Peter Stone TacTex’13: A Champion Adaptive Power Trading Agent 3
Approach Application domain: autonomous energy-trading In this domain: An agent is deployed into an unknown environment The agent is expected to make robust, real-time decisions Environment is realistic = ⇒ complex To perform robustly, agent need to: Learn Predict Plan Adapt A natural approach: Reinforcement Learning Daniel Urieli, Peter Stone TacTex’13: A Champion Adaptive Power Trading Agent 4
Approach Application domain: autonomous energy-trading In this domain: An agent is deployed into an unknown environment The agent is expected to make robust, real-time decisions Environment is realistic = ⇒ complex To perform robustly, agent need to: Learn Predict Plan Adapt A natural approach: Reinforcement Learning Daniel Urieli, Peter Stone TacTex’13: A Champion Adaptive Power Trading Agent 4
Approach Application domain: autonomous energy-trading In this domain: An agent is deployed into an unknown environment The agent is expected to make robust, real-time decisions Environment is realistic = ⇒ complex To perform robustly, agent need to: Learn Predict Plan Adapt A natural approach: Reinforcement Learning Daniel Urieli, Peter Stone TacTex’13: A Champion Adaptive Power Trading Agent 4
Approach Application domain: autonomous energy-trading In this domain: An agent is deployed into an unknown environment The agent is expected to make robust, real-time decisions Environment is realistic = ⇒ complex To perform robustly, agent need to: Learn Predict Plan Adapt A natural approach: Reinforcement Learning Daniel Urieli, Peter Stone TacTex’13: A Champion Adaptive Power Trading Agent 4
Approach Application domain: autonomous energy-trading In this domain: An agent is deployed into an unknown environment The agent is expected to make robust, real-time decisions Environment is realistic = ⇒ complex To perform robustly, agent need to: Learn Predict Plan Adapt A natural approach: Reinforcement Learning Daniel Urieli, Peter Stone TacTex’13: A Champion Adaptive Power Trading Agent 4
Approach Application domain: autonomous energy-trading In this domain: An agent is deployed into an unknown environment The agent is expected to make robust, real-time decisions Environment is realistic = ⇒ complex To perform robustly, agent need to: Learn Predict Plan Adapt A natural approach: Reinforcement Learning Daniel Urieli, Peter Stone TacTex’13: A Champion Adaptive Power Trading Agent 4
Reinforcement Learning in the Smart Grid Reinforcement Learning (RL): Agent State s, Action a Reward r Environment Our domains require from an RL agent: Sample-efficiency Computationally-efficiency Handle high-dimensional continuous state Handle continuous-actions and/or delayed-actions Handle possible non-stationarity Combination that was not addressed by past RL algorithms Daniel Urieli, Peter Stone TacTex’13: A Champion Adaptive Power Trading Agent 5
Reinforcement Learning in the Smart Grid Reinforcement Learning (RL): Agent State s, Action a Reward r Environment Our domains require from an RL agent: Sample-efficiency Computationally-efficiency Handle high-dimensional continuous state Handle continuous-actions and/or delayed-actions Handle possible non-stationarity Combination that was not addressed by past RL algorithms Daniel Urieli, Peter Stone TacTex’13: A Champion Adaptive Power Trading Agent 5
Reinforcement Learning in the Smart Grid Reinforcement Learning (RL): Agent State s, Action a Reward r Environment Our domains require from an RL agent: Sample-efficiency Computationally-efficiency Handle high-dimensional continuous state Handle continuous-actions and/or delayed-actions Handle possible non-stationarity Combination that was not addressed by past RL algorithms Daniel Urieli, Peter Stone TacTex’13: A Champion Adaptive Power Trading Agent 5
Power TAC: Game Description national grid renewables production Balancing electricity Market generation companies Wholesale T ariff Market Market Electricity Grid commercial/residential consumers competing broker agents Daniel Urieli, Peter Stone TacTex’13: A Champion Adaptive Power Trading Agent 6
Power TAC: Broker Operation Cycle Daniel Urieli, Peter Stone TacTex’13: A Champion Adaptive Power Trading Agent 7
Power TAC Game State cash weather forecast day/time $ $ $ Daniel Urieli, Peter Stone TacTex’13: A Champion Adaptive Power Trading Agent 8
Power TAC 2013 Competition Results Our agent, T AC T EX ’13, won the Power TAC 2013 finals: Broker 7-broker 4-broker 2-broker Total (not normalized) TacTex -705248 13493825 17853189 30641766 cwiBroker 647400 12197772 13476434 26321606 MLLBroker 8533 3305131 9482400 12796064 CrocodileAgent -361939 1592764 7105236 8336061 AstonTAC 345300 5977354 5484780 11807435 Mertacor -621040 1279380 4919087 5577427 INAOEBroker02 -76112159 -497131383 -70255037 -643498580 Daniel Urieli, Peter Stone TacTex’13: A Champion Adaptive Power Trading Agent 9
TacTex’13: Approach TacTex’13: Approach Daniel Urieli, Peter Stone TacTex’13: A Champion Adaptive Power Trading Agent 10
TacTex’13: Approach national grid renewables production Balancing electricity Market generation companies Wholesale T ariff Market Market Electricity Grid commercial/residential consumers competing broker agents Daniel Urieli, Peter Stone TacTex’13: A Champion Adaptive Power Trading Agent 11
TacTex’13: Approach electricity generation companies Wholesale Market T ariff Market Electricity Grid commercial/residential consumers Daniel Urieli, Peter Stone TacTex’13: A Champion Adaptive Power Trading Agent 12
Recommend
More recommend