Agents and environments sensors percepts ? environment Intelligent Agents agent actions actuators Chapter 2 Agents include humans, robots, softbots, thermostats, etc. The agent function maps from percept histories to actions: f : P ∗ → A The agent program runs on the physical architecture to produce f Chapter 2 1 Chapter 2 3 Outline Vacuum-cleaner world ♦ Agents and environments A B ♦ Rationality ♦ PEAS (Performance measure, Environment, Actuators, Sensors) ♦ Environment types ♦ Agent types Percepts: location and contents, e.g., [ A, Dirty ] Actions: Left , Right , Suck , NoOp Chapter 2 2 Chapter 2 4
A vacuum-cleaner agent PEAS To design a rational agent, we must specify the task environment Percept sequence Action [ A, Clean ] Right Consider, e.g., the task of designing an automated taxi: [ A, Dirty ] Suck [ B, Clean ] Left Performance measure?? [ B, Dirty ] Suck Environment?? [ A, Clean ] , [ A, Clean ] Right [ A, Clean ] , [ A, Dirty ] Suck Actuators?? . . . . . . Sensors?? function Reflex-Vacuum-Agent ( [ location , status ]) returns an action if status = Dirty then return Suck else if location = A then return Right else if location = B then return Left What is the “ right/correct ” function? Can it be implemented in a small agent program? Chapter 2 5 Chapter 2 7 Rationality PEAS To design a rational agent, we must specify the task environment Fixed performance measure evaluates the environment sequence – one point per square cleaned up in time T ? Consider, e.g., the task of designing an automated taxi: – one point per clean square per time step, minus one per move? – penalize for > k dirty squares? Performance measure?? safety, destination, profits, legality, comfort, . . . A rational agent chooses whichever action maximizes the expected value Environment?? US streets/freeways, traffic, pedestrians, weather, . . . of the performance measure given the percept sequence to date Actuators?? steering, accelerator, brake, horn, speaker/display, . . . Rational � = omniscient Sensors?? video, accelerometers, gauges, engine sensors, keyboard, GPS, . . . – percepts may not supply all relevant information Rational � = clairvoyant – action outcomes may not be as expected Hence, rational � = successful Rational ⇒ exploration, learning, autonomy Chapter 2 6 Chapter 2 8
Internet shopping agent Environment types Performance measure?? Fully observable vs. partially observable – Can the agent observe/know everything in a state? Environment?? Deterministic vs. stochastic Actuators?? – Does the current state plus action fully determines the next state? Sensors?? Episodic vs. sequential – Does the action affect the future action(s)? – Going to class does not affect doing homework in the future. – How you make a move in a chess game affects your moves later. Static vs. dynamic – Can the environment change while the agent is thinking? Discrete vs. continuous – Finitely distinct or infinitely continuous? Single agent vs. multi-agent – Does the agent deal with other agents? Chapter 2 9 Chapter 2 11 Internet shopping agent Environment types Performance measure?? price, quality, appropriateness, efficiency Solitaire Backgammon Internet shopping Taxi Observable?? Environment?? current and future WWW sites, vendors, shippers Deterministic?? Episodic?? Actuators?? display to user, follow URL, fill in form Static?? Sensors?? HTML pages (text, graphics, scripts) Discrete?? Single-agent?? Chapter 2 10 Chapter 2 12
Environment types Environment types Solitaire Backgammon Internet shopping Taxi Solitaire Backgammon Internet shopping Taxi Observable?? Yes Yes No No Observable?? Yes Yes No No Deterministic?? Deterministic?? Yes No Partly No Episodic?? Episodic?? No No No No Static?? Static?? Discrete?? Discrete?? Single-agent?? Single-agent?? Chapter 2 13 Chapter 2 15 Environment types Environment types Solitaire Backgammon Internet shopping Taxi Solitaire Backgammon Internet shopping Taxi Observable?? Yes Yes No No Observable?? Yes Yes No No Deterministic?? Yes No Partly No Deterministic?? Yes No Partly No Episodic?? Episodic?? No No No No Static?? Static?? Yes Semi Semi No Discrete?? Discrete?? Single-agent?? Single-agent?? Chapter 2 14 Chapter 2 16
Environment types Agent types Four basic types in order of increasing generality: Solitaire Backgammon Internet shopping Taxi – simple reflex agents Observable?? Yes Yes No No – reflex agents with state Deterministic?? Yes No Partly No – goal-based agents Episodic?? No No No No – utility-based agents Static?? Yes Semi Semi No Discrete?? Yes Yes Yes No All these can be turned into learning agents Single-agent?? Chapter 2 17 Chapter 2 19 Environment types Simple reflex agents Agent Solitaire Backgammon Internet shopping Taxi Sensors Observable?? Yes Yes No No Deterministic?? Yes No Partly No What the world Episodic?? No No No No is like now Environment Static?? Yes Semi Semi No Discrete?? Yes Yes Yes No Single-agent?? Yes No Yes (except auctions) No The environment type largely determines the agent design What action I Condition−action rules The real world is (of course) partially observable, stochastic, sequential, should do now dynamic, continuous, multi-agent Actuators Chapter 2 18 Chapter 2 20
Example Example function Reflex-Vacuum-Agent ( [ location , status ]) returns an action function Reflex-Vacuum-Agent ( [ location , status ]) returns an action if status = Dirty then return Suck state ← Update-State( state , location , status ) else if location = A then return Right if state = ... and status = Dirty then . . . else if location = B then return Left Chapter 2 21 Chapter 2 23 Reflex agents with state Goal-based agents Sensors Sensors State State What the world What the world How the world evolves How the world evolves is like now is like now Environment Environment What it will be like What my actions do What my actions do if I do action A What action I What action I Condition−action rules Goals should do now should do now Agent Agent Actuators Actuators Why add an internal model of how the environment evolves? Chapter 2 22 Chapter 2 24
Utility-based agents Summary Agents interact with environments through actuators and sensors Sensors State The agent function describes what the agent does in all circumstances What the world How the world evolves The performance measure evaluates the environment sequence is like now Environment A perfectly rational agent maximizes expected performance What it will be like What my actions do if I do action A Agent programs implement (some) agent functions How happy I will be Utility in such a state PEAS descriptions define task environments What action I Environments are categorized along several dimensions: should do now observable? deterministic? episodic? static? discrete? single-agent? Agent Several basic agent architectures exist: Actuators reflex, reflex with state, goal-based, utility-based Chapter 2 25 Chapter 2 27 Learning agents Performance standard Sensors Critic feedback Environment changes Learning Performance element element knowledge learning goals Problem generator Agent Actuators Chapter 2 26
Recommend
More recommend