Learning Significant Locations and Predicting User Movement with - - PowerPoint PPT Presentation

learning significant locations and predicting user
SMART_READER_LITE
LIVE PREVIEW

Learning Significant Locations and Predicting User Movement with - - PowerPoint PPT Presentation

Learning Significant Locations and Predicting User Movement with GPS Daniel Ashbrook and Thad Starner Contextual Computing Group http://www.cc.gatech.edu/ccg College of Computing, GVU Center Georgia Institute of Technology Atlanta, GA USA


slide-1
SLIDE 1

Georgia Tech

Learning Significant Locations and Predicting User Movement with GPS

Daniel Ashbrook and Thad Starner

Contextual Computing Group http://www.cc.gatech.edu/ccg College of Computing, GVU Center Georgia Institute of Technology Atlanta, GA USA

slide-2
SLIDE 2

Daniel Ashbrook and Thad Starner

Georgia Tech

Motivation

  • Location is a very common form of context

– easy to collect – infer other pieces of context

  • Most applications rely only on user’s

current location

slide-3
SLIDE 3

Daniel Ashbrook and Thad Starner

Georgia Tech

Motivation

  • How can we improve location context?
  • Look for patterns of movement and learn

user’s daily schedule

– predict where user is going based on where user has been

  • Goal: computer can act as agent

– offer suggestions at appropriate times – enable collaboration between colleagues

slide-4
SLIDE 4

Daniel Ashbrook and Thad Starner

Georgia Tech

Applications

  • Potential applications for location prediction
  • Single–user applications

– system only knows about one user’s movements

  • Multi–user applications

– system combines predictions for several people

slide-5
SLIDE 5

Daniel Ashbrook and Thad Starner

Georgia Tech

Applications

  • Single user: Pre–emptive Reminders

– remind user at an appropriate time – example: library book

  • try to determine if user will pass library today
  • only then remind user to take book before leaving

home

slide-6
SLIDE 6

Daniel Ashbrook and Thad Starner

Georgia Tech

Applications

  • Single user: Wireless caching

– wireless networks often unavailable

  • lack of infrastructure
  • radio shadows (buildings, subway)

– hide lack of connectivity by caching – predict when caching will be insufficient

  • warn user
  • suggest alternative routes
slide-7
SLIDE 7

Daniel Ashbrook and Thad Starner

Georgia Tech

Applications

  • Single user: Wireless caching

– cache even when network is available

  • transmission power can increase with 4th power of

distance in complex environments (i.e., city)

  • cost can vary with network used, time of day

– prediction can allow savings

  • of battery power
  • of money
slide-8
SLIDE 8

Daniel Ashbrook and Thad Starner

Georgia Tech

Applications

  • Multi–user: Enabling collaboration

– “Will I see Bob today?”

  • compare the user’s and Bob’s schedules
  • give yes or no answer

– Scheduling many–person meetings

  • find when most people are free and suggest a time
  • also discover most convenient place to meet
slide-9
SLIDE 9

Daniel Ashbrook and Thad Starner

Georgia Tech

Applications

  • Multi–user: Favor exchange

– remotely coordinate favor trading – example: FedEx/UPS package trading

slide-10
SLIDE 10

Daniel Ashbrook and Thad Starner

Georgia Tech

Related Work

  • Bhattacharya — cell phone prediction
  • Davis — prediction with ad–hoc networks
  • Kortuem — Walid
  • Marmasse — comMotion
  • Liu — predictively caching network architecture
  • Orwant — Doppelgänger
  • Sparacino — Museum Wearable
  • Wolf — travel diaries
slide-11
SLIDE 11

Daniel Ashbrook and Thad Starner

Georgia Tech

Hardware

  • Garmin GPS model

35-LVS

  • GeoStats data logger

– 1 MPH recording limit

slide-12
SLIDE 12

Daniel Ashbrook and Thad Starner

Georgia Tech

Hardware

  • Preliminary data

collected in Atlanta Sep-Dec 2001

  • Data currently being

collected from multiple users in Zürich, Switzerland

Preliminary data—Atlanta, GA

slide-13
SLIDE 13

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • Preliminary implementation

– finds points of possible significance – creates probabilistic model of user’s movements

  • Markov model

– using model, simple queries are possible:

  • “The user is at home. Where will she go next?”
  • “How likely is the user to visit the grocery store

today?”

slide-14
SLIDE 14

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • Markov model

– collection of nodes

slide-15
SLIDE 15

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • Markov model

– collection of nodes – transitions between nodes

slide-16
SLIDE 16

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • Markov model

– collection of nodes – transitions between nodes – each transition has a probability of occurring

slide-17
SLIDE 17

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • Markov model

– collection of nodes – transitions between nodes – each transition has a probability of occurring – can also have self– transitions

slide-18
SLIDE 18

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • Our Markov model

– nodes are significant locations – transitions are trips between those locations

slide-19
SLIDE 19

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • Significance

– how do we determine if a particular GPS coordinate might have some meaning to the user?

slide-20
SLIDE 20

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • Places

– logged GPS coordinates with more than time t of “resting time”

slide-21
SLIDE 21

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • How to pick t ?
slide-22
SLIDE 22

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • How to pick t ?

– try lots of values – graph number of places found for each value

slide-23
SLIDE 23

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • How to pick t ?

– try lots of values – graph number of places found for each value – but relationship is nearly linear!

slide-24
SLIDE 24

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • How to pick t ?

– try lots of values – graph number of places found for each value – but relationship is nearly linear! – so we pick an arbitrary value: t = 10 minutes

slide-25
SLIDE 25

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

All data All data

slide-26
SLIDE 26

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

All data All data Onl Only places places, with th t = 10 = 10m

slide-27
SLIDE 27

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • Locations

– problem: too many places

  • GPS inaccuracy
  • different exit points from buildings
slide-28
SLIDE 28

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • Locations

– problem: too many places

  • GPS inaccuracy
  • different exit points from buildings

– solution: cluster places to form locations

  • all places within a radius r of a particular place

form a single location

slide-29
SLIDE 29

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

All data All data Onl Only places places, with th t = 10 = 10m

slide-30
SLIDE 30

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

All data All data Onl Only locations locations Onl Only places places, with th t = 10 = 10m

slide-31
SLIDE 31

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • How to pick radius r ?
slide-32
SLIDE 32

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • How to pick radius r ?

– too large value

  • too few clusters
  • unrelated places

together

– too small value

  • too many clusters
slide-33
SLIDE 33

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • How to pick radius r ?

– too large value

  • too few clusters
  • unrelated places

together

– too small value

  • too many clusters
  • Solution:

– try various values for r – find knee in graph

slide-34
SLIDE 34

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • Clustering places into locations

– pick one place (•) – find all places within radius r (•)

slide-35
SLIDE 35

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • Clustering places into locations

– pick one place (•) – find all places within radius r (•) – find the mean of those places (x)

slide-36
SLIDE 36

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • Clustering places into locations

– pick one place (•) – find all places within radius r (•) – find the mean of those places (x) – repeat with x as the new center

slide-37
SLIDE 37

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • Clustering places into locations

– pick one place (•) – find all places within radius r (•) – find the mean of those places (x) – repeat with x as the new center – continue until the mean stops changing

slide-38
SLIDE 38

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • Clustering places into locations

– pick one place (•) – find all places within radius r (•) – find the mean of those places (x) – repeat with x as the new center – continue until the mean stops changing – start again with another place – repeat until no more places

slide-39
SLIDE 39

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • Sublocations

– problem: subsuming smaller-scale paths

slide-40
SLIDE 40

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • Sublocations

– problem: subsuming smaller-scale paths – solution: create sublocations within larger clusters

slide-41
SLIDE 41

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • How to determine if

sublocations exist?

slide-42
SLIDE 42

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • How to determine if

sublocations exist?

– use same knee & graph algorithm on each location – if no knee exists, not enough points to form sublocations

slide-43
SLIDE 43

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • Sublocations can have

multiple scales

– Country level

slide-44
SLIDE 44

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • Sublocations can have

multiple scales

– Country level – State level

slide-45
SLIDE 45

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • Sublocations can have

multiple scales

– Country level – State level – City level

slide-46
SLIDE 46

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • Sublocations can have

multiple scales

– Country level – State level – City level – Campus level

slide-47
SLIDE 47

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • Prediction

– each location gets a unique ID

  • user may provide a unique name for each location

such as “home” or “work”

– replace each place in original list with ID

  • result: list of locations that were visited, in the
  • rder that they were visited
slide-48
SLIDE 48

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • For each location

– count number of visits to each other location

slide-49
SLIDE 49

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • For each location

– count number of visits to each other location – count total number of visits to other locations

slide-50
SLIDE 50

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • For each location

– count number of visits to each other location – count total number of visits to other locations – divide to get probability

  • f transition
slide-51
SLIDE 51

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • For each location

– count number of visits to each other location – count total number of visits to other locations – divide to get probability

  • f transition

– result: Markov model for each location

slide-52
SLIDE 52

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • People don’t move randomly!

– 23 locations total, so chance of A→? = 1/22 = 4.5% – measured ratio CRB→Home = 16/77 = 21%

slide-53
SLIDE 53

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • Orders of Markov model

– 1st order A → ?

  • a given state’s transition probabilities only depend
  • n that state

– 2nd order B → A → ?

  • a given state’s transition probabilities depend on

that state and the previous state

– and so on…

slide-54
SLIDE 54

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • First order predictions

4% 6% 6% 10% 13% 13% 21% % Chance 3/77 CRB → Taco Bell 5/77 CRB → 10th/14th St. 5/77 CRB → GA400 8/77 CRB → Grocery store 10/77 CRB → Jake’s Ice Cream 10/77 CRB → Hardware store 16/77 CRB → Home Probability Movement

  • Random chance: 1/22 = 4.5%
slide-55
SLIDE 55

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

7% 7% 14% 21% 21% 70% % Chance 1/14 Home → CRB → 10th/14th St. 1/14 Home → CRB → GA400 2/14 Home → CRB → Jake’s Ice Cream 3/14 Home → CRB → Grocery store 3/14 Home → CRB → Home 14/20 Home → CRB Probability Movement 0% 0/14 Home → CRB → Hardware store

  • Second order predictions
  • Random chance: 1/22 = 4.5%
slide-56
SLIDE 56

Daniel Ashbrook and Thad Starner

Georgia Tech

Software

  • How many orders to use?

– sequence of 141 locations visited – 23 total unique locations

86 82 73 56 Observed unique paths 137 138 139 140

  • Approx. expected

unique paths 23 * 224 = 5,387,888 4 23 * 223 = 244,904 3 23 * 222 = 11,132 2 23 * 221 = 506 1 Permutations Order

slide-57
SLIDE 57

Daniel Ashbrook and Thad Starner

Georgia Tech

Future Work

  • Collect more data

– Georgia Tech students in Zürich & Atlanta

  • Investigate other sensors for smaller scales

– RF/IR beacons

  • Consider privacy policies
  • Add time of day to Markov model

– predict when a user will leave as well as where they’re going

slide-58
SLIDE 58

Daniel Ashbrook and Thad Starner

Georgia Tech

Future Work

  • Schedule “sharpness”

– always on time = important ? – example: work at 8AM vs. grocery store

  • Speed of model update vs. accuracy

– new schedule for college students every term – weight new events more heavily?

  • how to avoid unduly weighting one–time trips?
  • use confidence intervals to determine schedule

changes

slide-59
SLIDE 59

Daniel Ashbrook and Thad Starner

Georgia Tech

Future Work

  • Real–time update of models

– currently, data is post–processed – need full wearable computers for real–time

  • User interface

– visualize location model – allow user to influence model

  • Favor trading implementation
slide-60
SLIDE 60

Daniel Ashbrook and Thad Starner

Georgia Tech

Thank You Questions?