Analysis of wide area user mobility patterns Kevin Simler*, Steven - PowerPoint PPT Presentation

Analysis of wide area user mobility patterns Kevin Simler*, Steven E. Czerwinski † , Anthony Joseph UC Berkeley * Now at MIT 2004/12/02 † Now at Google

Motivation � We want to understand user behavior � In order to design better systems � In order to generate synthetic traces � In order to model user behavior � How can we capture user presence in the wide area?

Motivation � We want to understand user behavior � In order to design better systems � In order to generate synthetic traces � In order to model user behavior � How can we capture user presence in the wide area? web

Motivation � We want to understand user behavior � In order to design better systems � In order to generate synthetic traces � In order to model user behavior � How can we capture user presence in the wide area? web, IM

Motivation � We want to understand user behavior � In order to design better systems � In order to generate synthetic traces � In order to model user behavior � How can we capture user presence in the wide area? web, IM, …, e-mail

Why e-mail? � E-mail is a widely-used service � User typically checks e-mail first � Berkeley provides IMAP + web front end � Any Internet connection → e-mail access � E-mail reflects users’ Internet presence

Outline � Background � Analysis and results � User modeling � Future work � Summary

Trace characteristics � 31-days (May 2003) � Server from UC Berkeley EECS dept. � Regular IMAP plus web front-end � 1004 active users, primarily: � Professors � Graduate students � Support staff � Tracked across different service providers

Building on previous work � Wireless Campus Studies � Mobility on a campus � Single service provider with homogenous users � Tang & Baker, Kotz & Essien, Balazinska & Castro � Metricom WLAN � Mobility across/between cities � Single service provider with more diverse users � Tang & Baker

Trace data � Each entry in the trace includes: � Timestamp (seconds) � Request type ( login , close , select , etc.) � Username � IP address

Preprocessing � We want user behavior � Trace records client application behavior � Outlook, Eudora, Thunderbird, etc. � Primary difference: � Client polls for new e-mail at regular intervals � Fixed period per client, variable across clients

We filter client polling using a Fourier transform Client connections from a single user: … client connection login logout

We filter client polling using a Fourier transform p p … Use a Fourier transform to identify polling period p .

We filter client polling using a Fourier transform … Identify sequence separated by p . Remove all but the first connection.

We filter client polling using a Fourier transform > 15 minute gap … Clump connections into user sessions

We filter client polling using a Fourier transform … user session user session

We filter client polling using a Fourier transform … Now we have (roughly) a trace of user behavior

Outline � Background � Trace analysis � Defining location � Daily mobility � Monthly mobility � Session activity � User modeling � Future work � Summary

Defining network location � Connection used to access the Internet � E.g. a dialup ISP, campus wireless network � Approximated by a combination of � Authoritative DNS server � AS number � Subnet

How mobile are users each day? Fraction of user-days 0.6 0.5 0.4 0.3 0.2 0.1 0 0 1 2 3 Number of locations

How mobile are users each day? Fraction of user-days 0.6 50% of user- 0.5 days involve logging in from 0.4 only 1 location 0.3 0.2 0.1 0 0 1 2 3 Number of locations

How mobile are users each day? Fraction of user-days 0.6 15% of user- 0.5 days involve logging in from 0.4 2 locations 0.3 0.2 0.1 0 0 1 2 3 Number of locations

How mobile are users each day? Fraction of user-days 0.6 Upshot: On any 0.5 given day, users are not highly 0.4 mobile 0.3 0.2 0.1 0 0 1 2 3 Number of locations

How mobile are users in 31 days? � How many unique subnets do they visit? � How many unique AS #s do they visit? Let’s look at a graph….

How mobile are users in 31 days? 1 cumulative fraction of users subnets 0.8 AS #s 0.6 0.4 0.2 0 0 2 4 6 8 10 12 14 # clusters

How mobile are users in 31 days? 1 cumulative fraction of users subnets 0.8 AS #s 0.6 80% of users 0.4 log in from 8 or 0.2 fewer unique 0 subnets 0 2 4 6 8 10 12 14 # clusters

How mobile are users in 31 days? 1 cumulative fraction of users subnets 0.8 AS #s 0.6 90% of users 0.4 log in from 3 or 0.2 fewer unique 0 AS numbers 0 2 4 6 8 10 12 14 # clusters

How mobile are users in 31 days? 1 cumulative fraction of users subnets 0.8 AS #s 0.6 Upshot: Again, 0.4 most users are 0.2 not highly 0 mobile 0 2 4 6 8 10 12 14 # clusters

User activity at a location 0.7 fraction of visits 0.6 0.5 0.4 0.3 0.2 0.1 0 1 2 3 4+ # sessions

User activity at a location 0.7 60% of visits to fraction of visits 0.6 a location 0.5 result in only 1 0.4 session 0.3 0.2 0.1 0 1 2 3 4+ # sessions

User activity at a location 0.7 20% of visits to fraction of visits 0.6 a location result 0.5 in exactly 2 0.4 sessions 0.3 0.2 0.1 0 1 2 3 4+ # sessions

User activity at a location 0.7 Upshot: Users fraction of visits 0.6 access their e- 0.5 mail once or 0.4 twice per visit. 0.3 0.2 0.1 0 1 2 3 4+ # sessions

Outline � Background � Trace analysis � User modeling � Categorizing users � Model structure � Training and testing � Future work � Summary

Categorizing users � Based on number of primary locations � For a given user, a primary location is: � One where the user spends >5% of the time � Categories � Users with 1 primary location � Users with 2 primary locations � Users with 3+ primary locations

Structure of our models � One model for each category � Two-tiered Markov model � High-level states represent user’s location � Low-level states represent user’s activity � Both MMs are 1 st order

Model structure for category 2 � 2 primary locations + 1 traveling state primary 1 primary 2 traveling

Model structure for category 2 � 2 primary locations + 1 traveling state primary 1 High-level (location) states primary 2 traveling

Model structure for category 2 � 2 primary locations + 1 traveling state primary 1 Low-level (session) states primary 2 I.e. Logged-In and Logged-Out traveling

Training � We have all the information � Which locations are primary � Where the user is, at any time � When the user is logged in/out � Simple to compute transition probabilities

Testing methodology � Create synthetic trace � Chose metrics to measure a trace � Compare real trace with synthetic trace

Testing one metric � # of sessions between visits to primary � Each user visits his primary � leaves to visit other locations � then comes back to his primary � Every time this happens, record the number of other locations � There will be a CDF for the entire trace (real or synthetic)

Testing results

Outline � Background � Trace analysis � User modeling � Future work � Summary

Using the results � Synthetic traces can help test systems � User behavior has implications for design � E.g. focus resources on primary locations � Model can predict user behavior on-the-fly � E.g. to cache, or not to cache?

As technology changes… � Blackberries � More physical locations � Shorter, more frequent sessions � Still, primary locations will be important � Wireless LAN hotspots � More network locations

Outline � Background � Trace analysis � User modeling � Future work � Summary

Summary – what we’ve done � Obtained a trace from an e-mail server � Filtered out client polling � Analyzed trace of user behavior � Modeled categories of users with tiered MM � Generated synthetic traces

Summary – user behavior � Most users log in from 1 or 2 locations � But a few users are highly mobile � Users access e-mail infrequently, but for long periods of time

Thank you � Quick clarifying questions?

Analysis of wide area user mobility patterns Kevin Simler*, Steven - PowerPoint PPT Presentation

Analysis of wide area user mobility patterns Kevin Simler, Steven E. Czerwinski , Anthony Joseph UC Berkeley Now at MIT 2004/12/02 Now at Google Motivation We want to understand user behavior In order to design better

MOBILITY CHOICE STUDY MOBILITY CHOICE STUDY MOBILITY CHOICE STUDY Planning for Mobility in

Does Mobility Matter? Does Mobility Matter? Does Mobility Matter? Does Mobility Matter? The

Mobility Activity in WIDE Keio University/WIDE Ryuji Wakikawa ryuji@sfc.wide.ad.jp Goal of

Lube : Mitigating Bottlenecks in Hao Wang* Wide Area Data Analytics Baochun Li i Qua Wide Area

Factory Patterns: Factory Method and Abstract Factory Design Patterns In Java Bob Tarr

Principles and Patterns 26 February, 2020 Recap Principles Patterns Inheritance Anti-patterns

Wide Area Networking A short introduction to High-Speed Wide-Area-Networking August 31, 2005 1

WORLD WIDE WORKSHOP for WORLD WIDE WORKSHOP for WORLD WIDE WORKSHOP for WORLD WIDE WORKSHOP for

Design Patterns Applications Programming What is design patterns? The design patterns are

Design Patterns 1 What are Design Patterns? Design patterns describe common (and successful)

Software, Faster Patterns of Effective Delivery Dan North @tastapod Patterns of Effective

Design Patterns in Eiffel Dr. Till Bay design patterns? [Design Patterns] are

1 Closed Patterns and Max-Patterns Closed Patterns and Max-Patterns A long pattern contains a

More Design Patterns Horstmann ch.10.1,10.4 Design patterns Structural design patterns

Insero E-Mobility o t d to Driving the e-mobility industry forward r d r: e r k r n

MOBILITY & ENERGY CONSERVATI MOBILITY & ENERGY CONSERVATION MOBILITY & ENERGY

Graphical Models Graphical Models Relationship between the directed & undirected models

PixelVault:+Using+GPUs+for+Securing+ Cryptographic+Opera;ons+ ! Giorgos+Vasiliadis + +

Chapter 2 Application Layer

Graphical Models Graphical Models Bayesian Networks Siamak Ravanbakhsh Fall 2019 Previously on

Towards a theory of Undo Aaron Brown UC Berkeley June 2002 ROC Retreat Outline Recap of

FPGAs for Image Processing A DSL and program transformations Rob Stewart Greg Michaelson Idress

Marginal Inference in MRFs using Frank-Wolfe David Belanger, Daniel Sheldon, Andrew McCallum

Quantifying the Performance Impacts of Using Local Memory for Many-Core Processors Jianbin Fang 1

Sambuz

Useful Links

Newsletter

Mail Us

Analysis of wide area user mobility patterns Kevin Simler*, Steven - PowerPoint PPT Presentation

Analysis of wide area user mobility patterns Kevin Simler*, Steven E. Czerwinski , Anthony Joseph UC Berkeley * Now at MIT 2004/12/02 Now at Google Motivation We want to understand user behavior In order to design better

MOBILITY CHOICE STUDY MOBILITY CHOICE STUDY MOBILITY CHOICE STUDY Planning for Mobility in

Does Mobility Matter? Does Mobility Matter? Does Mobility Matter? Does Mobility Matter? The

Mobility Activity in WIDE Keio University/WIDE Ryuji Wakikawa ryuji@sfc.wide.ad.jp Goal of

Lube : Mitigating Bottlenecks in Hao Wang* Wide Area Data Analytics Baochun Li i Qua Wide Area

Factory Patterns: Factory Method and Abstract Factory Design Patterns In Java Bob Tarr

Principles and Patterns 26 February, 2020 Recap Principles Patterns Inheritance Anti-patterns

Wide Area Networking A short introduction to High-Speed Wide-Area-Networking August 31, 2005 1

WORLD WIDE WORKSHOP for WORLD WIDE WORKSHOP for WORLD WIDE WORKSHOP for WORLD WIDE WORKSHOP for

Design Patterns Applications Programming What is design patterns? The design patterns are

Design Patterns 1 What are Design Patterns? Design patterns describe common (and successful)

Software, Faster Patterns of Effective Delivery Dan North @tastapod Patterns of Effective

Design Patterns in Eiffel Dr. Till Bay design patterns? [Design Patterns] are

1 Closed Patterns and Max-Patterns Closed Patterns and Max-Patterns A long pattern contains a

More Design Patterns Horstmann ch.10.1,10.4 Design patterns Structural design patterns

Insero E-Mobility o t d to Driving the e-mobility industry forward r d r: e r k r n

MOBILITY &amp; ENERGY CONSERVATI MOBILITY &amp; ENERGY CONSERVATION MOBILITY &amp; ENERGY

Graphical Models Graphical Models Relationship between the directed &amp; undirected models

PixelVault:+Using+GPUs+for+Securing+ Cryptographic+Opera;ons+ ! Giorgos+Vasiliadis + +

Chapter 2 Application Layer

Graphical Models Graphical Models Bayesian Networks Siamak Ravanbakhsh Fall 2019 Previously on

Towards a theory of Undo Aaron Brown UC Berkeley June 2002 ROC Retreat Outline Recap of

FPGAs for Image Processing A DSL and program transformations Rob Stewart Greg Michaelson Idress

Marginal Inference in MRFs using Frank-Wolfe David Belanger, Daniel Sheldon, Andrew McCallum

Quantifying the Performance Impacts of Using Local Memory for Many-Core Processors Jianbin Fang 1

Sambuz

Useful Links

Newsletter

Mail Us

Analysis of wide area user mobility patterns Kevin Simler, Steven E. Czerwinski , Anthony Joseph UC Berkeley Now at MIT 2004/12/02 Now at Google Motivation We want to understand user behavior In order to design better

MOBILITY & ENERGY CONSERVATI MOBILITY & ENERGY CONSERVATION MOBILITY & ENERGY

Graphical Models Graphical Models Relationship between the directed & undirected models