social mining big data analy1cs
play

Social Mining & Big Data Analy1cs Big Data, Human Mobility & - PowerPoint PPT Presentation

Social Mining & Big Data Analy1cs Big Data, Human Mobility & Migra4on Roberto Trasar1 roberto.trasar1@is1.cnr.it h4p://www.sobigdata.eu/ Moving Object data (Vehicles) Call Detail Records (Phone data) Social Networks geo-located data


  1. Social Mining & Big Data Analy1cs Big Data, Human Mobility & Migra4on Roberto Trasar1 roberto.trasar1@is1.cnr.it h4p://www.sobigdata.eu/

  2. Moving Object data (Vehicles)

  3. Call Detail Records (Phone data)

  4. Social Networks geo-located data

  5. Mobility Data Mining: Applica4ons Moving object and trajectory data mining has many n important, real-world applications driven by the real need Ecological analysis (e.g., animal scientists) n Weather forecast n Traffic control n Location-based services n Homeland security ( e.g. , border monitoring) n Law enforcement ( e.g. , video surveillance) n … n

  6. mobility data mining landscape Algorithms Models Individual Behaviors semantics Hospital G y m Restaura nt derived models Collec1ve Mobility basic trajectory patterns Global Models pop 2 i pop j T ij ∝ ( pop i + s ij )( pop i + s ij + pop j ) raw trajectory a P > ( a ) data

  7. Analyzing the collec1ve behavior URBAN MOBILITY ATLAS

  8. Urban Mobility Atlas http://kdd.isti.cnr.it/uma2/

  9. Flows of vehicles to the two airports Vehicles of the sample have been re-scaled using ACI data of the circula1ng vehicle fleet.

  10. Discovering new borders 1 2 O/D 3 4 Community Discovery

  11. A user-centric view of mobility data INDIVIDUAL MOBILITY NETWORKS

  12. How to synthesize Individual Mobility? Mobility Data Mining methods automatically extract relevant episodes: locations and movements.

  13. Cluster & rank individual preferred locations • A key ini1al step is the study of the user’s personal loca+ons , i.e., the places or areas where the user stops to perform any kind of ac1vity. • The problem consists in discovering the set of observa1ons defining the personal loca1ons. 15

  14. Individual Mobility Profile • Besides loca1ons, the mobility of a user is characterized by the trajectories that start and end in the user’s personal loca1ons. • These trajectories can be clustered with respect to their similarity. • From each cluster can be extracted a representa1ve trajectory, named rou+ne . • The set of rou1nes, i.e., the individual mobility profile P u , is an abstrac1on in 1me and space of the systema1c movements of a user. 16

  15. How to synthesize Individual Mobility? Graph abstraction based on locations (nodes) and movements (edges) Trip Features Length Dura1on Time Interval Average Speed Network Features centrality clustering coefficient average path length predictability entropy hubbiness degree betweenness volume edge weight flow per loca1on

  16. Individual mobility indicator k-radius of gyra4on Radius of gyration the characteristic distance the radius computed on the k traveled by individuals most visited locations

  17. The movements between the K loca1ons cannot represent K=4 the user explorers returners All the mobility of the user can be represented by the movements between the K loca1ons Two different and separated behaviors

  18. Individual Call Profile and Classifica4on A condensed representa1on of the user’s phone calls: Individual Call profile (ICP). It is used to classify his behavior. 123643 Cell12 24/06/2015 14:05 123643 Cell12 24/06/2015 18:13 123643 Cell15 25/06/2015 11:05 t1 = [00:00-08:00) 123643 Cell15 25/06/2015 20:42 t2 = [8:00-19:00) t3 = [19:00-24:00) 123643 Cell11 25/06/2015 21:05 123643 Cell12 26/06/2015 10:01 ….

  19. Modelling individuals to discover correla1ons and pa4erns among the collec1vity INDIVIDUAL TO COLLECTIVE

  20. Big Data: Diversity and economic development

  21. Regular vs Occasional

  22. The ABC classifier

  23. Mobility predic4on • Predic1on of the individual mobility • Based on profiles: – If the user is following one of her rou1nes, she will con1nue to do so – Otherwise, check if other users’rou1nes fit the actual trip, and use them now predic1on

  24. Boos4ng Carpooling with Network Analysis passenger • Match – Rou1ne containment – A driver can pick up an other along her rou1ne driver • Network – Nodes = users – Edges = pairs of users with matching rou1nes PDE 26

  25. Case Study Results < 5% SOV 27

  26. Detec4ng Events – Example of Piazza San Pietro

  27. St. Peter’s Square (Piazza San Pietro) Characterizing “Padre Pio’s” Event Day afer 1 2 From area N. 3 4 5 Mean day - Week 5 Mean day - Week 6 outbound Event 1 2 From area N. 3 4 5 Event Day afer outbound

  28. MIGRATION STUDIES

  29. Big datasets • Social network and web data • Twi4er Streaming data: various Twi4er datasets from project partners, in various languages, with geoloca1on • GDELT Knowledge Graph database, a Big Data repertoire of online news ar1cles. • Mobile phone data • Orange dataset: mobile calls between Senegal and the rest of the world (country to country, 2012). • Highly educated migrants • Company data (Estonia and Italy): members of the governing boards of companies (with place of birth). • Publica1on data: DBLP (computer science) and APS (physics)

  30. The story: Migra4on stages • GO: Understanding migra4on flows and stocks • Migra1on stocks • Brain-drain and scien1fic migra1on • Policy and illegal migra1on • STAY: Evalua4ng migrant integra4on • Sen1ment related to migra1on topics • Migra1on and language • Mul1-culturality and sen1ment • Migrant start-uppers • RETURN: Return of migrants • Data journalism approach

  31. Sen1ment on migra1on topics: Percep1on of the Mediterranean Refugee Crisis STAY

  32. Sen4ment on migra4on topics: Percep4on of the Mediterranean Refugee Crisis • What is the evolu4on of the discussions about refugees migra1on in Twi4er? • What is the sen4ment of users across Europe in rela1on to the refugee crisis? • What is the evolu4on of the percep4on in countries affected by the phenomenon? • Are users more polarised in countries most impacted by the migra1on flow?

  33. Analy4cal Framework • An analy1cal framework to interpret social trends from large tweet collec1ons by extrac1ng and crossing informa1on about the following three dimensions: – Time – Space • User loca1on • Loca1on men1ons – Sen1ment • Tweet sen1ment • User sen1ment • Perform mul1dimensional analyses considering content and loca1ons in 1me

  34. Deriving Sen4ment Ini1al seeds #refugeeswelcome #refugessnotwelcome Posi1ve Hashtags Nega1ve Hashtags Enrich hashtag seeds from #-tag co-occurrence Posi1ve Tweets Posi1ve Users Nega1ve Tweets Nega1ve Users

  35. Sen4ment on migra4on topics: Percep4on of the Mediterranean Refugee Crisis Africa & Middle East country men4ons European country men4ons News about AT-HU border Terrorist a4ack in Syria opens Nigeria Flow shif to Croa1a

  36. Sen1ment Analysis in UK • Posi1ve and nega1ve users for different ci1es in UK before and afer September 4 (death of Alan Kurdi, borders between AT HU, Germany welcomes refugees). bars show the number of polarized posi1ve and nega1ve users by city –

  37. Thank you h4p://www.sobigdata.eu/

Recommend


More recommend