interpretation of dimensionally reduced crime data
play

Interpretation of Dimensionally-Reduced Crime Data A Study with - PowerPoint PPT Presentation

Interpretation of Dimensionally-Reduced Crime Data A Study with Untrained Domain Experts Dominik Jckle Florian Stoffel Sebastian Mittelstdt Daniel Keim Harald Reiterer Introduction to Domain Experts Data analysts of a Law Enforcement


  1. Interpretation of Dimensionally-Reduced Crime Data A Study with Untrained Domain Experts Dominik Jäckle Florian Stoffel Sebastian Mittelstädt Daniel Keim Harald Reiterer

  2. Introduction to Domain Experts Data analysts of a Law Enforcement Agency (LEA) • Work with tabular data on a daily basis • Identification of patterns & suspects • Comparative case analysis (consider similarities & correlations) Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

  3. Introduction to Domain Experts Data analysts of a Law Enforcement Agency (LEA) • Work with tabular data on a daily basis • Identification of patterns & suspects • Comparative case analysis (consider similarities & correlations)  Challenge: consider multiple attributes simultaneously Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

  4. Planar Data Projections Multidimensional Scaling (MDS) = Distance-Preserving Projection Overall goal: ℝ 𝑜 → ℝ 𝑛 ; 𝑛 < 𝑜 n Attributes A ... ... ... Data Records = Crimes B ... ... ... C ... ... ... Data Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

  5. Planar Data Projections Multidimensional Scaling (MDS) = Distance-Preserving Projection Overall goal: ℝ 𝑜 → ℝ 𝑛 ; 𝑛 < 𝑜 n Attributes A B C A ... ... ... Compute A 0 ... ... Data Records = Crimes Distances B ... ... ... B ... 0 ... C ... ... ... C ... ... 0 Data Distance Matrix Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

  6. Planar Data Projections Multidimensional Scaling (MDS) = Distance-Preserving Projection Overall goal: ℝ 𝑜 → ℝ 𝑛 ; 𝑛 < 𝑜 n Attributes A B C A A ... ... ... Compute A 0 ... ... Data Records Projection = Crimes Distances B B ... ... ... B ... 0 ... C ... ... ... C ... ... 0 C Data Distance Matrix 2D Scatterplot Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

  7. Planar Data Projections Multidimensional Scaling (MDS) = Distance-Preserving Projection Overall goal: ℝ 𝑜 → ℝ 𝑛 ; 𝑛 < 𝑜 n Attributes A B C A A ... ... ... Compute A 0 ... ... Data Records Projection = Crimes Distances B B ... ... ... B ... 0 ... C ... ... ... C ... ... 0 C Data Distance Matrix 2D Scatterplot Main Problem interpretation of the visual depiction Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

  8. Previous Work Includes Domain Experts No Study (any) Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

  9. Previous Work Includes Domain Experts No Study (any) Ward & Martin (1995) Buja (1996) Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

  10. Previous Work Includes Domain Experts No Study (any) Case Studies Application Examples Jeong et al. (2009) Seo & Shneiderman (2005) Johansson & Johansson (2009) Nam & Mueller (2013) Ward & Martin (1995) Ingram et al. (2010) Krause et al. (2016) Buja (1996) Turkay et al. (2011) Turkay et al. (2012) Fernstad et al. (2013) Yuan et al. (2013) Liu et al. (2014) Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

  11. Previous Work Includes Domain Experts No Study (any) User Studies Case Studies Application Examples without Domain Experts Jeong et al. (2009) Yi et al. (2005) Seo & Shneiderman (2005) Brown et al. (2012) Johansson & Johansson (2009) Nam & Mueller (2013) Ward & Martin (1995) Sedlmair et al. (2013) Ingram et al. (2010) Krause et al. (2016) Buja (1996) Stahnke et al. (2016) Turkay et al. (2011) Turkay et al. (2012) Fernstad et al. (2013) Yuan et al. (2013) Liu et al. (2014) Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

  12. Previous Work Includes Domain Experts No Study (any) User Studies Case Studies Application Examples without Domain Experts Our Study Jeong et al. (2009) Yi et al. (2005) Seo & Shneiderman (2005) Brown et al. (2012) Johansson & Johansson (2009) Nam & Mueller (2013) Ward & Martin (1995) Sedlmair et al. (2013) Ingram et al. (2010) Krause et al. (2016) Buja (1996) Stahnke et al. (2016) Turkay et al. (2011) Turkay et al. (2012) Fernstad et al. (2013) Yuan et al. (2013) Liu et al. (2014) Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

  13. Can domain experts not trained in advanced statistics interpret the depiction of a data projection?

  14. Data: San Francisco Crimes Category Date PdDistrict DayOfWeek Description Time Resolution Location Address https:// data .sfgov.org/

  15. Data: San Francisco Crimes Category Date PdDistrict Category: DISORDERLY CONDUCT Description: MAINTAINING A PUBLIC NUISANCE AFTER NOTIFICATION DayOfWeek Description DayOfWeek: Sunday Date: 08/21/2016 12:00:00 AM Time: 6:36 Time Resolution PdDistrict: TENDERLOIN Resolution: ARREST, BOOKED Location Address Address: 400 Block of LEAVENWORTH ST Location: (37.7851373814889°, -122.414457162309°) https:// data .sfgov.org/

  16. Data Types DISORDERLY CONDUCT 08/21/2016 00:06:36 AM MAINTAINING A PUBLIC NUISANCE AFTER NOTIFICATION numerical categorical textual Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

  17. Data Types DISORDERLY CONDUCT 08/21/2016 00:06:36 AM MAINTAINING A PUBLIC NUISANCE AFTER NOTIFICATION numerical categorical textual Similarity between ... numerical values 𝑡𝑗𝑛 𝑊 1 , 𝑊 2 = 𝑊 1 − 𝑊 2 𝑤 1 ∙𝑤 2 𝑡𝑗𝑛 𝑤 1 , 𝑤 2 = textual attrib. 𝑤 1 ∙ 𝑤 2 𝑡𝑗𝑛 𝑊 1 , 𝑊 2 = 𝑊 1 ≠ 𝑊 categorical values 2 How to combine different data types? Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

  18. 𝐸 1 𝑡𝑗𝑛 1 𝑥 1 Dimension/Variable Projection 𝐸 2 𝑡𝑗𝑛 2 𝑥 2 𝐸 3 𝑡𝑗𝑛 3 𝑥 3 … 𝐸 𝑜 𝑡𝑗𝑛 𝑜 𝑥 𝑜 Steering Weighting & Similarity Visual Data Exploration Interactive Visualization

  19. Weighting and Similarity Interactive weighting = impact of an attribute Integration of diverse data types |𝑒𝑗𝑛| 𝑡𝑗𝑛 𝑗 𝐵 𝑗 ,𝐶 𝑗 ∙𝑥 𝑗 σ 𝑗=1 Gower Metric: 𝑒𝑗𝑡𝑢 𝐵, 𝐶 = |𝑒𝑗𝑛| Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

  20. Weighting and Similarity |𝑒𝑗𝑛| 𝑡𝑗𝑛 𝑗 𝐵 𝑗 , 𝐶 𝑗 ∙ 𝑥 𝑗 σ 𝑗=1 𝑒𝑗𝑡𝑢 𝐵, 𝐶 = |𝑒𝑗𝑛| Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

  21. Visual Data Exploration Overview Detail Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

  22. Visual Data Exploration Overview Detail Projection Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

  23. Visual Data Exploration Overview Detail Projection Content Lens Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

  24. Visual Data Exploration Overview Detail Projection Content Lens Tooltip Data View Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

  25. Visual Data Exploration |𝑒𝑗𝑛| 𝑡𝑗𝑛 𝑗 𝐵 𝑗 , 𝐶 𝑗 ∙ 𝑥 𝑗 σ 𝑗=1 𝑒𝑗𝑡𝑢 𝐵, 𝐶 = |𝑒𝑗𝑛|

  26. Interpretation Study

  27. Study Design 3 LEA data analysts (1 female) • worked with data tables on a daily basis • not used to work with abstract data representations 4 consecutive tasks • Each analyst was confronted with the same task order • Each task was introduced as a new, subsequent analysis question Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

  28. Study Design San Francisco Crime Data • Week from Monday, July 25, 2016 to Monday, August 1, 2016 • 13 dimensions • 36 different crime categories After the study, we let analysts fill out a questionaire regarding: • basic understanding • interaction concepts • extraction of knowledge Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

  29. Tasks

  30. Task 1 Is there a pattern among dimensions between days?

  31. Task 1: Model Solution

  32. Task 2 Why is the Monday separated from all other days of the week? What is special about the Date distribution?

  33. Task 2: Model Solution

  34. Task 3 Which distribution of dimension values can you find for the rest of the week?

  35. Task 3: Model Solution

  36. Task 4 Leaving the temporal aspect behind, is there a pattern based on places or crime types?

  37. Task 4: Model Solution

  38. Findings

  39. F1: The analysis starts with an already known hypothesis. Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

  40. Crime Routine Activity (L. E. Cohen, 1979) Place District / Street / GPS Time Date / Time / Weekday Occasion Crime Opportunity Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

  41. F1: The analysis starts with an already known hypothesis. F2: Analysts always consider to add/remove dimensions to the depiction to explain a cluster separation. Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

  42. F1: The analysis starts with an already known hypothesis. F2: Analysts always consider to add/remove dimensions to the depiction to explain a cluster separation. F3: Analysts do not add/remove dimensions to explain an anomaly they are insecure about. Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

Recommend


More recommend