protecting privacy by spying on users
play

Protecting Privacy by Spying on Users Andrew Patrick & - PDF document

Protecting Privacy by Spying on Users Andrew Patrick & Information Security Group (Larry Korba, Ronggong Song, George Yee, Scott Buffett, Yunli Wang, Liqiang Geng, Steve Marsh, Hongyu Liu) 1 data breaches caused by laptop theft, hacks 2


  1. Protecting Privacy by Spying on Users Andrew Patrick & Information Security Group (Larry Korba, Ronggong Song, George Yee, Scott Buffett, Yunli Wang, Liqiang Geng, Steve Marsh, Hongyu Liu) 1

  2. data breaches caused by laptop theft, hacks 2

  3. Insider activities 3

  4. Privacy laws in Canada, and federal and provincial levels. No laws yet concerning informing the public about data breaches. 4

  5. U.S. Privacy Regulations Numerous state laws requiring disclosure of data breaches. California just vetoed a law that would have made if an offense to retain credit card data. Minnesota is the only state that has such a law. 5

  6. Solutions: searches, traffic monitoring, encryption 6

  7. Social Network Analysis Research using Social Network Analysis: “A few years back, we were conducting a Social Network Analysis (SNA) in one of IBM's global operating units with the goal to improve overall collaboration among the geographically dispersed teams that made up a world-wide organization. Next we colored the nodes in the network according to their departmental membership and asked InFlow to arrange the network based on the actual links. We were looking for the emergent organization – how work was really done – what the real structure of the organization was. Figure 1 below shows us how work was really accomplished in the organization. Two nodes/people are linked if they both confirm that they exchange and information and resources to get their jobs done. Each department involved in the study received a different color node.” See http://www.orgnet.com/emergent.html 7

  8. Social Network Analysis and Terrorism Social Network Analysis has proven to be useful to understanding group behavior, communication patterns, etc. This example is a post-hoc analysis of relationships among the terrorists involved with the 9/11 attacks on the US. Uncloaking Terrorist Networks by Valdis E. Krebs First Monday, volume 7, number 4 (April 2002), URL: http://firstmonday.org/issues/issue7_4/krebs/index.html Image from http://www.orgnet.com/tnet.html 8

  9. SNA & E-mail Anomalies A. J. O'Donnell, W. C. Mankowski, and J. Abrahamson. Using E-Mail Social Network Analysis for Detecting Unauthorizedzed Accounts. In Conference on Email and Anti-Spam (CEAS) , Mountain View, CA, July 2006. 9

  10. Social Networks Applied to Privacy (SNAP) Controls Action Data Collection Analysis Display Audit Results Resources Text Reflexive Privacy - Dashboard - Databases - Personally- - Prevention Policies - Meters Prescriptive - Applications Identifiable Data Log Interpretation - Files Discovery - Warnings - Search Tools - Context - Correlation Activities - Semantics Feedback -Time Security - Workflow - Users - Other Analysis - Data Prescribed Policies - Activity - Security - Clients Metadata Workflow Access - Privacy Control Social Network Context Social Network Analysis - Operation Analysis - Workflow - User Policy Analysis Non-Compliance Highlighting Action Action Schematic of the SNAP architecture. 10

  11. SNAP Agents Knowledge Level Network Correlation Correlation Correlation Discovery Discovery Discovery Local Context Local Context Local Context Discovery Discovery Discovery . . . PII Data Discovery PII Data Discovery PII Data Discovery Preprocessing Preprocessing Preprocessing Raw System Raw System File File Raw System File Calls System Calls System Calls System Schematic of the SNAP agents and the levels of operation. 11

  12. SNAP Prototype Screen capture of the first crude interface prototypes. Notice the SNA diagram showing two people sharing some documents in common. 12

  13. While building SNAP, we also want to explore and demonstrate the value of SNA for privacy protection. So we looked for other examples of interesting social behavior related to security and privacy, and found Enron… Enron Enron filed for bankruptcy protection in the Southern District of New York in late 2001 and selected Weil, Gotshal & Manges as their bankruptcy counsel. Enron employed around 21,000 people (McLean & Elkind, 2003) and was one of the world's leading electricity, natural gas, pulp and paper, and communications companies, with claimed revenues of $111 billion in 2000. Fortune named Enron "America's Most Innovative Company" for six consecutive years. It achieved infamy at the end of 2001, when it was revealed that its reported financial condition was sustained mostly by institutionalized, systematic, and creatively planned accounting fraud. Enron has since become a popular symbol of willful corporate fraud and corruption. Wikipedia 13

  14. Enron email corpus is very popular for SNA studies. SNA is an exploratory tool whose goal is to detect and interpret patterns of social ties among people. http://jheer.org/enron/v1/ 14

  15. Method • 517,431 email messages • headers parsed for From, To, Date … • alias substitution • clean duplicates left 250,641 unique messages from 31,718 email addresses • 63% of addresses only appear once • dataset scanned for password-related patterns: – “password: *” – “password is *” • clean obvious non-passwords: – “case” (case sensitive) – “your” (your birthday) We developed our own methods to clean the data. http://www.cs.cmu.edu/~enron/ Enron Email Dataset This dataset was collected and prepared by the CALO Project (A Cognitive Assistant that Learns and Organizes). It contains data from about 150 users, mostly senior management of Enron, organized into folders. The corpus contains a total of about 0.5M messages. This data was originally made public, and posted to the web, by the Federal Energy Regulatory Commission during its investigation. 15

  16. Password Sharing • 642 messages from 500 different addresses contain passwords • 418 addresses appear only once • 500 different connections (arcs) • density = 0.2% We ended up with 642 unique instances of password sharing. The network has 500 addresses (people) and 500 connections. It is a very sparse, network, containing only 0.2% of the connections that are possible. 16

  17. Internal and External This diagram shows the overall structure of the password sharing network. Blue nodes are internal, red nodes are external. 17

  18. Password Factions This analysis breaks the networks into factions based on number of links between nodes. Nodes with strong links are placed within the same faction. Each color represents a different faction in the network. The black nodes are a residual faction that represents nodes that are relatively unconnected, often involving pairs of people. There is fairly good separation between the blue and red factions, suggesting there are two fairly distinct groups involved with password sharing that we should investigate. 18

  19. Core and Periphery This analysis uses different colors to represent each connected portion of the network. It is clear that there is a main network of red nodes that is relatively well connected, and dozens of small portions that are isolated. It is also clear, again, that there are two central areas in the core network. 19

  20. 12 1 This is an analysis of the core, connected network. There are two fairly distinct components to this network centered around two key people. Node 12 is senior person in EnronOnline responsible for creating new accounts •- sent out passwords 74 times •- used only 20 different passwords •- some passwords used frequently • - q#9M#npX = 30 times to 30 recipients • - WELCOME! = 20 times to 15 recipients Node 1 is “performance management” system •passwords send 62 times •WELCOME used 39 times, announcements about new round of evaluations •other passwords were random strings, e.g., KTDVWCCH, automated message reminding Node 25 is senior manager of research •sent passwords 28 times •14 times sent to his aol address, 5 times sent from his aol address •mostly self-sharing of third-party account information, e.g., subscriptions •password sharing of third party account with internal colleague •sent password and install instructions for software to colleague multiple times • July 16 2001, web access to Outlook mailboxes announced • about half of self-sharing occurred after this date 20

  21. Sharing Anomalies This display highlights partitions of the network that are isolated and larger than 3 nodes. This might represent anomalies in sharing behavior… 21

  22. Case 1 Case 1: purple nodes at 11 o'clock 1: 117.118 val=2.0000 1: 117.206 val=1.0000 1: 117.246 val=1.0000 - password protected memos and documents shared internally 22

  23. Case 2 Case 2: red nodes at 8 o'clock - login and password shared to diagnosis system problem - sharing of accounts for airline reservations (3 times) - informing new employee of id number and password (birth date in YYYYMMDD format) 1: 465.68 val=1.0000 1: 68.69 val=3.0000 1: 68.315 val=1.0000 23

Recommend


More recommend