analyzing the great firewall of china over space and time
play

Analyzing the Great Firewall of China Over Space and Time Roya - PowerPoint PPT Presentation

Analyzing the Great Firewall of China Over Space and Time Roya Ensafi, Philipp Winter, Abdullah Mueen, Jed Crandall June 30, 2015 The Battle Over Information Control On The Internet State of the Art Rent a control machine (VPS)


  1. Analyzing the Great Firewall of China Over Space and Time Roya Ensafi, Philipp Winter, Abdullah Mueen, Jed Crandall June 30, 2015

  2. The Battle Over Information Control On The Internet

  3. State of the Art ● Rent a control machine (VPS) ● Cooperate with volunteers ● Advantages ○ Root access ● Disadvantages ○ Not always possible to rent VPS in interesting area ○ Expensive ○ Could put volunteers in danger

  4. Motivation ● We can't have access to all machines ● Machines follow RFC rules plus OS implementation ● Can we come up with ways to use them to measure FROM?

  5. Solving the Vantage Point Problem ● Side channels turn ordinary machines into vantage points! ● ??? Advantages ○ No root access required ○ No need for special software on any machine ● Disadvantages ○ Limited to TCP/IP layer

  6. Analyzing the GFW Over Space & Time ● Country-wide distributed NIDS ● Surprisingly sophisticated ○ Deep packet inspection ○ Active probing for unknown protocols ● Blocks Tor relays by dropping packets of TCP handshake

  7. Outline ● Discuss idle scans , a special kind of side channel Server ● Explain practical idle scans ??? Client ● Use practical idle scans to provide a better understanding of the Great Firewall (GFW)

  8. Hybrid Idle Scan Idle port scanning uses side channel techniques to bounce scans off of a “server” host to stealthily scan a “client”. Hybrid idle scans (spooky scans) can detect the direction of blocking between a client and server. It is simple, effective, and unobtrusive. (Ensafi, et al. PAM’14) Requirements: ● Global IPID machine for the client ● Server that has open port

  9. Hybrid Idle Scan No direction blocked SYN Backlog 0 Server (1) SYN/ACK Client (2) IPID: 1000 Client IPID: 1000 MM

  10. Hybrid Idle Scan No direction blocked SYN Backlog 0 1 Server (3) Spoof SYN (1) SYN/ACK Client (2) IPID: 1000 Client IPID: 1000 MM

  11. Hybrid Idle Scan No direction blocked SYN Backlog 0 (5) RST, IPID: 1001 1 ( 4 ) S Y N / A Server C K (3) Spoof SYN 0 (1) SYN/ACK Client (2) IPID: 1000 Client IPID: 1000 1001 MM

  12. Hybrid Idle Scan No direction blocked SYN Backlog 0 (5) RST, IPID: 1001 1 ( 4 ) S Y N / A Server C K (3) Spoof SYN 0 (1) SYN/ACK Client (2) IPID: 1000 Client IPID: (6) SYN/ACK 1000 (7) IPID: 1002 1001 1002 MM

  13. Hybrid Idle Scan Server to Client Blocked SYN Backlog (4) SYN/ACK 0 1 Server (3) Spoof SYN (1) SYN/ACK Client (2) IPID: 1000 Client IPID: (6) SYN/ACK 1000 (7) IPID: 1001 1001 MM

  14. Hybrid Idle Scan Server to Client Blocked Client to Server Blocked SYN Backlog SYN Backlog (4) SYN/ACK 0 0 ( 1 1 5 ) R S ( 4 ) Server Server T S Y N (3) Spoof SYN (3) Spoof SYN / A C K (1) SYN/ACK (1) SYN/ACK Client Client (2) IPID: 1000 (2) IPID: 1000 Client IPID: Client IPID: (6) SYN/ACK (6) SYN/ACK 1000 1000 (7) IPID: 1001 (7) IPID: 1004 1001 ... 1004 MM MM

  15. What Did We Want to Learn? ● Many open questions about the GFW and Tor ○ Does censorship of Tor differ for users in different regions? ○ Does filtering depend on when and where you are? ○ How good is the GFW at blocking Tor? ○ Is it always Server-to-Client blocking or also Client-to-Server blocking? ○ Does blocking change from one ISP to another? ● Revisit old beliefs about the GFW ○ Is filtering centralized ?

  16. Methodology - Relays and Clients (Map data @ 2014 Google, INEGI)

  17. Methodology - Machines Under Our Control ● We ran hybrid idle scans for 27 days. ● Each pair of clients and servers were tested hourly for a day Clients Servers ? ? ? ? (Map data @ 2014 Google, INEGI)

  18. Results: No Obvious Geographical Pattern No geographical or topological pattern is visible. Instead, the distribution matches the geographic Internet penetration patterns of China. (Map data @ 2014 Google, INEGI) (Map data @ 2014 Google, INEGI)

  19. Analyzing the GFW Over Space & Time ● Mostly Server-to-Client Blocking ● SYN/ACK dropping (IP and port) ● If RST passes through GFW, then SYN also will ● CERNET clients could more often communicate with servers throughout the day ● Some relays were always reachable throughout the day

  20. Analyzing the GFW Over Space & Time ● Mostly Server-to-Client Blocking ● SYN/ACK dropping (IP and port) ● If RST passes through GFW, then SYN also will ● CERNET clients could more often communicate with servers throughout the day ● Some relays were always reachable throughout the day

  21. Take Away Messages ● Side channels practical and enable broad coverage ● ...but not flexible and care must be taken when used ● CERNET treated differently than rest of country ● Filtering centralized , and quite effective

  22. Questions / Comments? Thank You!

  23. Ethical Considerations ● Want to learn if two remote hosts can talk to each other ○ Different approaches have different issues ○ Rented VPS could cause trouble for VPS provider ● Deciding if a given measurement is ethical on a case-to-case basis ○ Technique perfectly fine in situation X ... ○ … but irresponsible in situation Y ● Mitigations ○ Use routers instead of clients ○ Measure an entire (e.g) /24

  24. Real Data Client to server blocked Phase 1: just query IPID Phase 2: send 5 spoofed SYN packets per sec & query IPID for 120 sec No direction blocked Server to client blocked IPID difference

  25. Censored Planet Use practical idle scans to provide a framework to globally measure censorship

  26. The Great Firewall's Active Probing ● Ran measurements and analyzed initial data: ○ 3 JavaScript-implemented Tor relays are accessible almost always ● Evidence of Active probing for Tor relays ○ Every 24+ h, GFW flushes blocked IPs ● Evidence of IP spoofing ○ GFW owns at least 248 netblocks that are used to spoof IPs

Recommend


More recommend