HTTP Session Identification Research project 2 Kevin de Kok Marcus Bakker 30 June 2010
Agenda ● Introduction ● Research question ● Project scope ● Dataset ● Identification methods ● Conclusion ● Future work ● Questions? 30-06-10 HTTP Session Identification 2
Introduction (1) ● What is a HTTP session? 30-06-10 HTTP Session Identification 3
Introduction (2) ● The need to identify HTTP sessions [1] ● Not trivial to identify HTTP sessions ● HTTP is a sessionless protocol ● Request - Response [1] T. Kinkhorst and M. van Kleij. Busting the ghost on the web: real time detection of drive-by-infections, June 2009. URL http://www.delaat. net/~cees/sne-2008-2009/p46/report.pdf. 30-06-10 HTTP Session Identification 4
Research question ● How can HTTP sessions be distinguished from each other? 30-06-10 HTTP Session Identification 5
Project scope ● RFC 2616 ● The methods to identify a HTTP session will be developed for web 1.0 (e.g. no Ajax) ● The HTTP session identification will be executed from a central point in the network (no host-based detection) 30-06-10 HTTP Session Identification 6
Dataset ● Labsite (bookmark) ● Opened three hyperlinks ● Security.nl (bookmark) ● Opened three hyperlinks ● 8 HTTP sessions (2 bookmarks + 6 hyperlinks) 30-06-10 HTTP Session Identification 7
Identification methods ● Two categories of methods: ● Start of a HTTP session ● HTTP message correlation 30-06-10 HTTP Session Identification 8
Start of a HTTP session ● Time between successive fetches ● Hyperlink present at GET header ● No referrer 30-06-10 HTTP Session Identification 9
Time between successive fetches(1) ● 10 – 600ms [2] t > AOT Proof of Concept [2] Y. Bhole and A.Popescu. Measurement and analysis of http traffic, December 2005. 30-06-10 HTTP Session Identification 10
Time between successive fetches(2) ● “Slow” browsing (mobile phone?) 30-06-10 HTTP Session Identification 11
Hyperlink present at GET header(1) Hyperlink ● Hyperlink: 1/index.html GET header: /rp2/new_website/ 1/index.html HTML Body HTTP request message Hyperlink 30-06-10 HTTP Session Identification 12
Hyperlink present at GET header(2) ● 301 response message contains a hyperlink 30-06-10 HTTP Session Identification 13
No referrer(1) ● Address bar ● Bookmark Proof of Concept 30-06-10 HTTP Session Identification 14
No referrer(2) ● Javascript removes the referrer 30-06-10 HTTP Session Identification 15
HTTP message correlation ● HTML body HTTP GET correlation ● Link the referrers 30-06-10 HTTP Session Identification 16
HTML body HTTP GET correlation(1) URI embedded object: lokaal_plaatje.png GET header: /rp2/new_website/ lokaal_plaatje.png HTTP request message HTML Body Picture 30-06-10 HTTP Session Identification 17
HTML body HTTP GET correlation(2) ● Javascript: document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E %3C/script%3E")); 30-06-10 HTTP Session Identification 18
Link the referrers(1) Host+GET header: bulbasaur.studlab.os3.nl/rp2/new_website/ Referrer: http://bulbasaur.studlab.os3.nl/rp2/new_website/ HTTP request message HTML Body image 30-06-10 HTTP Session Identification 19
Link the referrers(2) ● Javascript can change the referrer: http://pagead2.googlesyndication.com/pagead/ads? client=<VERY LONG STRING> 30-06-10 HTTP Session Identification 20
Conclusion ● Start of a HTTP session ● Time between successive fetches ● Hyperlink present at GET header ● No referrer ● HTTP message correlation ● HTML body HTTP GET correlation ● Link the referrers 30-06-10 HTTP Session Identification 21
Future work ● Large scale testing ● Time between successive fetches for mobile phones ● Web 2.0 30-06-10 HTTP Session Identification 22
Questions? ? 30-06-10 HTTP Session Identification 23
Recommend
More recommend