1945 vannevar bush the internet end end
play

1945: Vannevar Bush The Internet End-End As we may think, Atlantic - PowerPoint PPT Presentation

2/24/2019 1945: Vannevar Bush The Internet End-End As we may think, Atlantic The Web Monthly, July, 1945. Describes the idea of a 15-441 Spring 2018 distributed hypertext system Profs Peter Steenkiste & Justine Sherry A


  1. 2/24/2019 1945: Vannevar Bush The Internet End-End “As we may think”, Atlantic • The Web Monthly, July, 1945. • Describes the idea of a 15-441 Spring 2018 distributed hypertext system Profs Peter Steenkiste & Justine Sherry • A “memex” that mimics the “web of trails” in our minds Thanks to Scott Shenker, Sylvia Ratnasamay, Peter Steenkiste, and Srini Seshan for slides. Many other iterations before we got to the Dec 9, 1968: “The Mother of All Demos” World Wide Web First demonstration of Memex- • MINITEL in France. https://en.wikipedia.org/wiki/Minitel inspired system • Project Xanadu. https://en.wikipedia.org/wiki/Project_Xanadu Working prototype with hypertext, linking, use of a mouse… • (Note that you don’t need to know any of this history for exams, this is just for the curious…) https://www.youtube.com/watch?v=74c8LntW7fo 1

  2. 2/24/2019 1989: Tim Berners-Lee Lots of Traffic! 1989: Tim Berners-Lee (CERN) writes internal proposal to develop a exabyte petabyte distributed hypertext system • Connects “a web of notes with links”. • Intended to help CERN physicists in large projects share and manage information 1990: TBL writes graphical browser for Next machines 1992-1994: NCSA/Mosaic/Netscape browser release What is an Exabyte? Hyper Text Transfer Protocol (HTTP) Network 1,000,000,000,000,000,000 Bytes 10 x 10 x 10 x 10 x ● Client-server architecture Storage 1,099,511,627,776 MByte ● Server is “always on” and “well known” Kilo Kilo 3 10 10 ● Clients initiate contact to server Mega Mega 6 20 20 Giga Giga 9 30 30 ● Synchronous request/reply protocol ● Runs over TCP, Port 80 Tera Tera 12 12 40 40 Peta Peta 15 15 50 50 A few years ago ● Stateless Exa Exa 18 18 60 60 Today ● ASCII format Zetta Zetta 21 21 70 70 In a few years Yotta Yotta 24 24 80 80 2

  3. 2/24/2019 Steps in HTTP Request/Response Client-to-Server Communication Client Server ● HTTP Request Message ● Request line: method, resource, and protocol version Establish connection ● Request headers: provide information or modify request ● Body: optional data ( e.g., to “POST” data to the server) Client request request line GET /somedir/page.html HTTP/1.1 Host: www.someschool.edu . . header User-agent: Mozilla/4.0 . Request lines Connection: close response Accept-language: fr (blank line) carriage return line feed indicates end of message Close connection Server-to-Client Communication HTTP is Stateless HTTP Response Message ● ● Each request-response treated independently Status line: protocol version, status code, status phrase ● ● Servers not required to retain state Response headers: provide information ● Body: optional data ● ● Good : Improves scalability on the server-side ● Failure handling is easier status line HTTP/1.1 200 OK (protocol, status code, ● Can handle higher rate of requests Connection close status phrase) ● Order of requests doesn’t matter Date: Thu, 06 Aug 2006 12:00:15 GMT Server: Apache/1.3.0 (Unix) header lines Last-Modified: Mon, 22 Jun 2006 ... ● Bad : Some applications need persistent state Content-Length: 6821 ● Need to uniquely identify user or store temporary info Content-Type: text/html ● e.g., Shopping cart, user profiles, usage tracking, … (blank line) data data data data data data ... e.g., requested HTML file 13 3

  4. 2/24/2019 How to Maintain State in a Stateless Protocol: Cookies ● Client-side state maintenance Client stores small state on behalf of server ● Client sends state in future requests to the server ● ● Can provide authentication Performance Issues Request Response Set-Cookie: XYZ Request Cookie: XYZ Performance Goals Solutions? Improve HTTP to compensate for ● User TCP’s weak spots ● User ● fast downloads (not identical to low-latency commn.!) ● fast downloads (not identical to low-latency commn.!) ● high availability ● high availability ● Content provider ● Content provider ● happy users (hence, above) ● happy users (hence, above) ● cost-effective infrastructure ● cost-effective delivery infrastructure ● Network (secondary) ● Network (secondary) ● avoid overload ● avoid overload 4

  5. 2/24/2019 Solutions? Solutions? Improve HTTP to Improve HTTP to compensate for compensate for TCP’s weak spots TCP’s weak spots ● User ● User ● fast downloads (not identical to low-latency commn.!) ● fast downloads (not identical to low-latency commn.!) ● high availability ● high availability ● Content provider Caching and Replication ● Content provider Caching and Replication ● happy users (hence, above) ● happy users (hence, above) ● cost-effective delivery infrastructure ● cost-effective delivery infrastructure ● Network (secondary) ● Network (secondary) ● avoid overload ● avoid overload Exploit economies of scale (Webhosting, CDNs, datacenters) HTTP Performance Typical Workload (Web Pages) • Multiple (typically small) objects per page ● Most Web pages have multiple objects • File sizes • Lots of small objects versus TCP ● e.g., HTML file and a bunch of embedded images • 3-way handshake • Heavy-tailed • Lots of slow starts • Extra connection state ● How do you retrieve those objects (naively)? • Pareto distribution for tail ● One item at a time, i.e., one “GET” per TCP connection • Lognormal for body of distribution ● Solution used in HTTP 0.9, and 1 • Embedded references • Number of embedded objects also Pareto ● New TCP connection per (small) object! Pr(X>x) = (x/xm)-k ● Lots of handshakes • This plays havoc with performance. Why? ● Congestion control state lost across connections • Solutions? 5

  6. 2/24/2019 Improving HTTP Performance: Improving HTTP Performance: Pipelined Requests & Responses Persistent Connections ● Maintain TCP connection across multiple requests Client Server Including transfers subsequent to current page ● Batch requests and responses to ● Client or server can tear down connection ● reduce the number of packets ● Performance advantages: Avoid overhead of connection set-up and tear-down ● ● Multiple requests can be contained Allow TCP to learn more accurate RTT estimate ● in one TCP segment Allow TCP congestion window to increase ● i.e., leverage previously discovered bandwidth ● ● Head of line blocking issues Drawback? Head of line blocking ● remains: a delay in Transfer 2 A “slow object” blocks retrieval of all later requests, including “fast” objects ● delays all later transfers ● Default in HTTP/1.1 Scorecard: Getting n Small Objects Improving HTTP Performance: Concurrent Requests & Responses ● Use multiple connections in parallel Time dominated by latency ● Speeds up retrieval by ~m R2 ● Does not necessarily maintain order R3 R1 ● One-at-a-time: ~2n RTT of responses T2 T3 ● Partially deals with HOL blocking ● M concurrent: ~2[n/m] RTT T1 ● Persistent: ~ (n+1)RTT • Client = ● Pipelined: ~2 RTT • Content provider = ● Pipelined/Persistent: ~2 RTT first time, RTT later • Network = Why? 6

  7. 2/24/2019 Scorecard: Getting n Large Objects Improving HTTP Performance: Caching ● Why does caching work? Time dominated by bandwidth ● Exploits locality of reference ● How well does caching work? ● Very well, up to a limit ● One-at-a-time: ~ nF/B ● Large overlap in content ● But many unique requests ● M concurrent: ~ [n/m] F/B ● assuming shared with large population of users ● and each TCP connection gets the same bandwidth ● Trend: increase in dynamic content ● E.g., customizing of web pages ● Pipelined and/or persistent: ~ nF/B ● Reduces benefits of caching ● The only thing that helps is getting more bandwidth.. ● Some exceptions, e.g., video Improving HTTP Performance: Improving HTTP Performance: Caching: Where? Caching: Clients ● Clients keep a local cache of recently ● Baseline: Many clients transfer same information accessed objects ● Generate unnecessary server and network load ● Clients often have a small number of web ● Clients experience unnecessary latency Server Server pages they access frequently ● Leads to reuse of logos, old content, java ● Everywhere! scripts, … ● Client Tier-1 ISP Tier-1 ISP ● Forward proxies ● Cheap: no additional ● Reverse proxies ISP-1 ISP-2 infrastructure needed ISP-1 ISP-2 ● Content Distribution Network Clients Clients ● But caching closer to server can lead to higher hit rates! 7

Recommend


More recommend