Today Today 1. Domain Name Service (DNS) illustrates: • issues and structure for large-scale naming systems naming contexts The Domain Name Service, Etc. The Domain Name Service, Etc. • use of hierarchy for scalability decentralized administration of the name space hierarchical authority and trust Jeff Chase 2. Role of DNS in wide-area request routing Duke University, Department of Computer • DNS round robin Science • Content Distribution Networks: Akamai, Digital Island CPS 212: Distributed Information Systems DNS 101 DNS 101 Domain Name Hierarchy Domain Name Hierarchy Domain names are the basis for the Web’s global URL space. DNS name space is hierarchical : com gov org - fully qualified names are “little endian” provides a symbolic veneer over the IP address space generic TLDs net - scalability firm top-level shop names for autonomous naming domains, e.g., cs.duke.edu - decentralized administration domains arts web - domains are naming contexts (TLDs) names for specific nodes, e.g., fran.cs.duke.edu us fr country-code replaces primordial flat hosts.txt namespace names for service aliases (e.g., www, mail servers) TLDs .edu • Almost every Internet application uses domain names when it establishes a connection to another host. duke washington unc The Domain Name System (DNS) is a planetary name service that translates Internet domain names. mc cs env cs cs maps <node name> to <IP address> www whiteout (prophet) (mostly) independent of location, routing etc. How is this different from hierarchical directories in distributed file systems? Do we already know how to implement this? DNS Implementation 101 DNS Implementation 101 DNS Name Server Hierarchy DNS Name Server Hierarchy DNS protocol/implementation: DNS servers are organized into a hierarchy WWW server for that mirrors the name space. nhc.noaa.gov • UDP-based client/server com Root servers list (IP 140.90.176.22) gov servers for every org • client-side resolvers Specific servers are designated as net TLD. firm authoritative for portions of the name space. shop typically in a library “ www.nhc.noaa.gov is arts web 140.90.176.22” gethostbyname , gethostbyaddr us .edu fr DNS server for Servers may delegate nhc.noaa.gov • cooperating servers management of Subdomains correspond to unc ... subdomains to child query-answer-referral model organizational ( admininstrative ) “lookup www.nhc.noaa.gov” name servers. boundaries, which are not duke local forward queries among servers necessarily geographical. DNS server server-to-server may use TCP Servers are bootstrapped with pointers Parents refer cs env to selected peer and parent servers. mc (“zone transfers”) subdomain queries to their children. Resolvers are bootstrapped with • common implementation: BIND pointers to one or more local servers; they issue recursive queries. 1
DNS: The Politics DNS: The Politics DNS: The Big Issues DNS: The Big Issues He who controls DNS controls the Internet. 1. Naming contexts • TLD registry run by Network Solutions, Inc. until 9/98. I want to use short, unqualified names like whiteout instead of whiteout.cs.duke.edu when I’m in the cs.duke.edu domain. US government (NSF) granted monopoly, regulated but not answerable to any US or international authority. 2. What about trust? How can we know if a server is authoritative, or just an impostor? • Registration is transitioning to a more open management structure involving an alphabet soup of organizations. What happens if a server lies or behaves erratically? What denial-of-service attacks are possible? What about privacy? For companies, domain name == brand. 3. What if an “upstream” server fails? • Squatters register/resell valuable domain name “real estate”. 4. Is the hierarchical structure sufficient for scalability? • Who has the right to register/use, e.g., coca-cola.com ? more names vs. higher request rates DNS Caching DNS Caching DNS Replication DNS Replication Every DNS domain has or should have at least TLD root .edu one secondary name server replica. Local server caches .edu , duke.edu , cs.duke.edu , - configure peers to offload queries from primary domain admin and prophet.cs.duke.edu . updates primary .edu - serve as authoritative backup duke Caching of query responses allows subsequent queries to bypass the roots of the server hierarchy. Secondary replicas keep themselves up to date by duke periodically fetching/refreshing the entire naming mc cs Each response is stamped with a time-to-live database via zone transfer (TCP). (TTL) to limit damage from stale cache entries. The primary database is timestamped with a “ serial What about negative caching: is it cs number ” to short-circuit if no updates have occurred worthwhile to cache negative responses? primary since last zone transfer. query secondary How to load-balance the secondaries? response query What if primary is overloaded with too many query (backup) prophet.cs.duke.edu secondaries requesting zone transfers? zone transfer Reverse Translation Reverse Translation The Server Selection Problem The Server Selection Problem server array A server farm B 152 Which server? 3 4... ...2 ... 140 ... ... Which network site? 5 (prophet) 152.3.140.5 “Contact the weather service.” 2
DNS Round Robin DNS Round Robin Generalized Cache/CDN (External View) Generalized Cache/CDN (External View) a b c d Brisco (Rutgers), RFC 1794 Origin Servers {push, request, reply} What about DNS caching? “ www.nhc.noaa.gov is How to handle server failures? IP address a” How effective is the load-balancing? DNS server for (or {b,c,d}) Content Distribution Networks nhc.noaa.gov Web Caches Cisco DistributedDirector uses a more “lookup www.nhc.noaa.gov” sophisticated DNS load balancing approach, based on its Director Response Protocol (DRP), and also incorporates HTTP redirection. {request, reply} local Clients DNS server DNS DNS- -based Request Routing based Request Routing Generalized Cache/CDN Generalized Cache/CDN (Internal View) (Internal View) How to apply the request routing function ƒ ? Interior Caches • Some intermediary intercepts the request, and directs it to a Request root caches selected site. reverse proxies Routing CDN caches Smart proxies or switches? E.g., look at URL or server IP address. Function ƒ ƒ • Or, interpose on the binding procedure, before the client sends the request itself. Leaf Caches Smart clients, Active Names, RPC binding, or DNS lookup (e.g., ISP proxies) Third-party CDNs are based on DNS servers that select the cache/replica site on DNS lookup for the request. bound client populations Akamai, Digital Island, Web hosting providers (e.g., Exodus), etc. Like DNS-RR....but smarter... Using DNS for Third Using DNS for Third- -party CDNs party CDNs Domain Granularity and “Akamaizing” Domain Granularity and “Akamaizing” Intelligent DNS-based request routing has some tricky parts: • CDN (e.g., Akamai) creates new domain names for each client content provider. • Third-party CDNs contract with content providers (e.g., Web sites such as cnn.com ) to serve a subset of their content. e.g., a128.g.akamai.net • The CDN’s DNS servers are authoritative for the new Resource-rich content, e.g., images, audio, video. domains. • To use DNS request routing, the CDN must assume DNS • The client content provider modifies its content so that duties for the URLs that reference the content it serves. embedded URLs reference the new domains. • The content provider does not want to designate the CDN as “Akamaize” content, e.g.: http://www.cnn.com/image-of-the-day.gif the authoritative DNS server for its domain (e.g., cnn.com ). becomes http://a128.g.akamai.net/image-of-the-day.gif . Solution: make up new DNS domains for the client provider’s • Using multiple domain names for each client allows the content served by the CDN. CDN to further subdivide the content into groups. DNS sees only the requested domain name, but it can route requests for different domains independently. 3
Recommend
More recommend