1
play

1 NAS vs. SAN: Cutting through the BS NAS vs. SAN: Cutting through - PDF document

Storage moves into the net Storage moves into the net Net wor k delays Net wor k delays Net wor k cost Net wor k cost Distributed Storage and Consistency Distributed Storage and Consistency St orage capacit y/ volume St orage capacit y/


  1. Storage moves into the net Storage moves into the net Net wor k delays Net wor k delays Net wor k cost Net wor k cost Distributed Storage and Consistency Distributed Storage and Consistency St orage capacit y/ volume St orage capacit y/ volume Administ r at ive cost Administ r at ive cost Net wor k bandwidt h Net wor k bandwidt h Shar ed st or age wit h scalable bandwidt h and capacit y. Shar ed st or age wit h scalable bandwidt h and capacit y. Consolidat e Consolidat e — — mult iplex mult iplex — — decent ralize decent ralize — — r eplicat e. r eplicat e. Reconf igure t o mix Reconf igure t o mix- -and and- -mat ch loads and r esour ces. mat ch loads and r esour ces. Storage as a service Storage as a service Storage Abstractions Storage Abstractions • relational database (IBM and Oracle) tables, transactions, query language • file system hierarchical name space of files with ACLs Each file is a linear space of fixed-size blocks. SSP SSP ASP ASP • block storage SAN, Petal, RAID-in-a-box (e.g., EMC) Each logical unit (LU) or volume is a linear space of fixed-size blocks. • object storage object == file, with a flat name space: NASD, DDS, Porcupine Storage Service Provider Application Service Provider Storage Service Provider Application Service Provider Varying views of the object size: NASD/OSD/Slice objects may act as large-ish “buckets” that aggregate file system state. Out sour cing: st orage and/ or applicat ions as a ser vice Out sour cing: st orage and/ or applicat ions as a ser vice. . • persistent objects For ASPs (e.g., Web ser vices), st orage is j ust a component . For ASPs (e.g., Web ser vices), st orage is j ust a component . pointer structures, requires transactions: OODB, ObjectStore Network Block Storage “NAS vs. SAN” Network Block Storage “NAS vs. SAN” In the commercial sector there is a raging debate today about One approach to scalable storage is to attach raw block “NAS vs. SAN”. storage to a network. • N etwork- A ttached S torage has been the dominant approach • Abstraction: OS addresses storage by <volume, sector> . to shared storage since NFS. iSCSI, Petal, FibreChannel: access through special device driver NAS == NFS or CIFS: named files over Ethernet/Internet. • Dedicated S torage A rea N etwork or general-purpose E.g., Network Appliance “filers” network. • Proponents of FibreChannel SANs market them as a FibreChannel (FC) vs. Ethernet fundamentally faster way to access shared storage. • Volume-based administrative tools no “indirection through a file server” (“SAD”) backup, volume replication, remote sharing lower overhead on clients • Called “raw” or “block”, “storage volumes” or just “SAN”. network is better/faster (if not cheaper) and dedicated/trusted • Least common denominator for any file system or database. Brocade, HP, Emulex are some big players. 1

  2. NAS vs. SAN: Cutting through the BS NAS vs. SAN: Cutting through the BS Storage Architecture Storage Architecture • FibreChannel a high-end technology incorporating NIC Any of these abstractions can be built using any, some, or all enhancements to reduce host overhead.... of the others. ...but bogged down in interoperability problems. Use the “right” abstraction for your application. • Ethernet is getting faster faster than FibreChannel. Basic operations: create/remove, open/close, read/write. gigabit, 10-gigabit, + smarter NICs, + smarter/faster switches The fundamental questions are: • Future battleground is Ethernet vs. Infiniband. • What is the best way to build the abstraction you want? • The choice of network is fundamentally orthogonal to division of function between device, network, server, and client storage service design. • What level of the system should implement the features and Well, almost: flow control, RDMA, user-level access (DAFS/VI) properties you want? • The fundamental questions are really about abstractions . shared raw volume vs. shared file volume vs. private disks Duke Mass Storage Testbed Duke Mass Storage Testbed Problems Problems Goal Goal: managed st orage on : managed st orage on I BM Shar k/ HSM I BM Shar k/ HSM poor interoperability demand f or cross- demand f or cross -disciplinar y disciplinar y • Must have a common volume layout across heterogeneous r esear ch. r esear ch. SAN clients. Dir ect SAN access f or “power Dir ect SAN access f or “power poor sharing control client s” and NAS client s” and NAS PoPs PoPs; ot her ; ot her • The granularity of access control is an entire volume. client s access t hr ough NAS. client s access t hr ough NAS. Campus FC net Campus FC net • SAN clients must be trusted. • SAN clients must coordinate their access. $$$ I P LAN I P LAN I P LAN I P LAN Brain Lab Brain Lab Med Ct r Med Ct r Duke Storage Testbed Testbed, v2.0 , v2.0 Testbed v2.0: pro and con v2.0: pro and con Duke Storage Testbed I BM Shar k/ HSM I BM Shar k/ HSM Supports resource sharing and data sharing. Each SAN volume is managed Each SAN volume is managed Does not leverage Fibre Channel investment. by a single NAS PoP by a single NAS PoP. . Does not scale access to individual volumes. All access t o each volume is All access t o each volume is mediat ed by it s NAS P P oP. . Prone to load imbalances. mediat ed by it s NAS oP Data crosses campus IP network in the clear. Campus FC net Campus FC net Identities and authentication must be centrally administered. It’s only as good as the NAS clients, which tend to be fair at best. Campus I P Campus I P net net Brain Lab Brain Lab Med Ct r Med Ct r 2

  3. Sharing Network Storage Sharing Network Storage File/Block Cache Consistency File/Block Cache Consistency How can we control sharing to a space of files or blocks? • Basic write-ownership protocol. Distributed shared memory (software DSM) • Access control etc. • Timestamp validation (NFS). • Data model and storage abstraction Timestamp each cache entry, and periodically query the server: • Caching “has this file changed since time t ?”; invalidate cache if stale. • Optimistic replication • Callback invalidation (AFS, Sprite, Spritely NFS). Consistency Request notification (callback) from the server if the file changes; • One-copy consistency vs. weak consistency invalidate cache and/or disable caching on callback. • Read-only (immutable) files? • Leases (NQ-NFS, NFSv4, DAFS) • Read-mostly files with weak consistency? [Gray&Cheriton89,Macklem93] • Write-anywhere files? Software DSM 101 Software DSM 101 Page Based DSM (Shared Virtual Memory) Page Based DSM (Shared Virtual Memory) Software-based distributed shared memory (DSM) provides an illusion of Virtual address space is shared shared memory on a cluster. • remote-fork the same program on each node • data resides in common virtual address space Virtual Address Space library/kernel collude to make the shared VAS appear consistent physical physical • The Great War: shared memory vs. message passing for the full story, take CPS 221 DRAM DRAM switched interconnect The Sequential Consistency Memory Model Inside Page- -Based DSM (SVM) Based DSM (SVM) The Sequential Consistency Memory Model Inside Page The page-based approach uses a write-ownership token sequential processors protocol on virtual memory pages. P3 issue P1 P2 • Kai Li [Ivy SVM, 1986], Paul Leach [Apollo, 1982] memory ops in program • Each node maintains per-node per-page access mode. order {shared, exclusive, no-access} switch randomly set after each memory op determines local accesses allowed ensures some serial Easily implemented with shared bus. For SVM, modes are enforced with VM page protection order among all operations For page-based DSM, weaker consistency models may be useful….but that’s for later. mode load (read) store (write) shared yes no exclusive yes yes Memory no-access no no 3

Recommend


More recommend