Enabling Enabling Data- -Intensive Science Intensive Science Data with Tactical Storage Systems with Tactical Storage Systems Prof. Douglas Thain Prof. Douglas Thain University of Notre Dame University of Notre Dame http://www.cse.nd.edu/~dthain www.cse.nd.edu/~dthain http://
The Cooperative Computing Lab The Cooperative Computing Lab Our model of computer science research: Our model of computer science research: – Understand Understand how users with complex, large how users with complex, large- -scale scale – applications need to interact with computing systems. applications need to interact with computing systems. – – Design Design novel computing systems that can be applied novel computing systems that can be applied by many different users == basic CS research. by many different users == basic CS research. – Deploy Deploy code in real systems with real users, suffer code in real systems with real users, suffer – real bugs, and learn real lessons == applied CS. real bugs, and learn real lessons == applied CS. Application Areas: Application Areas: – Astronomy, Bioinformatics, Biometrics, Molecular Astronomy, Bioinformatics, Biometrics, Molecular – Dynamics, Physics, Game Theory, ... ??? Dynamics, Physics, Game Theory, ... ??? External Support: NSF, IBM, Sun External Support: NSF, IBM, Sun http://www.cse.nd.edu/~ccl www.cse.nd.edu/~ccl http://
Abstract Abstract Users of distributed systems encounter many Users of distributed systems encounter many practical barriers between their jobs and the practical barriers between their jobs and the data they wish to access. data they wish to access. Problem: Users have access to many Problem: Users have access to many resources resources (disks), but are stuck with the abstractions (disks), but are stuck with the abstractions (cluster NFS) provided by administrators. (cluster NFS) provided by administrators. Solution: Tactical Storage Systems allow any Solution: Tactical Storage Systems allow any user to create, reconfigure, and tear down user to create, reconfigure, and tear down abstractions without bugging the administrator. abstractions without bugging the administrator.
The Standard Model The Standard Model shared disk Transparent Distributed Filesystem
The Standard Model The Standard Model private disk private private shared disk Transparent Distributed Filesystem disk disk FTP, SCP, RSYNC, HTTP, ... private disk shared Transparent Distributed Filesystem disk
Problems with the Standard Model Problems with the Standard Model Users encounter partitions in the WAN. Users encounter partitions in the WAN. – – Easy to access data inside cluster, hard outside. Easy to access data inside cluster, hard outside. – Must use different mechanisms on diff links. Must use different mechanisms on diff links. – – – Difficult to combine resources together. Difficult to combine resources together. Different access modes for different purposes. Different access modes for different purposes. – File transfer: preparing system for intended use. – File transfer: preparing system for intended use. – File system: access to data for running jobs. File system: access to data for running jobs. – Resources go unused. Resources go unused. – Disks on each node of a cluster. Disks on each node of a cluster. – – Unorganized resources in a department/lab. Unorganized resources in a department/lab. – A global file system can A global file system can’ ’t satisfy everyone! t satisfy everyone!
What if... What if... Users could easily access any storage? Users could easily access any storage? I could borrow an unused disk for NFS? I could borrow an unused disk for NFS? An entire cluster can be used as storage? An entire cluster can be used as storage? Multiple clusters could be combined? Multiple clusters could be combined? I could reconfigure structures without root? I could reconfigure structures without root? – – (Or bugging the administrator daily.) (Or bugging the administrator daily.) Solution: Tactical Storage System (TSS) Solution: Tactical Storage System (TSS)
Outline Outline Problems with the Standard Model Problems with the Standard Model Tactical Storage Systems Tactical Storage Systems – File Servers, Catalogs, Abstractions, Adapters File Servers, Catalogs, Abstractions, Adapters – Applications: Applications: – – Remote Database Access for Remote Database Access for BaBar BaBar Code Code – Remote Dynamic Linking for CDF Code Remote Dynamic Linking for CDF Code – – Logical Data Access for Bioinformatics Code Logical Data Access for Bioinformatics Code – – – Expandable Database for MD Simulation Expandable Database for MD Simulation Improving the OS for Grid Computing Improving the OS for Grid Computing
Tactical Storage Systems (TSS) Tactical Storage Systems (TSS) A TSS allows any node to serve as a file A TSS allows any node to serve as a file server or as a file system client. server or as a file system client. All components can be deployed without All components can be deployed without special privileges – – but with security. but with security. special privileges Users can build up complex structures. Users can build up complex structures. – Filesystems Filesystems, databases, caches, ... , databases, caches, ... – Two Independent Concepts: Two Independent Concepts: – – Resources Resources – – The raw storage to be used. The raw storage to be used. – Abstractions Abstractions – – The organization of storage. The organization of storage. –
App App file transfer Adapter ??? Adapter App Central Distributed Filesystem Abstraction Filesystem Adapter Distributed Database Abstraction UNIX UNIX UNIX UNIX UNIX UNIX UNIX 3PT file file file file file file file server server server server server server server UNIX UNIX UNIX UNIX UNIX UNIX UNIX file file file file file file file system system system system system system system Cluster administrator controls Workstations owners control policy on all storage in cluster policy on each machine.
Components of a TSS: Components of a TSS: 1 – – File Servers File Servers 1 2 – – Catalogs Catalogs 2 3 – – Abstractions Abstractions 3 4 – – Adapters Adapters 4
1 – – File Servers File Servers 1 Unix- -Like Interface Like Interface Unix – open/close/read/write open/close/read/write – – – getfile/putfile getfile/putfile to stream whole files to stream whole files Chirp – opendir opendir/stat/rename/unlink /stat/rename/unlink – Protocol Complete Independence Complete Independence file file – choose friends – choose friends server server – limit bandwidth/space limit bandwidth/space – A B – evict users? evict users? – Trivial to Deploy Trivial to Deploy – run server + – run server + setacl setacl file – no privilege required no privilege required – owner of owner of system – can be thrown into a grid system can be thrown into a grid system – server A server B Flexible Access Control Flexible Access Control
Related Work Related Work Lots of file services for the Grid: Lots of file services for the Grid: – GridFTP GridFTP, , NeST NeST, SRB, RFIO, SRM, IBP, ... , SRB, RFIO, SRM, IBP, ... – – – Adapter interfaces with many of these! Adapter interfaces with many of these! Why have Why have another another file server? file server? – Reason 1: Must have precise Unix semantics! Reason 1: Must have precise Unix semantics! – Apps distinguish ENOENT vs vs EACCES EACCES vs vs EISDIR. EISDIR. Apps distinguish ENOENT FTP always returns error 550, regardless of error. FTP always returns error 550, regardless of error. – Reason 2: TSS focused on easy deployment. Reason 2: TSS focused on easy deployment. – No privilege required, no config config files, no rebuilding, files, no rebuilding, No privilege required, no flexible access control, ... flexible access control, ...
Access Control in File Servers Access Control in File Servers Unix Security is not Sufficient Unix Security is not Sufficient – – No global user database possible/desirable. No global user database possible/desirable. – Mapping external credentials to Unix gets messy. Mapping external credentials to Unix gets messy. – Instead, Make External Names First- -Class Class Instead, Make External Names First – Perform access control on remote, not local, names. Perform access control on remote, not local, names. – – – Types: Types: Globus Globus, Kerberos, Unix, Hostname, Address , Kerberos, Unix, Hostname, Address Each directory has an ACL: Each directory has an ACL: globus:/O= :/O=NotreDame NotreDame/CN= /CN=DThain DThain RWLA globus RWLA kerberos:dthain@nd.edu kerberos:dthain@nd.edu RWL RWL hostname:*.cs.nd.edu cs.nd.edu RL hostname:*. RL address:192.168.1.* RWLA address:192.168.1.* RWLA
Problem: Shared Namespace Problem: Shared Namespace file server globus:/O=NotreDame/* RWLAX test.c test.dat a.out cms.exe
Solution: Reservation (V) Right Solution: Reservation (V) Right file server /O=NotreDame/CN=Monk /O=NotreDame/CN=Ted mkdir only! O=NotreDame/CN=* V(RWLA) mkdir mkdir /O=NotreDame/CN=Monk RWLA /O=NotreDame/CN=Ted RWLA test.c a.out test.c a.out
Recommend
More recommend