THE POWER OF RED HAT CEPH STORAGE And how it’s essential to your OpenStack environment Jean-Charles Lopez S. Technical Instructor, Global Storage Consulting Practice Red Hat, Inc. jcl@redhat.com May 2017 – OpenStack Summit, Boston
STORAGE CONCEPTS
DIFFERENT KINDS OF STORAGE BLOCK STORAGE FILE STORAGE OBJECT STORAGE Physical storage media appears File systems allow users to Object stores distribute data to computers as a series of organize data stored in blocks algorithmically throughout a sequential blocks of a uniform using hierarchical folders and cluster of media, without a rigid size. files. structure. 3
REPLICATION VS ERASURE CODING OBJECT OBJECT COPY COPY COPY 1 2 3 4 X Y REPLICATED POOL ERASURE CODED POOL CEPH STORAGE CLUSTER CEPH STORAGE CLUSTER FULL COPIES OF STORED OBJECTS ONE COPY PLUS PARITY • Very high durability • Cost-effective durability • Quicker recovery • Expensive recovery • Performance optimized • Capacity optimized 4
RED HAT CEPH STORAGE ARCHITECTURAL OVERVIEW
RED HAT CEPH STORAGE ARCHITECTURAL COMPONENTS APP HOST/VM CLIENT RGW RBD CEPHFS* A web services gateway for object A reliable, fully distributed block device A distributed file system with POSIX semantics & scale-out metadata storage, compatible with S3 and Swift with cloud platform integration LIBRADOS A library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby) RADOS A software-based reliable, autonomous, distributed object store comprised of self- healing, self-managing, intelligent storage nodes and lightweight monitors * CephFS is Tech Preview in RHCS2 6
RED HAT CEPH STORAGE ARCHITECTURAL COMPONENTS APP HOST/VM CLIENT RGW RBD CEPHFS* A web services gateway for object A reliable, fully distributed block device A distributed file system with POSIX semantics & scale-out metadata storage, compatible with S3 and Swift with cloud platform integration LIBRADOS A library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby) Reliable Autonomous Distributed Object Store Software-based, comprised of self-healing, self-managing, intelligent storage nodes and lightweight monitors * CephFS is Tech Preview in RHCS2 7
RADOS CLUSTER RADOS CLUSTER
OBJECT STORAGE DAEMONS (OSDs) OSD OSD OSD OSD FS FS FS FS DISK DISK DISK DISK OSDs • 10s to 10000s in a cluster • One per disk (SSD, SAS, SATA, …) • Serve stored objects to clients • Minimum 3 per cluster 9
MONITORS (MONs) Monitors • Maintain cluster membership and state • Track health of the cluster • Provide consensus for distributed decision-making • Small, odd number • These do not serve stored objects to clients • Minimum 3 per cluster 10
WHERE DO OBJECTS LIVE? APPLICATION OBJECTS RADOS CLUSTER
INTRODUCTION TO CEPH DATA PLACEMENT : POOLS & PLACEMENT GROUPS 10 11 10 01 POOL A 10 11 10 01 10 01 01 11 01 10 11 10 POOL B 01 01 10 01 10 01 01 01 POOL C 01 11 10 10 10 11 10 01 10 11 10 01 10 10 01 01 POOL D 10 11 10 01 01 10 11 10 12
CONTROLLED REPLICATION UNDER SCALABLE HASHING 10 01 10 01 01 11 01 01 01 10 01 10 10 OBJECTS 01 11 10 10 01 10 10 01 01 11 01 PLACEMENT CLUSTER GROUPS
CRUSH IS A QUICK CALCULATION 10 01 01 11 01 01 10 01 OBJECTS 01 11 10 10 10 10 01 01 CLUSTER 14
CRUSH - DYNAMIC DATA PLACEMENT CRUSH • Pseudo-random placement algorithm • Fast calculation, no lookup • Repeatable, deterministic • Statistically uniform distribution • Stable mapping • Limited data migration on change • Rule-based configuration • Infrastructure topology aware • Adjustable replication • Weighting
RED HAT CEPH STORAGE ARCHITECTURAL COMPONENTS APP HOST/VM CLIENT RGW RBD CEPHFS* A web services gateway for object A reliable, fully distributed block device A distributed file system with POSIX semantics & scale-out metadata storage, compatible with S3 and Swift with cloud platform integration LIBRADOS A library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby) RADOS A software-based reliable, autonomous, distributed object store comprised of self- healing, self-managing, intelligent storage nodes and lightweight monitors * CephFS is Tech Preview in RHCS2 16
ACCESSING A RADOS CLUSTER APPLICATION LIBRADOS OBJECTS Socket RADOS CLUSTER
RED HAT CEPH STORAGE ARCHITECTURAL COMPONENTS APP HOST/VM CLIENT RGW RBD CEPHFS* A web services gateway for object A reliable, fully distributed block device A distributed file system with POSIX semantics & scale-out metadata storage, compatible with S3 and Swift with cloud platform integration LIBRADOS A library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby) RADOS A software-based reliable, autonomous, distributed object store comprised of self- healing, self-managing, intelligent storage nodes and lightweight monitors * CephFS is Tech Preview in RHCS2 18
THE RADOS GATEWAY (RGW) REST RADOSGW RADOSGW LIBRADOS LIBRADOS Socket RADOS CLUSTER
RED HAT CEPH STORAGE ARCHITECTURAL COMPONENTS APP HOST/VM CLIENT RGW RBD CEPHFS* A web services gateway for object A reliable, fully distributed block device A distributed file system with POSIX semantics & scale-out metadata storage, compatible with S3 and Swift with cloud platform integration LIBRADOS A library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby) RADOS A software-based reliable, autonomous, distributed object store comprised of self- healing, self-managing, intelligent storage nodes and lightweight monitors * CephFS is Tech Preview in RHCS2 20
STORING VIRTUAL DISKS VM HYPERVISOR LIBRBD RADOS CLUSTER 21
SEPARATE COMPUTE FROM STORAGE VM HYPERVISOR HYPERVISOR LIBRBD LIBRBD RADOS CLUSTER 22
KERNEL MODULE FOR MAX FLEXIBILITY LINUX HOST KRBD RADOS CLUSTER 23
RED HAT CEPH STORAGE ARCHITECTURAL COMPONENTS APP HOST/VM CLIENT RGW RBD CEPHFS* A web services gateway for object A reliable, fully distributed block device A distributed file system with POSIX semantics & scale-out metadata storage, compatible with S3 and Swift with cloud platform integration LIBRADOS A library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby) RADOS A software-based reliable, autonomous, distributed object store comprised of self- healing, self-managing, intelligent storage nodes and lightweight monitors * CephFS is Tech Preview in RHCS2 24
CEPHFS* - SEPARATE METADATA SERVER LINUX HOST KERNEL MODULE METADATA DATA RADOS CLUSTER * CephFS is Tech Preview in RHCS2 25
RED HAT CEPH STORAGE OPENSTACK USE CASE
RED HAT CEPH STORAGE AND OPENSTACK OPENSTACK KEYSTONE SWIFT CINDER GLANCE NOVA MANILA HYPERVISOR RADOSG LIBRBD CEPHFS* W LIBRADOS RADOS CLUSTER * CephFS is Tech Preview in RHCS2
RED HAT CEPH STORAGE DOING IT!
RED HAT CEPH STORAGE RBD & GLANCE On ceph admin node, run: ceph osd pool create {pool_name} 2 x ceph auth get-or-create {user_name} ... – o {keyring_file} scp {keyring_file} {unix_user}@{glance_node}:{path} <- Provide read permission for Glance scp /etc/ceph/ceph.conf {unix_user}@{glance_node}:{path} <- Provide read permission for Glance Add the following to /etc/ceph/ceph.conf on Glance node [{user_name}] keyring = {path} Edit /etc/glance/glance-api.conf on Glance node ... [ glance_store] stores = rbd default_store = rbd show_image_direct_url = true rbd_store_user = {user_id} <- If user name is client.{id}, use {id} rbd_store_pool = {pool_name} rbd_store_ceph_conf = {Ceph configuration file path} rbd_store_chunk_size = {integer} <- Uses 8 by default for 8MB object RBDs flavor = keystone Restart Glance services
RED HAT CEPH STORAGE RBD & CINDER On ceph admin node, run: ceph osd pool create {pool_name} 2 x ceph auth get-or-create {user_name} ... – o {keyring_file} scp {keyring_file} {unix_user}@{cinder_node}:{path} <- Provide read permission for Cinder scp /etc/ceph/ceph.conf {unix_user}@{cinder_node}:{path} <- Provide read permission for Cinder Add the following to /etc/ceph/ceph.conf on Cinder node [{user_name}] keyring = {path} Edit /etc/cinder/cinder.conf on Cinder node. Note that you can create multiple storage backends ... [cinder_backend_name] volume_driver = cinder.volume.drivers.rbd.RBDDriver rbd_ceph_conf = {Ceph configuration file path} rbd_pool = {pool_name} rbd_secret_uuid = {UUID} rbd_user = {ceph_userid} Restart Cinder services
RED HAT CEPH STORAGE RBD & LIBVIRT Create a file with on compute node <secret ephemeral=”no” private=”no”> <uuid>{UUID}</uuid> <usage type=”ceph”> <name>{username} secret</name> </usage> </secret> Run command virsh secret-define --file ceph.xml virsh secret-set-value --secret {UUID} -base64 $(cat {ceph_user_name}.key)* Synchronize libvirt secrets across compute nodes
RED HAT CEPH STORAGE RBD & NOVA Edit /etc/nova/nova.conf on Nova nodes [libvirt] libvirt_images_type = rbd libvirt_images_rbd_pool = {pool_name} libvirt_images_rbd_ceph_conf = {Ceph configuration file path} libvirt_disk_cachemodes = "network=writeback" rbd_secret_uuid = {UUID} rbd_user = {ceph_userid} Restart Nova services
Recommend
More recommend