kerberos and health checks and bare metal oh my
play

Kerberos and Health Checks and Bare Metal, Oh My! Updates to - PowerPoint PPT Presentation

Kerberos and Health Checks and Bare Metal, Oh My! Updates to OpenStack Sahara in Newton Updates to OpenStack Sahara in Newton Vitaly Gridnev, Sahara PTL (Mirantis) Elise Gafford, Sahara Core (Red Hat) Nikita Konovalov, Sahara Core (Mirantis)


  1. Kerberos and Health Checks and Bare Metal, Oh My! Updates to OpenStack Sahara in Newton Updates to OpenStack Sahara in Newton Vitaly Gridnev, Sahara PTL (Mirantis) Elise Gafford, Sahara Core (Red Hat) Nikita Konovalov, Sahara Core (Mirantis)

  2. Agenda 1. Sahara overview 2. Health checks and management improvements 3. Kerberos integration for clusters 4. Image generation improvements 5. Bare metal clusters 6. What is NEW in NEWton 7. Q&A

  3. Agenda 1. Sahara overview 2. Health checks and management improvements 3. Kerberos integration for clusters 4. Image generation improvements 5. Bare metal clusters 6. What is NEW in NEWton 7. Q&A

  4. Sahara: The Use Cases ● Data Processing Cluster Management ○ On-demand, scalable, configurable, persistent clusters Supports multiple plugins (Apache, Ambari, CDH, MapR...) ○ Integrates with Heat, Glance, Nova, Neutron, and Cinder ○ ● EDP (Elastic Data Processing) ○ Supports multiple job types (Java, MR, Hive, Pig, Spark, Storm...) ○ Supports transient clusters (spin up, process, shut down) or persistent clusters Integrates with Swift and/or Manila (optionally) ○

  5. Sahara: The API

  6. Sahara: The Project ● Cluster provisioning plugins: Cloudera Distribution of Hadoop (using Cloudera Manager) ○ ○ Hortonworks Data Platform (using Apache Ambari) MapR ○ ○ “Vanilla” Apache Hadoop, Spark, and Storm EDP job types: ● ○ MapReduce, Java, Hive, and Pig jobs (using Apache Oozie) ○ Spark, Spark Streaming, and Storm jobs (using Apache Spark and Apache Storm) Image packing repository (sahara-image-elements) ● ● Framework to validate Sahara installation (sahara-tests) ● UI plugin OpenStackClient plugin ●

  7. Agenda 1. Sahara overview 2. Health checks and management improvements 3. Kerberos integration for clusters 4. Image generation improvements 5. Bare metal clusters 6. What is NEW in NEWton 7. Q&A

  8. Event log for clusters ● Cluster events about provisioning: allows to understand what is the current status of cluster provisioning, or reasons of failure ● Available since Newton for clusters created by using Ambari ● Supported in CLI since Newton, with full dump of all steps and events

  9. Event log for clusters

  10. Event log for clusters

  11. Health checks for clusters ● Users are interested in monitoring cluster state after cluster provisioning: vital for long living clusters ● Sahara in Liberty doesn't have any monitoring of the health of cluster processes. A cluster can be broken or unavailable but Sahara will still think that it is in ACTIVE status.

  12. Health checks for clusters ● Clusters health checks have been implemented since Mitaka ● Available for clusters deployed using Ambari and Cloudera Manager. Less availability for vanilla clusters ● Since Newton checks are available for the MapR plugin ● Health results can be set to notify Ceilometer ● Easy to recheck health

  13. Health checks for clusters

  14. Health checks for clusters

  15. Health checks for clusters

  16. Health checks for clusters Next steps are: More detailed health checks ● ○ Particular datanode/slave failure ○ No enough space in HDFS Suggestions/actions to repair health: ● Datanode replacement ○ ○ New nodes ○ Restarting services More flexible configuration of health checks (advanced health ● checks, on disabling/enabling health checks for some reason)

  17. Agenda 1. Sahara overview 2. Health checks and management improvements 3. Kerberos integration for clusters 4. Image generation improvements 5. Bare metal clusters 6. What is NEW in NEWton 7. Q&A

  18. Security improvements ● Security is an important part of created clusters Previously security could be enabled only by ● managers calling only Ambari and Cloudera Manager directly, but that leads to a situation in which Sahara will not perform auth operations, and EDP does not work ● Security is important not just for clusters, but for Sahara itself

  19. Security improvements In Newton the following Kerberos security features were implemented: MIT KDC can be preconfigured (or an existing KDC can be used) ● ● Oozie client was re-implemented to support auth operations with Kerberos ● Spark job executions are also supported Keys are distributed on nodes for system users (hdfs, hadoop, spark) ● Supported for clusters deployed using Ambari and Cloudera Manager ● ● Note: Be sure that latest hadoop-swift jars are in place for Swift data sources!

  20. Security improvements

  21. Security improvements ● Bandit tests per commit Improved secret storage ● (using Barbican and Castellan) was implemented in the previous release

  22. Agenda 1. Sahara overview 2. Health checks and management improvements 3. Kerberos integration for clusters 4. Image generation improvements 5. Bare metal clusters 6. What is NEW in NEWton 7. Q&A

  23. Where we were Sahara had 2 flows that were relevant to image manipulation: Pre-Nova spawn image packing ● ○ Used sahara-image-elements repository to generate images (to store in Glance) Post-Nova spawn cluster generation from “clean” (OS-only) images ● ○ Logic maintained in Sahara process within plugins ● Pre-Configuration validation of images by plugins Remember how I said we had 2 flows relevant to image manipulation? ○ ○ We didn’t do this at all.

  24. Where We Were: Problems ● Duplication of logic ○ Steps required for packing images and “clean” image clusters were often identical, but had to be expressed separately (in DIB and in Python). ● Poor validation Plugins did not validate that images provided to them met their needs. ○ ○ Failures due to image contents were late and sometimes difficult to understand. Poor encapsulation ● ○ Image generation and cluster provisioning logic for any one plugin are really one application ○ Maintaining them in two places allows versionitis and dependency problems ○ Having one monolithic repo for all plugins makes them less pluggable

  25. Our Dream Implementation ● All flows share common logic: Image packing ○ ○ Image validation Clean image cluster gen ○ ● Image manipulation is stored and versioned within plugins ● The user can still generate images with a CLI... But they can also use an API to generate images in clean build environments ● ... And both dev test cycles and user retries are as quick and painless as ● possible

  26. The plan 1. Build a validation engine that ensures that images meet a specification a. YAML-based spec definition 2. Extend that engine to optionally modify images to spec 3. Build a CLI to expose this functionality 4. Create and test specifications for each plugin to support this method 5. Deprecate sahara-image-elements (only when this method proves stable) 6. Build an API to: a. Spawn a clean tenant-plane image build environment b. Download a base image from Glance and modify it to spec c. Push the new image back to Glance and register it for use by Sahara

  27. Where we are 1. Build a validation engine that ensures that images meet a specification a. YAML-based spec definition 2. Extend that engine to optionally modify images to spec 3. Build a CLI to expose this functionality 4. Create and test specifications for each plugin to support this method 5. Deprecate sahara-image-elements (only when this method proves stable) 6. Build an API to: a. Spawn a clean tenant-plane image build environment b. Download a base image from Glance and modify it to spec c. Push the new image back to Glance and register it for use by Sahara

  28. What it looks like: the specs ● YAML-based definitions Argument definitions for ● configurability ● Idempotent resource declarations Scripts must be written ○ idempotently, as always in resource declarations ● Logical control operators (any, all, os_case, etc.)

  29. What it looks like: the CLI Command structure: sahara-image-pack --image ./image.qcow2 PLUGIN VERSION [plugin arguments] Features: ● Auto-generates help text from arguments Idempotent and modifies images in-place ● ○ Very fast test cycles and retries Allows freeform bash scripts and more ● structured resources ○ Though it’s on you to make your scripts idempotent Test-only mode to validate without change ●

  30. What it’s doing The images module runs a sequence of steps against a remote machine ● Validation uses the Sahara SSH remote in read-only mode ● Clean image gen uses the SSH remote Image packing uses a libguestfs Python ● API image handle All three use the same logic, contained in the appropriate plugin Plugin implementation targeting O!

  31. Agenda 1. Sahara overview 2. Health checks and management improvements 3. Kerberos integration for clusters 4. Image generation improvements 5. Bare metal clusters 6. What is NEW in NEWton 7. Q&A

  32. Ironic integration Why should you run Bare Metal in OpenStack: Big Data workload originates from Bare Metal installations ● ● Quick cluster scalability may have lower priority than a long running stability and persistence Best performance by design, no virtualization overhead ● The ability to manage a baremetal cluster with the OpenStack API ●

Recommend


More recommend