managing containers with helix
play

Managing Containers with Helix Kanak Biscuitwala Jason Zhang - PowerPoint PPT Presentation

Managing Containers with Helix Kanak Biscuitwala Jason Zhang Apache Helix Committers @ LinkedIn helix.apache.org @apachehelix Intersection of Job Types Oracle Oracle DB DB Intersection of Job Types Oracle Oracle DB DB Backup Backup


  1. Helix Controller Rebalancer ResourceAssignment computeResourceMapping( RebalancerConfig rebalancerConfig, ResourceAssignment prevAssignment, Cluster cluster, ResourceCurrentState currentState); Based on the current nodes in the cluster and constraints, find an assignment of task to node

  2. Helix Controller Rebalancer ResourceAssignment computeResourceMapping( RebalancerConfig rebalancerConfig, ResourceAssignment prevAssignment, Cluster cluster, ResourceCurrentState currentState); Based on the current nodes in the cluster and constraints, find an assignment of task to node What else do we need?

  3. Helix Controller What is Missing? Dynamic Container Automated Service Allocation Deployment Resource Utilization Container Isolation Monitoring

  4. Helix Controller Target Provider Fixed CPU Memory Bin Packing Based on some constraints, determine how many containers are required in this system We’re working on integrating with monitoring systems in order to query for usage information

  5. Helix Controller Target Provider Fixed TargetProviderResponse evaluateExistingContainers( Cluster cluster, ResourceId resourceId, CPU Collection<Participant> participants); class TargetProviderResponse { Memory List<ContainerSpec> containersToAcquire; List<Participant> containersToRelease; List<Participant> containersToStop; List<Participant> containersToStart; Bin Packing } Based on some constraints, determine how many containers are required in this system We’re working on integrating with monitoring systems in order to query for usage information

  6. Helix Controller Adding a Target Provider Target Provider Rebalancer Constraints Nodes Task Assignment

  7. Helix Controller Adding a Target Provider Target Provider Rebalancer Constraints Nodes Task Assignment How do we use the target provider response?

  8. Helix Controller Container Provider YARN Mesos Local Given the container requirements, ensure that number of containers are running

  9. Helix Controller Container Provider ListenableFuture<ContainerId> YARN allocateContainer(ContainerSpec spec); � ListenableFuture<Boolean> deallocateContainer(ContainerId containerId); � Mesos ListenableFuture<Boolean> startContainer(ContainerId containerId, Participant participant); � ListenableFuture<Boolean> Local stopContainer(ContainerId containerId); Given the container requirements, ensure that number of containers are running

  10. Helix Controller Adding a Container Provider Target Provider Rebalancer Constraints Nodes Container Provider Task Assignment Target Provider + Container Provider = Provisioner

  11. Application Lifecycle With Helix and the Task Abstraction Capacity Target Provider Planning Container Provider Provisioning Fault Existing Helix Controller (enhanced by Provisioner) Tolerance State Existing Helix Controller (enhanced by Provisioner) Management

  12. System Architecture

  13. System Architecture Resource Provider

  14. System Architecture submit job Client Resource Provider

  15. System Architecture submit job Client Resource Provider App Launcher Provisioner Rebalancer Controller Container

  16. System Architecture submit job Client Resource Provider container request App Launcher Provisioner Rebalancer Controller Container

  17. System Architecture submit job Client Resource Provider container request Participant Launcher App Launcher Helix Participant Provisioner App Rebalancer Participant Container Controller Container

  18. System Architecture submit job Client Resource Provider container request Participant Launcher App Launcher Helix Participant Provisioner assign tasks App Rebalancer Participant Container Controller Container

  19. Helix + YARN YARN Architecture Resource submit job Client Manager node status node status container request Node Manager Node Manager assign work status Application Master Container grab package App Package HDFS/Common Area

  20. Helix + YARN Helix + YARN Architecture Resource submit job Client Manager node status node status container request Node Manager Node Manager Helix Controller Helix Participant assign tasks Rebalancer App status Application Master Container grab package App Package HDFS/Common Area

  21. Helix + Mesos Mesos Architecture Mesos offer resources Master Scheduler offer response node status node status Scheduler Slave Mesos Slave Mesos Slave Mesos Executor Slave Machine Slave Machine grab executor Executor Package HDFS/Common Area

  22. Helix + Mesos Helix + Mesos Architecture Mesos offer resources Master Scheduler offer response Helix Controller node status node status Scheduler Slave assign tasks Mesos Slave Mesos Slave Mesos Executor Helix Participant/App Slave Machine Slave Machine grab executor Helix Executor Package HDFS/Common Area

  23. Example

  24. Distributed Document Store Overview Master Slave Partition 0 Partition 0 Partition 0 Partition 1 Partition 1 Partition 1 Oracle Oracle Oracle Partition 2 Partition 2 Partition 2 P2 Backup P1 Backup P0 Backup ETL ETL ETL HDFS

  25. Distributed Document Store Overview Master Slave Partition 0 Partition 0 Partition 0 Partition 1 Partition 1 Partition 1 Oracle Oracle Partition 2 Partition 2 Partition 2 P2 Backup P1 Backup P0 Backup ETL ETL HDFS

  26. Distributed Document Store YARN Example Resource submit job Client Manager node status node status container request Node Manager Node Manager Helix Participant Helix Controller assign work Rebalancer P1 Backup ETL Partition 0 Oracle Partition 1 status Application Master Container

  27. Distributed Document Store YAML Specification appConfig: { config: { k1: v1 } } appPackageUri: 'file://path/to/myApp-pkg.tar' appName: myApp services: [DB, ETL] # the task containers serviceConfigMap: {DB: { num_containers: 3, memory: 1024 }, ... ETL: { time_to_complete: 5h, ... }, ...} servicePackageURIMap: { DB: ‘file://path/to/db-service-pkg.tar', ... } ...

  28. Distributed Document Store YAML Specification appConfig: { config: { k1: v1 } } appPackageUri: 'file://path/to/myApp-pkg.tar' appName: myApp TargetProvider services: [DB, ETL] # the task containers specification serviceConfigMap: {DB: { num_containers: 3, memory: 1024 }, ... ETL: { time_to_complete: 5h, ... }, ...} servicePackageURIMap: { DB: ‘file://path/to/db-service-pkg.tar', ... } ...

  29. Distributed Document Store Service/Container Implementation public class MyQueuerService extends StatelessParticipantService { @Override public void init() { ... } � @Override public void onOnline() { ... } � @Override public void onOffline() { ... } }

  30. Distributed Document Store Task Implementation public class BackupTask extends Task { @Override public ListenableFuture<Status> start() { ... } � @Override public ListenableFuture<Status> cancel() { ... } � @Override public ListenableFuture<Status> pause() { ... } � @Override public ListenableFuture<Status> resume() { ... } }

  31. Distributed Document Store State Model-Style Callbacks public class StoreStateModel extends StateModel { public void onBecomeMasterFromSlave() { ... } � public void onBecomeSlaveFromMaster() { ... } � public void onBecomeSlaveFromOffline() { ... } � public void onBecomeOfflineFromSlave() { ... } }

  32. Distributed Document Store Spectator (for Discovery) class ¡RoutingLogic ¡{ ¡ ¡ ¡ ¡public ¡void ¡write(Request ¡request) ¡{ ¡ ¡ ¡ ¡ ¡ ¡partition ¡= ¡getPartition(request.key); ¡ ¡ ¡ ¡ ¡ ¡List<Participant> ¡nodes ¡= ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡routingTableProvider.getInstance( ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡partition, ¡“MASTER”); ¡ ¡ ¡ ¡ ¡ ¡nodes.get(0).write(request); ¡ ¡ ¡ ¡} ¡ � ¡ ¡ ¡public ¡void ¡read(Request ¡request) ¡{ ¡ ¡ ¡ ¡ ¡ ¡partition ¡= ¡getPartition(request.key); ¡ ¡ ¡ ¡ ¡ ¡List<Participant> ¡nodes ¡= ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡routingTableProvider.getInstance(partition); ¡ ¡ ¡ ¡ ¡ ¡random(nodes).read(request); ¡ ¡ ¡ ¡} ¡ }

  33. Helix at LinkedIn

  34. Helix at LinkedIn In Production User Writes Oracle Oracle Backup/Restore DB Oracle ETL Data Replicator Change Capture HDFS Change Consumers Analytics Index Search Index

Recommend


More recommend