Unicorn: Unified Resource Orchestration for Multi- Domain, Geo-Distributed Data Analytics Qiao Xiang 12 , Jace Liu 1 , Harvey Newman 3 , Tony Wang 12 , Y. Richard Yang 12 , Jensen Zhang 1 1 Tongji University, 2 Yale University, 3 California Institute of Technology, November, 2017, INDIS Workshop, Denver, CO
Background • Data-intensive applications rely on clusters of heterogeneous servers as the major computing platform. • Missing: a unified framework to manage a large set of distributively- owned, heterogeneous resources for multi-domain data analytics. • Members : worldwide multi-organizational collaboration among Caltech, Tongji University, Tsinghua University, Yale University, the OpenDaylight ALTO team and the Kytos team. 2
Example Design Setting: Large Hadron Collider (LHC) Figure source: cern.ch 3
The Compact Muon Solenoid (CMS) Computing Model 200Hz -400 Hz CERN Analysis Facility RAW:~1.7-1.1 MB/evt Calibration Large raw datasets from LHC Tie-0 at the Tier-0 site Data : tens of PB per year RA RAW Da RECO RE CO and and AO AOD : multiple times of RAW data depending on analysis requirements RECO and AOD datasets are distributed to Tier-1 sites Tie-1 RECO, AOD and simulation datasets are transferred among Tier-1~3 sites for analysis. Tie-2 4
Resource Orchestration in CMS: Challenges • It is a multi-domain science network. • Different domains (resource providers) provide heterogeneous resources. • Different resource providers use different controllers to manage the resources, especially the networking resource, e.g., OpenDaylight and Kytos. • Different jobs in the network. – PB dataset transfers – Various HEP analytics Figure source: cern.ch 5
Multi-Domain Resource Orchestration: Design Requirements • Multi-controller coordination . – Resource providers, which use different network controllers (e.g, OpenDaylight, Kytos, ONOS, and etc.), can communicate and coordinate the orchestration process through a unified interface. • Consistent operation paradigm . – Efficient resource utilization without resource overloading. Unicorn : a multi-domain, multi-controller – Fast convergence. (MDMC) resource orchestration system • Autonomy and privacy of resource providers . – Resource providers can make and practice their own resource supply strategies with control of privacy. 6
Unicorn: An MDMC Resource Orchestration System Jobs Global Resource Orchestrator Reservation Reservation Requests Requests Reservation Reservation results results Resource Resource Multi-Controller Reservation Reservation Coordination Server Server ODL Kytos How does the orchestrator know how much resources to request for each job?. • Users send jobs to a logically centralized orchestrator; • The orchestrator sends resource reservation requests to different domains; • The reservation servers, running on top of different controllers, process the requests and return the result (success/fail). 7
Unicorn: An MDMC Resource Orchestration System Consistent Operation Paradigm Jobs Global Resource Reservation Reservation Orchestrator Requests/Results Requests/Results Resource Discovery Queries/Responses Resource Resource Resource Resource Reservation Information Reservation Information Server Server Server Server OpenDaylight Kytos • Add resource information servers to provide such information of each domain. • The orchestrator uses the queried resource information to compute the optimal resource reservation requests, and send to the reservation servers. 8
Unicorn: An MDMC Resource Orchestration System Jobs Global Resource Reservation Reservation Orchestrator Requests/Results Requests/Results Resource Discovery Queries/Responses Resource Resource Resource Resource Reservation Information Reservation Information How does resource information servers provide Server Server Server Server accurate resource information yet still ensure the OpenDaylight Kytos autonomy and privacy of providers? • Consistent operation paradigm • Global orchestrator • Multi-controller coordination • Servers with unified interfaces • Provider autonomy and privacy • ? 9
Resource Information Server: Related Work • All-detail resource graph – Examples: HTCondor, Mesos, YARN, etc. – Nodes: computing/storage resources – Links: networking resources – Limitation •Reveal all details of resources, compromising the privacy of clusters in the multi-domain setting. •Resource supply heterogeneousity and dynamicity lead to high overhead. • Alternative design: one-big-switch abstraction – Example: P4P/ALTO. – Limitation: cannot reveal the shared bottleneck resources between analytics tasks, leading to resource overloading and slow convergence. 10
Resource Information Server: Solution What is the right abstraction for multi- domain science networks? All-Detail One-Big- Resource Switch Graph Abstraction Extremely detailed; Extremely abstract; Compromised privacy; Cannot reveal shared High overhead. bottleneck resources. • Basic idea : instead of the more limited graph model to represent resource availability, mathematical programming, such as linear programming, is a more general, abstract constraint representation. • We refer to this feasible region representation as resource state abstraction (ReSA). 11
Resource State Abstraction (ReSA): Example Each link: 100 Mbps sw 1 sw 3 d 1 s 1 sw 6 l 1 l 6 sw 5 sw 8 l 7 l 12 sw 7 s 2 sw 2 d 2 sw 4 • For each link, use a linear constraint to represent the bandwidth sharing among flows that use this link. 𝑠 𝑠 " ≤ 𝑐 % , ∀ 𝑗 ∈ 1, 2, 5, 6 " 100 𝑠 / ≤ 𝑐 % , ∀ 𝑗 ∈ 7, 8, 11, 12 𝑠 " + 𝑠 / ≤ 𝑐 % , ∀ 𝑗 ∈ 3, 4 𝑠 / 100 • Geometrically, ReSA is the feasible region of flow rates defined by these linear constraints. • However, some constraints are redundant, i.e., the feasible region of flow rates will not change without these constraints. 12
Minimal, Equivalent ReSA: Example Each link: 100 Mbps sw 3 sw 1 d 1 s 1 sw 6 l 1 l 6 sw 5 sw 8 l 7 l 12 s 2 sw 2 sw 7 d 2 sw 4 𝑠 " 𝑠 " ≤ 𝑐 % , ∀ 𝑗 ∈ 1, 2, 5, 6 100 𝑠 / ≤ 𝑐 % , ∀ 𝑗 ∈ 7, 8, 11, 12 𝑠 " + 𝑠 / ≤ 𝑐 % , ∀ 𝑗 ∈ 3, 4 𝑠 / 100 𝑠 " + 𝑠 / ≤ 100 𝑁𝑐𝑞𝑡 • Minimal, equivalent ReSA reveals shared bottleneck resources. 13
ReSA for Multi-Domain, Resource Discovery • Accurate, efficient discovery process. – Two-phase discovery decomposition. – Path query : find all the domains it passes through for each job. – Resource query : ReSA query for all jobs entering the same domain. • Minimal information exposure of multiple resource providers. – Secure multi-party computation. • Dynamic update of resource availability. – Server-side event. 14
Minimal Information Exposure of Resource Providers Global Resource Orchestrator {𝑔 " +𝑔 / + 𝑔 = ≤ 100𝐻𝑐𝑞𝑡} ∅ ∅ ∅ ReSA ReSA ReSA ReSA Server A Server D Server B Server C {𝑔 " + 𝑔 / ≤ 100𝐻𝑐𝑞𝑡} {𝑔 = ≤ 200𝐻𝑐𝑞𝑡} {𝑔 / +𝑔 = ≤ 100𝐻𝑐𝑞𝑡} {𝑔 " +𝑔 / + 𝑔 = ≤ 100𝐻𝑐𝑞𝑡} • Basic idea . a secure multi-party computational geometry protocol to decide the redundancy of each linear inequality using vertex enumeration and halfspace test. • ReSA servers from different domains do not reveal their own set of linear inequalities to others during the protocol. 15
Putting Pieces Together Jobs Global Resource 2. Reservation 2. Reservation Orchestrator Requests/Results Requests/Results 1. Resource Discovery Queries/Responses Resource Resource ReSA ReSA Reservation Reservation Server Server Server Server OpenDaylight Kytos • Multi-domain orchestration • Global orchestrator • Multi-controller coordination • Servers with unified interfaces • Provider autonomy and privacy • Resource state abstraction 16
Unicorn Implementation • Orchestrator: ~2700 LoC Python code • ReSA server: ~2500 LoC Java code • Resource reservation server: – fast data transfer (FDT), FireQoS, OpenvSwitch, etc. • Network controllers: OpenDaylight, Kytos – ONOS and Ryu are under development 17
Evaluation 220 1 Number of linear inequalities Intra-domain resource view Intra-domain resource view 200 0.9 Cross-domain resource view Cross-domain resource view 180 Compression ratio 0.8 160 0.7 140 0.6 120 0.5 100 0.4 80 0.3 60 0.2 40 0.1 20 0 0 Arpanet Aarnet Chinanet Arpanet Aarnet Chinanet Topologies Topologies 1 Number of linear inequalities 160 Intra-domain resource view Intra-domain resource view 0.9 140 Cross-domain resource view Cross-domain resource view Compression ratio 0.8 120 0.7 100 0.6 0.5 80 0.4 60 0.3 40 0.2 20 0.1 0 0 5 10 20 30 5 10 20 30 Number of jobs Number of jobs 18 Please refer to our paper for more details.
Recommend
More recommend