M. Carmen Ruiz, Diego Pérez and Damas Gruska CPA 2017: The 39th Commu mmunica icati ting ng Process ess Archit itectu ectures es Malta ta 20-23 23 August ust
Outline 1. Motivation 2. Our work 3. Formal Modelling of Map/Reduce 4. Performance Evaluation 5. Validation 6. Performance-Cost tradeoff 7. Conclusions
Motivation
Motivation • Such data provide the opportunity for social scientists to conduct a wide variety of research analysis and has demonstrated to be of great interest.
Motivation • However, performing a longitudinal analysis of this huge data becomes a Big-Data problem since the volume of this data is produced continuously all around the world.
Motivation New processing This fact hampers data paradigms and harvesting, storage and analysis computational by using traditional tools or environments have processing infrastructures. arisen. One of the main contributions to this matter has been the Map/Reduce paradigm and its Open – Source implementation ( Hadoop )
Motivation
Motivation • Simultaneously, the growing trend of the Cloud computing paradigm, due to its benefits in terms of storage, computing power and flexibility offers a possibility to handle this massive amounts of data at reasonable cost. • Moreover, Cloud computing provides several features that become of interest in conjunction with Hadoop, such as high availability and distributed environment provisioning.
Motivation • Hadoop requires a distributed environment (a cluster or virtual cluster) in order to perform any Hadoop application execution. • The number of resources dedicated to this task (number of virtual machines dedicated to a Hadoop virtual cluster) determines the application performance.
Motivation • The Cloud pay – per – use model must be taken into account in order to minimise the cost since the number of virtual machines hired for a certain study is related to operational expenses.
Our work We present a formalization of the Map/Reduce paradigm which is used to evaluate performance parameters and make a trade – off analysis on the number of workers versus processing time and resource cost. • Timed Process Algebra BTC • BAL Tool
Map/Reduce Paradigm The basis of Map/Reduce consists in splitting the input data into data chunks that are distributed to the worker nodes where they are processed. Later on, the results are combined and collected.
Map/Reduce Paradigm
BTC (Bounded True Concurrency) • Timed algebra • It takes into account that the available resources in a system must be shared by all the processes. This evolves into two types of delays: • Delays related to the synchronization of processes • Delays related to the allocation of resources • True Concurrency — > (a|b) ≠ (a.b + b.a) • Homogeneous / Heterogeneous Resources • Preemptable / Non-preemptable Resources
BTC (Bounded True Concurrency) Sintaxis P ::= stop | a.P | < b, α > .P | P ⊕ P | P || A P | recX.P Types of actions: • Timed actions (ActT ) • Untimed actions (ActU) • Special actions (ActS) N = {N1, N2, ..., Nm} Nº of resources of each type Z = {Z1, Z2, ..., Zm} Zi = {b1, b2, ..., bi} actions which need resources of type i [[P]] Z , N
BAL TOOL
BAL TOOL Specification Wizard File System Specification BTC Syntax Syntax Analyser Syntax Error BTC Operational Semantics Graph Generator Branch-and-bound Performance Evaluator techniques DBL Scheme Parallel computing Grid computing Results
Formal Modelling of Map/Reduce BTC Specification [[sys_Map_Red]] Z, N ≡ [[BLOCK || BLOCK || . . . || BLOCK || OVERLAP || SYN_CLEANUP || SYN_SETUP]] {act_worker}, {n} BLOCK ≡ SETUP . MAP . REDUCE . CLEAN SETUP ≡ < setup, tS > . synR.synRR MAP ≡ < act_worker > . synS . < recordReader, tRr > . < map, tM > . < act_worker > . synSS REDUCE ≡ < act_worker > . < shuffle, tSh > . < sort, tSrt > . < reduce, tR > . < output, tOpt > . < act_worker > CLEAN ≡ synC.synCC . < clean, tC > OVERLAP ≡ synS. . . . .synS.synSS.synSS. . . . .synSS SYN_CLEANUP ≡ synC. . . . .synC.synCC.synCC . . . .synCC SYN_SETUP ≡ synR. . . . .synR.synRR.synRR. . . .synRR
PRECONDITIONS OF EACH TRANSITION Formal Modelling of Map/Reduce Main Task Sub-Task Parameter Setup Setup tS Record Reader tRr Map Map tM Shuflle tSh Sort tSrt Reduce tR Reduce Output tOpt Clean Up Clean Up tC
Performance Evaluation • The focus lies on the study of Hadoop framework to obtain the utmost performance with the minimum number of resources or minimum cost. • A temporal analysis of the Hadoop behaviour has been performed. • The results show the performance in terms of the number of resource needed. • The results of this analysis allow: • users to know the configuration that best suits their requirements. • Cloud providers help to establish their service catalogue.
Performance Evaluation • The number of resources needed depends mainly on the type of application and the volume of data to be processed. • In order to be able to carry out the performance evaluation, we chose a concrete application: H.265 encoding Hadoop application
H.265 encoding • The most widely encoding standard used nowadays is H.264. However, its successor, known as H.265 (or HEVC), has shown to improve H.264 and represents the future of video encoding. • This application exploits the HEVC encoder within a Hadoop application, exploiting the distributed processing of video chunks, across multiple computing resources. • In order to evaluate the performance of this application, the encoded video sequence used has been “ BasketballDrill ” (832x480).
H.265 encoding • Time that each block needs to execute the phases and sub-phases that make up the model of Map/Reduce. Main Task Sub-Task Parameter Time (ms) Setup Setup tS 125 ms Record Reader tRr 506 ms Map Map tM 64044 ms Shuflle tSh 16187 ms Sort tSrt 500 ms Reduce Reduce tR 75 ms Output tOpt 65 ms Clean Up Clean Up tC 125 ms
Performance Evaluation • BTC models of the Map/Reduce behaviour • The application to be studied • The volume of data to be processed. Application (include information into the model) To replace the input parameters in the Map/Reduce Model To provide the number of data blocks. To establish the number of workers (variable n)
Performance Evaluation BAL tool • Checks the syntax of the specification • Builds the transition graph • Carries out the performance analysis The result is the time that the application takes to analyse this amount of chunks.
Performance Evaluation Workers ers Execution cution Impr mprovemen ement Workers ers Execution cution Impr mprovemen ement Time Time 2 10m 51s 10 02m 26s 1,35% 3 07m 20s 32,41% 11 02m 26s 0% 4 05m 25s 26,14% 12 02m 26s 0% 5 04m 35s 15,38% 13 02m 26s 0% 6 03m 47s 17,45% 14 02m 26s 0% 7 03m 30s 7,49% 15 02m 26s 0% 8 02m 43s 22,38% 16 01m 21s 44,52% 9 02m 28s 9,20% Data has been obtained for different configurations (1 master VM + # workers VMs)
Validation Cloud infrastructure deployed at the UCLM • 2 Xeon e5462 CPU (4 Cores) Formal model • 32 GB of main memory Real observation • 60 GB of storage. • NFS • Gigabit Ethernet network. • OS: CentOS 6.2 Linux • OpenNebula • Virtualization software: KVM • Headnode: + 1TB of storage shared between compute nodes
SentiStrength tool for Hadoop • SentiStrength tool conducts longitudinal analysis of social media data. • This application is used by COSMOS project whose objective is to translate the underlying social observation and analysis mechanisms into an embedded research tool that supports the development and execution of social media research analysis. • For this study, the application has performed the sentiment analysis of: • ≈ 100 million tweets (≈ 15Gb of plain text), which have been split into 300 blocks of equal size.
Performance-Cost Tradeoff • Since the virtual infrastructure used to validate the formal model follows the specifications provided by Amazon EC2 provider M1.small instances, the hiring costs stated by Amazon are been considered. • We use the ”Simple Monthly Calculator ” tool, which provides an automated hiring cost in terms of the number of resources, time required and location Amazon EC2 evaluates the cost in hours time slots.
Performance-Cost Tradeoff 300 B L O C K S R E S U LT S • Time and cost required to perform a longitudinal sentiment analysis of 100 million Tweets (divided into 300 blocks) within the Amazon EC2 Cloud. Workers Execution Time Price 2 43,92$ 7h 15m 02s 3 36,60$ 4h 50m 05s 4 36,60$ 3h 37m 33s 5 2h 54m 03s 32,94$ 6 2h 25m 15s 38,43$ 7 2h 04m 45s 43,92$ Data has been obtained for different 8 1h 49m 02s 32,94$ configurations (1 master VM + # workers VMs) 9 1h 36m 59s 36,60$
Conclusions • We have developed a formal model to allow users and service managers to evaluate cost and performance in terms of the deployment strategy, or to choose the best deployment strategy in terms of the expected cost/performance, with the objective of a better Cloud resource hiring in terms of user’s time and costs restrictions.
Recommend
More recommend