evaluating the impact of transactional characteristics on
play

Evaluating the Impact of Transactional Characteristics on the - PowerPoint PPT Presentation

Introduction Methodology Performance Evaluation Conclusions References Evaluating the Impact of Transactional Characteristics on the Performance of Transactional Memory Applications 1 Fernando Rui, 2 Mrcio Castro, 1 Dalvan Griebler, 1 Luiz


  1. Introduction Methodology Performance Evaluation Conclusions References Evaluating the Impact of Transactional Characteristics on the Performance of Transactional Memory Applications 1 Fernando Rui, 2 Márcio Castro, 1 Dalvan Griebler, 1 Luiz Gustavo Fernandes Email: fernando.rui@acad.pucrs.br, mbcastro@inf.ufrgs.br, dalvan.griebler@acad.pucrs.br, luiz.fernandes@pucrs.br 1 Pontifícia Universidade Católica do Rio Grande do Sul - PUCRS - GMAP 2 Universidade Federal do Rio Grande do Sul - UFRGS - INF February 2014 1 / 16

  2. Introduction Methodology Performance Evaluation Conclusions References Summary Introduction 1 Methodology 2 Performance Evaluation 3 Conclusions 4 References 5 2 / 16

  3. Introduction Methodology Performance Evaluation Conclusions References Introduction Motivation 1 Multi-core Applications are not embarrassingly parallel Traditional synchronization structures (locks, mutexes and semaphores) Low-level mechanisms Cause Blocking Hard to manage Vulnerable to failures and faults 3 / 16

  4. Introduction Methodology Performance Evaluation Conclusions References Introduction Transactional Memory (TM) 1 High-level abstraction Allows to write parallel code as transactions In runtime detect conflicts and solve them 4 / 16

  5. Introduction Methodology Performance Evaluation Conclusions References Introduction Challenge of TM systems 1 What kind of applications can really take advantage of TM? Why some TM applications present low performance? Contributions of this research 2 Performance evaluation of the state-of-art STM systems and applications Extend the analysis of [1], including the RSTM [2] system We find out characteristics that affect the performance TM We identify bottlenecks of TM App. that limit their scalability We show possible improvements to achieve better performance. 5 / 16

  6. Introduction Methodology Performance Evaluation Conclusions References Methodology Comparative Analysis 1 Four state-of-the-art STM systems using the Stanford 1 Transactional Applications for Multi-Processing (STAMP) benchmark [3]; Evaluation of STM systems using EigenBench [1]; 2 We evaluate the impact of certain transactional 3 characteristics using EigenBench. Environment of Tests 2 All experiments were performed on a Dell PowerEdge R610 machine with two quad-core Intel Xeon E5520 2.27 GHz processors with 8MB of L2 cache and 16GB of shared memory; All results are arithmetic means of at least 30 runs to guarantee a confidence level of 95%. 6 / 16

  7. Introduction Methodology Performance Evaluation Conclusions References STM Systems Using STAMP Benchmark STM Systems 1 Transactional Locking (TL2) [4]: second version of the original TL; TinySTM [5]: uses shared counter as clock to control the conflicts between transactions and locks to protect shared memory locations; SwissTM [6]: its innovations is the hybrid conflict detection scheme; Rochester Software Transactional Memory (RSTM) [2]: reduces cache misses by employing a single level of indirection to access shared objects. 7 / 16

  8. Introduction Methodology Performance Evaluation Conclusions References STM Systems Using STAMP Benchmark Performance Evaluation 1 SwissTM RSTM 5 5 4 4 3 3 2 2 1 1 0 0 s e r s h 2 n a s e r s h 2 n a e m e n t a o d e e n Speedups n m n t a o d y d a c i a y d a c i a a o u i t o u i t e r s a y a e r s a y b n r m y s c b n r m y s c e t b e t b n a n a g i k a v g i k a v l l TinySTM TL2 5 5 4 4 3 3 2 2 1 1 0 0 s e r s h 2 n a s e r s h 2 n a e e n e e n m n t a o d m n t a o d y d a c i a y d a c i a a o u i t a o u i t e r s a y e r s a y b n r m y s c b n r m y s c e t b e t b n a n a g i k a v g i k a v l l e m a k Applications a v l Legend 2 cores 4 cores 8 cores 8 / 16

  9. Introduction Methodology Performance Evaluation Conclusions References SwissTM vs. RSTM using EigenBench Set-up: 1 STM systems which presented better performance; STAMP applications with poor (ssca2), medium (intruder and vacation) and good (labyrinth and genome) scalability; The evaluation is based on speedup and aborts per commit (ApC). EigenBench Input Parameters 2 Table: Applications characteristics from STAMP benchmark Characteristic ssca2 intruder vacation labyrinth genome Working-set Size 400 MB 20 MB 256 MB 16 MB 20 MB Transactional Lenght 3 24 226 357 88 Pollution 33% 5% 2% 50% 5% Temporal Locality 0.33 0.52 0.59 0.77 0.58 Contention 0.0005% 22% 0.2% 5% 0.5% Predominance Low Low High Low High Density High High High Low High 9 / 16

  10. Introduction Methodology Performance Evaluation Conclusions References SwissTM vs. RSTM using EigenBenach (Cont.) Performance Evaluation 1 Speedups Aborts per commit 8 16% 7 14% 6 12% 5 10% SwissTM 4 8% 3 6% 2 4% 1 2% 0 0% genome intruder labyrinth ssca2 vacation 2 4 8 Applications Number of cores Speedups Aborts per commit 8 6% 7 5% 6 4% 5 RSTM 4 3% 3 2% 2 1% 1 0 0% genome intruder labyrinth ssca2 vacation 2 4 8 Applications Number of cores e m a k a v l genome intruder labyrinth 2 cores 4 cores 8 cores Legend Legend ssca2 vacation 10 / 16

  11. Introduction Methodology Performance Evaluation Conclusions References SwissTM vs. RSTM using EigenBenach (Cont.) Findings 1 TM applications that use large amounts of memory did not present good performance, since STM systems need to keep track of much more data to detect conflicts; The variation in terms of transaction lengths during the execution is not well treated by most of the STM systems; Low degrees of predominance and density help TM applications to perform better; High levels of ApC generally limit the performance of TM applications. 11 / 16

  12. Introduction Methodology Performance Evaluation Conclusions References Evaluating the Impact of Transactional Characteristics Genome - Transactional Length Intruder - Temporal Locality 5 5 4 4 3 3 2 2 1 1 0 0 Speedups Original V1 V2 V3 V4 Original V1 V2 V3 V4 Ssca2 - Working-set Size Vacation - Working-set Size 5 5 4 4 3 3 2 2 1 1 0 0 Original V1 V2 V3 V4 Original V1 V2 V3 V4 e Versions m a k a v l Legend 2 cores 4 cores 8 cores 12 / 16

  13. Introduction Methodology Performance Evaluation Conclusions References Conclusions About this paper Some Characteristics drive the performance of TM applications; Applications must be analysed carefully to identify relevant characteristics; Future Opportunities We intend to extend this work using some tracing mechanisms as proposed in [7]; We intend to study the impact of the TM characteristics on the performance of TM applications when executed on a real HTM processor such as the Intel Haswell. 13 / 16

  14. Introduction Methodology Performance Evaluation Conclusions References References I Sungpack Hong et al . Eigenbench: A Simple Exploration Tool for Orthogonal TM Characteristics. In IEEE International Symposium on Workload Characterization (IISWC) , pages 1–11, Washington, USA, 2010. IEEE Computer Society. Virendra J. Marathe, Michael F. Spear, Christopher Heriot, Athul Acharya, David Eisenstat, William N. Scherer III, and Michael L. Scott. Lowering the Overhead of Nonbacterial Software Transactional Memory. In ACM SIGPLAN Workshop on Transactional Computing . Jun 2006. Cao Minh et al . STAMP: Stanford Transactional Applications for Multi-Processing. In IEEE International Symposium on Workload Characterization (IISWC) , pages 35–46, Seattle, USA, 2008. IEEE Computer Society. Dave Dice et al . Transactional Locking II. In International Symposium on Distributed Computing (DISC) , pages 194–208, 2006. Pascal Felber, Christof Fetzer, and Torvald Riegel. Dynamic Performance Tuning of Word-based Software Transactional Memory. In Symposium on Principles and Practice of Parallel Programming (PPoPP) , pages 237–246, Salt Lake City, USA, 2008. ACM. Aleksandar Dragojevi´ c, Rachid Guerraoui, and Michal Kapalka. Stretching Transactional Memory. In Programming Language Design and Implementation (PLDI) , pages 155–165, 2009. 14 / 16

  15. Introduction Methodology Performance Evaluation Conclusions References References II Márcio Castro et al . Analysis and Tracing of Applications Based on Software Transactional Memory on Multicore Architectures. In Euromicro International Conference on Parallel, Distributed and Network-Based Computing (PDP) , pages 199–206. IEEE Computer Society, 2011. 15 / 16

  16. Introduction Methodology Performance Evaluation Conclusions References Evaluating the Impact of Transactional Characteristics on the Performance of Transactional Memory Applications 1 Fernando Rui, 2 Márcio Castro, 1 Dalvan Griebler, 1 Luiz Gustavo Fernandes Email: fernando.rui@acad.pucrs.br, mbcastro@inf.ufrgs.br, dalvan.griebler@acad.pucrs.br, luiz.fernandes@pucrs.br 1 Pontifícia Universidade Católica do Rio Grande do Sul - PUCRS - GMAP 2 Universidade Federal do Rio Grande do Sul - UFRGS - INF February 2014 16 / 16

Recommend


More recommend