Marwa A. Al-Shandawely PDC/KTH Algorithm overview. Trivial - PowerPoint PPT Presentation

Oct 16, 2022 •517 likes •682 views

Marwa A. Al-Shandawely PDC/KTH Algorithm overview. Trivial parallelization. Problems. Sequential optimization Proposed solutions. Experimental results. Conclusions and future work. for i=1 to n-1 find pivotPos in

Marwa A. Al-Shandawely PDC/KTH
 Algorithm overview.  Trivial parallelization.  Problems.  Sequential optimization  Proposed solutions.  Experimental results.  Conclusions and future work.
for i=1 to n-1 find pivotPos in column i if pivotPos ≠ i exchange rows(pivotPos,i) end if for j=i+1 to n A(i,j) = A(i,j)/A(i,i) end for j !$omp parallel lel do private te ( i ,j ) ) for j=i+1 to n+1 for k=i+1 to n A(k,j)=A(k,j)-A(k,i)×A(i,j) end for k end for j end for i
8 7 6 5 4 3 2 1 0 nThreads 2 3 4 5 6 7 8 N=1000 N=2000 N=3000 N=4000 N=5000
 Poor data locality  Pivoting is done by master thread  Overheads of creating and destroying threads at each iteration  Sequential optimization
 Replace division by the constant pivot  Avoid loop invariant access in the inner most loop  Eliminate the check for pivot changing position  Make use of fortran array notation Do k=j+1,n C=1/A(j,j) A(k,j)=A(k,j)/A(j,j) A(j+1:n)=A(j+1:n)*c End do
Pivots array  Pivots array  Locks array  Pivot holder P 1 ◦ Eliminate (i) on column(i+1) P 2 ◦ Search (i+1) P 3 ◦ Store pivot (i+1) position P 4 ◦ Prepare colmn (i+1) ◦ Free lock (i+1) ◦ Eliminate (i) on rest of scope Locks
12 10 8 6 4 2 0 nThreads 1 2 3 4 5 6 7 N=1000 N=2000 N=3000 N=4000 N=5000
 The original algorithm requires pivot columns to be prepared in order while the whole matrix is accessed for each pivot column.  For large input sizes; the cache is evicted many times for each iteration and there is no reuse of data in the cache.  False sharing on pivots and locks array.
 Double elimination on pivot holders. ◦ Knowledge of two pivots allow data reuse.  Each column is an accumulation of eliminations using previous columns! ◦ Make more pivots available each step and eliminate each column using several pivots while it is in the cache.
Pivots array  Block of pivots  Increase work/iter.  Increase locality P 1  Less locks P 2  Load balancing?! P 3 P 4 Locks
Pivots array  Block of pivots  Increase work/iter.  Increase locality P 1  Less locks P 2  Load balancing?! P 3 P 4 Locks
N=2000 N=5000 9 16 8 14 7 12 Original 6 C=1 10 5 C=2 8 4 C=3 6 3 C=4 4 2 C=5 2 1 0 0 2 3 4 5 6 7 8 2 3 4 5 6 7 8
N=2000 N=5000 30 25 25 20 20 Original 15 15 double elimination 10 10 C=25 with double elimination 5 5 0 0 2 3 4 5 6 7 8 2 3 4 5 6 7 8
 Scalable performance on multicores is highly dependent on application implementation, data layout and access patterns.  Cache and memory access optimization techniques is vital for performance despite the loss of readability.  Future work: ◦ Adaptive blocking scheme that changes the block size as a function of the matrix size, cache settings, and number of cores.

Recommend

ORIENTATION SESSION TO CIRO PROGRAMME CIRO Orientation Programme Marwa Almaskati Director of

ORIENTATION SESSION TO CIRO PROGRAMME CIRO Orientation Programme Marwa Almaskati Director of Marketing & Business Development MEIRA - Bahrain Chapter Head John Gollifer General Manager at MEIRA WHAT YOU WILL GET The tools and techniques

211 views • 19 slides

Sizing of Energy Sources for a Green DataCenter with 100% Renewable Supply Marwa Haddad,

Sizing of Energy Sources for a Green DataCenter with 100% Renewable Supply Marwa Haddad, Jean-Marc Nicod, Marie-Ccile Pra GreenDays Toulouse July 2018 July 3th, 2018 Data Centers and energy efficiency Does using IT technologies

436 views • 29 slides

DISB Risk Finance Team Karima M. Woods, Commissioner Flavian Marwa, Deputy Commissioner

CIC-DC Annual Conference DC Domicile Regulatory Update October 27, 2020 Karima M. Woods, Commissioner 1 DISB Risk Finance Team Karima M. Woods, Commissioner Flavian Marwa, Deputy Commissioner Dana Sheppard, Director of Risk Finance

295 views • 5 slides

Optimal approximation for unconstrained non-submodular minimization Marwa El Halabi Stefanie

Optimal approximation for unconstrained non-submodular minimization Marwa El Halabi Stefanie Jegelka CSAIL, MIT ICML 2020 Set function minimization Goal: Select collection S of items in V that minimize cost H ( S ) Unconstrained

385 views • 38 slides

081045

081045 Bibliotheca Alexandrina Compiled by Manar Badr & Marwa Anani 1 http://ar.wikipedia.org/wiki/ 081045

520 views • 18 slides

Turkey and and the the Region Region: : Turkey Testing the the Links Links between between

Turkey and and the the Region Region: : Turkey Testing the the Links Links between between Power Power Testing Asymmetry and and Hydro Hydro- -Hegemony Hegemony Asymmetry Marwa Daoudy GIIS, CERI 20-22 May 2005 Hydro- -Hegemony

506 views • 14 slides

The Internationalization Distinctive Labels at FEPS, Cairo University Warsaw School of Economics

The Internationalization Distinctive Labels at FEPS, Cairo University Warsaw School of Economics June 2018 Marwa Biltagy, Associate Professor of Economics, FEPS Cairo University Cairo University Cairo University is considered the mother

445 views • 12 slides

Effects of - Irradiation and Ageing on Surface and Catalytic Properties of nano-sized CuO/MgO

[a017] Effects of - Irradiation and Ageing on Surface and Catalytic Properties of nano-sized CuO/MgO System towards dehydrogenation and condensation reactions Sahar.A. El-Molla a * , Sahar.A. Ismail b , Marwa. M. Ibrahim a a Chemistry

526 views • 24 slides

Reliability and validity of Arabic version of BICAMS: Egyptian dialect Prepared by Nevin M

Reliability and validity of Arabic version of BICAMS: Egyptian dialect Prepared by Nevin M Shalaby Marwa Farghaly Professor of Neurology Professor of Neurology Cairo University Cairo University Epidemiology of MS in Arab world While

501 views • 24 slides

Globule: A collaborative Globule: A collaborative Content Delivery Network Content Delivery

Globule: A collaborative Globule: A collaborative Content Delivery Network Content Delivery Network Guillaume Pierre Maarten van Steen Presented By: Marwa K. Elteir Outline Outline Introduction Introduction Related work Related work

490 views • 10 slides

DDoS Defense by Defense by DDoS Offense Offense Published in ACM SIGCOMM06 Presented By:

DDoS Defense by Defense by DDoS Offense Offense Published in ACM SIGCOMM06 Presented By: Marwa K. Elteir Outline Outline Problem definition Problem definition Related Work Related Work Speak Speak- -up up Model Model

458 views • 9 slides

Dehydrogenation reactions of methanol in presence of Nanosized- ZnO/CuO/MgO system Sahar A.

Dehydrogenation reactions of methanol in presence of Nanosized- ZnO/CuO/MgO system Sahar A. El-Molla*, Shaimaa M. Ibrahim , and Marwa M. Ebrahim Department of Chemistry, Faculty of Education, Ain Shams University, Roxy 11757, Cairo, Egypt

197 views • 9 slides

Introduction to Machine Learning - CS725 Instructor: Prof. Ganesh Ramakrishnan Overview of Linear

Introduction to Machine Learning - CS725 Instructor: Prof. Ganesh Ramakrishnan Overview of Linear Algebra Solving Linear Equation: Geometric View Simple example of two equations and two unknowns x and y to be found: 2 x y = 0 and x + 2 y

535 views • 41 slides

Maltsev constraints revisited Ross Willard University of Waterloo, CAN Dagstuhl Seminar 15301

Maltsev constraints revisited Ross Willard University of Waterloo, CAN Dagstuhl Seminar 15301 July 21, 2015 In the beginning . . . D = ( D , ) the template In this lecture, D and are always finite. CSP Dichotomy Conjecture

465 views • 23 slides

Linear Algebra I MA1S1 Tristan McLoughlin October 17, 2014 Anton & Rorres: Ch 1.3

Linear Algebra I MA1S1 Tristan McLoughlin October 17, 2014 Anton & Rorres: Ch 1.3 Hefferon: Ch One, sec I.2 What is linear and not linear? Here are some examples of equations that are plausibly interesting from some practical points of

1.35k views • 83 slides

LA 2019 lect.#4 on Linear equation systems ++ Lecture showed slides 1 13, and covered

LA 2019 lect.#4 on Linear equation systems ++ Lecture showed slides 1 13, and covered (handwritten notes, separate file) algorithm pp. 1518 and examples. Slides 14 and 19ff: additional examples! First: what do we have? Vectors:

704 views • 29 slides

Solving Ax=b with Pivoting Solving Ax=b with Gaussian Elimination and LU and partial

Numerical and Scientific Computing with Applications David F . Gleich CS 314, Purdue September 29, 2016 In this class: Solving Ax=b with Pivoting Solving Ax=b with Gaussian Elimination and LU and partial pivoting Next class

240 views • 3 slides

Introduction to Mobile Robotics Compact Course on Linear Algebra Lukas Luft, Wolfram Burgard 1

Introduction to Mobile Robotics Compact Course on Linear Algebra Lukas Luft, Wolfram Burgard 1 Vectors Arrays of numbers Vectors represent a point in a n dimensional space Vectors: Scalar Product Scalar-Vector Product Changes

1k views • 42 slides

Lecture 8: SOS Lower Bound for 3-XOR Lecture Outline Part I: SOS Lower Bounds from Pseudo-

Lecture 8: SOS Lower Bound for 3-XOR Lecture Outline Part I: SOS Lower Bounds from Pseudo- expectation Values Part II: Random 3-XOR Equations and Pseudo- expectation Values Part III: Proving PSDness Part IV: Analyzing Parameter

650 views • 37 slides

Computational Lower Bounds for Statistical Estimation Problems Ilias Diakonikolas (USC) (joint

Computational Lower Bounds for Statistical Estimation Problems Ilias Diakonikolas (USC) (joint with Daniel Kane (UCSD) and Alistair Stewart (USC)) Workshop on Local Algorithms, MIT, June 2018 THIS TALK General Technique for Statistical Query

456 views • 30 slides

Certified proofs in programs involving exceptions Jean-Guillaume Dumas with D. Duval, B. Ekici,

Certified proofs in programs involving exceptions Jean-Guillaume Dumas with D. Duval, B. Ekici, J.-C. Reynaud Universit de Grenoble Laboratoire Jean Kuntzmann Applied Mathematics and Computer Science Department Dynamic Evaluation (D5) for

314 views • 18 slides

Parametric Signal Modeling and Linear Prediction Theory 4. The Levinson-Durbin Recursion

4 Levinson-Durbin Recursion Appendix: More Details Parametric Signal Modeling and Linear Prediction Theory 4. The Levinson-Durbin Recursion Electrical & Computer Engineering University of Maryland, College Park Acknowledgment: ENEE630

484 views • 19 slides

Structural Identifiability of Biological Models Nikki Meshkat Santa Clara University Joint work

Structural Identifiability of Biological Models Nikki Meshkat Santa Clara University Joint work with Zvi Rosen and Seth Sullivant Symbolic/Numeric Seminar at CUNY August 31, 2017 Motivation: Unidentifiable models Model 1: Model 2:

744 views • 47 slides

Eliminating variables in Boolean equation systems Bjrn Mller Greve 1 , 2 avard Raddum 2 Gunnar

Eliminating variables in Boolean equation systems Bjrn Mller Greve 1 , 2 avard Raddum 2 Gunnar Flystad 3 yvind Ytrehus 2 H 1 Norwegian Defence Research Establishment 2 Simula@UiB 3 Dept. of Mathematics, UiB July 5, 2017 Introduction and

1.21k views • 103 slides