SOLVING LARGE NUMERICAL OPTIMIZATION PROBLEMS IN HPC WITH PYTHON Antonio Gómez-Iglesias PyHPC 2015 1
CONCLUSIONS ▶ Python is great for optimization of large-scale problems in HPC resources ▶ Focus on the problem, not on the implementation ▶ Negligible impact on overall performance ▶ Python as a glue, but much more than that ▶ Large-scale problems → not your typical optimization problem Antonio Gómez-Iglesias Optimization with Python PyHPC 2015 2
LARGE-SCALE OPTIMIZATION PROBLEMS ▶ Optimization problem: find the best feasible solution for a problem ▶ Large-scale problem → very complex problems ▶ Large number of parameters ▶ Long wall time ▶ High computational requirements ▶ Advancements in software and hardware allow to solve problems that were unfeasible ▶ Applications on science, society and industry ▶ Few people working on this, high demand of skills and knowledge Antonio Gómez-Iglesias Optimization with Python PyHPC 2015 3
OPTIMIZING PLASMA CONFINEMENT Antonio Gómez-Iglesias Optimization with Python PyHPC 2015 4
PART OF A MUCH BIGGER PROBLEM Antonio Gómez-Iglesias Optimization with Python PyHPC 2015 5
A MULTI-OBJECTIVE PROBLEM Antonio Gómez-Iglesias Optimization with Python PyHPC 2015 6
MATHEURISTICS maximize f ( x ) subject to Ax ≤ B x ≥ 0 ▶ Metaheuristics + Mathematical Programming ▶ These problems are more common than you imagine ▶ Huge impact in everyday life ▶ Transportation, logistics, urban design, energy,... ▶ First version using ACO and Lagrangian Relaxation used for the Hunter Valley Coal Chain Antonio Gómez-Iglesias Optimization with Python PyHPC 2015 7
HUNTER VALLEY COAL CHAIN ▶ Optimize the coal supply chain in the Hunter Valley region (New South Wales, Australia) ▶ Newcastle: main coal export port in the world ▶ Key player in the economy of the region ▶ Very large and complex problem, with millions of variables and +40K constraints Antonio Gómez-Iglesias Optimization with Python PyHPC 2015 8
HOW TO SOLVE THESE PROBLEMS? ▶ Distributed solvers ▶ Metaheuristics ▶ DAB: distributed asynchronous bees ▶ ABC: artificial bee colony ▶ ACO: ant colony optimization ▶ Using large computational resources ▶ Euler: 144 Dual Xeon 4-core E5450 ▶ Bragg: 128 Dual Xeon 8-core E5-2650 ▶ Stampede: 6400 Dual Xeon 8-core E5-2680 ▶ Extensible parallel framework ▶ Python + NumPy + MPI4py + minidom + … ▶ Originally grid-oriented ▶ Producer-consumer model Antonio Gómez-Iglesias Optimization with Python PyHPC 2015 9
PRODUCER-CONSUMER MODEL ▶ Two main components in the framework: ▶ Producer - 1 process ▶ Implements the algorithm ▶ Creates solutions based on previous known solutions ▶ Decides when to finish ▶ Consumer - N processes ▶ Evaluates possible solutions for the problem ▶ Avoid reevaluating solutions → use queues of evaluated and pending solutions Antonio Gómez-Iglesias Optimization with Python PyHPC 2015 10
NOT THE TYPICAL BLACK BOX MODEL ▶ Many optimization frameworks designed to be used as black boxes ▶ If the optimization is easy , that's ok ▶ Large-scale problems are so challenging that specific tuning of the optimization algorithm is required ▶ Weeks of wall time → checkpointing is critical Antonio Gómez-Iglesias Optimization with Python PyHPC 2015 11
WHY PYTHON? ▶ Popular ▶ Code is clear ▶ Development time ▶ Functionality already provided by many modules ▶ Easy to interact with other applications Antonio Gómez-Iglesias Optimization with Python PyHPC 2015 12
PYTHON AS GLUE AND MUCH MORE ▶ Main problem might be solved with external applications ▶ Python used to interact with those applications ▶ Optimization algorithms are already quite complex ▶ Simple implementation in Python ▶ Many modules that facilitate the development ▶ Simple functions can be directly implemented in Python Antonio Gómez-Iglesias Optimization with Python PyHPC 2015 13
SOLVING DIFFERENT PROBLEMS ▶ XML file describing the parameters of the problem ▶ User needs to implement the function that evaluates one solution ▶ External applications ▶ Inline ▶ Auxiliary functions are needed ▶ Create input file from internal representation of the solution Antonio Gómez-Iglesias Optimization with Python PyHPC 2015 14
MPI4PY ▶ Great for developers with MPI experience ▶ Documentation is still lacking ▶ Initially (1.2.2) some functionality was missing → convoluted implementation ▶ 2.0.0 has been recently released: porting pending ▶ It might need the MPI-enabled Python interpreter (MVAPICH2) → python-mpi Antonio Gómez-Iglesias Optimization with Python PyHPC 2015 15
PERFORMANCE ▶ For large-scale problems most of the time is spent in the evaluation of a possible solution ▶ The impact of Python on the overall execution time is negligible ▶ However, special care has been put in making the Python code as efficient as possible ▶ Impact of communication (initialization of MPI and synchronization) is also irrelevant Antonio Gómez-Iglesias Optimization with Python PyHPC 2015 16
CONCLUSIONS ▶ Python is great for optimization of large-scale problems in HPC resources ▶ Focus on the problem, not on the implementation ▶ Negligible impact on overall performance ▶ Python as a glue, but much more than that ▶ Large-scale problems → not your typical optimization problem Antonio Gómez-Iglesias Optimization with Python PyHPC 2015 17
SOLVING LARGE NUMERICAL OPTIMIZATION PROBLEMS IN HPC WITH PYTHON Antonio Gómez-Iglesias Antonio Gómez-Iglesias Optimization with Python PyHPC 2015 18
Recommend
More recommend