Ab Initio modelling of surfaces and surfactants
Outline Background System studied Hardware used MPI programming
Background Flotation of minerals is a multi billion dollar industry Much is allready known but still much left to investigate
•Depresants •Collectors •Frothers Flotation principles
Background - Background - adsorption dsorption • In-situ adsorption • Surfactants Example of application Increasing C Ore flotation - ZnS, PbS,… - Collectors, e.g. xanthates and H 2 O dithiophosphates
Quantum chemical modelling Quantum chemical modelling Modelled structure 1042 1102 HepX ion Modelled 1135 intensity spectra 1055 1066 HepX ads 1131 1090 1005 1023 1200 1180 1160 1140 1120 1100 1080 1060 1040 1020 1000 wavenumber (1/cm) Experimental absorbance infrared HepX ion spectra HepX ads 1200 1150 1100 1050 1000 wavenumber (1/cm)
Aim of the w ork Investigate how a collectors interacts with surfaces Introduce the pseudopotential concept into chemistry Collaboration between experiments and modelling
Methods used SCF Self Consistent Field MP2 Second order Moller-Plesset DFT Density Functional theory (NWChem) DFT in combination with pseudopotentials (AIMPRO) Experimental IR/RAMAN spectroscopy
Geometrical optimization and vibrational mode calcultaions of ethyl xanthate
Execution time for the simulation Method Basis set Functions Time SCF STO-3G 43 8406 SCF 6-311G* 143 100838 SCF 6-311G** 160 143921 MP2 STO-3G 43 25722 MP2 6-311G* 143 402577 MP2 6-311G** 160 455127 DFT LDA DZVP 106 21408 DFT B3LYP DZVP 106 70467 Timings using NWChem with different methods and basis sets
Molecule in a box vs cluster calculations Geometrical optimization of Ethyl Xanthate Cluster Gamma 4 kp 14 kp Timing (s) 233 32379 32154 32580 Speedup 1 139 137 139 Vibrational freqeuncy calculations of Ethyl Xanthate Cluster Gamma 4 kp 14 kp Timing (s) 561 13539 71911 73037 Speedup 1 131 128 130 Box size 15x25x15 Å K-points converges to the gamma point Cluster calculations is about 18 times faster than NWChem using DZVP basis set and LDA About 400 times faster than NWChem using MP2
Excellent agreement w ith both all electron DFT calculations and experimental results Less than 3.8% deviation from all electron calculations Less than 4% deviation from experimental results Cluster calculations as accurate as supercell
(Potassium) O,O-Dibutyldithiophosphate Several different geometrical confromations Important for mining industry (flotation process) Shortchained species vital in lubrications
Vibrational frequency calculations Calculated vibrational spectra compared with experimtal
Adsorption of Heptyl Xanthate on a Germanium surface Calculations of vibrational frequencies Good agreement with experiments ATR-FTIR experiments Bridgeing conformation on the surface 175 atoms in supercell 6 k-points Big basis
Hardw are used The HPC2N at Umeå University • Sarek 384 nodes • Seth 256 nodes The PDC facilities at KTH • Lenngren 886 nodes
Sarek the HPC2N Opteron Cluster • Sarek has a total of 384 processors and 1.54 Tbyte of memory • 190 HP DL145 nodes, with dual AMD Opteron 248 (2.2GHz) • 2 HP DL585, with dual AMD Opteron 248 (2.2GHz) • 1.69 Tflops/s peak performance • 1.33 Tflops/s HP Linpack • 8 GB memory per node • Myrinet 2000 high speed interconnect
The new ork Myrinet-2000 with MX-2G software • MX or MPI latency 3.2µs • MX or MPI unidirectional data rate – 247 MBytes/s (one-port NICs) – 495 MBytes/s (two-port NICs) • TCP/IP data rate (MX ethernet emulation) – 1.98 Gbits/s (one-port NICs) – 3.95 Gbits/s (two-port NICs)
The nodes • 384 CPUs – 64-bit AMD Opteron 2.2 GHz – 64 kB + 64 kB L1 Cache (2-way associative) – 1024 kB unified L2 Cache (16-way associative) • 192 Nodes • 11.2 GB/s memory bandwidth • 8.8 Gflops/s peak performance
Softw are • Ubuntu 6.06 LTS • OpenAFS AFS client • MX • MPICH-MX • Goto Blas • ScaLAPACK • BLACS • FFTW • PGI Compiler suite • PathScale Compiler suite
The top 500 november 2006 Rank Site Computer Processors Year R max R peak GFlops GFlops 1 DOE/NNSA/LLNL BlueGene/L - eServer 131072 2005 280600 367000 United States Blue Gene Solution IBM 128 KTH - Royal Lenngren - PowerEdge 886 2005 4999 6025 Institute of 1850, 3.4 GHz, Technology Infiniband Sweden Dell Sarek not even on the list any longer •168 June 2004 •224 November 2004 •400 June 2005 •Top 30 1993
MPI an Introduction • Background • Basics of MPI message passing – Fundamental concepts – Simple examples in C • Point-to-point communication • Collective communication
What is MPI • A message passing library specifiction – not a language or compiler specification – not a specific implementation or product • For parallel computers, clusters, and heterogeneous networks • Designed to provide access to advanced parallel hardware for – end users – library writers – tool developers
Why MPI Early vendor systems were not portable • Early portable systems were mainly research efforts by individual groups • MPI provides a portable way to express parallel programs • MPI-forum organized in 1992 with broad participation • MPI Standard (1.0) released 1994 • MPI Standard (2.0) released 1997
The MPI Architecture • SPMD: Single Program Multiple Data – given P processors, run the same program on each processor • Data types – the standard way to describe data in MPI • Communicators – an abstraction for selecting the participants in a set of communications • Two sided (pair-wise) communications – one party sends data and the other receives • Collective Communication – Reductions, broadcasts, etc
A minimal MPI Program
A better version
Message Passing
MPI identifications • A process is identified by its rank in the group associated with acommunicator – There is a default communicator whose group contains all initial processes, MPI_COM_WORLD – Can create new communicators using MPI_COM_SPLIT • All communications are labeled by type, MPI_INT… – Support communication between processes on machines with different memory representations and lengths of elementary datatypes (heterogeneous communication) • MPI_TAG assist the receiving process in identifying the message.
Blocking send MPI_SEND (start, count, datatype, dest,tag, comm) • The message buffer is described by start, count, datatype • The target process is specified by dest, which is the rank of the target process in the communicator specified by comm. • When this function returns, the data has been delivered to the system and the buffer can be reused. The message may not have been received by the target process.
Blocking recive MPI_RECV(start, count, datatype, source, tag,comm, status) • Waits until a matching (on source and tag) message is received from the system, and the buffer can be used • source is the rank in communicator specified by comm • status contains further information • Receiving fewer than count occurrences of datatype is OK, but receiving more is an error.
MPI is simple • Many parallel programs can be written using just these six functions, only two of which are non-trivial – MPI_INIT – MPI_FINALIZE – MPI_COMM_SIZE – MPI_COMM_RANK – MPI_SEND – MPI_RECV
Collective Communication • Exists several collective primitives in MPI, for example – Broadcast: MPI_BCAST – Gather: MPI_Gather, MPI_Gatherv – Scatter: MPI_Scatter, MPI_Scatterv – All-to-all: MPI_Alltoall, MPI_ AlltoAllv – Reduction: MPI_Reduce, MPI_Allreduce – Barrier: MPI_BARRIER
MPI Summary • The parallel computing community has cooperated on the development of a standard for message-passing libraries • There are many implementations, on nearly all platforms • MPI subsets are easy to learn and use • Lots of MPI material is available
Recommend
More recommend