Parallel Methods for Nano/Materials Science ApplicationsOutlineSlide 3Electronic Structure CalculationsMotivation for Electronic Structure CalculationsMaterials Science MethodsQuantum Mechanics for molecules and crystalsSlide 8Slide 9Choice of Basis for DFT(LDA)Slide 11Slide 12Parallelization (load balance, minimize communications)Slide 14Slide 15Specialized 3d FFT for Electronic Structure Codes (Plane Waves/Fourier)Results: 3d-FFT 5123 grid on Cray XT4 (Franklin, NERSC, LBNL)PARATEC (PARAllel Total Energy Code)PARATEC: Code DetailsPARATEC: PerformanceApplication: Gamma Ray Detection ( Cerium activated Scintillators )Slide 22Density of States Plot for Ce in LaBr3Criteria to determine bright Ce activated ScintillatorsPredictions for some new materialsSlide 26SummaryFuture DirectionsParallel Methods for Nano/Materials Science ApplicationsAndrew CanningComputational Research Division LBNL & UC Davis, Applied Science Dept. (Electronic Structure Calculations)Outline•Introduction to Nano/Materials science •Electronic Structure Calculations (DFT) •Parallelism for Plane Wave Approach •Code performance on High Performance Parallel Computers •New Methods and Applications1991 Silicon surface reconstruction (7x7), Phys. Rev. (Stich, Payne, King-Smith, Lin, Clarke) Meiko I860, 64 processor Computing Surface (Brommer, Needels, Larson, Joannopoulos) Thinking Machines CM2, 16,384 bit processors2006 1000 atom Molybdenum simulation with Qbox SC05 (Gordon Bell prize for peak performance). (F. Gygi et al. ) BlueGene/L, 207 Tflops on IBM BG/L (LLNL) 1998 FeMn alloys (exchange bias), Gordon Bell prize SC98 (Ujfalussy, Stocks, Canning, Y. Wang, Shelton et al.) Cray T3E, 1500 procs. first > 1 Tflop SimulationMilestones in Parallel Calculations 2008 New Algorithm to Enable 400+ TFlop/s Sustained Performance in Simulations of Disorder Effects in High-Tc SC08 Gordon Bell prize (Thomas C. Schulthess et al.) Cray XT5 (ORNL), first > 1 Pflop Simulation Linear Scaling Divide-and-Conquer Electronic Structure Calculations (L-W Wang et. al.) SC08 Gordon Bell prize (algorithmic innovation )Electronic Structure Calculations•Accurate Quantum Mechanical treatment for the electrons•Each electron represented on grid or with some basis functions (eg. Fourier components) •Compute Intensive: Each electron requires 1 million points/basis (need 100s of electrons) •70-80% NERSC Materials Science Computer Time (first-principles electronic structure)BaYCl:Ce excited state (new scintillator gamma radiation detector)Motivation for Electronic Structure Calculations •Most Materials Properties Only Understood at a fundamental level from accurate electronic structure calculations (Strength, Cohesion etc)•Many Properties Purely Electronic eg. Optical Properties (Lasers)•Complements Experiments •Computer Design Materials at the nanoscaleMaterials Science Methods •Many Body Quantum Mechanical Approach (Quantum Monte Carlo) 20-30 atoms•Single Particle QM (Density Functional Theory) No free parameters. 100-1000 atoms •Empirical QM Models eg. Tight Binding 1000-5000 atoms •Empirical Classical Potential Methods thousand-million atoms •Continuum Methods Increasing #atomsSingle particle DFT methods largest user of supercomputer cycles of any scientific method in any disciplineQuantum Mechanics for molecules and crystals Many Body Schrodinger Equation for electrons (position ri) in a molecule or crystals with nuclei charge Z and position RI (natural units e = h = 1 etc.) ),..(),..(}||||121{11,,2NNIiIijijiiirrErrRrZrrkinetic energyelectron-electron interactionelectron-nuclei interaction• Solution of the above eigenfunction equation H=E (H is Hamiltonian) for the wavefunction gives complete information for the system.• Observables take the form of operators eg. momentum p (classical form = mv) pop = - i h then p observed is given by the solution of pop=p• Lowest eigenvalue pair (E0,) is the groundstate energy. Higher eigenpairs correspond to excited states of the system.Ab initio Method: Density Functional Theory (Kohn 98 Nobel Prize) ),..(),..(}||||121{11,,2NNIiIijijiiirrErrRrZrr)()(}||||)(21{2rErVRrZrdrrriiiXCII212|),..(||)(|)(Niirrrr Kohn Sham Equation (65): The many body ground state problem can be mapped onto a single particle non-linear problem with the same electron density and a different effective potential (cubic scaling).Use Local Density Approximation (LDA) for )]([ rVXC(good for Si,C)Many Body Schrodinger Equation (exact but exponential scaling ) = charge densitySelfconsistent Solution)()()},(21{2rErrViiiinNii ,..,1}{2|)(|)( rrNii),(rVoutSelfconsistency until Vout = VinN electronsN wave functionslowest N eigenfunctionsfix V(r,) linear problemChoice of Basis for DFT(LDA) Increasing basis size M Gaussian FLAPW Fourier gridPercentage of eigenpairs M/N30% 2%EigensolversDirect (scalapack)IterativePlane-wave Pseudopotential Method in DFT )()())}((||||)(21{2rErrVRrZrdrrrjjjXCIISolve Kohn-Sham Equations self-consistently for electron wavefunctions within the Local Density Approximation rkgigjgkjekCr).(,)()(1. Plane-wave (Fourier) expansion for to energy cutoff |g| < rcut (sphere ) 2. Replace “frozen” atom core (core electrons and nuclei) by a pseudopotential (pseudo-wavefunctions in core region allows rcut to be reduced without loss of accuracy)Different parts of the Hamiltonian calculated in different spaces (Fourier and real) via 3d FFTComputational Considerations • H never computed explicitly (available through mat-vec product)• Matrix-vector product in NlogN steps (not N2) • Orthogonalization is expensive, Mat-vec is cheap • Matrix is dense (in Fourier or Real space) • Real space grid approximately double size of Fourier sphere• Each Selfconsistent step we have good guess for eigenvectors (from previous step) • Typically use stable CG based iterative methods or Davidson FFT)(212ri)(rVParallelization (load balance,
View Full Document