Development of GPU-accelerated MIKE 21 solver

Object Details

View

GRØN DYST 2012 Technical University of Denmark

Development of GPU-accelerated MIKE 21 solver
Paper
Author:Peter J. Dinesen Pedersen (DHI – Water Environment Health, Technical University of Denmark, Denmark)
Peter Edward Aackermann (DTU Informatics, Technical University of Denmark, Denmark)
Date: 2012-06-22     Track: Main     Session: 1

ABSTRACT In today’s world the focus on environment and climate is higher than ever. Especially the change in rivers and coastal areas has crucial impact on peoples lives all over the world. The world saw in 2004 the devastating effect of a tsunami killing over 230,000 people in fourteen countries bordering the Indian Ocean. The tsunami had waves up to 30 meters high and the world donated more than $14 billion U.S. dollars in humanitarian aid. Having efficient tools to try and predict such events has potential to save hundreds of thousands of lives and billions of dollars. It is therefore essential in today’s modern society. DHI have developed a 2D free-surface flow numerical engine called MIKE 21 HD (hydrodynamic module) in order to simulate water movements. This application can simulate water movement in lakes, estuaries, bays, coastal areas and seas, based on rain, tidal variation, wind etc, but also including prediction of tidal hydraulics, wind and wave generated currents, storm surges, waves in harbours, dam-breaks and tsunamis. MIKE 21 HD uses a set of hyperbolic partial differential equations that describe the flow below a pressured surface in a fluid. These equations are solved numerically on a rectangular grid, using a finite difference method (FDM). The solution scheme that is used is the Alternating Direction Implicit (ADI) method. The simulation speed of the program is very important because it determines how big and how many problems can be solved in a given amount of time. Sometimes in order to get accurate understanding of the changes in a given area, hundreds of simulations have to be run. Therefore improving the computing speed has the potential to increase the kind and size of optimization problems where MIKE 21 HD is applicable and thereby open new market segments for DHI. This means that the project is both very relevant and that a satisfying result with certainty will be applied by DHI. For this reason the focus of the project is on improving the simulation speed while maintaining the accuracy of the program. This will be done by implementing the program to run on a graphics processing unit (GPU) to exploit the massively parallelism of the architecture. In order to do this a parallel solution scheme must be developed and different parallel algorithms must be developed and tested in order to utilize the potential of the GPU. The technology for doing this is relatively new. In 2006 NVIDIA published CUDA to run on NVIDIA’s CUDA-enabled GPUs as the world’s first solution for general-computing on GPUs. This means that very little research has been done in this domain and the project is therefore highly innovative. Throughout the project it has become clear that there are two different approaches that are beneficial according to the size of the grid. This is because for small grids there are less “work” to be performed by the hardware and therefore it is important to utilize the fast internal memory of the GPU to improve performance speed. This also means that in order to solve the tri-diagonal systems for each line in the grid, a new algorithm has to be used. We found a modified version of parallel cyclic reduction to be beneficial for small grids and a modified version of the more common Thomas algorithm for larger grids. The different approaches resulted in approximately 25x speedup when using the GPU compared to the CPU, independently of the problem size. This means that a simulation that before took a day now can be performed in less than an hour! This is a very significant improvement and a very satisfying result. Future work now involves implementing the developed solution schemes in MIKE 21 HD so the performance gain will be utilized in real life.