Automatic Multi-GPU Code Generation applied to Simulation of Electrical Machines

Reading time: 5 minute
...

📝 Original Info

  • Title: Automatic Multi-GPU Code Generation applied to Simulation of Electrical Machines
  • ArXiv ID: 1107.0538
  • Date: 2011-07-05
  • Authors: Antonio Wendell De Oliveira Rodrigues (INRIA Lille - Nord Europe), Frederic GuyomarcH (INRIA Lille - Nord Europe), Jean-Luc Dekeyser (INRIA Lille - Nord Europe), Yvonnick Le Menach (L2EP)

📝 Abstract

The electrical and electronic engineering has used parallel programming to solve its large scale complex problems for performance reasons. However, as parallel programming requires a non-trivial distribution of tasks and data, developers find it hard to implement their applications effectively. Thus, in order to reduce design complexity, we propose an approach to generate code for hybrid architectures (e.g. CPU + GPU) using OpenCL, an open standard for parallel programming of heterogeneous systems. This approach is based on Model Driven Engineering (MDE) and the MARTE profile, standard proposed by Object Management Group (OMG). The aim is to provide resources to non-specialists in parallel programming to implement their applications. Moreover, thanks to model reuse capacity, we can add/change functionalities or the target architecture. Consequently, this approach helps industries to achieve their time-to-market constraints and confirms by experimental tests, performance improvements using multi-GPU environments.

💡 Deep Analysis

Figure 1

📄 Full Content

Automatic Multi-GPU Code Generation applied to Simulation of Electrical Machines A. Wendell O. Rodrigues, Fr´ed´eric Guyomarc’h and Jean-Luc Dekeyser LIFL - USTL :: INRIA Lille Nord Europe - 59650 Villeneuve d’Ascq - France {wendell.rodrigues,frederic.guyomarch,jean-luc.dekeyser}@inria.fr Yvonnick Le Menach L2EP - USTL Cit´e Scientifique Bat.P2 - 59655 Villeneuve d’Ascq - France yvonnick.le-menach@univ-lille1.fr Abstract—The electrical and electronic engineering has used parallel programming to solve its large scale complex problems for performance reasons. However, as parallel programming requires a non-trivial distribution of tasks and data, developers find it hard to implement their applications effectively. Thus, in order to reduce design complexity, we propose an approach to generate code for hybrid architectures (e.g. CPU + GPU) using OpenCL, an open standard for parallel programming of heterogeneous systems. This approach is based on Model Driven Engineering (MDE) and the MARTE profile, standard proposed by Object Management Group (OMG). The aim is to provide re- sources to non-specialists in parallel programming to implement their applications. Moreover, thanks to model reuse capacity, we can add/change functionalities or the target architecture. Consequently, this approach helps industries to achieve their time-to-market constraints and confirms by experimental tests, performance improvements using multi-GPU environments. I. INTRODUCTION Methods of numerical computing are essential in many scientific and industrial areas. Nevertheless, due to time con- straints, communities of those areas are obliged to use parallel platforms to speed-up their results. There are many archi- tectures suitable to parallelize scientific algorithms. Hybrid architectures based on CPU and other devices (e.g. GPU) are popular for economic reasons (i.e. price and energy consumption) and their good performance. However, creating applications on these architectures is an arduous task for non- specialists in parallel programming. This paper presents an approach that addresses: 1) design methodology based on MDE to generate auto- matically application code; 2) exploiting higher performance multi-GPU validated by a case study. II. BACKGROUND A Graphics Processing Unit or GPU is the many-core processor attached to a graphics card. However, though it has diverse cores, its parallelism continues to scale with Moore’s law. It is necessary to develop application software that transparently scales its parallelism. Proposals, such as OpenCL, have been designed to overcome this challenge. The Khronos Group released OpenCL [1] as a standard for parallel computing consisting of a language(which is an extension of C), API, libraries and a runtime system. OpenCL is based on a platform model that divides a system into one host and one or several compute devices. Compute devices act as co- processors(e.g. GPUs) to the host(e.g. CPU). An OpenCL application is executed on the host, which sends instructions, defined in special functions called kernels, to the device. Additionally, a single host can have multiple devices. OpenCL allows for creating contexts and queues in order to manage tasks being launched in all attached devices. III. APPLICATION DESIGN AND CODE GENERATION A. MDE and MARTE Model Driven Engineering (MDE) [2] aims to raise the level of abstraction in program specification and increase automation in program development. The UML profile for MARTE [3] extends the possibilities for modeling of appli- cation and architecture and their relations. MARTE consists in defining foundations for model-based description of real time and embedded systems. B. Model Transformation Chain In MDE, a model transformation is a compilation process which transforms a source model into a target model. This allows for adding, modifying, transforming model elements in order to achieve a final model closer to the real program application. For instance, the last model has explicit informa- tion about variables and task scheduling. In [2] there is an overview about the tools used in model-to-model and model- to-text process. Additionally, we have used the Gaspard2 [4] framework as the engine to chain and encapsulate these transformations. IV. CASE STUDY The conjugate gradient (CG) method [5] is often used in modeling and simulation of electrical systems. It should only be applied to systems that are symmetric or Hermitian positive definite, and it is still the method of choice for this case. Input data are resulting from a FEM model of an electrical machine. The matrix is stored in Compressed Sparse Row (CSR) format having N=132651 and NNZ=3442951. The CG algorithm is modeled in MARTE as presented in the figure 2, where data reading and initial configurations are defined by stereotyped blocks. Highlighted gray blocks represent tasks, which are arXiv:1107.0538v1 [cs.DC] 4 Jul 2011 Fig. 1. Conjugate Gradient UML/MARTE Model Fig. 2. UML/MARTE Model for Setup and CG Overview

📸 Image Gallery

cover.png

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut