Title: Automatic Multi-GPU Code Generation applied to Simulation of Electrical Machines
ArXiv ID: 1107.0538
Date: 2011-07-05
Authors: Antonio Wendell De Oliveira Rodrigues (INRIA Lille - Nord Europe), Frederic GuyomarcH (INRIA Lille - Nord Europe), Jean-Luc Dekeyser (INRIA Lille - Nord Europe), Yvonnick Le Menach (L2EP)
📝 Abstract
The electrical and electronic engineering has used parallel programming to solve its large scale complex problems for performance reasons. However, as parallel programming requires a non-trivial distribution of tasks and data, developers find it hard to implement their applications effectively. Thus, in order to reduce design complexity, we propose an approach to generate code for hybrid architectures (e.g. CPU + GPU) using OpenCL, an open standard for parallel programming of heterogeneous systems. This approach is based on Model Driven Engineering (MDE) and the MARTE profile, standard proposed by Object Management Group (OMG). The aim is to provide resources to non-specialists in parallel programming to implement their applications. Moreover, thanks to model reuse capacity, we can add/change functionalities or the target architecture. Consequently, this approach helps industries to achieve their time-to-market constraints and confirms by experimental tests, performance improvements using multi-GPU environments.
💡 Deep Analysis
📄 Full Content
Automatic Multi-GPU Code Generation applied to
Simulation of Electrical Machines
A. Wendell O. Rodrigues, Fr´ed´eric Guyomarc’h
and Jean-Luc Dekeyser
LIFL - USTL :: INRIA Lille Nord Europe - 59650
Villeneuve d’Ascq - France
{wendell.rodrigues,frederic.guyomarch,jean-luc.dekeyser}@inria.fr
Yvonnick Le Menach
L2EP - USTL
Cit´e Scientifique Bat.P2 - 59655
Villeneuve d’Ascq - France
yvonnick.le-menach@univ-lille1.fr
Abstract—The electrical and electronic engineering has used
parallel programming to solve its large scale complex problems
for performance reasons. However, as parallel programming
requires a non-trivial distribution of tasks and data, developers
find it hard to implement their applications effectively. Thus,
in order to reduce design complexity, we propose an approach
to generate code for hybrid architectures (e.g. CPU + GPU)
using OpenCL, an open standard for parallel programming of
heterogeneous systems. This approach is based on Model Driven
Engineering (MDE) and the MARTE profile, standard proposed
by Object Management Group (OMG). The aim is to provide re-
sources to non-specialists in parallel programming to implement
their applications. Moreover, thanks to model reuse capacity,
we can add/change functionalities or the target architecture.
Consequently, this approach helps industries to achieve their
time-to-market constraints and confirms by experimental tests,
performance improvements using multi-GPU environments.
I. INTRODUCTION
Methods of numerical computing are essential in many
scientific and industrial areas. Nevertheless, due to time con-
straints, communities of those areas are obliged to use parallel
platforms to speed-up their results. There are many archi-
tectures suitable to parallelize scientific algorithms. Hybrid
architectures based on CPU and other devices (e.g. GPU)
are popular for economic reasons (i.e. price and energy
consumption) and their good performance. However, creating
applications on these architectures is an arduous task for non-
specialists in parallel programming. This paper presents an
approach that addresses:
1) design methodology based on MDE to generate auto-
matically application code;
2) exploiting higher performance multi-GPU validated
by a case study.
II. BACKGROUND
A Graphics Processing Unit or GPU is the many-core
processor attached to a graphics card. However, though it
has diverse cores, its parallelism continues to scale with
Moore’s law. It is necessary to develop application software
that transparently scales its parallelism. Proposals, such as
OpenCL, have been designed to overcome this challenge. The
Khronos Group released OpenCL [1] as a standard for parallel
computing consisting of a language(which is an extension of
C), API, libraries and a runtime system. OpenCL is based
on a platform model that divides a system into one host and
one or several compute devices. Compute devices act as co-
processors(e.g. GPUs) to the host(e.g. CPU). An OpenCL
application is executed on the host, which sends instructions,
defined in special functions called kernels, to the device.
Additionally, a single host can have multiple devices. OpenCL
allows for creating contexts and queues in order to manage
tasks being launched in all attached devices.
III. APPLICATION DESIGN AND CODE GENERATION
A. MDE and MARTE
Model Driven Engineering (MDE) [2] aims to raise the
level of abstraction in program specification and increase
automation in program development. The UML profile for
MARTE [3] extends the possibilities for modeling of appli-
cation and architecture and their relations. MARTE consists
in defining foundations for model-based description of real
time and embedded systems.
B. Model Transformation Chain
In MDE, a model transformation is a compilation process
which transforms a source model into a target model. This
allows for adding, modifying, transforming model elements
in order to achieve a final model closer to the real program
application. For instance, the last model has explicit informa-
tion about variables and task scheduling. In [2] there is an
overview about the tools used in model-to-model and model-
to-text process. Additionally, we have used the Gaspard2
[4] framework as the engine to chain and encapsulate these
transformations.
IV. CASE STUDY
The conjugate gradient (CG) method [5] is often used in
modeling and simulation of electrical systems. It should only
be applied to systems that are symmetric or Hermitian positive
definite, and it is still the method of choice for this case. Input
data are resulting from a FEM model of an electrical machine.
The matrix is stored in Compressed Sparse Row (CSR) format
having N=132651 and NNZ=3442951. The CG algorithm is
modeled in MARTE as presented in the figure 2, where data
reading and initial configurations are defined by stereotyped
blocks. Highlighted gray blocks represent tasks, which are
arXiv:1107.0538v1 [cs.DC] 4 Jul 2011
Fig. 1.
Conjugate Gradient UML/MARTE Model
Fig. 2.
UML/MARTE Model for Setup and CG Overview