BABOULIN Marc
Faculty habilitation
Group : Parallelism
Fast and reliable solutions for numerical linear algebra solvers in high-performance computing.
Starts on 05/12/2012
Advisor :
Funding :
Affiliation : Université Paris-Saclay
Laboratory : LRI
Defended on 05/12/2012, committee :
Jean-Claude Bajard, Université Pierre et Marie Curie (examinateur)
Philippe Dague, Université Paris-Sud (examinateur)
Frederic Desprez, Inria/Ecole Normale Supérieure de Lyon (rapporteur)
Jack Dongarra, University of Tennessee, USA (examinateur)
Serge Gratton, ENSEEIHT et CERFACS, Toulouse (membre invité)
Philippe Langlois, Université de Perpignan (rapporteur)
Jose Roman, Universitat Politecnica de Valencia, Espagne (rapporteur)
Brigitte Rozoy, Université Paris-Sud (examinateur)
Research activities :
Abstract :
Recent years have seen an increase in peak ``local" speed through
parallelism in terms of multicore processors and GPU accelerators. At
the same time, the cost of communication between memory hierarchies
and/or between processors have become a major bottleneck for most
linear algebra algorithms.
We first explain how hybrid multicore+GPU systems can be used
efficiently to enhance performance of linear algebra libraries. We
illustrate this approach by considering hybrid factorizations where we
split the computation over a multicore and a graphic processor and
where the amount of communication is significantly reduced.
Next we describe a class of randomized algorithms that accelerate
factorization of general or symmetric indefinite systems on multicore
or hybrid multicore+GPU systems. Randomization prevents the
communication overhead due to pivoting, is computationally inexpensive
and requires very little storage. The resulting solvers outperform
existing routines while providing us with a satisfying accuracy.
Finally we present numerical tools that enable us to assess the
quality of the computed solution of overdetermined linear least
squares. Our method is based on deriving exact values or statistical
estimates for the condition number of these problems. We describe
algorithms and software to compute these quantities using HPC
libraries.