Ph.D
Group : Parallelism
Methods and algorithms for solving linear systems of equations on massively parallel computers
Starts on 09/01/2008
Advisor : GRIGORI, Laura
Funding : CDD sur contrat INRIA
Affiliation : Université Paris-Saclay
Laboratory : LRI
Defended on 07/03/2012, committee :
Examinateurs:
Yannis Manoussakis, Professeur, Université de paris 11 Orsay France
Jack Dongarra, Professeur, Université de Tennessee USA
Thomas Guignon, Ingénieur, IFP France
Rapporteurs:
Raymond Namyst, Professeur, LaBRI Université de bordeaux I France
Frédéric Desprez, Directeur de recherche, LIP, ENS Lyon France
Directeur de thèse:
Laura Grigori, Chargé de recherche, Université de paris 11 Orsay France
Research activities :
Abstract :
Multicore processors are considered to be nowadays the future of computing, and they will have an important impact in scientific computing. In this thesis, we study methods and algorithms for solving efficiently sparse and dense large linear systems on future petascale machines. Due to the increasing communication cost compared to the time the processors take to perform arithmetic operations, our approach embrace the communication avoiding algorithm principle by doing some redundant computations and uses several adaptations to achieve better performance on multicore machines.
We decompose the problem to solve into several phases that would be then designed or optimized separately. In the first part, we present an algorithm based on hypergraph partitioning and which considerably reduces the fill-in incurred in the LU factorization of sparse unsymmetric matrices. In the second part, we present two communication avoiding algorithms for computing the LU and QR factorisations that are adapted to multicore environments. The main contribution of this part is to reorganize the computations such as to reduce bus contention and using efficiently resources. Then, we extend this work for clusters of multi-core processors. In the third part, we present the use of a scheduling and optimization strategy that uses hybrid static/dynamic scheduling of the task dependency graph. This strategy provides a balance of data locality, load balancing, reduces the overhead, and can adapt to dynamic changes in the system.
Our results obtained on several architectures show that all our algorithms are efficient and lead to significant performance gains. We can achieve from 30 up to 110% improvement over the corresponding routines of our algorithms in well known libraries.