Laboratoire de Recherche en Informatique

Séminaire d'équipe(s) Parallel Systems

New algorithms for the LU factorization and the generation of random orthogonal matrices
Amal Khabou

11 February 2014, 10:30
Salle/Bat : 465/PCRI-N
Contact :

Activités de recherche : High-performance computing

Résumé :

We consider two problems in numerical linear algebra. The first problem is the design of the block LU factorization with panel rank revealing pivoting (block LU PRRP), an algorithm based on strong rank revealing QR for the panel factorization. Block LU PRRP is more stable than Gaussian elimination with partial pivoting (GEPP), with a theoretical upper bound of the growth factor of (1 + τ b)(n/b)−1 , where b is the size of the panel used during the block factorization, τ is a parameter of the strong rank revealing QR factorization, and n is the number of columns of the matrix. Our numerical experiments show
that the new factorization scheme is as numerically stable as GEPP in practice, but it is more resistant to some pathological cases where GEPP fails. The block LU PRRP factorization performs only O(n2 b) additional floating point operations compared to GEPP. We also introduce a communication avoiding version of this algorithm. (joint work with James Demmel, Laura Grigori, and MingGu )
The second problem is the generation of random orthogonal matrices which have a wide variety of applications. The natural distribution over the space of orthogonal matrices is the Haar distribution. One way to generate a random orthogonal matrix from the Haar distribution is to generate a random matrix A with elements from the standard normal distribution and compute its QR factorization A = QR, where R is chosen to have nonnegative diagonal elements. Stewart develops a more efficient algorithm that directly generates an n × n orthogonal matrix from the Haar distribution as a product of Householder
transformations built from Householder vectors of dimensions 1, 2, . . . , n−1 cho- sen from the standard normal distribution. This algorithm requires m3 +n3 flops to form A. Here we present an algorithm that significantly reduces the computational cost of Stewart’s algorithm by giving up the property that Q is exactly Haar distributed. We also give some performance results showing the benefits of the new algorithm. (joint work with Nicholas Higham and Fran ̧coise Tisseur)

Pour en savoir plus :