Thèses et habilitations (Production scientifique) - Laboratoire de Recherche en Informatique

Habilitation à diriger des recherches de

Habilitation à diriger des recherches
Equipe : Intelligence Artificielle et Systèmes d'Inférence

Knowledge Representation meets Databases for the sake of ontology-based data management

Début le 11/07/2012
Direction :

Ecole doctorale : ED STIC 580
Etablissement d'inscription : Université Paris-Saclay

Lieu de déroulement : Univ. Paris-Sud

Soutenue le 11/07/2012 devant le jury composé de :

Serge Abiteboul, Directeur de recherche, INRIA Saclay Ile-de-France (Examinateur)
Alexander Borgida, Professeur, Rutgers University (Rapporteur)
Michel Chein, Professeur, Université de Montpellier (Examinateur)
Vassilis Christophides, Professeur, ICS-FORTH Crète (Rapporteur)
Philippe Dague, Professeur, Université Paris-Sud (Examinateur)
Pierre Marquis, Professeur, Université de Lens (Rapporteur)
Marie-Christine Rousset, Professeur, Université de Grenoble (Marraine)

Activités de recherche :
   - Bases de données
   - Représentation des connaissances
   - Raisonnement automatique

Résumé :
Data management is a longstanding research topic in Knowledge Representation (KR), a prominent discipline of Artificial Intelligence (AI), and -- of course -- in Databases (DB).
Till the end of the 20th century, there have been few interactions between these two research fields concerning data management, essentially because they were addressing it from different perspectives. KR was investigating data management according to human cognitive schemes for the sake of intelligibility, e.g., using Conceptual Graphs or Description Logics, while DB was focusing on data management according to simple mathematical structures for the sake of efficiency, e.g., using the relational model or the eXtensible Markup Language.
In the beginning of the 21st century, these ideological stances have changed with the new era of ontology-based data management. Roughly speaking, ontology-based data management brings data management one step closer to end-users, especially to those that are not computer scientists or engineers. It basically revisits the traditional architecture of database management systems by decoupling the models with which data is exposed to end-users from the models with which data is stored. Notably, ontology-based data management advocates the use of conceptual models from KR as human intelligible front-ends called ontologies, relegating DB models to back-end storage.
The World Wide Web Consortium (W3C) has greatly contributed to ontology-based data management by providing standards for handling data through ontologies, the two Semantic Web data models. The first standard, the Resource Description Framework (RDF), was introduced in 1998. It's a graph data model coming with a very simple ontology language, RDF Schema, strongly related to description logics. The second standard, the Web Ontology Language (OWL), was introduced in 2004. It's actually a family of well-established description logics with varying expressivity/complexity tradeoffs.
The advent of RDF and OWL has rapidly focused the attention of academia and industry on practical ontology-based data management. The research community has undertaken this challenge at the highest level, leading to pioneering and compelling contributions in top venues on Artificial Intelligence (e.g., AAAI, ECAI, IJCAI, and KR), on Databases (e.g., ICDT/EDBT, ICDE, SIGMOD/PODS, and VLDB), and on the Web (e.g., ESWC, ISWC, and WWW). Also, open-source and commercial software providers are releasing an ever-growing number of tools allowing effective RDF and OWL data management (e.g., Jena, ORACLE 10/11g, OWLIM, Protégé, RDF-3X, and Sesame).
Last but not least, large societies have promptly adhered to RDF and OWL data management (e.g., library and information science, life science, and medicine), sustaining and begetting further efforts towards always more convenient, efficient, and scalable ontology-based data management techniques.
This HDR thesis reports on some of my contributions to the design, the optimization, and the decentralization of data management techniques for RDF and OWL.