Activités de recherche : Algorithmes pour les grands volumes de données distribuées
Résumé :
The deep web commonly refers to documents and resources that are not directly accessible through web browsers, crawlers and other types of clients. Data residing behind web forms or web services belong to this class as one generally needs to provide input values to access them. Querying the deep web raises two difficult issues: (i) can a query be answered despite existing access restrictions, (ii) if so, how to find an efficient plan that automatically chooses the right inputs to produce the answer. Although a query may appear to be unanswerable, reasoning on the access restrictions and existing dependencies may uncover ways to answer it. Our approach tackles both problems concurrently by looking for proofs that a query is answerable and building a plan from each of these proofs. Plans are assigned costs to find an optimal one as well as to guide the proof search. I will present ongoing work on the topic, which has connections with data integration, view-based query answering and web service composition.