Français Anglais
Accueil Annuaire Plan du site
Accueil > Evenements > Séminaires
Séminaire du LRI
Salle 445 - Exemplar queries on documents: Finding related information across and within documents by topological information exploitation.
Yannis Velegrakis

24 April 2017, 11h00
Salle/Bat : 445/PCRI-N
Contact :

Activités de recherche : Gestion de données du Web

Résumé :
Finding related information to some data at hand has traditionally been based on content similarity. We claim that terms used in documents should be treated differently depending on their location within the document, i.e., have different weights and/or different semantics.
In the first part of the talk, we deal with short documents and in particularly with forum posts. We describe an approach that finds related forum posts to a post at hand. It does so by treating differently the sections of the posts that have been written with different intentions in mind. We explain how text feature variations can be exploited to recognise intentions, despite the fact that they are not explicitly stated in the text. We then segment the posts according to the identified intentions and compute similarities across segments only if they have the same intention. The similarity between two posts is then computed by a combination of the similarities of their respective segments with matching intentions. We provide evidences that this approach improves significantly the performance of existing methods for finding related posts. The evidences are based on an extensive set of experimentation with real datasets and users.
In the second part of the talk we deal with full documents, and in particular wikipedia documents. We describe how topological and temporal information can be used to better identify within the document, information related to a fact at hand. We describe a new structure, called ladder, that is used to link together different parts of the document that are related contextually even if syntactically they may not match.

About the speaker:
Yannis Velegrakis is a faculty member at the University of Trento, head of the Data Management Group, and coordinator of the EIT Digital MSc program in Trento. He holds a PhD degree in Computer Science from the University of Toronto. His research areas include large scale data management (Big Data), social data analytics, integration of heterogeneous data, query answering, data quality and graph data. Prior to joining the University of Trento, he held a researcher position at AT&T Research Labs in the United States. He has also spent time as a visitor at the University of California, Santa-Cruz, the IBM Almaden Research Center, and the Center of Advanced Studies of the IBM Toronto Lab. He has been a general chair for VLDB13, and PC char for WebDB12, DESWEB10/11, SWAE07, SDSW14, and ExploreDB17. He holds 2 US patents and has been a fellow of Marie Curie and of Paris Saclay "Jean d’Alembert”.

Pour en savoir plus :
Séminaires
Measuring Similarity between Logical Arguments
Raisonnement automatique
Monday 06 March 2023 - 00h00
Salle : 0 - 650
Victor David .............................................

Imputing Out-of-Vocabulary Embeddings with LOVE Ma
Langages et systèmes centrés données
Monday 20 February 2023 - 00h00
Salle : 455 - PCRI-N
Lihu Chen .............................................

On the Interplay between Software Product Lines an
Raisonnement automatique
Tuesday 18 October 2022 - 14h15
Salle : 2013 - DIG-Moulon
Vander Alves .............................................

Combining randomized and observational data: Towar
Raisonnement automatique
Thursday 13 October 2022 - 10h30
Salle : 2011 - DIG-Moulon
Bénédicte Colnet .............................................

New Achievements of Artificial Intelligence in Mul
Raisonnement automatique
Tuesday 11 October 2022 - 14h15
Salle : 2013 - DIG-Moulon
.............................................