We present in this talk our recent research on top-k query answering in social applications.
This problem requires a significant departure from socially agnostic techniques for information retrieval. One can now exploit the social links in order to obtain more relevant results, valid not only with respected to the keyword query but also with respect to the social context of the user who issued it. We propose a sound and complete algorithm, called TOPKS, which addresses important applicability issues of existing techniques. Moreover, we show that TOPKS is instance optimal in the case when the search relies exclusively on the social weight of the data. To further address the efficiency needs of online applications, for which the exact search, albeit optimal, may still be expensive, we also consider approximate algorithms. These rely on concise statistics about the social network or on approximate shortest-paths computations.
As a complementary direction for efficient, online answering, we also consider the materialization and exploitation of previous query results (views). We study social-aware query optimization based on views, presenting algorithms that address two important sub-problems. First, handling the possible differences in context between the various views and an input query leads to view results having uncertain scores, i.e., score ranges valid for the new context. As a consequence, current top-k algorithms are no longer directly applicable and need to be adapted to handle this uncertainty. Second, adapted view selection techniques are needed, which can leverage both the descriptions of queries and statistics over their results.
Extensive experiments on both synthetic and real-world data (from Delicious and Twitter) show that our techniques have the potential to scale and meet the requirements of real applications. They have been recently demonstrated in a prototype social-aware search called Taagle.
This is joint work with Silviu Maniu (now at Hong Kong University).