HomeHome

Performance of Multiattribute Top-K Queries on Relational Systems

Nicolas Bruno; Surajit Chaudhuri; Luis Gravano

Title:
Performance of Multiattribute Top-K Queries on Relational Systems
Author(s):
Bruno, Nicolas
Chaudhuri, Surajit
Gravano, Luis
Date:
Type:
Reports
Department(s):
Computer Science
Persistent URL:
Series:
Columbia University Computer Science Technical Reports
Part Number:
CUCS-021-00
Publisher:
Department of Computer Science, Columbia University
Publisher Location:
New York
Abstract:
In many applications, users specify target values for the attributes of a relation, and expect in return the k tuples that best match these values. Traditional RDBMSs do not process these ”top-k queries'' efficiently. In our previous work, we outlined a family of strategies to map a top-k query into a traditional selection query that a RDBMS can process efficiently. The goal of such mapping strategies is to get all needed tuples (but minimize the number of retrieved tuples) and thus avoid “restarts'' to get additional tuples. Unfortunately, no single mapping strategy performed consistently the best under all data distributions. In this paper, we develop a novel mapping technique that leverages information about the data distribution and adapts itself to the local characteristics of the data and the histograms available to do the mapping. We also report the first experimental evaluation of the new and old mapping strategies over a real RDBMS, namely over Microsoft's SQL Server7.0. The experiments show that our new techniques are robust and significantly more efficient than previously known strategies requiring at least one sequential scan of the data sets.
Subject(s):
Computer science
Item views
154
Metadata:
text | xml
Suggested Citation:
Nicolas Bruno, Surajit Chaudhuri, Luis Gravano, , Performance of Multiattribute Top-K Queries on Relational Systems, Columbia University Academic Commons, .

Columbia University Libraries | Policies | FAQ