Performance of Multiattribute Top-K Queries on Relational Systems
- Performance of Multiattribute Top-K Queries on Relational Systems
- Bruno, Nicolas
- Computer Science
- Persistent URL:
- Columbia University Computer Science Technical Reports
- Part Number:
- Department of Computer Science, Columbia University
- Publisher Location:
- New York
- In many applications, users specify target values for the attributes of a relation, and expect in return the k tuples that best match these values. Traditional RDBMSs do not process these ”top-k queries'' efficiently. In our previous work, we outlined a family of strategies to map a top-k query into a traditional selection query that a RDBMS can process efficiently. The goal of such mapping strategies is to get all needed tuples (but minimize the number of retrieved tuples) and thus avoid “restarts'' to get additional tuples. Unfortunately, no single mapping strategy performed consistently the best under all data distributions. In this paper, we develop a novel mapping technique that leverages information about the data distribution and adapts itself to the local characteristics of the data and the histograms available to do the mapping. We also report the first experimental evaluation of the new and old mapping strategies over a real RDBMS, namely over Microsoft's SQL Server7.0. The experiments show that our new techniques are robust and significantly more efficient than previously known strategies requiring at least one sequential scan of the data sets.
- Computer science
- Item views
text | xml
- Suggested Citation:
- Nicolas Bruno, Surajit Chaudhuri, Luis Gravano, 2000, Performance of Multiattribute Top-K Queries on Relational Systems, Columbia University Academic Commons, https://doi.org/10.7916/D8DJ5STF.