Technical reports:
Performance of Multiattribute Top-K Queries on Relational Systems
Nicolas Bruno; Surajit Chaudhuri; Luis Gravano
Downloads:
- Title:
- Performance of Multiattribute Top-K Queries on Relational Systems
- Author(s):
-
Bruno, Nicolas
Chaudhuri, Surajit
Gravano, Luis - Date:
- 2000
- Type:
- Technical reports
- Department:
- Computer Science
- Permanent URL:
- http://hdl.handle.net/10022/AC:P:29406
- Series:
- Columbia University Computer Science Technical Reports
- Part Number:
- CUCS-021-00
- Publisher:
- Department of Computer Science, Columbia University
- Publisher Location:
- New York
- Abstract:
- In many applications, users specify target valuesfor the attributes of a relation, and expect in return the k tuplesthat best match these values. Traditional RDBMSs do not process these``top-k queries'' efficiently. In our previous work, we outlined afamily of strategies to map a top-k query into a traditional selectionquery that a RDBMS can process efficiently. The goal of such mappingstrategies is to get all needed tuples (but minimize the number ofretrieved tuples) and thus avoid ``restarts'' to get additionaltuples. Unfortunately, no single mapping strategy performedconsistently the best under all data distributions. In this paper, wedevelop a novel mapping technique that leverages information about thedata distribution and adapts itself to the local characteristics ofthe data and the histograms available to do the mapping. We alsoreport the first experimental evaluation of the new and old mappingstrategies over a real RDBMS, namely over Microsoft's SQL Server7.0. The experiments show that our new techniques are robust andsignificantly more efficient than previously known strategiesrequiring at least one sequential scan of the data sets.
- Subject(s):
- Computer science
- Item views:
- 89