Academic Commons

Reports

Projecting XML Documents

Marian, Amelie; Simeon, Jerome

XQuery is not only useful to query XML in databases, but also to applications that must process XML documents as files or streams. These applications suffer from the limitations of current main-memory XQuery processors which break for rather small documents. In this paper we propose techniques, based on a notion of projection for XML, which can be used to drastically reduce memory requirements in XQuery processors. The main contribution of the paper is a static analysis technique that can identify at compile time which parts of the input document are needed to answer an arbitrary XQuery. We present a loading algorithm that takes the resulting information to build a projected document, which is smaller than the original document, and on which the query yields the same result. We implemented projection in the Galax XQuery processor. Our experiments show that projection reduces memory requirements by a factor of 20 on average, and is effective for a wide variety of queries. In addition, projection results in some speedup during query evaluation.

Subjects

Files

More About This Work

Academic Units
Computer Science
Publisher
Department of Computer Science, Columbia University
Series
Columbia University Computer Science Technical Reports, CUCS-004-03
Published Here
April 26, 2011
Academic Commons provides global access to research and scholarship produced at Columbia University, Barnard College, Teachers College, Union Theological Seminary and Jewish Theological Seminary. Academic Commons is managed by the Columbia University Libraries.