
<mods xmlns="http://www.loc.gov/mods/v3" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.loc.gov/mods/v3 http://www.loc.gov/standards/mods/v3/mods-3-4.xsd">
    
    <titleInfo>
        <title>Diversity Maximization Under Matroid Constraints</title>
    </titleInfo>
    <name type="personal" ID="za2153">
        <namePart type="family">Abbassi</namePart>
        <namePart type="given">Zeinab</namePart>
        <role>
            <roleTerm type="text">author</roleTerm>
        </role>
        <affiliation>Columbia University. Computer Science</affiliation>
    </name>
    <name type="personal">
        <namePart type="family">Mirrokni</namePart>
        <namePart type="given">Vahab S.</namePart>
        <role>
            <roleTerm type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="family">Thakur</namePart>
        <namePart type="given">Mayur</namePart>
        <role>
            <roleTerm type="text">author</roleTerm>
        </role>
    </name>
    <name type="corporate">
        <namePart>Columbia University. Computer Science</namePart>
        <role>
            <roleTerm type="text">originator</roleTerm>
        </role>
    </name>
    <typeOfResource>text</typeOfResource>
    <genre>Technical reports</genre>
    
    <originInfo>
        <place>
            <placeTerm type="text">New York</placeTerm>
        </place>
        <publisher>Department of Computer Science, Columbia University </publisher>
    </originInfo>
    <abstract>Aggregator websites typically present documents in the form of representative clusters. In order for users to get a broader perspective,it is important to deliver a diversified set of representative documents in those clusters. One approach to diversification is to maximize the average dissimilarity among documents. Another way to capture diversity is to avoid showing several documents from the same category (e.g. from the same news channel). We model the latter approach as a (partition) matroid constraint, and study diversity maximization problems under matroid constraints. We present the first constant-factor approximation algorithm for this problem,using a new technique. Our local search 0:5-approximation algorithm
is also the first constant-factor approximation for the maxdispersion problem under matroid constraints. Our combinatorial proof technique for maximizing diversity under matroid constraints uses the existence of a family of Latin squares which may also be of independent interest. In order to apply these diversity maximization algorithms in the context of aggregator websites and as a preprocessing step for our
diversity maximization tool, we develop greedy clustering algorithms that maximize weighted coverage of a predefined set of topics. Our algorithms are based on computing a set of cluster centers, where clusters are formed around them. We show the better performance of our algorithms for diversity and coverage maximization by running experiments on real (Twitter) and synthetic data in the
context of real-time search over micro-posts. Finally we perform a user study validating our algorithms and diversity metrics.</abstract>
    <subject>
        <topic>Computer Science</topic>
    </subject>
    <relatedItem type="series" ID="r.1">
        <titleInfo>
            <title>Columbia University Computer Science Technical Reports </title>
            <partNumber>CUCS-019-12</partNumber>
        </titleInfo>
    </relatedItem>
    <relatedItem>
        <location>
            <url></url>
        </location>
    </relatedItem>
    <identifier type="hdl">http://hdl.handle.net/10022/AC:P:15365</identifier>

    <language>
        <languageTerm type="text">English</languageTerm>
    </language>
    
    <location>
        <physicalLocation authority="marcorg">NNC</physicalLocation>
    </location>
    
    <recordInfo>
        <recordContentSource authority="marcorg">NNC</recordContentSource>
        <recordCreationDate encoding="w3cdtf">2012-12-03 14:58:12 -0500</recordCreationDate>
        <recordChangeDate encoding="w3cdtf">2012-12-03 15:14:29 -0500</recordChangeDate>
        <recordIdentifier>9368</recordIdentifier>
        <languageOfCataloging>
            <languageTerm authority="iso639-2b">eng</languageTerm>
        </languageOfCataloging>
    </recordInfo>
    
</mods>
