Bounded approximation: A new criterion for dimensionality reduction approximation in similarity search

Authors

    Authors

    K. Vu; K. A. Hua; H. Cheng;S. D. Lang

    Comments

    Authors: contact us about adding a copy of your work at STARS@ucf.edu

    Abbreviated Journal Title

    IEEE Trans. Knowl. Data Eng.

    Keywords

    similarity search; spherical range search; dimensionality reduction; nonlinear transformation; bounded approximation; INDEXING METHOD; DATABASES; Computer Science, Artificial Intelligence; Computer Science, Information; Systems; Engineering, Electrical & Electronic

    Abstract

    We examine the problem of efficient distance-based similarity search over high-dimensional data. We show that a promising approach to this problem is to reduce dimensions and allow fast approximation. Conventional reduction approaches, however, entail a significant shortcoming: The approximation volume extends across the dataspace, which causes overestimation of retrieval sets and impairs performance. This paper focuses on a new criterion for dimensionality reduction methods: bounded approximation. We show that this requirement can be accomplished by a novel nonlinear transformation scheme that extracts two important parameters from the data. We devise two approximation formulations, namely, rectangular and spherical range search, each corresponding to a closed volume around the original search sphere. We discuss in detail how we can derive tight bounds for the parameters and prove further results, as well as highlight insights into the problems and our proposed solutions. To demonstrate the benefits of the new criterion, we study the effects of (un)boundedness on approximation performance, including selectivity, error toleration, and efficiency. Extensive experiments confirm the superiority of this technique over recent state-of-the-art schemes.

    Journal Title

    Ieee Transactions on Knowledge and Data Engineering

    Volume

    20

    Issue/Number

    6

    Publication Date

    1-1-2008

    Document Type

    Article

    Language

    English

    First Page

    768

    Last Page

    783

    WOS Identifier

    WOS:000254976900004

    ISSN

    1041-4347

    Share

    COinS