SubSpace Projection: A unified framework for a class of partition-based dimension reduction techniques

Authors

    Authors

    H. Cheng; K. Vu;K. A. Hua

    Comments

    Authors: contact us about adding a copy of your work at STARS@ucf.edu

    Abbreviated Journal Title

    Inf. Sci.

    Keywords

    SubSpace Projection; Dimensionality reduction; Similarity search; Multidimensional indexing; Dimension partition; PRINCIPAL COMPONENT ANALYSIS; SEARCH; Computer Science, Information Systems

    Abstract

    Similarity search in high dimensional space is a nontrivial problem due to the so-called curse of dimensionality. Recent techniques such as Piecewise Aggregate Approximation (PAA), Segmented Means (SMEAN) and Mean-Standard deviation (MS) prove to be very effective in reducing data dimensionality by partitioning dimensions into subsets and extracting aggregate values from each dimension subset. These partition-based techniques have many advantages including very efficient multi-phased approximation while being simple to implement. They, however, are not adaptive to the different characteristics of data in diverse applications. We propose SubSpace Projection (SSP) as a unified framework for these partition-based techniques. SSP projects data onto subspaces and computes a fixed number of salient features with respect to a reference vector. A study of the relationships between query selectivity and the corresponding space partitioning schemes uncovers indicators that can be used to predict the performance of the partitioning configuration. Accordingly, we design a greedy algorithm to efficiently determine a good partitioning of the data dimensions. The results of our extensive experiments indicate that the proposed method consistently outperforms state-of-the-art techniques. (C) 2008 Elsevier Inc. All rights reserved.

    Journal Title

    Information Sciences

    Volume

    179

    Issue/Number

    9

    Publication Date

    1-1-2009

    Document Type

    Article

    Language

    English

    First Page

    1234

    Last Page

    1248

    WOS Identifier

    WOS:000264567500003

    ISSN

    0020-0255

    Share

    COinS