Title

A Non-Linear Dimensionality-Reduction Technique For Fast Similarity Search In Large Databases

Abstract

To enable efficient similarity search in large databases, many indexing techniques use a linear transformation scheme to reduce dimensions and allow fast approximation. In this reduction approach the approximation is unbounded, so that the approximation volume extends across the dataspace. This causes over-estimation of retrieval sets and impairs performance.This paper presents a non-linear transformation scheme that extracts two important parameters specifying the data. We prove that these parameters correspond to a bounded volume around the search sphere, irrespective of dimensionality. We use a special workspace-mapping mechanism to derive tight bounds for the parameters and to prove further results, as well as highlighting insights into the problems and our proposed solutions. We formulate a measure that lower-bounds the Euclidean distance, and discuss the implementation of the technique upon a popular index structure. Extensive experiments confirm the superiority of this technique over recent state-of-the-art schemes. Copyright 2006 ACM.

Publication Date

12-1-2006

Publication Title

Proceedings of the ACM SIGMOD International Conference on Management of Data

Number of Pages

527-538

Document Type

Article; Proceedings Paper

Personal Identifier

scopus

DOI Link

https://doi.org/10.1145/1142473.1142532

Socpus ID

34250655397 (Scopus)

Source API URL

https://api.elsevier.com/content/abstract/scopus_id/34250655397

This document is currently not available here.

Share

COinS