APHID: An architecture for private, high-performance integrated data mining

Authors

    Authors

    J. Secretan; M. Georgiopoulos; A. Koufakou;K. Cardona

    Comments

    Authors: contact us about adding a copy of your work at STARS@ucf.edu

    Abbreviated Journal Title

    Futur. Gener. Comp. Syst.

    Keywords

    Data mining; Privacy; Distributed architectures; PARTITIONED DATA; INFRASTRUCTURE; SERVICES; TOOLKIT; Computer Science, Theory & Methods

    Abstract

    While the emerging field of privacy preserving data mining (PPDM) will enable many new data mining applications, it suffers from several practical difficulties. PPDM algorithms are challenging to develop and computationally intensive to execute. Developers need convenient abstractions to simplify the engineering of PPDM applications. The individual parties involved in the data mining process need a way to bring high-performance, parallel computers to bear on the computationally intensive parts of the PPDM tasks. This paper discusses APHID (Architecture for Private and High-performance Integrated Data mining), a practical architecture and software framework for developing and executing large scale PPDM applications. At one tier, the system supports simplified use of cluster and grid resources, and at another tier, the system abstracts communication for easy PPDM algorithm development. This paper offers a detailed analysis of the challenges in developing PPDM algorithms with existing frameworks, and motivates the design of a new infrastructure based on these challenges. (C) 2010 Elsevier B.V. All rights reserved.

    Journal Title

    Future Generation Computer Systems-the International Journal of Grid Computing-Theory Methods and Applications

    Volume

    26

    Issue/Number

    7

    Publication Date

    1-1-2010

    Document Type

    Article

    Language

    English

    First Page

    891

    Last Page

    904

    WOS Identifier

    WOS:000279804200001

    ISSN

    0167-739X

    Share

    COinS