Scopus Export 2000s

Efficient Parallel Data Mining For Massive Datasets: Parallel Random Forests Classifier

Keywords

Cluster computing; Data mining; Parallel processing; Random forests

Abstract

Data mining refers to the process of finding hidden patterns inside a large dataset. While improving the accuracy of those algorithms has been the main focus of past research, massive dataset size imposes another challenge. Parallel and distributed processing techniques have been applied to data mining algorithms to make them scalable. In this paper, we discuss a new emerging data mining algorithm, random forests, and its parallelization based on VCluster, a portable parallel runtime system we have developed for a cluster of multiprocessors. Random forests is an ensemble of many decision trees and the classification is performed by majority voting by those decision trees. We also present the experimental results on the performance of parallel random forests approach.

Publication Date

12-1-2005

Publication Title

Proceedings of the 2005 International Conference on Parallel and Distributed Processing Techniques and Applications, PDPTA'05

Volume

Number of Pages

1142-1148

Document Type

Article; Proceedings Paper

Personal Identifier

scopus

Copyright Status

Unknown

Socpus ID

60749085012 (Scopus)

Source API URL

https://api.elsevier.com/content/abstract/scopus_id/60749085012

STARS Citation

Dai, Jianyong; Lee, Joohan; and Wang, Morgan C., "Efficient Parallel Data Mining For Massive Datasets: Parallel Random Forests Classifier" (2005). Scopus Export 2000s. 3182.
https://stars.library.ucf.edu/scopus2000/3182

This document is currently not available here.

COinS

Scopus Export 2000s

Efficient Parallel Data Mining For Massive Datasets: Parallel Random Forests Classifier

Keywords

Abstract

Publication Date

Publication Title

Volume

Number of Pages

Document Type

Personal Identifier

Copyright Status

Socpus ID

Source API URL

STARS Citation

Explore

Connect

Scopus Export 2000s

Efficient Parallel Data Mining For Massive Datasets: Parallel Random Forests Classifier

Creator

Keywords

Abstract

Publication Date

Publication Title

Volume

Number of Pages

Document Type

Personal Identifier

Copyright Status

Socpus ID

Source API URL

STARS Citation

Share

Explore

Connect