An Experimental Evaluation Of Fault Diagnosis From Imbalanced And Incomplete Data For Smart Semiconductor Manufacturing

Keywords

Classification; Data imputation; Fault detection; Machine learning; Semiconductor manufacturing

Abstract

The SECOM dataset contains information about a semiconductor production line, entailing the products that failed the in-house test line and their attributes. This dataset, similar to most semiconductor manufacturing data, contains missing values, imbalanced classes, and noisy features. In this work, the challenges of this dataset are met and many different approaches for classification are evaluated to perform fault diagnosis. We present an experimental evaluation that examines 288 combinations of different approaches involving data pruning, data imputation, feature selection, and classification methods, to find the suitable approaches for this task. Furthermore, a novel data imputation approach, namely “In-painting KNN-Imputation” is introduced and is shown to outperform the common data imputation technique. The results show the capability of each classifier, feature selection method, data generation method, and data imputation technique, with a full analysis of their respective parameter optimizations.

Publication Date

12-1-2018

Publication Title

Big Data and Cognitive Computing

Volume

2

Issue

4

Number of Pages

1-20

Document Type

Article

Personal Identifier

scopus

DOI Link

https://doi.org/10.3390/bdcc2040030

Socpus ID

85070301527 (Scopus)

Source API URL

https://api.elsevier.com/content/abstract/scopus_id/85070301527

This document is currently not available here.

Share

COinS