Authors: Adeyemi Olutoyin Adegbenjo, Michael Ngadi
Identifier: CSBE19185
Download file:
Published in: CSBE-SCGAB Technical Conferences » AGM Vancouver 2019
As with other various high dimensional data, "dead pixels" (missing values) are known to constitute major difficulty during multivariate analysis of hyperspectral imaging data. These "dead pixels" mostly resulting from detector anomalies are often seen as zero or missing values in spectra data and usually impede building robust multivariate models. One of the commonly used approach in handling "dead pixels" is column (feature) mean value estimation. However, some researchers do omit this crucial step which have great tendency in influencing data analysis outputs. Therefore, this study tested different missing values estimation methods namely: small value replacement, missing value exclusion, replacement by mean, median, minimum values, estimation using KNN, PPCA, BPCA, and SVD. The results show that the choice of "dead pixels" estimation method greatly affect classification accuracy. In the specific case of chicken egg spectra data, small value replacement method (usually computed as half of the minimum positive value in the original data) and BPCA are recommended preferable over other estimation techniques considered. These methods performed optimally in terms of observed AUC values and classification accuracies obtained for two classifiers namely: support vector machine (SVM) and random forest (RF).