Gene set imputation method-based rule for recovering missing data using deep learning approach
International Journal of Electrical and Computer Engineering
Abstract
Data imputation enhances dataset completeness, enabling accurate analysis and informed decision-making across various domains. In this research, we propose a novel imputation method, a spectral clustering based on a gene set using adaptive weighted k-nearest neighbor (AWKNN), and an imputation of missing data using a convolutional neural network algorithm for accurate imputed data. In this research, we have considered the Kaggle water quality dataset for the imputation of missing values in water quality monitoring. Data cleaning detects inaccurate data from the dataset by using the median modified Weiner filter (MMWFILT). The normalization technique is based on the Z-score normalization (Z-SN) approach, which improves data organization and management for accurate imputation. Data reduction minimizes unwanted data and the amount of capacity required to store data using an improved kernel correlation filter (IKCF). The characteristics and patterns of data with specific columns are analyzed using enhanced principal component analysis (EPCA) to reduce overfitting. The dataset is classified into complete data and missing data using the light- DenseNet (LIGHT DN) approach. Results show the proposed outperforms traditional techniques in recovering missing data while preserving data distribution. Evaluation based on pH concentration, chloramine concentration, sulfate concentration, water level, and accuracy.
Discover Our Library
Embark on a journey through our expansive collection of articles and let curiosity lead your path to innovation.





