A survey of missing data imputation techniques: statistical methods, machine learning models, and GAN-based approaches

International Journal of Artificial Intelligence

A survey of missing data imputation techniques: statistical methods, machine learning models, and GAN-based approaches

Abstract

Efficiently addressing missing data is critical in data analysis across diverse domains. This study evaluates traditional statistical, machine learning, and generative adversarial network (GAN)-based imputation methods, emphasizing their strengths, limitations, and applicability to different data types and missing data mechanisms (missing completely at random (MCAR), missing at random (MAR), missing not at random (MNAR)). GAN-based models, including generative adversarial imputation network (GAIN), view imputation generative adversarial network (VIGAN), and SolarGAN, are highlighted for their adaptability and effectiveness in handling complex datasets, such as images and time series. Despite challenges like computational demands, GANs outperform conventional methods in capturing non-linear dependencies. Future work includes optimizing GAN architectures for broader data types and exploring hybrid models to enhance imputation accuracy and scalability in real-world applications.

Discover Our Library

Embark on a journey through our expansive collection of articles and let curiosity lead your path to innovation.

Explore Now
Library 3D Ilustration