Improving Missing Data Imputation with Transformer-Based Models

Addressing missing data is crucial for accurate data analysis and predictive modeling. Traditional methods like zero, mean, and K-Nearest Neighbors (KNN) imputation often fail to capture complex relationships within datasets, leading to information loss. Our study introduces a transformer-based approach for missing values imputation, offering a more advanced and data-driven solution. Transformers, known for their ability to handle sequential data, outperform traditional methods by leveraging self-attention mechanisms, capturing intricate patterns and dependencies.

The proposed model was validated using Long Short-Term Memory (LSTM) networks, ensuring not only accurate imputation but also temporal coherence of the predicted values. The results showed a significant improvement, with the model achieving an R2 score of 0.96 in hourly data, surpassing KNN by 0.195. For daily data, the R2 score reached 0.806, outperforming mean imputation by 0.25. These findings highlight the superior capability of transformer models in maintaining data integrity and providing more accurate imputed values.

This research sets a new benchmark in the field of data imputation, proving that advanced models can significantly enhance the reliability of data analysis. It opens up new avenues for applying such techniques in areas like healthcare, finance, and environmental monitoring, where precise data handling is essential.

Links:

Full Text: https://www.igminresearch.com/articles/html/igmin140
DOI Link: https://dx.doi.org/10.61927/igmin140

IgMin Research

Search This Blog

Improving Missing Data Imputation with Transformer-Based Models

Comments

Post a Comment