A Transformer-Based Framework for Accurate and Interpretable PM2.5 Forecasting through Integration of Country-Level Embeddings and Explainable AI for Enhanced Environmental Decision-Making

Authors

  • Syed Azeem Inam Department of Artificial Intelligence and Mathematical Sciences, Sindh Madressatul Islam University, Karachi
  • Hassan Hashim Department of Artificial Intelligence and Mathematical Sciences, Sindh Madressatul Islam University, Karachi, Pakistan.
  • Asif Mehmood Awan Department of Artificial Intelligence and Mathematical Sciences, Sindh Madressatul Islam University, Karachi, Pakistan.
  • Syeda Wajiha Naim Department of Software Engineering, Sindh Madressatul Islam University, Karachi, Pakistan.
  • Haider Rajput Department of Artificial Intelligence and Mathematical Sciences, Sindh Madressatul Islam University, Karachi, Pakistan.
  • Saddam Umer Department of Artificial Intelligence and Mathematical Sciences, Sindh Madressatul Islam University, Karachi, Pakistan.

DOI:

https://doi.org/10.62019/hcqtrs67

Keywords:

Transformer Architecture , Explainable AI , Environmental Management, Pollution Forecasting, PM2.5 Concentration

Abstract

Air quality prediction is increasingly vital to environmental health, particularly for urban and rural regions, where dangerous particulate matter hazards often exist. This study presents a novel methodology for estimating PM2.5 concentrations using a transformer model with country-level embedding and explainable AI (XAI). The proposed approach is superior to conventional machine learning and deep learning techniques as it provides high accuracy and is interpretable and applicable to various geographic regions. Given the country-specific embeddings, the transformer architecture models the replacements caused by time and location variations in pollutants' concentrations, consequently allowing accurate predictions even for regions having sparse data. Furthermore, SHAP and LIME elucidate the model's tendency to predict, providing policymakers with valuable insights. Overall, the proposed architecture presents a stronger predictive power than other forecasting models, with an R-squared value of 0.98 and a mean absolute error of 0.011. Also, using country embedding has helped improve accuracy and the ability to apply to different regions. Hence, this research offers a plausible framework to forecast air pollution and evidence-based government policymaking and planning about air pollution and its health and environmental effects.

References

Aldughayfiq, B., Ashfaq, F., Jhanjhi, N. Z., & Humayun, M. (2023). Explainable AI for Retinoblastoma Diagnosis: Interpreting Deep Learning Models with LIME and SHAP. Diagnostics, 13(11), 1932. https://doi.org/10.3390/diagnostics13111932 DOI: https://doi.org/10.3390/diagnostics13111932

Al-qaness, M. A. A., Dahou, A., Ewees, A. A., Abualigah, L., Huai, J., Abd Elaziz, M., & Helmi, A. M. (2023). ResInformer: Residual Transformer-Based Artificial Time-Series Forecasting Model for PM2.5 Concentration in Three Major Chinese Cities. Mathematics, 11(2), 476. https://doi.org/10.3390/math11020476 DOI: https://doi.org/10.3390/math11020476

Esager, M. W. M., & Ünlü, K. D. (2023). Forecasting Air Quality in Tripoli: An Evaluation of Deep Learning Models for Hourly PM2.5 Surface Mass Concentrations. Atmosphere, 14(3), 478. https://doi.org/10.3390/atmos14030478 DOI: https://doi.org/10.3390/atmos14030478

Fang, D., Chen, B., Hubacek, K., Ni, R., Chen, L., Feng, K., & Lin, J. (2019). Clean air for some: Unintended spillover effects of regional air pollution policies. Science Advances, 5(4). https://doi.org/10.1126/sciadv.aav4707 DOI: https://doi.org/10.1126/sciadv.aav4707

Gaspar, D., Silva, P., & Silva, C. (2024). Explainable AI for Intrusion Detection Systems: LIME and SHAP Applicability on Multi-Layer Perceptron. IEEE Access, 12, 30164–30175. https://doi.org/10.1109/ACCESS.2024.3368377 DOI: https://doi.org/10.1109/ACCESS.2024.3368377

He, J., Chen, J., Xiao, J., Zhao, T., & Cao, P. (2023). Defining Important Areas for Ecosystem Conservation in Qinghai Province under the Policy of Ecological Red Line. Sustainability, 15(6), 5524. https://doi.org/10.3390/su15065524 DOI: https://doi.org/10.3390/su15065524

He, Z., & Guo, Q. (2024). Comparative Analysis of Multiple Deep Learning Models for Forecasting Monthly Ambient PM2.5 Concentrations: A Case Study in Dezhou City, China. Atmosphere, 15(12), 1432. https://doi.org/10.3390/atmos15121432 DOI: https://doi.org/10.3390/atmos15121432

Inam, S. A., Khan, A. A., Mazhar, T., Ahmed, N., Shahzad, T., Khan, M. A., Saeed, M. M., & Hamam, H. (2024). PR-FCNN: A Data-Driven Hybrid Approach for Predicting PM2.5 Concentration. Discover Artificial Intelligence, 4(1), 75. https://doi.org/10.1007/s44163-024-00184-7 DOI: https://doi.org/10.1007/s44163-024-00184-7

Inam, S. A., Zaidi, S. M. H., Khan, A. A., & Ullah, S. (2025). A Neural Network Approach to Carbon Emission Prediction in Industrial and Power Sectors. Discover Applied Sciences, 7(6), 640. https://doi.org/10.1007/s42452-025-07257-x DOI: https://doi.org/10.1007/s42452-025-07257-x

Lakshmi, S., & Krishnamoorthy, A. (2024). Effective Multi-Step PM2.5 and PM10 Air Quality Forecasting Using Bidirectional ConvLSTM Encoder-Decoder With STA Mechanism. IEEE Access, 12, 179628–179647. https://doi.org/10.1109/ACCESS.2024.3509142 DOI: https://doi.org/10.1109/ACCESS.2024.3509142

Martinez, N. E., Canoba, A., Donaher, S. E., Garnier-Laplace, J., Kinase, S., Mayall, A., Stark, K., & Whicker, J. (2024). An introduction to ecosystem services for radiological protection. Annals of the ICRP, 53(1_suppl), 246–254. https://doi.org/10.1177/01466453241283931ah DOI: https://doi.org/10.1177/01466453241283931ah

Mathew, A., Gokul, P. R., Raja Shekar, P., Arunab, K. S., Ghassan Abdo, H., Almohamad, H., & Abdullah Al Dughairi, A. (2023). Air quality analysis and PM 2.5 modelling using machine learning techniques: A study of Hyderabad city in India. Cogent Engineering, 10(1). https://doi.org/10.1080/23311916.2023.2243743 DOI: https://doi.org/10.1080/23311916.2023.2243743

Onwudiegwu, C., Nabebe, G., & Izah, S. C. (2025). Environmental and Public Health Implications of Pesticide Residues: From Soil Contamination to Policy Interventions. Greener Journal of Biological Sciences, 15(1), 1–12. https://doi.org/10.15580/gjbs.2025.1.120424187 DOI: https://doi.org/10.15580/gjbs.2025.1.120424187

Rai, V., Kumar, S., Singh, T., & Kapoor, R. (2023). PM2.5 Level Forecasting using Transformer-Based Model. 2023 3rd International Conference on Advance Computing and Innovative Technologies in Engineering (ICACITE), 216–221. https://doi.org/10.1109/ICACITE57410.2023.10182822 DOI: https://doi.org/10.1109/ICACITE57410.2023.10182822

Rath, S., & P, M. (2025). Air Pollution Forecasting Using Machine Learning with Temporal Fusion Transformer and Graph Neural Networks. 2025 3rd International Conference on Intelligent Data Communication Technologies and Internet of Things (IDCIoT), 1798–1803. https://doi.org/10.1109/IDCIOT64235.2025.10914714 DOI: https://doi.org/10.1109/IDCIOT64235.2025.10914714

Roshinta, T. A., & Gábor, S. (2024). A Comparative Study of LIME and SHAP for Enhancing Trustworthiness and Efficiency in Explainable AI Systems. 2024 IEEE International Conference on Computing (ICOCO), 134–139. https://doi.org/10.1109/ICOCO62848.2024.10928183 DOI: https://doi.org/10.1109/ICOCO62848.2024.10928183

Salih, A., Raisi-Estabragh, Z., Galazzo, I. B., Radeva, P., Petersen, S. E., Menegaz, G., & Lekadir, K. (2024). A Perspective on Explainable Artificial Intelligence Methods: SHAP and LIME. https://doi.org/10.1002/aisy.202400304 DOI: https://doi.org/10.1002/aisy.202400304

Vignesh, P. P., Jiang, J. H., & Kishore, P. (2023). Predicting PM 2.5 Concentrations Across USA Using Machine Learning. Earth and Space Science, 10(10). https://doi.org/10.1029/2023EA002911 DOI: https://doi.org/10.1029/2023EA002911

Zhakypbek, Y., Kossalbayev, B. D., Belkozhayev, A. M., Murat, T., Tursbekov, S., Abdalimov, E., Pashkovskiy, P., Kreslavski, V., Kuznetsov, V., & Allakhverdiev, S. I. (2024). Reducing Heavy Metal Contamination in Soil and Water Using Phytoremediation. Plants, 13(11), 1534. https://doi.org/10.3390/plants13111534 DOI: https://doi.org/10.3390/plants13111534

Zhang, Z., & Zhang, S. (2023). Modeling air quality PM2.5 forecasting using deep sparse attention-based transformer networks. International Journal of Environmental Science and Technology, 20(12), 13535–13550. https://doi.org/10.1007/s13762-023-04900-1 DOI: https://doi.org/10.1007/s13762-023-04900-1

Zhang, Z., Zhang, S., Zhao, X., Chen, L., & Yao, J. (2022). Temporal Difference-Based Graph Transformer Networks For Air Quality PM2.5 Prediction: A Case Study in China. Frontiers in Environmental Science, 10. https://doi.org/10.3389/fenvs.2022.924986 DOI: https://doi.org/10.3389/fenvs.2022.924986

Downloads

Published

2025-09-18

How to Cite

A Transformer-Based Framework for Accurate and Interpretable PM2.5 Forecasting through Integration of Country-Level Embeddings and Explainable AI for Enhanced Environmental Decision-Making. (2025). The Asian Bulletin of Big Data Management , 5(3), 245-255. https://doi.org/10.62019/hcqtrs67

Similar Articles

1-10 of 231

You may also start an advanced similarity search for this article.

Most read articles by the same author(s)