Boosting-Based Machine Learning Models and Hyperparameter Tuning for Predicting Vehicle Carbon Dioxide Emission
DOI:
https://doi.org/10.26877/asset.v7i4.2097Keywords:
Vehicle Carbon Dioxide Emission, Machine Learning Models, SHAP, Comparative Study, Environmental SustainabilityAbstract
Sustainable development and climate change are central agendas in global policy and research. This study examines and compares three ensemble learning models using Gradient Boosting Machine, Categorical Boosting, and Extreme Gradient Boosting for forecasting vehicle carbon dioxide (CO2) emissions. Data preprocessing with Interquartile Range (IQR) and median imputation is among the methods used to address missing values in CO₂ rating and smog rating variables. SHAP and PDP were employed for feature importance analysis and model interpretability. The findings from the third experiment demonstrate that Extreme Gradient Boosting (XGBoost) outperformed other models achieving a Coefficient Determination of 0.9988, Root-Mean-Square Error of 2.1696, Mean-Absolute Error of 0.4977, and Mean-Absolute-Percentage Error of 0.0019. The primary predictive features included combined fuel consumption (liters/100 km), city and highway fuel consumption, ethanol fuel consumption, model year, engine size and diesel consumption. The findings suggest the potential of boosting-based models for supporting sustainable transport planning, policy for emission reduction, and evidence-based policy making.
References
[1] Govindan K. How digitalization transforms the traditional circular economy to a smart circular economy for achieving SDGs and net zero. Transp Res E Logist Transp Rev 2023;177. https://doi.org/10.1016/j.tre.2023.103147.
[2] Cuce E, Cuce PM, Riffat S. Thin film coated windows towards low/zero carbon buildings: Adaptive control of solar, thermal, and optical parameters. Sustainable Energy Technologies and Assessments 2021;46:101257. https://doi.org/10.1016/J.SETA.2021.101257.
[3] Keys DL. Getting to Net Zero by 2050. Using NEPA to Combat Global Warming 2024:129–56. https://doi.org/10.1007/978-3-031-69316-8_7.
[4] van Soest HL, den Elzen MGJ, van Vuuren DP. Net-zero emission targets for major emitting countries consistent with the Paris Agreement. Nat Commun 2021;12. https://doi.org/10.1038/s41467-021-22294-x.
[5] Regufe MJ, Pereira A, Ferreira AFP, Ribeiro AM, Rodrigues AE. Current developments of carbon capture storage and/or utilization–looking for net-zero emissions defined in the paris agreement. Energies (Basel) 2021;14. https://doi.org/10.3390/en14092406.
[6] Singh S, Kulshrestha MJ, Rani N, Kumar K, Sharma C, Aswal DK. An Overview of Vehicular Emission Standards. Mapan - Journal of Metrology Society of India 2023;38:241–63. https://doi.org/10.1007/s12647-022-00555-4.
[7] Bachmann N, Tripathi S, Brunner M, Jodlbauer H. The Contribution of Data-Driven Technologies in Achieving the Sustainable Development Goals. Sustainability (Switzerland) 2022;14. https://doi.org/10.3390/su14052497.
[8] Kwilinski A, Lyulyov O, Pimonenko T. Environmental Sustainability within Attaining Sustainable Development Goals: The Role of Digitalization and the Transport Sector. Sustainability (Switzerland) 2023;15. https://doi.org/10.3390/su151411282.
[9] Zhu B, Hu S, Chen X (Michael), Roncoli C, Lee DH. Uncovering driving factors and spatiotemporal patterns of urban passenger car CO2 emissions: A case study in Hangzhou, China. Appl Energy 2024;375. https://doi.org/10.1016/j.apenergy.2024.124094.
[10] Zhu L. Comparative evaluation of CO2 emissions from transportation in countries around the world. J Transp Geogr 2023;110:103609. https://doi.org/10.1016/J.JTRANGEO.2023.103609.
[11] Andrew RM. A comparison of estimates of global carbon dioxide emissions from fossil carbon sources. Earth Syst Sci Data 2020;12:1437–65. https://doi.org/10.5194/ESSD-12-1437-2020.
[12] Gurcan F. Forecasting CO2 emissions of fuel vehicles for an ecological world using ensemble learning, machine learning, and deep learning models. PeerJ Comput Sci 2024;10. https://doi.org/10.7717/PEERJ-CS.2234.
[13] Natarajan Y, Wadhwa G, Sri Preethaa KR, Paul A. Forecasting Carbon Dioxide Emissions of Light-Duty Vehicles with Different Machine Learning Algorithms. Electronics (Switzerland) 2023;12. https://doi.org/10.3390/electronics12102288.
[14] Guo X, Kou R, He X. Towards Carbon Neutrality: Machine Learning Analysis of Vehicle Emissions in Canada. Sustainability (Switzerland) 2024;16. https://doi.org/10.3390/su162310526.
[15] Meddage P, Ekanayake I, Perera US, Azamathulla HMd, Md Said MA, Rathnayake U. Interpretation of Machine-Learning-Based (Black-box) Wind Pressure Predictions for Low-Rise Gable-Roofed Buildings Using Shapley Additive Explanations (SHAP). Buildings 2022;12:734. https://doi.org/10.3390/buildings12060734.
[16] Vehicle Emissions and Smog Rating Classification n.d. https://www.kaggle.com/datasets/abhikdas2809/canadacaremissions (accessed August 18, 2025).
[17] Abdollahi A, Pradhan B. Explainable artificial intelligence (XAI) for interpreting the contributing factors feed into the wildfire susceptibility prediction model. Science of the Total Environment 2023;879. https://doi.org/10.1016/j.scitotenv.2023.163004.
[18] Tudor C, Sova R, Stamatiou P, Vlachos V, Polychronidou P. Future-Proofing EU-27 Energy Policies with AI: Analyzing and Forecasting Fossil Fuel Trends. Electronics (Switzerland) 2025;14. https://doi.org/10.3390/electronics14030631.
[19] Ukwaththa J, Herath S, Meddage DPP. A review of machine learning (ML) and explainable artificial intelligence (XAI) methods in additive manufacturing (3D Printing). Mater Today Commun 2024;41. https://doi.org/10.1016/j.mtcomm.2024.110294.
[20] Arora G, Kumar D, Singh B. Tree based Regression Models for Predicting the Compressive Strength of Concrete at High Temperature. IOP Conf Ser Earth Environ Sci, vol. 1327, Institute of Physics; 2024. https://doi.org/10.1088/1755-1315/1327/1/012015.
[21] Laghmati S, Hamida S, Hicham K, Cherradi B, Tmiri A. An improved breast cancer disease prediction system using ML and PCA. Multimed Tools Appl 2024;83:33785–821. https://doi.org/10.1007/s11042-023-16874-w.
[22] Xiao W, Wang C, Liu J, Gao M, Wu J. Optimizing Faulting Prediction for Rigid Pavements Using a Hybrid SHAP-TPE-CatBoost Model. Applied Sciences (Switzerland) 2023;13. https://doi.org/10.3390/app132312862.
[23] Yazici C, Domínguez-Gutiérrez FJ. Machine learning techniques for estimating high–temperature mechanical behavior of high strength steels. Results in Engineering 2025;25. https://doi.org/10.1016/j.rineng.2025.104242.
[24] Kharazi Esfahani P, Peiro Ahmady Langeroudy K, Khorsand Movaghar MR. Enhanced machine learning—ensemble method for estimation of oil formation volume factor at reservoir conditions. Sci Rep 2023;13. https://doi.org/10.1038/s41598-023-42469-4.
[25] Mustapha IB, Abdulkareem M, Jassam TM, AlAteah AH, Al-Sodani KAA, Al-Tholaia MMH, et al. Comparative Analysis of Gradient-Boosting Ensembles for Estimation of Compressive Strength of Quaternary Blend Concrete. Int J Concr Struct Mater 2024;18. https://doi.org/10.1186/s40069-023-00653-w.
[26] Hakim S Bin, Adil M, Acharya K, Song HH. Decoding Android Malware with a Fraction of Features: An Attention-Enhanced MLP-SVM Approach, 2025, p. 187–209. https://doi.org/10.1007/978-981-96-3531-3_10.
[27] Sahraei MA, Li K, Qiao Q. A Multi-Stage Feature Selection and Explainable Machine Learning Framework for Forecasting Transportation CO2 Emissions. Energies 2025, Vol 18, Page 4184 2025;18:4184. https://doi.org/10.3390/EN18154184.
[28] Alsaadi N. Comparative analysis and statistical optimization of fuel economy for sustainable vehicle routings. Sustainability (Switzerland) 2022;14. https://doi.org/10.3390/su14010064.
[29] Ecker H, Adams NB, Schmitz M, Wetsch WA. Feasibility of real-time compression frequency and compression depth assessment in CPR using a “machine-learning” artificial intelligence tool. Resusc Plus 2024;20. https://doi.org/10.1016/j.resplu.2024.100825.
[30] Peng Y, Luo Y, Yan J, Li W, Liao Y, Yan L, et al. Automatic measurement of fetal anterior neck lower jaw angle in nuchal translucency scans. Sci Rep 2024;14. https://doi.org/10.1038/s41598-024-55974-x.
[31] Danza L, Belussi L, Meroni I, Mililli M, Salamone F. Hourly calculation method of air source heat pump behavior. Buildings 2016;6. https://doi.org/10.3390/buildings6020016.
[32] Mądziel M. Predictive methods for CO2 emissions and energy use in vehicles at intersections. Sci Rep 2025;15. https://doi.org/10.1038/s41598-025-91300-9.
[33] Meng Y, Noman H. Predicting CO2 Emission Footprint Using AI through Machine Learning. Atmosphere (Basel) 2022;13. https://doi.org/10.3390/atmos13111871.
[34] Utilization of Biopertalite for Fuel Efficiency and Reduction in CO and CO2 Gas Emissions in Four-Wheel Motor Vehicles. Makara Journal of Technology 2022;26:117–23. https://doi.org/10.7454/mst.v26i3.1603.
[35] Ramadhan R, Mon MT, Tangparitkul S, Tansuchat R, Agustin DA. Carbon capture, utilization, and storage in Indonesia: An update on storage capacity, current status, economic viability, and policy. Energy Geoscience 2024;5. https://doi.org/10.1016/j.engeos.2024.100335.