Evaluating Ordinal Multivariate Models under Multicollinearity via Pairwise Likelihood: A Simulation Perspective

Authors

DOI:

https://doi.org/10.26877/asset.v7i4.2282

Keywords:

Latent Variable Modeling, Monte Carlo Simulation, Multivariate Ordinal Regression, Ordinal Probit, Pairwise Likelihood Estimation (PL)

Abstract

This study examines the effect of multicollinearity on ordinal regression through a two-stage Monte Carlo simulation. A synthetic population of 2,000,000 observations was generated with predictors drawn from a normal distribution, and responses simulated using an ordinal probit model. A Monte Carlo procedure was employed with 10 repetitions, each consisting of 100 random samples of 1,000 observations. Parameter estimation employed Maximum Likelihood Estimation (MLE) for univariate models and Pairwise Likelihood (PL) for multivariate models, with performance assessed using mean squared error (MSE), bias, and computation time. Results show that multicollinearity had negligible impact on estimator bias and MSE, confirming the robustness of both MLE and PL to correlated predictors. However, severe multicollinearity substantially increased computation time, indicating a trade-off between estimator stability and efficiency. These findings highlight PL as a viable approach for analyzing complex ordinal data, particularly in applications such as socio-economic surveys and health metrics where predictor correlation is unavoidable.

Author Biographies

  • Achmad Fauzan, Universitas Islam Indonesia

    1Statistics Study Program, Faculty of Mathematics and Natural Science, Universitas Islam Indonesia, Yogyakarta, Indonesia

    2Study Program of Statistics and Data Science, School of Data Science, Mathematics and Informatics, IPB University, Indonesia

  • Kusman Sadik, IPB University

    Study Program of Statistics and Data Science, School of Data Science, Mathematics and Informatics, IPB University, Indonesia

  • Anang Kurnia, IPB University

    Study Program of Statistics and Data Science, School of Data Science, Mathematics and Informatics, IPB University, Indonesia

References

[1] Ganesh S. Multivariate Linear Regression. International Encyclopedia of Education, Elsevier; 2010, p. 324–31. https://doi.org/10.1016/B978-0-08-044894-7.01350-6.

[2] Bonnini S, Borghesi M. Relationship between Mental Health and Socio-Economic, Demographic and Environmental Factors in the COVID-19 Lockdown Period—A Multivariate Regression Analysis. Mathematics 2022;10:3237. https://doi.org/10.3390/math10183237.

[3] Kunkler M. Multilateral exchange rates: A multivariate regression framework. J Econ Bus 2023;125–126:106132. https://doi.org/10.1016/j.jeconbus.2023.106132.

[4] Cui J, Yi GY. Variable selection in multivariate regression models with measurement error in covariates. J Multivar Anal 2024;202:105299. https://doi.org/10.1016/j.jmva.2024.105299.

[5] Hernáez Á, Rogne T, Skåra KH, Håberg SE, Page CM, Fraser A, et al. Body mass index and subfertility: multivariable regression and Mendelian randomization analyses in the Norwegian Mother, Father and Child Cohort Study. Human Reproduction 2021;36:3141–51. https://doi.org/10.1093/humrep/deab224.

[6] Liang J, Bi G, Zhan C. Multinomial and ordinal Logistic regression analyses with multi-categorical variables using R. Ann Transl Med 2020;8:982–982. https://doi.org/10.21037/atm-2020-57.

[7] Pagui ECK, Canale A, Genz A, Azzalini A. PLordprob: Title Multivariate Ordered Probit Model via Pairwise Likelihood 2025.

[8] Kenne Pagui EC, Canale A. Pairwise likelihood inference for multivariate ordinal responses with applications to customer satisfaction. Appl Stoch Models Bus Ind 2016;32:273–82. https://doi.org/10.1002/asmb.2147.

[9] Hirk R, Hornik K, Vana L. mvord: An R package for fitting multivariate ordinal regression models. J Stat Softw 2020;93. https://doi.org/10.18637/jss.v093.i04.

[10] Lindsay BG. Composite likelihood methods. Comtemporary Mathematics, vol. 80, American Mathematical Society; 1988, p. 221–39. https://doi.org/10.1090/conm/080/999014.

[11] Varin C, Reid N, Firth D. AnOverview of Composite Likelihood Methods. Stat Sin 2011;21:5–42.

[12] Wieditz J, Miller C, Scholand J, Nemeth M. A Brief Introduction on Latent Variable Based Ordinal Regression Models With an Application to Survey Data. Stat Med 2024;43:5618–34. https://doi.org/10.1002/sim.10208.

[13] Gambarota F, Altoè G. Ordinal regression models made easy: A tutorial on parameter interpretation, data simulation and power analysis. International Journal of Psychology 2024;59:1263–92. https://doi.org/10.1002/ijop.13243.

[14] Mualifah LNA, Soleh AM, Notodiputro KA. Comparison of GARCH, LSTM, and Hybrid GARCH-LSTM Models for Analyzing Data Volatility. International Journal of Advances in Soft Computing and Its Applications 2024;16:150–65. https://doi.org/10.15849/IJASCA.240730.10.

[15] R Core Team. R: A Language and Environment for Statistical Computing 2024.

[16] Labambe M. Predicting Waste Production Trends in Palu City Using Linear Regression Analysis. Advance Sustainable Science Engineering and Technology 2024;6:0240306. https://doi.org/10.26877/asset.v6i3.523.

[17] Rusyana A, Kurnia A, Sadik K, Wigena AH, Sumertajaya IM, Sartono B. Comparison of GLM, GLMM and HGLM in Identifying Factors that Influence the District or City Poverty Level in Aceh Province. J Phys Conf Ser 2021;1863:012023. https://doi.org/10.1088/1742-6596/1863/1/012023.

[18] Agresti A. Foundations of Linear and Generalized Linear Models. Wiley & Sons; 2015.

[19] Skrondal A, Rabe-Hesketh S. Generalized Latent Variable Modeling : Multilevel, Longitudinal, and Structural Equation Models. CRC Press; 2004.

[20] Greene WH. Econometric analysis. Prentice Hall; 2003.

[21] Tutz Gerhard. Regression for categorical data. Cambridge University Press; 2012.

[22] Liddell TM, Kruschke JK. Analyzing ordinal data with metric models: What could possibly go wrong? J Exp Soc Psychol 2018;79:328–48. https://doi.org/10.1016/j.jesp.2018.08.009.

[23] Croux C, Haesbroeck G, Ruwet C. Robust estimation for ordinal regression. J Stat Plan Inference 2013;143:1486–99. https://doi.org/10.1016/j.jspi.2013.04.008.

[24] Bravo M, Canale A. Pairwise likelihood inference for the multivariate ordered probit model 2019.

[25] Molenberghs G, Verbeke G. Models for Discrete Longitudinal Data. New York: Springer-Verlag; 2005. https://doi.org/10.1007/0-387-28980-1.

[26] Robitzsch A. Pairwise Likelihood Estimation of the 2PL Model with Locally Dependent Item Responses. Applied Sciences 2024;14:2652. https://doi.org/10.3390/app14062652.

[27] Mazo G, Karlis D, Rau A. A Randomized Pairwise Likelihood Method for Complex Statistical Inferences. J Am Stat Assoc 2024;119:2317–27. https://doi.org/10.1080/01621459.2023.2257367.

[28] Tong J, Luo C, Islam MN, Sheils NE, Buresh J, Edmondson M, et al. Distributed learning for heterogeneous clinical data with application to integrating COVID-19 data across 230 sites. NPJ Digit Med 2022;5:76. https://doi.org/10.1038/s41746-022-00615-8.

[29] Chan JY-L, Leow SMH, Bea KT, Cheng WK, Phoong SW, Hong Z-W, et al. Mitigating the Multicollinearity Problem and Its Machine Learning Approach: A Review. Mathematics 2022;10:1283. https://doi.org/10.3390/math10081283.

[30] Dertli HI, Hayes DB, Zorn TG. Effects of multicollinearity and data granularity on regression models of stream temperature. J Hydrol (Amst) 2024;639:131572. https://doi.org/10.1016/j.jhydrol.2024.131572.

[31] Sundus KI, Hammo BH, Al-Zoubi MB, Al-Omari A. Solving the multicollinearity problem to improve the stability of machine learning algorithms applied to a fully annotated breast cancer dataset. Inform Med Unlocked 2022;33:101088. https://doi.org/10.1016/j.imu.2022.101088.

Downloads

Published

2025-10-28