Comparative Performance of GLMM and GEE for Longitudinal Beta Regression in Economic Inequality Modelling
DOI:
https://doi.org/10.26877/7y0xxb39Keywords:
Beta, GEE, GLMM, Gini Index, Economic InequalityAbstract
Due to the shortcomings of conventional Gaussian methods, specialized models are frequently needed for longitudinal data analysis with bounded outcomes, such as the Gini ratio. In order to model economic inequality in Indonesia, this study compares the effectiveness of Generalized Linear Mixed Models (GLMM) and Generalized Estimating Equations (GEE) for beta-distributed longitudinal data. Root Mean Square Error (RMSE) and pseudo R-squared values are used to assess model performance using panel data from 10 provinces between 2018 and 2024 as well as important socioeconomic indicators. With lower RMSE and higher explanatory power across all provincial subsets, the results consistently demonstrate that GLMM performs better than both GEE and generalized linear models (GLM). ANOVA tests verify that modeling methodologies, not data heterogeneity in GRDP or Gini values, are responsible for the differences in model performance. These results demonstrate how well GLMM handles complex data structures and within-subject correlations, providing more accurate and effective estimates in longitudinal beta regression scenarios. The study encourages the use of GLMM for more precise longitudinal analysis in economic and social research and offers insightful information for researchers modeling inequality indices.
References
[1] B. H. Baltagi, Econometric Analysis of Panel Data, Third. England: John Wiley & Sons Ltd, 2005.
[2] I. Y. Sun, “Gini Coefficient,” The Blackwell Encyclopedia of Sociology, 2007.
[3] Y. G. Berger and A. G. Balay, “Confidence Intervals of Gini Coefficient under Unequal Probability Sampling,” J Off Stat, vol. 36, no. 2, pp. 237–249, 2020, doi: 10.2478/jos-2020-0013.
[4] P. Dutt and I. Tsetlin, “Income distribution and economic development: Insights from machine learning,” Economics and Politics, vol. 33, no. 1, pp. 1–36, 2021, doi: 10.1111/ecpo.12157.
[5] Y. Qin, J. N. K. Rao, and C. Wu, “Empirical likelihood confidence intervals for the Gini measure of income inequality,” Econ Model, vol. 27, no. 6, pp. 1429–1435, 2010, doi: 10.1016/j.econmod.2010.07.015.
[6] C. J. Swearingen, M. S. M. Castro, and Z. Bursac, “Modeling percentage outcomes: the %beta_regression macro,” SAS Global Forum 2011, pp. 1–12, 2011.
[7] S. L. P. Ferrari and F. Cribari-Neto, “Beta regression for modelling rates and proportions,” J Appl Stat, vol. 31, no. 7, pp. 799–815, 2004, doi: 10.1080/0266476042000214501.
[8] P. R. Sihombing, “Comparison Of Normal-Based and Beta-Based Regression Models on Ratio/ Proportion Data,” Jurnal Ekonomi Dan Statistik Indonesia, vol. 2, no. 1, pp. 19–23, 2022, doi: 10.11594/jesi.02.01.03.
[9] H. Zhang, Q. Yu, C. Feng, D. Gunzler, P. Wu, and X. M. Tu, “A new look at the difference between the GEE and the GLMM when modeling longitudinal count responses,” J Appl Stat, vol. 39, no. 9, pp. 2067–2079, 2012.
[10] P. R. Sihombing, K. A. Notodiputro, and B. Sartono, “Comparison of GEE and GLMM Methods for Longitudinal Data (Case Study: Determinants of the Percentage of Poor People in Indonesia, 2015-2019),” AIP Conf Proc, vol. 2563, no. October, pp. 2015–2019, 2022, doi: 10.1063/5.0103254.
[11] P. R. Sihombing, R. Mastiani, D. A. Sunarjo, and D. Muslianti, “COMPARISON OF GLM, GLMM AND GEE POISSON MATHEMATICAL MODELING PERFORMANCE (Case Study: Number of Pulmonary Tuberculosis Patients in Indonesia in 2019-2021),” Jurnal TAMBORA, vol. 6, no. 3, pp. 102–106, 2022, doi: 10.36761/jt.v6i3.2081.
[12] D. Kusumaningrum, H. Wijayanto, A. Kurnia, K. A. Notodiputro, M. Ardiansyah, and I. M. Parvez, “Four-parameter beta mixed models with survey and sentinel 2A satellite data for predicting paddy productivity,” Smart Agricultural Technology, vol. 9, no. May, p. 100525, 2024, doi: 10.1016/j.atech.2024.100525.
[13] R. E. Walpole, Probability & Statistics for Engineers & Scientists. USA: Pearson, 2012.
[14] D. Zimprich, “Modeling change in skewed variables using mixed beta regression models,” Res Hum Dev, vol. 7, no. 1, pp. 9–26, 2010, doi: 10.1080/15427600903578136.
[15] M. Hunger, A. Döring, and R. Holle, “Longitudinal beta regression models for analyzing health-related quality of life scores over time,” BMC Med Res Methodol, vol. 12, no. 1, pp. 1–12, 2012.
[16] P. Chakraborty, S. Kalaivani, C. Tharini, and S. J. Hussain, “Evaluating Compressed Sensing Matrix Techniques: A Comparative Study of PCA and Conventional Methods,” Advance Sustainable Science, Engineering and Technology, vol. 7, no. 2, pp. 1–10, 2025, doi: 10.26877/h26m6b34.
[17] A. Widarjono, Ekonometrika: Teori dan Aplikasi untuk Ekonomi dan Bisnis. Yogyakarta: Ekonosia Fakultas Ekonomi Universitas Islam Indonesia, 2007.
[18] T. J. Kiely and N. D. Bastian, “The spatially conscious machine learning model,” Stat Anal Data Min, vol. 13, no. 1, pp. 31–49, 2020, doi: 10.1002/sam.11440.
[19] E. Ben-Michael, A. Feller, and E. Hartman, “Multilevel Calibration Weighting for Survey Data,” Political Analysis, vol. 32, no. 1, pp. 65–83, 2024, doi: 10.1017/pan.2023.9.
[20] U. Kim, S. M. Koroukian, K. C. Stange, J. C. Spilsbury, W. Dong, and J. Rose, “Describing and assessing a new method of approximating categorical individual-level income using community-level income from the census (weighting by income probabilities),” Health Serv Res, vol. 57, no. 6, pp. 1348–1360, 2022, doi: 10.1111/1475-6773.14026.
[21] R. Wieland, S. Ravensbergen, E. J. Gregr, T. Satterfield, and K. M. A. Chan, “Debunking trickle-down ecosystem services: The fallacy of omnipotent, homogeneous beneficiaries,” Ecological Economics, vol. 121, pp. 175–180, 2016, doi: 10.1016/j.ecolecon.2015.11.007.
[22] P. Saunders, Y. Naidoo, and M. Wong, “Are recent trends in poverty and deprivation in Australia consistent with trickle-down effects?,” The Economic and Labour Relations Review, vol. 33, no. 3, pp. 566–585, 2022.
[23] A. Naveed, “More Snakes Than Ladders: Mass Schooling, Social Closure, and the Pursuit of Tarraqi (Social Mobility) in Rural Pakistan☆,” Rural Sociol, vol. 89, no. 3, pp. 375–403, 2024, doi: 10.1111/ruso.12545.
[24] H. Zhang, Y. Xia, R. Chen, D. Gunzler, W. Tang, and X. Tu, “Modeling longitudinal binomial responses: Implications from two dueling paradigms,” J Appl Stat, vol. 38, no. 11, pp. 2373–2390, 2011, doi: 10.1080/02664763.2010.550038.
[25] M. B. M. B. K. Gawarammana and M. R. Sooriyarachchi, “Comparison of methods for analyzing binary repeated measures data: A simulation-based study (comparison of methods for binary repeated measures),” Commun Stat Simul Comput, vol. 46, no. 3, pp. 2103–2120, 2017, doi: 10.1080/03610918.2015.1035445.
[26] M. B. de Melo, D. Daldegan-Bueno, M. G. Menezes Oliveira, and A. L. de Souza, “Beyond ANOVA and MANOVA for repeated measures: Advantages of generalized estimated equations and generalized linear mixed models and its use in neuroscience research,” European Journal of Neuroscience, vol. 56, no. 12, pp. 6089–6098, 2022, doi: 10.1111/ejn.15858.
[27] K. A. Hallgren, D. C. Atkins, and K. Witkiewitz, “Aggregating and analyzing daily drinking data in clinical trials: A comparison of type I errors, power, and bias,” J Stud Alcohol Drugs, vol. 77, no. 6, pp. 986–991, 2016, doi: 10.15288/jsad.2016.77.986.