Revolutionizing high-dimensional regularization

EMLMLasso algorithm for linear mixed-effects models

Authors

  • Daniela C. R. Oliveira Universidade Federal de São João del-Rei (UFSJ)
  • Fernanda L. Schumacher The Ohio State University
  • Marcos S. Oliveira Universidade Federal de São João del-Rei (UFSJ)
  • Daiane A. Zuanetti Universidade Federal de São Carlos (UFSCar)
  • Victor H. Lachos University of Connecticut

DOI:

https://doi.org/10.5540/03.2025.011.01.0472

Keywords:

EM algorithm, High-dimensional data, Mixed-effects models, R package glmnet, Regularized variable selection methods

Abstract

The expectation–maximization (EM) algorithm, often used for maximum likelihood estimation, has not seen much application in addressing high-dimensional regularization challenges within linear mixed-effects models. This study introduces the EMLMLasso algorithm, which merges the EM algorithm with the widely used and efficient R package glmnet, enabling Lasso variable selection for fixed effects in such models. We thoroughly evaluate its performance, comparing it to two existing algorithms from R packages glmmLasso and splmm. Our findings, based on simulations and real-world cases, demonstrate the robustness and effectiveness of our approach, even when the number of predictors (p) exceeds the number of observations (n). Notably, across most scenarios, the EMLMLasso algorithm consistently outperforms both glmmLasso and splmm. Moreover, our method is versatile and straightforward to implement, with the potential for extensions to include ridge and elastic net penalties in linear mixed-effects models.

Downloads

Download data is not yet available.

References

A. Alabiso and J. Shang. “High-dimensional linear mixed model selection by partial correlation”. In: Communications in Statistics - Theory and Methods 52.18 (2023), pp. 6355–6380. doi: 10.1080/03610926.2022.2028838.

D. Bates, M. Mächler, B. Bolker, and S. Walker. “Fitting Linear Mixed-Effects Models Using lme4”. In: Journal of Statistical Software 67.1 (2015), pp. 1–48. doi: 10.18637/jss.v067.i01.

J. Bradic, G. Claeskens, and T. Gueuning. “Fixed Effects Testing in High-Dimensional Linear Mixed Models”. In: Journal of the American Statistical Association 115.532 (2020), pp. 1835–1850. doi: 10.1080/01621459.2019.1660172.

P. Bühlmann, M. Kalisch, and L. Meier. “High-Dimensional Statistics with a View Toward Applications in Biology”. In: Annual Review of Statistics and Its Application 1.1 (2014), pp. 255–278. doi: 10.1146/annurev-statistics-022513-115545.

J. Friedman, T. Hastie, and R. Tibshirani. “Regularization Paths for Generalized Linear Models via Coordinate Descent”. In: Journal of Statistical Software 33.1 (2010), pp. 1–22. doi: 10.18637/jss.v033.i01.

A. Groll. glmmLasso: Variable selection for generalized linear mixed models by L1-penalized estimation. Online. https://cran.r-project.org/package=glmmLasso, Accessed May 05, 2023.

N. M. Laird and J. H. Ware. “Random-Effects Models for Longitudinal Data”. In: Biometrics 38.4 (1982), pp. 963–974. doi: 10.2307/2529876.

L. A. Matos, V. H. Lachos, N. Balakrishnan, and F. V. Labra. “Influence diagnostics in linear and nonlinear mixed-effects models with censored data”. In: Computational Statistics & Data Analysis 57.1 (2013), pp. 450–464. doi: 10.1016/j.csda.2012.06.021.

D. C. R. Oliveira, F. L. Schumacher, and V. H. Lachos. “EMLMLasso: The use of the EM algorithm for regularization problems in high-dimensional linear mixed-effects models”. In: (2023). url: https://arxiv.org/abs/2308.01518.

J. Schelldorfer, P. Bühlmann, and S. V. De Geer. “Estimation for high−dimensional linear mixed−effects models using l1−penalization”. In: Scandinavian Journal of Statistics 38.2 (2011), pp. 197–214. doi: 10.1111/j.1467-9469.2011.00740.x.

F. L. Schumacher, V. H. Lachos, and L. A. Matos. “Scale mixture of skew-normal linear mixed models with within-subject serial dependence”. In: Statistics in Medicine 40.7 (2021), pp. 1790–1810. doi: 10.1002/sim.8870.

H. Wang, R. Li, and C. L. Tsai. “Tuning parameter selectors for the smoothly clipped absolute deviation method”. In: Biometrika 94.3 (2007), pp. 553–568. doi: 10.1093/biomet/asm053.

L. Yang and T. T. Wu. “Model-based clustering of high-dimensional longitudinal data via regularization”. In: Biometrics 79.2 (2022), pp. 761–774. doi: 10.1111/biom.13672.

Downloads

Published

2025-01-20

Issue

Section

Trabalhos Completos