Linear programming applied to separation detection in polytomous logistic regression

Inácio Andruski Guimarães, Thiago Schinda Bubniak

Resumo


The Logistic Regression Model is widely used in Discriminant Analysis. However, parameter estimation is affected by the data configuration and may not be achieved when there is  separation between the groups in the data set, which is a common problem in Discriminant Anal­  ysis. The use of linear programming to detect the separation between groups was proposed by [1], and a large number of linear programming approaches have been used to detect separate data in discriminant analysis. However, most research focuses on models for two groups and there are few models for classification problems in multiple groups. In this paper, a linear programming formulation is proposed to detect the separation between groups for the polytomous logistic regression model. The proposed model has a non-negative objective function that has a positive value when the separation is detected and allows to classify the data as completely separate, almost separated or overlapped, and can be used as part of the parameter estimation. A simulation, using data sets from the literature, shows that the proposed approach can be an efficient alternative for mathematical programming applied to problems with multiple groups.


Palavras-chave


Polytomous Logistic Regression; Discriminant Analysis; Linear Programming; Complete Separation.

Texto completo:

PDF (English)

Referências


A. Albert, J. A. Anderson, On the existence of maximum likelihood estimates in logistic regression models, Biometrika, 71, 1-10 , 1984

K. P. Bennet, O. L. Mangasarian, Multicategory discrimination via linear programming, Op- timization Methods and Software, 1, 23-24, 1992

D.Brodnjak — Voncina, Z.C.Kodba, C.Novic, Multivariate data analysis in classification of vegetable oils characterized by the content of fatty acids. Chemometrics and Intelligent Lab- oratory Systems 75, 31-43, 2005.

R.A. Fisher, The use of multiple measurements in taxonomic problems. Annals of Eugenics 3, 179-188, 1936

N. Freed, F. Glover, Simple but powerful goal programming models for discriminant problems. European Journal of Operational Research 7, 44-60, 1981.

W. Gochet, V. Srinivasan, A. Stam, S. Chen, Multi-group discriminant analysis using linear programming. Operations Research 45 (2), 213-225, 1997.

K. Konis, Linear programming algorithms for detecting separated data in binary logistic re­ gression models. DPhill in Computational Statistics, Worcester College, University of Oxford, Department of Statistics, 1 South Parks Road, Oxford 0X1 3TG, United Kingdom, 2007.

D. G. Luenberger, Linear and Nonlinear Programming, 4th. Ed.. Addison-Wesley Publishing Company. Reading, MA. (1973).

T. J. Santner, D. E. Duffy, A note on A. Albert and J. A. Anderson’s conditions for the existence of maximum likelihood estimates in logistic regression models. Biometrika, 73, 1- 10, 1986.

M. J. Silvapulle, J. Burridge, Existence of maximum likelihood estimates in regression models for grouped and ungrouped data. J. R. Statist. Soe. B, 48, 100-106, 1986.




DOI: https://doi.org/10.5540/03.2021.008.01.0427

Apontamentos

  • Não há apontamentos.


SBMAC - Sociedade de Matemática Aplicada e Computacional
Edifício Medical Center - Rua Maestro João Seppe, nº. 900, 16º. andar - Sala 163 | São Carlos/SP - CEP: 13561-120
 


Normas para publicação | Contato