Linear programming applied to separation detection in polytomous logistic regression

Autores

  • Inácio Andruski Guimarães
  • Thiago Schinda Bubniak

DOI:

https://doi.org/10.5540/03.2021.008.01.0427

Palavras-chave:

Polytomous Logistic Regression, Discriminant Analysis, Linear Programming, Complete Separation.

Resumo

The Logistic Regression Model is widely used in Discriminant Analysis. However, parameter estimation is affected by the data configuration and may not be achieved when there is  separation between the groups in the data set, which is a common problem in Discriminant Anal­  ysis. The use of linear programming to detect the separation between groups was proposed by [1], and a large number of linear programming approaches have been used to detect separate data in discriminant analysis. However, most research focuses on models for two groups and there are few models for classification problems in multiple groups. In this paper, a linear programming formulation is proposed to detect the separation between groups for the polytomous logistic regression model. The proposed model has a non-negative objective function that has a positive value when the separation is detected and allows to classify the data as completely separate, almost separated or overlapped, and can be used as part of the parameter estimation. A simulation, using data sets from the literature, shows that the proposed approach can be an efficient alternative for mathematical programming applied to problems with multiple groups.

Downloads

Não há dados estatísticos.

Biografia do Autor

Inácio Andruski Guimarães

DAMAT-UTFPR, Curitiba, PR

Thiago Schinda Bubniak

PIBIC-UTFPR, Curitiba, PR

Referências

A. Albert, J. A. Anderson, On the existence of maximum likelihood estimates in logistic regression models, Biometrika, 71, 1-10 , 1984

K. P. Bennet, O. L. Mangasarian, Multicategory discrimination via linear programming, Op- timization Methods and Software, 1, 23-24, 1992

D.Brodnjak — Voncina, Z.C.Kodba, C.Novic, Multivariate data analysis in classification of vegetable oils characterized by the content of fatty acids. Chemometrics and Intelligent Lab- oratory Systems 75, 31-43, 2005.

R.A. Fisher, The use of multiple measurements in taxonomic problems. Annals of Eugenics 3, 179-188, 1936

N. Freed, F. Glover, Simple but powerful goal programming models for discriminant problems. European Journal of Operational Research 7, 44-60, 1981.

W. Gochet, V. Srinivasan, A. Stam, S. Chen, Multi-group discriminant analysis using linear programming. Operations Research 45 (2), 213-225, 1997.

K. Konis, Linear programming algorithms for detecting separated data in binary logistic re­ gression models. DPhill in Computational Statistics, Worcester College, University of Oxford, Department of Statistics, 1 South Parks Road, Oxford 0X1 3TG, United Kingdom, 2007.

D. G. Luenberger, Linear and Nonlinear Programming, 4th. Ed.. Addison-Wesley Publishing Company. Reading, MA. (1973).

T. J. Santner, D. E. Duffy, A note on A. Albert and J. A. Anderson’s conditions for the existence of maximum likelihood estimates in logistic regression models. Biometrika, 73, 1- 10, 1986.

M. J. Silvapulle, J. Burridge, Existence of maximum likelihood estimates in regression models for grouped and ungrouped data. J. R. Statist. Soe. B, 48, 100-106, 1986.

Downloads

Publicado

2021-12-20

Edição

Seção

Trabalhos Completos