On the Comparison of Some Link Functions of Binary Response Analysis Under Symmetric and Asymmetric Assumption

5 PAGES (3398 WORDS) Statistics Article/Essay

Binary response analysis is modeled when the response variable is nominal and as such violates the use of the ordinary linear regression model. This paper utilizes the classical approach to fit a categorical response regression model using the logit, probit. loglog and the complementary loglog (Cloglog) link functions under symmetric and asymmetric assumptions. It is captured in past studies that we can only make comparisons between these link functions when n is large say (n > 1000), In this study we compared the link functions to investigate this claim with small values of n less than 1000. We fit the Cloglog and loglog models on 600 tuberculosis patients who may be co-infected with hypertension while the R package was initiated in simulating a binary data for fitting the logit and probit models using the Akaike Information Criterion (AIC) as a basis of comparison for the symmetric and asymmetric different model fitting techniques. The result of the simulated data of sample size 50 revealed that there is a difference between the two symmetric link functions with differing values of AIC with the Probit outperforming the logit link having least values of AIC which indicates that the probit link should be preferred under the symmetric assumption. While under the asymmetric link functions the loglog outperformed the cloglog with smaller values of AIC utilized on the life dataset which gives us the notion that the loglog link should be preferred under the asymmetric assumption. Furthermore table 6 also indicates that type of occupation is the only significant factor associated with hypertension in tuberculosis infected patients under study using both the cloglog and loglog link functions. On this note we recommend that patients with diabetes should be given less strenuous jobs and occupations to handle. Finally we were able to show that the link functions can be distinguished even with small values of (n < 1000) under the two assumptions.