DEVELOPMENT OF ALTERNATIVE LINEAR ESTIMATORS IN COMPLEX SURVEYS

ABSTRACT

The estimation of multiple characteristics using Probability Proportional to Size (PPS) sampling scheme has introduced some complexities in sample surveys. It requires transformation of auxiliary information into probability measures and the utilization of correlation coefficient between study variables y and measure of size x. Existing estimators of finite population characteristics are rigidly specified by a fixed order of positive correlation between y and x and are assumed efficient for all populations. However, the assumptions break down when the study variables are negatively correlated with measure of size. In this study, a linear class of estimators that are functions of moments in positive and negative correlation coefficients were proposed. Using laws of proportions and probability measure theory, a class of alternative linear estimators πœπ‘” ,𝑐 were developed for use in PPS sampling schemes. Using linear regression model with slope Ξ² and well-behaved error term Ξ΅, the expectation of c th standardized moment of the study variable given by 𝐸 π‘¦βˆ’πœ‡π‘¦ πœπ‘¦

𝑐 = 𝐸 𝛽 π‘₯βˆ’πœ‡π‘₯ πœπ‘¦ + πœ€βˆ’πœ‡πœ€ πœπ‘¦

𝑐 , 𝑐 = 1,2,3,4 with 𝛽 𝑐 = 𝜌 2 πœπ‘¦ 2 𝜍π‘₯ 2 𝑐 2 provided a link between moments in correlation coefficient and distribution of the target population, where ρ is the correlation coefficient, πœ‡π‘¦ , πœ‡π‘₯ , πœ‡πœ€ and πœπ‘¦, 2𝜍π‘₯ 2 , πœπœ€ 2 are means and variances of 𝑦, π‘₯ πœ€ respectively. The minimum variance was used as optimality criterion for comparing the performance of πœπ‘” ,𝑐 with the conventional estimator namely, Hansen and Hurwitzβ€Ÿs estimator 𝜏 𝐻𝐻, and other existing alternative estimators namely, Amahia-Chaubey-Raoβ€Ÿs estimator (𝜏𝐴 𝐢𝑅), Grewalβ€Ÿs estimator (𝜏 𝐺), Raoβ€Ÿs estimator (𝜏 𝑅) and Ekaetteβ€Ÿs estimator (𝜏 𝐸) under the PPS sampling design. Using the general super-population model with parameter g, the expected Mean Square Error (MSE) was derived for the estimators and their relative efficiencies were then computed. Empirical studies with samples drawn from four populations, namely; Population I, II, III and IV having correlation coefficients, 𝜌 = 0.16, 0.39, βˆ’0.32 and βˆ’ 0.775 respectively were conducted. The derived transformation for generalized selection probabilities defining the class of linear estimators is 𝑝𝑖,𝑔 βˆ— = 1βˆ’πœŒ 𝑐 𝑁 + 𝜌 𝑐𝓅𝑖 ; 𝑐 = 1,2,3,4 where 𝓅𝑖 = π‘₯𝑖 𝑋 , 𝑋 = π‘₯𝑖 𝑁 𝑖 or 𝓅𝑖 = 𝑧𝑖 𝑍 , 𝑍 = 𝑧𝑖 𝑁 𝑖 , 𝑧𝑖 = 1 π‘₯𝑖 for positive and negativeΒ correlations respectively. Provided that 𝐢𝑉π‘₯ < 𝐢𝑉𝑦 , 𝛾𝑦 < 𝛾π‘₯ ,𝐾𝑦 < 𝐾π‘₯ and 𝜌 2 < 1 for both positive and negative correlations where 𝐢𝑉𝑦 , 𝛾𝑦 ,𝐾𝑦 and 𝐢𝑉π‘₯ , 𝛾π‘₯ ,𝐾π‘₯ are coefficients of variation, skewness and kurtosis of x and y respectively and 𝜌 2 is the coefficient of determination, πœπ‘” ,𝑐 with 𝑐 = 2 was the best estimator for population II, while πœπ‘” ,𝐢 with 𝑐 = 1 was the best estimator for population I in terms of relative mean square error for positive correlation. Under the same conditions and for negative correlation, πœπ‘” ,𝑐 with 𝑐 = 2 and 4 were the best estimators for populations III and IV respectively in terms of relative mean square error. At 𝑔 = 0, πœ‰π‘€π‘†πΈ 𝜏 1 = 131.293 < πœ‰π‘€π‘†πΈ 𝜏 𝐻𝐻 = 134.3, πœ‰π‘€π‘†πΈ 𝜏 2 = 826.5 < πœ‰π‘€π‘†πΈ 𝜏 𝐻𝐻 = 1043.0, πœ‰π‘€π‘†πΈ 𝜏 2 = 254.3 < πœ‰π‘€π‘†πΈ 𝜏 𝐻𝐻 = 329.7 and πœ‰π‘€π‘†πΈ 𝜏 4 = 266.3 < πœ‰π‘€π‘†πΈ 𝜏 𝐻𝐻 = 229.2 for Population I, II, III and IV respectively. Similarly, when 𝑔 = 1, πœ‰π‘€π‘†πΈ πœπ‘” ,𝑐 < πœ‰π‘€π‘†πΈ 𝜏 𝐻𝐻 for all populations. However, at 𝑔 = 2, 𝜏 𝐻𝐻 is relatively more efficient than the alternative estimators. All estimators converge to 𝜏 𝐻𝐻 when 𝜌 = Β±1 and to 𝜏 𝑅 when 𝜌 = 0. The developed alternative estimators accommodated all dimensions of correlation coefficients. The derived estimators also reflected the structure of population distribution and enhanced its power of estimation.