Regression estimator of population mean with random missing values in stratified two stage sampling using double sampling for auxiliary information
DOI:
https://doi.org/10.64497/jssci.128Abstract
Missing data is a recurring challenge in survey sampling, often reducing the efficiency and reliability of estimators. This study proposed and investigated the regression-type estimator of the population mean under a stratified two-stage sampling design in the presence of missing values. The population of the study comprised of Field survey data on students’ school attendance (auxiliary variable) and mathematics test scores (study variable). A stratified two-stage design was adopted, with schools as primary sampling units and students as secondary units. To reflect item nonresponse, 20% of the study variable was declared missing completely at random (MCAR) and handled through both regression and ratio imputation. The performance of the proposed regression estimator was compared with the Bahl-Saini (2011) ratio and difference estimator across sample sizes of 25, 40, 70, and 100, using coefficient of variation (CV), and confidence intervals as evaluation criteria. Results showed that the regression estimator consistently achieved lower variances and CVs than the existing estimators, with efficiency improving as sample size increased. Even in the presence of missing data, the regression estimator maintained superior performance, particularly under regression imputation. The study demonstrated the efficiency of the regression estimator in handling incomplete data and highlights its practical significance for reliable estimation in complex survey designs.
Downloads
References
[1] Neyman, J. (1934). On the two different aspects of the representative method: The method of stratified sampling and the method of purposive selection. Journal of the Royal Statistical Society, 97(4), 558–625. https://doi.org/10.2307/2342192 DOI: https://doi.org/10.2307/2342192
[2] Rao, J. N. K. (2020). On Making Valid Inferences by Integrating Data from Surveys and Other Sources. Sankhya Series B, 83‑B, 242–272. DOI: https://doi.org/10.1007/s13571-020-00227-w
[3] Sarndal, C. E., Swensson, B., and Wretman, J. (2003). Model Assisted Survey Sampling (pp. 303–327).
Springer.
[4] Little, R. J. A., and Rubin, D. B. (2019). Statistical Analysis with Missing Data (3rd ed., pp. 53–98). John DOI: https://doi.org/10.1002/9781119482260
Wiley & Sons.
[5] Kim, J. K., & Kim, J. J. (2016). A Note on Fully Efficient Estimators in Two-Phase Sampling with
Nonresponse. Survey Methodology, 42(2), 283–293.
[6] Rubin, D. B. (2020). Multiple Imputation for Nonresponse in Surveys (2nd ed., pp. 25–76). John Wiley &
Sons.
[7] Bahl, S. and Saini, M. (2011). Estimation of Population Total in Two Stage Design with PPS Sampling and
Using Double Sampling for Auxiliary Information. International Journal of Statistics and Systems. 6(1): 67-76.
[8] Saini M. and Bahl, S. (2013) Estimation of mean in Two Stage Design Using Double Sampling for
Stratification and Multiauxiliary Information at SSU Level. International Journal of Agricultural and Statistical
Sciences. 9(1) PP 45 - 56.
[9] Little, R. J. and Rubin, D. B. (2002) Statistical Analysis with Missing Data. Hoboken, NJ: John Willey & Sons.
http://dx.doi.org/10.1002/9781119013563 DOI: https://doi.org/10.1002/9781119013563
[10] Ahmed, M. G., and Rania, H. M. A. (2017). Imputation methods for Longitudinal Data: A Comparative
Study. International Journal of Statistical Distributions and Applications, 3(4): 72–80.
https://doi.org/10.11648/j.ijsd.20170304.13 DOI: https://doi.org/10.11648/j.ijsd.20170304.13
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 P.I. Iorlaha, S.C. Nwaosu, T. Uba, A.J. Ikughur

This work is licensed under a Creative Commons Attribution 4.0 International License.
- Abstract 261
- PDF 85

