Journal of Statistical Theory and Applications

Volume 19, Issue 1, March 2020, Pages 28 - 35

On a Class of Almost Unbiased Ratio Type Estimators

Authors
A.K.P.C. Swain1, *, Priyaranjan Dash2
1Former Professor of Statistics, Utkal University, Bhubaneswar, India
2Department of Statistics, Utkal University, Bhubaneswar, India
*Corresponding author. Email: akpcs@rediffmail.com
Corresponding Author
A.K.P.C. Swain
Received 4 July 2018, Accepted 15 October 2019, Available Online 2 March 2020.
DOI
10.2991/jsta.d.200224.007How to use a DOI?
Keywords
Simple random sampling; Ratio estimator; Bias; Almost unbiased ratio type estimator; Variance; Efficiency
Abstract

In sample surveys ratio estimator has found extensive applications to obtain more precise estimators of the population ratio, population mean, and population total of the study variable in the presence of auxiliary information, when the study variable is positively correlated with the auxiliary variable. The theory underlying the ratio method of estimation is same whether we estimate the population ratio or population mean/population total, excepting the fact that in the latter case we assume the advance knowledge of the population mean or total of the auxiliary variable in question. In this paper we use the term ratio estimator for both the purposes. However, in spite of its simplicity the ratio estimator is accompanied by an unwelcome bias, although the bias decreases with increase in sample size and is negligible for large sample sizes. In small samples the bias may be substantial so as to downgrade its utility by affecting the reliability of the estimate. As pointed out by L.A. Goodman, H.O. Hartley, J. Am. Stat. Assoc. 53 (1958), 491–508, in sample surveys where we draw very small samples from a large number of strata in stratified random sampling with the ratio method of estimation in each stratum, the combined bias from all the strata may assume serious proportions, affecting the reliability of the estimate. This calls for devising techniques either at estimation stage or in the sampling scheme at the selection stage to reduce the bias or completely eliminating it to make it usable in practice. This has motivated many research workers like E.M.L. Beale, Ind. Organ. 31 (1962), 27–28 and M. Tin, J. Am. Stat. Assoc. 60 (1965), 294–307 among others to construct estimators at the estimation stage removing the bias of O(1/n), where n is the sample size, and thus reducing the bias to O(1/n2). Such estimators are termed as Almost Unbiased ratio-type estimators found in literature. In this paper we have proposed a class of almost ratio type estimators following the techniques of E.M.L. Beale, Ind. Organ. 31 (1962), 27–28 and M. Tin, J. Am. Stat. Assoc. 60 (1965), 294–307 and made comparison with regard to bias and efficiency.

Copyright
© 2020 The Authors. Published by Atlantis Press SARL.
Open Access
This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

1. INTRODUCTION

Large scale sample surveys are often conducted in countries around the world to assess the present status of certain sectors of economy for future planning. In such surveys it is a general practice to adopt stratification to divide the heterogeneous population into homogeneous groups called strata. Sometimes besides observing the main variable under study, observations on certain auxiliary variables stipulated at planning stage or even during the course of investigation to improve the efficiency of the estimators of the parameters of the main variable under study. The simplest method of using auxiliary information in case of a single auxiliary variable when the main variable under study and auxiliary variable is positively correlated, is the ratio method of estimation, advocated by Cochran [3] among many [5] earlier workers in sample surveys. It is well known that the ratio estimator of the population mean/total/ratio is a biased estimator although the bias may be negligible for large sample sizes. Even for moderately large sample the bias may be substantial, more so in stratification where these biases accumulate over strata to make the overall estimate sometimes unacceptable to be used for the purpose for which it is to be used (Goodman and Hartley [7], Cochran [4]). This suggests to devise ways to construct estimators whose biases of O(1/n), n being the sample size, is removed and the reduced bias becomes of O(1/n2). Beale [1] and Tin [14] devised ways to adjust the estimator for the bias by the asymptotic series expansion of the ratio estimator under certain assumptions. These improved type of ratio estimators having first order bias being removed are known in sampling theory literature as Almost Unbiased Ratio Type Estimators. De-graft Johnson [6] and David [5] have made some extensive studies on ratio method of estimation.

Let there be a finite population U having N distinct and identifiable units {U1,U2,,UN} indexed by paired values of the study variable y and positively correlated auxiliary variable x such as (Y1,X1),(Y2,X2),,(YN,XN). Assume that both y and x are positively measured.

Draw a simple random sample without replacement of size n from the finite population of N units and the paired values on the sample units are (y1,x1),(y2,x2),,(yn,xn).

Define the population means of y and x as Y¯=1Ni=1NYi and X¯=1Ni=1NXi respectively and the population variances and covariance between y and x as Sy2=1N1i=1NYiY¯2, Sx2=1N1i=1NXiX¯2, and Sxy=1N1i=1NXiX¯YiY¯ respectively. Define further Cx2=Sx2X¯2 and Cy2=Sy2Y¯2 as the population squared coefficients of variation of x and y respectively. Also, the population coefficient of co-variation Cxy=SxyY¯X¯=ρCxCy, ρ being the coefficient of correlation between y and x. The population regression coefficient of y on x β=SxySx2. Further, the population ratio R=Y¯X¯.

The sample means of y and x are respectively y¯=1ni=1nyix¯=1ni=1nxi. The sample variances y and x are sy2=1n1i=1nyiy¯2 and sx2=1n1i=1nxix¯2 respectively and sample covariance is sxy=1n1i=1nxix¯yiy¯

Define cx2=sx2x¯2, cy2=sy2y¯2, and cxy=sxyx¯y¯

cx2, cy2, and cxy are consistent estimators of Cx2, Cy2, and Cxy respectively.

Also, sample ratio r=y¯x¯

The ratio estimator of the population mean Y¯ is given by

Y¯^r=y¯x¯X¯,(1)
where X¯ is known in advance.

Expanding (1) in power series (Sukhatme et al. [12]) and using results Vy¯=θCy2, Vx¯=θCx2, and Covx¯,y¯=θCxy, we have to O(1/n),

EY¯^r=Y¯1+θCx2Cxy, where(2)
θ=1n1N.

Hence bias to O(1/n) is given by

BiasY¯^r=θY¯Cx2Cxy

Y¯^r is a biased but consistent estimator of Y¯ with bias being negligible in large samples.

Alternatively we may put

EY¯^r=Y¯+Y¯θ1C20C11,(3)
where Cij=μijX¯iY¯  j=1NXiX¯iYiYjX¯iY¯j, and θ1=Nn(N1)n

For very large N, θ1θ.

The variance of Y¯^r to O(1/n) is given by (Sukhatme et al. [12])

VY¯^r=θY¯2Cy2+Cx22Cxy,(4)

Alternatively,

VY¯^r=θY¯2C02+C202C11(5)

Y¯^r is more efficient than y¯ if

ρ>12CxCy=12C20C11,(6)
where ρ is the correlation coefficient between y and x.

Beale [1] suggested an ingenious almost unbiased ratio type estimator given by

Y¯^rB=Y¯^r1+θsxy/x¯y¯1+θsx2/x¯2(7)

Tin [14] derived another almost unbiased estimator by subtracting the estimate of first order bias of O(1/n) from the estimator Y¯^r itself to get an estimator whose bias of O(1/n) is removed, so as to get his estimator having bias of O1/n2. Thus Tin's estimator is given by

Y¯^rT=Y¯^r1+θsxyx¯y¯sx2x¯2(8)

Beale's estimator is in ratio form, which reduces to Tin's form after its asymptotic expansion retaining terms up to O(1/n). It also eliminates the bias of O(1/n) of the ratio estimator. Both Tin's and Beale's estimators use same information sxyx¯y¯ and sx2x¯2 in their formulations.

Considering terms up to O1/n2, Tin [14] has shown that Beale's estimator is less biased and equally efficient compared to his estimator. A disadvantage with Tin's estimator is that it may take negative values for a positive population ratio, a situation pointed out by Beale in a private communication with Tin [14] with a sample of size two. Such discerning picture has also been seen by drawing a sample of size 8 from a bivariate normal population having X¯=5,Y¯=15,Sx2=45,Sy2=500, and correlation coefficient ρ=0.4. The computed estimators are r=117.11, rb=1.55 and rt=64963.33 (privately communicated by Mr. Xiafei Zhang, Iowa State University).

Again, writing Beale's estimator rb for R as

rb=r1+θsyx/y¯x¯1+θsx2/x¯2=y¯x¯+θsyxx¯2+θsx2,
we find that when x¯ is small or even x¯=0, Beale's estimator is dominated by syx/sx2 and hence will not give extreme values and thus has a control to avoid extremes. As compared to Beale's estimator, Tin's estimator is dominated by sx2/x¯2 and hence is extremely large when x¯ is small. The advantage with Beale's estimator is that it can deal with case when x¯=0. These thoughts were expressed by Professor W.A. Fuller in a private communication.

As noted by Tin [14], Beale's estimator seems to be a better estimator than his estimator as regards reducing the bias of the ratio estimator and also in large samples there is marginal loss of efficiency compared to his estimator.

Beale's estimator has been fruitfully applied in hydrological studies by Lee et al. [8] in load estimation using dense water quality data. Assuming positive correlation between flux and flow Richards and Holloway [9] and Richards [10] used Beale's estimator for flux estimation in the Great Lakes region and other parts of United States, generally applying more complex strata. They showed that Beale's estimator generally exhibited greater estimation accuracy and lower bias. Carriquiry et al. [2] have studied the estimation of usual intake distributions of intake of ratios of dietary components using Beale's estimator.

In this paper Srivastava's [11] class of estimators is considered to derive its Beale type and Tin type almost unbiased ratio type estimators and to compare them with regard to bias and efficiency. Further, as a special case Swain's [13] square root transformation estimator is discussed at length to compare Beale type and Tin type estimators with regard to bias and efficiency. As exact comparisons are not possible we have used asymptotic expansions and considered terms to O1/n2.

2. A CLASS OF ALMOST UNBIASED RATIO TYPE ESTIMATORS

Srivastava [11] proposed a class of power transformation ratio estimators of the population mean Y¯, with known population mean X¯ as

ts=y¯X¯x¯α,(9)
where α is a real constant.

Define,

y¯=Y¯1+eo,x¯=X¯1+e1,sxy=Sxy1+e2,sx2=Sx21+e3
Eei=0,i=0,1,2,3

Expanding ts=y¯X¯x¯α in power series, assuming |ei|<1 for all possible samples, i=0,1,2,3 and retaining terms up to degree four, we have

ts=Y¯(1+e0)(1+e1)α=Y¯(1αe1+α(α+1)2e12α(α+1)(α+2)6e13+α(α+1)(α+2)24e14+e0αe1e0+α(α+1)2e12e0α(α+1)(α+2)6e13e0+)=Y¯(1λ1e1+λ2e12λ3e13+λ4e14+e0λ1e1e0+λ2e12e0λ3e13e0+),(10)
where λ1=α,λ2=α(α+1)2,λ3=α(α+1)(α+2)6,λ4=α(α+1)(α+2)(α+3)24

After some lengthy derivations using traditional techniques adopted for asymptotic expansion of the ratio estimator (see Sukhatme et al. [12]), we have to O1/n2

Ets=Y¯+Y¯θλ2C20λ1C11+θ2λ3C30+3λ4C202+λ2C213λ3C20C11,(11)

Now,

Vts=Y¯2θλ12C202λ1C11+C02+Y¯2θ2C2022λ22+6λ1λ3+C112λ12+4λ2+C20C02λ12+2λ2+C302λ1λ2+Y¯2θ2C212λ12+2λ2+C20C116λ310λ1λ22λ1C12(12)

When λi=1 for i=1,2,3,4,

EY¯^r=Y¯+Y¯θC20C11+θ2C30+3C202+C213C20C11(13)
VY¯^r=Y¯2θC202C11+C02+Y¯2θ28C202+5C112+3C20C022C30+4C2116C20C112C12(14)

The expressions for EY¯^r and VY¯^r are the same as those derived by Tin [14] and De-Graft Johnson [6].

Following Beale [1] we write an almost unbiased ratio estimator of Y¯ using ts given by

tsB=y¯X¯x¯α1+λ1θsxyx¯y¯1+λ2θsx2x¯2(15)

Now,

1+λ1θsxyx¯y¯=1+λ1θC11(1+e0)1(1+e1)1(1+e2)=1+λ1θC11(1e0+e02e03+e04+.)(1e1+e12e13+e14+)(1+e2)=B,say.

Further,

1+λ2θsx2x¯21=1+λ2θC201+e31+e121=1λ2θC2012e1+3e124e13+e32e1e3+λ22θ2C202=A, say

Thus we write

tsB=y¯X¯x¯α(BA)

After some mathematical simplifications we write the expressions for the expected value and variance of tsB to O1/n2 as

EtsB=Y¯+Y¯θ2C303λ21+C20236λ2+λ22+C20C113λ2λ1λ2+C2112λ1λ2(16)
VtsB=Y¯2θλ12C202λ1C11+C02+Y¯2θ2C1124λ22λ1λ12+λ12C20C02+Y¯2θ2C2026λ4+6λ1λ3+6λ28λ1λ22λ12λ26+C20C112λ13+4λ12λ2+2λ12λ2+2λ1λ212λ3(17)

Substituting,

λ1=λ2=λ3=λ4=1,
EY¯^rB=Y¯+Y¯θ22C302C202+2C20C112C21(18)
VY¯^rB=Y¯2θC202C11+C02+θ2C112+C20C02+2C2024C20C11(19)

Further, following Tin [14] we have another almost unbiased ratio-type estimator given by

tsT=y¯X¯x¯α1+λ1θsxyx¯y¯λ2θsx2x¯2=y¯X¯x¯α1+λ1θC111+e011+e111+e2λ2θC201+e31+e12=y¯X¯x¯α1+λ1θC111e1+e12e0+e1e0+e02+e2e1e2e0e2λ2θC2012e1+3e124e13+e32e1e3(20)

Retaining terms up to O1/n2

EtsT=Y¯+Y¯θ2C30λ3+2λ2+λ1λ2+C2023λ44λ22λ1λ2+Y¯θ2C20C11λ3+λ1λ2+2λ2+λ1+C21λ12λ1(21)
VtsT=Y¯2θλ12C202λ1C11+C02+Y¯2θ2C1124λ22λ1λ12+λ12C20C02+Y¯2θ2C2026λ1λ3+2λ24λ1λ22λ12λ2+C20C112λ13+4λ12+4λ24λ1λ210λ3(22)

When

λ1=λ2=λ3=λ4=1
EY¯^rT=Y¯+Y¯θ22C303C202+3C20C112C21(23)
VY¯^rT=VY¯^rB,(24)

Note: The expressions for EY¯^rB, EY¯^rT, VY¯^rB, and VY¯^rT are derived by Tin [14] using bivariate cumulants and by De-Graft Johnson [6] using bivariate moments. Some of the higher order bivariate moments neglecting finite population correction factor for large finite population mentioned by De-Graft Johnson [6] are given below:

Ee12e02=1n22C112+C20C02,Ee14=3C202n2,Ee13e0=1n23C20C11Ee1e2=1nC21C11,Ee1e3=1nC30C20,Ee0e2=1nC12C11.

2.1. Comparison of Bias and Variance of the Class of Beale Type and Tin Type Almost Unbiased Ratio-Type Estimators

Consider the situation when y and x follow a bivariate symmetric distribution with odd order moments being zero.

We have EtsB=Y¯+Y¯θ236λ2+λ22C202+3λ2λ1λ2C20C11=Y¯+Y¯θ2B1C202+B2C20C11,

Where B1=36λ2+λ22 and B2=3λ2λ1λ2, and

EtsT=Y¯+Y¯θ23λ44λ2+2λ1λ2C202+λ3+λ1λ2+2λ2+2λ1C20C11=Y¯+Y¯θ2T1C202+T2C20C11,

Where T1=3λ44λ2+2λ1λ2 and T2=λ3+λ1λ2+2λ2+2λ1

Hence, tsB will be less biased than tsT

ifB1+B2βR<T1+T2βR(25)

Further,

VtsB<VtsT
if   βR>6λ4+4λ24λ1λ266λ2+2λ36λ1λ22λ1+4λ124λ12λ2.(26)
that is, if ρ>12CxCy6λ4+4λ24λ1λ266λ2+2λ36λ1λ22λ1+4λ124λ12λ2, provided the denominator does not vanish.

2.2. A Special Case of Class of Almost Unbiased Ratio-Type Estimators

Consider a special case considered by Swain [13] of ts when α=1/2 of the form

tsqr=y¯X¯x¯1/2(27)

This estimator is termed as square root transformation estimator by Swain [13].

Substituting λ1=12, λ2=38, λ3=516, and λ4=35128 in the expressions in (15)

Beale's almost unbiased ratio estimator is given by

tsqrB=y¯X¯x¯1/21+12θsxyx¯y¯1+38θsx2x¯2(28)
and Tin's almost unbiased ratio estimator is written as
tsqrT=y¯X¯x¯1/21+12θsxyx¯y¯38θsx2x¯2(29)

Substituting λ1=12, λ2=38, λ3=516, and λ4=35128 in the expressions in (11), (16), (21), (12), (17), and (22) respectively, we have

Etsqr=Y¯+Y¯θ38C2012C11+θ2516C30+105128C202+38C211516C20C11(30)
EtsqrB=Y¯+Y¯θ218C3038C21+5764C202+1516C20C11(31)
EtsqrT=Y¯+Y¯θ258C3034C21+135128C202+98C20C11(32)
Vtsqr=Y¯2θ14C20C11+C02+Y¯2θ23932C202+74C112+C20C0238C30154C20C11+54C21C12(33)
VtsqrB=Y¯2θ14C20C11+C02+Y¯2θ218364C202+14C112+14C20C0252C20C11(34)
VtsqrT=Y¯2θ14C20C11+C02+Y¯2θ234C202+14C112+14C20C0298C20C11(35)

Comparison of biases and variances of almost unbiased tsqr=y¯X¯x¯1/2

tsqrB will be less biased than tsqrT to O1/n2 if

5764+1516βR<135128+98βR(36)

tsqrB will be more efficient than tsqrT to O1/n2 if

23164+118βR>0,(37)
which is always true since both β and R are positive.

3. NUMERICAL ILLUSTRATION

To compare Beale type and Tin type square root transformation estimators, we have considered four natural populations-1, 2, 3, and 5, having size N = 3164, whose parameters in terms of product moments are mentioned in De-Graft Johnson [6]. The biases and variances, ignoring finite population correction factor, to O1/n2 for different sample sizes are given in Table 1.

POP-1 B(tsqr) B(tsqrB) B(tsqrT) V(tsqr) V(tsqrB) V(tsqrT)
θ = 1/10 −0.02490112 0.00045152 −0.00179427 0.15693527 0.14491935 0.14988051
θ = 1/20 −0.01255322 0.00011288 −0.00044857 0.07643719 0.07343321 0.07467350
θ = 1/50 −0.00504592 0.00001806 −0.00007177 0.03008757 0.02960693 0.02980538
θ = 1/100 −0.00252707 0.00000452 −0.00001794 0.01496257 0.01484241 0.01489202
POP-2
θ = 1/10 −0.01953039 0.00154964 −0.00048017 0.16621535 0.15482812 0.16030793
θ = 1/20 −0.00972207 0.00038741 −0.00012004 0.08135202 0.07850522 0.07987517
θ = 1/50 −0.00387848 0.00006199 −0.00001921 0.03211945 0.03166396 0.03188316
θ = 1/100 −0.00193751 0.00001550 −0.00000480 0.01598950 0.01587563 0.01593043
POP-3
θ = 1/10 −0.03381434 0.01896277 −0.00493000 0.10236741 0.01425384 0.08892160
θ = 1/20 −0.01682702 0.00474069 −0.00123250 0.04729623 0.02526784 0.04393478
θ = 1/50 −0.00671157 0.00075851 −0.00019720 0.01798550 0.01446095 0.01744766
θ = 1/100 −0.00335258 0.00018963 −0.00004930 0.00883725 0.00795611 0.00870279
POP-5
θ = 1/10 −0.01751400 0.00049767 −0.00041688 0.17321951 0.16626384 0.16834761
θ = 1/20 −0.00874797 0.00012442 −0.00010422 0.08530856 0.08356965 0.08401559
θ = 1/50 −0.00349702 0.00001991 −0.00001668 0.03381114 0.03353291 0.03360626
θ = 1/100 −0.00174815 0.00000498 −0.00000417 0.01685352 0.01678397 0.01680480
Table 1

Comparison biases and variances of Beale type and Tin type estimators for the square root transformation estimator, omitting constant multipliers.

Comments:

For all populations, Beale's estimator is more efficient than Tin's estimator for all sample sizes under consideration. For population 1 Beale's estimator is less biased than Tin's estimator, but for populations 2, 3, and 5 Tin's estimator is marginally less biased than Beale's estimator.

Thus, Beale's estimator appears to be a preferred estimator over Tin's estimator for the square root transformation estimator with regard to bias and efficiency.

4. CONCLUSIONS

Almost unbiased Tin type and Beale type estimators for Srivastava's [11] class of estimators ts are derived and compared with regard to bias and efficiency. As a special case Beale's and Tin's almost unbiased estimators for Swain's [13] square root transformation estimator are formulated and compared. It is seen that Beale's estimator is conditionally less biased than Tin's estimator, but interestingly is more efficient than Tin's estimator to O1/n2. Numerical illustrations show that Beale's estimator have better performances with regard to bias and efficiency.

CONFLICT OF INTEREST

There is no conflict of interest involved and the research was carried out with authors's own contribution without any outside funding.

REFERENCES

1.E.M.L. Beale, Ind. Organ., Vol. 31, 1962, pp. 27-28.
2.A.L. Carriquiry, W.A. Fuller, J.J. Goneyeche, and K.W. Dodd, Estimation of the Usual Intake of Distributions of Ratios of Dietary Supplements, Department of Statistics, Iowa State University, Iowa, USA, 1995. Research Report – foe Agricultural Research Service, Department of Agriculture
4.W.G. Cochran, Sampling Techniques, John Wiley and Sons, New York, USA, 1977.
5.I.P. David, Contribution to Ratio Method of Estimation, Iowa State University, Ames, IA, USA, 1971. Ph.D. Dissertation
6.K.T. De-Graft Johnson, Some Contributions to the Theory of Two Phase Sampling, a Dissertation Submitted for the Degree of Doctor of Philosophy, Iowa State University, Ames, IA, USA, 1969.
10.R.P. Richards, Estimation of Pollutant Loads in Rivers and Streams: A Guidance Document for NPS Programs, Water Quality Laboratory, Heidelberg College, Project Report Prepared under Grant from US Environment Protection Agency, OH, USA, 1998.
12.P.V. Sukhatme, B.V. Sukhatme, and C. Asok, Sampling Theory in Surveys with Applications, Iowa State University Press, Ames, IA, USA, 1984.
13.A.K.P.C. Swain, Revista Investigacion Operacional, Vol. 35, 2014, pp. 49-57.
Journal
Journal of Statistical Theory and Applications
Volume-Issue
19 - 1
Pages
28 - 35
Publication Date
2020/03/02
ISSN (Online)
2214-1766
ISSN (Print)
1538-7887
DOI
10.2991/jsta.d.200224.007How to use a DOI?
Copyright
© 2020 The Authors. Published by Atlantis Press SARL.
Open Access
This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - JOUR
AU  - A.K.P.C. Swain
AU  - Priyaranjan Dash
PY  - 2020
DA  - 2020/03/02
TI  - On a Class of Almost Unbiased Ratio Type Estimators
JO  - Journal of Statistical Theory and Applications
SP  - 28
EP  - 35
VL  - 19
IS  - 1
SN  - 2214-1766
UR  - https://doi.org/10.2991/jsta.d.200224.007
DO  - 10.2991/jsta.d.200224.007
ID  - Swain2020
ER  -