Parameter Estimation and Application of Generalized Inflated Geometric Distribution

Avishek Mallick; Ram Joshi

doi:10.2991/jsta.2018.17.3.7

<Previous Article In Issue

Download article (PDF)

Next Article In Issue>

Volume 17, Issue 3, September 2018, Pages 491 - 519

Parameter Estimation and Application of Generalized Inflated Geometric Distribution

Authors

Avishek Mallick^*^,mallicka@marshall.edu

Department of Mathematics, Marshall University, Huntington, WV, USA

Ram Joshiram.d.joshi@ttu.edu

Department of Mathematics & Statistics, Texas Tech University, Lubbock, Texas, USA

^*Corresponding author.

Corresponding Author

Avishek Mallickmallicka@marshall.edu

Received 12 January 2017, Accepted 2 August 2017, Available Online 30 September 2018.

DOI: 10.2991/jsta.2018.17.3.7 How to use a DOI?
Keywords: Standardized bias; Standardized mean squared error; Akaike’s Information Criterion (AIC); Bayesian Information Criterion (BIC); Monte Carlo study
Abstract: A count data that have excess number of zeros, ones, twos or threes are commonplace in experimental studies. But these inflated frequencies at particular counts may lead to overdispersion and thus may cause difficulty in data analysis. So to get appropriate results from them and to overcome the possible anomalies in parameter estimation, we may need to consider suitable inflated distribution. Generally, Inflated Poisson or Inflated Negative Binomial distribution are the most commonly used for modeling and analyzing such data. Geometric distribution is a special case of Negative Binomial distribution. This work deals with parameter estimation of a Geometric distribution inflated at certain counts, which we called Generalized Inflated Geometric (GIG) distribution. Parameter estimation is done using method of moments, empirical probability generating function based method and maximum likelihood estimation approach. The three types of estimators are then compared using simulation studies and finally a Swedish fertility dataset was modeled using a GIG distribution.
Copyright: © 2018, the Authors. Published by Atlantis Press.
Open Access: This is an open access article under the CC BY-NC license (http://creativecommons.org/licences/by-nc/4.0/).

1. Introduction

A random variable X that counts the number of trials to obtain the r^th success in a series of independent and identical Bernoulli trials, is said to have a Negative Binomial distribution whose probability mass function (pmf) is given by

(1.1)P(X=k)=P(k|p)=(k−1r−1)pr(1−p)k−r

where r = 1, 2, 3,…, k = r, r + 1,… and p > 0.

The above distribution is also the “Generalized Power Series distribution” as mentioned in Johnson et al. (2005) [9]. Some writers, for instance Patil et al. (1984) [12], called this the “Pólya-Eggenberger distribution”, as it arises as a limiting form of Eggenberger and Pólya’s (1923) [5] urn model distribution. A special case of Negative Binomial Distribution is the Geometric distribution which can be defined in two different ways

Firstly, the probability distribution for a Geometric random variable X (where X being the number of independent and identical trials to get the first success) is given by

(1.2)P(X=k|p)={p(1−p)k−1 if k=1,2,…0 otherwise.

However, instead of counting the number of trials, if the random variable X counts the number of failures before the first success, then it will result in the second type of Geometric distribution which again is a special case of Negative Binomial distribution when r = 1 (first success) and its pmf is given by

(1.3)P(X=k)=P(k|p)={p(1−p)k if k=0,1,2,…0 otherwise.

The support set of this random variable is {0,1,2,… }, which makes it different from the distribution in (1.2). The above model in (1.3), henceforth referred to as “Geometric(p)” has mean 1−pp and variance 1−pp2 and is the only distribution with non-negative integer support that can be characterized by the “memoryless property” or the “Markovian property”. Many other characterizations of this distribution can be found in Feller (1968 in Feller (1969) [6] [7]. The distribution occurs in many applications and some of them are indicated in the references below:

•
The famous problem of Banach’s match boxes (Feller (1968) [6]);
•
The runs of one plant species with respect to another in transects through plant populations (Pielou (1962, 1963) [13] [14]);
•
A ticket control problem (Jagers (1973) [8]);
•
A survillance system for congenital malformations (Chen (1978) [4]);
•
The number of tosses of a fair coin before the first head (success) appears;
•
The number of drills in an area before observing the first productive well by an oil prospector (Wackerly et al. (2008) [15]).

The Geometric model in (1.3) which is widely used for modeling count data may be inadequate for dealing with overdispersed count data. One such instance is the abundance of zero counts in the data, and (1.3) may be an inefficient model for such cases due to the presence of heterogeneity, which usually results in undesired over dispersion. Therefore, to overcome this situation, i.e., to explain or capture such heterogeneity, we consider a ‘two-mass distribution’ by giving mass π₁ to 0 counts, and mass (1 − π₁) to the other class which follows Geometric(p). The result of such a ‘mixture distribution’ is called the ‘Zero-Inflated Geometric’ (ZIG) distribution with the probability mass function

(1.4)P(k|p,π1)={π1+(1−π1)p if k=0(1−π1)P(k|p) if k=1,2,…

where, p > 0, and P(k | p) is given in (1.3). However the mixing parameter π is chosen such that P(k = 0) ∈ (0, 1) in (1.4), i.e., it ranges over the interval −p1−p<π1<1. This allows the distribution to be well defined for certain negative values of π₁, depending on p. Although the mixing interpretation is lost when π₁ < 0, these values have a natural interpretation in terms of zero-deflation, relative to a Geometric(p) model. Correspondingly, π₁ > 0 can be regarded as zero inflation as discussed in Johnson et al. (2005) [9].

A further generalization of (1.4) can be obtained by inflating/deflating the Geometric distribution at several specific values. To be precise, if the discrete random variable X is thought to have inflated probabilities at the values k₁, ...., k_m ∈ {0, 1, 2,.... }, then the following general probability mass function can be considered:

(1.5)P(k|p,πi,1≤i≤m)={πi+(1−∑i=1mπi)P(k|p) if k=k1,k2,…,km(1−∑i=1mπi)P(k|p) if k≠ki;1≤i≤m

where k = 0,1,2,…; p > 0 and π_i’s are chosen in such a way that P(k_i) ∈ (0,1) for all i = 1, 2,..., m in (1.5). For the remainder of this work, we will refer to (1.5) as the Generalized Inflated Geometric (GIG) distribution which is the main focus of this work.

We will consider some special cases of the (GIG) distribution such as Zero-One-Inflated Geometric (ZOIG) distribution in the case k = 2 with k₁ = 0 and k₂ = 1 or Zero-One-Two Inflated Geometric (ZOTIG) models. Similar type of Generalized Inflated Poisson (GIP) models have been considered by Melkersson and Rooth (2000) [11] to study a women’s fertility data of 1170 Swedish women of the age group 46–76 years (Table 1). This data set consists of the number of child(ren) per woman, who have crossed the childbearing age in the year 1991. They justified the Zero-Two Inflated Poisson distribution was the best to model it. However recently, Begum et al. (2014) [2] studied the same data set and found that a Zero-Two-Three Inflated Poisson (ZTTIP) distribution was a better fit.

Count	Frequency	Proportion
0	114	.097
1	205	.175
2	466	.398
3	242	.207
4	85	.073
5	35	.030
6	16	.014
7	4	.003
8	1	.001
10	1	.001
12	1	.001

Total	1,170	1.000

Table 1.

Observed number of children (= count) per woman.

Instead of using an Inflated Poisson model, we will consider fitting appropriate Inflated Geometric models to the data in Table 1. Now, which model is the best fit whether a GIG model with focus on counts (0, 1), i.e., ZOIG or a GIG model focusing on some other set {k₁, k₂,…, k_m} is appropriate for the above data will be eventually decided by different model selection criteria in section 5. The rest of the paper is organized as follows. In the next section, we provide some properties of the GIG distribution. In section 3, we discuss different techniques of parameter estimation namely, the method of moments (MME), empirical probability generating function (epgf) based methods (EPGE) suggested by Kemp and Kemp (1988) [10], and the maximum likelihood estimation (MLE). In section 4, we compare the performances of MMEs, EPGEs and MLEs for different GIG model parameters using simulation studies. Finally conclusion is given in section 6.

2. Some Properties

In this section, we present some properties of the GIG distribution and that of some special cases of this distribution.

Proposition 1.

The probability generating function (pgf) of a random variable X following a GIG in (1.5) with parameters π₁,…, π_m and p is given by

(2.1)GX(s)=∑i=1mskiπi+(1−∑i=1mπi)p1−{(1−p)s} if s<11−p.

Proof.

GX(s)=∑k=0∞skP(k|p,πi,1≤i≤m)=∑i=1mskiπi+(1−∑i=1mπi)∑k=0∞skP(k|p)=∑i=1mskiπi+(1−∑i=1mπi)∑k=0∞p{(1−p)s}k=∑i=1mskiπi+(1−∑i=1mπi)p1−{(1−p)s}.

The infinite sum converges, if (1−p)s<1⇔s<11−p.

Proposition 2.

The r^th raw moment of a random variable X following a GIG in (1.5) with parameters π₁,…, π_m and p is given by

(2.2)E[Xr]=∑i=1mkirπi+(1−∑i=1mπi)μr′(p)

where μ′_r(p) is the the r^th raw moment of Geometric(p) distribution and can be calculated easily by differentiating its moment generating function (mgf) given by p1−(1−p)et, i.e., μr′(p)=drdtr(p1−(1−p)et)|t=0.

Proof.

E[Xr]=∑k=0∞krP(k|p,πi,1≤i≤m)=∑i=1mkirπi+(1−∑i=1mπi)∑k=0∞krP(k|p)=∑i=1mkirπi+(1−∑i=1mπi)μr′(p).

For the ZIG distribution, i.e., when m = 1 and k₁ = 0, using (2.1) we get the pgf as

GX(s)=π1+(1−π1)p1−{(1−p)s}.

Using (2.2) the population mean and population variance in this special case can be obtained as

(2.3)E(X)=(1−π1)(1−p)p and Var(X)=(1−π1){1+π1(1−p)}(1−p)p2.

Note that ∂E(X) / ∂π₁ < 0 and ∂E(X) / ∂p < 0, and therefore the mean decreases with both parameters π₁ and p. Table (2) shows the variance-to-mean ratio, i.e., Var(X) / E(X) for different values of the parameters π₁ and p. It can be observed that this ratio is increasing (decreasing) with the values of π₁ (p). Also it is always larger than 1, and thus ZIG distribution seems to be overdispersed.

π₁	p

	0.1	0.2	0.3	0.4	0.5	0.6	0.7	0.8	0.9
0.1	10.90	5.40	3.57	2.65	2.10	1.73	1.47	1.28	1.12
0.2	11.80	5.80	3.80	2.80	2.20	1.80	1.51	1.30	1.13
0.3	12.70	6.20	4.03	2.95	2.30	1.87	1.56	1.33	1.14
0.4	13.60	6.60	4.27	3.10	2.40	1.93	1.60	1.35	1.16
0.5	14.50	7.00	4.50	3.25	2.50	2.00	1.64	1.38	1.17
0.6	15.40	7.40	4.73	3.40	2.60	2.07	1.69	1.40	1.18
0.7	16.30	7.80	4.97	3.55	2.70	2.13	1.73	1.43	1.19
0.8	17.20	8.20	5.20	3.70	2.80	2.20	1.77	1.45	1.20
0.9	18.10	8.60	5.43	3.85	2.90	2.27	1.81	1.48	1.21

Table 2.

Var(X) / E(X) for ZIG distribution

The cumulative distribution function (cdf) of the ZIG distribution is given by

(2.4)F(k|p,π1)={0 if k<0π1+(1−π1){1−(1−p)k+1} if k=0,1,2,…

Thus, the survival function for the ZIG distribution is given by

(2.5)S(k)=P(X>k)={1 if k<0(1−π1)(1−p)k+1 if k=0,1,2,…

The failure rate (hazard function) for ZIG, using (1.4) and (2.5), is given by

r(k)=P(X=k|X≥k)=P(X=k)P(X>k−1)=P(k)S(k−1)={π1+(1−π1)pif k=0pif k=1,2,…

which is a constant function of X, except at 0.

Using (1.5), the probability mass function of ’Zero-One-Inflated Geometric’ (ZOIG) distribution, i.e., when m = 2 and k₁ = 0 and k₂ = 1, can be obtained as

(2.6)P(k|p,π1,π2)={π1+(1−π1−π2)p if k=0π2+(1−π1−π2)p(1−p) if k=1(1−π1−π2)P(k|p) if k=2,3,…

From (2.6), the cdf of ZOIG can be easily obtained as

(2.7)F(k|p,π1,π2)={0 if k<0π1+(1−π1−π2)p if k=0π1+π2+(1−π1−π2){1−(1−p)k+1} if k=1,2,…

Thus, the survival function of ZOIG distribution is given by

(2.8)S(k)={1 if k<0(1−π1)(1−p)+π2p if k=0(1−π1−π2)(1−p)k+1 if k=1,2,…

Therefore, the hazard function becomes

(2.9)r(k)={π1+(1−π1−π2)p if k=0π2+(1−π1−π2)p(1−p)(1−π1)(1−p)+π2p if k=1p if k=2,3,…

which is a constant function of X, except at 0 and 1.

Also using (2.1), we obtain the pgf of ZOIG distribution as

GX(s)=π1+sπ2+(1−π1−π2)p1−{(1−p)s}.

3. Estimation of Model Parameters

In this section, we estimate the parameters by three well known methods of parameter estimation namely the method of moment estimations, methods based on empirical probability generating function and the method of maximum likelihood estimations.

3.1. Method of Moments Estimation (MME)

The easiest way to obtain estimators of the parameters is through the method of moments estimation (MME). Given a random sample X₁,...., X_n, i.e. independent and identically distributed (iid) observations from the GIG distribution, we equate the sample moments with the corresponding population moments (2.2) to get a system of (m+1) equations involving the (m+1) model parameters p, π₁,..., π_m of the form

(3.1)mr′=∑i=1mkirπi+(1−∑i=1mπi)μr′(p)

where r = 1, 2,..., (m + 1). Note that mr′=∑j=1nXjr/n is the r^th raw sample moment. The values of π_i,i = 1, 2,...., m, and p obtained by solving the system of equations (3.1) are denoted by π^i(MM) and p^MM respectively. The subscript “(MM)” indicates the MME approach. Note that the parameter p is non-negative and hence the estimate must obey this restriction. But there is no such guarantee, as such we propose the corrected MME’s to ensure non-negativity of this moment estimator as

(3.2)p^MMc=p^MM truncated at 0 and 1 and π^i(MM)c=π^i(MM)

where π^i(MM)c is the solution of π_i in (3.1) obtained after substituting p^MM.

Consider the case of ZIG distribution when m = 1 i.e., k₁ = 0, resulting into only two parameters to estimate, i.e. π₁ and p. In order to obtain the method of moments estimators, we can equate the mean and variance given in (2.3) with sample mean ( X¯) and sample variance (s²) respectively. This is an alternative approach to estimate the parameters instead of dealing with the sample raw moments. Hence we obtain,

(3.3)(1−π1)(1−p)p=X¯(1−π1){1+π1(1−p)}(1−p)p2=s2

Solving the above equations simultaneously for p and π₁, we get, p^MMc=2X¯(s2+X¯+X¯2) and π^1(MM)c=s2−X¯−X¯2s2−X¯+X¯2.

Now let us consider another special case of GIG, the Zero-One Inflated Geometric (ZOIG) distribution, i.e., m = 2 and k₁ = 0 and k₂ = 1. It has three parameters π₁, π₂ and p and to estimate them we need to equate the first three raw sample moments with the corresponding population moments and we thus obtain the following system of equations:

(3.4)m1′=π2+(1−π1−π2)(1−p)pm2′=π2+(1−π1−π2)(p2−3p+2)p2m3′=π2+(1−π1−π2)(1−p)(p2−6p+6)p3

Similarly, another special case of GIG is the Zero-One-Two Inflated Geometric (ZOTIG) distribution. Here we have four parameters π₁, π₂, π₃ and p to estimate. This is done by solving the following system of four equations obtained by equating the first four raw sample moments with their corresponding population moments to have,

(3.5)m1′=π2+2π3+(1−π1−π2−π3)(1−p)pm2′=π2+4π3+(1−π1−π2−π3)(1−p)(2−p)p2m3′=π2+8π3+(1−π1−π2−π3)(1−p)(p2−6p+6)p3m4′=π2+16π3+(1−π1−π2−π3)(1−p)(2−p)(p2−12p+12)p4

Algebraic solutions to these systems of equations in (3.4) and (3.5) are obtained by using Mathematica. We note that these solutions may not fall in the feasible regions of the parameter space, so we put appropriate restrictions to these solutions as discussed for the ZIG distribution to obtain the corrected MMEs.

3.2. Methods based on Probability Generating Function (EPGE)

Given a random sample X₁, X₂,…, X_n of size n from a distribution with pgf G(s), the empirical probability generating function (epgf) is defined as

(3.6)Gn(s)=1n∑i=1nsXi.

Number of parameter estimation methods have been proposed based on the the epgf, due to the fact that the epgf G_n(s) converges pgf G(s) in various modes as n → ∞. The simplest of these method for estimating θ = (θ₁, θ₂,..., θ_m) is by equating the epgf to the pgf at m different numerical values of s, and then solving the system of m equations of the form

(3.7)Gn(si)=G(si),i=1,2,…m,

where, −1 ≤ s_i ≤ 1. Another system of estimating equations can be obtained and solved by equating the derivatives,

(3.8)Gn′(si)=G′(si),i=1,2,…m,

or by considering the derivatives of logs,

(3.9)Gn′(si)G(si)=G′(si)Gn(si), i=1,2,…m.

A combination of these three types of estimating equations can be used as well. Kemp and Kemp (1988) [10] showed that some well known estimation procedures, such as, the method of moments discussed earlier are special cases of these methods. For the case of ZIG, i.e. when m = 2, we used one such well-known procedure, the method of mean-and-zero-frequency, where the sample mean is equated to the population mean and the relative frequency of X = 0 in the sample (f₀) is equated to P(X = 0). Hence we obtain,

(3.10)X¯=(1−π1)(1−p)pf0=π1+(1−π1)p.

Solving the above set of equations for p and π₁, we get, p^PG=1−f0X¯ and π^1(PG)=X¯f0+f0−1X¯+f0−1. The subscript “(PG)” indicates the EPGE approach. As in the case of the MME approach, the parameter estimates obtained using EPGE approach may lie outside the parameter space. In that case, like the MME approach we need to correct them by appropriately truncating them. We can extend this approach to any GIG distribution by equating the sample mean with the population mean and probability of the inflated points P(X = k_i) with the relative frequencies of those in the sample (f_{k_i}). For example, in case of ZOIG distribution, we obtain the following set of equations:

(3.11)X¯=π2+(1−π1−π2)(1−p)pf0=π1+(1−π1−π2)pf1=π2+(1−π1−π2)p(1−p)

and for ZOTIG distribution, we have

(3.12)X¯=π2+2π3+(1−π1−π2−π3)(1−p)pf0=π1+(1−π1−π2−π3)pf1=π2+(1−π1−π2−π3)p(1−p)f2=π3+(1−π1−π2−π3)p(1−p)2

Again, algebraic solutions to these systems of equations in (3.11) and (3.12) are obtained by using Mathematica.

3.3. Maximum Likelihood Estimation (MLE)

Now we discuss the approach of estimating our parameters by the method of Maximum Likelihood Estimation(MLE). Based on the random sample X = (X₁, X₂,..., X_n), we define the likelihood function L = L(p, π_i, 1 ≤ i ≤ m | X) as follows. Let Y_i = the number of observations at k_i with inflated probability, i.e., if I is an indicator function, then Yi=∑j=1nI(Xj=ki), 1 ≤ i ≤ m, which means Y_i is the total number of observed counts at k_i. Also, let Y.=∑i=1mY_i = total number of observations with inflated observations, n = total number of observations and, (n − Y.) is the total number of non-inflated observations. Then,

(3.13)L=∏i=1m{πi+(1−∑l=1mπl)P(ki|p)}Yi∏Xj≠ki{(1−∑l=1mπl)P(Xj|p)}=∏i=1m{πi+(1−∑l=1mπl)P(ki|p)}Yi(1−∑l=1mπl)(n−Y.)∏Xj≠kiP(Xj|p)

The log likelihood function l = lnL is given by

l=∑i=1mYiln{πi+(1−∑l=1mπl)P(ki|p)}+(n−Y.)ln(1−∑l=1mπl)+∑Xj≠kilnP(Xj|p)

But we have,

∑Xj≠kilnP(Xj|p)=(n−Y.)lnp+ln(1−p)(∑j=1nXj−∑l=1mklYl)

hence the log likelihood function becomes

(3.14)l=∑i=1mYiln{πi+(1−∑l=1mπl)P(ki|p)}+(n−Y.)ln(1−∑l=1mπl) +(n−Y.)lnp+ln(1−p)(∑j=1nXj−∑l=1mklYl)

Now to obtain the MLEs, we maximize l in (3.14) with respect to the parameters π_i, 1 ≤ i ≤ m, and p over the appropriate parameter space. Differentiating l partially w.r.t the parameters and setting them equal to zero yields the following system of likelihood equations or score equations.

(3.15)∂l∂πi=Yi{πi+(1−∑l=1mπl)P(ki|p)}−∑i=1mYiP(ki|p){πi+(1−∑l=1mπl)P(ki|p)} −(n−Y.)(1−∑l=1mπl)=0, ∀i=1,…,m; ∂l∂p=∑i=1mYi(1−∑l=1mπl)P(p)(ki|p){πi+(1−∑l=1mπl)P(ki|p)}+(n−Y.)p−(nX¯−∑l=1mklYl)(1−p)=0

where, P^(p) (k_i | p) = (∂ / ∂p)P(k_i | p).

Usually, of the three methods discussed, MLE is known to work best but in absence of closed form expression we have to resort to numerical optimization techniques, which makes it the most difficult one to obtain. Using the other two methods (MME and EPGE), estimators are much easier to obtain and do not require extensive computation, but they may lie outside the feasibility region of the parameter space. So we need to appropriately correct them. Therefore, we conduct simulation studies in the next section which can provide some guidance about their performances.

4. Simulation Study

We have considered the following three cases for our simulation study:

(i)
m = 1, k₁ = 0 (Zero Inflated Geometric (ZIG) distribution)
(ii)
m = 2, k₁ = 0, k₂ = 1 (Zero-One Inflated Geometric (ZOIG) distribution)
(iii)
m = 3, k₁ = 0, k₂ = 1, k₃ = 2 (Zero-One-Two Inflated Geometric (ZOTIG) distribution)

For each model mentioned above, we generate random data X₁,..., X_n from the distribution (with given parameter values) N = 10,000 times. Let us denote a parameter (either π_i or p) by the generic notation θ. The parameter θ is estimated by three possible estimators θ^MM(c) (the corrected MME), θ^PG(c) (the corrected EPGE) and θ^ML (the MLE). At the lth replication, 1 ≤ l ≤ N, the estimates of θ are θ^MM(c)(l), θ^PG(c)(l) and θ^ML(l) respectively. Then the standardized bias (called ‘SBias’) and standardized mean squared error (called ‘SMSE’) are defined and approximated as below

(4.1) SBias(θ^)=E(θ^−θ)/θ≈{∑l=1N(θ^(l)−θ)/θ}/NSMSE(θ^)=E(θ^−θ)2/θ2≈{∑l=1N(θ^(l)−θ)2/θ2}/N

Note that θ^ will be replaced by θ^MM(c), θ^PG(c) and θ^ML in our simulation study. Further observe that we are using SBias and SMSE instead of the actual Bias and MSE, because the standardized versions provide more information. An error of magnitude 0.01 in estimating a parameter with true value 1.00 is more severe than a situation where the parameter’s true value is 10.0. This fact is revealed through SBias and/or SMSE more than the actual bias and/or MSE.

4.1. The ZIG Distribution

In our simulation study for the Zero Inflated Geometric (ZIG) distribution, we fix p = 0.2 and vary π₁ from 0.1 to 0.8 with an increment of 0.1 for n = 25, 100. The constrained optimization algorithm “L-BFGS-B” (Byrd et al. (1995)) [3] is implemented in R programming language to obtain the maximum likelihood estimators (MLEs) of the parameters p and π₁, and the corrected MMEs and EPGEs are obtained by solving a system of equations and imposing appropriate restrictions on the parameters. In order to compare the performances of the MLEs with that of the CMMEs and EPGEs, we plot the standardized biases (SBias) and standardized MSE (SMSE) of these estimators obtained over the allowable range of π₁. The SBias and SMSE plots are presented in Figure 1 for sample size 25. Due to space constraint, the plot for sample size 100 is presented in Figure 8 in the Appendix. Also we have presented in the Appendix a plot when p = 0.7 and n = 100 (Figure 9).

For the ZIG distribution with n = 25, from Figure 1(a) we see that MLE and EPGE outperforms CMME for all values of π₁ with respect to SBias except at π₁ = 0.1. All the estimators beyond 0.2 are negatively biased. In Figure 1(b), we again see that MLE and EPGE uniformly outperforms the CMME and are increasing with values of π₁. In Figure 1(c), MLE and EPGE consistently outperforms CMME at all points w.r.t SMSE. For all the estimators, the SMSE starts off at their highest values and then decreases rapidly until it reaches nearly zero. The SMSE of MLE and EPGE consistently outperforms that of CMME in Figure 1(d). They both start off at their lowest values, and increase as values of π₁ get higher. For n = 100, from Figure 8 we see a similar pattern. However, as expected the values of sBias and SMSE are considerably smaller for all estimators.

When p = 0.7, from Figure 9 we observe that MLE outperforms both CMME and EPGE. Again all estimators of π₁ seems to be negatively biased and MLE and EPGE of p is almost unbiased up to π₁ = 0.7. We also observe that SMSE of MLE for both the parameters π₁ and p is smallest among the three estimators.

4.2. The ZOIG Distribution

In the case of the Zero-One Inflated Geometric (ZOIG) distribution we have three parameters to consider, namely π₁, π₂ and p. For fixed p = 0.3 we vary π₁ and π₂ one at a time for sample sizes 25 and 100. Figure 2 presents the comparisons for the three different type of estimators for three parameters π₁, π₂ and p in terms of standardized bias and standardized MSE for n = 25, varying π₁ from 0.1 to 0.5 and keeping π₂ and p fixed at 0.15 and 0.3 respectively. Figure 10 in the Appendix presents the same for sample size 100. Also in the Appendix a plot when p is fixed at 0.7 and n = 100 is given in Figure 11. In Figure 2(a), both MLE and EPGE outperforms CMME at all points with respect to SBias. SBias of MLE and EPGE starts above zero and quickly becomes negative, whereas that of CMME is throughout negative. In Figure 2(b), we see that the MLE is almost unbiased for values of π₁ up to 0.4, and SBias of both CMME and EPGE are always negative. Again in Figure 2(c), we see that MLE outperforms both CMME and EPGE. In Figure 2(d), SMSE of all the estimators are decreasing, but SMSE of MLE stays below that of other two estimators. In Figure 2(e), SMSE of MLE stays constant at 0.55 and in 2(f), SMSE of MLE stays below that of the other estimators.

For sample size n = 100 from Figure 10, we observe that both MLE and EPGE for all the parameters are almost unbiased and performs better than CMME. However, SMSE wise MLE is marginally better. We also observe that with the increase in sample size the SBias and SMSE of all the estimators have gone down.

When p = 0.7 and n = 100, from Figure 11 we observe that SBias of π₁ is decreasing for all the estimators. However for π₂, SBias of MLE starts of with very small positive value and then becomes negative, whereas both EPGE and CMME are negatively biased and MLE is almost unbiased for parameter p. Also SMSE wise, MLE seems to be the best estimator.

In our second scenario which is presented in Figure 3 for sample size 25, we vary π₂ keeping π₁ and p fixed at 0.15 and 0.3 respectively. Figure 12 in the Appendix contains the same plots for sample size 100. Also in the Appendix we present comparison plots of SBias and SMSE in Figure 13 for sample size 100, varying π₂ keeping π₁ and p fixed at 0.15 and 0.7 respectively. In Figure 3(a), we see that all three estimators are negatively biased, but MLE is always performing better. However in Figure 3(b), MLE starts of with a positive bias and then becomes almost unbiased for π₂. SBias of EPGE is negative throughout but stays very close 0. SBias of CMME stays throughout between −0.6 and −0.8. In Figure 3(c), we see that MLE outperforms both CMME and EPGE throughout with respect to SBias. From Figures 3(d), 3(e) and 3(f), it is clear that MLE outperforms both CMME and EPGE with respect to SMSE for all permissible values of π₂. Thus we observe that the MLEs of the all three parameters perform better in terms of the both SBias and SMSE.

We observe very similar behavior from the plots for n = 100, presented in Figure 12. MLE seems to be performing best and SBias and SMSEs are going down as expected with an increase in sample size except for SMSE of π₂ (Figure 12(e)).

Again from Figure 13, when p = 0.7 and n = 100, we observe that MLE is performing better than EPGE and CMME w.r.t both SBias and SMSE.

4.3. The ZOTIG Distribution

For the Zero-One-Two Inflated Geometric (ZOTIG) distribution we have four parameters to consider, namely π₁, π₂, π₃ and p. For fixed p = 0.3 we vary π₁, π₂, and π₃ one at a time from 0.1 to 0.5 for sample size n = 25. Thus we have 12 estimators π^1(MM)(c), π^2(MM)(c), π^3(MM)(c), p^(MM)(c), π^1(PG), π^2(PG), π^3(PG), P^(PG), π^1(ML), π^2(ML), π^3(ML) and p^(ML). The comparisons of these estimators in terms of standardized bias and standardize MSE for sample size n = 25 are presented in Figures 4–6 and for sample size 100 are presented in the Appendix (Figures 14–16).

In the first scenario of ZOTIG distribution, which is presented in Figure 4, we vary π₁ keeping π₂, π₃ and p fixed at 0.2, 0.2 and 0.3 respectively. From Figure 4(a, b, c, d), we see that the CMMEs of all the four parameters perform consistently worse than the MLEs and EPGEs with respect to SBias. Both MLEs and EPGEs seems to have very similar performance. Also from parts (e, f, g, h) in Figure 4 concerning the SMSE, we notice that the MLEs of all the parameters perform consistently better than the other two type of estimators. We observe similar performance of the three types of estimators, when sample size is increased to 100 (Figure 14).

In our second scenario which is presented in Figure 5, we vary π₂ keeping π₁, π₃ and p fixed at 0.2, 0.2 and 0.3 respectively. We observe some interesting things in these plots. From Figure 5 we see that MLE and EPGE of π₁, π₂, π₃ and p are performing better upto π₂ = 0.4 with respect to SBias but after that performance of all the estimators are same. However we see that the MLEs of all four parameters perform better than their CMME and EPGE counterparts with respect to SMSE. In Figure 15, when sample size is 100, we observe that CMMEs are consistently worse w.r.t both SBias and SMSE and EPGEs are performing marginally better than MLEs.

In the third scenario which is presented in Figure 6, we vary π₃ keeping π₁, π₂ and p fixed at 0.2, 0.2 and 0.3 respectively. Here also we observe similar results as the second case of ZOTIG distribution, MLEs and EPGEs for all the four parameters are uniformly outperforming CMMEs with respect to SBias up to π₃ = 0.4. Also as before MLEs uniformly outperform CMMEs and EPGEs of all the four parameters with respect to SMSE. But as sample size increases to 100, we observe that EPGEs are performing marginally better than MLEs (Figure 16).

Thus from our simulation study it is evident that MLE has an overall better performance than CMME and EPGE for all the Generalized Inflated Geometric models that we have considered. So in the next chapter, we consider an example where we fit an appropriate GIG model to the Swedish fertility data set.

5. Application of GIG Distribution

In this section, we consider the Swedish fertility data presented in Table 1 and try to fit a suitable GIG model. Since our simulation study in the previous section suggests that the MLE has an overall better performance, all of our estimations of model parameters are carried out using the maximum likelihood estimation approach. While fitting the Inflated Geometric models, parameter estimates of some mixing proportions (π_i) came out negative. So we need to make sure that all the estimated probabilities according to the fitted models are non-negative. We tried all possible combinations of GIG models and then compared each of these Inflated Geometric models using the Chi-square goodness of fit test, the Akaike’s Information Criterion (AIC) and the Bayesian Information Criterion (BIC). While performing the Chi-square goodness of fit test, the last three categories of Table 1 are collapsed into one group due to small frequencies. More details of our model fitting is presented below.

First, we try with single-point inflation at each of the four values (0, 1, 2 and 3). In this first phase, an inflation at 2 seems most plausible as it gives the smallest AIC and BIC values 4188.794 and 4198.924 respectively. However the p-value of the Chi-square test is very close to 0, suggesting that this is not a good model. Next, we try two-point inflations at {0, 1}, {0, 2}, {0, 3}, {1, 2}, etc. At this stage, {2, 3} inflation seems most appropriate going by the values of AIC (3947.064) and BIC ( 3962.258). But p-value of the Chi-square test close to 0 again makes it an inefficient model.

In the next stage, we try three-point inflation models, and here we note that a GIG with inflation set {0, 2, 3} significantly improves over the earlier {2, 3} inflation model (i.e., TTIG). This ZTTIG model significantly improves the p-value (but still close to 0) while maintaining a low AIC and BIC of 3841.169 and 3861.428 respectively. The main reason for low p-value is that this model is unable to capture the tail behavior. The estimated value of the parameters are (with k₁ = 0,k₂ = 2,k₃ = 3): π^1=−0.2340517, π^2=0.2816816, π^3=0.1376353, and p^=0.4068477 using the Maximum Likelihood Estimation approach. Interpretation of the negative π^1 is very difficult in this case, perhaps it can be thought of as a deflation point.

Finally, we fitted the full {0, 1, 2, 3} inflated model. We obtained the maximum likelihood estimates of the model parameters as π^1=−2.60853103, π^2=−0.92017233, π^3=−0.04498528, π^3=0.02737037, and p^=0.59520603. The AIC and the BIC values for this model are 3800.688 and 3826.012 respectively. Which is significantly lower than all the previous models. Also p-value of the Chi-square test is 0.810944, thus rendering this Zero-One-Two-Three inflated Geometric (ZOTTIG) model to be a very good fit. For the sake of completeness, we have also fit a discrete Lindley distribution, proposed by Bakouch et al. (2014) [1], whose probability mass function is given in the Appendix (A.1), to the data and obtained the parameter estimate as θ^=0.614. Also the AIC and BIC are 4353.098 and 4358.163 respectively. Moreover, p-value of the Chi-square test is almost 0. Recently, this data set was also modeled using a Generalized Inflated Poisson distribution. It was observed that the Zero-Two-Three inflated Poisson (ZTTIP) model is giving best fit to the data with an AIC and BIC value of 3805.485 and 3769.226 respectively. Also the Chi-square test produced a test statistic and p-value of 2.085 and 0.720 respectively. Though both ZOTTIG and ZTTIP models seem to be performing reasonably well for this data set, but based on the AIC and the Chi-square test, the ZOTTIG model seems to be a better fit. We have included a plot of regular geometric model, discrete Lindley model and the ZOTTIG model in Figure 7. It is evident from this plot that the ZOTTIG model is performing way better than the regular geometric and discrete Lindley model for the Swedish fertility data.

6. Conclusion

This work deals with a general inflated geometric distribution (GIG) which can be thought of as a generalization of the regular Geometric distribution. This type of distribution can effectively model datasets with elevated counts. We have discussed some properties of this distribution and also outlined the parameter estimation procedure for this distribution using the method of moments estimation, empirical probability generating function bases methods and the maximum likelihood estimation techniques. Simulation studies were also performed and we found that MLEs performed better than the corrected MMEs and EPGEs in estimating the model parameters with respect to the standardized bias (SBias) and standardized mean squared errors (SMSE). While performing the simulation, we observed that for certain ranges of the inflated proportions in the GIG models, the MMEs and EPGEs lie outside the parameter space. Nonetheless, we selected all permissible values and compared the overall performance of the CMMEs, EPGEs and MLEs for three special cases of GIG. Different GIG models were used for analyzing the fertility data of Swedish women and was compared with a discrete Lindley model. It was found that the Zero-One-Two-Three inflated Geometric (ZOTTIG) model is a good fit. Because of the extra parameter(s), the GIG distribution seems to be much more flexible in model fitting than the regular geometric distribution.

Acknowledgment

The authors are grateful to the editor and two referees for providing valuable comments and suggestions, which enhanced this work substantially.

Appendix A.

The probability mass function of the discrete Lindley distribution is given by

(A.1)P(X=k|θ)=p(k)={e−θk1+θ{θ(1−2e−θ)+(1−e−θ)(1+θk)} if k=0,1,2,…0 otherwise,

where θ > 0.

References

[1]HS Bakouch, MA Jazi, and S Nadarajah, A new discrete distribution, Statistics, Vol. 48, No. 1, 2014, pp. 200-240.

[2]M Begum, A Mallick, and N Pal, A Generalized Inflated Poisson Distribution with Application to Modeling Fertility Data, Thailand Statistician, Vol. 12, 2014, pp. 135-159.

[3]RH Byrd, P Lu, J Nocedal, and C Zhu, A limited memory algorithm for bound constrained optimization, SIAM Journal on Scientific Computing, Vol. 16, No. 5, 1995, pp. 1190-1208.

[4]R Chen, A survillance system for congenital malformations, Journal of the American Statistical Association, Vol. 73, No. 362, 1978, pp. 323-327.

[5]F Eggenberger and G Pólya, Über die statistik verketetter vorgäge, Zeitschrift für Angewandte Mathematik und Mechanik, Vol. 1, 1923, pp. 279-289. (German).

[6]W Feller, An introduction to probability theory and its applications, 3rd ed., John Wiley & Sons, Inc, New York, Vol. 1, 1968.

[7]W Feller, An introduction to probability theory and its applications, 3rd ed., John Wiley & Sons, Inc, New York, Vol. 2, 1971.

[8]P Jagers, How many people pay their tram fares?, Journal of the American Statistical Association, Vol. 68, No. 344, 1973, pp. 801-804.

[9]NL Johnson, S Kotz, and AW Kemp, Univariate discrete distributions, 3rd ed., John Wiley & Sons, Inc, Hoboken, New Jersey, 2005.

[10]CD Kemp and AW Kemp, Rapid Estimation for Discrete Distributions, The Statistician, Vol. 37, No. 3, 1988, pp. 243-255.

[11]M Melkersson and DO Rooth, Modeling female fertility using inflated count data models, Journal of Population Economics, Vol. 13, No. 2, 2000, pp. 189-203. (English).

[12]GP Patil, MT Boswell, SW Joshi, and MV Ratnaparkhi, Dictionary and classified bibliography of statistical distributions in scientific work: Discrete models, International Co-operative Publishing House, Vol. 1, 1984.

[13]EC Pielou, Runs of one species with respect to another in transects through plant populations, Biometrics, Vol. 18, No. 4, 1962, pp. 579-593.

[14]EC Pielou, Runs of healthy and diseased trees in transects through an infected forest, Biometrics, 1963, pp. 603-614.

[15]DD Wackerly, W Mendenhall, and RL Scheaffer, Mathematical statistics with applications, 7th ed., Cengage Learning, 2008.

<Previous Article In Issue

Download article (PDF)

Next Article In Issue>

Journal: Journal of Statistical Theory and Applications
Volume-Issue: 17 - 3
Pages: 491 - 519
Publication Date: 2018/09/30
ISSN (Online): 2214-1766
ISSN (Print): 1538-7887
DOI: 10.2991/jsta.2018.17.3.7 How to use a DOI?
Open Access: This is an open access article under the CC BY-NC license (http://creativecommons.org/licences/by-nc/4.0/).

Cite this article

ris enw bib

TY  - JOUR
AU  - Avishek Mallick
AU  - Ram Joshi
PY  - 2018
DA  - 2018/09/30
TI  - Parameter Estimation and Application of Generalized Inflated Geometric Distribution
JO  - Journal of Statistical Theory and Applications
SP  - 491
EP  - 519
VL  - 17
IS  - 3
SN  - 2214-1766
UR  - https://doi.org/10.2991/jsta.2018.17.3.7
DO  - 10.2991/jsta.2018.17.3.7
ID  - Mallick2018
ER  -

download .riscopy to clipboard