Modeling Frailty Using Birnbaum Saunders Distribution for Bivariate Survival Data

Lal Pawimawha

doi:10.2991/jsta.d.210222.001

<Previous Article In Issue

Download article (PDF)

Next Article In Issue>

Volume 20, Issue 2, June 2021, Pages 355 - 363

Modeling Frailty Using Birnbaum Saunders Distribution for Bivariate Survival Data

Authors

Lal Pawimawha^*^,

Department of Statistics, Pachhunga University College, Mizoram, 796001, India

^*Corresponding author. Email: raltelalpawimawha08@gmail.com

Corresponding Author

Lal Pawimawha

Received 23 June 2020, Accepted 18 February 2021, Available Online 12 March 2021.

DOI: 10.2991/jsta.d.210222.001 How to use a DOI?
Keywords: Bayesian model comparison; Birnbaum–Saunders frailty; Exponential power distribution; Pareto distribution; Shared frailty; Weibull distribution
Abstract: Frailty models are survival models, which are designed to explore the properties of the unobserved heterogeneity in individuals pertaining to disease and death. In this paper, we propose the new frailty model called Birnbaum–Saunders frailty model with Weibull, Pareto, and exponential power as baseline distributions. Estimation of the proposed model parameters was done using Bayesian estimation procedure under Markov chain Monte Carlo technique. A simulation study was also presented to compare the true values and the estimated parameters. We then applied these proposed models to kidney infection data. Models comparison was done using various information criteria and Bayes factor. The best model for the data related to infection occur due to insertion of catheter is suggested.
Copyright: © 2021 The Authors. Published by Atlantis Press B.V.
Open Access: This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

1. INTRODUCTION

Many researchers have used Cox [1] proportional-hazards model to analyze the survival data by considering the study population as homogeneous indicating that all the individuals are considered as having the same risk for getting diseases or death. But, in many practical situations, it may not be possible to consider all the individuals as homogeneous since they are differ from each other in contracting disease due to genetic factor or environmental factor or some other factor, which are generally omitted during the analysis of the survival data. In these conditions, it is not possible to assume the dependency between the survival times. To investigate such data, it is important to explain dependency within the subject in the multiple event times. In such case, it is expected to present an additional random part “frailty” into the proportional hazard model. Here, the random family impact itself is of prime interest in the investigation. These models are becoming alluring and more attractive due to the introduction of extra unobserved variable and name as “frailty,” which is assumed to be shared by a pair of organs.

The frailty model is also known as a random effect model, which is designed to account for heterogeneity amongst the groups of individuals due to unobserved factors. Generally, it is omitted by many researchers. Clayton [2] first proposed the model for such data without using frailty. The term frailty was first used by Vaupel et al. [3]. Keyfitz and Littman [4] also indicate that omitting individual heterogeneity can cause inaccurate conclusions. Balakrishnan and Liu [5] discussed Birnbaum–Saunders (BS) as frailty distribution in semi-parametric frailty model. In addition, BSas frailty distribution is used to analyze medical and cancer data (Leao et al. [6–8]). Many distributions such as gamma, inverse Gaussian, positive stable distribution, power variance function, Weibull, compound Poisson are also engaged as frailty distribution for modeling heterogeneity in the populations.

Generally, the frailty model is developed as an unobserved random variable is adding multiplicatively on the baseline hazard function (BHF). Considering univariate frailty model, let a continuous random variable $Y$ be a life time of an individual and the random variable $V$ be the frailty variable. The conditional hazard function (CHF) for a given frailty variable $V=v$ at time $y>0$ is,

$m(y∣v)=vm0(y)eX_β_$ (1.1)

where

$m0(y)$ is a BHF at time

$y>$ 0.

$X_$ and

$β_$ are the vector form of covariates and regression coefficients. The conditional survival function (CSF) for given frailty at time

$y>$ 0 is,

$S(y∣υ)=e−∫0ym(x∣υ)dx$ (1.2)

$=e−υM0(y)eX_β_$ (1.3)

where

$M0(y)$ is the cumulative BHF at time

$y>$ 0.

The marginal survival function(SF) can be obtained by taking integration of Equation (1.3) with respect to frailty variable $V$ having probability density function (pdf) $f(v)$ , and is given as

$S(y)=∫0∞S(y∣υ)f(υ)dυ =∫0∞e−υM0(y)eX_β_f(υ)dυ$ (1.4)

$=LV(M0(y)eX_β_)$ (1.5)

where

$LV(.)$ is the Laplace transform (LT) of the frailty variable

$V$ .

In this manuscript, we consider right censored data with BS as the frailty distribution. Weibull, Pareto and exponential power are used as the baseline distributions to explore the salient features of the shared BS frailty models based on multiplicative hazard. Here, the dependence between the survival times is due to BS distributed common frailty variable $V$ . The next sections are organized as follows: in Section 2, the general shared frailty model was introduced. Frailty distribution is given in Section 3. In Section 4, baseline distributions are given. Sections 5 and 6 are reserved for proposed models and Estimation strategies. Section 7 present the application to real life data and in Section 8, conclusion is given.

2. GENERAL SHARED FRAILTY MODEL

In the study, suppose there are n individuals having first and second failure times ( $y1k,y2k$ ), $Xpk$ (p = 1,2,…,q) be the found covariate for the $kth$ individual. Here it is acknowledged that the two survival times share a similar kind of covariates. Let $Vk$ represent the shared frailty for the $kth$ individual and are assumed to be adding multiplicatively on BHF. Another assumption is that conditional independence between the two survival times of an individual for given shared frailty. Under these cases, the CHF and CSF for the $kth$ individual at $jth$ (j = 1,2) survival times $yjk$ for given frailty are obtained as

$m(yjk∣vk,X_k)=vkm0(yjk)eX_kβ_$ (2.1)

where

$m0(yjk)$ is a BHF at time

$yjk>0$ and

$β_$ is a vector form of regression coefficients.

The conditional cumulative hazard function (CuHF) for the $kth$ individual at the $jth$ survival time $yjk>0$ for given frailty $Vk$ = $vk$ is

$M(yjk∣vk,X_k)=vkM0(yjk)ηk$ (2.2)

where

$ηk=eX_kβ_$ and

$M0(yjk)$ is the cumulative BHF at time

$yjk>0$ .

The CSF for $kth$ individual at $jth$ survival time $yjk>0$ for given frailty $Vk$ = $vk$ is

$S(yjk∣vk,X_k)=e−M(yjk∣vk,X_k) =e−vkM0(yjk)ηk$ (2.3)

Under the assumption of independence, the bivariate CSF for given frailty $Vk$ = $vk$ at time $y1k>0$ and $y2k>0$ is

$S(y1k,y2k∣vk,X_k)=S(y1k∣vk,X_k)S(y2k∣vk,X_k) =e−vk(M01(y1k)+M02(y2k))ηk$ (2.4)

The unconditional SF (in bivariate case) at time $y1k>0$ and $y2k>0$ can be derived by taking integration of Equation (2.4) with respect to frailty variable $Vk$ having the pdf $f(vk)$ , and is given as

$S(y1k,y2k∣X_k)=∫VkS(y1k,y2k∣vk)f(vk)dvk =∫Vke−vk(M01(y1k)+M02(y2k))ηkf(vk)dvk =LVk[(M01(y1k)+M02(y2k))ηk]$ (2.5)

where

$LVk(.)$ is the LT of frailty variable of

$Vk$ for

$kth$ individual.

Here onward we represent $S(y1k,y2k∣X_k)$ as $S(y1k,y2k)$ .

3. SHARED BIRNBAUM SAUNDERS FRAILTY MODEL

The first frailty distribution we have considered here is BS distribution [9]. The pdf of the BS distribution is unimodal. The shape of the HF plays a significant role in the analysis of lifetime data. Leiva [10] also showed that BS distribution is right-skewed (asymmetrical), continuous, and unimodal. It is otherwise called the fatigue life distribution and has gotten extensive consideration because of its theoretical arguments, its alluring properties and its connection with the normal distribution. In this regard, Mann et al. [11] first conjectured that the HF is not an increasing function, but the average HF is nearly nondecreasing function. Later, Kundu et al. [12] and Bebbington et al. [13] showed that the HF of the BS distribution is a unimodal function. In many real-life situations, the HF may not be monotone, and it increases up to a point and then decreases.

A continuous random variable $V$ is said to follow BS distribution with scale parameter $α$ and shape parameter $ξ$ if its probability density function is

$f(v)={122παξ[(ξx)1/2+(ξx)3/2]e[−12α2(xξ+ξx−2)];v>0,ξ>0,α>00;otherwise,$ (3.1)

having mean

$ξ2$ (

$α2$ +2) and variance

$ξ24$ (5

$α4$ +4

$α2$ )

The LT is derived as

$LV(s)=e1−2s+1α2ξα2ξξα2(2s+1α2ξ)1/4(1+2s+1α2ξα2ξξ)2α(2s+1α2ξα2ξ)3/2(α2ξ)3/4ξ$

For the identifiability issues, we assume V has mean one which presents restriction on the parameters $α$ = $θ$ and $ξ$ = $2θ2+2$ . Under this restriction, the pdf and the LT of a BS distribution are derived as

$f(v)={(2+θ2)(21x(2+θ2)+22(1x(2+θ2))3/2)42πθe−2+2x(2+θ2)+12x(2+θ2)2θ2;v>0,θ>00;otherwise,$ (3.2)

and

$LV(s)=e1θ2−1+4s+2θ2θ2(2+θ2)(1+4s+2θ2)1/4θ(2+θ2+1+4s+2θ2θ2(2+θ2))212+θ2(θ2(2+θ2))7/4(1+4s+2θ2θ2(2+θ2))3/2$

Substituting the LT in Equation (2.5), we get the unconditional bivariate survival function for the $kth$ individual at the time $y1k$ and $y2k$ as,

$S(y1k,y2k)=e1θ2−1+4ηk((M01(y1k)+M02(y2k)))+2θ2θ2(2+θ2)(1+4ηk((M01(y1k)+M02(y2k)))+2θ2)1/4×θ(2+θ2+1+4ηk((M01(y1k)+M02(y2k)))+2θ2θ2(2+θ2))212+θ2(θ2(2+θ2))7/4(1+4ηk((M01(y1k)+M02(y2k)))+2θ2θ2(2+θ2))3/2$ (3.3)

where

$M01(y1k)$ and

$M02(y2k)$ are the cumulative BHFs of the life time random variables

$Y1k$ and

$Y2k$ , respectively.

4. BASELINE DISTRIBUTIONS

The first baseline distribution engaged here is Weibull distribution, which is one of the most commonly used distribution for modeling life-time data in survival analysis. A continuous random variable $Y$ is said to have two parameter Weibull distribution if the SF, the HF, and the CuHF, respectively;

$S(y)=e−λyα;y>0,α>0,λ>0$ (4.1)

$h(y)=αλyα−1;y>0,α>0,λ>0$ (4.2)

$H(y)=λyα;y>0,α>0,λ>0$ (4.3)

where

$λ$ and

$α$ represents scale and shape parameters, respectively. The HF is increasing when

$α>1$ and decreasing when

$α<1$ and constant hazard rate is observed for the distribution when

$α$ = 1.

The second baseline distribution engaged here is the Pareto distribution [14]. A continuous random variable $Y$ is said to have the Pareto distribution with the scale parameter $λ$ and the shape parameter $α$ if its SF is

$S(y)=(λy+1)−α;y>0,α>0,λ>0$ (4.4)

and the HF and the CuHF as

$h(y)=αλ1+λy;y>0,α>0,λ>0$ (4.5)

$H(y)=αlog(λy+1)$ (4.6)

It is observed that $h(y)$ decreases with $y$ , $λ>0$ , $α>0$ . Hence this distribution is also among the class of decreasing failure rate.

The third and last baseline distribution we have considered here is the exponential power distribution. A continuous random variable $Y$ is said to have the exponential power distribution if its SF is

$S(y)=e[1−eλyα];y>0,α>0,λ>0$ (4.7)

where

$λ$ and

$α$ represents scale and shape parameters of the distribution respectively. The HF and the CuHF are respectively,

$h(y)=αλyα−1eλyα;y>0,α>0,λ>0$ (4.8)

$H(y)=eλyα−1;y>0,α>0,λ>0$ (4.9)

It is observed that the HF is increasing, decreasing and $U$ shaped for the exponential power distribution.

5. PROPOSED MODELS

Replacing the CuHF for the Weibull distribution, Pareto distribution, and exponential power distribution in Equation (3.3), we get the unconditional bivariate SFs at time $y1k>0$ and $y2k>0$ as

$S(y1k,y2k)=e1θ2−1+4ηk((λ1y1kα1k+λ2y2kα2k))+2θ2θ2(2+θ2)(1+4ηk((λ1y1kα1k+λ2y2kα2k))+2θ2)1/4×θ(2+θ2+1+4ηk((λ1y1kα1k+λ2y2kα2k))+2θ2θ2(2+θ2))212+θ2(θ2(2+θ2))7/4(1+4ηk((λ1y1kα1k+λ2y2kα2k)) + 2θ2θ2(2+θ2))3/2$ (5.1)

$S(y1k,y2k)=e1θ2−1+4ηk((α1log(λ1y1k+1)+α2log(λ2y2k+1))) + 2θ2θ2(2+θ2)(1+4ηk((α1log(λ1y1k+1)+α2log (λ2y2k+1)))+2θ2)1/4θ(2+θ2+1+4ηk((α1log(λ1y1k+1)+α2log(λ2y2k+1)))+2θ2θ2(2+θ2))212+θ2(θ2(2+θ2))7/4(1+4ηk((α1log(λ1y1k+1)+α2log(λ2y2k+1)))+2θ2θ2(2+θ2))3/2$ (5.2)

$S(y1k,y2k)=e1θ2−1+4ηk((α1log(λ1y1k+1)+α2log(λ2y2k+1)))+2θ2θ2(2+θ2)(1+4ηk((α1log(λ1y1k+1) +α2log(λ2y2k+1))+2θ2)1/4θ(2+θ2 +1+4ηk((α1log(λ1y1k+1)+α2log(λ2y2k+1))+2θ2θ2(2+θ2))212+θ2(θ2(2+θ2))7/4(1+4ηk((α1log(λ1y1k+1)+α2log(λ2y2k+1))+2θ2θ2(2+θ2))3/2$ (5.3)

Here from now onwards we denote Equations (5.1−5.3) as Model I, Model II, and Model III, respectively. Model I is shared BS frailty model with Weibull baseline distribution, Model II is shared BS frailty model with Pareto baseline distribution, and Model III is shared BS frailty model with exponential power baseline distribution.

6. ESTIMATION STRATEGIES

Let us consider n individuals under the study having first and second failure times given by ( $y1k,y2k$ ). Let $d1k$ and $d2k$ be considered as the observed censoring times for the $kth$ individual ( $k=1,2,3,…,n$ ) associated with the first and second recurrence times respectively. It is also assumed that censoring scheme and life times of individuals are independent.

The likelihood function (LF) associated with the $kth$ individual having bivariate life time random variable is given by

$Lk(y1k,y2k)={f1(y1k,y2k),;y1k<d1k,y2k<d2k,f2(y1k,d2k),;y1k<d1k,y2k>d2k,f3(d1k,y2k),;y1k>d1k,y2k<d2k,f4(d1k,d2k),;y1k>d1k,y2k>d2k.$

and the LF is

$L(ψ_,β_,σ)=∏k=1n1f1(y1k,y2k)∏k=1n2f2(y1k,d2k)∏k=1n3f3(d1k,y2k)∏k=1n4f4(d1k,d2k)$ (6.1)

where

$σ$ ,

$ψ_$ , and

$β_$ are respectively the frailty parameter, the baseline parameters and the regression coefficients, which are in vector form.

Where $n1,n2,n3$ , and $n4$ are the random number of observations for which first and second failure times ( $y1k,y2k$ ) observed to lie in the ranges $y1k<d1k,y2k<d2k$ ; $y1k<d1k,y2k>d2k$ ; $y1k>d1k,y2k<d2k$ and $y1k>d1k,y2k>d2k$ , respectively and

$f1(y1k,y2k)=∂2S(y1k,y2k)∂y1k∂y2kf2(y1k,d2k)=−∂S(y1k,d2k)∂y1kf3(d1k,y2k)=−∂S(d1k,y2k)∂y2kandf4(d1k,d2k)=S(d1k,d2k)$ (6.2)

The LFs given in Equation (6.1) is very difficult to solve using the Newton–Raphson method. Maximum likelihood estimations (MLEs) may not be convergent since large number of parameters are involved in the models. Therefore, the Bayesian approach of estimation is engaged to estimate the parameters involved in the models, which does not suffer any such kind of troubles. Many authors have studied the Bayesian approach for the estimation of frailty model parameters. Some of the authors are Ibrahim et al. [15] and references therein, Santos and Achcar [16]. Santos and Achcar [16] have used log-normal as frailty distribution with Weibull and generalized gamma distribution as the baseline in parametric models. Ibrahim et al. [15] and references therein used gamma frailty in the Weibull model and the piecewise exponential model.

The joint posterior density function associated with the parameters for given failure times is obtained as

$π(α1,λ1,α2,λ2,θ,β_)∝L(α1,λ1,α2,λ2,θ,β_) ×q1(α1)q2(λ1)q3(α2)q4(λ2)q5(θ)∏i=1kgi(β_i),$

where

$qi(.)$ (

$i=1,2,3,…,5$ ) represents the prior density function having known hyper parameters which is concurring justification for baseline parameters and variance of frailty;

$gi(.)$ represents prior density function associated with the regression coefficient

$βi$ ;

$β_i$ is regression coefficients, which represent in a vector except

$βi$ ,

$i=1,2,…,k$ and

$L(.)$ is given by Equation (6.1). Here all the parameters are assumed to be independently distributed.

Metropolis–Hasting and Gibbs Sampler Algorithm is utilized for the estimation of the parameters in the models fitted with the above prior density function and likelihood equation. Geweke test and Gelman–Rubin Statistics given by Geweke [17] and Gelman and Rubin [18] to observe the Markov chain to be converged to a stationary distribution. To explore the property of the chain, to calculate the burn-in period and to choose autocorrelation lag, we utilized trace plots, coupling from the past plots and sample autocorrelation plots, respectively.

The adequacy of the models is checked by using predictive intervals, which is based on the sample generated from the posterior predictive density. For comparing different proposed models, we used various information criteria such as Bayesian Information Criteria (BIC), Akaike Information Criteria (AIC), and Deviance Information Criteria (DIC). Also, the Bayes factor $Buv$ (Kass and Raftery [19]) is utilized for comparing the models $Mu$ against $Mv$ .

7. APPLICATION TO REAL-LIFE DATA

The models performance was examined by applying the proposed models to the data related to infection of kidney occurring due to catheter insertion [20]. First, Kolmogorov–Smirnov (KS) test is utilized to check the goodness of fit for the given data and it is seen that the p values for the first and second recurrences are adequately enormous to say that there is no motivation to reject the hypothesis that the first and second recurrence time to follow the distributions with SF as given in Equations (4.1), (4.4), and (4.7) in univariate case and this condition is assumed to be valid for bivariate case. The p values obtained from KS test are given in Table 1.

	Recurrence Times
Baseline Distribution	First	Second
Weibull	0.6671	0.5003
Pareto	0.9736	0.8452
Exponential power	0.7012	0.4929

Table 1

p values of Kolmogorov–Smirnov (KS) statistics for goodness of fit test for kidney infection data set.

The posterior results of the different proposed models are presented in Tables 2–4. It comprises the estimate (posterior mean), standard error (SE), 95% lower and upper credible limits (LCL and UCL), Gelman and Rubin (GR) statistics values with p values and Geweke (GK) test values. From Tables 2–4, it is observed that regression coefficients for all the models are more or less the same. Also for all these different proposed models, the value zero is not a credible value for all the credible intervals of the regression coefficients, so all the covariates seem to be significant. It is observed that the values of frailty parameter for all the models are different. In particular, the sex effect $β2$ = −1.8987, −0.8597, and −1.6724 indicates that in the frailty models the female patients have significantly lower risk of infection than the male patients for all the models since the credible interval of regression coefficient does not contain zero. Also, estimate of $θ$ = 1.1217, 0.6175, and 1.1938 for the models shows that there exists heterogeneity in the population of patients though each patient within a cluster share the same covariates that is, age, sex, and disease type.

Parameter	Estimate	SE	LCL	UCL	GK Values	p Values	GR Values
Burn in period = 7200;		Autocorrelation lag = 250
$α1$	1.2271	0.0322	1.1701	1.2938	0.0070	0.5028	0.9999
$α2$	1.1558	0.0482	1.0510	1.2344	0.0115	0.5046	1.0017
$λ1$	0.0178	0.0047	0.0093	0.0271	0.0097	0.5038	1.0002
$λ2$	0.0150	0.0045	0.0074	0.0243	0.0019	0.5007	0.9999
$θ$	1.1217	0.0486	1.0409	1.2200	0.0094	0.5037	1.0011
$β1$	0.0082	0.0005	0.0072	0.0091	0.0002	0.5000	1.0001
$β2$	−1.8987	0.3475	−2.5616	−1.2270	−0.0120	0.5000	1.0008
$β3$	0.0421	0.0047	0.0331	0.0510	0.0106	0.5042	1.0000
$β4$	0.0616	0.0050	0.0528	0.0711	0.0049	0.5019	1.0001
$β5$	0.0021	0.0005	0.0011	0.0030	0.0006	0.5002	1.0000

SE, standard error; LCL, lower credible limit; UCL, upper credible limit; GK, Geweke.

Table 2

Posterior results for Model I using kidney infection data set.

Parameter	Estimate	SE	LCL	UCL	GK Values	p Values	GR Values
Burn in period = 4900;		Autocorrelation lag = 220
$α1$	2.2022	0.2959	1.6299	2.8432	−0.0023	0.4990	1.0005
$α2$	2.6089	0.4641	1.7035	3.4416	0.0045	0.5018	1.0085
$λ1$	0.0159	0.0043	0.0102	0.0264	0.0069	0.5027	1.0000
$λ2$	0.0089	0.0005	0.0080	0.0099	0.0021	0.5008	0.9999
$θ$	0.6175	0.0514	0.5427	0.7241	0.0234	0.5093	1.0014
$β1$	0.0020	0.0005	0.0010	0.0029	0.0048	0.5019	1.0006
$β2$	−0.8597	0.1796	−1.1968	−0.5025	−0.0074	0.5019	1.0058
$β3$	0.0620	0.0051	0.0529	0.0712	0.0180	0.5072	1.0033
$β4$	0.0620	0.0051	0.0525	0.0712	0.0051	0.5020	1.0056
$β5$	0.0209	0.0052	0.0116	0.0303	−0.0006	0.4997	1.0035

SE, standard error; LCL, lower credible limit; UCL, upper credible limit; GK, Geweke.

Table 3

Posterior results for Model II using kidney infection data set.

Parameter	Estimate	SE	LCL	UCL	GK Values	p Values	GR Values
Burn in period = 5100;		Autocorrelation lag = 375
$α1$	0.5117	0.0293	0.4515	0.5720	−0.0024	0.4990	1.0014
$α2$	0.5952	0.0250	0.5517	0.6415	−0.0042	0.4982	0.9999
$λ1$	0.1623	0.0330	0.1073	0.2432	0.0024	0.5009	1.0005
$λ2$	0.0891	0.0049	0.0807	0.0989	0.0038	0.5015	0.9999
$θ$	1.1938	0.0479	1.1116	1.2846	0.0054	0.5021	1.0020
$β1$	0.0015	0.0004	0.0006	0.0024	−0.0043	0.4982	1.0009
$β2$	−1.6724	0.3614	−2.3837	-0.8913	0.0023	0.4982	1.0001
$β3$	0.0741	0.0048	0.0646	0.0828	0.0083	0.5033	1.0000
$β4$	0.0822	0.0049	0.0731	0.0906	0.0001	0.4999	1.0000
$β5$	0.0021	0.0005	0.0012	0.0030	0.0089	0.5035	1.0009

SE, standard error; LCL, lower credible limit; UCL, upper credible limit; GK, Geweke.

Table 4

Posterior results for Model II using kidney infection data set.

The adequacy of the different proposed models is checked by constructing 95% and 50% predictive intervals of a generated random sample from predictive distribution as given in Gelfand and Ghosh [21] and Sahu et al. [22] and counted the number of times recurrence happen in the intervals for first and second kidney infections. Under the Model I, Model II, and Model III, 76, 65; 76, 62; and 71, 58, respectively out of 76 observations are contained in the 95% and 50% predictive intervals. This shows that the three models are adequate for the data.

The best model as per AIC, BIC, and DIC values is Model I, since the values of AIC, BIC, and DIC of Model I are smaller than Model II and Model III, which is given in Table 5. But the distinction between AIC, BIC, and DIC values for Model I, Model II, and Model III are very little, so AIC, BIC, and DIC values may not be commendable to make a choice between the proposed models. Here, we consider Bayes factor to compare model $u$ against model $v$ . The value in Table 6 shows that Model I is better than Model II and Model III as the corresponding value of $2log(Buv)$ is greater than 10 indicating that there is a very strong positive to favor Model I than Model II and Model III for the given dataset, which affirms our earlier result given in Table 5. Hence from all the demonstrated comparison criteria we can say Model I is better than Model II and Model III for modeling the given data.

Model No.	AIC	BIC	DIC	Log likelihood
Model I	690.1043	706.4801	675.1946	−335.0521
Model II	702.0792	718.4551	685.4432	−341.0396
Model III	698.0372	714.413	685.1324	−339.0186

BIC, Bayesian Information Criteria; AIC, Akaike Information Criteria; DIC, Deviance Information Criteria.

Table 5

AIC, BIC, and DIC values for all the proposed models.

Numerator Model Against Denominator Model	$2loge(Buv)$	Range	Evidence Against Model in Denominator
$MI$ against $MII$	11.1430	$≥ 10$	Very Strong Positive
$MI$ against $MIII$	14.1158	$≥ 0$ and $< 2$	Not significant
$MIII$ against $MII$	2.9728	$≥ 2$ and $< 6$	Positive

Table 6

The values of Bayes factor and decision based on it.

8. CONCLUSION

In this paper we discussed the results for BS shared frailty models with Weibull, Pareto, and exponential power as baseline distributions. Our main objective is to investigate which model fits better with the given infectious disease data. We performed the analysis of the given data by using R. Since, the LFs do not converge and maximum likelihood fails to estimate the parameters, Bayesian approach was utilized to analyze the data.

The comparison between the three proposed models was done using AIC, BIC, and DIC values. The smallest AIC value was Model I(BS frailty model with Weibull baseline distribution). The same result holds for BIC and DIC value. But these differences are not much significant. To take the decision about Model I, Model II, and Model III, we used the Bayes factor. We observed that, Model I is the best. The values of frailty variances were also calculated, which are 1.2197, 0.3971 and 1.3515 for Model I, Model II and Model III respectively indicating that there is a strong evidence of high degree of heterogeneity in the population of patients. Bayes factor was employed to test the frailty parameter $θ=0$ and it was observed that frailty is present. Negative value of regression coefficient $(β2)$ of covariate sex indicated that the female patients have a slightly lower risk of infection.

By referring all the above analysis, we suggested that the new shared BS frailty model with Weibull distribution as baseline distribution is the best in the proposed models for modeling of kidney infection data. This also confirms that model with frailty is appropriate to model kidney infection dataset. Therefore, from the above results, it is evident that cluster effects (frailty variable) significantly incorporate with the recurrence time of kidney infection due to insertion of catheter. One important property that was observed from BS frailty model with Weibull baseline distribution is the estimate of frailty variance, which is larger than earlier frailty models given by McGrilchrist and Aisbett [20] on log-normal frailty, Hanagal and Dabade [23] on compound Poisson frailty, and Hanagal and Dabade [24] on gamma frailty models.

CONFLICTS OF INTEREST

The authors declare that they have no conflicts of interest.

AUTHORS' CONTRIBUTIONS

The author developed the models, performed the analytic calculations, numerical simulations, wrote the manuscript and discussed the results.

ACKNOWLEDGEMENTS

The author would like to thank the referee for their valuable suggestions and comments which improved the earlier version of the manuscript.

REFERENCES

1.D.R. Cox, J. R. Stat. Soc. B, Vol. 34, 1972, pp. 187-220.

2.D.G. Clayton, Biometrika, Vol. 65, 1978, pp. 141-151.

3.J.W. Vaupel, K.G. Manton, and E. Stallard, Demography, Vol. 16, 1979, pp. 439-454.

4.N. Keyfitz and G. Littman, Popul. Stud., Vol. 33, 1979, pp. 333-342.

5.N. Balakrishnan and L. Liu, Revstat., Vol. 16, 2018, pp. 231-255.

6.J. Leao, V. Leiva, H. Saulo, and V. Tomazella, Biometrical J., Vol. 59, 2017, pp. 291-314.

7.J. Leao, V. Leiva, H. Saulo, and V. Tomazella, Brazilian J. Probab. Stat., Vol. 32, 2018, pp. 707-729.

8.J. Leao, V. Leiva, H. Saulo, and V. Tomazella, Stat. Med., Vol. 37, 2018, pp. 4421-4440.

9.Z.W. Birnbaum and S.C. Saunders, J. Appl. Probab., Vol. 6, 1969, pp. 319-327.

10.V. Leiva, The Birnbaum-Saunders Distribution, Academic Press, New York, US, USA, 2016.

11.N.R. Mann, R.E. Schafer, and N.D. Singpurwalla, Methods for Statistical Analysis of Reliability and Life Data, John Wiley and Sons, New York, NY, USA, 1974.

12.D. Kundu, N. Kannan, and N. Balakrishnan, Comput. Stat. Data Anal., Vol. 52, 2008, pp. 2692-2702.

13.M. Bebbington, C.-D. Lai, and R. Zitikis, Math. Sci., Vol. 33, 2008, pp. 49-56.

14.J.V. Deshpande and S.G. Purohit, Life Time Data: Statistical Models and Methods, World Scientific, New Jersey, NJ, USA, 2005.

15.J.G. Ibrahim, M.-H. Chen, and D. Sinha, Bayesian Survival Analysis, Springer, Verlag, New York, NY, USA, 2001.

16.C.A. Santos and J.A. Achcar, J. Stat. Theory Appl., Vol. 9, 2010, pp. 233-253.

17.J. Geweke, J.M. Bernardo, J. Berger, A.P. Dawid, and A.F.M. Smith (editors), Bayesian Statistics 4, Oxford University Press, Oxford, UK, 1992, pp. 169-193.

18.A. Gelman and D.B. Rubin, J.M. Bernardo, J.O. Berger, A.P. Dawid, and A.F.M. Smith (editors), Bayesian Statistics 4, Oxford University Press, Oxford, 1992, pp. 625-632.

19.R.A. Kass and A.E. Raftery, J. Am. Stat. Assoc., Vol. 90, 1995, pp. 773-795.

20.C.A. McGilchrist and C.W. Aisbett, Biometrics, Vol. 47, 1991, pp. 461-466.

21.A.E. Gelfand and S.K. Ghosh, Biometrika, Vol. 85, 1998, pp. 1-11.

22.S.K. Sahu, D.K. Dey, H. Aslanidou, and D. Sinha, Life Time Data Anal., Vol. 3, 1997, pp. 123-137.

23.D.D. Hanagal and A.D. Dabade, Int. J. Stat. Manag. Syst., Vol. 7, 2012, pp. 36-84.

24.D.D. Hanagal and A.D. Dabade, Commun. Stat. Simul. Comput., Vol. 42, 2013, pp. 910-931.

<Previous Article In Issue

Download article (PDF)

Next Article In Issue>

Journal: Journal of Statistical Theory and Applications
Volume-Issue: 20 - 2
Pages: 355 - 363
Publication Date: 2021/03/12
ISSN (Online): 2214-1766
ISSN (Print): 1538-7887
DOI: 10.2991/jsta.d.210222.001 How to use a DOI?
Open Access: This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

ris enw bib

TY  - JOUR
AU  - Lal Pawimawha
PY  - 2021
DA  - 2021/03/12
TI  - Modeling Frailty Using Birnbaum Saunders Distribution for Bivariate Survival Data
JO  - Journal of Statistical Theory and Applications
SP  - 355
EP  - 363
VL  - 20
IS  - 2
SN  - 2214-1766
UR  - https://doi.org/10.2991/jsta.d.210222.001
DO  - 10.2991/jsta.d.210222.001
ID  - Pawimawha2021
ER  -

download .riscopy to clipboard