Research Article Volume 7 Issue 3
A two–parameter Sujatha distribution
Mussie Tesfay, Rama Shanker
Regret for the inconvenience: we are taking measures to prevent fraudulent form submissions by extractors and page crawlers. Please type the correct Captcha word to see email ID.
Department of Statistics, Eritrea Institute of Technology Asmara, Eritrea
Correspondence:
Received: March 15, 2018 | Published: May 10, 2018
Citation: Tesfay M, Shanker R. A two–parameter Sujatha distribution. Biom Biostat Int J. 2018;7(3):188–197. DOI: 10.15406/bbij.2018.07.00208
Download PDF
Abstract
This paper proposes a two parameter Sujatha distribution (TPSD). This includes size–biased Lindley distribution and Sujatha distribution as particular cases. It’s important statistical properties including its shapes for varying values of parameters, coefficient of variation, skewness, kurtosis, index of dispersion, hazard rate function, mean residual life function, stochastic ordering ,mean deviations, Bonferroni and Lorenz curves, and stress–strength reliability have been discussed. The estimation of parameters has been discussed using the method of moments and the method of maximum likelihood. Application of the distribution has been discussed with a real lifetime data.
Keywords: Sujatha distribution, moments, statistical properties, estimation of parameters, application
Introduction
The statistical analysis and modeling of lifetime data are crucial for statisticians working in various field of knowledge including medical science, engineering, social science, behavioral science, insurance, finance, among others. The classical one parameter lifetime distribution in statistics which were popular for modeling lifetime data are exponential distribution and Lindley distribution proposed by Lindley.1 Shanker et al.2 have detailed critical study on applications of lifetime data from engineering and biomedical science and observed that exponential and Lindley distributions are not always suitable due to theoretical or applied point of view and presence of single parameter. In search for a lifetime distribution which gives a better fit than exponential and Lindley distributions Shanker3 has proposed a new lifetime distribution named Sujatha distribution defined by its probability density function (pdf) and cumulative distribution function (cdf).
(1.1)
(1.2)
where θ is a scale parameter. It has been shown by Shanker3 that Sujatha distribution is a convex combination of exponential (θ) distribution, a gamma (2, θ) distribution and a gamma (3, θ) distribution. The first four moments about origin and central moments of Sujatha distribution obtained by Shanker3 are
Shanker3 has discussed its important properties including shapes of density function for varying values of parameters, hazard rate function, mean residual life function, stochastic ordering, mean deviations, Bonferroni and Lorenz curves, and stress–strength reliability. Shanker3 discussed the maximum likelihood estimation of parameter and showed applications of Sujatha distribution to model lifetime data from biomedical science and engineering. Shanker4 has introduced Poisson–Sujatha distribution (PSD), a Poisson mixture of Sujatha distribution, and studied its properties, estimation of parameter and applications to model count data. Shanker & Hagos5 have discussed zero–truncated Poisson– Sujatha distribution (ZTPSD) and applications for modeling count data excluding zero counts. Shanker & Hagos6 have also studied size–biased Poisson– Sujatha distribution and its applications for count data excluding zero counts.
The Lindley distribution and a size–biased Lindley distribution (SBLD) having parameter are defined by their pdf
(1.3)
(1.4)
Ghitany et al.7 have discussed various statistical and mathematical properties, estimation of parameter and application of Lindley distribution to model waiting time data in a bank and it has been showed that Lindley distribution provides better fit than exponential distribution.
In this paper, a two– parameter Sujatha distribution (TPSD), which includes size–biased Lindley distribution and Sujatha distribution as particular cases, has been proposed. It’s important statistical properties including coefficient of variation, skewness, kurtosis, index of dispersion, hazard rate function, mean residual life function, stochastic ordering, mean deviations, Bonferroni and Lorenz curves, stress–strength reliability have been discussed. The estimation of the parameters has been discussed using maximum likelihood estimation. A numerical example has been given to test the goodness of fit of TPSD over Lindley and Sujatha distributions.
A two–parameter Sujatha distribution
A Two parameter Sujatha distribution (TPSD) having parameters and is defined by its pdf
(2.1)
where is a scale parameter and is is a shape parameter. It can be easily verified that (2.1) reduces to Sujatha distribution (1.1) and SBLD (1.4) for = 1 and = 0 respectively.
Like Sujatha distribution (1.1), TPSD (2.1) is also a convex combination of exponential (), gamma (2, ) and gamma (3, ) distributions. We have
(2.2)
where
The corresponding cdf of TPSD (2.1) can be obtained as
(2.3)
Behavior of the pdf and the cdf of TPSD for varying values of parameter and α are shown in Figures 1 & 2 respectively.
Figure 1 Behavior of the pdf of TPSD for varying values of parameter θ and α.
Figure 2 Behavior of the cdf of TPSD for varying values of parameter θ and α.
Moments and related measures
The moment generating function of TPSD (2.1) can be obtained as
Thus, the rth moment about origin of TPSD (2.1), obtained as the coefficient of in , is given by
(3.1)
The first four moments about origin of TPSD are obtained as
Using the relationship between moments about the mean and moments about the origin, the moments about mean of TPSD are obtained as
The coefficient of variation (C.V), coefficient of skewness, coefficient of kurtosis and index of dispersion of TPSD are given by
It can be easily verified that these statistical constants of TPSD reduce to the corresponding statistical constants of Sujatha distribution and SBLD at and respectively. To study the behavior of C.V.,, and , their values for varying values of the parameters and have been computed and presented in Tables 1–4.
|
0.2 |
0.5 |
1 |
2 |
3 |
4 |
5 |
0.2 |
0.59658 |
0.624798 |
0.668399 |
0.739814 |
0.792609 |
0.8317 |
0.861102 |
0.5 |
0.599565 |
0.639569 |
0.708329 |
0.816497 |
0.882958 |
0.922627 |
0.946881 |
1 |
0.604466 |
0.662392 |
0.761739 |
0.892143 |
0.95119 |
0.977525 |
0.989835 |
2 |
0.614004 |
0.702377 |
0.83666 |
0.96225 |
0.996661 |
1.005655 |
1.007547 |
3 |
0.623205 |
0.736304 |
0.886072 |
0.991701 |
1.009814 |
1.011382 |
1.009973 |
4 |
0.632091 |
0.765466 |
0.920447 |
1.0059 |
1.014222 |
1.012415 |
1.009836 |
5 |
0.640678 |
0.790787 |
0.945247 |
1.013246 |
1.015576 |
1.012149 |
1.009163 |
Table 1 CV of TPSD for varying values of parameters and
For a given value of , C.V increases as the value of increases .But for values , CV decreases as the value of increases.
θ
α |
0.2 |
0.5 |
1 |
2 |
3 |
4 |
5 |
0.2 |
1.156092 |
1.164414 |
1.193838 |
1.288579 |
1.40832 |
1.544566 |
1.694179 |
0.5 |
1.151692 |
1.153618 |
1.202728 |
1.394848 |
1.600302 |
1.785072 |
1.947347 |
1 |
1.145006 |
1.145839 |
1.247611 |
1.535588 |
1.733747 |
1.848046 |
1.912879 |
2 |
1.133828 |
1.153526 |
1.352316 |
1.647373 |
1.698737 |
1.653874 |
1.586127 |
3 |
1.125191 |
1.176753 |
1.43637 |
1.643895 |
1.562899 |
1.429794 |
1.310578 |
4 |
1.118703 |
1.206238 |
1.496066 |
1.59473 |
1.421347 |
1.244821 |
1.108 |
5 |
1.114041 |
1.237609 |
1.535958 |
1.528385 |
1.293862 |
1.097469 |
0.956984 |
Table 2 Coefficient of skewness of TPSD for varying values of parameters and
Since , TPSD is always positively skewed, and this means that TPSD is a suitable model for positively skewed lifetime data.
θ
α |
0.2 |
0.5 |
1 |
2 |
3 |
4 |
5 |
0.2 |
5.003116 |
5.022048 |
5.093943 |
5.346882 |
5.661781 |
5.979645 |
6.275987 |
0.5 |
4.991667 |
4.984856 |
5.082378 |
5.625 |
6.28542 |
6.865586 |
7.326691 |
1 |
4.973635 |
4.944566 |
5.170213 |
6.21499 |
7.193906 |
7.868405 |
8.297711 |
2 |
4.94128 |
4.924032 |
5.510204 |
7.2144 |
8.270528 |
8.774988 |
9.001011 |
3 |
4.913483 |
4.956867 |
5.903269 |
7.900925 |
8.799663 |
9.113171 |
9.206366 |
4 |
4.889821 |
5.022933 |
6.283795 |
8.3676 |
9.077558 |
9.253624 |
9.271966 |
5 |
4.869916 |
5.109996 |
6.633262 |
8.690336 |
9.230612 |
9.313262 |
9.289003 |
Table 3 Coefficient of kurtosis of TPSD for varying values of parameters and
Since TPSD is always leptokurtic, and this means that TPSD is more peaked than the normal curve. Thus TPSD is suitable for lifetime data which are leptokurtic.
θ |
α |
0.2 |
0.5 |
1 |
2 |
3 |
4 |
5 |
0.2 |
5.164536 |
2.158531 |
1.144817 |
0.615741 |
0.424979 |
0.323306 |
0.259524 |
0.5 |
5.197861 |
2.220551 |
1.218487 |
0.666667 |
0.451356 |
0.334416 |
0.262078 |
1 |
5.252329 |
2.31348 |
1.305556 |
0.696429 |
0.452381 |
0.325758 |
0.251067 |
2 |
5.357375 |
2.466667 |
1.4 |
0.694444 |
0.431884 |
0.306064 |
0.235088 |
3 |
5.457478 |
2.585608 |
1.439394 |
0.676136 |
0.414263 |
0.293608 |
0.2264 |
4 |
5.552914 |
2.678571 |
1.452381 |
0.657692 |
0.401423 |
0.285531 |
0.221109 |
5 |
5.643939 |
2.751515 |
1.451923 |
0.641667 |
0.39193 |
0.279936 |
0.217569 |
Table 4 Index of dispersion of TPSD for varying values of parameters and
As long as and , the nature of TPSD is over dispersed and for and , the nature of TPSD is over dispersed
The behavior of C.V.,, and , for selected values of the parameters and are shown in Figure 3.
Figure 3Behavior of C.V., , and , for varying values of the parameters θ and α.
Statistical properties
In this section, statistical properties of TPSD including hazard rate function, mean residual life function, stochastic ordering, mean deviation, Bonferroni and Lorenz curves and stress–strength reliability have been discussed.
Hazard rate function and mean residual life function
Let X be a continuous random variable with pdf and cdf .The hazard rate function (also known as failure rate function), and the mean residual function, of X are respectively defined as
and
The corresponding hazard rate function, and the mean residual function, of TPSD (2.1) are thus obtained as
And
It can be easily verified that and
It can also be easily verified that the expression of and of TPSD reduce to the corresponding and of Sujatha distribution at
The behavior of and of TPSD (2.1) for different values of its parameters are shown in Figures 4 & 5 respectively.
Figure 4Behavior of of TPSD for selected values of parameters θ and α.
Figure 5Behavior of of TPSD for selected values of parameters θ and α.
It is clearly seen from the graphs of and that is monotonically increasing function of where as is monotonically decreasing function of .
Stochastic ordering
Stochastic ordering of positive continuous random variable is an important tool for judging the comparative behavior of continuous distributions. A random variable X is said to be smaller than a random variable Y in the
- Stochastic order if for all x
- Hazard rate order if for all x
- Mean residual life order if for all x
- Likelihood ratio order if decreases in x
The following results due to Shaked & Shanthikumar8 are well known for establishing stochastic ordering of distributions
The TPSD (2.1) is ordered with respect to the strongest “likelihood ratio” ordering as shown in the following theorem:
Theorem: Let and .If and (or and ) then
Proof: We have
This gives
Thus, for ,
This means that . This shows flexibility of TPSD over Sujatha distribution.
Mean deviations
The amount of scatter in a population is evidently measured to some extent by the totality of deviations from the mean and the median. These are known as the mean deviation about the mean and the mean deviation about the median and are defined as
and respectively,
where and The measures and can be calculated using the following relationships
(4.3.1)
and
(4.3.2)
Using the pdf (2.1) and expression for the mean of TPSD, we get
(4.3.3)
(4.3.4)
Using expressions from (4.3.1), (4.3.2), (4.3.3) and (4.3.4) and after some tedious algebraic simplifications, the mean deviation about the mean, and the mean deviation about the median, of TPSD are obtained as
(4.3.5)
and (4.3.6)
Bonferroni and Lorenz curves and indices
The Bonferroni and Lorenz curves and Bonferroni9 and Gini indices have applications not only in economics to study income and poverty, but also in other fields like reliability, demography and medical science. The Bonferroni and Lorenz curves are defined as
(4.4.1)
and (4.4.2)
respectively or equivalently.
(4.4.3)
and (4.4.4)
respectively, where and .
The Bonferroni and Gini indices are thus defined as
(4.4.5)
and (4.4.6)
respectively.
Using pdf of TPSD (2.1), we get
(4.4.7)
Now using equation (4.4.7), (4.4.1) and (4.4.2), we get
(4.4.8)
and (4.4.9)
Now using the equations (4.4.8) and (4.4.9) in (4.4.5) and (4.4.6), the Bonferroni and Gini indices of TPSD (2.1) are obtained as
(4.4.10)
(4.4.11)
Stress–strength reliability
The stress–strength reliability of a component illustrates the life of the component which has random strength that is subjected to random stress. When the stress of the component Y applied to it exceeds the strength of the component X, the component fails instantly and the component will function satisfactorily till X > Y . Therefore, R = P (Y < X ) is a measure of the component reliability and is known as stress–strength reliability in statistical literature. It has extensive application in almost all areas of knowledge especially in engineering such as structure, deterioration of rocket motor, static fatigue of ceramic component, aging of concrete pressure vessels etc.
Let X and Y be independent strength and stress random variables having TPSD (2.1) with parameter and respectively. Then the stress–strength reliability R of TPSD can be obtained as
It can be verified that the stress–strength reliability of Sujatha distribution is a particular case of stress–strength reliability of TPSD at
Estimation of parameters
In this section, the estimations of parameters of TPSD using method of moments and method of maximum likelihood have been discussed.
Method of moment estimates (MOME)
Since TPSD (2.1) has two parameters to be estimated, the first two moments about the origin are required to estimate its parameters using method of moments. Equating the population mean to the sample mean, we have
(5.1.1)
Again equating the second population moment with the corresponding sample moment, we have
(5.1.2)
Equations (5.1.1) and (5.1.2) give the following cubic equation in
(5.1.3)
Solving equation (5.1.3) using any iterative method such as Newton–Raphson method, Regula–Falsi method or Bisection method, method of moment estimation (MOME) of can be obtained and substituting the value of in equation (5.1.1), MOME
of can be obtained as
(5.1.4)
Maximum likelihood estimates (MLE)
Let be random sample from TPSD (2.1). The likelihood function L is given by
where is the sample mean.
The natural log likelihood function is thus obtained as
The maximum likelihood estimate (MLE’s) of are then the solutions of the following non–linear equations
(5.2.1)
(5.2.2)
These two natural log likelihood equations do not seem to be solved directly, because they cannot be expressed in closed forms. The (MLE’s) of can be computed directly by solving the natural log likelihood equations using Newton–Raphson iteration method using R–software till sufficiently close values of are obtained. The initial values of parameters and α are the MOME of the parameters
A numerical example
In this section an application of TPSD using maximum likelihood estimates has been discussed with a real lifetime data set. The data set regarding vinyl chloride obtained from clean up gradient monitoring wells in mg/l, available in Bhaumik et al.10 has been considered. The data set is
5.1 |
1.2 |
1.3 |
0.6 |
0.5 |
2.4 |
0.5 |
1.1 |
8 |
0.8 |
0.4 |
0.6 |
0.9 |
0.4 |
2 |
0.5 |
5.3 |
3.2 |
2.7 |
2.9 |
2.5 |
2.3 |
1 |
0.2 |
0.1 |
0.1 |
1.8 |
0.9 |
2.4 |
6.8 |
1.2 |
0.4 |
0.2 |
|
|
|
|
|
|
In order to compare lifetime distributions, values of, AIC (Akaike Information Criterion), BIC (Bayesian Information Criterion) and K–S Statistic (Kolmogorov–Smirnov Statistic) for the above data set has been computed. The formulae for computing AIC, BIC, and K–S Statistics are as follows:
, and , where = the number of parameters, the sample size, and the = empirical distribution function. The best distribution is the distribution which corresponds to lower values of , AIC, and K–S statistic and higher p–value. The MLE along with their standard errors, , AIC, BIC, K–S Statistic and p–value of the fitted distributions are presented in the Table 5.
It is obvious that TPSD gives much closer fit than Sujatha and Lindley distributions. Therefore, TPSD can be considered as an important two–parameter lifetime distribution. In order to see the closeness of the fit given by Lindley, Sujatha and TPSD, the fitted pdf plots of these distributions for the given dataset have been shown in Figure 6. It is also obvious from the fitted plots of the distribution along with the histogram of the original dataset that TPSD gives much closer fit than Lindley and Sujatha distributions.
Figure 6Fitted pdf plots of distributions for the given dataset.
Conclusion
A two parameter Sujatha distribution (TPSD) has been introduced which includes size–biased Lindley distribution and Sujatha distribution, proposed by Shanker (2016a) as particular cases. Moments about origin and moments about mean have been obtained and nature of coefficient of variation, coefficient of skewness, coefficient of kurtosis and index of dispersion of TPSD have been studied with varying values of the parameters. The nature of probability density function, cumulative distribution function, hazard rate function and mean residual life function have been discussed with varying values of the parameters. The stochastic ordering, mean deviations, Bonferroni and Lorenz curves, and stress–strength reliability have also been discussed. The method of moments and method of maximum likelihood have been discussed for estimating parameters. A numerical example of real lifetime data have been presented to show the application of TPSD and the goodness of fit of TPSD gives much closer fit over Sujatha and Lindley distributions.
Acknowledgement
Authors are grateful to the editor–in–chief of the journal and the anonymous reviewer for constructive comments on the paper.
Conflict of interest
Authors declare that there is no conflict of interest.
References
©2018 Tesfay, et al. This is an open access article distributed under the terms of the,
which
permits unrestricted use, distribution, and build upon your work non-commercially.