Research Article Volume 9 Issue 3
1Department of Statistics, Assam University, Silcher, India
2Department of Statistics, Mainefhi College of Science, Asmara, Eritrea
3 Department of Mathematics, G.L.A.College, N.P University, India
Correspondence: Kamlesh Kumar Shukla, Mainefhi College of Science, Asmara, Eritrea
Received: April 24, 2020 | Published: June 9, 2020
Citation: Shanker R, Shukla KK, Shanker R. A note on size– biased new quasi poisson– lindley distribution. Biom Biostat Int J. 2020;9(3):97-104. DOI: 10.15406/bbij.2020.09.00306
In this paper some important properties including coefficients of variation, skewness, kurtosis and index of dispersion of size–biased new quasi Poisson–Lindley distribution (SBNQPLD) have been discussed and their behaviors have been explained graphically for varying values of parameters. Some applications of SBNQPLD have also been discussed.
Keywords: Size–biased new Quasi Poisson–Lindley distribution, moments based measures, maximum likelihood estimation, goodness of fit
The size– biased version of Poisson–Lindley distribution (SBPLD) proposed by Sankaran1 has been introduced by Ghitany and Mutairi2 and is defined by its probability mass function (pmf)
P1(x,θ)=θ3θ+2⋅x(x+θ+2)(θ+1)x+2 ;θ>0, x=1,2,3,...
Shanker et al.3 have proposed a simple method of deriving moments of SBPLD and the applications of SBPLD to model thunderstorms events. The Poisson –Lindley distribution (PLD), a Poisson mixture of Lindley distribution of Lindley,4 is defined by its pmf
P2(x; θ) = θ2 (x+ θ+ 2)(θ+ 1)x+ 3 ;x= 0, 1, 2,…, > 0.
The Lindley distribution is defined by its probability density function (pdf) f1 (x, θ) = θ2θ+ 1 (1+ x) e−θ x ; x > 0, θ > 0
The size–biased quasi Poisson–Lindley distribution (SBQPLD), size–biased version of quasi Poisson–Lindley distribution (QPLD) of Shanker and Mishra,5 suggested by Shanker and Mishra6 with parameters θ and α is defined by its pmf
P3(x;θ,α)=θ2α+2x(θx+θα+θ+α)(θ+1)x+2;x=1,2,3,...,θ>0,α>−2
The QPLD, a Poisson mixture of quasi Lindley distribution proposed by Shanker and Mishra,7 is defined by its pmf
P4(x;θ,α) = θ(θ x+θ α+θ+α)(α+1)(θ+1)x+2 ;x=0,1,2,...; θ>0,α>−1
The QLD is defined by its pdf
f2 (x ; θ, α) = θα+ 1 (α+ x θ) e− θ x;x>0, θ > 0, α>−1
Shanker and Amanuel8 proposed a new quasi Lindley distribution (NQLD) having pdf
f3(x;θ,α)=θ2θ2+α(θ+α x)e−θx
whereθ+αx>0 and θ2+α>0 or θ+αx<0 and θ2+α<0 for x>0 θ>0 . Lindley distribution is a particular case of NQLD at α=θ . A new quasi Poisson–Lindley distribution (NQPLD), a Poisson mixture of NQLD, has been suggested by Shanker and Tekie9 and defined by its pmf
P5 (x;θ,α) = θ2(θ+1)x+2[1+θ+αxθ2+α]
where θ+αx>0 and θ2+α>0 or θ+αx<0 and θ2+α<0 for x=0,1,2,...; θ>0 .
It can be seen that the PLD is a particular case of it at α=θ . Shanker et al.10 derived the pmf of size biased new quasi Poisson–Lindley distribution (SBNQPLD) having pmf
P6 (x; θ, α) = θ3θ2+2αx(θ2+θ+α+αx)(θ+1)x+2;x=1,2,3,...,θ>0,θ2+2α>0
Shanker et al.10 discussed various statistical properties, parameters estimation and applications of SBNQPLD. Shanker et al.10 have shown that SBNQPLD can also be obtained from the size–biased Poisson distribution when its parameter λ follows a SBNQLD with pdf
f4(λ;θ,α)=θ3θ2+2αλ(θ+αx)e−θx
where θ+αx>0 and θ2+α>0 or θ+αx<0 and θ2+α<0 for x>0 θ>0 . That is
P(X=x)=∞∫0e−λλx−1(x−1)!⋅θ3θ2+2αλ(θ+αλ)e−θλdλ
=θ3θ2+2αx(θ2+θ+α+αx)(θ+1)x+2 ;x=1,2,3,...
The r th factorial moment about origin μ(r)′ of SBNQPLD obtained by Shanker et al.10 as
μ(r)′=r!{r θ3+(r+1)θ2+r(r+1)α θ+(r+1)(r+2)α}θr(θ2+2α);r=1,2,3,...
Thus, the first four moments about origin obtained by Shanker et al.10 are
μ1′=1+2(θ2+3α)θ(θ2+2α)
μ2′=1+6(θ2+3α)θ(θ2+2α)+6(θ2+4α)θ2(θ2+2α)
μ3′=1+14(θ2+3α)θ(θ2+2α)+36(θ2+4α)θ2(θ2+2α)+24(θ2+5α)θ3(θ2+2α)
μ4′=1+30(θ2+3α)θ(θ2+2α)+126(θ2+4α)θ2(θ2+2α)+240(θ2+5α)θ3(θ2+2α)+120(θ2+6α)θ4(θ2+2α) .
It has been observed that the central moments (moments about the mean) has not been given by et al.10 and hence many important characteristics including coefficient of variation, skewness, kurtosis and index of dispersion of SBNQPLD has not been studied by Shanker et al.10
The main purpose of this paper is to derive expressions for coefficients of variation, skewness, kurtosis and index of dispersion of SBNQPLD and study their behaviour graphically. The goodness of fit of the distribution has been presented with a number of count datasets using maximum likelihood estimates from various fields of knowledge.
Using the relationship between moments about the mean and the moments about the origin, the moments about mean of SBNQPLD can be obtained as
μ2=2(θ5+θ4+5αθ3+6αθ2+6α2θ+6α2)θ2(θ2+2α)2
μ3=2{θ8+3θ7+(7α+2)θ6+24αθ5+(16α2+18α)θ4+54α2θ3+(12α3+36α2)θ2+36α3θ+24α3}θ3(θ2+2α)3
μ4=2{θ11+13θ10+(9α+24)θ9+(130α+12)θ8+(30α2+264α)θ7+(460α2+144α)θ6+(44α3+936α2)θ5+(696α3+504α2)θ4+(24α4+1368α3)θ3+(384α4+720α3)θ2+720α4θ+360α4}θ4(θ2+2α)4 .
The coefficient of variation (C.V), coefficient of Skewness (√β1) , coefficient of Kurtosis (β2) and Index of dispersion (γ) of SBNQPLD are obtained as
C.V.=σμ1′=√2(θ5+θ4+5αθ3+6αθ2+6α2θ+6α2)θ3+2θ2+2αθ+6α
√β1=μ3(μ2)3/2={θ8+3θ7+(7α+2)θ6+24αθ5+(16α2+18α)θ4+54α2θ3+(12α3+36α2)θ2+36α3θ+24α3}√2(θ5+θ4+5αθ3+6αθ2+6α2θ+6α2)3/2
β2=μ4μ22={θ11+13θ10+(9α+24)θ9+(130α+12)θ8+(30α2+264α)θ7+(460α2+144α)θ6+(44α3+936α2)θ5+(696α3+504α2)θ4+(24α4+1368α3)θ3+(384α4+720α3)θ2+720α4θ+360α4}2(θ5+θ4+5αθ3+6αθ2+6α2θ+6α2)2
γ=σ2μ1′=2(θ5+θ4+5αθ3+6αθ2+6α2θ+6α2)θ(θ2+2α)(θ3+2θ2+2αθ+6α) .
Shapes of coefficient of variation, skewness, kurtosis and index of dispersion of SBNQPLD for varying values of parameters have been shown in figure 1.
Suppose (x1,x2,…,xn) as random samples of size n from the SBNQPLD and fx , the observed frequency in the sample corresponding to X=x (x=1,2,...,k) such that k∑x=1fx=n , where k being the largest observed value having non–zero frequency. The log likelihood function of SBNQPLD can be presented as
logL=nlog(θ3θ2+2α)−k∑x=1fx(x+2)log(θ+1)+k∑x=1fxlog[αx2+x(θ2+θ+α)]
The two log likelihood equations are thus obtained as
∂logL∂θ=3nθ−2nθθ2+2α−k∑x=1(x+2)fxθ+1+k∑x=1(2θ+1)x fx[αx2+x(θ2+θ+α)]=0
∂logL∂α=−2nθ2+2α+k∑x=1x(x+1)fx[αx2+x(θ2+θ+α)]=0 .
These two log likelihood equations seems difficult to solve directly as these cannot be expressed in closed forms. The (MLE’s) (ˆθ,ˆα) of parameters (θ,α) can be computed directly by solving the log likelihood equation using Newton–Raphson iteration available in R–software till sufficiently close values of ˆθ and ˆα are obtained. The initial values of parameters θ and α are the MOME (˜θ,˜α) of the parameters (θ,α) , given in Shanker et al.10
To test the goodness of fit of SBNQPLD along with SBPD, SBPLD and SBQPLD, several cont datasets have been considered from various fields of knowledge. The expected frequencies of SBPD, SBPLD and SBQPLD have also been given in the tables (Table 1–10). The estimates of the parameters have been obtained by the method of maximum likelihood. It is obvious from the goodness of fit of SBNQPLD that it provides better fit over SBPD and SBPLD and competing well with SBQPLD in majority of datasets. The following datasets have been considered for testing the goodness of fit of SBNQPLD.
Group Size |
Observed frequency |
Expected frequency |
|||
SBPD |
SBPLD |
SBQPLD |
SBNQPLD |
||
1 2 3 4 5 6 |
1486 694 195 37 10 1 |
1452.4 743.3 190.2 32.4 4.1 0.6 |
1532.5 630.6 191.9 51.3 12.8 3.9 |
1485.4 697.2 189.7 41.1 7.8 1.8 |
1505.5 656.8 202.5 49.2 9.0 0.0 |
Total |
2423 |
2423.0 |
2423.0 |
2423 |
|
ML Estimate |
|
ˆθ=0.5118 | ˆθ=4.5082 | ˆθ=7.14063 ˆα=−0.79104 |
ˆθ=2.69606 ˆα=−1.39128 |
χ2 |
|
7.370 |
13.760 |
0.776 |
6.1 |
d.f. |
|
2 |
3 |
2 |
2 |
p-value |
|
0.0251 |
0.003 |
0.6804 |
0.04735 |
−2logL |
|
10445.34 |
4622.36 |
4607.8 |
4610.0 |
AIC |
|
10447.34 |
4624.36 |
4611.8 |
4614.0 |
Table 1 Pedestrians-Eugene, Spring, Morning, available in Coleman and James11
Group Size |
Observed frequency |
Expected frequency |
|||
SBPD |
SBPLD |
SBQPLD |
SBNQPLD |
||
1 2 3 4 5 |
316 141 44 5 4 |
306.3 156.1 39.8 6.7 1.1 |
322.9 132.5 40.2 10.7 3.7 |
315.7 142.7 40.1 9.1 2.4 |
313.5 141.4 44.1 10.4 0.6 |
Total |
510 |
510.0 |
510.0 |
510.0 |
|
ML Estimate |
|
ˆθ=0.5098 | ˆθ=4.5211 | ˆθ=6.5501 ˆα=−0.5069 |
ˆθ=2.4693 ˆα=−1.2977 |
χ2 |
|
2.39 |
3.07 |
0.94 |
0.38 |
d.f. |
|
2 |
2 |
1 |
1 |
p-value |
|
0.3027 |
0.2154 |
0.3322 |
0.5376 |
−2logL |
|
916.63 |
972.78 |
971.07 |
970.24 |
AIC |
|
918.63 |
974.78 |
975.07 |
974.24 |
Table 2 Play Groups-Eugene, Spring, Public Playground A, available in Coleman and James11
Group Size |
Observed frequency |
Expected frequency |
|||
SBPD |
SBPLD |
SBQPLD |
SBNQPLD |
||
1 2 3 4 5 |
306 132 47 10 2 |
292.2 155.2 41.2 7.3 1.1 |
309.4 131.2 41.1 11.3 4.0 |
304.4 137.9 41.3 10.3 3.1 |
306.4 134.4 41.6 11.0 3.6 |
Total |
497 |
497.0 |
497.0 |
|
|
ML Estimate |
|
ˆθ=0.5312 | ˆθ=4.3548 | ˆθ=5.71547 ˆα=4.9998 |
ˆθ=4.9998 |
χ2 |
|
6.479 |
0.932 |
1.19 |
1.2 |
d.f. |
|
2 |
2 |
1 |
1 |
p-value |
|
0.039 |
0.6281 |
0.2753 |
0.2733 |
−2logL |
|
2142.03 |
971.86 |
970.96 |
971.25 |
AIC |
|
2144.03 |
973.86 |
974.96 |
975.25 |
Table 3 Play Groups-Eugene, Spring, Public Playground A, available in Coleman and James11
Group Size |
Observed frequency |
Expected frequency |
|||
SBPD |
SBPLD |
SBQPLD |
SBNQPLD |
||
1 2 3 4 5 6 |
305 144 50 5 2 1 |
296.5 159.0 42.6 7.6 1.0 0.3 |
314.4 134.4 42.5 11.8 3.1 0.8 |
304.3 148.2 42.3 9.6 1.9 0.7 |
310.1 138.8 43.1 11.3 2.7 1.0 |
Total |
507 |
507.0 |
507.0 |
507.0 |
507.0 |
ML Estimate |
|
ˆθ=0.5365 | ˆθ=4.3179 | ˆθ=6.70804 ˆα=−0.74907 |
ˆθ=5.1516 ˆα=48.6067 |
χ2 |
|
3.035 |
6.415 |
2.96 |
4.64 |
d.f. |
|
2 |
2 |
1 |
1 |
p-value |
|
0.219 |
0.040 |
0.0853 |
0.0312 |
−2logL |
|
2376.75 |
993.10 |
990.02 |
991.51 |
AIC |
|
2378.75 |
995.1 |
994.02 |
995.51 |
Table 4 Play Groups-Eugene, Spring, Public Playground D, available in Coleman and James11
Group Size |
Observed frequency |
Expected frequency |
|||
SBPD |
SBPLD |
SBQPLD |
SBNQPLD |
||
1 2 3 4 5
|
276 229 61 12 3
|
292.3 200.7 68.9 15.8 3.3
|
319.6 166.5 63.8 21.4 9.7
|
276.0 228.3 61.9 12.2 2.6 |
313.7 173.1 65.2 20.7 8.3 |
Total |
581 |
581.0 |
581.0 |
581.0 |
581.0 |
ML Estimate |
|
ˆθ=0.6867 | ˆθ=3.4359 | ˆθ=8.6724 ˆα=−1.4944 |
ˆθ=4.1645 ˆα=61.0287 |
χ2 |
|
6.68 |
37.86 |
0.017 |
29.6 |
d.f. |
|
2 |
2 |
1 |
1 |
p-value |
|
0.0354 |
0.00 |
0.8962 |
0.000 |
−2logL |
|
1146.7 |
1277.42 |
1238.11 |
1268.77 |
AIC |
|
1148.7 |
1279.42 |
1242.11 |
1272.77 |
Table 5 Play Groups-Eugene, Spring, Public Playground D, available in Coleman and James11
No. of sites with particles |
Observed Frequency |
Expected Frequency |
|||
SBPD |
SBPLD |
SBQPLD |
SBNQPLD |
||
1 2 3 4 5 |
122 50 18 4 4 |
111.3 64.1 |
119.0 53.8 18.0 |
119.2 53.5 17.9 5.3 2.1 |
119.3 53.3 17.8 5.3 2.3 |
Total |
198 |
198.0 |
198.0 |
198.0 |
198.0 |
ML estimate |
|
ˆθ=0.575758 | ˆθ=4.050987 | ˆθ=3.7564 ˆα=10.1281 |
ˆθ=3.4795 ˆα=0.0216 |
χ2 |
|
4.64 |
0.43 |
0.34 |
0.28 |
d.f. |
|
1 |
2 |
1 |
1 |
p-value |
|
0.0312 |
0.8065 |
0.5598 |
0.5967 |
−2logL |
|
393.95 |
409.28 |
409.17 |
409.13 |
AIC |
|
395.95 |
411.28 |
413.17 |
413.13 |
Table 6 Distribution of number of counts of sites with particles from Immunogold data, available in Mathews and Appleton12
No. times hares caught |
Observed Frequency |
Expected Frequency |
|||
SBPD |
SBPLD |
SBQPLD |
SBNQPLD |
||
1 2 3 4 5 |
184 55 14 4 4 |
170.6 72.5 |
177.3 62.5 |
177.4 62.3 16.3 3.8 1.2 |
177.5 62.2 16.3 3.8 1.2 |
Total |
261 |
261.0 |
261.0 |
261 |
261.0 |
ML estimate |
|
ˆθ=0.425287 | ˆθ=5.351256 | ˆθ=4.9800 ˆα=14.9193 |
ˆθ=4.6959 ˆα=−0.0302 |
χ2 |
|
6.22 |
1.18 |
3.2 |
3.19 |
d.f. |
|
1 |
1 |
1 |
1 |
p-value |
|
0.0126 |
0.2773 |
0.0736 |
0.07409 |
−2logL |
|
452.40 |
457.10 |
456.86 |
456.80 |
AIC |
|
454.40 |
459.10 |
460.86 |
460.80 |
Table 7 Distribution of snowshoe hares captured over 7 days, available in Keith and Meslow13
Number of pairs of running shoes |
Observed frequency |
Expected Frequency |
|||
SBPD |
SBPLD |
SBQPLD |
SBNQPLD |
||
1 |
18 |
15.0 |
20.3 |
17.4 |
19.5 |
2 |
18 |
20.8 |
17.4 |
19.6 |
18.0 |
3 |
12 |
14.4 |
10.9 |
12.3 |
11.3 |
4 5 |
7 5 |
5.9 5.5 |
6.1 4.6 |
6.0 5.2 |
|
Total |
60 |
60.0 |
60.0 |
60.0 |
60 |
ML Estimate |
|
ˆθ=1.383333 | ˆθ=1.818978 | ˆθ=2.5858 ˆα=−0.7318 |
ˆθ=2.08739 ˆα=17.3228 |
χ2 |
|
1.87 |
0.64 |
0.31 |
0.33 |
d.f. |
|
2 |
3 |
1 |
2 |
P-value |
|
0.3926 |
0.8872 |
0.5777 |
0.8478 |
−2logL |
|
147.1 |
187.08 |
185.55 |
186.33 |
AIC |
|
149.1 |
189.08 |
189.55 |
190.33 |
Table 8 Number of counts of pairs of running shoes owned by 60 members of an athletics club, reported by Simonoff14
Number of fly eggs |
Observed Frequency |
Expected Frequency |
|||
SBPD |
SBPLD |
SBQPLD |
SBNQPLD |
||
1 2 3 4 5 6 7 8 9 |
22 18 18 11 9 6 3 0 1 |
11.3 23.2 23.8 16.2 8.3 3.4 1.1 0.3 0.4 |
20.3 22.0 17.2 11.6 7.2 4.2 2.4 1.3 1.8 |
19.8 22.1 17.5 11.8 7.3 4.2 2.3 1.3 1.7 |
19.8 22.1 17.5 11.8 7.3 4.2 2.3 1.3 1.7 |
Total |
88 |
|
|
88.0 |
88.0 |
ML estimate |
|
ˆθ=2.0454 | ˆθ=1.2822 | ˆθ=1.3483 ˆα=0.6925 |
ˆθ=1.3465 ˆα=2.5654 |
χ2 |
|
18.8 |
1.39 |
1.49 |
1.49 |
d.f. |
|
4 |
4 |
3 |
3 |
p-value |
|
0.0008 |
0.8459 |
0.6845 |
0.6845 |
−2logL |
|
206.59 |
329.92 |
329.86 |
329.86 |
AIC |
|
208.59 |
331.92 |
333.86 |
333.86 |
Table 9 The numbers of counts of flower heads as per the number of fly eggs reported by Finney and Varley15
x |
Observed frequency |
Expected Frequency |
|||
SBPD |
SBPLD |
SBQPLD |
SBNQPLD |
||
1 |
375 |
341.2 |
262.8 |
363.3 |
363.6 |
2 |
143 |
186.8 |
157.4 |
156.5 |
156.3 |
3 |
49 |
51.1 |
50.4 |
50.4 |
50.4 |
4 5 6 7 8 |
17 2 2 1 1 |
9.3 1.2 0.1 0.2 0.1 |
14.2 3.7 0.9 0.2 0.3 |
14.4 3.9 1.0 0.2 0.4 |
14.4 3.8 1.0 0.2 0.3 |
Total |
590 |
|
590.0 |
590.0 |
590.0 |
ML Estimate |
|
ˆθ=0.5474 | ˆθ=4.24 | ˆθ=3.8386 ˆα=17.2968 |
ˆθ=3.6534 ˆα=0.00067 |
χ2 |
|
14.1 |
2.48 |
2.11 |
2.08 |
d.f. |
|
2 |
3 |
2 |
2 |
P-value |
|
0.0008 |
0.4789 |
0.3481 |
0.3534 |
−2logL |
|
1124.3 |
1190.4 |
1189.67 |
1189.57 |
AIC |
|
1126.3 |
1192.4 |
1193.67 |
1193.57 |
Table 10 Number of households having at least one migrant according to the number of migrants, reported by Singh and Yadav16
In this paper expressions based ob central moments including coefficients of variation, skewness, kurtosis and index of dispersion of SBNQPLD have been derived and their behaviors have been explained graphically for varying values of the parameters. Some important applications of SBNQPLD have also been discussed and its goodness of fit has been compared with other discrete distributions. It has been observed that SBNQPLD provides much better fit over SBPD, SBPLD and competing well with SBQPLD in majority of datasets.
None.
None.
©2020 Shanker, et al. This is an open access article distributed under the terms of the, which permits unrestricted use, distribution, and build upon your work non-commercially.
2 7