^{1}

^{*}

^{2}

^{2}

^{2}

Kenyan insurance firms have introduced insurance policies of chronic illnesses like cancer ; however , they have faced a huge challenge in the pricing of these policies as cancer can transit into different stages , which consequently leads to variation in the cost of treatment. This has made the estimation of aggregate losses of diseases which have multiple stages of transitions such as cancer , an area of interest of many insurance firms. Mixture phase type distributions can be used to solve this setback as they can in-cooperate the transition in the estimation of claim frequency while also in-cooperating the heterogeneity aspect of claim data. In this paper , we estimate the aggregate losses of secondary cancer cases in Kenya using mixture phase type Poisson Lindley distributions. Phase type (PH) distributions for one and two parameter Poisson Lindley are developed as well their compound distributions. The matrix parameters of the PH distributions are estimated using continuous Chapman Kolmogorov equations as the disease process of cancer is continuous while severity is modeled using Pareto, Generalized Pareto and Weibull distributions. This study shows that aggregate losses for Kenyan data are best estimated using PH-OPPL-Weibull model in the case of PH-OPPL distribution models and PH-TPPL-Generalized Pareto model in the case of PH-TPPL distribution models. Comparing the two best models, PH-OPPL-Weibull model provided the best fit for secondary cancer cases in Kenya. This model is also recommended for different diseases which are dynamic in nature like cancer.

Aggregate losses are estimated by in-cooperating both claim frequency and claim severity distributions. Pavel (2010) [

S N = ∑ i = 1 N X i (1)

where:

X i is the severity distribution and N is the claim count distribution. The distribution of N in this paper is considered to follow mixed PH Poisson distributions.

Phase type distributions are constructed, when mixture distributions are convoluted resulting to an interrelated Poisson process occurring in phases. Phase type distributions were introduced way back by Erlang (1909) [

Markov chains were introduced by Andrei Markov (1856-1922). Nurul et al. (2019) [

Frequency data is used to model occurrences in different areas such as engineering, insurance, biology etc. Poisson distribution is often used to model count data; however, it is based on the assumption that variance to mean ratio is unity (equi-dispersion) which is not applicable to real data; hence, it is considered as an inflexible model. Most real life data either experience over dispersion where variance exceeds the mean or under dispersion where the mean exceeds the variance which can be modeled using Poisson mixtures [

In the insurance sector, when calculating aggregate losses for chronic diseases which have various stages like cancer the claim frequency distributions considered do not in-cooperate the different stages of such diseases. In-cooperating phase type distributions solve this short coming of ordinary distributions. Further considering mixed phase type distributions improves modeling of claim frequency data as it considers the heterogeneity aspect of claim data. In this paper, we develop PH one parameter Poisson Lindley distribution and PH two parameter Poisson Lindley distributions where the mixing distribution follows PH Lindley distribution. The resulting PH distributions are used to model claim numbers of secondary cancers in Kenya. Section 1 has a brief introduction to Poisson distributions and Poisson Lindley distributions.

The structure of this paper is as follows: Section 2 will discuss construction of phase type distribution using PH Lindley distributions which will later be applied in modeling of the aggregate losses. Compound distributions from the frequency and severity distributions are developed in Section 3. Aggregate losses for the data are estimated using Discrete Fourier Transforms and the results discussed in Section 4 and Section 5 outlines the conclusions.

In this section we develop phase type distributions for one parameter Poisson Lindley and two parameter Poisson Lindley. Phase type Poisson Lindley distributions are derived when the mixing distribution follow phase type Lindley distribution.

Definition 1. A random variable X is said to be a phase type one parameter Poisson Lindley distribution if it follows:

X | λ ~ P o ( λ )

λ | Λ ~ P H − O P L ( Λ )

for λ > 0 and Λ is m ∗ m matrix.

Theorem 1. If X ~ P H − O P P L distribution then the probability distribution function of X is:

f ( x ; Λ ) = γ → Λ 2 ( I + Λ ) x + 3 { ( x + 2 ) I + Λ } 1 → T (2)

where Λ is M ∗ M and I is an identity matrix.

Proof:

If X | λ ~ P o ( λ ) and λ | Λ ~ P H − O P L ( Λ ) , then the pdf of variable X is expressed as;

P ( x ) = ∫ 0 ∞ P r ( x | λ ) f ( λ ; Λ ) d λ

where f ( λ ; Λ ) is P H − O P L ( Λ ) .

P ( x ) = ∫ 0 ∞ e − λ λ x x ! Λ 2 I + Λ ( 1 + λ ) e − Λ λ λ > 0, Λ = M ∗ M = Λ 2 I + Λ ∫ 0 ∞ [ λ x x ! e − λ ( I + Λ ) + λ x + 1 x ! e − λ ( I + Λ ) ] = γ → Λ 2 ( I + Λ ) x + 3 { ( x + 2 ) I + Λ } 1 → T (3)

Properties of Phase Type One Parameter Poisson Lindley DistributionThe r^{th} moments of PH-OPPL distribution is given by:

E ( X r ) = ∫ 0 ∞ x r f ( x , Λ ) d x = Λ 2 I + Λ ∫ 0 ∞ x r e − Λ x ( 1 + x ) d x = Λ 2 I + Λ [ Γ ( x + 1 ) Λ x + 1 + Γ ( x + 2 ) Λ x + 2 ] = γ → x ! [ ( x + 1 ) I + Λ ] Λ + I 1 → T (4)

The expectation and variance of PH-OPPL distribution can be easily obtained from Equation (4) as:

1) Expectation

E ( x ) = 1 ! [ ( 1 + 1 ) I + Λ ] Λ ( Λ + I ) = γ → 2 I + Λ Λ ( Λ + I ) 1 → T (5)

2) Variance

V a r ( x ) = 2 ! [ ( 2 + 1 ) I + Λ ] Λ 2 ( Λ + I ) − { 2 I + Λ Λ ( Λ + I ) } 2 = γ → 2 I + 4 Λ + Λ 2 ( Λ + I ) 2 + 2 I + Λ Λ ( Λ + I ) 1 → T (6)

The probability generating function of PH-OPPL distribution is given by:

G ( s ) = ∫ 0 ∞ e λ ( 1 − s ) Λ 2 I + Λ ( 1 + λ ) e − Λ λ d λ = Λ 2 I + Λ [ ∫ 0 ∞ λ e − λ ( I + Λ − s I ) d λ + ∫ 0 ∞ e − λ ( Λ + I − s I ) d λ ] = γ → Λ 2 I + Λ [ Λ + ( 2 − s ) I [ Λ + ( 1 − s ) I ] 2 ] 1 → T (7)

The parameter Λ of PH-OPPL distribution is estimated using continuous Chapman-Kolmogorov equation.

Definition 2. A random variable X is said to be a phase type two parameter Poisson Lindley distribution if it follows:

X | λ ~ P o ( λ )

λ | Λ , α ~ P H − T P L ( Λ , α )

for α > 0 , λ > 0 and Λ is M ∗ M matrix.

Theorem 2. If X ~ P H − T P P L distribution then the probability density function of X is expressed as:

f ( x ; Λ , α ) = γ → Λ 2 ( I + Λ ) x + 2 [ I + ( α + x ) I α Λ + I ] 1 → T (8)

where α > 0 , Λ is M ∗ M and I is an identity matrix.

Proof:

If X | λ ~ P o ( λ ) and λ | Λ , α ~ P H − T P L ( Λ , α ) , then the pdf of variable X is given by;

P ( x ) = ∫ 0 ∞ P r ( X = x | λ ) f ( λ ; Λ , α ) d λ

where f ( λ ; Λ , α ) is P H − T P L ( Λ , α ) .

P ( x ) = ∫ 0 ∞ e − λ λ x x ! Λ 2 I + Λ α ( α + λ ) e − Λ λ d λ λ > 0 , Λ = M ∗ M = Λ 2 I + Λ α ∫ 0 ∞ α λ x x ! e − λ ( I + Λ ) d λ + ∫ 0 ∞ λ x + 1 x ! e − λ ( I + Λ ) d λ = Λ 2 α Λ + I [ α Γ ( x + 1 ) I x ! ( I + Λ ) x + 1 + Γ ( x + 2 ) x ! ( I + Λ ) x + 2 ] = γ → Λ 2 ( I + Λ ) x + 2 [ I + ( α + x ) I α Λ + I ] 1 → T (9)

Properties of Phase Type Two Parameter Poisson Lindley DistributionThe r^{th} moments of PH-TPPL distribution is given by:

E ( X r ) = ∫ 0 ∞ x r f ( x , Λ , α ) d x = ∫ 0 ∞ [ ∑ x = 0 ∞ x r e − λ λ x x ! ] Λ 2 I + α Λ ( α + λ ) e − Λ λ d λ = Λ 2 I + Λ α ∫ 0 ∞ λ r Λ e − Λ λ d λ + ∫ 0 ∞ λ r + 1 e − Λ λ d λ = Λ 2 I + Λ α [ α Γ ( r + 1 ) I Λ r + 1 + Γ ( r + 2 ) Λ r + 2 ] = γ → Γ ( r + 1 ) I Λ r α Λ + ( r + 1 ) I α Λ + I 1 → T (10)

The expectation and variance of PH-TPPL distribution can be easily obtained from Equation (10) as:

1) Expectation

E ( x ) = Λ 2 α α + I ∫ 0 ∞ λ ( α + λ ) e − Λ λ d λ = γ → 2 I + Λ α Λ ( Λ α + I ) 1 → T (11)

2) Variance

V a r ( x ) = E ( x 2 ) − [ E ( x ) ] 2

E ( x 2 ) = α Λ + 2 I Λ ( α Λ + I ) + 2 ( α Λ + 3 I ) Λ 2 ( α Λ + I ) = γ → α Λ + 2 I Λ ( α Λ + I ) + 2 ( α Λ + 3 I ) Λ 2 ( α Λ + I ) − [ 2 I + Λ α Λ ( Λ α + I ) ] 2 1 → T (12)

The probability generating function of PH-TPPL distribution is given by:

G ( s ) = Λ 2 ( Λ + I ) 2 ∑ x = 0 ∞ [ s I Λ + I ] x + Λ 2 ( Λ + I ) 2 ( α Λ + I ) ∑ 0 ∞ ( α + x ) [ s Λ + I ] x = γ → α Λ [ Λ + ( 1 − s ) I ] + Λ 2 ( α Λ + I ) [ Λ + ( 1 − s ) I ] 2 1 → T (13)

The value of Λ is known hence the value of α can be obtained from Equation (11) if the value of E ( x ) is known.

Matrix Λ was determined using continuous Chapman-Kolmogorov equation for cancer data in Kenya and the values of γ is the stationary probabilities obtained using the formula π k = π 0 Λ k . The values of Λ for three state Markov model represents cancer patients who transit from Healthy-Leukemia-Dead states, four state Markov model represents patients who transit from Healthy-Liver-Colon-Dead states, five state Markov model represents Healthy-Stomach-Pharynx-Colon-Dead states and six state Markov model represents patients transiting from Healthy-Oesophagus-Stomach-Lung-Kidney-Dead states. The values of Λ for different states are:

[ 0.8783 0.1217 0 0 0.3938 0.6062 0 0 1.0000 ] [ 0.7900 0.2100 0 0 0 0.2898 0.7102 0 0 0 0.8985 0.1015 0 0 0 1.0000 ]

[ 0.8364 0.1636 0 0 0 0 0.3892 0.6108 0 0 0 0 0.6688 0.3312 0 0 0 0 0.5524 0.4476 0 0 0 0 1.0000 ] [ 0.4851 0.5149 0 0 0 0 0 0.1223 0.8777 0 0 0 0 0 0.1533 0.8467 0 0 0 0 0 0.4410 0.5590 0 0 0 0 0 0.8668 0.1332 0 0 0 0 0 1.0000 ]

The shape of probability function of phase type one parameter Poisson Lindley is expressed as:

The shape of probability function of phase type two parameter Poisson Lindley is expressed as:

Compound distribution in the actuarial field is the total loses in the group of insurance policies. In this section we develop compound phase type distributions (CPHD) which can be used to model secondary cancer cases.

Definition 3. Let N be a r.v with probability generating function F ( S ) and X 1 , ⋯ , X N be a set of iid random variable with a common probability generating function G ( S ) and is independent of N, then the probability generating function of the compound distribution is expressed as:

H ( S ) = F [ G ( S ) ] (14)

Unlike ordinary compound distributions which do not consider transition phases of diseases, (CPHD) in-cooperates the transition states. Probability generating functions of compound distributions can be derived by convolution of probability generating function of two distributions as shown in Equation (14).

Theorem 3 (Compound one parameter Poisson Lindley distribution). If the pgf of N ~ P H − O P P L ( Λ ) the compound pgf of N is:

H ( S ) = γ → Λ 2 I + Λ [ Λ + ( 2 − L x [ G ( S ) ] ) I [ Λ + ( 1 − L x [ G ( S ) ] ) I ] 2 ] 1 → T (15)

where L x [ G ( S ) ] is the Laplace transform of the severity distribution as most continuous distributions their pgf is not available.

Proof:

H ( S ) = F [ G ( S ) ] = F [ L x [ G ( S ) ] ] = γ → Λ 2 I + Λ [ Λ + ( 2 − L x [ G ( S ) ] ) I [ Λ + ( 1 − L x [ G ( S ) ] ) I ] 2 ] 1 → T (16)

Theorem 4 (Compound two parameter Poisson Lindley distribution). If the pgf of N ~ P H − T P P L ( Λ ) the compound pgf of N is:

H ( S ) = γ → α Λ [ Λ + ( 1 − L x [ G ( S ) ] ) I ] + Λ 2 ( α Λ + I ) [ Λ + ( 1 − L x [ G ( S ) ] ) I ] 2 1 → T (17)

where L x [ G ( S ) ] is as defined in theorem (3).

Proof:

H ( S ) = F [ G ( S ) ] = F [ L x [ G ( S ) ] ] = γ → α Λ [ Λ + ( 1 − L x [ G ( S ) ] ) I ] + Λ 2 ( α Λ + I ) [ Λ + ( 1 − L x [ G ( S ) ] ) I ] 2 1 → T (18)

The continuous distributions considered in this research are; Weibull, Pareto and Generalized Pareto distributions hence their Laplace transforms will be derived and replaced in Equations (16) and (18) to get the pgf of their compound distribution using PH-OPPL and PH-TPPL distributions respectively. The Laplace transform of Weibull, Pareto and Generalized Pareto are derived as:

1) Weibull distribution

L x ( S ) = E [ e − s x ]

L x G ( S ) = ∫ 0 ∞ e − s x β α ( x α ) β − 1 e − ( x α ) β d x = β α ∫ 0 ∞ ( x α ) β − 1 e − x α [ s α + ( x α ) β − 1 ] = β α Γ β [ s α + ( x α ) β − 1 ] β (19)

2) Pareto distribution

L x ( S ) = E [ e − s x ]

L x G ( S ) = α β α ∫ 0 ∞ e − s x ( x + β ) α + 1 d x = α β ∫ 0 ∞ e − β x ∑ k = 0 ∞ ( − ( α + 1 ) k ) ( x β ) k d x = α β ∑ k = 0 ∞ ( − 1 ) k Γ ( α + k ) k ! Γ α k ! β 2 k + 1 = ∑ k = 0 ∞ ( − 1 ) k α Γ α Γ ( α + k ) β 2 k + 2 (20)

3) Generalized Pareto distribution

L X ( S ) = E [ e − s x ]

L X G ( S ) = ∫ 0 ∞ e − s x x α − 1 β ( α , γ ) ( x + λ ) α + γ d x = 1 λ γ β ( α , γ ) ∫ 0 ∞ x α e − s x ∑ k = 0 ∞ ( α + γ k ) x λ k d x = 1 λ γ β ( α , γ ) ∑ k = 0 ∞ − ( α + γ ) λ k ∫ 0 ∞ x γ + k + 1 − 1 e − s x d x = 1 λ γ β ( α , γ ) ∑ k = 0 ∞ − ( α + γ ) λ k Γ α + k s α + k (21)

Replacing Equations (19), (20) and (21) in Equation (16) the pgf of the compound distributions of PH-one parameter Poisson Lindley with Weibull, Pareto and Generalized Pareto respectively are:

1) Compound PH-OPPL-Weibull distribution

H ( S ) = γ → Λ 2 I + Λ [ Λ + ( 2 − β α Γ β [ s α + ( x α ) β − 1 ] β ) I [ Λ + ( 1 − β α Γ β [ s α + ( x α ) β − 1 ] β ) I ] 2 ] 1 → T (22)

2) Compound PH-OPPL-Pareto distribution

H ( S ) = γ → Λ 2 I + Λ [ Λ + ( 2 − ∑ k = 0 ∞ ( − 1 ) k α Γ α Γ ( α + k ) β 2 k + 2 ) I [ Λ + ( 1 − ∑ k = 0 ∞ ( − 1 ) k α Γ α Γ ( α + k ) β 2 k + 2 ) I ] 2 ] 1 → T (23)

3) Compound PH-OPPL-Generalized Pareto distribution

H ( S ) = γ → Λ 2 I + Λ [ Λ + ( 2 − 1 λ γ β ( α , γ ) ∑ k = 0 ∞ − ( α + γ ) λ k Γ α + k s α + k ) I [ Λ + ( 1 − 1 λ γ β ( α , γ ) ∑ k = 0 ∞ − ( α + γ ) λ k Γ α + k s α + k ) I ] 2 ] 1 → T (24)

Replacing Equations (19), (20) and (21) in Equation (18) the pgf of the compound distributions of PH-two parameter Poisson Lindley with Weibull, Pareto and Generalized Pareto respectively are:

1) Compound PH-TPPL-Weibull distribution

H ( S ) = γ → α Λ [ Λ + ( 1 − β α Γ β [ s α + ( x α ) β − 1 ] β ) I ] + Λ 2 ( α Λ + I ) [ Λ + ( 1 − β α Γ β [ s α + ( x α ) β − 1 ] β ) I ] 2 1 → T (25)

2) Compound PH-TPPL-Pareto distribution

H ( S ) = γ → α Λ [ Λ + ( 1 − ∑ k = 0 ∞ ( − 1 ) k α Γ α Γ ( α + k ) β 2 k + 2 ) I ] + Λ 2 ( α Λ + I ) [ Λ + ( 1 − ∑ k = 0 ∞ ( − 1 ) k α Γ α Γ ( α + k ) β 2 k + 2 ) I ] 2 1 → T (26)

3) Compound PH-TPPL-Generalized Pareto distribution

H ( S ) = γ → α Λ [ Λ + ( 1 − 1 λ γ β ( α , γ ) ∑ k = 0 ∞ − ( α + γ ) λ k Γ α + k s α + k ) I ] + Λ 2 ( α Λ + I ) [ Λ + ( 1 − 1 λ γ β ( α , γ ) ∑ k = 0 ∞ − ( α + γ ) λ k Γ α + k s α + k ) I ] 2 1 → T (27)

The cancer data considered in this research is obtained from a medical facility in Kenya. The cancer transitions states considered are Healthy-Leukemia-Dead states for 3 state model, Healthy-Liver-Colon-Dead states for four state model, Healthy-Stomach-Pharynx-Colon-Dead states for five state model and Healthy-Oesophagus-Stomach-Lung-Kidney-Dead states for six state models. The values of Λ for the data are obtained using continuous Chapman-Kolmogorov equations expressed as:

p i j ( ϕ A , γ t + Ψ d ) = ∑ k = 1 n p i k ( ϕ A , γ t ) p k j ( γ t , γ t + Ψ d )

l i m Ψ d → 0 p i j ( ϕ A , γ t + Ψ d ) − p i j ( ϕ A , γ t ) Ψ d = l i m Ψ → 0 p i k ( ϕ A , γ t ) p k j ( γ t , γ t + Ψ d ) − p i j ( ϕ A , γ t ) [ 1 − p j j ( γ t , γ t + Ψ d ) ] Ψ d

∂ ∂ γ t p i j ( ϕ A , γ t ) = ∑ k = 1 n p i k ( ϕ A , γ t ) ℑ k j − p i j ( ϕ A , γ t ) ℑ j

p i j ( ϕ A , γ t + Ψ d ) = ∑ k = 1 n p i k ( ϕ A , γ t ) p k j ( γ t , γ t + Ψ d )

p i j ( ϕ A , γ t ) = 1 − e − ℑ i j ( ϕ A ) t (28)

where:

lim Ψ d → 0 p k j ( γ t , γ t + Ψ d ) Ψ d = ℑ k j

lim Ψ → 0 1 − p j j ( γ t , γ t + Ψ d ) Ψ d = ℑ j

The values of Λ for three, four, five and six state using the data obtained were as shown in Section 2.3.

The severity distributions considered in this research are Weibull, Pareto and Generalized Pareto distributions. DFT requires severity probabilities to be discrete hence they will be discretized using method of mass rounding which is expressed as:

f 0 = F J ( h 2 ) f x = F J ( x h + h 2 ) − F J ( x h + h 2 ) x = 1 , 2 , 3 , ⋯ f m = 1 − F J ( m h − h 2 )

The pdf of Wei-bull, Pareto and Generalized Pareto distributions respectively are expressed as;

f ( x ) = β α ( x α ) β − 1 e − ( x α ) β ; x > 0 ; a , b > 0 f ( x ) = α β α x + β α + 1 f ( x ) = x α − 1 λ γ Γ ( α + γ ) Γ γ Γ α ( x + λ ) α + γ

The frequency and severity probabilities for secondary cancer cases are: (

States | PH-OPPL | PH-TPPL | Weibull | Pareto | Gen Pareto |
---|---|---|---|---|---|

3 state | 0.011 | 0.0078 | 0.1506 | 0.1099 | 0.0192 |

4 state | 0.0082 | 0.0054 | 0.1266 | 0.0591 | 0.0091 |

5 state | 0.0083 | 0.0055 | 0.1703 | 0.1904 | 0.0429 |

6 state | 0.0.0083 | 0.0054 | 0.1775 | 0.2488 | 0.0730 |

There are different numerical methods used in estimation of aggregate losses such as; Monte Carlo, Panjer recursive model, Fourier transforms and Direct Numerical Integration. Panjer recursive model is applicable when the claim frequency distributions follow either Panjer class ( a , b ,0 ) or class ( a , b ,1 ) . In this study we will consider Discrete Fourier Transform (DFT) in estimation of the aggregate losses. Robertson (1992) applied Fourier transforms in computation of aggregate losses [

The algorithm of DFT of aggregate losses requires computation of DFT of frequency and DFT of severity separately.

Definition 4 (Discrete Fourier Transform). Let X n be the severity or frequency distribution of the claim data. For any discrete function X k the Discrete Fourier transform is the mapping;

X k = ∑ n = 0 N − 1 X ( n ) e − i 2 Π k n N k = 0 , 1 , 2 , ⋯ , N − 1 (29)

Expression (29) is very complex to work with hence to reduce its complexity we apply Euler’s formula and it becomes:

X ( k ) = ∑ n = 0 N − 1 X ( n ) cos ( 2 Π k n N ) + i sin ( 2 Π k n N )

X ( k ) = ∑ n = 0 N − 1 X ( n ) W N k n (30)

which is the DFT of the severity or frequency probabilities. The severity and frequency probabilities are of length 8 and hence the matrix W must be a primitive 8^{th} root of unity therefore Equation (30) can be rewritten as:

X ( k ) = ∑ N = 0 7 X ( n ) W N k n (31)

The frequency or severity probabilities will be padded with equal number of zero’s as its elements in order to perform no wrap convolution. The DFT algorithm is as follows:

1) Multiply the matrix W N k n with the frequency or severity probabilities to get the DFT of frequency or severity probabilities.

2) Compute DFT of DFT of frequency and severity by multiplying DFT of frequency probabilities with the DFT of the severity probabilities and consequently multiplying the resulting vector with the matrix W N k n .

3) Select the values without the complex i and divide each value by the number of elements in the vector of frequency or severity distribution and arrange the resulting probabilities in reverse except for the first probability.

4) Values corresponding to original frequency and severity values are the aggregate loss probabilities.

The values of aggregate loss probabilities using DFT are:

States | PH-OPPL/Wei | PH-OPPL/Par | PH-OPPL/Gen-par | PH-TPPL/Wei | PH-TPPL/Par | PH-TPPL/Gen-par |
---|---|---|---|---|---|---|

3 state | 0.00117468 | 0.00085722 | 0.00093696 | 0.0132528 | 0.00300027 | 0.00052416 |

4 state | 0.00180072 | 0.00105444 | 0.00104696 | 0.02102016 | 0.00488845 | 0.00082059 |

5 state | 0.00284028 | 0.00240871 | 0.0029879 | 0.00332912 | 0.0102232313 | 0.00201259 |

5 state | 0.00381366 | 0.00388731 | 0.00579889 | 0.04507728 | 0.01749645 | 0.00411375 |

The values of

Mixed phase type distributions are developed to model secondary cancer cases in Kenya. Unlike ordinary distributions which do not in-cooperate the transition of different states, the distributions proposed here take into consideration transition states while modeling claim frequency data. The distributions are based on Poisson and Lindley distributions, where PH-OPPL-Weibull provided the best for PH-OPPL models while PH-TPPL-Generalized Pareto provided the best fit for PH-TPPL models. This model improves estimation of aggregate loses as it in-cooperates transition probabilities of different states of cancer as well as heterogeneous aspect of claim data. This greatly improves estimation of insurance policies for diseases which transit to different state such as cancer hence improving the financial positions of the insurance firms as it will improve estimation of its reserves. This model, however, is only applicable in risk theory for diseases which have multiple transitions states. Further research can be done on this study factoring in patients who were censored in this study and also the same study can be carried out for disease such as HIV-AID which has transition states.

The data used to support the findings of this study can be availed upon request.

The authors declare that there is no conflict of interest regarding the publication of this paper.

Mwende, C., Weke, P., Bundi, D. and Ottieno, J. (2021) Estimation of Aggregate Losses of Secondary Cancer Using PH-OPPL and PH-TPPL Distributions. Open Journal of Statistics, 11, 838-853. https://doi.org/10.4236/ojs.2021.115049