Pareto Distribution: A Probability Model in Social Research

Abstract

This methodological article aims to present the type I Pareto distribution in a clear and illustrative manner for better understanding among social researchers. It also provides R scripts for practical application. This continuous distribution, with its inverted J shape, skewness towards the right side, and heavy right tail, serves as an effective probability model for various social variables, such as wealth and income, as well as behaviors that are highly frequent in a few individuals and infrequent in the majority. The type I distribution, which has a scale parameter xm and a shape parameter α, is introduced, beginning with a brief historical overview. The density, cumulative distribution, tail, moment, and characteristic functions are presented. The article proceeds with descriptive measures, estimators based on the method of moments and maximum likelihood, its relationship with other distributions, and goodness-of-fit tests. This material is applied through two examples: one involving probability and descriptive measure calculations, and the other focused on parameter estimation and fit testing using the Kolmogorov-Smirnov and Anderson-Darling tests. Additionally, scripts were developed to perform the corresponding calculations in R, a freely available software. Simulated data were used in two examples illustrating the application of the distribution. Finally, suggestions for its use are provided.

Share and Cite:

Moral de la Rubia, J. (2025). Pareto Distribution: A Probability Model in Social Research. Open Journal of Social Sciences, 13, 86-121. doi: 10.4236/jss.2025.131007.

1. Introduction

The Pareto distribution is a non-normal continuous distribution that has significant applications in the social and behavioral sciences (Barnoy & Reich, 2022; Feng, Deng, Chen, Perc, & Kurths, 2020). However, its explanation is often theoretical and confusing, primarily due to its presentation under varying parameterizations that are frequently interchanged (Sarabia, Jorda, & Prieto, 2019). These include the classic two-parameter form, which is the most widely used (type I: scale parameter xm and shape parameter α); the three-parameter forms (type II: location parameter μ, xm, and α; and type III: μ, xm, and shape parameter γ); and the four-parameter form (type IV: μ, xm, α, and γ). Additionally, the distribution is predominantly developed with an economic focus (Barczy, Nedényi, & Sütő, 2023). The purpose of this article is, therefore, to present the type I Pareto distribution in a clear, comprehensible, and illustrative manner for social researchers, including psychologists, sociologists, and social workers.

The type I Pareto (1896; 1897) distribution was developed between 1896 and 1897 by the Italian civil engineer, economist, and sociologist Vilfredo Federico Damaso Pareto (1848-1923) during his political economy course at the University of Lausanne (UNIL) in Switzerland. It was initially conceived to describe the distribution of land, wealth, and income within a society. From these studies, the 80/20 principle was derived, positing that 80% of wealth is concentrated among 20% of the population. Notably, both the distribution and the inequality principle are now applied as probabilistic models for similar variables (Charpentier & Flachaire, 2022; Feng et al., 2020), although contemporary inequality levels tend to be lower, with a maximum ratio of 70/30 (McCarthy & Winer, 2019).

Furthermore, the Pareto distribution and inequality rule have found applications beyond economics, serving as models in fields such as psychology (Campbell & Brauer, 2021; Rajeev, 2022), virtual education (Valkanas & Diamandis, 2022), engineering (Chen, Zhang, Wang, Jiang, & Liu, 2019; Sudharson et al., 2022; Sudharson & Prabha, 2019), climatology (Le Gall, Favre, Naveau, & Prieur, 2022), and physics (Rácz et al., 2023; Cheng et al., 2023). For instance, claims to insurers for accidents and illnesses often follow a Pareto distribution, prompting these companies to impose disproportionately high premiums on certain sectors of the population, such as older adults (Diawara, Kane, Dembele, & Lo, 2021; Zhang, Wu, & Yao, 2022).

2. Characterization of the Distribution

2.1. Parameters and Support

This continuous distribution, belonging to the family of power distributions, is defined by two parameters in its simplest or type I form (Ahmad & Almetwally, 2020). The first parameter, a scale parameter often denoted as xm (though its notation varies widely), represents both the peak or mode and the minimum value of the distribution (Chattamvelli & Shanmugam, 2021). The second parameter, a shape parameter denoted as α (Fedotenkov, 2020), is commonly referred to as the Pareto tail index (Andria, 2022). The parameter space for both spans the interval (0, ∞), while the support of the distribution is defined over the interval [xm, ∞).

The profile of its density function depicts a curve that sharply decreases from its maximum value and becomes asymptotic to the horizontal axis, exhibiting positive skewness and leptokurtosis with a long right tail (Figure 1). The value of α that precisely models the 80/20 law is log4(5) = ln(5)/ln(4) ≈ 1.161, which is considered too extreme for wealth distributions, where α is typically greater than 1.5 (Yang & Zhou, 2022).

X~Pareto( x m ,α )

x m andα( 0, )andx [ x m , )

Figure 1. Density function fX(x) and cumulative distribution function FX(x) of hourly income modeled by a type I Pareto distribution with scale parameter xm = 40 and shape α = log4 (5).

2.2. Functions

2.2.1. Density Function

The definite integral of the density function allows the calculation of the probability that a value x of the continuous random variable X falls within a given integration interval, which must lie within the domain of the distribution. It is denoted by fX(x). The analytical expression of this function for the type I Pareto distribution is shown in Equation 1.

f X ( x )=α x m α x α+1 ,x x m (1)

2.2.2. Cumulative Distribution Function

This function provides the probability that a continuous random variable X takes a value between the lower limit of the distribution and a specified value x, where x belongs to the domain of the distribution. it is denoted by FX(x). The analytical expression of this function for the type I Pareto distribution is presented in Equation 2.

F X ( x )=P( Xx )= x m x f X ( x )dx =1 ( x m x ) α ,x x m (2)

2.2.3. Complementary Cumulative Distribution Function

Also known as the tail function, it is denoted by F ¯ X ( x ) . This function provides the probability that a continuous random variable X takes a value between a specified value x and the upper limit of the distribution, where x belongs to the domain of the distribution. The analytical expression for this function in the case of the type I Pareto distribution is presented in Equation 3.

F ¯ X ( x )=P( X>x )= x m f X ( x )dx = ( x m x ) α ,x x m (3)

2.2.4. Quantile Function

It is the inverse of the cumulative distribution function. It is denoted by QX(x) or F X 1 ( x ) . It provides the value of X that accumulates the probability corresponding to the given argument, where the argument takes values in the interval (0, 1). The analytical expression for this function in the case of the type I Pareto distribution is shown in Equation 4.

Q X ( p )= x m 1p α = x m ( 1p )   1 α ,p( 0,1 ) (4)

2.2.5. Moment Generating Function

The k-th derivative of this function, evaluated at the point 0, gives the k-th order moments of the distribution, where the k-th order moment is the mathematical expectation of the variable raised to the k-th power. It is denoted by MX(t). The moment generating function of the type I Pareto distribution is not defined for a non-trivial interval of t around 0, but it is defined for values of t ≤ 0. Its analytical expression is shown in Equation 5.

M X ( t )=E( e tx )= x m + e tx f X ( x )dx ={ α ( x m t ) α Γ( α, x m t ) t<0 1 t=0 Undefined t>0 (5)

Upper incomplete gamma function

Γ( α, x m t )= x m t u α1 e u du

2.2.6. Characteristic Function

The characteristic function corresponds to the Fourier transform (with a sign inversion) applied to the analytical expression of the density function. It allows for the analysis of the behavior of the moments of a distribution. It is denoted by CX(t). The analytical expression of this function for the type I Pareto distribution is shown in Equation 6, where i= 1 and t ∊ R.

C X ( t ):; C X ( t )=E( e itx )= x m e itx f X ( x )dx =α ( i x m t ) α Γ( α,i x m t ) (6)

2.3. Descriptive Measures

2.3.1. Measures of Central Tendency

Mathematical expectation or arithmetic mean: This measure corresponds to the first (non-central) moment. For the type I Pareto distribution, it can be calculated using its two parameters, as shown in Equation 7. Its computation requires the shape parameter α to be greater than 1.

μ( X )=E( X )= x m x f X ( x )dx = α α1 x m ,α>1 (7)

Geometric mean: It is the antilogarithm of the mathematical expectation of the logarithm of the values, typically using the natural base. For the type I Pareto distribution, it can be calculated from its two parameters, as shown in Equation 8.

μ g ( X )=G( X )= e E( ln( X ) ) = x m e α = x m e 1/α (8)

Harmonic Mean: It is the reciprocal of the mathematical expectation of the reciprocals of the values. For the type I Pareto distribution, it can be calculated using its two parameters, as shown in Equation 9.

μ h ( X )=H( X )= 1 E( 1/X ) = x m ( 1+ 1 α ) (9)

Median: It is the 0.5 quantile. For the type I Pareto distribution, it can be calculated using its two parameters, as shown in the Equation 10.

Mdn( X )= Q X ( p=0.5 )= x m 1/2 α = x m 2 α (10)

Mode: It corresponds to the peak of the distribution. The type I Pareto distribution is unimodal, with its single peak occurring at the scale parameter xm (Equation 11).

Mo( X )={ x|max[ f X ( x ) ] }= x m (11)

Non-central k-th moment: It is the mathematical expectation of the values raised to the k-th power. For the type I Pareto distribution, it can be calculated using its two parameters, as shown in Equation 12. For its calculation, the shape parameter α must be greater than the power k to which the values are raised.

E( X k )= x m x k f X ( x )dx = α αk x m k ,α>k (12)

2.3.2. Measures of Variation

Variance: It corresponds to the second central moment or the expected value of the squared differences from the arithmetic mean. For the type I Pareto distribution, it can be calculated using its two parameters, as shown in in Equation 13. Its computation requires the shape parameter α to be greater than 2.

Var( X )= σ 2 ( X )=E[ ( XE( X ) ) 2 ]= x m ( xE( X ) ) 2 f X ( x )dx = α x m 2 ( α1 ) 2 ( α2 ) ,α>2 (13)

Standard deviation: It is the square root of the variance (Equation 14).

DE( X )=σ( X )= Var( X ) = x m α1 α α2 ,α>2 (14)

Entropy: It corresponds to the mathematical expectation of the additive inverses of the logarithms of the densities, or the mathematical expectation of the information of the values. When the natural logarithm is used, entropy is expressed in natural units of information (nats), which is the most common. If base 10 is used, it is expressed in decimal units of information (dits); whereas, if base 2 is used, it is expressed in binary units of information (bits). For the type I Pareto distribution, it can be calculated using its two parameters, as shown in Equation 15.

H( X )=E( ln[ f X ( x ) ] )= x m f X ( x )ln[ f X ( x ) ]dx =ln[ x m α e 1+ 1 α ] =ln( x m )ln( α )+1+1/α (15)

2.3.2. Measures of Shape

Skewness: It corresponds to the third standardized central moment. Karl Pearson’s original notation is used: β 1 (Pearson, 1895). For the type I Pareto distribution, it can be calculated using the shape parameter α, as shown in Equation 16. Its computation requires this parameter to be greater than 3.

β 1 ( X ) = E[ ( XE( X ) ) 3 ] ( E[ ( XE( X ) ) 2 ] ) 3/2 = 2( α+1 ) α3 α2 α ,α>3 (16)

Excess Kurtosis: It corresponds to the standardized fourth central moment minus the kurtosis value of the normal distribution, which is 3. Karl Pearson’s original notation is used: β2 − 3 (Pearson, 1905). For the type I Pareto distribution, it can be calculated using the shape parameter α, as shown in Equation 17. Its computation requires this parameter to be greater than 4.

β 2 ( X )3= E[ ( XE( X ) ) 4 ] ( E[ ( XE( X ) ) 2 ] ) 2 = 6( α 3 + α 2 6α2 ) α( α3 )( α4 ) ,α>4 (17)

3. Parameter Estimation

Let X be a random sample of size n from a continuous quantitative variable that follows a type I Pareto distribution with parameters xm and α. For example, the sample data could consist of records of the monthly salaries of randomly chosen workers in a large firm. Refer to Equation 18 for its notation.

x={ x 1 , x 2 ,, x n }X~Pareto( x m ,α ) (18)

3.1. Estimator of α by the Method of Moments

The estimator of the shape parameter α by the method of moments is obtained from the mathematical expectation or arithmetic mean of X, if the value of the location parameter xm, which is the peak and minimum value of the distribution, is known. This mathematical expectation corresponds to the quotient of the sample mean (numerator) and the difference between the sample mean and the known parameter xm (denominator). It is valid when α > 2. For more details, refer to Equation 19.

E( X )= α α1 x m αE( X )E( X )α x m =0 α( E( X ) x m )=E( X ) α= E( X ) E( X ) x m = μ( X ) μ( X ) x m α ^ = E ^ ( X ) E ^ ( X ) x m = μ ^ ( X ) μ ^ ( X ) x m = x ¯ x ¯ x m ,where μ ^ ( X )= x ¯ = i=1 n x i n (19)

The point estimator converges to a normal distribution, since X has a distribution with finite moments (Rao, 1973), which allows for the computation of an asymptotic interval estimator (Equation 20).

α ^ n = x ¯ x ¯ x m d N( α, α ( α1 ) 2 n( α2 ) ) P( α ^ z 1 α 2 × α ^ ( α ^ 1 ) 2 n( α ^ 2 ) α α ^ + z 1 α 2 × α ^ ( α ^ 1 ) 2 n( α ^ 2 ) ) (20)

3.2. Estimators of xm and α by the Maximum Likelihood Method

The estimator by the maximum likelihood method of the scale parameter xm is the minimum sample value, and the estimator of the shape parameter α is the inverse of the arithmetic mean of the logarithms of the ratios between each value xi and the minimum sample value (Siudem, Nowak, & Gagolewski, 2022). See Equation 21 and Equation 22 for these estimators, respectively.

x ^ m =min( x 1 , x 2 ,, x n )=min( { x i } i=1 n ) (21)

α ^ = n i=1 n ln( x i x ^ m ) = n i=1 n ln( x i min( { x i } i=1 n ) ) (22)

The Fisher information for n data points for these two parameters is given by the 2 × 2 matrix in Equation 23.

I( x m ,α|x ) =n( E[ δ δ x m δ x m ln f X ( x| x m ,α ) ] E[ δ δ x m δα ln f X ( x| x m ,α ) ] E[ δ δαδ x m ln f X ( x| x m ,α ) ] E[ δ δαδα ln f X ( x| x m ,α ) ] )

=n( α/ x m 2 1/ x m 1/ x m 1/ α 2 )=( nα/ x m 2 n/ x m n/ x m n/ α 2 ) (23)

The variance of a parameter θ, Var(θ), is always greater than or equal to its Cramér-Rao lower bound CRLB(θ) or the inverse of its Fisher information for n data, 1/I(θ) (Xu, Sedory, & Singh, 2022). Refer to Equation 24.

Var( θ ^ )= σ θ ^ 2 CICR( θ ^ )=1/ I( θ ) (24)

The estimators obtained by the maximum likelihood method have asymptotic properties of unbiasedness (Equation 25), consistency (Equation 26), efficiency (Equation 27), and normality (Equation 28), which make them very useful (Song, Roung-Park, & Yoon, 2022).

θ ^ n isanasymptoticallyunbiasedestimatorofθ if lim n E( θ ^ n )=θ (25)

θ ^ n isanasymptoticallyconsistentestimatorofθ if lim n P( | θ ^ n θ |>ε )=0 (26)

θ ^ n isanasymptoticallyefficientestimatorofθ if lim n CICR( θ ^ ) Var( θ ^ n ) =1 (27)

θ ^ n asanestimatorθconvergestonormality if lim n n ( θ ^ n θ ) DE( θ ^ ) d N( 0,1 ) (28)

Based on these asymptotic properties, asymptotic standard errors (aes) can be defined for the maximum likelihood estimators of xm (Equation 29) and α (Equation 31), along with asymptotic confidence intervals (Equation 30 for the estimator of xm and Equation 32 for the estimator of α).

ase( θ ^ )= σ θ ^ = 1/ I( θ ) ase( x ^ m )= σ x ^ m = 1 nα/ x m 2 = x m 2 nα = x m nα ; EEA ^ ( x ^ m )= x ^ m n α ^ (29)

P( θ ^ z 1 α n σ θ ^ θ θ ^ + z 1 α n σ θ ^ )=1α P( x ^ m z 1 α n x ^ m n α ^ x m x ^ m + z 1 α n x ^ m n α ^ )=1α (30)

ase( θ ^ )= σ θ ^ = 1/ I( θ ) ase( α ^ )= σ α ^ = 1 n/ α 2 = α 2 n = α n ; EEA ^ ( α ^ )= α ^ n (31)

P( θ ^ z 1 α n σ θ ^ θ θ ^ + z 1 α n σ θ ^ )=1α P( α ^ z 1 α n α ^ n α α ^ + z 1 α n α ^ n )=1α (32)

The estimator of α by the maximum likelihood method is more efficient than that of the method of moments, but it is biased. Given this limitation, a bias-corrected estimator can be defined that has a lower variance than the biased estimator (Rytgaard, 1990) and can be used to achieve more efficient asymptotic estimation (Equation 33). As with the previous definitions, the use of these asymptotic formulas requires a large sample, specifically one larger than 30 and preferably at least 100 (Mateus & Caeiro, 2022). See the variance, standard error, and asymptotic confidence interval of the bias-corrected estimator in Equations 34, 35, and 36, respectively.

α ^ c = n2 i=1 n ln( x i min( { x i } i=1 n ) ) (33)

σ ^ 2 ( α ^ c )= α ^ c 2 / ( n2 ) < σ ^ 2 ( α ^ ) (34)

σ ^ ( α ^ c )= α ^ 2 / ( n2 ) = α ^ c / n2 (35)

P( α ^ c z 1 α n α ^ c n2 α α ^ c + z 1 α n α ^ c n2 )=1α (36)

3.3. Exact Distributions of the Parameters

Let X be a random variable following a type I Pareto distribution with scale parameter xm and shape parameter α. The exact distribution of the sum of the logarithms of each of the n sample data points of X, divided by the minimum sample value, follows an exponential distribution with a rate parameter equal to α (Rytgaard, 1990). See Equation 37.

i=1 n ln( x i min ( { x i } ) i=1 n )= i=1 n ln( x i x ^ m )~Exponential( λ=α ) (37)

The exact distribution of the maximum likelihood estimator of the scale parameter xm is a type I Pareto distribution with a scale parameter xm and a shape parameter n × α, while the exact distribution of the maximum likelihood estimator of the shape parameter α is an inverse gamma distribution with a shape parameter n - 1 and a scale parameter n × α (Qian, Chen, & He, 2021). See Equation 38.

x ^ m =min( { x i } i=1 n )~Pareto( x m ,n×α )

α ^ = n i=1 n ln( x i min( { x i } i=1 n ) ) ~Inv-Gamma( α=n1,β=n×α ) (38)

4. Generalized Form of the Pareto Distribution

There is a generalized form of the four-parameter Pareto distribution with location parameter μ, scale parameter σ, and two shape parameters: the Pareto tail index α and the inequality index γ. Its cumulative distribution function is provided in Equation 39 (Arnold, 2015).

Support: xμ

Parameter space: μ( , ) and σ,α and γ( 0, ) .

F( X )=1 1 [ 1+ xμ σ γ ] α (39)

The location parameter μ is not the mathematical expectation or arithmetic mean of the distribution, and the scale parameter σ is not the standard deviation. The mean depends on the parameters σ, α, and γ, as well as certain gamma functions, and requires that γ < α (Equation 40).

E( X )= σ×Γ( αγ )×Γ( 1+γ ) Γ( α ) ,γ<α (40)

When μ = σ = xm (minimum and mode) and γ = 1, this generalized four-parameter form reduces to the type I Pareto distribution (Equation 41).

F( X )=1 1 [ 1+ ( xμ σ ) 1/γ ] α =1 1 [ 1+ ( x x m x m ) 1/1 ] α =1 1 ( 1+ x x m 1 ) α =1 ( x m /x ) α (41)

In the type II Pareto distribution, γ = 1 (Equation 42). When the parameter μ of the type II Pareto distribution is equal to 0, it is referred to as the Lomax (1954) distribution.

F( X )=1 1 [ 1+ ( xμ σ ) 1/γ ] α =1 1 [ 1+ ( xμ σ ) 1/1 ] α =1 1 [ 1+ xμ σ ] α (42)

In the type III Pareto distribution, α = 1 (Equation 43).

F( X )=1 1 [ 1+ ( xμ σ ) 1/γ ] α =1 1 [ 1+ ( xμ σ ) 1/γ ] 1

=1 1 1+ ( xμ σ ) 1/γ =1 1 1+ xμ σ γ (43)

5. Relationship between Type I Pareto Distribution and Other Distributions

Let X be a random variable following a type I Pareto distribution with parameters xm and α. The product of the scale parameter xm and an exponential distribution based on the number e and exponent X (random variable Y) follows an exponential distribution with a rate parameter or inverse scale λ = α. Conversely, let Y be a random variable with an exponential distribution with rate parameter or inverse scale λ. The natural logarithm of the quotient of the random variable Y and the value xm (random variable X) follows a type I Pareto distribution with a scale parameter xm and a shape parameter α = λ. This implies that the cumulative distribution functions of X and Y are equal and interchangeable when calculating cumulative probabilities. Refer to Equation 44.

X~Pareto( x m ,α ); x m e X =Y~Exponential( λ=α ) Y~Exp( λ );ln( Y/ x m )=X~Pareto( x m ,α=λ ) P( Xx )= F X ( x| x m ,α )= F Y [ y=ln( x x m )|λ=α ]=1 ( x m x ) α =1 e αy (44)

It should be noted that there is a relationship between the exponential and Pareto distributions that is analogous to the one between the normal and lognormal distributions. The Pareto and lognormal distributions can be applied to the same data, although one may provide a better fit than the other. Both are exponential transformations of a more widely known and used variable: the former from an exponential distribution and the latter from a normal distribution (Feng et al., 2020).

The zeta distribution is the discrete equivalent of the Pareto distribution. If the support of the zeta distribution is bounded within an interval of natural numbers, it becomes the so-called Zipf distribution (Arnold, 2015).

6. Relationship between Type I Pareto Distribution and Gini Concentration Index

The Pareto (1897) tail index is related to the Gini concentration index (Safari, Masseran, Ibrahim, & Hussain, 2019). Corrado Gini (1936), created this index to summarize in a single number the information contained in the Lorenz curve (Mojiri & Ahmadi, 2022). The Lorenz (1905) curve is obtained from a two-dimensional diagram that represents the number of people on the horizontal axis (x) and the accumulated income or wealth on the vertical axis, FX(x). A straight line with a 45-degree slope, known as the line of equality, is included in this graph. As the curve, defined by the coordinate pairs (x, FX(x)), diverges from this line, inequality increases. The Gini index is the ratio of the area between the line of equality and the Lorenz curve to the area of the lower right triangle. A value of 0 indicates total equality, which occurs when all individuals have the same income. A value of 1 reflects maximum inequality, occurring when one person receives all the income and the others have no income or are unpaid labor (Sitthiyot & Holasut, 2021).

In countries with more income equality, such as the Scandinavian countries, the Gini index is less than 0.3. In countries with greater inequality, such as some African nations (South Africa, Namibia, Suriname, Zambia, Eswatini, Botswana, Angola, and Zimbabwe) and countries in the Americas (Belize, Brazil, Colombia, and Panama), the Gini index is higher than 0.5 (World Bank, 2022). It should be noted that countries with communist economic systems do not have low Gini indices, and social democratic policies involving increased public spending and debt tend to increase rather than decrease inequality (Tokhirov, 2021). Thus, left-wing populist policies have been described as policies of poverty equalization, leading to the emergence of a wealthy political oligarchy (Benczes, 2022; Landoni & Villegas, 2022).

The value on the Lorenz curve at point x for a continuous random variable X with density function fX(x) is obtained by the following proportion given in Equation 45.

L( F( x ) )= x xf( x )dx xf( x )dx = x xf( x )dx E( X ) (45)

When applied to the type I Pareto distribution, the above formula is transformed into the expression shown in Equation 46.

X~Pareto( x n ,α )andF( x )=1 ( x m /x ) α L( F( x ) )={ 0 0<α<1 1 ( 1F( x ) ) 1 1 α =1 ( x m x ) α1 α1 (46)

The Gini index can be expressed as one minus twice the integral from 0 to 1 of the Lorenz curve. When applied to the type I Pareto distribution, it is the inverse of twice the tail parameter minus one, 1/(2α − 1), whenever α > 1. When α < 1, the Gini index takes its minimum value, which is 1 (Equation 47). Table 1 shows the correspondence between values of the Pareto tail and Gini concentration indices and their interpretation.

G( X )={ 1 0<α<1 12 0 1 L( F( x ) )dx = 1 2α1 α1 α=0.5( 1+1/ G( X ) ) (47)

Table 1. Correspondence between Gini and Pareto indexes and their interpretation.

G

α

Interpretation

0.05

10.5

Very little inequality

0.1

5.5

0.15

3.833

Little inequality

0.2

3

0.25

2.5

0.3

2.167

0.35

1.929

Moderate inequality

0.4

1.75

0.45

1.61

Fairly significant inequality

0.5

1.5

0.55

1.409

A lot of inequality

0.6

1.333

0.65

1.269

0.7

1.214

0.75

1.167

Very much inequality

0.8

1.125

0.85

1.088

0.9

1.056

0.95

1.026

1

1

Note. G = Gini concentration index and α = Pareto tail index.

7. Generation of Type I Pareto Random Samples and Goodness-of-Fit Testing

Let U be a random variable with a standard uniform distribution: U ⊆ U [0, 1]. The transformation xm(1 − U)1/α follows a type I Pareto distribution with scale parameter xm and shape parameter α. This procedure allows for obtaining a random sample with a type I Pareto distribution from a random sample of a standard uniform variable and is called inverse transform sampling (Ross, 2022).

To test whether the sample data fit the type I Pareto distribution, Chu, Dickin and Nadarajah (2019) recommend the Kolmogorov-Smirnov test based on a simulation study. The simplest way to apply this inferential test is to transform the data, assuming a type I Pareto distribution, into values with an exponential distribution by taking the natural logarithm of the ratio of each data point to the minimum sample value. Then, the Kolmogorov-Smirnov test, adapted for exponentially distributed samples, is applied (Stephens, 1974). The inferential test can be supplemented with a graphical assessment using a quantile-quantile plot and a histogram with an overlaid density curve. In the former, an alignment of points along a straight line with a 45-degree slope is sought, and in the latter, an inverted J-shaped profile is expected (Bhoj & Chandra, 2021).

Statistical hypotheses: H0: X ~ Pareto(xm, α) ≡ Y = ln[X/min(X)] ~ Exponential(λ = α) y H1: X ≁ Pareto(xm, α).

Assumptions: Random sample of size n of a continuous quantitative variable X.

Test statistic: The data are sorted in ascending order, assigned ranks or orders, and transformed to follow an exponential distribution under the assumption that the null hypothesis is true.

x(1)

x(2)

… ≤

x(i)

… ≤

x(n-1)

x(n)

1

2

i

n-1

n

ln[x(1)/x(1)]

ln[x(2)/x(1)]

ln[x(i)/x(1)]

ln[x(n-1)/x(1)]

ln[x(n)/x(1)]

The cumulative probability (theorical cumulative relative frequency) under the type I Pareto distribution model (Equation 48) and the (empirical) cumulative relative frequency (Equation 49) are calculated for each transformed value, along with the D+, D, and D statistics (Equation 50).

y i =ln[ x i / x ( 1 ) ]Y~Exponetial( λ ^ = n i=1 n ln( x i ) ) F Y ( y ( i ) )=1 e λ ^ y ( i ) (48)

F n ( x ( i ) )= ( i )/n (49)

D + =max[ F n ( x ( i ) ) F Y ( y ( i ) ) ]=max[ ( i )/n F Y ( y ( i ) ) ] D =max[ F Y ( y ( i ) ) F n ( x ( i1 ) ) ]=max[ ( i1 )/n F Y ( y ( i ) ) ] D=max( D + , D ) (50)

The sample estimation correction to the D statistic, as given by Stephens (1974), is applied. See Equation 51.

D c =( D 0.2/n )( n +0.26+ 0.5/ n ) (51)

The decision is based on the transformed D statistic (Equation 48). If DcDα, H0 holds, and if Dc > Dα, H0 is rejected at a significance level of α. The critical Dα values depend on the significance level α (Stephens, 1974), which is typically set at .05 (D0.05 = 1.094). For small samples (n < 20), it may be increased to .10 (D.10 = .990), and for large samples (n > 500), it may be reduced to .01 (D.01 = 1.308).

Another option for inferential testing is the Anderson & Darling (1952) test. The formula for its test statistic is shown in Equation 52. Similar to the Kolmogorov-Smirnov test, the data are transformed to follow an exponential distribution (Equation 48), and Stephens’ (1986) correction is applied to the test statistic when the parameters are estimated from the sample data (Equation 53).

A 2 =n i=1 n 2i1 n [ ln( F Y ( y i ) )+ln( 1 F Y ( y n+1i ) ) ] (52)

A c 2 =A( 1+ 0.6 n ) (53)

As in the Kolmogorov-Smirnov test, the decision is based on the corrected A2 statistic, denoted as A c 2 . If A c 2 A α 2 (critical value), H0 is retained; if A c 2 > A α 2 , H0 is rejected at a significance level of α. The critical values Aα2 depend on the type of distribution (in this case, exponential), the estimation method (in this case, maximum likelihood estimator), and the significance level α, which is typically set at .05 (A.05 = 1.321). For small samples (n < 20), the significance level α can be increased to .10 (A.1 = 1.062), while for large samples (n > 500), it can be reduced to .01 (A.01 = 1.959). Refer to Stephens (1986).

A modification of the Anderson-Darling test was developed by Sinclair, Spurr and Ahmad (1990) for distributions with positive skewness and highly atypical cases in the right tail. The type I Pareto distribution falls into this category; hence, this modification can be applied. The Sinclair-Spurr-Ahmad statistic is calculated directly from the original sample data, without transformation (Equation 54), and is denoted as A U n 2 .

A U n 2 = n 2 2 i=1 n F X ( x ( i ) ) i=1 n ( 2 2i1 n )ln( 1 F X ( x ( i ) ) ) F X ( x ( i ) )=1 ( x i / x ^ m ) α =1 ( x i / min( x ) ) α (54)

The critical value for decision making is calculated with the formula shown in Equation 55.

A p U n 2 =1 1 1+ e G( p ) G( p )=0.11700.03791×t+0.06318×u+0.09878×t×u +0.009184× t 2 ×u0.00009742× t 4 ×u t=ln p 1p ,u= 1 1+ 0.3/ n (55)

If A U n 2 A p U n 2 , H₀ holds: X ~ Pareto (xm, α) at a significance level of p. If A U n 2 > A p U n 2 , H₀ is rejected. As the sample size approaches infinity, the value of u becomes unitary, and the critical value A p U n 2 reaches its asymptotic value: A p U n 2 = 0.356 for p = .1, A p U n 2 = 0.432 for p = .05, and A p U n 2 = 0.610 for p = .01.

8. Example Calculations Using the Type I Pareto Distribution

Next, simulated data are used in two examples illustrating the application of the distribution. In the first example, probabilities, descriptive statistics, and the representation of the distribution are calculated for a variable 𝑋 that follows a type I Pareto distribution, characterized by the parameters xm = 500 and α = 4.1, representing the biweekly salary (in dollars) within a company. In the second example, the data are generated using the inverse sampling transform in Excel to follow a type I Pareto distribution (xm =500, α = 4.1). Based on these generated data, the distribution parameters are estimated both pointwise and using confidence intervals. Additionally, the goodness of fit is assessed using statistical tests and graphical methods.

8.1 Calculation of Probabilities and Descriptive Measures

Let X be a variable with a type I Pareto distribution characterized by parameters xm = 500 and α = 4.1, representing the biweekly salary in dollars within a company. The task is to calculate the probability of having an income less than or equal to 592, less than 1000, between 800 and 1400, greater than or equal to 600, and greater than 1500. Additionally, the following measures of central tendency are required: the mathematical expectation or arithmetic mean μ(X), geometric mean μg(X), harmonic mean μh(X), median Mdn(X), and mode Mo(X); measures of variation such as variance σ2(X), standard deviation σ(X), and entropy H(X); as well as shape measures based on standardized central moments, including skewness β 1 ( X ) and excess kurtosis β2(X). Furthermore, the Gini concentration index G(X) is to be calculated. Finally, it is considered illustrative to plot its density function fX(x) and cumulative distribution FX(x).

X~Pareto( x n =500,α=4.1 )

Equation 2 is applied for the calculation of the probability of having an income less than or equal to 592.

P( X592 )= 500 592 f X ( x )dx =1 ( x m /x ) α =1 ( 500/ 592 ) 4.1 =0.4997

Equation 2 is also used to calculate the probability of having an income less than 1000. Since it is a continuous distribution, including or excluding the specific point does not affect the value of the cumulative probability.

P( X<1000 )= 500 1000 f X ( x )dx =1 ( 500/ 1000 ) 4.1 =0.9417

Equation 2 is used to calculate the probability of having an income between 800 and 1400 by taking the difference between two cumulative probabilities.

P( 800X1400 )=P( X1400 )P( X800 )=0.98530.8544=1309

P( X1400 )=1 ( 500/ 1400 ) 4.1 =0.9853

P( X800 )=1 ( 500/ 800 ) 4.1 =0.8544

Equation 3 is used to calculate the probability of having an income greater than or equal to 600.

P( X600 )= ( 500/ 600 ) 4.1 =0.4735

Equation 3 is also used to calculate the probability of having an income greater than 1500. Since it is a continuous distribution, including or excluding the specific point does not affect the complementary or right-tailed probability.

P( X>1500 )= ( 500/ 1500 ) 4.1 =0.0111

Measures of central tendency. The arithmetic mean is calculated using Equation 7, the geometric mean using Equation 8, the harmonic mean using Equation 9, the median using Equation 10, and the mode using Equation 11.

μ( X )=E( X )= α α1 x m = 4.1 3.1 ×500=661.29

μ g ( X )= e E( ln( X ) ) = x m e 1/α =500× e 1/ 4.1 =638.11

μ h ( X )=H( X )= 1 E( 1/X ) =( 1+ 1 α ) x m =( 1+ 1 4.1 )×500=621.95

Mdn( X )= Q X ( p=0.5 )= 2 α × x m = 2 1/ 4.1 ×500=592.10

Mo( X )={ x|max( f X ( x ) ) }= x m =500

Measures of variation. Variance is calculated using Equation 13, standard deviation using Equation 14, and entropy using Equation 15.

σ 2 ( X )= μ 2 ( X )=E[ ( XE( X ) ) 2 ]= x m 2 α ( α1 ) 2 ( α2 ) = 500 2 4.1 3.1 2 ×2.1 =50790.35

σ( X )= μ 2 1/2 ( X )= E[ ( XE( X ) ) 2 ] = x m α1 α α2 = 500 3.1 4.1 2.1 =225.37

H( X )=E[ ln( f X ( x ) ) ]=ln[ x m α e 1+ 1 α ]=ln[ 500 4.1 e 1+ 1 4.1 ]=6.05nats

Skewness and Excess Kurtosis Based on Central Moments. These two shape statistics are calculated using Equations 16 and 17, respectively.

β 1 ( X ) = μ 3 ( X ) μ 2 3/2 ( X ) = 2( α+1 ) α3 α2 α = 2×5.1 1.1 2.1 4.1 =6.64

β 2 ( X )3= μ 4 ( X ) μ 2 2 ( X ) 3= 6( α 3 + α 2 6α2 ) α( α3 )( α4 ) = 6( 4.1 3 + 4.1 2 6×4.12 ) 4.1×1.1×0.1 =786.67

The Gini Index is obtained using Equation 47.

G( X )=1/ ( 2α1 ) =1/ ( 2×4.11 ) =0.14

The Gini index is close to 0, indicating fairly equal wages within the firm. Figure 2 displays the density function fX(x) and the cumulative distribution FX(x) of the random variable X, representing biweekly wages (in dollars). The wages follow a type I Pareto distribution with parameters xm = 500 and α = 4.1.

Figure 2. Density function fX(x) and cumulative distribution function FX(x) for X ~ Pareto(xm = 500, α = 4.1).

The R script to calculate these probabilities, descriptive statistics, and the plot of the density and cumulative distribution functions of the type I Pareto distribution (xm = 500 and α = 4.1) is shown below:

# Load required libraries (packages)

library(cascsim)

library(modeest)

cat ("Probability calculations (rounding to four decimal places)", "\n")

cat ("Cumulative probability less than or equal to 592 =", round(ppareto(592, xm = 500, alpha = 4.1), 4), "\n")

cat ("Cumulative probability less than 1000 =", round(ppareto(1000, xm = 500, alpha = 4.1), 4), "\n")

cat ("Probability in the interval [800, 1400] =", round(ppareto(1400, xm = 500, alpha = 4.1) - ppareto(800, xm = 500, alpha = 4.1), 4), "\n")

cat ("Probability greater than or equal to 600 =", round(1 - ppareto(600, xm = 500, alpha = 4.1), 4), "\n")

cat ("Probability greater than 1500 =", round(1 - ppareto(1500, xm = 500, alpha = 4.1), 4), "\n")

# Descriptive measures for type I Pareto distribution (rounding to three decimal places).

x_m <- 500 # Scale parameter

alpha <- 4.1 # Shape parameter

cat ("Measures of central tendency", "\n")

cat ("Mathematical expectation of X: μ(X) =", round(alpha / (alpha - 1) * x_m, 3), "\n")

cat ("Geometric mean of X: μg(X) =", round(x_m * exp(1 / alpha), 3), "\n")

cat ("Harmonic mean of X: μh(X) =", round((1 + 1 / alpha) * x_m, 3), "\n")

cat ("Median de X: Mdn(X) =", round(qpareto(0.5, xm = 500, alpha = 4.1), 3), "\n")

cat ("Mode of X: Mdn(X) =", round(qpareto(0.5, xm = 500, alpha = 4.1), 3), "\n")

cat ("Measures of variation ", "\n")

cat ("Variance of X: σ^2(X) =", round(x_m^2 * alpha /(( alpha -1)^2 * (alpha - 2)), 3), "\n")

cat ("Measures of variation ", "\n")

cat ("Standard deviation of X: σ(X) =", round(sqrt(x_m^2 * alpha /(( alpha -1)^2 * (alpha - 2))), 3), "\n")

cat ("Entropy of X: Η(X) =", round(log(x_m / alpha * exp(1+ 1/ alpha)), 3), "nats", "\n")

cat ("Shape measures", "\n")

cat("Skewness of X: √b1(X) =", round(2 * (alpha + 1) / (alpha - 3) * sqrt((alpha - 2) / alpha), 3), "\n")

cat ("Kurtosis of X: b2(X) =", round(6* (alpha^3 + alpha^2 - 6 * alpha - 2) / (alpha * (alpha - 3) * (alpha - 4)), 3), "\n")

cat ("Gini Index of X: G(X) =", round(1 / (2 * alpha - 1), 3), "\n")

# Plot of density and cumulative distribution functions.

x <- seq(0, 2000, length.out = 2000)

dpareto <- function (x, x_m, alpha) {

ifelse (x < x_m, 0, alpha * x_m^alpha / (x^(alpha + 1)))}

ppareto2 <- function (x, x_m, alpha) {ifelse(x < x_m, 0, 1 - (x_m / x)^alpha)}

density <- dpareto(x, x_m, alpha)

cumulative <- ppareto(x, x_m, alpha)

par (mar = c(4, 4, 1, 2) + 0.1)

plot (x, density, type = "l", col = "blue", lwd = 2, xlab = "Biweekly wages (in dollars)",

ylab = "Density", ylim = c(0, max(density)))

par (new = TRUE)

plot (x, cumulative, type = "l", col = "red", lwd = 2, axes = FALSE, xlab = "", ylab = "")

axis (4)

mtext ("Cumulative Probability", side = 4, line = 3, col = "red")

legend ("right", legend = c("Density", "Cumulative"),

col = c ("darkblue", "red"), lwd = 2, bty = "n", y.intersp = 1.5)

8.2. Random Sample Generation, Parameter Estimation, and Goodness of Fit

In this second example, the aim is to generate a random sample of 40 observations from the variable X ~ Pareto (500, 4.1). Based on this sample, the goal is to estimate the parameters xm and α using maximum likelihood estimators, and to construct 95% confidence intervals for these parameters based on their asymptotic distributions and errors. Finally, the goodness-of-fit will be assessed through a plot of theoretical versus empirical quantiles, a histogram with an overlaid density curve, and the Kolmogorov-Smirnov and Anderson-Darling tests.

In the first column of Table 2, a random sample of 40 ui values is drawn from a standard uniform distribution U ~ U [0, 1], generated by means of the 2021 version of the Excel random number generator. Using the quantile function, the ui values are transformed into a random sample of size 40 for a variable X that follows a type I Pareto distribution with population parameters: xm = 500 (scale) and α = 4.1 (shape). This data generation process is known as inverse transform sampling, as illustrated in Equation 56. The first value of X obtained by applying Equation 56, starting at u1 = 0.226, is 532.239.

x i = x m ( 1 u i ) 1 α =500 ( 1 u i ) 1 4.1 x 1 =500 ( 10.226 ) 1 4.1 =532.239 (56)

Table 2. Random sample generation and fit testing to the generating distribution.

Generation of X

Kolmogorov-Smirnov test

Q-Q plot

ui

xi

(i)

x(i)

y(i)

FY(y(i))

D+

D−

p(i)

xt(i)

0.226

532.239

1

502.595

0

0

0.025

0

0.017

504.716

0.678

659.182

2

506.685

0.008

0.032

0.018

0.007

0.041

507.983

0.754

703.919

3

509.868

0.014

0.055

0.020

0.005

0.066

511.358

0.791

732.467

4

513.713

0.022

0.083

0.017

0.008

0.091

514.847

0.427

572.739

5

516.552

0.027

0.103

0.022

0.003

0.116

518.457

0.989

1502.032

6

519.474

0.033

0.123

0.027

−0.002

0.140

522.196

0.956

1071.113

7

524.641

0.043

0.156

0.019

0.006

0.165

526.073

0.356

556.651

8

529.753

0.053

0.188

0.012

0.013

0.190

530.097

0.651

646.362

9

532.239

0.057

0.203

0.022

0.003

0.215

534.278

0.592

622.201

10

540.797

0.073

0.252

−0.002

0.027

0.240

538.628

0.275

540.797

11

540.979

0.074

0.253

0.022

0.003

0.264

543.159

0.885

847.360

12

550.305

0.091

0.302

−0.002

0.027

0.289

547.886

0.921

928.627

13

551.504

0.093

0.308

0.017

0.008

0.314

552.824

0.524

599.242

14

556.651

0.102

0.333

0.017

0.008

0.339

557.991

0.145

519.474

15

560.951

0.110

0.353

0.022

0.003

0.364

563.406

0.726

685.652

16

569.140

0.124

0.389

0.011

0.014

0.388

569.093

0.053

506.685

17

571.286

0.128

0.398

0.027

−0.002

0.413

575.075

0.325

550.305

18

572.739

0.131

0.404

0.046

−0.021

0.438

581.384

0.421

571.286

19

588.961

0.159

0.466

0.009

0.016

0.463

588.050

0.578

617.102

20

598.325

0.174

0.498

0.002

0.023

0.488

595.114

0.489

588.961

21

599.242

0.176

0.501

0.024

0.001

0.512

602.620

0.021

502.595

22

609.166

0.192

0.533

0.017

0.008

0.537

610.619

0.924

937.437

23

617.102

0.205

0.556

0.019

0.006

0.562

619.175

0.331

551.504

24

622.201

0.213

0.570

0.030

−0.005

0.587

628.359

0.622

633.900

25

633.900

0.232

0.601

0.024

0.001

0.612

638.261

0.723

683.834

26

646.362

0.252

0.630

0.020

0.005

0.636

648.989

0.211

529.753

27

659.182

0.271

0.658

0.017

0.008

0.661

660.674

0.412

569.140

28

683.834

0.308

0.704

−0.004

0.029

0.686

673.482

0.077

509.868

29

685.652

0.311

0.707

0.018

0.007

0.711

687.625

0.276

540.979

30

703.919

0.337

0.736

0.014

0.011

0.736

703.373

0.521

598.325

31

721.765

0.362

0.761

0.014

0.011

0.760

721.090

0.822

761.718

32

732.467

0.377

0.775

0.025

0.000

0.785

741.265

0.125

516.552

33

761.718

0.416

0.807

0.018

0.007

0.810

764.590

0.971

1185.755

34

804.876

0.471

0.845

0.005

0.020

0.835

792.076

0.858

804.876

35

847.360

0.522

0.873

0.002

0.023

0.860

825.283

0.105

513.713

36

928.627

0.614

0.912

−0.012

0.037

0.884

866.783

0.376

560.951

37

937.437

0.623

0.915

0.010

0.015

0.909

921.249

0.555

609.166

38

1071.113

0.757

0.950

0.000

0.025

0.934

998.449

0.179

524.641

39

1185.755

0.858

0.967

0.008

0.017

0.959

1124.364

0.778

721.765

40

1502.032

1.095

0.987

0.013

0.012

0.983

1417.313

Σ

10.108

max

0.046

0.037

Note. Generation of X: ui = sample data in its random order (i = 1, 2, …, 40) drawn from a standard uniform distribution U [0, 1] and xi = 500(1 - ui)-1/4.1 = data transformed to follow a type I Pareto distribution (with scale parameter xm = 500 and shape parameter α = 4.1) via inverse transform sampling. Logarithmic transformation of X and Kolmogorov- Smirnov test: (i) = the order of data xi within the 40-item sample, x(i) = data xi sorted in ascending order (empirical quantiles), y(i) = ln(x(i)/502.595) = the logarithmic transformation of xi normalized by the sample minimum, FY(y(i)) = the cumulative distribution function of the Y variable following an exponential distribution with rate parameter λ = 3.957, D+ = (i)/40 - FY(y(i)) = the difference between the empirical and theoretical cumulative distribution functions and D- = FY(y(i)) - ((i)-1)/40 = the difference between the theoretical distribution function and empirical cumulative distribution function with lap of 1. Quantile-Quantile plot: p(i) = ((i)-1/3)/(n+1/3) = the order of the theoretical quantile and xt(i) = QX[p(i)] = theoretical quantiles calculated using the quantile function of a type I Pareto distribution with estimated parameters. Σ = sum per column, max = maximum value per column.

Next, the two parameters are estimated pointwise using maximum likelihood estimation (Equation 21 for thse estimate of xm and Equation 22 for the estimate of α), and their corresponding standard errors (Equation 29 for x ^ m and Equation 31 for α ^ ) and 95% asymptotic confidence intervals (Equation 30 for x ^ m and Equation 32 for α ^ ) are computed.

Estimation of the scale parameter (xm):

x ^ m =min( { x i } i=1 40 )=502.595

ase( x ^ m )= σ ^ ( x ^ m )= x ^ m n× α ^ = 502.595 40×3.957 =39.948

P( x ^ m z 1 α n x ^ m n× α ^ x m x ^ m + z 1 α n x ^ m n× α ^ )=1α

P( 502.5951.96×39.948 x m 502.595+1.96×39.948 )=0.95

P( x m [ 424.299,580.891 ] )=0.95

Estimation of the shape parameter (α):

α ^ = n i=1 n ln( x i x ^ m ) = n i=1 n ln( x i min( { x i } i=1 n ) ) = 40 i=1 40 ln( x i 502.595 ) = 40 10.108 =3.957

ase( α ^ )= σ ^ ( α ^ )= α ^ / n = 3.957/ 40 =0.626

P( α ^ z 1 α n α ^ n α α ^ + z 1 α n α ^ n )=1α

P( 3.9571.96×0.626α3.957+1.96×0.626 )=0.95

P( α[ 2.731,5.184 ] )=0.95

The bias-corrected formula is also used to estimate the shape parameter α. (Equation 33). Initially, the point estimate of α appears less accurate than the unbiased one; however, the confidence interval is more efficient and appropriate. The standard error is calculated using Equation 35, and the asymptotic confidence interval is determined using Equation 36.

α ^ c = n2 i=1 n ln( x i x ^ m ) = 38 i=1 40 ln( x i 502.595 ) = 38 10.108 =3.759

ase( α ^ c )= σ ^ ( α ^ c )= α ^ c / n2 = 3.759/ 38 =0.610

P( α ^ c z 1 α n α ^ c n2 α α ^ c + z 1 α n α ^ c n2 )=1α

P( 3.7591.96×0.610α3.759+1.96×0.610 )=0.95

P( α[ 2.564,4.955 ] )=0.95

The following is the R script for point estimates and 95% asymptotic confidence intervals for the two parameters of the type I Pareto distribution.

# Vector of scores.

x <- c(532.2385845, 659.1817235, 703.9185616, 732.4668655, 572.7389011, 1502.031724, 1071.11279, 556.6512741, 646.3622704, 622.2013372, 540.796541, 847.3602594, 928.6272967, 599.2422898, 519.4737837, 685.6524203, 506.685297, 550.3047107, 571.2856056, 617.1023659, 588.9614346, 502.594959, 937.4374491, 551.5044233, 633.9000483, 683.8337772, 529.7526944, 569.1404291, 509.8675755, 540.9786305, 598.3247283, 761.718308, 516.5523976, 1185.754777, 804.8759389, 513.7129142, 560.9510722, 609.1662801, 524.6406256, 721.7654816)

# Calculation of statistics.

n <- length(x)

x_m_hat <- min(x)

y <- log(x / x_m_hat)

alpha_hat <- n / sum(y)

ase_x_m_hat <- x_m_hat /sqrt(n * alpha_hat)

LL_x_m <- x_m_hat - qnorm(0.975) * ase_x_m_hat

UL_x_m <- x_m_hat + qnorm(0.975) * ase_x_m_hat

ase_alpha_hat <- alpha_hat / sqrt(n)

LL_alpha_hat <- alpha_hat - qnorm(0.975) * ase_alpha_hat

UL_alpha_hat <- alpha_hat + qnorm(0.975) * ase_alpha_hat

alpha_c <- (n - 2) / sum(y)

ase_alpha_c <- alpha_c / sqrt(n - 2)

LL_alpha_c <- alpha_c - qnorm(0.975) * ase_alpha_c

UL_alpha_c <- alpha_c + qnorm(0.975) * ase_alpha_c

# Display results (rounding to three decimal places).

cat("sample size: n =", n, "\n")

cat("Estimate of the scale parameter: x_m_hat =", round(x_m_hat, 3), "\n")

cat("Asymptotic standard error for x_m_hat: ase(x_m_hat) =", round(ase_x_m_hat, 3), "\n")

cat("95% asymptotic confidence interval for x_m: 95% CI", "[", round(LL_x_m, 3), ",", round(UL_x_m, 3), "]", "\n")

cat("Estimate of shape parameter: alpha_hat =", round(alpha_hat, 3), "\n")

cat("Asymptotic standard error for alpha_hat: ase(alpha_hat) =", round(ase_alpha_hat, 3), "\n")

cat("95% asymptotic confidence interval for alpha: 95% CI", "[", round(LL_alpha_hat, 3), ",", round(UL_alpha_hat, 3), "]", "\n")

cat("Biased-corrected estimate of shape parameter: alpha_bc =", round(alpha_c, 3), "\n")

cat("Asymptotic standard error for alpha_bc: ase(alpha_bc) =", round(ase_alpha_c, 3), "\n")

cat("95% asymptotic confidence interval for alpha_bc: 95% CI", "[", round(LL_alpha_c, 3), ",", round(UL_alpha_c, 3), "]", "\n")

The goodness-of-fit test begins with inferential tests. The 40 data points, xi, following a type I Pareto distribution, are transformed using Equation 48 to follow an exponential distribution, yi. The Kolmogorov-Smirnov test statistic is then calculated by applying Equation 50 to the transformed sample. Subsequently, Stephens’ correction is applied (Equation 51). Table 2 presents the data transformation and the calculations to derive the statistics D+ (the maximum difference between the empirical and theoretical cumulative distribution functions), D (the maximum difference between the theoretical cumulative distribution function and the empirical cumulative distribution function with a lag of 1), and D (the maximum of D+ and D). After correcting D, it is found to be less than the critical value at a 5% significance level, supporting the null hypothesis of goodness-of-fit. Thus, the transformed sample data, yi, follow an exponential distribution, and the original data, xi, follow a type I Pareto distribution.

Statistical hypothesis

H 0 :X~Pareto( x m ,α )Y=ln( X min( X ) )~Exp( λ=α ) H 1 :X ~ Pareto( x m ,α )

Data transformation

y i =ln[ x i / 502.595 ]Y~Exponetial( λ ^ =n/ i=1 40 ln( x i ) =3.957 )

F Y ( y ( i ) )=1 e λ ^ × y ( i ) =1 e 3.957× y ( i )

Test statistic

D + =max[ ( i )/n F Y ( y ( i ) ) ]=0.046

D =max[ F Y ( y ( i ) ) ( ( i )1 )/n ]=0.037

D=max( D + , D )=max( 0.046,0.037 )=0.046

Test statistic with Stephens’ correction

D c =( D 0.2 n )( n +0.26+ 0.5 n )=( 0.046 0.2 40 )( 40 +0.26+ 0.5 40 )=0.275

Decision on the null statistical hypothesis of the type I Pareto distribution

D c =0.275< D α=0.05 =1.094, H 0 isnotrejected.

H 0 :Y~Exp( λ )X=502.595 e Y ~Pareto( x m ,α )

The yi data are also used in the application of the Anderson-Darling test. These data are arranged in ascending order. For the calculation of the statistic A2 (Equation 52), the first value, which is 0, must be excluded because it corresponds to a cumulative probability of 0, and the logarithm of this probability is undefined. To compute the cumulative probabilities, the rate parameter λ is required, and it is estimated using its maximum likelihood estimator, which is the reciprocal of the sample mean of the remaining 39 data points (Table 3). The corrected A2 statistic, calculated using Equation 53, is less than the critical value at a 5% significance level (Stephens, 1986). Consequently, the null hypothesis holds, indicating that the transformed data yi = ln(xi/502.595) follow an exponential distribution and, therefore, the original data x i =502.595× e y i follow a type I Pareto distribution.

Rate parameter estimate

λ ^ = 39 i=1 39 y ( i ) = 39 10.108 =3.858

Expected cumulative probability following an exponential distribution with a rate parameter of 3.858.

F Y ( y ( i ) | λ ^ =3.858 )= λ ^ × e λ ^ y ( i ) =3.858× e 3.858 y ^ ( i )

i=2 40 A i = i=2 40 2i1 39 [ ln( F Y ( y ( i ) ) )+ln( 1 F Y ( y ( n+1i ) ) ) ]=39.080

Anderson-Darling test statistic

n = i=2 40 i =39

A 2 = n i=1 39 A i =39+39.080=0.080

Anderson-Darling test statistic with Stephens’ correction

A c 2 = A 2 ( 1+ 0.6 n )=0.080( 1+ 0.6 39 )=0.081

Table 3. Calculations for deriving the Anderson-Darling A2 statistic from the yi values.

i

y(i)

(2i − 1)/39

FY (y(i))

Ln [FY (y(i))]

y(n + 1 i)

FY[y(n + 1 i)]

Ln [1 − FY(y(n + 1 i))]

Ai

1

0

0

0

2

0.008

0.026

0.031

−3.481

1.095

0.985

−4.224

−0.198

3

0.014

0.077

0.054

−2.920

0.858

0.964

−3.312

−0.479

4

0.022

0.128

0.081

−2.514

0.757

0.946

−2.919

−0.697

5

0.027

0.179

0.100

−2.300

0.623

0.910

−2.405

−0.844

6

0.033

0.231

0.120

−2.123

0.614

0.906

−2.369

−1.037

7

0.043

0.282

0.153

−1.880

0.522

0.867

−2.015

−1.099

8

0.053

0.333

0.184

−1.694

0.471

0.837

−1.817

−1.170

9

0.057

0.385

0.198

−1.618

0.416

0.799

−1.604

−1.239

10

0.073

0.436

0.246

−1.402

0.377

0.766

−1.453

−1.244

11

0.074

0.487

0.247

−1.398

0.362

0.753

−1.396

−1.361

12

0.091

0.538

0.295

−1.220

0.337

0.727

−1.300

−1.357

13

0.093

0.590

0.301

−1.200

0.311

0.698

−1.198

−1.415

14

0.102

0.641

0.326

−1.122

0.308

0.695

−1.188

−1.481

15

0.110

0.692

0.345

−1.063

0.271

0.649

−1.046

−1.460

16

0.124

0.744

0.381

−0.965

0.252

0.621

−0.971

−1.439

17

0.128

0.795

0.390

−0.942

0.232

0.592

−0.896

−1.460

18

0.131

0.846

0.396

−0.927

0.213

0.561

−0.824

−1.481

19

0.159

0.897

0.458

−0.782

0.205

0.547

−0.792

−1.412

20

0.174

0.949

0.490

−0.714

0.192

0.524

−0.742

−1.381

21

0.176

1.000

0.493

−0.708

0.176

0.493

−0.679

−1.387

22

0.192

1.051

0.524

−0.647

0.174

0.490

−0.673

−1.387

23

0.205

1.103

0.547

−0.603

0.159

0.458

−0.612

−1.340

24

0.213

1.154

0.561

−0.578

0.131

0.396

−0.504

−1.248

25

0.232

1.205

0.592

−0.525

0.128

0.390

−0.494

−1.228

26

0.252

1.256

0.621

−0.476

0.124

0.381

−0.480

−1.201

27

0.271

1.308

0.649

−0.433

0.110

0.345

−0.424

−1.120

28

0.308

1.359

0.695

−0.364

0.102

0.326

−0.394

−1.030

29

0.311

1.410

0.698

−0.359

0.093

0.301

−0.358

−1.012

30

0.337

1.462

0.727

−0.318

0.091

0.295

−0.350

−0.977

31

0.362

1.513

0.753

−0.284

0.074

0.247

−0.284

−0.860

32

0.377

1.564

0.766

−0.266

0.073

0.246

−0.283

−0.859

33

0.416

1.615

0.799

−0.224

0.057

0.198

−0.221

−0.720

34

0.471

1.667

0.837

−0.177

0.053

0.184

−0.203

−0.634

35

0.522

1.718

0.867

−0.143

0.043

0.153

−0.166

−0.530

36

0.614

1.769

0.906

−0.098

0.033

0.120

−0.127

−0.399

37

0.623

1.821

0.910

−0.095

0.027

0.100

−0.106

−0.365

38

0.757

1.872

0.946

−0.055

0.022

0.081

−0.084

−0.262

39

0.858

1.923

0.964

−0.037

0.014

0.054

−0.055

−0.178

40

1.095

1.974

0.985

−0.015

0.008

0.031

−0.031

−0.091

Σ

10.108

Note. i = order or rank of the data y = ln[xi/min(x)] where the data are ordered in ascending order, y(i) = y-value at order i, (2i − 1)/39 = first factor of Ai, FY(y(i)) = cumulative probability for the y(i) value, assuming an exponential distribution with a rate parameter λ = 3.858, ln[FY(y(i))] = natural logarithm of the cumulative probability for the value y(i), y(n + 1 i) = y-value at order n + 1 − i, FY[y(n + 1 i)] = cumulative probability for the value y(n+1-i), following the same probability distribution; ln[1 − FY(y(n + 1 i))] = natural logarithm of the complement of the cumulative probability for the value y(n+1-i), Ai = ((2i − 1)/39) × (ln[FY(y(i))]+ln[1-FY(y(n + 1 i))]), Σ = sum by column.

The Anderson-Darling test, modified by Sinclair et al. (1990) for distributions with positive skewness, is applied to the original data sorted in ascending order x(i). Refer to Equation 54. The cumulative distribution function of the type I Pareto distribution, FX(x(i)) = 1 − (x(i)/502.595)3.957, is used for these calculations; however, the probabilities are identical to those given by the cumulative distribution function of an exponential distribution with rate parameter λ = 3.858. The test statistic A U n 2 is less than the critical value for a sample size of 40 and a significance level of 5%, so the null hypothesis of fit to a type I Pareto distribution is not rejected. The detailed calculations are provided below, with part of the results summarized in Table 4.

Test statistic

A U n 2 = n 2 2 i=1 n F X ( x ( i ) ) i=1 n ( 2 2i1 n )ln( 1 F X ( x ( i ) ) )

A U n 2 = n 2 2 i=1 n ( x ^ m x ( i ) ) α ^ i=1 n ( 2 2i1 n )ln( 1 ( x ^ m x ( i ) ) α ^ )

A U 40 2 = 40 2 2×19.885( 19.802 )=2019.969=0.031

Critical value:

t=ln( p 1p )=ln( 0.05 0.95 )=2.944

u= 1 1+ 0.3/ n = 1 1+ 0.3/ 40 = 1 1.0480 =0.955

G( p )=0.11700.03791t+0.06318u+0.09878tu +0.009184 t 2 u0.00009742 t 4 u

G( p=0.05 )=0.11700.03791×( 2.944 )+0.06318×0.955 +0.09878×( 2.944 )×0.955+0.009184× ( 2.944 ) 2 ×0.9550.00009742× ( 2.944 ) 4 ×0.955 =0.080

A p U n 2 =11/ [ 1+ e G( p ) ]

A 0.05 U 40 2 =11/ [ 1+ e 0.080 ] =0.520

A U 40 2 =0.031< A 0.05 U 40 2 =0.520, H 0 isnotrejected:X~Pareto( x m ,α )

Table 4. Calculations for deriving the test statistic of the Anderson-Darling test modified by Sinclair et al. (1990).

i

x(i)

FX(x(i))

2 − [(2i − 1)/40]

ln[1 − FX(x(i))]

AUi

1

502.595

0

1.975

0

0

2

506.685

0.032

1.925

−0.032

−0.062

3

509.868

0.055

1.875

−0.057

−0.107

4

513.713

0.083

1.825

−0.087

−0.158

5

516.552

0.103

1.775

−0.108

−0.192

6

519.474

0.123

1.725

−0.131

−0.225

7

524.641

0.156

1.675

−0.170

−0.285

8

529.753

0.188

1.625

−0.208

−0.338

9

532.239

0.203

1.575

−0.227

−0.357

10

540.797

0.252

1.525

−0.290

−0.442

11

540.979

0.253

1.475

−0.291

−0.430

12

550.305

0.302

1.425

−0.359

−0.511

13

551.504

0.308

1.375

−0.367

−0.505

14

556.651

0.333

1.325

−0.404

−0.536

15

560.951

0.353

1.275

−0.435

−0.554

16

569.140

0.389

1.225

−0.492

−0.603

17

571.286

0.398

1.175

−0.507

−0.596

18

572.739

0.404

1.125

−0.517

−0.582

19

588.961

0.466

1.075

−0.628

−0.675

20

598.325

0.498

1.025

−0.690

−0.707

21

599.242

0.501

0.975

−0.696

−0.679

22

609.166

0.533

0.925

−0.761

−0.704

23

617.102

0.556

0.875

−0.812

−0.711

24

622.201

0.570

0.825

−0.845

−0.697

25

633.900

0.601

0.775

−0.919

−0.712

26

646.362

0.630

0.725

−0.996

−0.722

27

659.182

0.658

0.675

−1.073

−0.724

28

683.834

0.704

0.625

−1.219

−0.762

29

685.652

0.707

0.575

−1.229

−0.707

30

703.919

0.736

0.525

−1.333

−0.700

31

721.765

0.761

0.475

−1.432

−0.680

32

732.467

0.775

0.425

−1.490

−0.633

33

761.718

0.807

0.375

−1.645

−0.617

34

804.876

0.845

0.325

−1.863

−0.606

35

847.360

0.873

0.275

−2.067

−0.568

36

928.627

0.912

0.225

−2.429

−0.547

37

937.437

0.915

0.175

−2.467

−0.432

38

1071.113

0.950

0.125

−2.994

−0.374

39

1185.755

0.967

0.075

−3.397

−0.255

40

1502.032

0.987

0.025

−4.332

−0.108

Σ

19.885

−19.802

Note. i = order or rank of the sample data of X, where the 40 data points are sorted in ascending order, x(i) = sample data of X at order or rank i, FX(x(i)) = cumulative probability of x(i) under a Pareto distribution(xm = 502.595, α = 3.957), 2 − [(2i − 1)/40] = first factor of AUi, ln[1 − FX(x(i))] = natural logarithm of the probability to the right tail of x(i) or second factor of AUi, AUi = (2 − [(2i − 1)/40]) × ln[1 − FX(x(i))], Σ = sum per column.

To obtain the quantile-quantile plot, the data xi (i = 1, 2, …, 40) are sorted in ascending order and each value is assigned its corresponding order (i), representing the empirical quantiles: x(i). Using the orders (i), the theoretical quantiles are computed based on the median of the sampling distribution of the i-th order statistic from a standard uniform continuous distribution: Mdn = (α − 1/3)/(α + β − 2/3). This distribution corresponds to a beta distribution with parameters α = (i) and β = n + 1 – (i). Consequently, the theoretical quantile’s cumulative probability (order of quantile) is given by p(i) = ((i) + 1/3)/(n + 1/3). The theoretical quantiles are then obtained by evaluating the quantile function of a type I Pareto distribution at p(i). The values for its two parameters are taken from the estimates previously calculated using the maximum likelihood estimators (Table 2).

First coordinate pair of the theoretical quantile (on the abscissa axis) and the empirical quantile (on the ordinate axis) plot.

x ( 1 )  = 502.595

p ( 1 ) = ( i )1/3 n+1/3 = 11/3 40+1/3 =0.017

x t ( 1 ) = x ^ m ( 1 p ( i ) ) 1 α =502.595 ( 10.017 ) 1 3.957 =504.716

( x t ( 1 ) x ( 1 ) )=( 502.595 504.716 )

Figure 3 illustrates the coordinate pairs of theoretical quantiles xt(i) and empirical quantiles x(i). The theoretical quantiles are plotted on the horizontal (abscissa) axis as they predict the empirical quantiles, which are positioned on the vertical (ordinate) axis. In the plot, the coordinate points align closely along a 45-degree line, indicating that the empirical data closely adhere to the theoretical model. This alignment can be quantified using the linear correlation coefficient; the square of this coefficient represents the proportion of shared variance. In this example, the linear correlation is 0.999, corresponding to a shared variance of 99.7%. Furthermore, the histogram with an overlaid density curve (Figure 4) exhibits the characteristic inverted J-shaped profile of a sample drawn from a type I Pareto distribution, supporting the conclusion that the sample data follow a type I Pareto distribution pattern.

Figure 3. Quabtile-Quantile plot.

Figure 4. Histogram with an overlaid density curve.

The R script for testing the fit of sample data to a type I Pareto distribution using the Kolmogorov-Smirnov and Anderson-Darling tests, and for displaying the q-q plot and histogram with an overlaid density curve, is shown.

# Sample data.

x <- c(532.2385845, 659.1817235, 703.9185616, 732.4668655, 572.7389011, 1502.031724, 1071.11279, 556.6512741, 646.3622704, 622.2013372, 540.796541, 847.3602594, 928.6272967, 599.2422898, 519.4737837, 685.6524203, 506.685297, 550.3047107, 571.2856056, 617.1023659, 588.9614346, 502.594959, 937.4374491, 551.5044233, 633.9000483, 683.8337772, 529.7526944, 569.1404291, 509.8675755, 540.9786305, 598.3247283, 761.718308, 516.5523976, 1185.754777, 804.8759389, 513.7129142, 560.9510722, 609.1662801, 524.6406256, 721.7654816)

# Load library

Library (EnvStats)

# Kolmogorov-Smirnov test.

x_sorted <- sort(x)

n <- length(x_sorted) # Sample size

i <- 1:n

F_empirical <- i / n

x_m <- min(x_sorted)

y <- log(x_sorted / x_m)

lambda_hat <- n / sum(y)

F_theoretical <- 1 - exp(-lambda_hat * y)

D_plus <- max(F_empirical - F_theoretical)

F_empirical_prev <- c(0, F_empirical[-n]) # Add 0 at the beginning for F_n(x_(i-1))

D_minus <- max(F_theoretical - F_empirical_prev)

D <- max(D_plus, D_minus)

# Stephens’ correction for D (D_c).

Dc <- (D - 0.2 / n) * (sqrt(n) + 0.26 + 0.5 / sqrt(n))

alpha <- 0.05 # Significance level

D_alpha <- 1.094 # Critical value for α = 0.05 (Stephens, 1974)

if (Dc <= D_alpha) {decision <- "H_0 holds: The data follow the theoretical distribution."

} else {decision <- "H_0 is rejected: The data do not follow the theoretical distribution."}

# Display results (rounding to three decimal places).

cat("D⁺:", round(D_plus, 3), "\n")

cat("D⁻:", round(D_minus, 3), "\n")

cat("D:", round(D, 3), "\n")

cat("D_c:", round(Dc, 3), "\n")

cat("Significance level: α =", alpha, "\n")

cat("Decision:", decision, "\n")

# Anderson-Darling test.

y_prime <- y[-1]

n_prime <- length(y_prime)

i_prime <- 1: n_prime

lambda_hat_prime <- n_prime / sum(y_prime)

FY_i <- 1 - exp(-lambda_hat_prime * y_prime)

log_FY_i <- log(FY_i)

FY_complement_log <- log(1 - FY_i[n_prime:i_prime])

A <- -n_prime - sum((2 * i_prime - 1) / n * (log_FY_i + FY_complement_log))

# Apply Stephens’ correction.

Ac <- A * (1 + 0.6 / n_prime)

alpha <- 0.05 # Significance level

A_alpha <- 1.321 # Critical value for α = 0.05 (Stephens, 1974)

if (Ac <= A_alpha) {

decision <- "H_0 holds: The data follow the theoretical distribution."

} else {decision <- "H_0 is rejected: The data do not follow the theoretical distribution."}

# Display results.

cat("A:", A, "\n")

cat("Ac:", Ac, "\n")

cat("Significance level: α =", alpha, "\n")

cat("Decision:", decision, "\n")

# Anderson-Darling test modified by Sinclair et al. (1990).

alpha_hat <- lambda_hat

FX <- 1 - (x_m / x_sorted)^alpha_hat

sum(FX)

sum((2 - (2 * i - 1) / n) * log(1 - FX))

AU <- n / 2 - 2 * sum(FX) - sum((2 - (2 * i - 1) / n) * log(1 - FX))

p <- 0.05 # Significance level

t <- log(p / (1 - p))

u <- 1 / (1 + 0.3 / sqrt(n))

G_p <- 0.1170 - 0.03791 * t + 0.06318 * u + 0.09878 * t * u + 0.009184 * t^2 * u - 0.00009742 * t^4 * u

AU_np <- 1 - 1 / (1 + exp(G_p))

if (AU <= AU_np) {

decision <- "H_0 holds: The data follow the theoretical distribution."

} else {decision <- "H_0 is rejected: The data do not follow the theoretical distribution."}

# Display results (rounding to three decimal places).

cat("Estimate of scale parameter: x_m =", round(x_m, 3), "\n")

cat("Estimate of shape parameter: α = :", round(alpha_hat, 3), "\n")

cat("Test statistic: AU =:", round(AU, 3), "\n")

cat("Significance level: p =:", p, "\n")

cat("Critical value (AU_n,p):", round(AU_np, 3), "\n")

cat("Decision:", decision, "\n")

# Q-Q plot.

u <- (i - 1/3) / (n + 1/3)

q <- eqpareto(x, p = u, method = "mle", plot.pos.con = 1/3)

plot(q$quantiles, x_sorted, pch = 1, xlab = " Theoretical quantiles ", ylab = "Empirical quantiles")

abline(0, 1, col = "darkblue", lwd = 2)

# Histogram with an overlaid density curve.

hist(x, breaks = "fd", col = "lightyellow", border = "black", freq = FALSE, xlab = "Values of X", ylab = "Density", main = "")

lines(density(x), col="darkblue", lwd=4)

R provides a command to generate random samples that follow a Type I Pareto distribution. The EnvStats package must be loaded, and a seed can be set to ensure the results are stable (e.g., 123). It is only necessary to specify the sample size (n), the value of the scale parameter (xm) in “location” (since in the Type I Pareto distribution, the scale and location parameters coincide), and the value of the shape parameter (α) in “shape”.

Library (EnvStats)

set.seed (123)

rpareto (n = 40, location = 500, shape = log (5)/log(4))

9. Strengths and Weaknesses of Type I Pareto Distribution

The Pareto distribution is a valuable tool in social research for analyzing extreme inequality, but its simplicity and specific focus make it less versatile than other models for more balanced or nuanced distributions (Charpentier & Flachaire, 2022). The following are some of its most distinctive features:

Flexibility with inequality measures: The Pareto model is directly linked to measures of inequality, such as the Gini coefficient. It can succinctly model disparities, making it a preferred choice over less interpretable models like the exponential distribution.

Simplicity in application: With only two parameters, the type I Pareto distribution is relatively simple compared to multi-parameter models such as generalized Pareto or Weibull distributions. This simplicity can be advantageous for interpretability but might limit its ability to model more complex phenomena.

Fit for empirical data: While the type I Pareto distribution fits well for data with pronounced inequality, it might not perform as effectively for datasets with moderate or low inequality. In such cases, models like the log-normal or gamma distributions might provide better fits.

Possible combination with other distributions: The type I Pareto distribution typically focuses on the upper tail of a distribution (e.g., the wealthiest individuals). For comprehensive population modeling, hybrid models (e.g., combining Pareto for the tail and log-normal or beta for the body distribution) may be more accurate (Akinsete, Famoye, & Lee, 2008; Martins, Liska, Beijo, Menezes, & Cirillo, 2020). However, this feature can be seen as a limitation of the Pareto distribution that needs to be combined.

Specifically, the Type I Pareto distribution has several limitations that should be considered when applying it to real world data. First, it is primarily suited to modelling the upper tails of distributions and may not accurately represent the entire data set, especially for moderate or low values. Second, its reliance on a power-law relationship imposes strict assumptions about the proportion of large values, which may oversimplify complex phenomena. In addition, the distribution assumes that the scale parameter (xm) represents a minimum threshold, which may not always match the data structure. The Type I Pareto distribution also struggles to capture multimodal distributions or data sets with varying degrees of inequality across different ranges. Finally, while its simplicity makes it attractive in terms of interpretability, this can limit its flexibility in capturing more nuanced patterns, necessitating the use of hybrid or alternative models in certain contexts.

The type I Pareto distribution fails in contexts where the data are light-tailed, multimodal, highly dynamic or dominated by intermediate values, or where structural factors introduce complexity that cannot be accounted for by a simple power-law relationship. For example, in a society with a large and stable middle class, income distributions often peak around the median, which the Pareto model cannot effectively capture due to its emphasis on extremes.

10. Conclusion

The illustrated presentation of the type I Pareto distribution with scale parameter xm and shape parameter α demonstrates that calculating probabilities is straightforward, as is computing descriptive statistics. As with other distributions, the most effective estimators for its parameters are those derived using the maximum likelihood method (Martín, Parra, Pizarro, & Sanjuán, 2022), which are both simple and computationally efficient. The estimator for xm is the sample minimum, while the estimator for α is the inverse of the mean of the log-transformed sample data (Siudem et al., 2022). Naturally, it is necessary to assess the fit of the empirical data to the probability model by employing both a graphical approach—using a histogram with an overlaid density curve (characterized by an inverted J-shape) and a quantile-quantile plot (showing a 45-degree alignment)—and an inferential approach, involving the Kolmogorov-Smirnov (Chu et al., 2019), Anderson-Darling (Stephens, 1986), or modified Anderson-Darling (Sinclair et al., 1990) tests. In the first two tests, the data are transformed to follow an exponential distribution, whereas the third test does not require such a transformation.

All these calculations can be performed using the scripts developed and presented in this article, which can be adapted to other score vectors. Additionally, Excel is also a valuable tool for this purpose. It should be noted that R is not only a free-access program, but it is currently the most comprehensive among statistical software, and it is gaining popularity among researchers in psychology and social sciences (Navarro, 2024).

Within the social sciences, this distribution serves as an effective probability model for various phenomena, including the distribution of income, rent, accumulated resources within a company, region, or country, and insurance claims (Feng et al., 2020). It is also applicable to the average frequency of behaviors that are rare among most individuals but highly prevalent in a few, such as compulsive behaviors, behavioral or substance addictions, and paraphilias (Rajeev, 2022). Notably, it can be applied to the same continuous variables as the lognormal distribution, such as epidemiological data (Beare & Toda, 2020), necessitating an evaluation to determine which probability model provides the best fit (Charpentier & Flachaire, 2022; Feng et al., 2020). For discrete data defined on a bounded set of natural numbers, the Zipf distribution—a variant of the zeta distribution—is the appropriate choice. From a social perspective, a good fit to a Paretian probability model highlights the presence of well-defined funnel rules and inequality in resource allocation. This, in turn, may inspire policies of segregation targeting specific social sectors deemed responsible for disproportionate expenditure or debt, such as health insurance for the elderly (Rodríguez-Abreu, 2021).

Acknowledgements

The author expresses his gratitude to the reviewers and editors for the suggestions and corrections received for the improvement of the manuscript.

Conflicts of Interest

The author declares no conflicts of interest regarding the publication of this paper.

References

[1] Ahmad, H. A. H., & Almetwally, E. M. (2020). Marshall-Olkin Generalized Pareto Distribution: Bayesian and Non Bayesian Estimation. Pakistan Journal of Statistics and Operation Research, 16, 21-33.
https://doi.org/10.18187/pjsor.v16i1.2935
[2] Akinsete, A., Famoye, F., & Lee, C. (2008). The Beta-Pareto Distribution. Statistics, 42, 547-563.
https://doi.org/10.1080/02331880801983876
[3] Anderson, T. W., & Darling, D. A. (1952). Asymptotic Theory of Certain “Goodness of Fit” Criteria Based on Stochastic Processes. The Annals of Mathematical Statistics, 23, 193-212.
https://doi.org/10.1214/aoms/1177729437
[4] Andria, J. (2022). A Computational Proposal for a Robust Estimation of the Pareto Tail Index: An Application to Emerging Markets. Applied Soft Computing, 114, Article ID: 108048.
https://doi.org/10.1016/j.asoc.2021.108048
[5] Arnold, B. C. (2015). Pareto Distribution (2nd ed.). John Wiley y Sons, Ltd.
https://doi.org/10.1201/b18141
[6] Barczy, M., K. Nedényi, F., & Sütő, L. (2023). Probability Equivalent Level of Value at Risk and Higher-Order Expected Shortfalls. Insurance: Mathematics and Economics, 108, 107-128.
https://doi.org/10.1016/j.insmatheco.2022.11.004
[7] Barnoy, A., & Reich, Z. (2022). Trusting Others: A Pareto Distribution of Source and Message Credibility among News Reporters. Communication Research, 49, 196-220.
[8] Beare, B. K., & Toda, A. A. (2020). On the Emergence of a Power Law in the Distribution of COVID-19 Cases. Physica D: Nonlinear Phenomena, 412, Article ID: 132649.
https://doi.org/10.1016/j.physd.2020.132649
[9] Benczes, I. (2022). Taking Back Control over the Economy: From Economic Populism to the Economic Consequences of Populism. European Policy Analysis, 8, 109-123.
https://doi.org/10.1002/epa2.1134
[10] Bhoj, D. S., & Chandra, G. (2021). Ranked Set Sampling with Lowest Order Statistics for Pareto Distribution. Communications in Statistics-Simulation and Computation, 52, 2327-2335.
https://doi.org/10.1080/03610918.2021.1904143
[11] Campbell, M. R., & Brauer, M. (2021). Is Discrimination Widespread? Testing Assumptions about Bias on a University Campus. Journal of Experimental Psychology: General, 150, 756-777.
https://doi.org/10.1037/xge0000983
[12] Charpentier, A., & Flachaire, E. (2022). Pareto Models for Top Incomes and Wealth. The Journal of Economic Inequality, 20, 1-25.
https://doi.org/10.1007/s10888-021-09514-6
[13] Chattamvelli, R., & Shanmugam, R. (2021). Pareto Distribution. In Continuous Distributions in Engineering and the Applied Sciences-Part II (pp. 179-188). Springer International Publishing.
https://doi.org/10.1007/978-3-031-02435-1_3
[14] Chen, B., Zhang, K., Wang, L., Jiang, S., & Liu, G. (2019). Generalized Extreme Value-Pareto Distribution Function and Its Applications in Ocean Engineering. China Ocean Engineering, 33, 127-136.
https://doi.org/10.1007/s13344-019-0013-9
[15] Cheng, W., Fu, H., Wang, L., Dong, C., Jin, Y., Jiang, M. et al. (2023). Data-Driven, Multi-moment Fluid Modeling of Landau Damping. Computer Physics Communications, 282, Article ID: 108538.
https://doi.org/10.1016/j.cpc.2022.108538
[16] Chu, J., Dickin, O., & Nadarajah, S. (2019). A Review of Goodness of Fit Tests for Pareto Distributions. Journal of Computational and Applied Mathematics, 361, 13-41.
https://doi.org/10.1016/j.cam.2019.04.018
[17] Diawara, D., Kane, L., Dembele, S., & Lo, G. S. (2021). Applying of the Extreme Value Theory for Determining Extreme Claims in the Automobile Insurance Sector: Case of a China Car Insurance. Afrika Statistika, 16, 2883-2909.
https://doi.org/10.16929/as/2021.2883.188
[18] Fedotenkov, I. (2020). A Review of More than One Hundred Pareto-Tail Index Estimators. Statistica, 80, 245-299.
https://doi.org/10.6092/issn.1973-2201/9533
[19] Feng, M., Deng, L., Chen, F., Perc, M., & Kurths, J. (2020). The Accumulative Law and Its Probability Model: An Extension of the Pareto Distribution and the Log-Normal Distribution. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 476, Article ID: 20200019.
https://doi.org/10.1098/rspa.2020.0019
[20] Gini, C. (1936). On the Measure of Concentration with Special Reference to Income and Statistics. Colorado College Publication, General Series, 208, 73-79.
[21] Landoni, J. S., & Villegas, L. (2022). Pagan los Pobres: Consecuencias Negativas de Políticas Públicas con Buenas (y Malas) Intenciones [The Poor Pay: Negative Consequences of Public Policies with Good (and Bad) Intentions]. Editorial Galerna.
[22] Le Gall, P., Favre, A., Naveau, P., & Prieur, C. (2022). Improved Regional Frequency Analysis of Rainfall Data. Weather and Climate Extremes, 36, Article ID: 100456.
https://doi.org/10.1016/j.wace.2022.100456
[23] Lomax, K. S. (1954). Business Failures: Another Example of the Analysis of Failure Data. Journal of the American Statistical Association, 49, 847-852.
https://doi.org/10.1080/01621459.1954.10501239
[24] Lorenz, M. O. (1905). Methods of Measuring the Concentration of Wealth. Publications of the American Statistical Association, 9, 209-219.
https://doi.org/10.2307/2276207
[25] Martín, J., Parra, M. I., Pizarro, M. M., & Sanjuán, E. L. (2022). Baseline Methods for the Parameter Estimation of the Generalized Pareto Distribution. Entropy, 24, Article No. 178.
https://doi.org/10.3390/e24020178
[26] Martins, A. L. A., Liska, G. R., Beijo, L. A., Menezes, F. S. d., & Cirillo, M. Â. (2020). Generalized Pareto Distribution Applied to the Analysis of Maximum Rainfall Events in Uruguaiana, RS, Brazil. SN Applied Sciences, 2, Article No. 1479.
https://doi.org/10.1007/s42452-020-03199-8
[27] Mateus, A., & Caeiro, F. (2022). Confidence Intervals for the Shape Parameter of a Pareto Distribution. AIP Conference Proceedings, 2425, Article ID: 320003.
https://doi.org/10.1063/5.0081541
[28] McCarthy, D. M., & Winer, R. S. (2019). The Pareto Rule in Marketing Revisited: Is It 80/20 or 70/20? Marketing Letters, 30, 139-150.
https://doi.org/10.1007/s11002-019-09490-y
[29] Mojiri, A., & َََAhmadi, K. (2022). Inequality in the Distribution of Resources in Health Care System by Using the Gini Coefficient and Lorenz Curve (A Case Study of Sistan and Baluchestan Province over a Five-Year Period). Health Monitor Journal of the Iranian Institute for Health Sciences Research, 21, 227-236.
https://doi.org/10.52547/payesh.21.3.227
[30] Navarro, D (2024). Learning Statistics with RA Tutorial for Psychology Students and Other Beginners. LibreTexts Libraries. Statistics.
https://stats.libretexts.org/Bookshelves/Applied_Statistics/Learning_Statistics_with_R_-_A_tutorial_for_Psychology_Students_and_other_Beginners_(Navarro)
[31] Pareto, V. F. D. (1896). Cours d’Economie Politique (Vol. 1). F. Rouge éditeur.
[32] Pareto, V. F. D. (1897). Cours d’Economie Politique (Vol. 2). F. Rouge éditeur.
[33] Pearson, K. (1895). Contributions to the Mathematical Theory of Evolution. II. Skew Variation in Homogeneous Material. Philosophical Transactions of the Royal Society of London A, 186, 343-414.
https://doi.org/10.1098/rsta.1895.0010
[34] Pearson, K. (1905). “Das fehlergesetz und seine verallgemeiner-ungen durch fechner und pearson.” A rejoinder. Biometrika, 4, 169-212.
https://doi.org/10.1093/biomet/4.1-2.169
[35] Qian, W., Chen, W., & He, X. (2021). Parameter Estimation for the Pareto Distribution Based on Ranked Set Sampling. Statistical Papers, 62, 395-417.
https://doi.org/10.1007/s00362-019-01102-1
[36] Rácz, E., Spasibko, K., Manceau, M., Ruppert, L., Chekhova, M. V., & Filip, R. (2023). Quantifying Optical Rogue Waves.
https://doi.org/10.48550/arXiv.2303.04615
[37] Rajeev, C. D. S. (2022). Pareto Principle and Compulsive Buying Disorder—An Analysis. Journal of Educational and Social Research, 8, 44-59.
[38] Rao, C. R. (1973). Linear Statistical Inference and Its Applications. Wiley.
https://doi.org/10.1002/9780470316436
[39] Rodríguez Abreu, M. (2021). Gasto de bolsillo y gastos catastróficos en salud en hogares mexicanos. Carta Económica Regional, 34, 59-83.
https://doi.org/10.32870/cer.v0i128.7825
[40] Ross, S. M. (2022). Simulation (6th ed.). Academic Press.
[41] Rytgaard, M. (1990). Estimation in the Pareto Distribution. ASTIN Bulletin, 20, 201-216.
https://doi.org/10.2143/ast.20.2.2005443
[42] Safari, M. A. M., Masseran, N., Ibrahim, K., & Hussain, S. I. (2019). A Robust and Efficient Estimator for the Tail Index of Inverse Pareto Distribution. Physica A: Statistical Mechanics and its Applications, 517, 431-439.
https://doi.org/10.1016/j.physa.2018.11.029
[43] Sarabia, J. M., Jordá, V., & Prieto, F. (2019). On a New Pareto-Type Distribution with Applications in the Study of Income Inequality and Risk Analysis. Physica A: Statistical Mechanics and Its Applications, 527, Article ID: 121277.
https://doi.org/10.1016/j.physa.2019.121277
[44] Sinclair, C. D., Spurr, B. D., & Ahmad, M. I. (1990). Modified Anderson Darling Test. Communications in StatisticsTheory and Methods, 19, 3677-3686.
https://doi.org/10.1080/03610929008830405
[45] Sitthiyot, T., & Holasut, K. (2021). A Simple Method for Estimating the Lorenz Curve. Humanities and Social Sciences Communications, 8, Article No. 268.
https://doi.org/10.1057/s41599-021-00948-x
[46] Siudem, G., Nowak, P., & Gagolewski, M. (2022). Power Laws, the Price Model, and the Pareto Type-2 Distribution. Physica A: Statistical Mechanics and its Applications, 606, Article ID: 128059.
https://doi.org/10.1016/j.physa.2022.128059
[47] Song, I., Ryoung-Park, S., & Yoon, S. (2022). Probability and Random Variables: Theory and Applications. Springer International Publishing.
[48] Stephens, M. A. (1974). EDF Statistics for Goodness of Fit and Some Comparisons. Journal of the American Statistical Association, 69, 730-737.
https://doi.org/10.2307/2286009
[49] Stephens, M. A. (1986). Tests Based on EDF Statistics. In R. B. D’Agostino, & M. A. Stephens (Eds.), Goodness-of-Fit Techniques (pp. 97-193) Marcel Dekker, Inc.
https://doi.org/10.1201/9780203753064-4
[50] Sudharson, D., & Prabha, D. (2019). Retracted Article: A Novel Machine Learning Approach for Software Reliability Growth Modelling with Pareto Distribution Function. Soft Computing, 23, 8379-8387.
https://doi.org/10.1007/s00500-019-04047-7
[51] Sudharson, D., Divya, P., Ratheeshkumar, M., Saravanan, A., Nithiyashree, V. K., & Srinithi, J. (2022). A PD ANN Machine Learning Framework for Reliability Optimization in Application Software. In 2022 Smart Technologies, Communication and Robotics (STCR) (pp. 1-4). Institute of Electrical and Electronics Engineers.
https://doi.org/10.1109/stcr55312.2022.10009626
[52] Tokhirov, A., Harmáček, J., & Syrovátka, M. (2021). Remittances and Inequality: The Post-Communist Region. Prague Economic Papers, 30, 426-448.
https://doi.org/10.18267/j.pep.776
[53] Valkanas, K., & Diamandis, P. (2022). Pareto Distribution in Virtual Education: Challenges and Opportunities. Canadian Medical Education Journal, 13, 102-104.
https://doi.org/10.36834/cmej.73511
[54] World Bank (2022). Gini Index.
https://data.worldbank.org/indicator/
[55] Xu, T., Sedory, S. A., & Singh, S. (2022). Lowering the Cramer-Rao Lower Bounds of Variance in Randomized Response Sampling. Communications in StatisticsSimulation and Computation, 51, 4112-4126.
https://doi.org/10.1080/03610918.2020.1737874
[56] Yang, X., & Zhou, P. (2022). Wealth Inequality and Social Mobility: A Simulation-Based Modelling Approach. Journal of Economic Behavior & Organization, 196, 307-329.
https://doi.org/10.1016/j.jebo.2022.02.012
[57] Zhang, Y., Wu, Y., & Yao, H. (2022). Optimal Health Insurance with Constraints under Utility of Health, Wealth and Income. Journal of Industrial and Management Optimization, 18, 1519-1540.
https://doi.org/10.3934/jimo.2021031

Copyright © 2025 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.