Basic Functions for Computational Implementation of the Box-Cox Symmetric Class of Distributions

A class of distributions called Box-Cox symmetric was proposed for random variables with asymmetric distributions. This class allows through its structure an interpretation of the parameters in terms of quantiles (in particular, the median), relative dispersion and skewness. This study presents the initial results of the computational development of basic functions of each of the distributions that make up the Box-Cox symmetric class. Four functions have been developed to compose a routine in software R up to now. These functions are related to random numbers generation, probability density function, cumulative distribution function, and quantile function associated to a given probability. Examples of implemented functions were presented. The gamlss routine was used to check the performance of developed functions.


Introduction
The Box-Cox symmetric class of distributions (BCS) emerges as an alternative for modeling highly asymmetric data even when outliers are present [1]. The main idea to develop this class of distributions has begun to solve problems related to analyze nutrient intake data in dietitian area. In this context and as the data consumption present an asymmetric distribution, it is common to apply Box-Cox transformation [2] in order to model the transformed data with the normal distribution approach. It just works well when the data present a reasonably well behavior distribution. In the case of asymmetric distribution or outliers' presence, this propose doesn't work very well, since in the presence of outliers, Box-Cox transformation cannot give an adequate distribution for the data. So, an alternative approach for estimating usual nutrient distributions and prevalence of inadequate nutrient intakes was done through a Box-Cox t model with random intercept [3].
The BCS class allows through its structure an interpretation of the parameters in terms of quantiles (especially, the median), relative dispersion and skewness, which makes it interesting for regression models. This class includes as particular cases the log-normal [4], Box-Cox t [5], Box-Cox Cole-Green [6] and Box-Cox power exponential [7] distributions.
This work presents an initial study about the computational implementation of the distributions that make up the BCS class, through the presentation of four basic functions. Additionally, a study with the gamlss package in software R is presented to corroborate the proposal.

Methodology
Let Y be a positive and continuous random variable. The Box-Cox symmetric class of distributions is defined from the transformation where 0, 0, µ σ ν > > − ∞ < < ∞ and Z has a standard symmetric distribution truncated at the interval , with ( ) r ⋅ being the density generating function.
Thus, Y has a BCS distribution with parameters , µ σ and ν , so r ⋅ being the density generating function. The class of the symmetric distributions has a number of well-known distributions as special cases depending on the choice of r. It includes the normal distribution, the Student-t, power exponential, type I logistic, type II logistic and slash distributions among others. These densities have quite different tail behaviors, and some of them may have heavier or lighter tails than the normal distribution [1].
The first basic function generates random numbers from a probability density function. The function is initiated by r (random) followed by the required BCS

( )
, , , rBCS n µ σ ν (2) where: • n: number of observations to be simulated; • µ: parameter related to the median in a distribution that belongs to the BCS class; • σ: parameter related to the relative dispersion in a distribution that belongs to the BCS class; • ν: parameter related to the skewness in a distribution that belongs to the BCS class. Note that the proposed function requires only three parameters, but some distributions of the BCS class require a fourth parameter, that is related to kurtosis of the distribution. For instance, the Box-Cox t distribution has an additional parameter to model the tail decay, which is defined with the others as Thus, if the intention is to generate random numbers from a variable that has a Box-Cox Normal distribution (BCN), for example, you have to write ( ) , , , rBCN n µ σ ν and to specify the number of observations to be simulated and the parameters values.
The second basic function generates the probability density function as result.
The notation is initiated by d (density) followed by the required BCS distribution:

( )
, , , dBCS y µ σ ν where: • y: value of the observed variable to obtain the corresponding point of the probability density function; • µ, σ and ν: parameters related to a distribution that belongs to the BCS class, as described in the Equation (2). Then, considering all possible values for y, a graph of the probability density function of a variable that follows a BCS distribution can be plotted. For instance, in order to obtain the density function of the Box-Cox Normal distribution (BCN), just specify the parameters in the function
The third basic function returns the cumulative distribution function (P (Y ≤ y)), which is initiated by p (probability) followed by the required BCS distribution, written as:

( )
, , , pBCS y µ σ ν , where: • y: any possible positive value of Y; • µ, σ and ν: parameters related to a distribution that belongs to the BCS class, as described in the Equation (2). For example, if the aim is to obtain the cumulative probability associated to any value of a random variable with Box-Cox Normal distribution (BCN), just specify the parameters ( ) , , , pBCN y µ σ ν , for any value of y.
The fourth basic function returns the quantile of a BCS distribution. For that, the function initiates by q (quantile) followed by the distribution to obtain the quantile, which is written as follows: ( ) , , , qBCS p µ σ ν , where: • p: probability value, which varies between 0 and 1; • µ, σ and ν: parameters related to a distribution that belongs to the BCS class, as described in the Equation (2). For instance, to obtain the quantile associated with a probability value of a Box-Cox Normal distribution (BCN), just write ( ) , , , qBCN p µ σ ν , specifying the probability value and the parameters values.
In order to evaluate the implemented functions, the gamlss package present in R was used gamlss [8]. The Box-Cox Normal distribution was chosen to compare the results.
All implemented functions were done in the software R, version 4.0.2.

The Use of the Implemented Functions
In order to generate random numbers, consider a Box-Cox Slash distribution.
In this case, it can be used the command rBCSlash (n, μ, σ, ν, q) with n = 100, μ = 5, σ = 0.5, ν = −0.5; 0; 0.5 and q = 2. Figure 1 presents the histogram. The skewness is the main feature of the functions that make up such a class of distributions.
To obtain the probability density function of a Box-Cox type II logistic distribution (BCLog2), it can be used the command ( ) 2 , , , dBCLog y µ σ ν .  Using a Box-Cox Cauchy distribution to calculate the cumulative distribution function, the command was pBCCauchy (y, µ = 5, σ = 0.5, ν = 0.5). Considering several values of y, Figure 3 presents the result.

Comparison between the Implemented Functions and the Gamlss Package
The gamlss package presented in software R was used to analyze the implemented functions. This package presents some particular cases of the BCS distributions. The Box-Cox Normal distribution was chosen to make the comparisons. Figure 4 shows 1000 random numbers that were generated from a Box-Cox  Normal distribution (BCN) for µ = 1, σ = 0.5 and ν = 0.5 using the implemented function and the gamlss package. In the gamlss, the Box-Cox Normal is known as Box-Cox Cole Green (BCCG). Figure 5 shows a graph of a probability density function generated from the Box-Cox Normal (BCN) with the parameters µ = 5, σ = 0.5 and ν = 0.5 by the implemented function and by the gamlss package. Table 1 presents a comparison of the quantiles of BCN using implemented functions and gamlss package. As can be seen, results were the same.

Conclusions
In this work, functions for random numbers generation, probability density function, cumulative distribution function, and quantile function associated to a given probability of a distribution of the BCS class were presented. Comparisons made using the gamlss routine, which has particular cases of some distributions of the BCS class, provided the same values.
Thus, these functions will be used to elaborate a routine for data analysis with asymmetric distributions in order to fit both experimental and regression models. The authors are using the R package and routines to make functions available.