Bivariate Analysis of Pollutants Monthly Maxima in Mexico City Using Extreme Value Distributions and Copula

Abstract

In the present work, we are interested in studying the joint distributions of pairs of the monthly maxima of the pollutants used by the environmental authorities in Mexico City to classify the air quality in the metropolitan area. In order to obtain the joint distributions a copula will be considered. Since we are analyzing the monthly maxima, the extreme value distributions of Weibull and Fréchet are taken into account. Using these two distributions as marginal distributions in the copula a Bayesian inference was made in order to estimate the parameters of both distributions and also the association parameters appearing in the copula model. The pollutants taken into account are ozone, nitrogen dioxide, sulphur dioxide, carbon monoxide, and particulate matter with diameters smaller than 10 and 2.5 microns obtained from the Mexico City monitoring network. The estimation was performed by taking samples of the parameters generated through a Markov chain Monte Carlo algorithm implemented using the software OpenBugs. Once the algorithm is implemented it is applied to the pairs of pollutants where one of the coordinates of the pair is ozone and the other varies on the set of the remaining pollutants. Depending on the pollutant and the region where they were collected, different results were obtained. Hence, in some cases we have that the best model is that where we have a Fréchet distribution as the marginal distribution for the measurements of both pollutants and in others the most suitable model is the one assuming a Fréchet for ozone and a Weibull for the other pollutant. Results show that, in the present case, the estimated association parameter is a good representation to the correlation parameters between the pair of pollutants analyzed. Additionally, it is a straightforward task to obtain these correlation parameters from the corresponding association parameters.

Share and Cite:

Vazquez-Morales, J. , Rodrigues, E. and Reyes-Cervantes, H. (2024) Bivariate Analysis of Pollutants Monthly Maxima in Mexico City Using Extreme Value Distributions and Copula. Journal of Environmental Protection, 15, 796-826. doi: 10.4236/jep.2024.157046.

1. Introduction

High levels of air pollution are a recurrent problem in many cities/regions around the world (see, for instance, [1] [2]). This may produce harmful effects on the population’s health, as well as the environment. Among the many pollutants measured in Mexico City’s monitoring network are ozone (O3), sulphur dioxide (SO2), carbon monoxide (CO), nitrogen dioxide (NO2), particulate matter with a diameter smaller than 10 microns (PM10) and those with a diameter smaller than 2.5 (PM2.5). Pollutant concentrations are measured in parts per million (ppm) in the cases of O3, SO2, NO2, and CO, and in micrograms per cubic meter (μg/m3) when we consider PM10 and PM2.5. These pollutants are also known as the criterion pollutants which are used to establish the air quality indices in the city. They are chosen because of their possible harmful impact on human health, as well as the environment ([3] [4]). For instance, if we have ozone levels above 0.11 ppm, the ill, newborn, and elderly may experience serious health deterioration (see, for example, [5]-[8], among others). We also know that sulphur and nitrogen dioxides, when in contact with the right level of humidity in the atmosphere, may produce acid rain ([9] [10]). Additionally, exposure of pregnant women to CO, PM10, and PM2.5 may produce adverse effects on the newborn ([11]-[13]), and exposure to PM10 and PM2.5 may cause cardiovascular problems in the population in general and an increase in mortality in at-risk groups ([14]-[17]). Therefore, it is very important to study the behavior of these and other pollutants. It is possible to find in the literature several works studying pollution behavior using data from several regions of the world. Among them are [18]-[20], in particular, it is of interest to study the behavior of concentrations that present extreme values when compared to the remaining measurements. One way of studying this is to use extreme value models.

Extreme value theory ([21]-[23]) has been applied in many areas ([24]) including the study of several environmental problems. For instance, we have [25] where these models are used to study air pollution with application to the Istanbul data [26], in which they are used to study the ozone extreme trends using data from one monitoring station in Mexico City [27], where they are used to predict air pollution threshold exceedances with an application to NO2, O3, and CO data obtained from monitoring sites located in Queensland, Australia [28], which studies the impact of climate change on global flood and precipitation [29], where trends in temperature extremes in some cities in Mexico are analyzed, and [30] where extreme sea levels in coastal China are studied. In the present study, we are interested in analyzing the joint behavior of pairs of pollutants extreme measurements.

When it comes to studying the distribution of multivariate data we may consider, for instance, the direct use of the multivariate distributions when they are available ([31]) or to use of copulas ([32]) to obtain the expressions for the corresponding distributions. Copulas have also been used in works where multivariate extreme value models are considered to study environmental problems, for instance, in [33] a copula approach was considered to study multivariate multi-parameter extreme value models with an application to annual flooding data from the Spey basin in the north of Scotland [34], where a copula model is used to analyze the ranking of the paired data (PM10, CO), (PM10, NO2), (PM10, O3), and (PM10, SO2) with a generalized Pareto distribution and data from the Klang City, Malaysia.

In the present study we consider an extreme value model and a copula to study the association and correlation between pairs of pollutants obtained from the Mexico City monitoring network. We consider the bivariate analysis of the pairs (NO2, O3), (SO2, O3), (CO, O3), (PM10, O3), and (PM2.5, O3). The copula used is the Gumbel-Hougaard ([35] [36]) considering two extreme value distributions, namely, the Weibull and Fréchet as possible marginal distributions. The novelties are the copula used, as well as the data sets. The differences from previous works also using data from Mexico City are the observational period, the copula used, the data taken into account, and the regional analysis performed.

This work is organized as follows. In Section 2, we present the mathematical and the Bayesian formulations of the model. Section 3 gives an application to the case of Mexico City’s monthly maximum measurements of the six criterion pollutants. In Section 4, a discussion of the results, as well as some additional comments are given. Finally, in Section 5 we conclude. An Appendix placed after the list of references, presents the one-dimensional analysis of the monthly maxima for all pollutants and regions considered in this work, some tables and figures mentioned in the main text, the explicit form of the likelihood functions, and some of the computational details.

2. The Mathematical and Bayesian Models

Let d1 and N1 be natural numbers representing, respectively, the number of pollutants considered in the analysis and the number of observations of each one of them during the observational period [ 0,T ] , T>0 . Denote by Z t ( i ) the measurement of the ith pollutant at time t; i=1,2,,d ; t=1,2,,T . Let Z ( i ) ={ Z t ( i ) :t=1,2,,T } be the process recording the ith pollutant concentration values throughout the observational period, i=1,2,,d . We assume that the ith pollutant has, respectively, distribution and density functions F i ( ) and f i ( ) ; i=1,2,,d . Denote by F ij ( , ) and f ij ( , ) the joint distribution and density functions of the ith and jth pollutants, respectively, i,j=1,2,,d ; ij . We assume that F ij ( , ) is given by a copula function denoted by C θ ( , ) with F i ( ) and F j ( ) the marginal distributions, i.e., F ij ( , )= C θ ( F i ( ), F j ( ) ) ; ij ; i,j=1,2,,d . Due to the nature of the correlation between pairs of measurements, in the present case we assume a Gumbel-Hougaard ([35]-[38]) copula for the pairs of distribution functions. Hence, we take

C θ ( u,v )=exp( [ ( logu ) θ + ( logv ) θ ] 1/θ ), (1)

where θ[ 1, ) is the association parameter. (Note that when θ=1 we have independence of the two random variables whose joint distribution we are trying to model).

Remark. Even though there are many different types of copula, we have decided to use the Gumbel-Hougaard because this copula is suitable when we have a positive sample correlation between the two sets of measurements we are studying. This is the case here (see Figure A1 in Appendix A).

When F ij ( , ) is given by (1), the corresponding joint density function f ij ( , ) is given by

f ij ( x,y )= f i ( x ) f j ( y ) c θ ( F i ( x ), F j ( y ) ), (2)

where

c θ ( u,v )= ( logu ) θ1 ( logv ) θ1 uv exp( [ ( logu ) θ + ( logv ) θ ] 1 θ ) ×( [ ( logu ) θ + ( logv ) θ ] 2 θ 2 +( θ1 ) [ ( logu ) θ + ( logv ) θ ] 1 θ 2 ).

In the present study we are using the monthly maximum measurements, hence two well known extreme values distributions will be considered when using a copula in order to obtain the joint distributions of pairs of maxima. These distributions are the Fréchet (α, σ) (F) and the Weibull (α, σ) (W) which are given as follows ([39] [40]). Random variables X and Y are said to have Fréchet (α, σ) and Weibull (α, σ) distributions, respectively, if their distribution and density functions are of the following forms,

F X ( F ) ( x )=exp[ ( x σ ) α ]and f X ( F ) ( x )= α σ ( x σ ) α1 exp[ ( x σ ) α ], (3)

and

F Y ( W ) ( y )=1exp[ ( y σ ) α ]and f Y ( W ) ( y )= α σ ( y σ ) α1 exp[ ( y σ ) α ], (4)

with x,y,α,σ>0 . Taking b= ( 1/σ ) α we may rewrite (4) in the equivalent form,

F Y ( W ) ( y )=1exp[ b y α ]and f Y ( W ) ( y )=bα y α1 exp[ b y α ]. (5)

The latter will be considered in the computer codes used to estimate the parameters of the models. In both distributions, the vector of parameters to be estimated is φ=( α,σ ) .

Denote by ( Z t ( i ) , Z t ( j ) ) the random vector recording the joint pollutants concentrations, i,j=1,2,,d ; ij ; t=1,2,,T . In the present work, ( Z t ( i ) , Z t ( j ) ) will record the monthly maximum concentrations of pollutants i and j. Let ( α i , σ i ) and ( α j , σ j ) denote the vector of parameters of the distribution functions F i ( ) and F j ( ) , respectively. We also assume that F ij ( , ) is given by the copula (1) with f ij ( , ) the associated density function. Hence, the vector of parameters to be estimated is φ=( α i , σ i , α j , σ j ,θ ) .

Estimation of the parameters will be performed under the Bayesian point of view ([41]) using information provided by the so-called posterior distribution of the vector of parameters. If ϕ is the vector of parameters of a model describing a data set D , then the posterior distribution of ϕ , denoted by P( ϕ|D ) is such that P( ϕ|D )L( D|ϕ )P( ϕ ) , where L( D|ϕ ) and P( ϕ ) are, respectively, the likelihood function of the model and the so-called prior distribution of ϕ . Estimation of the parameters is made easier by using the software OpenBugs ([42] [43]) since we only need to specify the likelihood function of the model and the prior distributions of the parameters.

In the present case the observed data are given by D ( i,j ) ={ ( z t ( i ) , z t ( j ) ),t=1,2,,T } , i,j=1,2,,d ; ij . The prior distributions of the parameters will be specified when the model is applied to the Mexico City data. In all cases we will assume prior independence of the parameters. The likelihood function when the ith and jth pollutants are taken into account is L( D ( i,j ) |φ )= t=1 T f ij ( z t ( i ) , z t ( j ) ), with f ij ( , ) the appropriate joint density function given by the copula. The particular forms of the bivariate density functions obtained using the copula are given in Appendix B.

Several versions of the model will be considered. Hence, several criteria will be used to aid in the selection of the best model to describe the data. One of them is the deviance information criterion (DIC) ([44]) and another is the Bayes factor ([45]) through the marginal likelihood function (MLF). They are described as follows. The deviance information criterion is given by DIC=D( ϕ ¯ )+2 p D , with D( ϕ ¯ )=2log[ L( D| ϕ ¯ ) ]+C the deviance evaluated at the posterior mean ϕ ¯ of the parameter ϕ , C a constant, and p D the effective number of parameters in the model which is given by p D = D ¯ ( ϕ )D( ϕ ¯ ) , with D ¯ ( ϕ )=E[ D( ϕ ) ] the posterior mean deviance. The smaller the value of the DIC the better suited is the model to describe a data set D . The Bayes discrimination method may be described as follows. From [45] the marginal likelihood function of a data set D for Model l , l=1,2,,J , is given by V l = L( D| ϕ ( l ) )P( ϕ ( l ) )d ϕ ( l ) , where ϕ [ l ] is the vector of parameters for Model l and P( θ [ l ] ) is the joint prior distribution of ϕ ( l ) . The Bayes discrimination method prefers model i to model j if V j / V i <1 .

Remark. For completeness, in Appendix we also present an individual analysis of each criterion pollutant considered here. In this case only the univariate distributions are taken into account. Hence, in the univariate analysis we will be modeling one set of data at time. Hence, in this case the process is Z ( i ) , i=1,2,,d , and the vector of parameters to be estimated is just φ=( α,σ ) , and the observed data are denoted by the set D ( i ) ={ z t ( i ) t=1,2,,T } . In addition to the selection criteria described above, in the one-dimensional case we will also use the graphical fit between estimated and empirical density functions associated with a data set, as well as the p-value of the Kolmogorov-Smirnov test. The likelihood function of the model when we are using the ith pollutant data is given by L( D ( i ) |φ )= t=1 T f i ( z t ( i ) ) , where φ is the vector of parameters of either the Fréchet or the Weibull distribution with f i ( ) the corresponding density function and D ( i ) is the set of the monthly maxima of the pollutant taken into account.

3. Application to Mexico City Data

Application will be made to measurements of so-called criterion pollutants collected at Mexico City monitoring network (http://www.aire.cdmx.gob.mx/default.php?opc=%27aKBh%27). This network comprises several monitoring stations placed throughout the metropolitan area. Measurements in each monitoring station are obtained minute by minute and the averaged hourly result is reported at each station. The data actually used in the application are the monthly maximum measurements. Depending on the pollutant we have an observational period. Hence, in the cases of ozone, NO2, SO2, and CO the data used were collected from January 1990 to December 2021 giving a total of T=384 monthly maxima; from January 1995 to December 2021 in the case of PM10, giving a total of T=324 measurements; and from August 2003 to December 2021 for the PM2.5 data with a total of T=221 values. When pollutants with different observational periods are used in the bivariate study, the observational period taken into account is the intersection of the observational periods of the pollutants involved. Whenever necessary we will associate the pollutants O3, NO2, SO2, CO, PM10, and PM2.5 with the integer numbers 1, 2, 3, 4, 5, and 6, respectively.

3.1. Data Analysis

Since the metropolitan area of Mexico City is divided into five regions: northwest (NW), northeast (NE), centre (CE), southwest (SW), and southeast (SE), we will analyze data from each region separately. The monthly maximum measurements in a given region in a given month are the maximum over the values reported in all stations located at that region during that month. In Table 1 we have, for all regions, the means and standard deviations (indicated by SD) of the monthly maxima of the pollutants considered in the present study.

Table 1. Means and standard deviations (indicated by SD) of the monthly maxima for all pollutants and regions in ppm in the cases of O3, NO2, SO2, and CO, and in μg/m3 in the cases of PM10 and PM2.5.


NW

NE

CE

SE

SW

Mean

SD

Mean

SD

Mean

SD

Mean

SD

Mean

SD

O3

0.187

0.0641

0.153

0.038

0.185

0.06

0.176

0.049

0.205

0.069

NO2

0.141

0.0696

0.112

0.432

0.142

0.058

0.127

0.051

0.121

0.055

SO2

0.186

0.824

0.178

0.871

0.117

0.062

0.083

0.046

0.086

0.045

CO

9.857

7.947

9.238

5.907

9.217

6.432

6.987

4.321

7.434

5.138

PM10

295.272

150.856

447.343

242.234

265.35

177.69

318.988

189.788

211.333

113.746

PM2.5

106.42

61.336

159.104

120.743

105.792

60.838

108.729

52.912

98.9

51.339

Looking at Table 1, we see that the highest average of the monthly ozone maxima is achieved in region SW. The largest NO2 monthly maxima occurs in region CE. In the cases of SO2 and CO the largest average values are in region NW, and the highest values for PM10 and PM2.5 occur in region NE. The highest value of the ozone monthly maximum average in region SW may be explained by the fact that some of the ozone precursors are produced in regions NE and CE (see values of the averages of the monthly maxima in the cases of NO2, SO2, and CO observed in those regions) and are transported to region SW by the predominant wind direction which is from NE to SW. Those precursors form ozone during their atmospheric travels and ozone stays in region SW trapped by the surrounding mountains which occupy the southwest part of that region. In the case of SO2 and CO, the high values of the means of the monthly maxima in region NW could be the product of the heavy truck traffic in that region since it is one of the gateways to the northern part of Mexico. The high level of carbon monoxide may also be explained by the high number of cars circulating in that area in addition to the truck volume. Note that in region NE there is a large number of factories, hence, this may affect the concentration values of particulate matter, in particular, PM10 and PM2.5. If we look at the values associated with region CE we also see high average values for O3, CO, and PM2.5. We would also like to call attention to the fact that in that region, there is a large amount of cars and busses circulating. Therefore, we are bound to have higher mean values for the monthly maxima of those pollutants.

3.2. Results

Several combinations of the two extreme value distributions considered here are assumed in the copula (1). Due to computational issues when using the ozone, NO2, and SO2 data, the parts per billion unit of measure is used when running the OpenBugs software in the estimation of the parameters present in the models.

Looking at Table 1 we see that the average monthly maximum value of ozone is slightly above the double of the Mexican ozone standard (0.095 ppm [46]) valid at the end of the observational period. Additionally, under the new rule to assign the hourly air quality indices in Mexico City, the pollutant ozone contributes a significant amount of times in the assignation of these air quality indices [19] [47]. Hence, it would be interesting to know how the other criterion pollutants associate with ozone. In order to do that we consider the copula given by (1) and study the pairs of pollutants where ozone is fixed and we vary the other pollutant through the set of the remaining criterion pollutants. Hence, in the present case we have the following pairs (NO2, O3), (SO2, O3), (CO, O3), (PM10, O3), and (PM2.5, O3). In Figure A1 given in Appendix A, we have the plots of each of these pairs of measurements against each other for each region and pair. Looking at Figure A1, we see that for all pairs of pollutants considered here we have a positive correlation (some on a small scale and others in a clearer way). Hence, the Gumbel-Hougaard copula is suitable for analyzing the pairs of pollutants taken into account in this study.

We consider the cases where we have both pollutants with either the Fréchet (denoted by F-F) or the Weibull (denoted by W-W) distribution, and the case where O3 has Fréchet distribution and the other pollutant has a Weibull (W-F). In some cases, i.e., PM2.5 and regions CE, NW, and SW, we also consider the case where ozone has a Weibull distribution and PM2.5 has a Fréchet (F-W). Selection of the models that best fit the data will be performed using the DIC and the MLF, and also the results of the univariate model (given in Appendix C), i.e., if the selected density function in the univariate model we have that ozone has a Fréchet distribution and, for instance, PM2.5 has a Weibull, we consider the pair W-F as one possible model. If a tie occurs, i.e., each criterion chooses a different combination of the marginal distributions, then the tie break is performed by selecting the model suggested by the MLF criterion.

The prior distributions of the parameters, burn-in periods, sampling gap, and sample sizes are given in Appendix D. Table 2 gives the values of the DIC and marginal likelihood functions (MLF) in all cases.

Looking at Table 2 we see that if the DIC is used to choose the model, then the selected models are those that assume a Weibull distributions for both pollutants in all pairs of pollutants and all regions. When we use the MLF criterion to select the model we have that, in most of the cases, the model assuming Fréchet distributions for both pollutants is chosen. The exceptions are the pair (SO2, O3) in all regions and (CO, O3) in regions CE and SW where the W-F model assuming a Weibull distribution for SO2 and CO, and a Fréchet for ozone is selected.

If we take into account the corresponding selected density function for each pollutant in the unidimensional analysis (see Appendix C), we have a more heterogeneous assignation. Hence, we have a F-F model in the cases of (NO2, O3) and regions NW, SE, and SW; (SO2, O3) in region NE; (CO, O3) in region SE; (PM10, O3) in all regions with the exception of region NE; and (PM2.5, O3) and regions SE and SW. In the remaining cases the model assuming a Fréchet for ozone and Weibull for the other pollutant is the selected model.

Table 2. Values of the DIC and marginal likelihood functions (MLF) for all pairs of pollutants, regions, and pairs of distribution functions considered in the present study.

Using the tie break criterion in the cases where a tie has occurred, we have that the selected models are F-F for the pairs (NO2, O3), (PM10, O3), and (PM2.5, O3) for all regions and the pair (CO, O3) in all regions with the exception of regions CE and SW. In the remaining cases the selected model is the one assuming a Fréchet distribution for ozone and a Weibull distribution for the other pollutant.

In Table 3 we report the estimated parameters in all the selected models. (Note that, even though we use the same notation α and σ for the parameters, in some cases they correspond to the Weibull distribution and in others to the Fréchet).

4. Discussion and Comments

In this work we have used a copula model to study the joint behavior of the monthly maximum measurements of pairs of pollutants in terms of their joint density functions. Two distribution functions are assumed as possible marginal distributions to be considered in the copula. They are the Fréchet and the Weibull distributions. We consider a two-dimensional model in order to establish the association between the criterion pollutants and ozone. Results have shown that the selected distributions depend on the pollutants, as well as the region in the Mexico City metropolitan area where they are measured.

When we look at Table 3 we see that of the association parameters θ present in the selected models, higher values are given by the pair (CO, O3) in all regions. The highest is in region NW, followed by regions CE, SW, SE, and NE. Recall that region NW is a region with an extremely high number of cars, buses, and trucks circulation, hence, the CO levels are bound to be high and that has effects on the ozone concentration. Smaller values of the association parameters are found mostly when the pair (PM2.5, O3) is taken into account with the lowest value occurring when data from region CE is considered followed closely by the value in region SW. Note that region SW is the one with the lowest PM2.5 monthly maximum mean. Hence, it seems the concentration present in that region is not high enough to produce any effect on the ozone monthly maxima. The second lowest PM2.5 monthly maximum average occurs in region CE.

Another information the model provides is that we may obtain, from the association parameter θ present in the copula, the Spearman’s ρ correlation between each element of the pair of pollutants using either the formula ([48] [49]) ρ= 12 θ 0 1 [ t( 1t ) ] 1 θ 1 [ 1+ t 1 θ + ( 1t ) 1 θ ] 1 θ dt3, or integrating directly using the copula function, i.e., using (see [32]) ρ=12 0 1 0 1 C( u,v )dudv3

Table 3. Estimated means, standard deviations (indicated by SD), and Monte Carlo errors (indicate by MC Error) of the parameters in all selected two-dimensional models for all regions and pairs of pollutants considered in the analysis.

where, in the present case, C( , ) is the Gumbel-Hougaard copula given by (1). The values of ρ, for the selected models, obtained using the software MATHEMATICA by Wolfram, are given in Table 4 together with the values of the sample correlation for each pair of pollutants and all regions. (To make it easier to compare with the association parameters θ we repeat their values in the cases of the selected models).

Table 4. Values of the association parameters θ, the Spearman’s ρ obtained from them and the sample correlations σ ^ i1 , i=2,3,4,5,6 , for all pairs of pollutants considered and all regions and selected models.



NW

NE

CE

SE

SW

(NO2, O3)

θ

1.685

1.7

1.884

1.849

1.726

ρ

0.571

0.577

0.647

0.635

0.588

σ ^ 21

0.561

0.541

0.64

0.641

0.572

(SO2, O3)

θ

1.126

1.257

1.352

1.243

1.26

ρ

0.166

0.2997

0.378

0.287

0.302

σ ^ 31

0.241

0.351

0.411

0.395

0.385

(CO, O3)

θ

2.865

2.142

2.582

2.259

2.274

ρ

0.835

0.719

0.8

0.745

0.784

σ ^ 41

0.8598

0.72

0.829

0.768

0.825

(PM10, O3)

θ

1.299

1.608

1.49

1.567

1.413

ρ

0.336

0.535

0.471

0.514

0.422

σ ^ 51

0.31

0.52

0.457

0.538

0.413

(PM2.5, O3)

θ

1.228

1.363

1.052

1.153

1.059

ρ

0.273

0.386

0.074

0.201

0.083

σ ^ 61

0.318

0.419

0.061

0.217

0.054

Looking at Table 4 we see that, in general, the association parameters provide a good insight on the behavior of the correlation between pairs of pollutants. For instance, in the case of (CO, O3) which has higher association parameters, also has higher sample correlations (higher than 0.7 in all regions) and higher Spearman’s ρ.

We would also like to call attention to the fact that in the present case not always the most suitable two-dimensional model using copula corresponds to the case where the marginal distributions are the ones selected in the unidimensional analysis. For example, in the case of CO and O3 and region NE, we have that the respective one-dimensional distributions are Weibull and Fréchet (see Appendix C). However, when the two-dimensional model is considered we have that the model assuming Fréchet distribution for both pollutants is the selected model for that region. That could be explained by the fact that the Fréchet distribution also represents well the behavior of CO monthly maximum values in region NE with larger values for the density function when compared to those given by the Weibull distribution.

In order to see how the results given in the one-dimensional case presented in Appendix C may be used, we could, for instance, consider the estimated distributions to generate future values. One way of doing so is for each future month we generate a sample and use the mean of this sample to represent the future monthly maxima. We may also use the theoretical means to obtain information on what to expect in the future. Hence, in Table 5, we present the theoretical estimated means, medians and mode ([50]) of the selected distributions for each dataset and the observed means of the monthly maxima from January 2022 to April 2023, whose values were not used in the estimation of the parameters. Looking at Table 5, we have that in all cases the estimated mode is a more suitable theoretical estimator for the observed future mean values since their values are very close. Therefore, even though the estimated parameters of the selected models are such that the means of the corresponding one dimensional distributions, most of the time, produce an overestimation of future values, we may see that the modes may perform this estimation to a good degree.

Table 5. Theoretical estimated means, medians, and modes of the selected models, as well as the observed means of the monthly maximum measurements for the period ranging from January 2022 to April 2023.



NW

NE

CE

SE

SW

O3

mean

0.193

0.156

0.189

0.178

0.21

median

0.167

0.143

0.167

0.162

0.184

mode

0.14

0.127

0.142

0.143

0.156

obs_mean

0.129

0.124

0.139

0.133

0.17

NO2

mean

0.149

0.111

0.142

0.138

0.133

median

0.119

0.109

0.138

0.112

0.104

mode

0.091

0.104

0.131

0.087

0.079

obs_mean

0.087

0.078

0.094

0.08

0.068

SO2

mean

0.186

0.178

0.117

0.083

0.086

median

0.18

0.169

0.11

0.077

0.081

mode

0.166

0.15

0.094

0.063

0.068

obs_mean

0.072

0.094

0.041

0.28

0.031

CO

mean

9.94

9.29

9.26

8.89

7.45

median

8.3

8.33

8.05

5.38

6.46

mode

4.11

5.98

4.99

3.4

3.96

obs_mean

3.23

3.37

2.47

3.18

1.67

PM10

mean

325

447

289

371

234

median

247

417

204

254

172

mode

183

347

143

173

125

obs_mean

266

304

149

236

141

PM2.5

mean

106

160

106

112

100

median

98

140

97

93

85

mode

78

87

78

75

70

obs_mean

83

102

85

116

91

Selected two-dimensional models provided, in most cases, good approximations, through the Spearman’s ρ, to the sample correlations of the pairs of pollutants (see Table 4). Even in the fewer cases where the approximations are not optimal, their differences are not too large. Hence, they provide information regarding some ozone precursors that have a larger influence on its concentration. For instance, in the case of CO, in all regions, we have large correlations with O3. Therefore, decreasing its emission could result in a decrease in the ozone levels in all regions. On the other hand, the influence of SO2, even though relevant, seems to be not as serious as that of CO in all regions.

Besides estimating future correlations between pairs of pollutants and, hence, how much correlation they might have in the future behavior of the data if no changes occur, we may also obtain the probability that their measurements belong to a given set. For, instance, take the pair (CO, O3) and region SW. Suppose we are interested in knowing the probability of having CO in the interval [11.00; 13.00] which corresponds to having bad air quality in the region according to the “Air and Health” quality index ([47]) and of having O3 in the interval [0.154; 0.204] which corresponds to the interval of Phase II of environmental emergency in Mexico City ([51]). Therefore, we need to obtain,

P( ( CO, O 3 )[ 11.00;13.00 ]×[ 0.154;0.204 ] )=1.99E2.

This is the result of the following properties. For a 1 < b 1 y a 2 < b 2 , real numbers and F( , ) the joint distribution of the two random variables X and Y, we have,

P( a 1 <X b 1 , a 2 <Y b 2 )=F( b 1 , b 2 )F( a 1 , b 2 )F( b 1 , a 2 )+F( a 1 , a 2 ).

Hence, if the joint distribution is given by a copula, as it is in our case, we may write, P( ( X,Y )[ a 1 , b 1 )×[ a 2 , b 2 ] ) = C θ ( F X ( b 1 ), F Y ( b 2 ) ) C θ ( F X ( a 1 ), F Y ( b 2 ) ) C θ ( F X ( b 1 ), F Y ( a 2 ) )+ C θ ( F X ( a 1 ), F Y ( a 2 ) ), where F X ( ) and F Y ( ) are the appropriate marginal distribution functions. In the present example we have that F X ( ) and F Y ( ) are either Fréchet or Weibull.

5. Conclusions

In the present work we have studied the behavior of five pairs of pollutants obtained from the Mexico City monitoring network. The data modeled were the paired ozone monthly maximum measurements and the monthly maxima of four other pollutants part of the so-called criterion pollutants which are used to classify the air quality of the Mexico City metropolitan area. In order to study the joint behavior of these pairs of pollutants a two-dimensional distribution was constructed using a copula. Additionally, analysis was performed using the data collected in the different areas of Mexico City. Results show that depending on the region and pair of pollutants, different marginal distributions should be used.

We may see by the results that the estimated distributions, as well as the estimated association parameters are good approximations to the empirical/observed behavior of the data. Given this good approximation, the probabilities of interest may be approximated by a good degree. Hence, obtaining the odds of having exceedances of certain environmental thresholds (such as those used to declare environmental alerts in the city) and therefore, the possibility of an alert may be made with good accuracy.

The model used here could be considered as a first exploratory action in order to evaluate the air quality in future days and hence, evaluate the measures that could be taken in order to possibly avoid environmental emergencies, as well as inform the population of the risks.

Acknowledgments

The authors thank an anonymous reviewer for the comments that helped to improve the presentation of the results in this work. This work is part of JAVM Ph.D. Thesis developed at the Benemérita Universidad Autónoma de Puebla, Puebla, México. The authors thank CONAHCyT-Mexico.

Appendix

A. Plots of the pairs of measurements considered

In this appendix we present the plots of the monthly maxima of five of the criterion pollutants against the ozone monthly maxima, i.e., the CO, NO2, SO2, PM10, and PM2.5 monthly maxima plotted against the ozone’s.

B. The particular forms of the likelihood functions

In this appendix we present the expressions for the likelihood functions in the cases where both F i ( ) and F j ( ) are Fréchet (α, σ) distributions; both are Weibull (α, σ) distributions; and the case where F i ( ) is Weibull (α, σ) and F j ( ) is Fréchet (α, σ).

Both F i ( ) and F j ( ) are Fréchet (α, σ) distributions

In the case where F i ( ) and F j ( ) are both Fréchet (α, σ) distributions, let ( α i , σ i ) and ( α j , σ j ) be their respective parameters. Their expressions are given by (3). Since the joint distribution function is given by the copula (1), the joint density given by (2) is

f ij ( x,y )=( α i α j σ i σ j ) ( x σ i ) α i 1 ( y σ j ) α j 1 exp( [ ( x σ i ) α i θ + ( y σ j ) α j θ ] 1 θ ) { [ ( x σ i ) α i θ + ( y σ j ) α j θ ] 2 θ 2 +( θ1 ) [ ( x σ i ) α i θ + ( y σ j ) α j θ ] 1 θ 2 }.

Both F i ( ) and F j ( ) are Weibull (α, b) distributions

In the case where F i ( ) and F j ( ) are both Weibull (α, b) distributions, let ( α i , b i ) and ( α j , b j ) be their respective parameters. Their expressions are given by (5). Since the joint distribution function is given by the copula (1), the joint density given by (2) is

f ij ( x,y )=( α i α j )( b i b j )( x α i 1 y α j 1 )exp( b i x α i b j y α j ) ( log[ 1 e b i x α i ] ) θ1 ( log[ 1 e b j y α j ] ) θ1 [ 1 e b i x α i ][ 1 e b j y α j ] exp( [ ( log[ 1 e b i x α i ] ) θ + ( log[ 1 e b j y α j ] ) θ ] 1 θ ) { [ ( log[ 1 e b i x α i ] ) θ + ( log[ 1 e b j y α j ] ) θ ] 2 θ 2 +( θ1 ) [ ( log[ 1 e b i x α i ] ) θ + ( log[ 1 e b j y α j ] ) θ ] 1 θ 2 },

Figure A1. Monthly maximum measurements of the criterion pollutants CO, NO2, SO2, PM10, and PM2.5 (from left to right) plotted against the ozone monthly maximum measurements for regions NW (from (a) to (e)), NE (from (f) to (j)), CE (from (k) to (o)), SE (from (p) to (t)), and SW (from (u) to (y)).

with b k = σ k 1/ α k , k=i,j .

F i ( ) is a Weibull ( α i , b i ) distribution and F j ( ) is a Fréchet ( α j , σ j )

In the case where F i ( ) is a Weibull (α, b) distribution and F j ( ) is a Fréchet (α, σ), let ( α i , b i ) and ( α j , σ j ) be their respective parameters. Their expressions are given by (5) and (3), respectively. Since the joint distribution function is given by the copula (1), the joint density given by (2) is

f ij ( x,y )=( b i α i α j σ j α j θ )( x α i 1 y α j θ1 )exp( b i x α i ) ( log[ 1 e b i x α i ] ) θ1 1 e b i x α i exp( [ ( log[ 1 e b i x α i ] ) θ + ( y σ j ) α j θ ] 1 θ )

{ [ ( log[ 1 e b i x α i ] ) θ + ( y σ j ) α j θ ] 2 θ 2 +( θ1 ) [ ( log[ 1 e b i x α i ] ) θ + ( y σ j ) α j θ ] 1 θ 2 },

with b k = σ k 1/ α k , k=i,j .

C. One-dimensional model

In this appendix we present the models, computational details, results, as well as the observed (empirical) and estimated density functions for all pollutants and regions in the one-dimensional models. In all cases, both Fréchet (α, σ) and Weibull (α, σ) distributions were assumed to describe the behavior of the monthly maximum measurements. Once the parameters have been estimated the graphical representation of the density functions, DIC, MLF, and p-value were used to select the distribution that best fit the observed data.

The prior distributions of the parameters of the models are given as follows. In the cases of both Fréchet and Weibull distributions, the parameter α has as a prior distribution a uniform U(0, 10) in all cases. When we consider the parameter b in the Weibull distribution, it has as a prior distribution a U(0, 10) in all cases. If we take into account the parameter σ of the Fréchet distribution, the values of the hyperparameters of the prior distribution are more heterogeneous. Hence, when we have CO, in all regions, the parameters σ will have as prior distribution a uniform U(0, 10). It has a uniform U(0, 100) in the cases of SO2 and regions CE, SE, and SW; and of PM2.5 in all regions with the exception of region NE. The uniform prior distribution U(0, 150) is used in the cases of NO2 in all regions; and SO2 and PM2.5 in region NE. In the case of ozone in all regions the prior distribution of σ is a U(0, 200) which is the same distribution used in the cases of SO2 in region NW and PM10 in regions CE and SW. If we take into account PM10 in regions NW and SE the prior distribution assigned to σ is a U(0, 250) and it is a U(0, 350) in region NE.

Estimation of the parameters was performed in most of the cases by samples of size 8000 collected after a burn-in period of 20,000 iterations. The exceptions are when the Weibull distribution was used to study ozone in regions NW and SW; NO2 in regions CE, NE, and SE; SO2 in regions NE and NW; PM10 in region NW, where we have burn-in periods of 25,000 steps and samples of 17,500 values; Weibull distribution and NO2 in region SW and CO in region NE where the sample sizes are equal to 7000; and ozone in regions CE with sample size 16,000 and in regions NE and SE with sample size 25,000.

Table A1 gives the values of the DIC, MLF, and the p-value in all cases.

Looking at Table A1 we see that when using the DIC to select the models in all cases the Weibull distribution is chosen. If we consider the MLF criterion, then the Fréchet model is selected in all cases with the exception of SO2 in all

Table A1. Values of the DIC, marginal likelihood functions (ML), and p-values for all pollutants, regions and models.

regions; CO in regions NE, CE, and SW; and PM10 in region NE where the Weibull distribution is the selected one. If the p-value criterion is used to select the model, in the majority of the selected models, the Fréchet distribution as the most suitable. The exceptions are SO2 in all regions; NO2 in region CE; CO in regions CE and SW; and PM10 in region NE.

In Figures A2-A6, we have the plots of the estimated and empirical density functions for each region, pollutant, and extreme value distribution in the univariate analysis. In these figures continuous and dashed lines represent the estimated Fréchet and Weibull distribution, respectively, and the blocks are the observed (empirical) density functions. In all figures we have that the results related to pollutant CO are given in panel (a), those related to NO2 are in panel (b), the SO2 is given in panel (c), in panel (d) we have those related to O3, and in panel (e) and (f) we have the results related to the pollutants PM10 and PM2.5, respectively. Figure A2 and Figure A3 are the plots related to regions NW and NE, respectively, in Figure A4 are the plots when data from region CE are used, and in Figure A5 and Figure A6 are the plots related to data from regions SE and SW, respectively. In what follows we present and describe the contents of each figure.

Figure A2 conveys the results when data from the region northwest are used.

Looking at Figure A2 we may see that depending on the pollutant sometimes a Weibull density function fits better the observed data (for instance, in the cases of CO - panel (a), SO2 - panel (c), and PM2.5 - panel (f)), and in other occasions the Fréchet is the most adequate (for instance, in the case of ozone - panel (d) and NO2 - panel (b)).

Figure A3 presents the results in the case of region NE. The disposition of the panels follows that used in Figure A2.

When we look at Figure A3 displaying the case of region NE, we see that the Weiibull density function adjusts better in the case CO and NO2 - panels (a) and (b), respectively, and PM2.5 - panel (f), respectively. In some other cases, the better fit is given by the Fréchet density function as in the case of PM10 - panel (e).

Next, in Figure A4, we give the plots related to the data collected from region CE. As before the disposition of the panels follows that used in Figure A2.

Looking at Figure A4 we see that, as in the previous figures, depending on the pollutant, different density functions may fit better the data. In the present figure we see that the Weibull density function fits well in most of the cases, we may also see that in the case of ozone - panel (d), the Fréchet density function also gives a good fit as it does in the case of PM10 - panel (e).

Figure A5 gives the adjustment of the estimated density function to the observed in the case of region SE.

In this figure behavior similar to those described in Figures A2-A4 may also be detected in Figure A5 with the difference that in the case of ozone - panel (d) it seems that the Weibull density function does not provide a better fit than those presented in the previous figures. However, the overall adjustment is better when we consider this density function.

Figure A2. Observed (blocks) and estimated density functions when the Fréchet (continuous lines) and Weibull (dashed lines) distributions are taken into account and pollutants CO (panel (a)), NO2 (panel (b)), SO2 (panel (c)), O3 (panel (d)), PM10 (panel (e)), and PM2.5 (panel (f)) from region NW are used.

In Figure A6 we have the plots when data from region SW are used.

It is possible to see by looking at Figure A6 that in the case of region SW, both estimated density functions provide good fits with the Weibull in practically all case (with the exception of ozone - panel (d)) given better adjustment to the observed density function.

When the information provided by the graphical fit is taken into account (Figures A2-A6), we have that the Weibull distribution is selected in the cases of CO and regions NW, NE, and CE; NO2 and regions NE and CE; SO2 in all regions with the exception of region NE where the Fréchet distribution is selected; and PM2.5 and regions NW, NE, and CE. In all the remaining cases the Fréchet distribution is selected.

Figure A3. Observed (blocks) and estimated density functions when the Fréchet (continuous lines) and Weibull (dashed lines) distributions are taken into account and pollutants CO (panel (a)), NO2 (panel (b)), SO2 (panel (c)), O3 (panel (d)), PM10 (panel (e)), and PM2.5 (panel (f)) from region NE are used.

It is possible to see that in some cases we have two selection criteria for choosing a given extreme value distribution whereas the other two select another. In those cases we use the graphical fit as a tie break. We have decided to use the graphical fit, because when quantities of interest (such as mean and variance) are needed, if the estimated density function represents well the observed, then these quantities are well specified. Therefore, the chosen density in the univariate model is the Fréchet in the cases of ozone in all regions; NO2 in regions NW, SE, and SW; CO in region SE; in all regions, with the exception of region NE, in the case of PM10; and when we consider PM2.5 and regions SE and SW. In the remaining cases the selected distribution is the Weibull. Table A2 gives the estimated parameters in all the selected models. (Note that the parameters α and σ in some cases are related to the Fréchet distribution and in others related to the Weibull).

Figure A4. Observed (blocks) and estimated density functions when the Fréchet (continuous lines) and Weibull (dashed lines) distributions are taken into account and pollutants CO (panel (a)), NO2 (panel (b)), SO2 (panel (c)), O3 (panel (d)), PM10 (panel (e)), and PM2.5 (panel (f)) from region CE are used.

D. Computational details

In this section we present the prior distributions used in each version of the two-dimensional model, as well as information related to the burn-in periods, sampling gaps, and sample sizes. In some cases we have run five chains and in others, due to slow convergence, we have run only three chains. In all cases the sampling gap is 50.

In the bivariate case the parameter θ has a uniform U(1, 15) prior distribution in all cases and the prior distributions of α, σ and b are as in the univariate version.

Figure A5. Observed (blocks) and estimated density functions when the Fréchet (continuous lines) and Weibull (dashed lines) distributions are taken into account and pollutants CO (panel (a)), NO2 (panel (b)), SO2 (panel (c)), O3 (panel (d)), PM10 (panel (e)), and PM2.5 (panel (f)) from region SE are used.

Depending on the pair of pollutants, model considered, and region, different burn-in periods, number of chains run, and sample sizes were used. Hence, we have a burn-in period of 20,000 steps and samples of size 8000 drawn from five chains in the case of model F-F in all regions for all pairs of pollutants with the exception of the pair (PM10, O3) in region SE where we have a burn-in period of 30,000 steps with a sample of size 7000. We also have a burn-in period of 20,000 iterations with a sample of size 8000 drawn from five chains in the case of model W-F and the pair (CO, O3) in all regions. A burn-in of 20,000 steps and a sample of size 4800 obtained from three chains, were also used in the cases of model W-F and the pair (NO2, O3) in all regions with the exception of region NE and

Figure A6. Observed (blocks) and estimated density functions when the Fréchet (continuous lines) and Weibull (dashed lines) distributions are taken into account and pollutants CO (panel (a)), NO2 (panel (b)), SO2 (panel (c)), O3 (panel (d)), PM10 (panel (e)), and PM2.5 (panel (f)) from region SW are used.

NW where we have a burn-in period of 40,000 with sample size 3600; the pair (PM2.5, O3) in regions CE, NE, and SE; and (PM10, O3) in regions NW and SW. These same burn-in period and sample size were also used in the case of model W-W and the pair (CO, O3) in regions CE, SW, and SE; and (SO2, O3) in region SW.

Five chains were also used when considering the model W-F for some of the pairs of pollutants in some of the regions. However, the burn-in periods and sample sizes were more heterogeneous. Hence, we have that for the pairs (SO2, O3) in regions SE and CE; (PM2.5, O3) in region SW; and (PM10, O3) in region SW, we have a burn-in period of 25,000 iterations and a sample of size 17,500.

Table A2. Estimated means, standard deviations (indicated by SD), and Monte Carlo errors (indicated by MC Error) of the parameters in the selected one-dimensional models for all regions and pollutants considered in the analysis.

In the remaining cases we have run only three chains due to computational constraints. Different burn-in periods and sample sizes were also present in this case. Hence, a burn-in period of 50,000 iterations and a sample size of 9000 were used in the cases of model W-W and (CO, O3) in regions NE, NW, and SE; (PM10, O3) in region NE; (PM2.5, O3) in regions NW and CE; and (NO2, O3) in region CE. The same burn-in period and sample size were also used in the case of model F-W with the pair (PM2.5, O3) in regions NW and CE. Burn-in periods of 40,000 steps with samples with 3600 values were used in the cases of model W-W and (SO2, O3) in all regions with the exception of region SW; (PM10, O3) in regions SE, CE, and SE; and (NO2, O3) in region NW. These values were also used in the case of model W-F and (PM2.5, O3) in region SE. When we consider model W-W and the pairs (NO2, O3) and (PM10, O3) in regions NE and NW, respectively, and model W-F using the pair (SO2, O3) in regions NE and NW; (PM10, O3) and (PM2.5, O3) in regions NE and NW, respectively, the burn-in period was of 30,000 iterations and the sample size was 4200.

In the remaining cases we have model W-W applied to the pair (PM2.5, O3) in regions SE and NE where we have burn-in periods of 60,000 and 100,000 steps, respectively, with corresponding sample sizes 8400 and 12,000; and also the pair (NO2, O3) in region SW with a burn-in period of 35,000 steps with a sample size 3900. Finally, we have model F-W applied to the pair (PM2.5, O3) in region SW with a burn-in period of 60,000 iterations and a sample of size 8400.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

References

[1] World Health Organization (2022) Ambient (Outdoor) Air Pollution. Facts Sheet.
https://www.who.int/news-room/fact-sheets/detail/ambient-(outdoor)-air-quality-and-health
[2] Dossou-Gbete, S.C.J., Kpadonou, D., Nonvignon, G.M.A., Elegbede, V., Karim, A.Y.A., Saizonou, K.V.M., et al. (2023) Level of Exposure of Populations to Atmospheric Pollution in Southern Benin. Open Journal of Air Pollution, 12, 160-181.
https://doi.org/10.4236/ojap.2023.124010
[3] World Health Organization (2006) Air Quality Guidelines for Particulate Matter, Ozone, Nitrogen Dioxide and Sulfur Dioxide. European Union, World Health Organization Regional Office for Europe.
https://www.who.int/publications/i/item/WHO-SDE-PHE-OEH-06-02
[4] Gauderman, W.J., Avol, E., Gilliland, F., Vora, H., Thomas, D., Berhane, K., et al. (2004) The Effect of Air Pollution on Lung Development from 10 to 18 Years of Age. New England Journal of Medicine, 351, 1057-1067.
https://doi.org/10.1056/nejmoa040610
[5] Bell, M.L. (2004) Ozone and Short-Term Mortality in 95 US Urban Communities, 1987-2000. JAMA, 292, 2372-2378.
https://doi.org/10.1001/jama.292.19.2372
[6] Dockery, D.W., Schwartz, J. and Spengler, J.D. (1992) Air Pollution and Daily Mortality: Associations with Particulates and Acid Aerosols. Environmental Research, 59, 362-373.
https://doi.org/10.1016/s0013-9351(05)80042-8
[7] Cifuentes, L., Borja-Aburto, V.H., Gouveia, N., Thurston, G. and Davis, D.L. (2001) Assessing the Health Benefits of Urban Air Pollution Reductions Associated with Climate Change Mitigation (2000-2020): Santiago, São Paulo, México City, and New York City. Environmental Health Perspectives, 109, 419-425.
https://doi.org/10.1289/ehp.01109s3419
[8] Loomis, D.P., Borja-Aburto, V.H., Bangdiwala, S.I. and Shy, C.M. (1996) Ozone Exposure and Daily Mortality in Mexico City: A Time Series Analysis. Health Effects Institute Research Report, No. 75, 1-46.
[9] Mohnen, V.A. (1988) The Challenge of Acid Rain. Scientific American, 259, 30-38.
https://doi.org/10.1038/scientificamerican0888-30
[10] Likens, G. (2010) Acid Rain. Environmental Protection Agency.
http://www.eoearth.org/article/Acidrain
[11] Ritz, B. and Yu, F. (1999) The Effect of Ambient Carbon Monoxide on Low Birth Weight among Children Born in Southern California between 1989 and 1993. Environmental Health Perspectives, 107, 17-25.
https://doi.org/10.1289/ehp.9910717
[12] Ritz, B., Yu, F., Chapa, G. and Fruin, S. (2000) Effect of Air Pollution on Preterm Birth among Children Born in Southern California between 1989 and 1993. Epidemiology, 11, 502-511.
https://doi.org/10.1097/00001648-200009000-00004
[13] Ritz, B., Wilhelm, M., Hoggatt, K.J. and Ghosh, J.K.C. (2007) Ambient Air Pollution and Preterm Birth in the Environment and Pregnancy Outcomes Study at the University of California, Los Angeles. American Journal of Epidemiology, 166, 1045-1052.
https://doi.org/10.1093/aje/kwm181
[14] Itô, K. and Thurston, G.D. (1996) Daily PM10/Mortality Associations: An Investigation of At-Risk Subpopulations. Journal of Exposure Analysis and Environmental Epidemiology, 6, 79-95.
[15] Mauderly, J.L. (1997) Relevance of Particle-Induced Rat Lung Tumors for Assessing Lung Carcinogenic Hazard and Human Lung Cancer Risk. Environmental Health Perspectives, 105, 1337-1346.
https://doi.org/10.1289/ehp.97105s51337
[16] Mauderly, J.L. and Oberdörster, G. (1997) Current Understanding of the Health Effects of Particles and Characteristics That Determined Dose Effects. Formation and Characterization of Particles. Report of the 1996 Health Effects Institute Workshop, Cambridge, 3-4 December 1996, 1-5.
[17] Thurston, G.D. (1996) A Critical Review of PM10 Mortality Time Series Studies. Journal of Exposure Analysis and Environmental Epidemiology, 6, 3-21.
[18] Gallegos-Herrada, M.A., Rodrigues, E.R., Tarumoto, M.H. and Tzintzun, G. (2023) A Multi-Dimensional Non-Homogeneous Markov Chain of Order K to Jointly Study Multi-Pollutant Exceedances. Environmental and Ecological Statistics, 30, 157-187.
https://doi.org/10.1007/s10651-023-00557-8
[19] Rodrigues, E.R., Cruz-Juárez, J.A., Reyes-Cervantes, H.J. and Tzintzun, G. (2023) Air Quality Estimation Using Nonhomogeneous Markov Chains: A Case Study Comparing Two Rules Applied to Mexico City Data. Journal of Environmental Protection, 14, 561-582.
https://doi.org/10.4236/jep.2023.147033
[20] Rogers, C. and Bush, E. (2022) Using Tillandsia recurvate (Ball Moss) as a Biological Indicator to Monitor Air Pollution and Retain Oil Pollution. Open Journal of Air Pollution, 11, 62-69.
https://doi.org/10.4236/ojap.2022.113005
[21] De Haan, L. and Ferreira, A. (2006) Extreme Value Theory: An Introduction. Springer.
[22] Reiss, R.D. and Thomas M (1997) Statistical Analysis of Extreme Values. Volume 2, Birkhäuser.
[23] Nadaryah, S. (2003) Chap. 17. Extreme Theory, Models and Simulation. In: Sanbhag, D.N. and Rao, C.R., Eds., Handbook of Statistics, Volume 21, Elsevier, 607-691.
[24] Coles, S. (2001) An Introduction to Statistical Modeling of Extreme Values. Springer.
[25] Ercelebi, S.G. and Toros, H. (2009) Extreme Value Analysis of Istanbul Air Pollution Data. CLEANSoil, Air, Water, 37, 122-131.
https://doi.org/10.1002/clen.200800041
[26] Rodríguez, S., Huerta, G. and Reyes, H. (2016) A Study of Trends for Mexico City Ozone Extremes: 2001-2014. Atmósfera, 29, 107-120.
https://doi.org/10.20937/atm.2016.29.02.01
[27] Surman, P.G., Bodero, J. and Simpson, R.W. (1987) The Prediction of the Numbers of Violations of Standards and the Frequency of Air Pollution Episodes Using Extreme Value Theory. Atmospheric Environment (1967), 21, 1843-1848.
https://doi.org/10.1016/0004-6981(87)90125-9
[28] Tabari, H. (2021) Extreme Value Analysis Dilemma for Climate Change Impact Assessment on Global Flood and Extreme Precipitation. Journal of Hydrology, 593, Article ID: 125932.
https://doi.org/10.1016/j.jhydrol.2020.125932
[29] García-Cueto, O.R., López-Velázquez, J.E., Bojórquez-Morales, G., Santillán-Soto, N. and Flores-Jiménez, D.E. (2020) Trends in Temperature Extremes in Selected Growing Cities of México under a Non-Stationary Climate. Atmósfera, 34, 233-254.
https://doi.org/10.20937/atm.52784
[30] Fang, J., Wahl, T., Zhang, Q., Muis, S., Hu, P., Fang, J., et al. (2021) Extreme Sea Levels along Coastal China: Uncertainties and Implications. Stochastic Environmental Research and Risk Assessment, 35, 405-418.
https://doi.org/10.1007/s00477-020-01964-0
[31] Yue, S. and Wang, C.Y. (2004) A Comparison of Two Bivariate Extreme Value Distributions. Stochastic Environmental Research and Risk Assessment (SERRA), 18, 61-66.
https://doi.org/10.1007/s00477-003-0124-x
[32] Nelsen, R.B. (2006) An Introduction to Copulas. 2nd Edition, Springer.
[33] Salvadori, G. and De Michele, C. (2010) Multivariate Multiparameter Extreme Value Models and Return Periods: A Copula Approach. Water Resources Research, 46, W10501.
https://doi.org/10.1029/2009wr009040
[34] Masseran, N. and Hussain, S.I. (2020) Copula Modelling on the Dynamic Dependence Structure of Multiple Air Pollutant Variables. Mathematics, 8, Article No. 1910.
https://doi.org/10.3390/math8111910
[35] Gumbel, E.J. (1960) Bivariate Exponential Distributions. Journal of the American Statistical Association, 55, 698-707.
https://doi.org/10.1080/01621459.1960.10483368
[36] Hougaard, P. (1986) Survival Models for Heterogeneous Populations Derived from Stable Distributions. Biometrika, 73, 387-396.
https://doi.org/10.1093/biomet/73.2.387
[37] Hougaard, P. (1986) A Class of Multivariate Failure Time Distributions. Biometrika, 73, 671-678.
https://doi.org/10.2307/2336531
[38] Hutchinson, T.P. and Lai, C.D. (1990) Continuous Bivariate Distributions. Emphasizing Applications, Rumsby.
[39] Padgett, W.J. (2011) Weibull Distribution. In: Lovic, M., Ed., International Encyclopedia of Statistical Science, Springer, 1651-1653.
https://doi.org/10.1007/978-3-642-04898-2_611
[40] Tsokos, C.P. (2011) Generalized Extreme Value Family of Probability Distributions. In: Lovric, M., Ed., International Encyclopedia of Statistical Science, Springer, 585-589.
https://doi.org/10.1007/978-3-642-04898-2_427
[41] Carlin, B.P. and Louis, T.A. (2000) Bayes and Empirical Bayes Methods for Data Analysis. 2nd Edition, Chapman and Hall/CRC.
[42] Lunn, D., Spiegelhalter, D., Thomas, A. and Best, N. (2009) The BUGS Project: Evolution, Critique and Future Directions. Statistics in Medicine, 28, 3049-3067.
https://doi.org/10.1002/sim.3680
[43] Spiegelhalter, D.J., Thomas, A., Best, N.G. and Gilks, W.R. (1999) BUGS: Bayesian Inference Using Gibbs Sampling. MRC Biostatistics Unit.
https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=478195f845581114c5dde5a1cf21f5786d7127a1
[44] Spiegelhalter, D.J., Best, N.G., Carlin, B.P. and Van Der Linde, A. (2002) Bayesian Measures of Model Complexity and Fit. Journal of the Royal Statistical Society Series B: Statistical Methodology, 64, 583-639.
https://doi.org/10.1111/1467-9868.00353
[45] Raftery, A.E. (1996) Hypothesis Testing and Model Selection. In: Gilks, W., Richardson, S. and Speigelhalter, D.J., Eds., Markov Chain Monte Carlo in Practice, Chapman and Hall, 163-187
[46] N.O.M. (2014) Norma Oficial Mexicana NOM-020-SSA1-2014. Diario Oficial del a Federación. 19 de agosto de 2014. (In Spanish)
[47] N.O.M. (2019) Norma Oficial Mexicana NOM-172-SEMARNAT-2019. Diario Oficial de la Federación. 20 de noviembre de 2019. Mexico. (In Spanish)
https://www.dof.gob.mx/nota_detalle.php?codigo=5579387fecha=20/11/2019#gsc.tab=0
[48] Hürliman, W. (2005) Properties and Measure of Dependence for Archmax Copula. Advances and Applications in Statistics, 5, 125-143.
[49] Yilmaz, M. and Bekçi, M. (2021) Construction of a Bivariate Copula by Rüschendorf’s Method. Cumhuriyet Science Journal, 42, 201-208.
https://doi.org/10.17776/csj.753556
[50] Johnson, N.L., Koltz, S. and Balakrishnan, B. (1995) Continuous Univariate Distributions. Volume 2, John Wiley and Sons.
[51] N.A.D.F. (2018) Norma Ambiental para el Distrito Federal NADF-009-AIRE-2017. Gaceta Oficial de la Ciudad de México. 14 de noviembre de 2018. Mexico. (In Spanish)
http://www.aire.cdmx.gob.mx/descargas/monitoreo/normatividad/NADF-009-AIRE-2017.pdf

Copyright © 2025 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.