^{1}

^{*}

^{1}

^{1}

^{1}

^{1}

^{1}

The use of Recurrence plots have been extensively used in various fields. In this work, Recurrence Plots (RPs) investigates the changes in the non-linear behaviour of urban air pollution using large datasets of raw data (hourly). This analysis has not been used before to extract information from large datasets for this type non-linear problem. Two different approaches have been used to tackle this problem. The first approach is to show results according to monitoring network. The second approach is to show the results by particle type. This analysis shows the feasibility of using Recurrence Analysis for pollution monitoring and control.

The states in nature typically change in time. The importance in the investigation of these changes in complex systems helps to understand and describe such changes. A relatively new method based on non-linear data analysis has become popular to describe the changes of these systems. This method is called recurrence plot [1, 2].

In this contribution, the non-linear behaviour of urban air pollution is quantified and analysed at various sites at Mexico City, using large datasets over a number of years. This is carried out to show the feasibility of analysing key features embedded in the raw datasets.

In recent times, urban air pollution has been a growing problem especially for urban communities. Size, shape and chemical properties govern the lifetime of particles in the atmosphere and the site of deposition within the respiratory tract. Also, air pollution has been held responsible for various health disorders, especially respiratory complications resulting in an increase in the number of asthmatic cases and hospital admissions in some parts of the world and has been widely documented [3-5].

Most major pollutants can alter pulmonary function in addition to other health effects when the exposure concentrations are high. This is especially severe in vulnerable sectors of the population such as children asthmatic and the elderly and has been vastly documented [6-9].

In this work, five particles were chosen due to the site’s availability and toxicity: Ozone (O_{3}), Carbon Monoxide (CO), Nitrogen Dioxide (NO_{2}), Sulphur Dioxide (SO_{2}) and Particulate Matter of less than 10 micrometers (PM10). The datasets are separated according to month of the year and type of particle. There is one data for each hour, for each particle for all five sites, making it difficult to extract information from datasets using common methods.

Ozone is a natural atmosphere component that is found on low concentrations and is crucial for life. Air pollution caused by high concentration of ozone is a common problem in large cities throughout the world [

In Mexico City, there is a decreasing of ozone especially from 2009. However, there is still a risk situation for overexposure mainly in the southwest region [

Carbon monoxide (CO) is a tasteless, odorless gaseous pollutant ubiquitous in the outdoor atmosphere that is generated by combustion [

Adverse health effects of CO exposure include death from asphyxiation at high exposure levels and, at lower levels, impaired neuropsychological performance and risk for myocardial ischemia and rhythm disturbances in persons with cardiovascular disease. The most definitive evidence on CO comes largely from controlled exposure studies, involving CO inhalation at concentrations to mimic exposures previously typical of urban environments [19, 20].

Also, Carbon monoxide has been held responsible for many hospital admissions due to carbon monoxide poisoning. Only in the US, around 40,000 people are admitted in hospitals for this cause in one year [

In Mexico, the Official Norm NOM-021-SSA1-1993 sets the maximum level for carbon monoxide on 11.0 parts per million (ppm) for an average of 8 hours, which cannot be exceeded more than once a year. Comparatively, in the United States the federal standard is 9 ppm for 8 hours and 35 ppm for 1 hour average.

Nitrogen Dioxide (NO_{2}) is a particularly important compound, not only for its health effects, but also because absorbs visible light and contributes to the visibility decrease. It also plays a critical role in production of ozone due to the photolysis of NO_{2} is the initial step in the photochemical reaction of the ozone [

In nature, there is a nitrogen dioxide concentration of 10 to 50 parts per billion (ppb). However, the high levels of nitrogen dioxide are due to industrial processes and fossil sources. Furthermore, motor vehicles substantially contribute to urban levels of nitrogen oxides through their engine combustion processes [_{2} is critically important, in order to assess the potential effect of NO_{2} on human health and ecosystems, as well as developing strategies for the effective control of NO_{2} pollution [23-25].

In Mexico, the official norm: NOM-023-SSA1-1993 [^{3} (0.021 ppm) for annual mean and 200 µg/m^{3} (0.106 ppm) for hourly mean.

Sulphur Dioxide gases contribute to the deterioration of air quality. Several epidemiological studies have demonstrated a direct association between atmospheric inhalable Sulphur dioxide and respiratory diseases, pulmonary damage and mortality among population [

In the past three decades in Europe, and more recently in the United States, there have been substantial reducetions in SO_{2} emissions [26,27].

The World Health Organization recommends a concentration of between 100 to 150 µg/m^{3} 24 hours mean and 40 to 60 µg/m^{3} the annual mean. The official Mexican Norm: NOM-022-SSA1-1993 establishes a limit of 341 µg/m^{3} 24-hour mean once a year and 79 µg/m^{3} annual mean to protect vulnerable population.

The airborne particulate matter (PM) is a mixture of small particles and liquid droplets suspended in the atmosphere, which contributes significantly to the urban air quality such as acid rain and visibility degradation [

In airborne pollution particle could be any olid or liquid materials with a diameter between 0.002 and 500 micrometers (µm). Airborne particulates of 10 μm diameter and less are of concern from the perspective of air pollution. A variety of national and worldwide standards, directives and guidelines exist to define acceptable particulate levels in the air.

These types of particles are classified according to their effect on human health and their Physical characteristics.

Mexico City is geographically located in the Valley of Mexico. This valley, also known as the Valley of the Damned is a large valley in the high plateaus at the center of Mexico. It has an altitude of 2240 meters (7349 feet). The Federal District of Mexico City is situated in central-south Mexico and it is surrounded by the state of Mexico on the west, north and east, and by the state of Morelos on the south. The city covers an area of around 1485 km^{2} (571 sq mi) with the elevation of 2240 m (7349 ft).

The sites used in this work are as follows: Northeast (San Agustín-SAG), Northwest (Acatlán-FAC), Downtown (Merced-MER), Southeast (Iztapalapa-UIZ) and Southwest (Pedregal-PED). The map of the monitoring sites is shown on

The recurrence plot (RP) exhibits characteristic patterns for typical dynamical behavior. A collection of single recurrence points, homogeneously and irregularly distributed over the whole plot, reveals a mainly stochastic process [

Recurrence Plot is a graphical tool introduced by Eckmann (1987) in order to extract qualitative characteristics of a time series. The recurrence of a state i at a different time j is pictured within a two-dimensional squared matrix with black and white dots, where the black dots represent a recurrence and both axes represent time [30,31].

Such RP can be mathematically expressed as:

where, N is the number of considered states x_{i}; is a threshold distance, norm and the

Heaviside function [

Since by definition, the RP has a black main diagonal line called line of identity (LOI). In this context, the Heaviside function is a recurrence of a state that is sufficiently close to (states that fall into an m-dimensional neighborhood) [

Using the time series of a single observable variable (particles, in this case), it is possible to reconstruct a phase space trajectory. Starting from the scalar time series a sequence of embedded vectors is generated [

Determining the embedding parameters must be the first step for analysing nonlinear systems [29,34-37]. For this reason, a search for the best dimension and time delay must be made first. In this contribution, the best dimension value is calculated using the algorithm of false nearest neighbors (FNN) as shown on [32,38].

Also, when calculating an RP a norm must be chosen [

Although it is possible to identify each plot from figures 2(c) and (d), some experience is needed to interpret the RPs [

The main idea of this project is to reconstruct the (unknown) system dynamics in the phase space by using time-delay embedding, and then computing the distances between all pairs of embedded vectors, generating a symmetric two-dimensional square matrix for each dataset as shown on figures 1(c) and (d), applying RQA to each dataset.

Zbilut [

In general, the characteristics measured in a RP are:

recurrence rate, determinism, ratio, entropy and trend. In this contribution, an extension of these characteristics was also considered such as Laminarity and Trapping time.

The recurrence rate is a measure of recurrences, or density of recurrence points in the RP. This rate gives the mean probability of recurrences in the system [41,43]. The recurrence rate is given by:

in the case of time series, and;

in the case of spatial data [

The recurrence rate represents the fraction of recurrent points with respect to the total number of possible recurrences. It is a density measure of the RP.

Determinism is a measure for predictability of the system [

where P(l) denotes the probability of finding a diagonal line of length l in the RP. This measure quantifies the predictability of a system [

The average diagonal line length L_{mean} is defined as:

This characterizes the average time that two segments of a trajectory stay in the vicinity of each other, and is related to the mean predictability time [

The choice of l min can also be used in order to exclude short temporal scales that are not important [

The Ratio variable is defined as the quotient of determinism (DET) divided by the recurrence (REC). It is useful to detect transitions between states: this ratio increases during transitions but settles down when a new quasi-steady state is achieved [

The measure characteristic entropy refers to the Shannon entropy of the frequency distribution of the diagonal line lengths [

The trend is a linear regression coefficient over the recurrence point density of the diagonals parallel to the Line of Identity (LOI). The trend measurement is given by:

Laminarity may be defined as the amount of recurrence points which form vertical lines [

where P(v) is the frequency distribution of the lengths v of the vertical lines, which have at least a length of v min. It is noteworthy that Laminarity is evidence of chaotic transitions and is related with the amount of laminar phases in the system (intermittency) [

Trapping Time shows the average length of the vertical lines and is given by equation (9):

where v is the length of the vertical lines, v min is the shortest length that is considered a line segment and P(v) is the distribution of the corresponding lengths. TT shows the time that the systems have been trapped in the same state [

Recurrence Quantification Analysis have been carried out for years 2005-2010 for all sites mentioned in section 2.2 for particles PM_{10}, CO, SO_{2}, NO_{2} and O_{3} using the raw data (hourly) obtained from the monitoring stations. The analysis has been carried out for recurrence rate (REC), determinism (DET), Ratio, Trapping Time (TT), Laminarity (LAM) and Trend.

The results were analysed separately and presented in form of boxplots according to two different approaches: by site and by type of particle. This analysis is complex due to the large quantity of the datasets. However, much useful information have been extracted from the recurrence plots using RQA.

This approach explores the feasibility of using RQA to extract information from the Recurrence Plots by sites. As explained in section 2.2, the monitoring networks were separated as: Northwest, Northeast, Downtown, Southwest and Southeast. This approach does not take into account the type of particle, but rather the location of the monitoring network.

In figure 3 is shown the recurrence Rate for all Monitoring networks. In this figure, it is worth notice that the median recurrence rate lies from 10 to 13, with the lowest recurrence rate being for Mexico Northwest. The other sites seem to have a fairly constant spread of the percentiles.

The ratio for particle concentration (

Furthermore, for entropy the frequency distribution of the data is also stable. Their median oscillates between 2 and 3 for all monitoring networks showing the percentiles reaching up to 5. This could also be due to the type of particle rather than the site. This is shown on figure 6.