Statistical Air Pollution Index (API) for Trinidad and Tobago Based on Observed Data on Trinidad’s West Coast

Air pollution has been identified as the largest global environmental threat facing the world today, estimated to cause 7 - 10 million deaths worldwide annually (World Health Organisation, 2014, 2016; Yale University, 2018). Trinidad and Tobago, with a per capita GDP of USD$16310 (2019), is the most industrialised of the Caribbean islands, and like the rest of the Caribbean region is also affected by seasonal Sahara dust (<PM 2.5 ). Assessment of the air quality was done for over Trinidad’s west coast. Pollution was measured at four stations during March ‘15-May ‘16, representative of rural, urban, mixed background and industrial land uses. Annual mean PM 2.5 and PM 10 in ambient air exceeded the WHO guidelines for protection of public health (n = 522). PM 2.5 and PM 10 exceed the WHO (2006) safe limit guidelines (PM 2.5 is 10 µg/m 3 ; PM 10 is 20 µg/m 3 ) over 70% of the time sampled at urban and industrial sites. Gaseous pollutants found to be in exceedance were CO, NH 3 , NO 2 , N 2 O, C 6 H 6 . Nitrogen dioxide and benzene were the most prolific. A collated metric based on measurement of these pollutants yielded a statistically validated algorithm—An Air Pollution Index. The single metric can convey useful and easily understood information on air quality to the regulators and the general public.


Introduction
Air pollution is a complex multicomponent mixture that can have varying impacts on human health, ranging from chronic effects on the respiratory and cir-more difficult to establish cause and effect relationships.
Blessed with abundant petroleum and natural gas deposits, successive governments of Trinidad and Tobago (T & T) have actively courted large scale industrial enterprises (since the 1960's) with little balance in consideration to environmental health and pollution management issues (PLIPDECO, 1999). The natural resources have been a boon to the economic development; however, it has been a disincentive to conserve the environment. Trinidad and Tobago has reaped the economic benefits of its petrochemical resources, but has been ranked as the 3 rd highest (if not 2 nd ) producer of GHGs per capita globally (Boodlal, 2012). Besides the locally produced industrial and urban linked air pollutants, T & T like the rest of the Caribbean is also seasonally afflicted with trans-Atlantic fine particulates brought over with the stratospheric Sahara Air Layer (NASA, 2013). These pollutants are the likely cause of the increase respiratory and allergic response ailments observed by health workers locally (Monteil et al., 2005). Air quality at any given time therefore depends on the surrounding land use, traffic levels, weather conditions and time of year. This study characterised the air quality by investigating how various factors affect the types and levels of pollutants at different places over time. This paper is part of a larger study, and outlines the calculation of an air pollution index for Trinidad and Tobago based on measurement of a range of pollutants during March 2015 to May 2016. The characterisation and distribution of the air pollution are described only in summary in this paper and will be discussed in greater detail in a separate publication.

Literature Review
The WHO estimates that of all the air pollutants, fine airborne particulates (PM 2.5 ) have the most impact on human health. Worldwide, it is estimated to cause about 25% of lung cancer deaths, 8% of chronic obstructive pulmonary disease (COPD) deaths, and about 15% of ischaemic heart disease and stroke (World Health Organisation, 2017b).
Direct relationships between levels of fine (PM 2.5 ) and respirable particulates (PM 10 ) to mortality rates in affected populations (Pope III, Rodermund, & Gee, 2007;Pope III, Schwartz, & Ransom, 1992;Balluz et al., 2007;World Health Organisation, 2014) have been well established. Supporting scientific evidence indicates that besides worsening already existing health conditions in cardiovascular and respiratory disease patients (Bruske et al., 2010), air pollution can preci-  (Dockery et al., 1993;Lepeule, Laden, Dockery, & Schwartz, 2012;Pope III, Schwartz, & Ransom, 1992). Levels of exposure as well as the chemical nature and duration of exposure determine the "dose-response" relationship, hence the effects on the human health. Pope et al. (Pope III et al., 2011) found that the exposure-response relationship associated with PM 2.5 was qualitatively different for lung cancer versus cardiovascular mortality. At low exposure levels, cardiovascular deaths are projected to account for most of the burden of disease, whereas at high levels of PM 2.5 , lung cancer becomes proportionately more important. Air quality is a concept, measurable by the levels of pollution in the air. The composite index is useful as it incorporates complex data into in a single metric to ease the understanding of and prompt swift action from regulators and persons who can be impacted. Composite air pollution or air quality indices (APIs and AQIs) have been used for more than three decades, particularly in industrial countries where regulatory management is closely intertwined with public health protection. The specific combination of weighted (or unweighted) variables (averaged over specific time frames e.g. 1-hr, 8-hr, 24-hr) varies by the state/country, directed by that country's own regulatory priorities and policies. There is some commonality regarding the pollutant factors included in calculating an API, as it usually incorporates one or more primary (directly emitted) and/or secondary pollutant (derived from chemical reactions with primary pollutants). Each method of calculation has its merits and limitations. Several authors have evaluated and compared the methodologies used in the calculation of APIs/AQIs used around the globe (Bishoi, Prakash, & Jain, 2009), (Tiwari, 2015), (Wong et al., 2012), summarised in Table 1.

Methods and Procedures
The metric for "measuring" air quality is inextricably linked to the types and amounts of the pollutants in the air at that given time and location, as much as it is the accuracy of the measurement methods used. An Air Pollution index was calculated based on fifteen months of pollutant data collected during this study on Trinidad's west coast (Figure 1). Pollutants and meteorological parameters were measured simultaneously at four sites classified as urban, rural, industrial and mixed background over a minimum 24 hr period once every 6 th day from March 2015 to May 2016. Particulates were measured gravimetrically using a commercially available low volume (Q = 2.3 m 3 /hr) modular sampler (ISAP 1050e500) compliant with German/EU air quality control regulations for limit values and measurement engineering 2008/50/EC, BimSchV, EN 12341, EN 14907, EN 481, ISO 8756, TA Luft, VDI 2463 BI. 1 + 8 and VDI BI.2). Meteorological parameters (relative humidity, temperature, precipitation, atmospheric pressure, wind speed and direction) were measured using the compatible meteorological sensors for the gravimetric sampler. Pollutant gases were measured using a GASMET DX 4015 Multicomponent Portable Fourier Transform Infra-Red Spectrometer, calibrated to ambient conditions. Data collected were averaged Taiwan AQI (Lohani, 1984) Statistical factor analysis approach for AQI. Authors compared the air quality index based on factor analysis method and Pindex method. The ratings (or trends) obtained by both these methods are exactly same, but the AQI based on factor analysis shows a wider range-indicates it's the superior approach and analysed as daily averages as the smallest sampling unit for which particulates were measured.
The development of a multivariate metric for capturing Air Quality has remained an elusive goal locally, though it is a common enough approach used around the world to calculate APIs (Bishoi, Prakash, & Jain, 2009;Wong et al., 2012;Tiwari, 2015). The methodology followed for compilation and validation of the composite index categories. The systematic process of collating and condensing the field data, statistically extracting the factors and weighting coefficients, compilation of these factors into well segregated categories of the index was adopted from Chadee & Stoute, 2018. However, this study went a step further by validating the composite index via regression of calculated results with actual field measurements as proof, providing statistical rigour and validation to the Air Pollution Index algorithm formulated. Two algorithms were developed, one utilising fewer measured parameters than the other, though representing less of the variance.

Experimental/Numerical Setting
Essentially Exploratory Factor Analysis yields the underlying structure of the input variables in the form of factors, which are then aggregated into the composite index-a linear combination of weighted variables. The index for the air pollution components is built by multiplying the measured value of the pollutant by a coefficient which is the weighted average of the regression factor score coefficients, as illustrated by Equations (1) and (2). In Equation (1), the weight for the pollutant, j, W j , is estimated by where V i is the variance explained by the i th factor for a solution of p factors, β ij is the regression factor score coefficient, and V is the total variance in the values for all the pollutants in the index explained by all the p factors. The index score, I T , for the air pollution at any time, T, is obtained by applying the algorithm in Equation (2). S j is the value for the pollutant j summed over n pollutants in the group ( ) One index was calculated using 11 significant input variables (API-11) and a second was calculated by truncation of the colinear variables through stepwise regression, reducing the input variable to 6 (to elucidate API-6).
The Air Pollution Index scores were classified by examining a histogram of the scores to look for natural breaks in frequency, which could be used as the classification boundaries. The effectiveness of the classification was estimated using Discriminant Analysis (SPSS V.22) to obtain cross validation of the selected classes with those estimated by the discriminant function(s), and General Linear Model analysis to determine fit to observed data using pooled pollutant data for all four stations.

1) Pollutant levels
The pollutant levels measured during the study are listed as daily or annual means to conform with reporting and comparison standards. Annual mean particulates measured at each station are listed in as well the number of times the sampled data exceeded local and international standards. Particulates measured at the urban and industrial stations (Port-of-Spain and Pt. Lisas respectively) exceeded the WHO (2006) daily safe limit guidelines (PM 2.5 is 10 µg/m 3 , PM 10 is 20 µg/m 3 ) over 70% of the time sampled, with annual means also exceeding the PM 2.5 local Environmental Management Authority's limit value of 15 µg/m 3 (Table 2).
Annual mean levels of gaseous pollutants at the four sampling sites (Table 3) indicate a range of pollutants, a few of which (NO 2 and C 6 H 6 being the most prolific) exceed the EMA's limit values outlined in Shedule-1 of the regulatory guidelines (Environmental Management Authority of Trinidad and Tobago, 2014). Detailed analysis of the trends for exceedance, variation by station and time of year are outlined in a separate paper.
2) Algorithms for API-11 and API-6 The index had to meet several criteria, namely: a) Input variables used should be universally recognized and have deleterious health impacts. b) Input variables should be measurable in real time so that the index itself could be readily generated in real time from the field measurements.    Table 4.
The algorithm combined represented the weighted sum for each pollutant, resulting in the index termed API (11), and calculated for each daily measurement. The actual value of the index is calculated from summing the products of the weighted pollutant concentrations at any given time or place (Table 5).
Stepwise regression was used to remove the collinear variables used in API (11).
This approach allowed only variables which contributed uniquely in the model to be retained. The regression procedure stops when all significant, non-redundant  Table 3. This shows that 94.4% of the variance in the 11-variable Air Pollution Index API-11, is captured by just six variables in API-6. The truncated algorithm for API-6 was derived by applying the regression. The weights for these six variables in API-11 are multiplied by the respective regression coefficients from Table 4. The variable weights for the algorithm for this new index-API-6, is also given in Table 5.
Comparison of the two indices indicate that API-6 represents 94.4% of the variability of API-11 (Table 6) with its 6 variable input, thus making it an acceptable substitute for the higher input variable index.
3) Classification and Validation API-11 and API-6 Classification gives more utility to the index, as both the absolute score and the "API Class" can be quoted in AAQI management documentation and public advisories. The Air Pollution Index scores were classified by examining a histogram of the scores to look for natural breaks in frequency, for delimiting the classification boundaries ( Figure 2). The API-11 is separated into four (4) classes over its score range from 19.18 -85.98; Low (0 -23), Normal (>23 -30), High (>30 -40) and Very High (>40 -100). For API-6 (range 9.44 -93.72) three classes were chosen, low Normal (<20), high Normal (>20 -30) and High (>30) for ease of separation and where there were natural breaks.
The classification systems were firstly tested statistically to determine its veracity using Discriminant Analysis. This allowed a cross validation of the selected classes with those estimated by the discriminant function(s). The validity of the classification boundaries used for both indices was tested using canonical discriminant functions (SPSS V.22) for cross validation in each scheme. Three discriminant functions accurately classified 82.7% of the original (83.5% of the cross-validated) data values for API (11). For API (6) classification boundaries, the discriminant functions, based on the variables in the index correctly classified 84.9% (83.5%) of the data. The important issue here is not so much to validate the arbitrary boundaries but to understand which variables discriminate best between the various levels of air pollution. The "discriminating" variables  in the regression analysis were found to be PM 2.5 , CO 2 , and CH 3 OH, which were also the same three variables that are the most impactful. Validation of the index comes with the consistency of classification as outlined in the above section, as well as its "goodness of fit" to the observed data. This second tier of validation was tested using General Linear Model (GLM) analysis. The GLM analysis for API-11 and API-6 yielded robust models indicated by the high R squared values, R 2 = 0.901 (R 2 Adj-0.866) and R 2 = 0.902 (R 2 Adj-0.867) respectively, indicating an excellent fit to the pollutant dataset. Noteworthy was that the daily averaged meteorological conditions did not factor significantly, as the R 2 Adj for API-11 and API-6 were close to the R 2 values for the respective index models, indicating very little variance inflation. Using the API-11 algorithm, the air quality index calculated for the four monitoring during the study period is illustrated in Figure 3. The GLM analysis highlighted significant three-way interaction effects among station, period (time of the year) and day of week, indicating an interdependence among these three factors on both API-11 and API-6.
The air quality plots for the mixed background, industrial, urban and rural stations ( Figure 3) indicate similar periods when air quality is the worst (June-July '15 and December '15-January '16) the transition months for the wet and dry Journal of Geoscience and Environment Protection November '15. The consistent multipollutant metric used (API-11 index) allows comparison of the air quality across the country, accurately providing a single "score" that can easily be associated with the pollutant levels. This is much more utilitarian for public awareness and public advisories, which is likely why this composite algorithmic approach to air quality indices has been used in Taiwan, Canada and the EU. It has the advantage over single pollutant scores like the USEPA's Pollution Standard Index (US Environmental Protection Agency, 2019), in that it is more representative of the air quality. The indices in this study (API-11 and API-6) are validated statistically and in the field, better for informing the general public and consider additive and synergistic effects where several pollutants impact the air quality as it almost always the case in actual circumstances.
A single metric that represents the air quality that incorporates information from multiple pollutant variables is very useful on several levels which is why so many countries use it as a tool in the air pollution management arsenal. As outlined by the Indian Central Pollution Control Board (Central Pollution Control Board, 2014), the benefits of a single air quality indicator value have widespread advantages that stem from conveying complex information in a simplified and easily understandable way; − Politicians and Law/Decision Makers: facilitate rapid action in hot spot areas, assess effectiveness of pollution management strategies, identify the gaps in regulation to be addressed, use data driven decision making for zonation of land use, prioritise investment of limited resources in pollution mitigation. − Researcher and Scientists: provide data useful for scientists to analyse and provide technical input on air pollution science, to inform environmental issues, climate change issues, public health issues. − Public Knowledge: the public can use the information given to self-adjust behaviour and activities appropriate to their circumstance (illness, exercise, outdoor recreation etc.) and be aware of the state of their environment. In this digital age of social media and multiple avenues for disbursement of information, the use of an API can have far reaching and significant impact in communication of critical information to the people that need to use it and act on it. The effectiveness of this strategy explains why so many countries that have active programmes to reduce their air pollution on a national scale have adopted the strategy of a publicised Air Quality/Pollution Index. Combating the global effects of air pollution requires a multiprong strategy where politicians, scientists and the public to be cognisant of the cause and effect dynamics of actions like burning fossil fuels, widespread use of coal, unabated industrial emissions, lack of alternative energy policies and absent monitoring, sustainable environmental policies and the mal effects that result. Transfer of information is essential in creating the impetus to force an international mitigatory response to the global problem of air pollution and climate change.

Conclusion and Future Research
The assessment of air quality in Trinidad and a metric to represent the information in a readily digestible manner was one of objectives of this research. The indices developed in this study (API-11 and API-6) have been shown to be both representative and statistically valid, with several advantages over the single pollutant indicator used currently by the USEPA which is directly linked to its own National Ambient Air Quality Standards (NAAQS). The multiparameter approach also has more intuitive logic as air quality is affected by a slew of pollutants, not one at a time. It is anticipated that this approach to calculating and reporting air quality will be incorporated into the arsenal of the local advisory and regulatory agencies for assessment and dissemination of information on the air quality status to the general public.
In the immediate future, the API calculated from air pollution measured in real time can be made available to the public on social and digital media platforms, e.g. on the Trinidad and Tobago Meteorological Service website, on the EMA's website, on the news programs, accompanied by advisories for the public for the protection of public health. In the short to medium term, the changes in the air quality should be the main feedback loop for assessing the effectiveness of environmental and air quality management and remediation strategies and policies, in accordance with the national sustainability goals (Ministry of Planning and Development, GoTT, 2017). It is anticipated that the increased public awareness of the air quality will grow the impetus to push the revision of the current legislature towards air quality standards that are in line with the best available research and the protection of public health.