Application of Multivariate Analysis for Identification of Pollution Sources in the Densu Delta Wetland in the Vicinity of a Landfill Site in Ghana

Surface water samples collected from various sites in the Densu delta wetland, Ghana, were analyzed for pH, statistical analyses such as cluster analysis (CA) and principal component analysis (PCA) were used to identify heavy metal pollution in the wetland area. Results from CA and PCA suggest positive relationships between the two analyses. Linear correlation analysis done also suggests similar relationships. Heavy metals were identified as originating from a common source in all the analyses. The hydrochemistry of the area appears to have been influenced, to a large extent, by dissolution/precipitation as well as numerous but subsistence small-scale agricultural activities that take place in the wetland environment.


Introduction
For a very long time, the aquatic environment has been regarded as a 'free good' available to be exploited for social, cultural and economic gain [1].This is especially so for many wetlands have suffered degradation resulting from sewage disposal, industrial activities, and agriculture.This has led to biodiversity loss, pollution of surface and ground water, soil pollution and, even in some cases, loss of habitats of some birds, mammals, reptiles, amphibians, fish and invertebrate species.Studies on the importance of wetlands to the socio-economic development of countries are gaining much importance as many people are becoming familiar with the economic benefits of wetlands.
Owing to seasonal distribution of rainfall, the Densu delta wetland experiences seasonal flooding which introduces a lot of detritus, nutrients and possibly pollutants resulting directly or indirectly from housing developments, subsistence agriculture including fishing activities, water pollution and dumping of household and other forms of waste.Accumulation of high levels of elements, even if in trace amounts could have negative influence on the health of humans and other forms of life especially those that depend extensively on resources in the wetland environment.Heavy metals such as cadmium and lead are used in many agricultural applications and, because of their harmful effects, stability and persistence of these environmental contaminants are increasingly becoming a source of concern in many coastal waters and wetland systems.Even trace amounts of such metals can accumulate in the food chain, eventually causing diseases such as cancer, leukemia and lymphoma.Therefore, identification and monitoring of pollutants in the water environment is of critical importance in an attempt to protect ecological and human health.
Even though some pollution studies have been undertaken in the Densu wetland [2], there still appears to be inadequate knowledge and/or information on the source(s) and/or fate of contaminants in this ecologically important wetland.Multivariate statistical analyses such as cluster analysis (CA) and principal component analy-sis (PCA) with factor analysis are becoming important tools in environmental studies dealing with measurements and monitoring [3].Normally, CA is carried out to reveal specific links between sampling sites and it also helps in interpreting the data and indicate the contaminant patterns.PCA is also an exploratory data analysis method that could allow for identification of major contamination sources [4].Multivariate statistic methods such as PCA and CA were used to predict potential non-site heavy metal sources in soil on the regional scale [5].This paper attempts to utilize these methods to check on probable sources of pollution within the Densu wetland environment, an important ecosystem in Ghana, West Africa.

Materials and Methods
The Densu delta wetland is close to the confluence of the Densu River with the Atlantic Ocean in the Accra Metropolis of Ghana (Figure 1).It is located approximately 11 km to the SouthWest of Accra and is traversed by the Accra-Takoradi-Axim highway.Some communities of interest in this wetland are Aplaku, Tetegbu, Bortianor, Panbros, and Weija.The wetland is fed mainly by the Densu River, which has been dammed a few kilometres up-stream (Weija dam) to supply water to some parts of Accra.Inspite of its ecological importance, its proximity to the rapidly developing metropolitan Accra has resulted in massive encroachment on the wetland environment for farming, fishing, salt mining and residential accommodation.The cumulative effect of improper urban planning has put enormous environmental pressures on the wetland.An abandoned quarry at Oblogo which is now being used as a landfill site is situated very close to the wetland.All these developments have put enormous pressure on this ecologically important wetland.

Sample Collection and Analysis
Surface water samples for chemical analysis were collected from five areas in the Densu wetland namely; Aplaku, Weija, Tetegbu, Panbros, and Bortianor.Two sampling sites were chosen from each of the five sites; Aplaku (SW2a, SW2b), Tetegbu (SW3a, SW3b), Pan-bros (SW4a, SW4b), Bortianor (SW5a, SW5b).Site SW1 which is in Weija is located above a landfill site.Also, sites DSW5a and DSW5b were located in Bortianor but were taken in the dry season.DSW2a was located in Aplaku and represents sample taken during the dry season.Sampling and analysis were done between October, 2007 and March, 2008 which covered the wet and dry seasons, respectively.Standard methods were followed for the collection, preservation and analyses of samples [6].Parameters such as pH, conductivity, TDS, salinity and temperature were determined both in-situ and in the laboratory using a portable equipment.
Microwave assisted acid digestion method using nitric, hydrochloric and peroxide acids was used in the metal analysis.6 ml (65%) HNO 3 , 3 ml (35%) HCl and 0.25 ml H 2 O 2 (30%) acids were added to 5ml each of the water samples in a Teflon beaker and then digested according to MILESTONE Microwave Digestion Report code 309 programme.The digested samples were transferred into test tubes and analysed using the Atomic Absorption Spectrometer (AAS) of Varian Model AA240FS.The water samples were analysed for Ca, Mg, Cd, Co, Cr, Cu, Fe and Pb with detection limits of 0.001, 0.003, 0.002, 0.005, 0.001, 0.003, 0.006 and 0.001, respectively.Standards and blanks were read after every five measurements to check the reproducibility of results.

Statistical Analysis
After determination of mean concentrations and standard deviations, the measured parameters were subjected to multivariate analysis [i.e., cluster analysis (CA) and principal component analysis (PCA)] using SPSS 16.0 statistical package.Correlation matrix was done using Microsoft Excel, 2007.

Cluster Analysis (CA)
CA is an exploratory data analysis tool for solving classification problems.Its objective is to sort cases (in this case sampling sites) into groups or clusters, so that the degree of association is strong between members of the same cluster and weak between different clusters [3].Cluster analysis may bring out associations and structure in data which, though not previously evident, nevertheless are sensible and useful once found.
Each cluster thus describes, in terms of the data collected, the class to which its members belong; and this description may be abstracted through use from the particular to the general class or type [7].In this case prior to CA, the descriptor variables (pollution indicators) were block-standardized by range (autoscaling) to avoid any effects of scale of units on the distance measurements by applying the equation where x ji indicates the original value of measured parameter, z ji -the standardized value, x j -the average value of variable j, s j -the standard deviation of j.
The similarities-dissimilarities were quantified through Euclidean distance measurements; the distance between two objects (sampling sites), i and j, is given as where d ij denotes the Euclidean distance, z ik and z jk are the values of variable k for object i and j, respectively, and m is the number of variables [8].Normalized Euclidean distances and the Ward's method were used to obtain dendrograms [9].CA is not a statistical technique; the results obtained are justified according to their value in interpreting data and indicating patterns [10].In recent times many applications of CA have been reported (see [3,4,11]).

Principal Component Analysis (PCA) and
Factor Analysis These two methods are aimed at finding and interpreting hidden complex and casually determined relationships between features in a data set [7].The key idea of PCA is to quantify the significance of variables that explain in the observed groupings and patterns of the inherent properties of the individual objects (in this study, sampling sites).On the basis of the dataset, new factors as the linear combination of original parameters are calculated.Owing to this, all information about the objects gathered in the original multidimensional dataset can be performed in the reduced space and explained by a reduced set of calculated factors called principal components (PCs).Identified PCs (e.g. by eigenvalue-one criterion) account for the maximum explainable variance of all original property parameters in a descending order [12].Factor analysis is a useful tool for extracting latent information, such as not directly observable relationships between variables [7].The original data matrix is decomposed into the product of a matrix of factor scores plus a residual matrix.In general, by applying the eigenvalue-one criterion, the number of extracted factors is less than the number of measured features.So the dimensionality of the original data space can be decreased by means of factor analysis.After rotation of the factor loading matrix (e.g. by VARIMAX rotation and normalized with Kaiser Normalization), the factors can often be interpreted as origins or common sources [8].

Physicochemical Data for Surface Water
The range of pH values (6.2-9.5 pH units) was slightly alkaline for most of the surface water samples in the wetland.For most natural waters, the accepted range of pH is 6.0-8.5 pH units [13].Apart from Weija (SW1) and Panbros (SW4a) (Table 1), pH values from the other ites were within the accepted pH range which provides s adequate protection for aquatic life.A lot of salt mining activities occur in Panbros and that might be a contributory factor to the alkaline nature of water from this site.
Conductivity values ranged from 235-60000 µS/cm.Sites SW5a and SW5b (Bortianor) recorded the highest conductivity values, suggesting pollution of surface water in these areas.A lot of activities including farming and fishing take place in these wetland areas.Besides, there is a resort centre in Bortianor which is frequented by many tourists especially on weekends, the activities of which could likely impact on the water quality.In addition, a likely contributory factor includes influx of sea water into the river during high tidal periods, and possible sea spray which could also raise the conductivity values.
Na + and K + values exhibited significant spatial variability.For example, at Tetegbu (SW3b) Na + read 51.7 mg/l whereas Na + at Aplaku (SW2a) was 6670 mg/l (Table 1).Naturally Na + may come from rock-water interactions as it is one of the most abundant ions found in the earth crust [13].However, there is the possibility of anthropogenic contribution especially leachate from landfill site.From the location map (Figure 1), Aplaku is found between Oblogo (where there is a landfill site) and Panbros (which is a salt mining area), so contributions from these sources may account for the high sodium content.Interestingly, Aplaku (SW2a) recorded very high K + value (1340 mg/l) during the rainy season but very low during the dry season i.e. at DSW2a (1.3 mg/l) (Table 1).Although most natural waters have low concentrations of K + because rocks containing potassium are resistant to weathering [13], potassium salts are widely used in the manufacture of fertilizers for agriculture and enter fresh waters during agricultural run-off and this may account for the reason the high K + content in the rainy season.Calcium and magnesium concentrations were moderate (< 202.7 mg/l) and (< 671 mg/l), respectively.Nutrient elements, nitrate (NO 3 -N), phosphate (PO 4 -P) and sulphate (SO 4 ) recorded values of > 0.0009 mg/l, >0.0009 mg/l and > 0.28 mg/l, respectively among the sampling sites.Most tropical waters have low nutriaent values, a feature considered common for natural and polluted waters.The concentrations of NO 3 -N (>0.0009 mg/l) and PO 4 -P (>0.0009 mg/l), suggest organic pollution and nutrient enrichment [14].

Cluster Analysis (CA)
With the cluster analysis, the relationship between various sites in the wetland and water chemistry can be clearly explained and the source of origin of the ions, that is, whether they are anthropogenic or natural, can be evaluated.The dendrogram of the location pattern resulting from CA of measured data for the period of October, 2007 to March, 2008 is presented in Figure 2.
Using the CA, the locations of different pollution sites may be clearly distinguished.The dendrogram shows that the sampling sites could be mainly grouped into three main clusters.Cluster I consists of sites SW1 (Weija), SW3a (Tetegbu), SW3b (Tetegbu), SW4b (Panbros) and SW5a (Bortianor), cluster II by sites SW2a (Aplaku), SW2b (Aplaku), SW4a (Panbros) and SW5b (Bortianor), and cluster III, sites DSW2a (Aplaku), DSW5a (Bortianor) and DSW5b (Bortianor).It is seen from the dendrogram that cluster III is characterized by the biggest Euclidean distance to the other clusters (high significance of clustering).This cluster could be cate-gorized as highly polluted because sites DSW5a, DSW5b and DSW2a have the highest TDS values.These sites are characterized by a lot of human activities which include shallow sea fishing and beach seining.Other human activities including gravel and sand winning take place here besides crop farming.Cluster II (sites SW2a, SW2b, SW4a and SW5b) was a typical grouping of saline water influenced (sites SW4a and SW5b) and pollutant-influenced sites (SW2a and SW2b).Cluster I (sites SW1, SW3a, SW3b, SW4b and SW5a) was characterized by sites close to a landfill (sites SW3a and SW3b) and also being influenced by sea water intrusion (sites SW5a and SW4b).Site SW1 is upstream of a landfill site.
Cluster analysis in Q-mode was also performed on all the elements and three clusters were obtained (Figure 3).Cluster I consists of BOD, COD, temperature and pH.Cluster II is characterized by turbidity, Pb, NO 3, Cu, Na, Cl, K, EC, TDS, Sal, SO 4 , PO 4, and TSS.Alkalinity, HCO 3 -, Cd, Co, Ca, Mg, Cr, Fe constitute cluster III.

Principal Component Analysis (PCA)
In a PCA, the number of components is equal to the number of variables (in this case water quality parameters).However, a component consists of all the variables used in the study.In this study 25 variables (water quality parameters) were used and so there were 25 components.Varimax rotation was used to maximize the sum of the variance of the factor coefficients.According to the scree plot (Figure 4) for the PCA, up to 84% of the original mean logs of the dataset is gathered in the first four new variables.Therefore, all information about pollution in the sampling sites would be discussed based on a set of four calculated variables (Principal Components or PCs) with Eigen values >1.Loading of four retained PCs are presented in Table 2.   Principal Component 1 (PC1) explains 27% of the variance and is contributed by most variables, i.e.K, Ca, Mg, NO 3 , BOD, Cd, Cr, Co and Fe.As shown by the factor score, these elements are concentrated at Bortianor (DSW5a and DSW5b) and Aplaku (DSW2a).PC2 explains 26% of the variance and includes temperature, EC, Sal, TDS, Na, Cl, SO 4 and Pb and these are highly concentrated at Aplaku (SW2a and SW2b) and Bortianor (SW5b).PC3 also explains 16% of the variance and includes temperature, pH, turbidity, SO 4 , PO 4 , BOD, COD are high at Bortianor (DSW5a).PC4 (15% of the variance) and includes turbidity, TSS, alkalinity, HCO 3 , NO 3 and Cu.These are found to be strongly concentrated at Panbros (SW4a) and Bortianor (SW5b).

Correlation Matrix
The correlation matrix (Table 3) shows distributions of positive and negative correlations among some selected variables.There is strong positive correlation between Ca 2+ and SO 4 2-, between Ca 2+ and NO 3 -and also between Ca 2+ and Mg 2+ .This suggests contribution from agricultural activities as sources of these ions in the chemical budget in the water [15].Pollution from domestic sewage will mean association of NO 3 -, Na + and Cl -ions since these ions are more enriched in sewage and this is evi-dent in this work since there is a strong positive correlation between Na + and Cl -.This relation may also be as a result of sea-freshwater interaction.The exchangeable ions Na-Mg, Na-Ca are weakly correlated suggesting ion-exchange reactions occurring as a result of rockwater interaction may not be significant.Also the following species pairs, Mg-SO 4 , Ca-SO 4, Cl-SO 4 and Na-SO 4 , show strong and low correlations, respectively.Therefore, both dissolution/precipitation reactions and pollution from anthropogenic causes may be influencing the water chemistry of the wetland.
There was not so much significant correlation between the physical parameters and changes in heavy metal except pH which showed a negative correlation with Fe (Table 4).
EC also exhibited positive correlations with Cd, Co and Fe.Among the heavy metals, there was a positive correlation between Cd and Cr, Co and Fe.Cr exhibited a positive correlation with Co and Fe whilst Pb shows a positive correlation with Cu.This suggests a common source for most of the heavy metals especially Cd, Cr, Co and Fe.The similarities and differences within the sampling sites were investigated using the Q-mode PCA. Figure 5 shows the factor scores of sampling sites on the bidimensional plane defined by PC1 and PC2, and three distinct groups re revealed.Group 1 consists of Weija (SW1), Tetegbu (SW3a and SW3b), Panbros (SW4a and SW4b) whereas group 2 comprises Aplaku (SW2a and SW2b), Bortianor (SW5a and SW5b).Group 3 consists of Aplaku (DSW2a), Bortianor (DSW5a and DSW5b) which were samples taken during the dry season.
With the exception of sites SW4a and SW5a, the PCA shares certain similarities with CA in terms of site groupings.Therefore, the significant agreement between PCA and CA multivariate statistical techniques suggests that the grouping of the sampling points have been done in a very convincing way [16].Group 3 consists of sam-ples from Bortianor (DSW5a and DSW5b) and Aplaku (DSW2a) they correspond to a relatively high pollution from both natural and anthropogenic sources.Groups 1 and 2 are polluted mainly by anthropogenic influences.

Pollution Source Identification with PCA
As can be seen from Table 2, the main contributions to PC1 probably include both natural and anthropogenic elements.The PC1 which explains ~27% of the total variance had high positive scores on Ca, Mg, NO 3 -, Cd, Cr, Co and Fe.Most of these ions especially Ca, Mg and Fe could have resulted from lithological processes taking place here which include weathering of rocks such as sandstone and shale.This is also evidence of interaction of the wetland with the sea.However, Cr, Cd, NO 3 and Co do not have significant lithological origin and thus could likely not be sourced from underlying geology since the bed rock in this area is mostly of the Togo series which consists of mostly sandstones and shales.Thus, run-off from agriculture activities may be responsible for the high concentrations of nitrates in water in the area [15].Also since these sampling sites are in the vicinity of an unengineered landfill, influence from the landfill leachate cannot be underestimated.
PC2, PC3 and PC4 are contributed by predominantly anthropogenic pollution.PC2 explains ~26% of the total variance and this factor had high positive loadings on EC, TDS, Sal, Na, SO 4 and Cl.A high concentration of these ions is suggestive of seawater intrusion from the ocean into these areas.From Table 2, temperature contributes negatively to this factor.This clearly indicates that Na and Cl are contributing to the salinity of the water and that there is seawater intrusion.This interaction, however, does not go with high temperature.As such during the rainy season, there is high seawater intrusion.PC3 describes ~4% of the total variance and is characterized by high loadings of SO 4 , PO 4 .There are also high negative loadings on temperature, pH, turbidity, BOD and COD are associated with fertilizer surface run off.The loadings show that during rainy seasons, precipitation likely causes dissolution of these minerals.PC4 explains ~4% of the total variance and is contributed to positively by alkalinity, HCO 3 and negatively by turbidity, NO 3 TSS and Cu.

Conclusions
Multivariate statistical techniques performed on surface water samples from the Densu wetland, Ghana appear to suggest interesting inter-relationships between the various ions determined and their possible sources.Cluster analysis helped to classify the study area into 3 distinct groups that possibly explain observed determinants in the water chemistry of the area.Principal component analysis correlated well with cluster analysis which suggests fairly reasonable grouping of the sampling sites.The results of the factor analysis performed on the data also appear to explain fairly well the factors that may have accounted for the chemistry of the water in the study area.Correlation matrix, together with other multivariate analyses, seems to point towards a common source for heavy metals.Based on the results of the study, it is strongly suggested that apart from dissolution and precipitation affecting the chemistry of the water, anthropogenic mainly small-scale agricultural activities, together with seawater intrusion, likely greatly influence the water chemistry of the ecologically important Densu delta wetland.

Figure 2 .
Figure 2. Dendrogram of sampling sites using complete linkage method (after Ward's method).

Figure 3 .
Figure 3. Dendrogram of water parameters using complete linkage method.

Figure 5 .
Figure 5. Scores of surface water quality sites for the sampling period (October, 2007-March, 2008) on the bidimensional plane defined by the first two varifactors.