Olive Mill Waste Water Management Study by Using Principal Component Analysis

Olive mill waste water (OMWW) is a by-product issued after triturating olives. In Sfax, its management is different from urban to farming area. In this paper we treat it through a statistical analysis study during the season 2005-2006. Principal Component Analysis (PCA) and Hierarchical Classification (HC) methods are carried out on this work. Applied to variables issued from an exhaustive questionnaire including 274 mills, four Principal Components (PCs) are found to be significant, explaining 67% of the total variance. The coordinates of the 13 active variables retained by PCA were used to create a typology relative to the OMWW management and offered 7 groups of individuals which have the same characteristics, explaining 70% of the total inter-variance. This study showed that OMWW management in farming area could causes environmental problems because oleifactors haven’t controlled tanks and could evacuated OMWW on soil (causing oil deposit, waterproofing and possible asphyxia) or on public sewage network (causing corrosion, flow reduction). So, mills transfer from urban to farming areas in the form of agro-industrial complex is needed in the Sfax region.


Introduction
The olive oil industry is one of the driving sectors of the agricultural economy of the Mediterranean basin.Every year about 11 million tons of olives are produced and about 1.7 million tons of olive oil are produced corresponding to 95% of the world's production [1].This huge amount is generated between November and February [2].
Olive oil is produced in olive mills either by the discontinuous press method or by the continuous centrifugation method.In the last decades, development of continuous centrifugation method has been observed.This method has many advantages compared to the discontinuous press method such as complete automation and better oil quality [3].However, it also presents some inconveniences such as higher wastewater production, i.e. olive mill wastewater (OMWW).In the Mediterranean region, OMWW produced at a rate above 10 million m 3 /year [4,5].Furthermore, with chemical oxygen demand (COD) values in the range 40 -220 g·L −1 and biochemical oxygen demand (BOD) values in the range 23 -100 g·L −1 , which is 25 -80 times higher than the pollu-tion level of common municipal wastewater [6], OMWW represents a major environmental problem.Thus, the olive oil producer countries will face a serious challenge to find an environmentally and economically viable solution for the handling and disposal of OMWW [3].
To avoid these environmental impacts, olive mills were forced to treat or eliminate this waste.Hence, a wide range of systems has been studied for the disposal [7,8] or use of OMWW, such as aerobic [9,10] and anaerobic [11,12] treatments composting [2,13,14] and direct watering on fields [15][16][17].However, these methods present several drawbacks that make their implementation very difficult and very expensive [5,18,19].
In the present study, statistical methods such as Principal Component Analysis (PCA), and Hierarchical Classification were used to determine the OMWW management in Sfax.Variables were collected using a questionnaire with 274 mills distributed over the study zone.PCA, a fundamental and one of the most popular multivariate statistics based on monitoring methods [20], is carried out on these data to create a typology relative to OMWW management and to understand differences between procedures used to evacuate OMWW on urban and farming areas.Multivariate analysis provides a methodology to extract and structure information from large amounts of data.The multivariate treatment of environmental data is useful for evidencing temporal and spatial variations caused by natural and anthropogenic factors linked to seasonality [21,22].It has been used to monitor industrial processes for several decades [23].It has also been applied to wastewater treatment operation [24].These statistical techniques and exploratory data analysis are the appropriate tools for a meaningful data reduction and interpretation [25] of data collected from the questionnaire.

Study Area
Sfax has been chosen for its outstanding contribution to the olive production in the country, its triturating capacity and its OMWW production.
Sfax is located in the South of Tunisia, situated in 34˚43' on north latitude and in 10˚41' on east longitude.It is bordered by Mahdia prefecture to the North, Kairouan prefecture, the prefectures of Sidi Bouzid and Gafsa prefecture to the West, Gabes to the South and finally the Mediterranean to the East.Sfax region is made up of 16 administrative units called delegations (Figure 1).
Sfax belongs to the pre-Saharian part of Tunisia, is characterized therefore by an arid to semi-arid Mediterranean zone.These factors explain the important contribution of this sector to the economy of the country.
Olive groves in Sfax cover 312,000 hectares representing 44% of the total agricultural area and 19% of the na-tional olive-growing region and counts 6.13 million feet.The olive variety in Sfax is 100% Chemlali.Trees are planted to 83% in full with a density of 20 ft/ha.Consequently, Sfax has a very important contribution to the economy of the country, especially in the olive production sector.
In the period 2005-2006, there were 405 oil mills.However, only 305 were functional assuring the triturating of 12,000 tons per day which represented 37% of the national capacity and generated about 9090 tons per day of OMWW.

Collecting Data
Data used in this work were collected through an exhaustive investigation of 305 functional mills on Sfax during the season 2005-2006 [26].This investigation includes four chapters.The first one identified the mill and the oleifactor.The second one described the origin of olive supply.The third one was interested on OMWW evacuation and the last one was relative to the project of mill transfer to the farming environment.
The statistical study was applied only on 274 individuals.The others were excluded from this study because of their questionnaire refusal or because mills were newly created and responses could distort our study.Sfax was performed through PCA [20,27].PCA can be described as a method to project high dimensional measurement space with significantly fewer dimensions.A set of data, has been processed by multivariate statistical techniques in order to investigate the OMWW management during the season 2005-2006.The experimental matrix (138 × 274) was analysed by PCA [25].

Multivariate analysis of the variables relative to mills of
The PCA method involves the transformation of a greater number of unorthogonal variables into smaller number of orthogonal variables, which present common causes of their changes.It can therefore reduce the dimensionality of a problem by replacing the measured variables and the inter-correlated variables by using a smaller number of uncorrelated variables.This can be useful in reducing the amount of basic data to be processed [28,29].
Statistical methods were later applied to complete and refine the analysis.They especially included linear correlations and hierarchical classification.
The correlation coefficient explains its position in the selected factorial plane.It is given by R(x, y) = cov(x, y)/σ(x)σ(y).
The good interpretation of documents issued from statistical analysis lead to the right choice of the principal axis.The choice of axis is based on the statistical test; the percentage of the eigenvalues average is equal to (100/13) = 7.69 which imply that we can choose the first four axis that have an inertia more than this percentage (7.69%)[30].
The objective of the statistical analysis is to create typology based on variables relative to triturating system, OMWW evacuation and localization.
This typology was conducted on 13 active variables (Table 1) for the season 2005-2006.Hierarchical Classification was also used, to create classes with individuals who have correlated variables.It was applied on the coordinates of the 274 individuals on the four first axis retained by the PCA.Hierarchical Classification is a group of multivariate techniques whose primary purpose is to assemble objects based on the characteristics they possess [31].

Results and Discussion
In the period 2005-2006, Sfax accounted 405 mills among which only 305 only were functional and whose characteristics are gathered in Table 2.
The geographic distribution of mills showed their concentration in urban environment.These mills triturated 3130 tons per day representing 34.43% (Figure 2).
Mills generated OMWW with different coefficients according to triturating system [32].In Sfax, the biggest amount of OMWW was generated from continuous mills (Table 3) and was evacuated essentially in Agreb's tank (70.43%) (Table 4 and Figure 3).

Principal Component Analysis (PCA)
PCA identified four factors, which are responsible for the data structure explaining 67% of the total variance of the data set and allowed to group the selected parameters according to common features as well as to evaluate the incidence of each group on the overall variation in OMWW management.negatively correlated with the rural environment, mills located in Mahres, Mahres'tank, Garïba's tank and owned tanks.The variable "project of transfer" is positively correlated with urban environment.This was explained by problems encountered by oleiafctors as those related to traffic and the high cost of olives and OMWW transportation.The project of transfer will be the best solution for them.PC2 explains 20.15% of the variance and was mainly participated by mills localisation and triturating system.The continuous centrifugation method was correlated negatively with the discontinuous press method and positively correlated with peri-urban environment, triturating capacity and use of OMWW as a fertilizer.In fact, mills concentrated on this zone were essentially continuous,  newly created, with high triturating capacity and generates a large amount of OMWW.This affluent is rich in organic matter, nitrogen (N), phosphorus (P), potassium (K) and magnesium (Mg) [33].Its use for soil fertilization could, therefore, be doubly beneficial mainly in those countries having severe water deficiencies and soil poor of organic matter and nutrients.PC3 explains 12.72% of the variance and defined the OMWW evacuation in Mahres.
PC4 explains 8.23% of the variance and defined the OMWW evacuation in Graïba For better refining the above-mentioned groupings, the recourse to plane projections is of a great interest.The following paragraphs include the factorial distribution of the variables in the 1 × 2 and 1 × 3 plans.

Projection on the Plan 1 × 2
According to this factorial representation we notice a distribution of two variables groups illustrated in Figure 4.The first group was representative of the urban environment and the classic triturating system.The second one was related negatively to axis 1.It was formed by the peri-urban environment, the continuous triturating system, the important triturating capacity and the use of OMWW as a fertilizer.

Projection on the Plan 1 × 3
According to this factorial representation we notice a distribution of three groups of variables illustrated in Figure 5.The first group was representative by mills located in Mahres, Mahres' tank and the use OMWW as a fertiliser.Some of Mahres oleifactors practices illegally the OMWW spreading.The second one was related positively to axis 3. It was formed by the rural environment, owned tanks inside Sfax region and Graïba's tank.The third group was representative by rural environment, Agareb's tank and the project of transfer.

Class' Identification and Characterization
The Hierarchical classification allowed selecting seven classes (Figure 6) with 70% of the total inertia.Table 6 shows different classes and their effective Class 1 regrouped 70 mills located in urban environment assuring the triturating of 2224 tons/day.63% were classic and 19% were continuous.OMWW was evacuated in Agareb's tank and oleifactors encouraged its use as a fertiliser.52% of them have a project of transfer.
Class 2 regrouped 36 classic mills located in urban environment assuring the triturating of 941 tons/day.OMWW was evacuated in Agareb's tank but oleifactors didn't encourage its use as a fertiliser.34% of them have a project of transfer.
Class 3 regrouped 75 mills located in peri-urban environment assuring the triturating of 4170 tons/day.64% were continuous.OMWW was evacuated in Agareb's tank and oleifactors encouraged its use as a fertiliser.Only 27% of them have a project of transfer because their mills are generally newly created.
Class 4 regrouped 15 mills located in Mahres assuring the triturating of 616 tons/day.50% were classic.OMWW was evacuated in Mahres'tank and oleifactors encouraged its use as a fertiliser.However oleifactors of this group applied the OMWW spreading illegally.Class 5 regrouped 5 mills located in Graïba assuring the triturating of 184 tons/day.75% were classic.OMWW was evacuated in Graïba's tank and oleifactors encouraged its use as a fertiliser.
Class 6 regrouped 49 mills located in rural environ-ment assuring the triturating of 1109 tons/day.45 % were classic and 11% were continuous.20% of OMWW generated by mills of this group was evacuated in Agareb's tank and 80% was evacuated in owned tanks, in tanks belonging to other governorates near some mills or given to transporters.Oleifactors of this class encouraged the use of OMWW as a fertiliser.Class 7 regrouped 24 mills located in rural environment assuring the triturating of 1101 tons/day.25% were classic, 25% were super pressure and 25% were continuous.OMWW was evacuated in owned tanks, in tanks belonging to other governorates near some mills or given to transporters.Oleifactors of this class encouraged the use of OMWW as a fertiliser.
Thus, the statistical study showed that groups 6 and 7, whose mills were located in rural areas, can be a source of pollution that's why the decision maker must make a priority to these areas.The other groups contributed to the environment pollution with lesser degree because they use tanks controlled and managed by the State.But it does not prevent that we must give them special attention.
40% of oleifactors belonging to urban and peri-urban areas encouraged the transfer project of olive mills from urban to farming areas provided that all infrastructures were present that's why we propose in another work [34] to select sites appropriated to create agro industrial complex.

Conclusions
This study investigated data relative to OMWW management in 274 mills.The PCA and HC analysis were applied and resulted essentially in four principal components, describing approximately 68% of the total variance and seven distinct groups of individuals were identified, describing approximately 70% of the total variance, showing that OMWW management was different between urban and peri-urban in a first hand and the rural one on another hand.
PCA allowed the reduction of the 13 variables to four PCs and HC allowed the reduction of the 274 individuals to seven classes.PC1 (25.84% of variance) was defined by mills localisation and OMWW evacuation.This cor-relation can be attributed to a particular behaviour of the decision maker who proposes distinct solutions of OMWW management.The PC2 (20.15% of variance) can be assigned to mills localisation and triturating system.PC3 (12.72% of variance) can be assigned to OMWW evacuation in Mahres.PC4 (8.23% of variance) can be assigned to OMWW in Graïba.
Thus, the multivariable statistical analysis served as an excellent exploratory tool in analysis and interpretation of complex data set on OMWW management especially when different disposal methods based on evaporation ponds [35,36], thermal concentration [37,38], physicochemical [39] and biological treatments [40] as well as direct application to agricultural soils [17,[41][42][43][44] as organic fertilizers have been proposed.But, in Sfax, characterised by the important olive production, the very high cost of these different techniques realization encourages oleifactors to abandon them and to practice evacuation in tanks.

Figure 2 .
Figure 2. Geographic distribution of mills in urban and peri-urban areas of Sfax region during 2005-2006 season.

Figure 4 .
Figure 4. Factorial distribution on the variable space on the plan 1 × 2.

Figure 5 .
Figure 5. Factorial distribution on the variable space on the plan 1 × 3.

Figure 6 .
Figure 6.Projection of 7 classes on the two main components.

Table 5
summarises the PCA results.The amount of variance spanned by each principal component (PC).PC1 explained 25.84% of the variance, defined mills localisation and OMWW evacuation, was contributed by the urban environment and Agareb's tank; which were Copyright © 2013 SciRes.IJG