Determining the Best Optimum Time for Predicting Sugarcane Yield Using Hyper-Temporal Satellite Imagery

Hyper-temporal satellite imagery provides timely up to date and relatively accurate information for the management of crops. Nonetheless models which use high time series satellite data for sugarcane yield estimation remain scant. This study determined the best optimum time for predicting sugarcane yield using the normalized difference vegetation index (NDVI) derived from SPOT-VEGETATION images. The study used actual yield data obtained from the mill and related it to NDVI of several two-month periods of integration spread along the sugarcane growing cycle. Findings were in agreement with results of previous studies which indicated that the best acquisition period of satellite images for the assessment of sugarcane yield is about 2 months preceding the beginning of harvest. Overall, of the five years tested to determine the relationship between actual yield and integrated NDVI, three years showed a significant positive relationship with a highest r2 value of 85%. The study however warrants further investigation to improve and develop accurate operational sugarcane yield estimation models at the local level given that other years had weak results. Such hybrid models may combine different vegetation indexes with agro-meteorological models which take into account broader crop’s physiological, growth demands, and soil management which are equally important when predicting yield.


Introduction
The 21st Century demands the promotion of fast track modernization and diversification of the sugar sector to convert it into an efficient cane industry aimed at producing sufficient stocks for manufacturing sugar, for energy [1], and other by products [2,3].Remote sensing in the form of hyper-temporal satellite imagery is one of the tools that can be used to provide timely up to date and relatively accurate information for the management of sugar cane crop.Several studies have applied remote sensing techniques in sugarcane monitoring.Particular focus has been on the crop's classification [4] areal extent mapping [5] thermal age group identification [6], varietal discrimination [7], crop health and nutritional status monitoring [8,15].There is limited applied remote sensing for sugarcane yield prediction yet it has been used successfully on graneous crops, such as maize and wheat [9][10][11][12], linked the paucity of publications of experimental results to the difficulty in collating data and the lengthy of the growing period of sugarcane.Given the crop's relevance in today's world economy, and the scant models developed this far, the study uses SPOT VEGETATION multi-temporal images to estimate the optimum time for sugarcane yield prediction.

Sugarcane Growth
Sugarcane is a perennial crop of the Saccharum genus that is grown in the warm tropics and sub tropics [13].In the sub tropics the crop is often grown in irrigated areas due to its high water requirements.Its growth is characterized by three development stages namely: sprouting, tillering, stalk growth and maturation [8].The maturation is triggered by a decrease in soil water content, in temperature and nitrogen availability.This stage is charac-terised by the end of stalk growth and, declining leaf water content and turgor [8,14].

Estimation of Sugarcane Yield
Sugarcane yield estimation is critical for a number of stakeholders, among which includes, planters growers, industrial producers and policy makers.Industrial producers and growers might be interested in avoiding costs through the optimization of harvest campaigns.Policy makers might want to quantify outputs for national statistics and also address food security issues [15].
One of the utilities in which remote sensing can be exploited in sugarcane production is in yield prediction expressed in terms of the tones of the sugarcane stalks per hectare [15].This is enabled by the incorporation of a sugarcane growth model based on the Normalized Difference Vegetation Index (NDVI), one of the vegetation indices that are derived from visible and near-infrared channels [13].It is a measurement of the "greenness" of a given area thus in the long term NDVI provides an indication of the trend of intensity of any agricultural activity [16].
The NDVI is calculated as illustrated on Equation (1); where IR and R stand for the near-infrared and visible (red) respectively [13].
Literature points that the utility of remote sensing data to forecast sugar cane yield is limited by: a) Proprietary nature of past work [17] b) Length of harvest season, the lack of direct link between sugar cane yield and crop radiometry [8] c) Uncertainty of growth models [18][19][20] applied the conversion of above ground dry biomass into crop yield for a number of crops; however this study estimated fixed scale sugarcane yield based on NDVI.This study attempted to respond to the following questions. What is the optimum time for predicting sugar cane yield using NDVI derived from hyper-temporal satellite imagery? Is there any significant difference between sugarcane yield predictions over several trimesters throughout the year? Is there any significant relationship between Actual Yield of sugarcane and Normalized Density Vegetation Index?

Study Area
This study was conducted at Mkwasine Estate located in the South East lowveld of Zimbabwe.This region is characterized by low and erratic summer precipitation of less than 450 mm per annum and high temperatures of around 35˚C.Sugarcane is thus grown under irrigation as the conditions are conducive for the growth of the crop within a 12 month cycle.The growing cycle starts between September and October and harvesting occurring between June and August.Out of the 45,000 hectares of sugarcane plantations, 9000 ha are managed by 840 small to medium scale out growers under the Commercial Sugarcane Farmers Association (CSFA) and the Zimbabwe Sugarcane Farmers Association (ZSFA) [17].Figure 1 shows the geographic location of the study area, which has a historical trend of contested land resource, following the fast track resettlement program of the year 2000.

Sampling, Data Collection and Analysis
Figure 2 provides an overview of the methods applied and described in the next sections.A ground truthing exercise was carried out to collect sugarcane production data in the field using stratified random sampling based on whether the farmers produced sugarcane or other crops.A sample of twenty sugarcane plots with an average area of 20 hectares was selected randomly within the forty identified plots.Coordinates for each plot were taken.All the plots comprised of ratoon crops which are normally harvested at about 12 month interval for 4 years or more, before the crop is renewed.Actual yield levels were obtained from Mkwasine milling plant, where the farmers sell their sugarcane output.

Hyper-Temporal Satellite Imagery
Hyper temporal satellite imagery in form of composite 10 days decadal NDVI images (S10 products) at 1 km × 1 km resolution from April 1998 to December 2009 for the study were extracted from SPOT-4 VEGETATION.The SPOT VEGETATION system has a spatial resolution of 1.15 km at nadir and a swath width of 2250 kilometers that can cover almost all the globe's landmasses while orbiting 14 times a day [21].It comprises of bands 2 (red; 0.61 -0.68 µm) and 3 (near IR; 0.78 -0.89 µm) which are the main wavelength bands for deriving the NDVI.The NDVI indicates chlorophyll activity and was calculated from (band 3-band 2)/(band 3 + band 2); the index was then converted to a digital number (DN value) in the 0 -255 data range using the fomula: DN = (NDVI + 0.1)/0.004.This was to make the data handy with data analysis [22] .The NDVI composite consist of 393 images (April 1998 to December 2009) taken for a period of 11 years which were obtained from [23].The downloded NDVI images were geo-referenced and declouded.eclouded means: using by image and pixel the supplied D quality record, only pixels with a "good" radiometric quality for bands 2 (red; 0.61 -0.68 µm) and 3 (near IR;0.78 -0.89 µm), and not having "shadow" "cloud" or "uncertain", but "clear" as general quality, were kept (removed pixels were labeled as 'missing') [24].In-order to extract the hyper temporal Mean, Maximum and Minimum NDVI the study applied Zonal Statistics for all the 20 plots.cane production and NDVI integrated over all combinations of continuous time intervals of two to three months (based on starting date which is around September and October, duration and the burning season which is the harvest period often around June to August).Integrating the value of NDVI over a period of time implies amalgamating the condition or "greenness" of the crop over its growth cycle and assuming this accumulated condition will be correlated with the overall production or yield of the crop.

Statistical Analysis
The study tested if there is any significant difference The study examined regressions between annual sugar- between the different R2 of the various time intervals using analysis of variance, in-order to obtain the best optimum time for predicting sugarcane yield as illustrated on Figure 3.This was repeated over 4 harvesting seasons.The later results were illustrated using box plots representation of NDVI for the different 5 years.While the study was able to obtain NDVI data for the period 1998 to 2009, the actual yield data obtained from the mill was not consistent for the whole 10 year period; hence the analysis was undertaken for 5 years for each of the tests.

What Is the Best Optimum Time for Predicting Sugar Cane Yield?
To estimate the best time of the year when the NDVI related to the sugarcane yield, the bi-monthly NDVI was plotted against the correlation coefficient.The bimonthly periods used were April to May, June to July, August to September, October to November, December to January and February to March.This comes from the notion that the bi-monthly time periods can give a good differentiation of the shooting, growing, burning or harvesting stages, in which also the NDVI varies.From this regression analysis, the bi-monthly time period with the highest correlation coefficient shows the time in the sugarcane growth cycle in which there is the strongest relationship between NDVI and sugarcane yield.As such this depicts the best time of predicting sugarcane yield as shown on Figure 3.
The optimum time for estimating sugarcane yields is during the period December to March where the correlation coefficient was between 0.58 and 0.61.The box plots shown on Figure 4 however show some outliers in the different months.A possible explanation to this might be that, NDVI sometimes peaks during the early stages of the development cycle of the crop.Results are thus in line with previous studies which depicted that the best  time for predicting yield using NDVI is the pre harvest period [5,25,26].Normally sugarcane fields have completely closed canopies at least two months before harvest period.It is known that the main driving factors of the variations in NDVI are the amount of vegetation expressed by LAI.This is in line with [8]'s finding which showed a highest correlation of 0.98 between Leaf Area Index (LAI) and NDVI.Presumably that's the same period the study anticipates to get a good relationship with actual yield.During the optimum period, sugarcane crop would have reached its vegetative growth peak characterized by a high intensity of greenness thus we anticipate a better correlation with yield.For the other bimonthly periods, the correlation coefficient was low possibly because of the different crop development stages such as shoot development, vegetative growth and har-vesting making these periods not optimum for estimating sugarcane yield.Likewise during the harvest period the sugarcane crop loses its greenness as leaves are burnt, hence the NDVI decreases, a similar trend observed over the different years in the analysis.
Arguably the illustrated results show weak correlation coefficient for the best optimum prediction trimester over the different tested years.Results of ANOVA indicated that the F critical is 2.4, which is less than the P-value of 2.6 showing a significant difference between the means of the different trimesters.This therefore implies that there is no other better trimester which can be used to predict yield given the cropping calendar.The low correlations coefficients are explained on the next set of results.

Relations between Sugarcane Yield and Integrated NDVI: Is There Any Significant Relationship between Integrated NDVI and Annual Yield?
The relationship between integrated NDVI of the pre harvest season determined in this study and actual sugarcane yield from the mill was estimated using linear regression with results shown on Table 1.Overall, of the five years tested three years showed a significant positive relationship between NDVI and the actual yield with a highest r-square (r 2 ) value of 85%.This analysis shows that areas with high NDVI coincide with years of high or better yields.Findings of this study concur with a study by [27] which found a positive relationship between NDVI derived from NOAA AVHHR and sugarcane yield at a regional scale.Notwithstanding in some instances the standard error of the yield estimate shows a relatively low precision, while some areas showed a weak relationship between NDVI and actual yield.Various possible explanations can help explain this scenario.While SPOT VEGETA-TION has a strength in high temporal resolution, the spatial resolution is course i.e. (1 * 1 KM), implying that the measured reflectance will be affected by other factors other than the condition of the sugarcane.Possibly the reliability of the imagery as a means of estimating the yield might be compromised by the time interval between the date the image was taken and the harvest date.
It can be argued that the relationship between integrated NDVI and yield depends on a variety of environmental conditions [28], and production related factors.As an example canopy condition of sugarcane is mainly determined by moisture and chlorophyll status as measured by NDVI however it's not the only factor determining yield.Other external factors such as droughts, nutriaent deficiency might influence the yield, which might be a reason for the weak relationship shown on Table 1 and Figures 3 and 4. Other factors which might need to be taken into account could include the harvesting process and transportation to the mill.Often substantial amount of cane can be lost before selling to the mills, and such cane is unaccounted for as the actual yield used in this study was based on records from the milling company.The fact that our results were based on field measurements of output at the mill negates this notion since results indicated some satisfactory relationship.This shows the potential of high time series data to predict yields though with room for improvement.

Conclusions
Hyper-temporal satellite imagery (Spot Vegetation) can play a significant role in sugarcane management.The main finding of this study is that the preceding two months before harvest is the optimum period for predicting yield using NDVI.Despite the weak correlation coefficient for the optimum prediction trimester over the different tested years there is no better trimester which can be used to predict yield given the cropping calendar.Had it been that this study analysed data for the anticipated 10-year period better informed results might have been envisaged.
Based on the 20 samples for each of the five years tested to determine the relationship between actual yield and integrated NDVI, three years showed a significant positive relationship showing the potential of high time series data in yield prediction.
In a nutshell the study warrants further investigation to improve and develop accurate operational sugarcane yield estimation models at the local level given the evidence of weak results for other years.Such hybrid models may combine different vegetation indices with agro-meteorological models which take into account broader crop's physiological, growth demands, soil management which are equally important when predicting yield.Post yield prediction factors which may affect the sugarcane need to be taken into account for example diseases, poor harvesting processes, and cane transportation losses.It might also be advantageous to consider a model which integrates high spatial resolution imagery with high temporal resolution.

Figure 1 .
Figure 1.Location of the study area.

Figure 2 .
Figure 2. Simplified schema of the methodological approach.

Figure 3 .
Figure 3. Regression coeficient of the relationship between Integrated NDVI and actual yield for 20 samples ploted against different months of the sugarcane plant growth cyle.

Figure 4 .
Figure 4. Regression coeficient of the relationship between Integrated NDVI and actual yield ploted against different months of the sugarcane plant growth cyle repeatede over 5 years.