Creation of New Global Land Cover Map with Map Integration

We present here a new approach to the development of a global land cover map. We combined three existing global land cover maps (MOD12, GLC2000, and UMD) based on the principle that the majority view prevails and validated the resulting map by using information collected as part of the Degree Confluence Project (DCP). We used field survey information gathered by DCP volunteers from 4211 worldwide locations to validate the new land cover map, as well as the three existing land cover maps that were combined to create it. Agreement between the DCP-derived information and the land cover maps was 61.3% for our new land cover map, 60.3% for MOD12, 58.9% for GLC2000, and 55.2% for UMD. Although some of the improvements we achieved were not statistically significant, this project has shown that an improved land cover map can be developed and well-validated globally using our method.


Introduction
Many organizations have developed and distributed global land cover maps.The differences among the various maps hinder their effective use for modeling phenomena such as the carbon cycle and the water cycle, as well as for ecosystem modeling.For example, terrestrial ecosystem models rely on land cover maps to estimate total net primary production and to model its spatial distribution; consequently, the accuracy of existing land cover maps needs to be quantitatively evaluated [1][2][3][4].Land cover maps are also used to model changes in global land cover.The model outputs of these studies are also hindered by the differences in the available maps used as input [5].There is a crucial need for systematic validation of land cover maps and improvement of their accuracy.
Studies comparing several land cover maps have found that the total global areas for particular land cover classes are similar, but vary significantly by region [6,7].
These results clearly demonstrate that there has been insufficient progress in validating existing land cover maps.A new validation method for land cover maps was recently proposed by Iwao et al. [8].They used information compiled by volunteers contributing to the Degree Confluence Project (DCP), a project that aims to collect land cover information at each of the terrestrial intersections of integer degrees of latitude and longitude throughout the world (DCP points hereafter).The DCP contains four directions of photos taken at the confluences together with text information which explains the points and its surroundings.It allows registering users any number of visits to each confluence which enables to estimate the land cover change as well.Based on that text and photos, they categorized each confluence into 6 classes (forest, grassland, cropland, wetland, settlements and other land) and developed validation data.By using information derived from 749 DCP points, Iwao et al. [8] validated existing land cover maps of Eurasia.Their re-sults suggest that further improvement of the accuracy of land cover maps is needed and that their validation method should be applied to global land cover maps.A similar approach to integrate volunteers input to develop validation of global land cover map is conducted under the GEO-Wiki project [9].
Several methods have been proposed to improve the accuracy of existing land cover maps.One such example is by the integration of land-classification methods [10].The fuzzy agreement technique is another method that has been applied, for example, in the development of the SYNMAP land cover map [4].Based on the existing ecophysiological model, they defined a new legend and made a relationship between defined legend classes and the combinations with the legend classes of the original maps by assigning affinity scores between them based on fuzzy.They merged map data from MODIS Land Cover (MOD12), Global Land Cover 2000 (GLC2000), and Global Land Cover Characteristics (GLCC) to produce SYNMAP, and described the synergies of the different map products they used.However, Jung et al. [4] concluded that there was insufficient reference data available to allow them to thoroughly validate SYNMAP and show that it was more accurate than its predecessors.
In this study, we present a new approach for the development of a global land cover map by combining three existing land cover maps and adopting the land classification favored by the majority of the contributing maps.We then validated the new land cover map, and the three maps that contributed to it, by using newly developed information from 4211 DCP-derived points.

Methodology
In our new approach we compared the land cover classes at corresponding pixels on the three existing land cover maps and adopted the classification favored by the majority of those maps.That is, where either two or three classes at a particular sample point were in agreement, we used that class.For sample points with three different classifications, we adopted the classification of the existing land cover map with the highest level of accuracy.For our study, we used the three most accurate land cover maps as determined by the validation results of Iwao et al. [8]: these were MOD12 (Boston University, Land cover and land cover dynamics products user guide, 2003; available at http://geography.bu.edu/landcover/userguidelc/ index.html),GLC2000 (Joint Research Centre, Global land cover 2000; available at http://www-gvm.jrc.it/glc2000/), and the University of Maryland's 1-km Global Land Cover product (UMD) [11].
The simplified IGBP class scheme (14 classes) was previously used to compile MOD12 and UMD (hereafter, MOD12_sigbp and UMD_sigbp, respectively), whereas the LCCS class scheme (22 classes) was used for GLC 2000 (GLC2000_lccs hereafter).To properly reach a majority decision, the land cover classification schemes used for the contributing maps must be the same.We therefore adopted the six classes (forest, croplands, grassland, wetlands, settlements, and other land) of the LULUCF (Land Use, Land Use Change and Forestry) classification scheme established by the Intergovernmental Panel on Climate Change (IPCC).This scheme is available at http://www.ipcc-nggip.iges.or.jp/public/gpglulucf/gpglulucf_contents.htm[12].The relationships we used between the LULUCF scheme and the three classification schemes of the three maps that contributed to our new map were those proposed by Sato and Tateishi [13].
We refer hereafter to the three contributing land cover maps after conversion to the LULUCF scheme as MOD12_6c, GLC2000_6c, and UMD_6c.Because MOD12 had the highest accuracy of the three contributing maps described in section 3, the land class of MOD12_sigbp was replaced by the others only when the classes of GLC2000_6c and UMD_6c agreed, and only the MOD12_6c class differed.For GLC2000 and UMD, we assumed the same accuracy of the GLC2000_lccs and UMD_sigbp classes at a particular sample point if each pixel showed the same class for all six classes.If this was the case, we used UMD_sigbp as the replacement because it used the classification system we needed for our new land cover map.Compared to SYNMAP, which employed a new land cover classification scheme, this map is more user-friendly for existing global land cover map users.
Using the rules described above, we produced a new land cover map based on the simplified IGBP class scheme (Figure 1).As the new map reflects the Simplified IGBP class scheme, past users of MOD12_sigbp and UMD_sigbp can use the new land cover map without the need to convert classification schemes.The agreement between the new map and MOD12_sigbp was 97%.

Results and Discussion
We developed a global validation dataset based on the method proposed by Iwao et al. [8].As of December 2008, 5568 DCP points had been visited at least once and photographed by DCP volunteers.Of those DCP points 4211 reflect the characteristic land cover over the surrounding square kilometer.We categorized the land cover of each of the 4211 DCP points as forest land (1166), grassland (1250), cropland (721), wetlands (378), settlements (40), or other land (656) (Figure 2).The  than that of our new land cover map, but still higher than those of GLC2000 and UMD in the arid climatic zone.Moreover, there is little DCP-derived validation data for the polar zone.Because this zone is vulnerable to the effects of global warming, much more DCP-derived validation data is required in polar zone.
Comparison of the DCP-derived validation information with the new land cover map and the existing land cover maps produced overall agreement rates of 61.3% for our new land cover map, 60.4% for MOD12, 58.9% for GLC2000, and 55.2% for UMD.Similarly, comparison of the DCP-derived validation information with the new land cover map and the existing land cover maps produced kappa coefficient of 0.5 for our new land cover map, 0.47 for MOD12, 0.47for GLC2000, and 0.41 for UMD.(Table 1).
Although accurate evaluation data for each of the six LULUCF land cover classes (Table 3) show that the overall agreement with DCP data was higher for our new land cover map than for the existing three maps, UMD showed the highest agreement for grasslands, GLC2000 for croplands, and MOD12 for settlements.However, the agreement rate of GLC2000 with 721 croplands DCP validation points is only 46%.This suggests that the areas of grassland shown by UMD, and of cropland shown by GLC2000, are excessive.There are comparatively few incorrect classifications of forest.There are many places in existing land cover maps where grassland that has been validated by DCP data has been misclassified as forest.These findings suggest that further work is required to improve the classification methodology for grassland as well as to incite the definition of forest.We compared mod12_6c, glc_6c and umd_6c and the agreements between them were 87% between mod12_6c and glc_6c, 86% between glc_6c and umd_6c and 90% between mod12_6c and umd_6c respectively.According to the report of Giri, agreement between original MOD12 and GLC2000 is 59% which means that increasing the number of class makes uncertainty in classification and could assume that we need further investigation for the integration in class as mod12_6c and umd_6c are much similar than those mod12_6c and glc_6c.
These rates of agreement are similar to those obtained by Iwao et al. [8] in their validation of Eurasian land cover maps.
We used the 4211 DCP points to determined the rates of agreement of each land cover map with DCP data for six major climatic zones (tropical, arid, temperate, cold, polar, and other) according to the Köppen-Geiger climate classification map [14] (Table 2).Although our results show that the agreement rate for MOD12 was higher The integration and construction of SYNMAP included data from the GLCC Data Base Version 2.0 (U.S. Geological Survey, Global land cover, 1999; available at http://edcsns17.cr.usgs.gov/glcc/globdoc2_2.html) as well as MOD12 and GLC2000.As a further test of our new integration method, we also merged the data from MOD12, GLC2000, and GLCC, and compared both the output of this merged data set and GLCC data with DCP validation points (Table 4).For GLCC, we used the simplified IGBP class scheme (GLCC_sigbp) as a replacement for UMD_sigp of our previous integration.The overall agreement rate for GLCC with 4211 DCP validation points was 53.2% (Table 4), which is lower than the three land cover maps we had already validated.Among the 4211 DCP points, there were 492 for which the GLC2000_6c and UMD_6c classes agreed, but disagreed with MOD12_6c.Among these 492 DCP points, 218 agreed with UMD_6c classes.Further, DCP data agreed with MOD12_6c class values at 178 data points.So, at 40 DCP points, the new land cover map integrated from MOD12, GLC2000, and UMD showed ments than MOD12_6c did.As a result, the overall agreement rate for MOD12, GLC2000 and GLCC with the new combined map with 4211 DCP validation points was 60.2%, which is slightly lower than that of MOD12.These results suggest that the accuracy of the resultant map produced by using our new method is very reliant on the accuracy of the input land cover maps and does not always provide improvement.DCP-derived validation information is indispensable for the assessment of land cover maps.

Köppen-Geiger
Our results show statistically significant differences between our new land cover map and both GLC2000 and UMD, and also showed the improvement in kappa coefficient, but no statistically significant difference between our new land cover map and MOD12.
Several new land cover maps that can be usefully integrated to produce another DCP-validated land cover are available such as Global Land Cover by National Mapping Organizations (GLCNMO) produced by the International Steering Committee for Global Mapping (ISCGM) (available at http://www.iscgm.org/cgi-bin/fswiki/wiki.cgi) and GlobCover Land Cover produced by the European Space Agency (available at http://ionia1.esrin.esa.int/index.asp).

Conclusions
We developed a new map integration method based on the principle of favoring the majority view to produce a new global land cover map by combining data from three existing land cover maps.The method we have proposed in this paper enables the combination of existing global land cover maps based on different classification schemes and provides a user-friendly map which utilizes an existing land cover classification scheme.We validated the resultant map, and the individual maps merged to produce it, by comparing them to 4211 terrestrial DCP-derived better agreement rates than MOD12_6c.
There were a total of 523 DCP points for which GLC2000_6c and GLCC_6c classes agreed, but disagreed with MOD12_6c classes.Of these 523 DCP points, 188 agreed with the GLCC_6c classes.Among 492 DCP points, 197 agreed with MOD12_6c classes.So, at nine DCP points, the land cover map derived from-MOD12, GLC2000, and GLCC showed fewer agree-validation points worldwide.The validation data we have developed is one of the best available land cover validation datasets based on field observations in terms of its numbers and its distribution.The validation showed agreement rates of 61.3% for the new land cover map, 60.4% for MOD12, 58.9% for GLC2000, 55.2% for UMD, and 53.2% for GLCC which showed the same tendency compared with the previous work applied for Eurasia using 749 DCP-derived validation points.Our analysis shows statistically significant differences between the new land cover map and both GLC2000 and UMD.The agreements were improved in most of the classes as well as major climate zones.Some existing maps might overestimate specific classes such as an overestimate of cropland in GLC2000, which might appear as high agreements.Also, our findings suggest that further work is required to improve the classification methodology for grassland as well as to clarify the definition of forest.Moreover, there is little DCP-derived validation data for the polar zone.Because this zone is vulnerable to the effects of global warming, much more DCP-derived validation data is required.DCP-derived validation data will be available in 2011 at the GEO Grid (Global Earth Observation Grid).A map integration system based on the principle of favoring the majority view will also be available as a service at the website.

Figure 1 .
Figure 1.The new global land cover map developed by merging land cover maps MOD12, GLC2000, and UMD.Spatial resolution: 30 arc seconds; Map projection: Plate Caree (Geographic); Classification scheme: Simplified IGBP.

Figure 2 .
Figure 2. Distribution of the 4211 DCP-derived validation points used in this project.Green, forest; Yellow, croplands; Orange, grasslands; Blue, wetlands; Red settlements; and Gray, other land.validationdata we developed is one of the best available global validation datasets for global land-cover maps.For example, in the case of SYNMAP, the author mentioned the insufficient reference data available, compared to the validation information published by Boston University for MOD12 (IGBP land cover validation confi-