An Empirical Investigation of Common Sense of Land Use from a Statistical Approach

Recently, ontological study has been one of the key concerns of geographic information science, a number of studies have been conducted in both of philosophical and knowledge engineering approach. Some studies pointed out the importance of human cognition and social context for development of ontologies. This paper presents empirical investigation of common sense of land use categories for development of suitable ontologies for each cultural or speech communities. Distinctions and characteristics in perceiving land use categories were described by a psychological method that was submitted to Japanese graduate and undergraduate students. In addition the results were analyzed using correspondence analysis, a statistical technique for categorical data. This analysis serves to clarify the dominant determining factors for land use categories.


Introduction
Semantic issues have always been a key concern in geographic information science (GIScience) because semantic interoperability plays a crucial role in the sharing and integration of geographic information [1].Although the Open Geospatial Consortium (OGC) and ISO/TC211 provide certain standards supporting the deployment of geospatial web services regarding semantic interoperability, these standards address the interoperability issue at the syntactic level.They are therefore limited in terms of semantics and do not provide a consistent model for the semantic integration geospatial services [1,2].Ontology has been identified as an explicit specification of a conceptualization contributing to the establishment of semantic interoperability.Until recently, research related to ontologies in GIScience has been broadly divided into the philosophical approach and the knowledge-engineering approach.The philosophical approach has addressed top-level ontologies for the geographic domain; the knowledge-engineering approach has addressed ontologies as application-specific and purpose-driven engineering artifacts [1].
Within the philosophical approach, a number of theories about geospatial ontologies have been discussed (e.g., formalization of ontology [3]).While the conventional approaches to study ontologies had been based on the objectivist point of view premised on a real world independent of human cognition and social context, the sub-jectivist point of view has been focused on ontology research in geographic information [4,5].Mapping human cognition or categories to ontologies is the most rational basis for data integration or sharing [6,7].In fact, it is necessary to develop suitable ontologies for each cultural or speech communities.Common sense is critically reflected by background that people of each community are thinking, speaking and perceiving in every day.Mark and Turk investigated common sense of landscape categories in the language of the Yindijibarndi people [8].The series of studies about ontologies presented by these authors constitutes one of the few research efforts to investigate common sense within a perspective of human cognition and cross-linguistics for the development of ontologies [8,9].In contrast, Mark et al. and Smith and Mark employed questionnaire method to obtain empirical evidence of the influence of human cognition on geographical categories [10,11].
In order to develop valid geospatial ontologies, they must be investigated in each language domain because sharing spatial data and attributes requires language stability across cultures and geographies-assumptions that are seldom true [12,13].In addition, the targets of previous studies about geospatial ontologies are natural objects such as landscape [8]; insufficient study has been conducted for artificial objects.The objective of this study is to empirically investigate common sense of artificial land use categories, such as public facilities in an urban area, in Japanese community.The questionnaire method was used to investigate this common sense.The results of the questionnaire were then analyzed, using a statistical technique, to clarify the distinctions and characteristics of this common sense.1).This questionnaire was applied in Japanese-language circumstance.

Natural and Artificial Land Use Categories
One of the best-known land use classification systems is the Land Cover Classification System (LCCS), developed by the Food and Agricultural Organization of the United Nations [15].The LCCS has detailed a classification based on natural land use categories, such as natural vegetation or agricultural land.Although classifications about artificial land use are important factor to describe human activity, would be more complex and arbitrary than natural land use categories.In this study, the "Public facility", "Commercial facility", "Residence", and "Others" were used as artificial land use categories, because these categories often appear in developed urban areas.The common sense underlying these categories would be complex and arbitrary because they would be deeply related to culture and history.However, because GIS applications or services in urban areas would produce many benefits, it is valuable to investigate the common sense of these artificial land use categories for development of its ontologies.

Correspondence Analysis
To investigate the common sense about the land use categories used in the questionnaire, correspondence analysis was applied.Correspondence analysis is a multivariate statistical technique for use with categorical data rather than quantitative data.Correspondence analysis has become increasingly popular in ecological, marketing and psychological research [16].The basic idea of correspondence analysis is to reduce the dimensionality of a data matrix and visualize it in a subspace of low dimensionality, commonly two-or three-dimensional [17].Both columns and rows can be visualized on the same plot [18].The main components underlying correspondence analysis are mass, profile, and chi-square distance.Assume that the cross-tabulated data under examination are described formally by matrix The correspondence matrix P is denoted as a matrix in which all elements of F are divided by the grand total n, . Next, row and column summaries of the correspondence matrix are defined 1 1 , (called mass in correspondence analysis).The respective row and column profiles of P are defined as (column profile).Here, the chi-square distance between row i and row k, is denoted as The chi-square distance for column elements can be calculated using column profiles in a way analogous to that used to calculate the chi-square distance for row elements.
In this study, the "ca" function of the ca package in the R language [17] was used for correspondence analysis.This function employs the singular-value decomposition (SVD) as a solution for the correspondence analysis.The ca package outputs the scores for the row and column elements and the cumulative contribution ratio for each axis based on the chi-square distances.As shown by Equation ( 1), the chi-square distances can be calculated for rows and columns separately.Accordingly, the scores are based on the scalar products of the row vectors and column vectors, which depend on the lengths of the vectors and the angles between them rather than the absolute distance between the vectors [19].

Results of Correspondence Analysis
The result obtained from the questionnaire was used as input data for this analysis (Table 2).This result can be visualized in a two dimensional scatter plot (Figure 1).In this figure, the top and right axes represent the land use categories (column elements in Table 2).The bottom and left axes represent the score of the facility classes (row elements in Table 2).The contribution ratios corresponding to each axis are 53.91% for the first (vertical) axis and 34.02% for the second (horizontal) axis.The cumulative contribution ratio is 87.93% in two-dimensional visualization.Although the first axis is generally plotted horizontally, the second axes were rescaled for improved visual quality in this study.
In Figure 1, the land use categories "Public facility", "Others", and "Commercial facility" are arranged in a straight line along the horizontal axis."Residence" is located on the opposite side from the three categories and above them on the vertical axis.The contribution ratio of the vertical axis is larger than that of the horizontal axis."Residence" is thus clearly distinguished from the other three categories.In other words, this can be considered that "Residence" is far from other categories in regard to semantic similarity.The facility classes located around "Residence" are "Apartment complex", "Student dormitory", and "Dormitory for Diet members", all of which have function of housing.In addition, "Welfare house for aged" is closer to these classes than are the other facility classes.In Japan, welfare houses for the aged is in a variety of types, such as nursing care facilities or lifetime care service facilities, but the housing function is common to all.In fact, the primary function of "Welfare house for aged" is not simply housing, whereas "Apartment complex", "Student dormitory", and "Dormitory for Diet members" serve as primarily housing.Therefore it can be considered that "Welfare house for aged" is located far from "Residence" in the plot than these other three facility classes in virtue of priority of housing function.
The "Others" category was used if a facility class could not be classified into the other three categories."Others" is located in the middle between "Public facility" and "Commercial facility".Therefore, it appears to classify a facility class into "Others" in case of not decide whether to classify the facility as a "Public facility" or a "Commercial facility"."Grave site", "Botanical garden", and "Child care center" are located around "Others."Copyright © 2012 SciRes.JGIS

Comparative Analysis Based on "Establishment Agent" and "Establishment Purpose"
The facility classes located around "Public facility" include "Library," "Park," and "Athletic field".These facilities are available to all citizens free or low cost.In contrast, the facility classes located around "Commercial facility" include "Shopping center," "Wholesale market," and "Japanese style hotel".These facilities have economic activity as their primary function.

Figure 2(a)
shows the relationships between the land use categories and the facility classes in terms of "Establishment agent."This figure shows that "public" and "public/private" facilities concentrate around "Public facility" and "Residence" and "private" facilities are located around "Commercial facility" and "Others".In contrast, the relationships in terms of "Establishment purpose" show that there are several categories that are concentrated (Figure 2(b)).For example, the "Residence" characteristics category is concentrated around the "Residence" land use category, and "transportation" is concentrated around "Public facility".

Categorization of the Facility Classes
To investigate the relationships between the land use categories and the facility classes, the facility classes were categorized in terms of establishment agent and establishment purpose (Table 3).The term "Establishment agent" was defined whether to use taxes to establish the facility."Public" indicates taxes were used, and "private" indicates taxes were not used.The term "Establishment purpose" is used to categorize the primary function of these facilities.This categorization was developed based on a land use database named Digital Map 5000 (land use) published by the Geospatial Information Authority of Japan (GSI) to divide the facilities into more detailed categories.
The dominant determining factor for land use categories can be inferred from the relationships in terms of "Establishment agent" and "Establishment purpose."The three facility classes "Apartment complex", "Student dormitory", and "Dormitory for Diet members" are concentrated around the "Residence" land use category.Although their "Establishment agent" is not in common, their "Establishment purpose" is in common.This means that "Establishment purpose", not "Establishment agent", is the dominant determining factor for the land use category "Residence".The housing function is a more strongly determining belonging to "Residence" than the use of taxes or no taxes to establish the facilities.Similarly, "Wholesale market" is close to "Commercial facility," although its "Establishment agent" is "public".Therefore, the facilities to be built for economic activity would be easily classified into "Commercial facility" irrespective of taxes use for its establishment.

Conclusions
This study presented an empirical investigation of common sense of land use categories using the questionnaire method and a statistical technique.Although the land use categories used in this study are limited, the several characteristics and distinctions of these land use categories were clarified.In addition, the dominant determining factors for the land use categories were confirmed.Although some

Figure 1 .
Figure 1.Scatterplot of the results from the correspondence analysis.

Figure 2 .
Figure 2. Scatterplots of the results of the correspondence analysis expressed in terms of "Establishment agent" (a) and "Establishment purpose" (b).

Table 1 . Question table used in the questionnaire method.
[14]ublic facility, C: Commercial facility, R: Residence, O: Others.*Japan'sShinkin banks, commonly known as credit associations, are relatively small financial institutions that are privately held by members living near a bank's headquarters[14]; ** In Japan, child care center is generally under private management and is different from a kindergarten that is licensed by the Ministry of Health, Labor and Welfare or the Ministry of Education, Culture, Sports, Science and Technology.Their facility functions are, however, almost same.