Developing Ontology Systems as a Base of an Environmental Quality Management Model in México

The process of identifying the attributes and relationships considered in an ontology is a complex task because there are many factors involved in the deterioration of environmental quality, the diversity of sources and data dispersion. This work presents an ontology that integrates the data required by an Environmental Quality Synoptic System (EQSS), which to date scatters in different Internet sites and concentrates by different agencies for example INEGI, CONABIO, SEMARNAT, CNA, among others. The methodology process consists of the collection of environmental information in Mexico through the application of computational techniques resulting ontology with environmental knowledge that will be processed by the system EQSS. Among the main advantages is than the selection and structure of information allow the automated generation of results in an environmental statement. The ontology proposal is based on knowledge of EQSS system that is based on the architecture of expert systems and through this important information for decision-making in regard to environmental quality and interaction with Geographic Information System (GIS) is obtained.


Introduction
The constant evolution of technology has enabled us to manipulate information in the ways different from the past when it was only carried out manually and with a large time investment.Similarly, it is also possible to perform processes that previously require a lot of human resources, such as statistical calculations, reports, and graphs.Currently there is a lot of environmental information from different sources, such as documents, databases and files with geographic information, satellite images, aerial photographs, maps, among others.These sets of data can provide important information on qualitative and quantitative characteristics inherent to the environment and its relation to the relative's capacity to meet the needs of people and ecosystems [1].From this information, we can estimate the greenhouse gas emissions, wastewater discharges and solid waste generation by indirect methods because they are based on emission factors, rates activity, estimated using historical data, material balance, engineering calculations and mathematical emission models [2].To do this, you must have access to enough environmental information, industrial sources, vehicle sources, sources of waste, emission sources, rankings, and current regulations, and then get the data required for the study area.It is very difficult to determine the integration of necessary information due to the heterogeneity of the data sources and the different tables and field names.Meanwhile, in Mexico, the policy states the need for inventories of pollutants through a record of emissions and transfers of pollutants emphasizing three major themes: hazardous waste, solid waste, air emissions and wastewater discharges, and specific sources of pollution such as mining, agriculture, livestock, pet and service.These sources and these types of wastes represent different dimensions of environmental problems, depending on the level of importance of the economic orientation of the different regions of the country.While ontologies have been developed that serve to integrate databases that generate environmental information based on study areas and methods for environmental impact assessment and management of wastewater lines, etc. [1] [3] [4]; in Mexico there have been not developed.These ontologies based on environmental information, can be used for the management of environmental systems and their application in environmental policy tools lines.This has been considered as the first of a string of errors in the application of tools such as the process of environmental impact assessment or strategic impact assessment [5] [6].
Hence, the objective of this research is to present a methodology for the design of an environmental ontology and its application in the Environmental Quality Synoptic System (EQSS).Given the complexity of the collection of environmental information and its impact on decision-making in environmental public policy, it is required to have a mechanism capable of integrating data and relations between them allowing form a knowledge base EQSS.This architecture operates according to the methodology of the art Rapid Evaluation of Sources of Environmental Contamination (ERFCA by its acronym in Spanish) [5] [7].The computational strategy to structure a data warehouse from databases taken from government agencies responsible for environmental information in Mexico is based on defining semantic relationship in the designed context.In a previous review of environmental ontologies studies, we did not find an environmental ontology that incorporated the necessary information as posed by the ERFCA technique [7], and the concept of ontology was used since the late 1990s [8] [9], taken a peak in recent years thanks to advances in information technology [10]- [13].The implementation of Web ontologies is mainly done using W3C standards for knowledge management based on RDF: OWL and SKOS [14] and using the software primarily protégé [15] [16].The tendency to access information remotely and in real time has promoted the use of mainly developed in languages like PHP and ASP NET Web systems, incorporating the benefits of JavaScript libraries like Jason, jQuery, bootstrap that allow information viewed from any mobile device.Ontologies developed in recent years [10] [11] use a service-oriented technology and Semantic Web, in addition to incorporating of spatial data using a Server Maps [17] [18] through web services [13].
As described within the requirements to allow the assessment of environmental quality in a given study area, it is necessary to have a mechanism to integrate useful information.The objective of this research is to present the methodology required to design an environmental ontology and its application in a Web system for estimating environmental quality.The term ontology is used in systems of knowledge representation to denote a knowledge model that represents a particular domain of interest.A knowledge domain represented by formally in a conceptualization: the objects, concepts and other entities that presumed to exist in some area of interest and the relationships between them.In the Semantic Web, ontologies are a key component for knowledge modeling through the interoperability between different systems and reuse of existing knowledge in new systems [18].On the other hand, in the area of Artificial Intelligence, a repository of information is used with a structure that allows interpretation to the data to make inferences within a learning process allowing the acquisition of know-ledge automatically.The Web Ontology developed in this research, is based on the ERFCA technique [5] and its modification by Econopoulous [7], in 2002 for air pollution sources which allow conducting inventories of pollution sources and their results obtained by the use of indicators of environmental quality status Batelle type.These indicators establish evaluation criteria and an estimate of pollution generated according to the production of goods or services of different generating sources of contamination.The technique uses data from the public, social and private sectors available, highlighting the most important sources of generation that have a significant impact on the environment.The proposed ontology also provides information for decision-making in regard to environmental quality and it is expected to interact with other environmental policy instruments, incorporating tools of Geographic Information Systems (GIS).

Methodology
The proposed methodology integrates provided by Noy and McGuinness [18], which state that you must first determine the scope, and extent of ontology considering whether there is a similar ontology that reused.Subsequently shall list the important terms to define classes and their properties and their hierarchy.This process allows you to create the necessary instances where information is stored.The following describes each of the steps presented in Figure 1.
Defining the scope of environmental ontology.The goal of the ontology and its functionality within the Web evaluation system of environmental quality, for it must analyze existing ontologies and technology that has been used for implementation defined.At this stage the field of environmental ontology is determined, i.e. the set of information that will be modeling and computational tools necessary for its creation.
Ontology design from the input and expected output.The structure of the ontology defined by identifying the attributes, relationships based on the information obtained in the previous step.The structure of the bases of official environmental data analyzed, identifying and relating each of the entities and their attributes with the concepts defined in the ontology.
Integration of ontology instances proposals.Mechanism is defined to feed the required data in the ontology, this process is called mapping instances and unifies the different data sets and integrate them into the ontology.
Implementing search engine.The design and implementation of the interface and modules required searching for information about the ontology performed, this process involves the mapping of instances, the process of search and display process for displaying the data required by the user.
Retrieving and displaying data.Function responsible for formatting oftenest consultation data ontology implemented.This output format is an executive report, called Cabinet Study, which may include graphic and thematic maps if it selected in the user interface EQSS system that aims to provide the assessment of environmental impact consistent material for decision-making.

Ontology Design
The design of the ontology based on client-server architecture, as shown in Figure 2, so allow users of the same access information through a Web browser.For the import of databases has been used MYSQL 5.6.14 and SQL Server 2008 Language program for Web user interfaces is PHP 5.5.6.The ontology was developed in language OWL (Web Ontology Language) based on the RDF/XML syntax using the Protégé tool, which is a tool for developing ontologies created by Stanford University, which aims to further facilitate the construction is able to display all the knowledge of an area for later analysis.
Regarding the definition of domain ontology, has made a previous work to identify the domain and scope of the ontology by analyzing the information required by the technical ERFCA mainly covering the following criteria: 1) Study area.
2) Industrial pollutants and non-industrial sources.
3) Classification of pollution sources according to United Nations, UN. 4) Pollutants and factors associated with each of the categories described in the UN classification.
The study area, pollutant sources, UN classification, Environmental Statement, indicators primarily, and from this information and based on propositional logic entities identified a taxonomy has been modeled from a hierarchical graph to represent a semantic network mainly order to establish the classes.The relationship types, properties and allow ontology know what the inheritance relationships that exist through their predecessors in the graph are. Figure 3 shows the information considered by the Environmental Statement.Each of the items represents a class properties and which related entity polluting sources through the study area entity.Figure 4 show a part of the rules that form the taxonomy described and a fragment of semantic network.Instance mapping is based on the D2RQ [19] platform enables unified data sources and principally consists of two access steps.In the first filter of the information needed for each of the data sources is performed and the information can't be obtained and identified to be configured in a database design itself, this database is has been given the name of ERFCA data Base.With respect to the information required by the system, it dispersed as it stored and/or generated by various organizations mentioned above which makes it difficult to integrate them shown below: • The National Commission for the Knowledge and Use of Biodiversity (CONABIO by its acronym in Spanish) [20] created in 1992 to provide data, information and advice to multiple users, and to comply with international commitments and carrying out actions to conservation and sustainable use of biodiversity in Mexico.
• The Secretariat of Environment and Natural Resources (SEMARNAT by its acronym in Spanish) [21].Since 2000, it is the federal government agency responsible for promoting the protection, restoration and conservation of ecosystems and natural resources and environmental goods and services in Mexico, in order to promote their use and sustainable development.
• The National Institute of Statistics, Geography and Informatics (INEGI by its acronym in Spanish) [22].
Since 1983 has brought together the responsibility to generate statistics and geographic information modernizing capture, processing and dissemination of information about the land, people and economy.• The National Water Commission (CNA by its acronym in Spanish) [23].In 1989 given the responsibility to manage, regulate, control and protect the country's national waters, establish policy and national strategies hydraulic and establish programs to assist municipalities in providing safe drinking water and sanitation in cities and rural communities, and to promote the efficient use of water in irrigation and industry.• The Federal Attorney for Environmental Protection (PROFEPA by its acronym in Spanish) [24] originated from the need to address and control the increasing environmental deterioration in Mexico, regulate risky industrial activities, contamination to soil and air, and the care of resources natural.It has been decided to consider primarily stores data from INEGI and SEMARNAT.All information collected downloaded directly from its official website, considering data, images and maps related to the ontology entities.The structure that uses each of the selected data sources and data base ERFCA itself is variable in Figure 5 shows some of the catalogs considered during information filtering.
Once concentrated information relevant to the ontology we proceed to generate instances (Figure 6 and    The information obtained in the first step follows the principle of relational databases so it is necessary to establish a correspondence between the information obtained and the ontology.The following figure shows the tables presented with information extracted from the BDs of INEGI and SEMARNAT, with his environmental variable to which they belong.

Semantic Integration
The goal of integration is to provide a unified database with different structures through independent ontology access avoiding duplication and lack of updating information.The unified access means that once connected the ontology to the external data store, through a user interface, access to a set of heterogeneous data that are independent to each other, but both are part of the same domain.In Figure 8, you can see graphically the mapping process instance, for this work two databases containing information from INEGI and SEMARNAT respectively.Each database has its own structure represented by a diagram were chosen quantity ratio E1 and E2, the mapping process instance is to relate each of the attributes of E1 and E2 with its equivalent in the ontology through the language mapping D2RQ platform consisting of a declarative language to describe relationships between a relational database and the OWL ontology.
Firstly, it is set to a file using the syntax D2RQ the relationship between OWL ontology schema and relational database.Database, D2RQ: jdbcDSN, D2RQ: ClassMap and D2RQ: PropertyBridge For this, two basic structures D2RQ language D2RQ are used.Below in Figure 9 and Figure 10 presented a view of the code needed to connect the relational database.
The ClassMap structure represents the group of OWL ontology classes and through the sentence, map database is linked with some data of ontology.The D2RQ property: uriPattern Tabla.Columna notation specifies the schema within the database and so on set during the mapping information.
As a result of mapping instances a mapping file is generated from a declarative language that describes relationship between RDFS or OWL ontologies with a relational database through D2RQ.From this file the D2RQ platform manages to transform a database on a network.By mapping data, the user is able to request consultation with the ontology.In EQSS UI, the user selects the study area, comprising one or more municipalities in   Mexico, and is generated through a report called ontology Baseline or Environmental Statement, this process is shown in Figure 11.A once the mapping is completed, it will have a file that may be used in implementing a project on Environmental Quality, where queries of information will be made only in the employees terms in the ontology and only will answer those classes where there instances.
The Environmental Statement is generated from an interface and modeled by an ontology is a critical tool necessary in the process of assessing environmental quality territorial regulation, planning and management, development of environmental management plans, waste management, vulnerability analysis, among others [25] [26].Importantly, depending on the results of the query, using the features of Google Maps API for spatial information representation is interpreted in a map embedded within the user interface EQSS generating small custom thematic maps.Figure 12 shows a view of the Environmental Statement that is generated by the interface shown.
The studies in the area of Artificial Intelligence since 1960 [27], but in recent years, thanks to advances in the development of software, hardware and more precisely in telecommunications, allowed environmental modeling and geographical systems automating processes and enabling better management of data.On the other hand, it is important to note that a progressive trend to replace conventional process models for information retrieval that apply cognitive models of human memory and learning models for intelligent agents in information retrieval is seen proposing ontologies for structuring the knowledge base.So that may share information and knowledge  generating services offer a rich and modern knowledge for modeling providing services and manage terminology in any subject contextual framework.The ontology proposed based on knowledge of EQSS, which requires a set of related data-dos with physical-natural, mainly social and economic information.The information obtained from data sources in Mexico and government agencies are working to ensure that access to the data is in real time through the available Web services.Within perspectives is implementing a Web Service based GeoData semantic mapping, which allows to the user modeled in accordance with the needs of the problem to be solved, such as prediction models fire, requiring the creation and integration of geospatial data more related climate, human activities and flammability of specific regions [28]- [30].

Conclusions and Recommendations
Within the assessment, methods in the field of semantic technologies are found and the ontologies as the knowledge of storage model in the area of environment are dynamic, collaborative and iterative.This process needs to be understood, evaluated and managed between domain experts and an automated manner.Progress in this direction allows us to expand the existing arsenal of techniques for analyzing environmental data ontology evaluation towards more holistic approaches.With the emergence of new devices, new technologies and new services, the GIS applications are really very important and are used in various fields such as public policy instruments which can model the dispersion of pollutants.
To determine the system requirements, the process and outputs of ERFCA technique, the data entry system is related to the study area to assess, while the expected outputs must be in terms of media (air, water, soil), sec-torial (industrial, not industrial) and source (mobile or not mobile).The environment ontology proposed based on the user interface EQSS can produce information for decision-making concerning environmental quality and is considered to interact with Geographic Information Systems (GIS).

Figure 1 .
Figure 1.Methodology for the implementation of the environmental ontology.

Figure 2 .
Figure 2. Architecture of the ontology.

Figure 3 .
Figure 3. Information considered by the entity cabinet study.

Figure 4 .
Figure 4. View of the semantic network of environmental ontology.

Fig- ure 7 )
to integrate to form the information domain ontology, and thus the necessary mechanisms for configuring and updating data sets.

Figure 6 .
Figure 6.Generation of instances from the data accessed.

Figure 7 .
Figure 7. Data obtained from the INEGI database.

Figure 8 .
Figure 8. Integration of two information entities ontology.

Figure 9 .
Figure 9. Definition of relational databases using.

Figure 11 .
Figure 11.Generating a baseline study from the environmental ontology.

Figure 12 .
Figure 12.Report generated by the interface of environmental ontology.