How to Implement a Governmental Open Source Geoportal

By emerging the OGC web services technologies which caused server interoperability in geospatial fields, a rush toward implementing geoweb services commenced among most governmental enterprises. The massive geospatial information which has been produced in various offices during previous years, and which was not available for public customers because of protocol difficulties, now had a chance to be restructured as OGC specifications and be reachable to huge amount of keen clients via Internet platform. Increasing number of map presenters in web environment raised a search facility requirement in spatial data area. These kinds of search abilities are called “Geoportal”, which provides client applications that use several geo services such as catalogs and web map services. How to implement a suitable geoportal to meet the needs, has brought a set of hard challenges for governmental geospatial owners. In this study we present an overall concept of service oriented architecture and its consequence web service specifications and eventually web catalog services which are fundamentals of developing a geoportal. It also declares some experiments on importing/exporting data between geoportals, which is known as harvesting method.


Introduction
Over the last decade, we have witnessed the evolution of Geographic Information Systems from the traditional model of stand-alone systems with geo-data tightly coupled with the systems used to create them, to an increas-ingly distributed model based on independently provided, specialized interoperable GIS Web Services.This evolution is enabled by the advancements in supporting IT technologies and the growing demand for GIS in a variety of application domains [1].Web-based geographic information system tools are increasingly used for basic mapping, data visualization tasks, complex web mapping and data analysis software tools continue to be developed.In spite of this apparent migration of GIS to the web-platform, desktop or client-side GIS tools are likely to continue to be needed for a variety of use cases in the foreseeable future.The primary standards issuing entity for the GIS community, the Open Geospatial Consortium (OGC) has released web-based GIS specifications which are arguably the most widely adopted of its standards, and are used primarily for web-based GIS tools.OGC standards such as the Web Map Service (WMS) and Web Feature Service (WFS) have proven to be useful for normalizing and improving the manner in which data are shared across the Internet, and as such they are expected to grow in popularity and usage [2].
The technology for data sharing has advanced in the areas of web services and direct data access, rendering obsolete of the first generation systems such as the Clearinghouse Network.In the second generation it was attempted to make the government more focus on citizens, which is consistent with the long term goals of creating an easy-to-access inventory of currently available data collected by Federal agencies, and to cultivate a planned data investment marketplace that will allow Federal and local governments to combine resources with one another on future data collection/purchase plans [3].In this study, by combination of several software environments, a set of GIS server, catalog server, and geoportal server has been developed, which was mostly based on open source technology and has the ability to be updated by other defined geoportal servers.It recorded a wide range of spatial data including maps with different scales and data with different formats.

Spatial Interoperability
In the past decade a lot of Geographical Information Systems were operating all over the world, especially in administrations, and one of the problems was to make them cooperate, especially at the data level, such as an institution which can profit from the data of another institution.This problem is identified as GIS interoperability since the idea is to share not only data but also programs and services [4].The software foundation of this ability is based on Service Oriented Architecture, which brought out so many valuable interaction abilities that will be described in next sections.

SOA (Service Oriented Architecture)
Service Oriented Architecture comprises a flexible set of design principles used during the phases of system development and integration.The deployment of a SOA-based architecture will provide a loosely-integrated suite of services that can be used within multiple business domains.The enabling technologies in SOA allow services to be discovered, composed, and executed.For instance, when an end-user wishes to accomplish a certain task, a service can be employed to discover the required resources for the task.This will be followed by a composition service which will plan the road-map to provide the desired functionality and quality of service to the end-users [5] [6].

OGC Web Services
The Open Geospatial Consortium (OGC) as widely known in our community is an international consortium of 480 members with the aim to develop different interface standards in the geo-enabled web3.The OGC develops different standards for Geoweb Service [7].In Web GIS structure, accessing to the spatial data through internet is being made by web services.The data is not directly accessible as like FTP systems.Instead of this approach, web services are being served to the user [8].Two major standard services of OGC that related to this paper are shortly described in the following: WMS (Web Map Service): renders maps with spatial content dynamically from geographic information.Map is understood as the portrayal of geo-graphic information as a digital image file which is suitable to be displayed on a computer screen.The maps produced by a WMS are generally rendered in a raster graphic format like PNG, JPEG or Geo Tiff as well as in a vector based format such as SVG [7].Web Map Tiling Service (WMTS) specification interface provides a mechanism to access discrete pre-rendered map tiles in a standardized way for caching of tiles and performance reasons.Similar approaches but proprietary approaches are used by Google and Yahoo for their Restful services.The WMTS is also a complement to the existing Web Map Service.The specification is split in two parts defining 1) there sources and 2) the concrete exchange mechanisms [9].Spatial web services are being used to provide interoperability between users and realize advanced GIS functions and analysis on web pages [10].Spatial web services are constituting and establishing a standardized web-based communication infrastructure between institutes/organizations and information systems [11] [12].Web services are being interoperability, flexibility, compatibility and re-usable components of Service Oriented architectures [13] [14].

Geoserver
In performing the GIS analysis tasks, Web GIS is similar to the client/server typical three-tier architecture.The geo processing is breaking down into server-side and client-side tasks.A client typically is a Web browser.The server-side consists of a Web Server, Web GIS software and database [15].Geoserver is an open source Web GIS server written in Java language that allows users to share and edit geospatial data.Designed for interoperability, it publishes data from any major spatial data source using open standards [16] Geoserver is the reference implementation of the Open Geospatial Consortium (OGC) Web Feature Service (WFS) and Web Coverage Service (WCS) standards, as well as a high performance certified compliant Web Map Service (WMS) and Web Map Tile Service (WMTS).Geoserver forms a core component of the Geospatial Web [17].Geoserver internally uses Open Layers as its web server, which is free Java script base software.Geoserver's goal has always been to make geospatial information as accessible as possible.In the past this has amounted to delivering solid implementations of open standards on top of robust support for a variety of spatial data formats.However as the more consumer-oriented geoweb emerges, so do new ways of exposing geospatial data.Much Geoserver development of the past year has been focused on providing support for these new geoweb-oriented technologies.The primary goal is to bridge the gap between the good data provided by SDI's and the quickly growing popularity of web mapping and the geoweb.This includes support for KML which allows Geoserver to serve data directly to Google Earth and Google Maps.The ability to both consume and publish data in a wide variety of formats make Geoserver ideal for pushing data out to the wider web.In essence Geoserver is the link between a SDI and the rest of the world [18].

Interoperability
Laurini, R. (2002) described a sample as below, which simply indicates the importance of GIS interoperability.He says: Consider an application that is aimed at detecting and controlling pollution along the Rhine River in Europe (Figure 1).The Rhine takes its sources in Switzerland, and then flows along the boundary between France and Germany, traverses Germany, and it finishes in the Netherlands.Any comprehensive study must be based on several databases, located in different countries.The languages and scopes of the databases may be different.Their contents were not acquired with the same specifications, and their last updates do not coincide: In this example, interoperability must be ensured between four GIS (Swiss, German, French and Dutch), with perhaps the indication that some of them may be in reality a federation of different GIS subsystems [4].
Web-based GIS systems are being solution center for interacting with spatial data through internet.Beside desktop Geographical Information Systems, web-based geographical information systems have been making a rapid progress from a decade ago until nowadays.Web GIS is now preferred due to the innovations and developments on software, hardware and network.Fast processing units, and a great amount of memory allow analyzing high resolution raster images, intensive algorithms intensive data visualizations and spatial analysis [19].With the awesome sights of spatial interoperability which does not need any physical data sharing, most spatial information owners raised to complete their complementary preparations for being able to present their spatial data via OGC web service specifications.As OGC web services (OWSs) follow WWW web services framework, any individual can deploy them on their own machines.The availability of OWS implementations is very important for data owners to easily install and share their data in an interoperable manner [20].According to a crawl report conducted by Sky Lab Mobile System in 2006, there were 994 WMS services and 339,254 layers available on the Internet at the time of the investigation [21].By using any composition of desired layers, a satisfactory map will be achievable.Figure 2 shows a sample of implementing this kind of GIS interoperability in environment management.
However although the ability of GIS interoperability brings the benefit of having worldwide information resources, seeking for finding desired layers among huge amount of presented services, is something which needs  some preparations.This construction is called implementing geoportal by using Catalog Services, which will be explained in the next section.

Geoportal
Spatial data can be searchable if they are attached by the required information as metadata.Metadata helps people who use geospatial data find the data they need and determine how best to use it.Metadata are a key ingredient in supporting the discovery, evaluation, and application of geographic data beyond the originating organization or project [23].If data producers want to promote related resources and make them available in the Globe, then they need to create metadata according to the predefined rules and publish them using a CSW (OGC Catalog Service for Web) standard.There are several main steps of a workflow that allow us to keep a wide range of geospatial services updated and accessible through a CSW catalogue [24].Protocol adaptation, query dispatching, query criteria translation, and query results integration are the four main challenges in building a catalogue federation.The Open GIS Catalogue Service specification can be used to define the internal communication protocols between the federation service and the affiliated catalogue services, and between the federation service and its clients.The Open GIS CSW specification proposes an OGC Common Catalogue Query Language, which must be supported by all compliant Open GIS Catalogue Services.This query language supports nested Boolean queries, text matching operations, temporal data types, and geospatial operators.Implementations of query languages that are transformable to this language are the OGC Filter specification [25].
A geoportal is an internet entry point to access spatial data infrastructures.By providing web-based network services, a geoportal is an essential part of a spatial information infrastructure.Geoportal offers the opportunity for federal state agencies, municipal authorities and private companies to make their data and services accessible for the whole community of internet users.The portal is designed to provide only information about geospatial data and the data owners, not the data itself.The data as well as related metadata information remains with the data providers thus leaving full control on all provided information to the information provider [26].There may be two main types of geoportals, 1) "discovery geoportals" and 2) "data delivery geoportals".The only difference is the accessibility of the real data in the second type geoportal.A data delivery geoportal is setup by the owner of that data.In that sense these geoportals are usually limited to the number of datasets owned by the provider [27].An estimated 80 percent of information supplied to the public and other government agencies has a spatial dimension, indicating the importance of a new approach that deals with data management and delivery in combination with monitoring and reporting systems, across different levels of government and the private sector [28].
Several implementations have been already provided for the CSW based geo catalogue, including Geo Network, Deegree and Esri Geoportal Server [29].Deegree is a Java-based Open Source/Free Software framework for the implementation of Spatial Data Infrastructures (SDI).It contains the services needed for SDI as well as portal components, mechanisms for handling security and access control issues and storage/visualization of 3D geodata.Deegree is conceptually and interface-wise based on the standards of the OGC and ISO/TC211 [30].ESRI defines a geoportal as "a single point of access to spatial information, regardless of the location, format, or structure of the data source".ESRI packaged this product as a joint technology and services solution.The product consists of a suite of Web-based and desktop software components collectively and called GIS Portal Toolkit.GIS Portal Toolkit has constructed over Arc IMS and Arc SDE.Content for a portal catalog can be published and maintained using various harvesting methods.Using ESRI Arc Catalog, users can connect to the portal metadata service and publish meta data directly from the geospatial data or from separate collections of metadata files.Users can also upload XML files or automatically harvest from different kinds of metadata repositories including ArcIMS metadata services, Web accessible folders, or Open Archive Initiative metadata services [31].In this study a geoportal is implemented based on Geo Network open source system which will be described more in detail in the next section.

Geo Network
Geo Network "open source" is a standardized and decentralized Geospatial Information Management System, based on the concept of distributed data and information ownership and is designed to enable access to geo-referenced data and cartographic products through descriptive metadata [32].Geo Network is a catalog application to manage spatially referenced resources.It provides powerful metadata editing and search functions as well as an embedded interactive web map viewer.It is currently used in numerous Spatial Data Infrastructure initiatives across the world.Some of its main features are as below [33]: • Immediate search access to local and distributed geospatial catalogues; • Uploading and downloading data, graphics, documents, pdf files and any other content type; • An interactive Web Map Viewer to combine Web Map Services from distributed servers around the world; • Online editing of metadata with a powerful template system; • Scheduled harvesting and synchronization of metadata between distributed catalogs; • Support for OGC-CSW 2.0.2ISO Profile, OAI-PMH, Z39.50 protocols; • Fine-grained access control with group and user management; • Multi-lingual user interface.
Geo Network has a Metadata Template system.This system allows quick creation of new metadata entries.A template can be fully customized online and can be pre-filled with repetitive content (contact information for example).Templates can also be searched in the same way as normal metadata.But only editors and administrators have access to templates.Further, more templates can be created for specific user groups [34].Geo Network had its own XML/XSL-based user interface to create and edit the metadata records and could also harvest the metadata information from the harvesting node using the WFS service.The communication between its UI layer and the RESTful application layer was based on XSL/XML and JSON technologies.Moreover, Geo Network had the robust mechanism (by using XSL technology) to harvest WFS layers and convert them into ANZLIC metadata profile compliant records [35].All metadata entries are subject to Geo Network groups and users management mechanisms which allow specification of access rights within the application.Users with appropriate credentials can access metadata entries and download the associated physical data through secured connections.Geo Network open source offers a user friendly interface divided in two parts.In the upper part of the interface, users can query the data catalogue using free text, geothermal oriented keywords and categories.They can also search by location, date of publication in the catalogue or date of acquisition of the data.The results of the query are displayed in the lower part of the interface.Users can click through different metadata entries and download all associated data [36].
Thesaurus tool enables metadata creators to use thesauri in order to fill in some metadata elements.The use of controlled keywords facilitates the mapping between a selected vocabulary and a large collection of records [37].A thesaurus is a list of words (or concepts) from a specialized field of knowledge.In a metadata catalog, words from a thesaurus can be assigned to a metadata record (as keywords) as a way of associating it with one or more concepts from a field of knowledge.In Geo Network, the process of assigning keywords to a metadata record takes place in the metadata editor.The user can choose words from one or more thesauri to associate the record with the concepts described by those words.This process is supported for both ISO19115/19139 and Dublin Core Metadata records using an ExtJS based thesaurus picker.In Geo Network, thesauri are represented as SKOS (Simple Knowledge Organization System).Geo Network supports multilingual thesauri (e.g.Agrovoc).Search and editing takes place in the current user interface language [33].
Geo Network open source supports language localization.It is very modular and to add any further language to the existing languages constitutes a very simple task.All texts are stored in XML files, so you may just copy files with an existing localization, translate and place them in the directory named "cs".Access to Geo Network is addressed through groups, roles and users.Editor and controller roles can be so very well presumed for the management of metadata.Editor has the rights to create, edit and delete of metadata.Controller has only one, yet more important, task: to inspect and to approve the metadata for publication.i.e., Students will have typically editor rights and their metadata records will be checked by teachers [38].

Implementation
After implementing required optimizations for storing and retrieving spatial data, and developing needed web services such as WMS and WMTS in each branch of the organization, it is good time to launch a catalog server and a suitable geoportal over them all.This kind of relationship is indicated as a block diagram in Figure 3.
As it is illustrated above, each branch of the organization should equip its Geospatial server by two different abilities.The first one is implementing a Mapping server (in this case Geoserver) which presents OGC web services such as WMS, WFS, WMTS.The second one is a Catalog server (in this case Geo Network) which works upon CSW (Catalog Service for Web) service specification.A server side application such as ASPX has been written to plays as a proxy between server and clients.A small type of this program was also written for isolating the mapping server from clients.Several Java Script codes have been written for required processes in the client side.A Java Script Open Layers program used to show maps and manage them.These elements will be described later.

Proxy
Although both Geoserver and Geo Network software have diverse facilities to define different permissions and constraints for users, it is always useful to have a program by which the scope of access for each user can be defined with different details.
A good illustration of this ability is shown in Figure 4.The Geoserver mostly supports some common web services such as WMS, WFS and WPS.While the WMS service is used as a service in which the main data will not be transferred to client side, the WFS service allow the client to access the whole data, and usually the system owners are reluctant to grant the WFS permission to clients.On the other hand, the clients need to have the WFS permission for being able to use the WPS functions, and this is not a good constraint.
By using a proxy, and writing some small lines of code, it will be possible to response the client's WPS requests without needing to give him a WFS permission.Another usage of proxy in this study was managing several catalog servers as one server.As it is indicated in Figure 3, each office of the organization has its own catalog server and the proxy is able to propagate the client's request through all defined catalog servers, and may presents the overall returned answers to client.For isolating the Geoserver from its client side, a special proxy has been dedicated for each of them.

Harvesting
To reduce the amount of linked catalog servers, it is good to use harvesting method.For this purpose several catalog servers Metadata will merge in one aggregated server (Figure 5).In that case, harvesting may act as a periodically merging of defined servers records to their corresponding records in the aggregated server.
Geo Network is able to harvest from various sources like: another Geo Network node, a Web DAV server, a CSW or catalogue server, an OGC service using its Get Capabilities document.These include WMS, WFS, WPS and WCS services.

Client Side Programming
Client side programming may start from a thin simple HTML page to thick desktop application.As it is shown in Figure 6 the first ones may be run in any web explorer and are machine independent, and the second ones need to be installed in client's machine and are dependent to the client's machine OS.In this study a release of UDig application used for thin client.It's open source Java language software which has good structure for developing by implementing desired plug-in.It presents both Windows and Linux compatible versions.
In general, a Web browser can handle HTML documents, and embedded raster images in standard formats.To deal with other data formats like vector data, video clips or music files, the browser's functionality has to be extended.To overcome this problem most browser applications offer a mechanism that allows third tier programs to work together with the browser as a plug-in [15].Generally, the attempt to provide processing possibility in the client side has two reasons.The first aim is to provide facilities to manage the provided maps (such as zooming and…), and the second aim is to provide opportunities for displacement of requests' management.Java script is one of the most popular coding languages which are environment independent and light to be deployed.
In this study the OpenLayers java script program has been used as map management tools.This program is open source and is well known in this domain.It is possible to add desired codes inside and define some map management buttons.Java scripts codes may be in external files with .jsextension and may be recalled by a tag line the same as below: <script type = "text/javascript" src = "scripts/ search.js"></script> More than six java script code files used in this study, some of which are for accessories same as a beautiful clock.Another ability which improves the use of HTML pages is Cascade Style Sheets (CSS) programming.It is a style sheet language used for describing the look and formatting of a document written in a markup language.While most often used to style web pages and interfaces written in HTML and XHTML, the language can be applied to any kind of XML document, including plain XML, SVG and XUL.CSS is a cornerstone specification of the web and almost all web pages use CSS style sheets to describe their presentation [39].It is possible to define CSS code in three different ways named as: Inline, Internal and External.In case of using external CSS codes a line same as below may define it: <link rel = "stylesheet" href = "main.css"type = "text/css"/>

Server Side Programming
In the server side, besides Java script files which will deployed in client side for being processed there, two Active Server Pages (ASPX) program developed to manage whole workflow.Figure 7 shows the workflow to provide geoportal services.Job begins by client's request text.By using the XMLHttp Request function of Open Layers a standard request statement is generated upon the user request text.
The generated request statement is sent to Geo Network, and its answers (which are a list of Metadata records) are stored in a DBMS for future uses, as well as presenting to the client.Whenever the user clicks on any rows of listed records, the related Metadata will be retrieved from DBMS and will be presented to the user via a separate page.In this new page it is possible to order for presenting the border of layer in the base map.In this posi-  tion it may also be possible to see the layer if one of its respective WMS or WMTS services had been registered.For implementing interactions between Java script codes and ASPX applications, the AJAX technology was used.

Conclusions
The objective of this work has been to show that by using appropriate open source software and some small programming code, governmental enterprises, may implement a suitable geoportal which covers several geospatial servers.By optimizing spatial data and defining them in Geoserver, many useful web services such as WMS, WFS, and WMTS will be accessible for each office of the organization.These services are suitable for presenting spatial data in web network, and since they are accepted as standard services, it will be possible to have a map whose layers are gathered from many different servers.In order to be able to register the metadata of the layers in a geoportal, the Geo Network open source was used.
With this operation, each department responsible for spatial data, not only got access to facilities to present its data in web environment in a standard way, but also made it possible to search the information spatially by introducing the standard service catalogue.
In order to allow the organization to make all facilities of its sub-departments available to users through a single portal, several server applications were written which managed all servers in organization and acted as a proxy between users and servers.
According to the Global approach toward open source facilities, it will be better to try to develop these small server applications by Java Server Pages (JSP), or PHP language instead of ASPX technology.Also it is good to use thesaurus of Geo Network to improve the search ability in local language.

Figure 2 .
Figure 2. Example of a client accessing multiple WMS enabled servers [22].

Figure 4 .
Figure 4. Using Proxy for responding to WPS requests without granting WFS permission.

Figure 5 .
Figure 5. Collecting remote metadata as a periodic process.