TITLE:
Visual Composition of Complex Queries on an Integrative Genomic and Proteomic Data Warehouse
AUTHORS:
Francesco Pessina, Marco Masseroli, Arif Canakoglu
KEYWORDS:
SQL Query Composition; Visual Interface; Integrated Data Extraction; Data Warehousing; Bioinformatics Database
JOURNAL NAME:
Engineering,
Vol.5 No.10B,
October
25,
2013
ABSTRACT:
Biomedical questions are usually complex and regard
several different life science aspects. Numerous valuable and he- terogeneous
data are increasingly available to answer such questions. Yet, they are
dispersedly stored and difficult to be queried comprehensively. We created a
Genomic and Proteomic Data Warehouse (GPDW) that integrates data provided by
some of the main bioinformatics databases. It adopts a modular integrated data
schema and several metadata to describe the integrated data, their sources and
their location in the GPDW. Here, we present the Web application that we
developed to enable any user to easily compose queries, although complex, on
all data integrated in the GPDW. It is publicly available at
http://www.bioinformatics.dei.polimi.it/GPKB/. Through a visual interface, the
user is only required to select the types of data to be included in the query and
the conditions on their values to be retrieved. Then, the Web application
leverages the metadata and modular schema of the GPDW to automatically compose
an efficient SQL query, run it on the GPDW and show the extracted requested
data, enriched with links to external data sources. Performed tests
demonstrated efficiency and usability of the developed Web application, and
showed its and GPDW relevance in supporting answering biomedical questions,
also difficult.