Paper Menu >>
Journal Menu >>
![]() J. Software Engineering & Applications, 2010, 3, 629-643 doi:10.4236/jsea.2010.37073 Published Online July 2010 (http://www.SciRP.org/journal/jsea) Copyright © 2010 SciRes. JSEA 629 Using an Ontology to Help Reason about the Information Content of Data Shuang Zhu1, Junkang Feng2 1,2Database Research Group, School of Computing, University of the West of Scotland, Paisley, UK; 2Business College, Beijing Union University, Beijing, China. Email: {shuang.zhu, junkang.Feng}@uws.ac.uk Received March 25th, 2010; revised April 23rd, 2010; accepted April 25th, 2010. ABSTRACT We explore how an ontology may be used with a database to support reasoning about the “information content” of data whereby to reveal hidden information that would otherwise not derivable by using conventional database query lan- guages. Our basic ideas rest with “ontology” and the notions of “information content”. A public ontology, if available, would be the best choice for reliable domain knowledge. To enable an ontology to work with a database would involve, among others, certain mechanism thereby the two systems can form a coherent whole. This is achieved by means of the notion of “information content inclusion relation”, IIR for short. We present what an IIR is, and how IIR can be identified from both an ontology and a database, and then reasoning about them. Keywords: Ontology, Information Content of Data 1. Introduction Data mining techniques and tools are developed for finding otherwise hidden knowledge from data, and little seems to have been done on bringing “standard” domain knowledge into such a process, which we envisage would be helpful. Ontologies as domain knowledge have been used in many fields. We want to explore how an ontology may help find hidden information from data. In this paper, the focus is on how to link an ontology with a relation data- base in order to reason about informational relationships between data constructs in the database and those be- tween domain objects captured by an ontology. This may represent an innovative approach to knowledge discovery in a database. Ontology [1] as a term used in computer science was started in the 1990’s. Compared with the development of relational databases, it is a new scientific field. Ontology offers an opportunity to give an open and standardized description of database semantics with which we can substantially improve the quality and utilization of data. That is, Ontology + Database = (Standards + Explicit Seman- tics) + Database, which leads to improved data utilization and data quality [2]. Futhermore, semantic web [3] is a popular topic. Through semantic web we attempt to provide users with far better machine assistance than it is available now for their queries. Semantically annotated web pages with ontologies may assist reachers to achieve this purpose [4]. Through our work, we obtain an ontology from the DAML library, which represents some additional com- mon knowledge, and link it with an existing database. In terms of linking an ontology and a database though, in the literature, we find a few different methods in using an ontology to assist a query process. It appears that one way to achieve this is that an on- tology is invoked at the very beginning of a query proc- ess [5] as shown in Figure 1. That is, it is through re-writing a query in order to get more information. A Figure 1. Invoking an ontology in the query processing ![]() Using an Ontology to Help Reason about the Information Content of Data Copyright © 2010 SciRes. JSEA 630 user query is translated into a set of queries with the help of the ontology, which better fits the structure of the data source. After query optimization strategies having been applied on them, the resultant transformed queries are equivalent to the submitted ones. Although seemingly a promising approach, it is not concerned explicitly with the information content of data, in which we are particularly interested and wish to explore and make use of. Another approach that we have investigated is where an ontology is invoked in formulating a query process by Munir et al. [6]. In their approach, firstly, an ontology is generated based upon domain metadata including rela- tionships between data in a relational database. Then such an ontology is enriched with domain knowledge. Secondly, ontology statements are translated into expres- sions in the OWL-DL language. Thirdly, the expressions are transformed into relational query statements. Finally, map the domain ontology to a relational database (as shown in Figure 2). Munir et al. [6] said little about the mapping between the created ad hoc ontology and the “standard” domain ontology if any, which we suspect is done intuitively. This is however one of the topics in which we are particularly interested. We give an outline of our approach in Section 2. The key notion is informational relationship and its formali- sation IIR [7]. We describe in details how IIR may be derived from a relational database and from an ontology in Section 3, which make use of inherent and ad hoc constraints between data constructs in a database and between concepts in an ontology. We present a full ac- count on how our ideas are tested by using some imple- mentation in Oracle in Section 4. Finally we give con- cluding remarks in Section 5. 2. Outline of our Approach Our approach is to invoke an ontology when we work on a database. Namely, when a user submits a query, we do not change the query, but rather we involve the ontology in the reasoning process per se that is required for an swering the query (shown in Figure 3). Furthermore and most importantly for us, the reasoning is carried out on the basis of the notion of “information content” of data. This notion is the work of Xu, Feng and Crowe in 2008 [8], which extends substantially Dretske’s [9] definition of “information content” of a signal. In this paper, they introduce another notion called IIR, as a formulation of the notion of “information content” of data. Xu et al. [8] define IIR as follows: “Let X and Y be an event respectively, there exists an IIR, from X to Y, if every possible particular of Y is in the information con- tent of at least one particular of X”. Furthermore, they define that “Let X be a event, the information content of X, denoted I(X), is the set of events with each of which X has an information content inclusion relation”. More- over, they present a sound and complete set of inference rules (IIR rules) for reasoning about information content of data (states of affairs, or events in general). The six inference rules are cited below. 1) Sum If 12 n YX XX then i IX Yfor i = 1, …, n This rule says if it is the disjunction of a number of events, then an event X is in the information content of any of the latter. A trivial case is where X and Y above are not distinct. 2) Product If 12 , ni X XX XYX for i = 1, …, n then I XY This rule says that if an event X is the conjunction of a number of events, then any of the latter is in the informa- tion content of the former. A trivial case is where X and Y above are not distinct. 3) Transitivity If , I XYIYZ then IX Z This rule says that if the information content of an event X includes another event Y, and the information Figure 2. Ontology assisting the formulation of a query ![]() Using an Ontology to Help Reason about the Information Content of Data Copyright © 2010 SciRes. JSEA 631 Figure 3. Ontology enhances reasoning about the information content in a database content of Y includes yet another event Z, then the infor- mation content of X includes Z. 4) Union If ,IX YIXZ , then IXYZ This rule says that if the information content of an event X includes another two events Y and Z respectively, then the information content of X includes event Y∩Z that is the product of Y and Z. And it is in this sense that Y and Z are in a “union”. 5) Augmentation If 12 n WW WW, Z is the product of a subset of 12 ,,, n WW W then IWXZ Y This rule says that if 12 , n WW W event Z is the product of a subset of 12 ,,, , n WW W and the infor- mation content of event X includes event Y, then the in- formation content of the event W∩X formed by the product of W and X includes the event Z∩Y formed by the product of Z and Y. 6) Decomposition If IXYZthen ,IX YIXZ This rule says that if the information content of event X includes event Y∩Z that is the product of event Y and event Z, then Y and Z, as separate events, are in the in- formation content of X, respectively. In this paper, we exploit the ideas above. That is, in a way, we translate both the ontology and the database into IIR and then reason about them as a whole. Put another way, as what matters is information and IIR captures and formulates it, so we look at both an ontology and a data- base from the same perspective of IIR, and this enables the two different things to work together. The overview of our approach is illustrated in Figure 3. On the very top of Figure 3, there is a block called “information collection from the real world”. From this information, knowledge about a domain of interest in- cluding explicit business rules is arrived at. Domain knowledge is then formulated as an ontology by using software tools and languages. Two different routes are there to deal with user queries. If it is in a conventional query language then a query is handled in a normal way. The dotted line indicates this route. If that does not work, we would invoke the other route, i.e., to invoke ontology and reasoning about IIR. The second solution is the primary goal of this project, which is indicated by the solid line arrow in Figure 3 of “Customer query”→“IIR closures”→“Query results”. The only difference between these two solutions lies in the middle part of the procedure, on which we concen- trate. Within the “Integration of IIR” section, there are three different resources required to derive the “IIR”, indicated by three arrows from “ontology”, “business rule” and “database”, which are the origins of initial IIR. Then there is a reasoning mechanism implemented in PL/SQL of Oracle. The result of the reasoning is IIR ![]() Using an Ontology to Help Reason about the Information Content of Data Copyright © 2010 SciRes. JSEA 632 closures. Given an event A, the IIR closure of A, denoted as A+, is the set of all events that are in the information content of A, that is, if IIR(A, B), then B A+. IIR clo- sures are the basis of answering queries in our approach. Our work thus far shows that it is the additional rela- tionships between data constructs especially “entities” that are revealed and made available through using an ontology that give us more and enlarged IIR closures than those that would otherwise be based on the database alone. This is how our approach makes a difference. One of the main tasks is to derive IIR from the ontol- ogy, the database and business rule, and then integrate them as a whole. For instance, suppose that we have IIR(A, B) (meaning the information content of A in- cludes B) and IIR(C, D) from a relational database, and IIR(E, F) and IIR(G, H) from an ontology. If we also know that A and E are equivalent, then with Transitivity, we get IIR(E, B) and IIR(A, F). Consequently A+ and E+ are enlarged. We use Oracle [10] to implement this approach. An ontology in OWL [11] can be translated into relational tables [12]. Such tables do not hold data values however, if the ontology is an unpopulated one. In such a case, the involvement of an ontology results in additional objects and additional relationships between objects that are rep- resented by data in the original relational database. This way, a query that does not have exact match with data in the database may be answered. An ontology may add an additional hierarchical structure to data in the database. Furthermore, as said earlier, we use ontologies in a spe- cial type of reasoning, i.e., reasoning about the informa- tion content of data through a kind of special relationship between data items and between data items and real world objects, namely informational relationships, which is captured and formulated as IIR between events (in terms of probability theory). Thus, how to identify IIR from an ontology becomes a key factor in our approach. 3. Deriving IIR An IIR is a relationship between two states of affairs (i.e., events) such that one’s existence results in the certainty that the other exists, and without the former, the latter is not certain. Following Dretske 81 [9], we say that the latter is in the “information content” of the former. It would appear that to express IIR(X, Y) must be based on and revolved around two elements. One is two individual values (two individual parts or two sets of groups) captured as X and Y, and the other is relation- ships between X and Y. We use part of a “university” database and part of on- tology “Academic” to present how IIR can be derived from a database and an ontology. Then the IIR are rea- soned about by applying aforementioned Inference Rules. The reasoning is implemented by a program. 3.1 Deriving IIR from an Ontology According to characteristics of ontologies, these are two different sources that may help the derivation of IIR. One is concerned with relationships between “Classes” in an ontology. The other is “ObjectProperty”. 3.1.1 IIR Derived from Classes Generally, there are two different types of relationships between classes from which IIR exist. One is “subClas- sOf”, and the other “equivalentClass”. The syntax for these two in an OWL ontology is as follows: A relationship between “Class” and “subClassOf”, <owl:Class rdf:ID=”Lecturer”> <rdfs:subClassOf rdf:resource=”# Faculty”> </owl:Class> A relationship between “Class” and “equivalent- Class”, <owl:Class rdf:ID=”Teachers”> <owl:equivalentClass rdf:resource=”#Faculty”/> </owl:Class> The IIR could be derived from these two relationships thusly: IIR(Class, subClassOf), IIR(Class, equivalentClass) and IIR (equiva- lentClass, Class). As shown above, we have a relationship “Lecturer is a subclass of Faculty, and Teachers is an equivalent class to Faculty”. Hence we have IIR(Lecturer, Faculty), IIR(Teacher, Faculty), IIR(Faculty, Teacher), and IIR(Lecturer, Teacher). 3.1.2 IIR Derived from ObjectProperty There are four different types of ObjectProperty rela- tionships, which capture relationships between classes in an OWL ontology. These are: “ObjectProperty”, “sub- PropertyOf”, “equivalentProperty” and “inverseOf”. As aforementioned, to create IIR needs two classes (X and Y) from the ontology. As ObjectProperty represents a relation for connecting two classes of “domain” and “range” in an OWL ontology, an ObjectProperty already contains a set of classes, which can be expressed as “Ob- jectProperty=(‘domain’, ‘range’)”. Accordingly, we ob- tain IIR(domain, range). That is, the IIR that can be de- rived from these four types of ObjectProperty is all of the form: IIR(domain, range) Note that IIR must be of a many-to-one relationship (including one-to-one). How to handle many-to-many is to be addressed shortly. The relevant syntax of OWL is as follows: A relationship between “ObjectProperty”, <owl:ObjectProperty rdf:ID=”research_by ”> <rdfs:domain rdf:resource=”Professors”/> <rdfs:range rdf:resource=”Projects”/> </owl:ObjectProperty> ![]() Using an Ontology to Help Reason about the Information Content of Data Copyright © 2010 SciRes. JSEA 633 Then we have “IIR(domain, rang)”, for example, IIR(Professors, Projects). A relationship between “ObjectProperty” and “subPropertyOf”, <owl:ObjectProperty rdf:ID=”research_in”> <rdfs:domain rdf:resource=”Postgraduates”/> <rdfs:range rdf:resource=”Projects”/> </owl:ObjectProperty> <owl:ObjectProperty rdf:ID=”study_in”> <rdfs:subPropertyOf rdf:resource=”research_in”/> <rdfs:domain rdf:resource=”Postgraduates”/> <rdfs:range rdf:resource=”Projects”/> </owl:ObjectProperty> Then we have “IIR(domain, range)”, for example, IIR(Postgraduates, Projects). A relationship between “ObjectProperty” and “equivalentProperty”, <owl:ObjectProperty rdf:ID=”attend_course”> <rdfs:domain rdf:resource=”Student”/> <rdfs:range rdf:resource=”Course”/> </owl:ObjectProperty> <owl:ObjectProperty rdf:ID=”join_course ”> <owl:equivalentProperty rdf:resource=”attend_course”/> <rdfs:domain rdf:resource=”Students”/> <rdfs:range rdf:resource=”Course”/> </owl:ObjectProperty> Then we have “IIR(domain, range)”’, for example, IIR(Students, Course). A relationship between “ObjectProperty” and “in- verseOf”, <owl:ObjectProperty rdf:ID=”teache_of”> <rdfs:domain rdf:resource=”Faculty”/> <rdfs:range rdf:resource=”Course”/> </owl:ObjectProperty> <owl:ObjectProperty rdf:ID=”instruct_by”> <owl:inverseOf rdf:resource=”teaches_of”/> <rdfs:domain rdf:resource=”Course”/> <rdfs:range rdf:resource=”Faculty”/> </owl:ObjectProperty> Then we have “IIR(domain, rang)”, for example, IIR(Course, Faculty). Moreover, there are relationships between classes and “Not Null” FunctionalProperty, the syntax of which is as follows: A relationship between “Class” and “Functional- Property”, <owl:Class rdf:ID=”Course”> <owl:DatatypeProperty rdf:ID=”courseNo”> <rdfs:type rdf:resource=”&owl;FunctionalProperty”/> <!-- NOT NULL --> <rdfs:domain rdf:resource=”Course”/> <rdfs:range rdf:resource=”&xsd;short”/> </owl:DatatypeProperty> Then we have “IIR(Class, DatatypeProperty)”, for example, IIR(Course, courseNo). Furthermore, to handle a many-to-many relationship, we transform it into two many-to-one relationships. Con- sider firstly this scenario: “one course is taken by more than one students and one student takes more than one course”. This is a many-to-many relationship. We de- compose such a relationship into two many-to-one by creating a new class and then they are treated in the same way as the second method for the ObjectProperty trans- formation. We create an intermediate table and use the ObjectProperty name as the new class name (as well as the table name which will be the transformation of this class). In details, the relationship “StudentTakeCourse” between the class “student” and the class “course” is many-to-many. We create a new class CourseLearning, which contains two ObjectProperty relationships as shown below: Class (Student) DatatypeProperty (studentNo domain (Student) range (xsd: short) Functional) DatatypeProperty (studentName domain (Student) range (xsd: string)) DatatypeProperty (major domain (Student) range (xsd: string)) DatatypeProperty (enrollmentDate domain (Student) range (xsd: date)) Class (Course) DatatypeProperty (courseNo domain Course) range (xsd: short) Functional) DatatypeProperty (courseName domain (Course) range (xsd: string)) DatatypeProperty (creditHour domain (Course) range (xsd: integer)) Class (CourseLearning) ObjectProperty (takenBy domain (CourseLearning) range (Student)) ObjectProperty (inv - takenBy domain (Student) range (CourseLearning) inverseOf (takenBy)) ObjectProperty (takesCourse domain (CourseLearning) range (Course)) ObjectProperty (inv - takesCourse domain (Course) range (CourseLearning) inverseOf (takesCourse)) Accordingly the IIR obtained in this process are IIR(CourseLearning, Student) and IIR(CourseLearning, Course) (Figure 4). The paragraphs that follow illustrate details at the “in- stances” (data values) level of the above example. The original class CourseLearning is divided into two parts takenBy and takesCourse (as ObjectProperty), ei- ther of which only shows a one-way relationship. These combined however form the relationship between stu- dents and courses. Table 1 shows some instances. ![]() Using an Ontology to Help Reason about the Information Content of Data Copyright © 2010 SciRes. JSEA 634 Figure 4. The overview for IIR relationship of CourseLearning Table 1. Table transformation from CourseLearning CourseLearning Student Course d1 S1 C1 d2 S2 C1 d3 S1 C2 d4 S4 C1 d5 S3 C2 d6 S4 C2 3.2 Deriving IIR from a Database In a database, initial IIR (i.e., IIR that is not implied by others) come from three sources: relationships between tables, relationships between attributes and relationships between individual data values. Two different ways can be used to derive such IIR. A relationship between a “subclass” and a “super class” Figure 5 shows part of a “university” database schema. An IIR exists between two tables if one is a super class of the other, for example, IIR(postgraduate, student). Note that, IIR is a relationship between events as we said earlier. For tables, we define events as follows: if we randomly chose a tuple from a database, that the tuple happens to be in a particular table is an event. Thus the above IIR(postgraduate, student) means that the exis- tence of a tuple in table Postgraduate makes certain that a tuple that corresponds to the former exists in table Stu- dent. A many-to-one relationship between two tables Similar to deriving IIR from an “ObjectProperty” with an ontology, we obtain IIR(table 1, table 2) if they have a many-to-one relationship for example, IIR(undergraduate, course). Constraints of the relational data model and busi- ness rules on data A third source of IIR is constraints of a relational da- tabase and business rules on data, for example, IIR(table 1, PK) and ‘IIR(PK, attribute1). For Figure 5, we have IIR(courses, courseID) and IIR(courseID, courseName). The former means that the existence of a tuple of table Courses makes certain that a corresponding course ID (a value) exists. The latter means that the existence of a course ID makes certain that a corresponding course name exists. Another type of IIR is IIR(FK1, PK1), for example: ![]() Using an Ontology to Help Reason about the Information Content of Data Copyright © 2010 SciRes. JSEA 635 Figure 5. 6 partial EER diagram of “University” relational database IIR(courseIDoftableStudents, courseIDoftableCourses), which means that the existence of a course ID in table Students makes certain that a corresponding course ID exists in table Courses. A database is normally populated with data values. Inaddition to the above IIR on the “table” level and “at- tribute” level, there could be IIR identifiable at the “data value” level. The information that each individual “data value” holds in a relational database comes from the semantics of the attributes to which the data value belongs. This is due to the capacity of a concept’s “giving meaning to its instances” [9]. An attribute may be seen as representing a concept. For example, “student name” is seen as a con- cept. Relationships between entities in a database can be seen as “complex concepts” [9] and therefore also give meaning to data values that are instances of the relation- ships. That is, data in the relational database already hold relationships upon which there are constraints imposed. We now use a simple example to summarise how IIR may be derived on the three levels. Suppose three tables shown in Figure 6 in the “University” database. Table level According to Figure 6, table administration staff is a subclass of table staff, which gives the following IIR between these two tables: IIR(administration_staff, staff) As previously mentioned, the meaning of IIR indicates that first arguments existence results in the certainty that the other exists, and without the former, the latter is not certain. Therefore, the meaning of the two part relation- ship in this particular IIR, may be explained as: “if there is a member of administration staff then a corresponding member of staff must exist, otherwise the latter is not certain”. In this particular case, the IIR is true because any member of administration staff is a member of staff. Attribute level If an attribute “A” is in a table which includes attrib- utes “A”, “B” and “C”. Then, any combination of “A”, “B” and “C” that includes “A” would have “A” in its information content. For example, IIR(A∩B, A), which means that if an instance, say (a, b), of “A∩B” exists, then there must be an corresponding instance of “A” ex- isting - in this case, it is a. In general, this type of IIR is IIR(“a set of attributes”, “a subset of the attributes”). Using the values in Figure 6, an example is shown below. The attributes in table administration staff include sno, position, and deptNo. So the IIR are: IIR(sno∩position∩deptNo, sno) IIR(sno∩position∩deptNo, sno∩position) IIR(sno∩position∩deptNo, sno∩ deptNo) IIR(sno∩position∩deptNo, position) IIR(sno∩position∩deptNo, position∩deptNo) IIR(sno∩position∩deptNo, deptNo) These IIR are derived on the attributes level, which may be seen as based on the aforementioned “Product” Rule, i.e., if an event X is the conjunction of a number of events, then any of the latter is in the information content of the former. Data value level ![]() Using an Ontology to Help Reason about the Information Content of Data Copyright © 2010 SciRes. JSEA 636 DB-staff sno fnamelname sexAddress tel office s02923 John Key M 6 Lawrence St, Glasgow2384 E110 s02933 Julie Lee F 8 George St, Glasgow 2234 G203 s04885 Ann White F 18 Taylor St, Glasgow 5112 G133 s04995 SusanBrand F 28 High St, Paisley 3001 G229 s06465 Mary TregearF 7 George St, Paisley 7754 F232 s06883 DavidFord M 64 Well St, Paisley 8772 F231 DB-administration staff sno position deptNo s04885 secretary d01 s04995 accountant d03 DB-departments deptID departmentName d01 administration d03 finance Figure 6. Three tables within the “University” relational database Unlike the unpopulated ontology in use for this project, data values are a very important part in relational data- bases and it also the largest constituent of a relational database. Before explaining how to derive IIR on the data value level, let us re-cap the meaning of the terms we have been using, i.e., “random variable” “event” and “particular of an event”. A random variable is an entity used mainly to describe chance and probability in a mathematical way. An event is a set of outcomes (a subset of the sample space) to which a probability is assigned. Typically, when the sample space is finite, any subset of the sample space is an event (i.e., all elements of the power set of the sample space are defined as events) (A WorldViewer.com, 2009). Moreover, a specific event at a particular time and in a particular space is called a particular of an event. For example, consider the following situation. For an electric circuit, two random variables can be identified: one is Table 2. IIR between data values Random variables IIR iir('s04885', 'secretary') 'sno', 'position' iir('s04995', 'accountant') iir('s04885', 'd01') 'sno', 'deptNo' iir('s04995', 'd03') “the condition of the lamp”, and the other “the condition of the switch”. There are two states about the lamp: “lit” and “unlit”, and two for the switch: “closed” and “open”. There are 22 events for either. Moreover, “unlit” at 10:30 am and “lit” at 10:30 pm, are two particulars. Table 2 shows some random variables and associated for the university database given in Figure 6. In a rela- -tional database, an attribute, e.g., sno, can be seen as a random variable, and then each possible data value in this column is an event. That is, randomly picking up a tuple in this column, then its value could be any one of all those that are allowed. An attribute is therefore a variable. The variable holding a particular value is an event. 3.3 Deriving IIR from Business Rules Business rules are domain dependent, established by an individual organisation and they are ad hoc logical limi- tations on data. Business rules may be embedded in an ontology and also could be in a database. In order to de- rive IIR from these business rules, we treat an Object- Property in an OWL ontology as if it were a constraint in a database. Both could be represented as additional rela- tionships. For example, in a university, there might be a rule: “Any newly recruited lecturer must hold a PhD”. Then we have an IIR(newly recruited lecturer, PhD), which means that if someone is a newly recruited lecturer, then she/he must hold a PhD corresponding to him/her. ![]() Using an Ontology to Help Reason about the Information Content of Data Copyright © 2010 SciRes. JSEA 637 4. Testing our Idea In this section, we show a case study that verifies our idea. We created an ontology entitled “Academic” and a relational database “University”. The program runs in Oracle using PL/SQL. This case study elucidates the dif- ferent results when reasoning is based on the database in question only, and on both the database and the ontology integrated through IIR. There are 11 tables in the “University” database shown in Figure 7. And as we mentioned in previous sections, in the OWL ontology, a “Class” is transformed into a “table”. “subClassOf” is treated as a “Class”. A “DatatypeProp- erty” is changed to an “attribute”. An “ObjectProperty” is a relationship between classes, which are transformed into constraints upon these tables. So, we arrive at 10 tables shown in Figure 8 from the “Academic” ontology. Thus the schema of the “University” is substantially extended as shown in Figure 9, from which more IIR are derived. Figure 10 shows the 10 tables in SQL Plus of Oracle. 4.1 Original IIR and Derived IIR A few business rules are defined for this case study. They specify correspondences between the “Academic” OWL ontology and the “University” relational database. For example, there is a table in the “University” database called “staff”. There is a class in the “Academic” OWL ontology named “Person”, and “staff” is a subclass of “Person”. Other 18 business rules are concerned with equivalent classes between ontology and the database at class level. There is one on the ObjectProperty. The original IIR derived from the ontology and the ‘database and business rules are shown in Table 3. Applying the IIR inference rules listed earlier to the IIR identified, more IIR are derived. 4.2 Implementation and Results As we previously mentioned, firstly, we created tables for the relational database (named 0db.sql), the ontology (named 0onto.sql) and IIR (named 0IIR.sql), used for storing both original IIR from the ontology and the data- base and 0IIR_DB.sql is used for storing the original IIR from the relational database alone. Secondly data values are inserted into the database ta- bles and all the original IIR are entered into the IIR tables. Then all single attributes and class names and attribute names from the ontology and the database are obtained and inserted into the attributes table with duplicate com- ponents removed. Thirdly, three intermediate tables are created. The ta- bles named fo1 and fo2 store the former and the latter part of original IIR respectively, and the table t1 stores intermediate results. Fourthly, a procedure is invoked. For instance, when a user asks a question, relevant IIR closures will be looked at. They embody relevant information for the query. There is a string match function in this procedure. In order to find out the difference that the ontology makes, we compare the two results. One was obtained by using both the “Academic” OWL ontology and the “University” database, and the other obtained using the database alone. They are shown in Table 4. As Table 4 shows, the first column “Attributes” indi- cate all attributes that are extracted from “Academic” OWL ontology and the “University” database. The col- umn “Closures from both ONTO and DB” shows the IIR closures that we derive by running our prototype when the “Academic” OWL ontology is involved, The column “Closures from DB only” consists of the IIR closures that we derive by running our prototype when only the “Uni- versity” database is involved. We use the same five questions in the testing. As Ta- ble 4 shows, the attributes that are included in the results are ticked. When the database alone is used, a query for “sno” gives 12 results and a query for “matricNo” gives 7 results. When both the ontology and the database are used, the same query for “sno” gets 14 results and the same query for “matricNo’ gives 9 results. That is, two more attributes are found to be included in the respective IIR closures when the ontology is involved, which means that more information is made available due to the on- tology. 5. Conclusions We have described how an ontology may be linked with database in order to derive hidden information. A proto- type in Oracle was developed to verify our ideas. We use the notion of IIR (Information content Inclusion relation) and inference rules for IIR. We have found that if we do not invoke a relevant on- tology, a query may be unanswerable. After invoking an ontology, more relationships between objects become available, and therefore more elements can connect to one another, and as a result, a query may become answer- able, and as a result, more information can be derived from data in a database. To achieve this, a key is to be- able to identify IIR from both a database and an ontology. We have presented a way of doing so. More work need to be done in the future, for instance, to display correspondences between a query and the an- swers in a more accurate and specific way, i.e., not just listing the answers. One issue that is not aesthetic is how to achieve semantic alignment between an ontology and a database, on which we are currently working. ![]() Using an Ontology to Help Reason about the Information Content of Data Copyright © 2010 SciRes. JSEA 638 sno fname lname sex Address tel s02923 John Key M 6 Lawrence St, Glasgow 2384 s02933 Julie Lee F 8 George St, Glasgow 2234 s04885 Ann White F 18 Taylor St, Glasgow 5112 s04995 Susan Brand F 28 High St, Paisley 3001 s06465 Mary Tregear F 7 George St, Paisley 7754 s06883 David Ford M 64 Well St, Paisley 8772 1. DB-staff sno title school sno school pno s02923 lecturer computing s06465 business p00203 s02933 professor business s06883 engi- neering p00334 2. DB-faculty 3. DB-researcher sno position deptNo matricNopno s04885 secretary d01 ts030283 p00334 s04995 accountant d03 tm051083p00203 4. DB-administration staff 6. DB-postgraduates matricNo fname lname sex Address ts030283 Tony Shaw M 20 George St, Paisley tm051083 Tina Murphy F 16 George St, Paisley rn050385 Robert Nielson M 11 George St. Paisley hf151186 Henry Ford M 7 Well St. Paisley jw010483 John White M 5 Novar Dr, Glasgow sb210682 Susan Brand F 2 Manor Rd, Glasgow cp020381 Chris Paul M 6 Lawrence St, Glasgow 5. DB-student matricNo creditsSoFar matricNo rn050385 155 M050385 8. DB-projects 7. DB-undergraduates courseID courseName creditHour lecturerNo school c0054 Oracle Development 24 s02923 computing c0021 International Finance Planning 24 s06465 business c0154 Advanced Oracle Development 24 s02923 computing c0155 Networking Principles 16 s06883 computing c0220 Software Development 24 s06883 computing 9. DB-courses matricNo courseID results deptID depart- mentName rn050385 c0054 A d01 administration hf151186 c0021 C1 d03 finance cp020381 c0154 B1 cp020381 c0155 C2 sb210682 c0220 B2 10. DB-achievements 11. DB-departments Figure 7. Tables in the “University” database matricNo creditsSoFar rn050385 155 hf151186 65 ![]() Using an Ontology to Help Reason about the Information Content of Data Copyright © 2010 SciRes. JSEA 639 name age sex specialty educationDegree back- ground 1. onto-Person 2. onto-Worker 3. onto-Faculty 4. onto-Administration staff school background supervisor credits 5. onto-Assistants 7. onto-Postgraduates 8. onto-Undergraduates 6. onto-Student projectNo (PK) projectName courseNo (PK) courseName creditHour 9. onto-Projects 10. onto-Course Figure 8. Table transformations from the “Academic” ontology Figure 9. “University” database extended due to an ontology school title background department Position background studentNo (PK) studentName major address E-mail sex ![]() Using an Ontology to Help Reason about the Information Content of Data Copyright © 2010 SciRes. JSEA 640 Figure 10. The “Academic” ontology represented in SQL Plus ![]() Using an Ontology to Help Reason about the Information Content of Data Copyright © 2010 SciRes. JSEA 641 Table 3. IIR derived from the “Academic” OWL ontology and the “University” relational database IIR derived from the ‘Academic’ OWL ontology IIR derived from the ‘University’ Relational Database IIR derived from Business Rules (corresponding relations) class, subclass (17) and equivalent class (1) class, subclass (5) and equivalent class (0) class, subclass (1) and equivalent class, attributes (20) 1.IIR(Worker, Person) 1.IIR(faculty, staff) 1.IIR(staff, Person) 2.IIR(Faculty, Worker) 2.IIR(administration_staff, staff) 2.IIR(Worker, staff) 3.IIR(Professors, Faculty) 3.IIR(researcher, staff) 3.IIR(Faculty, faculty) 4.IIR(Lecturer, Faculty) 4.IIR(postgraduates, student) 4.IIR(Administration_staff, admini- stration_staff) 5.IIR(Postdoc, Faculty) 5.IIR(undergraduates, student) 5.IIR(Projects, projects) 6.IIR(Administration_staff, Worker) 6.IIR(Student, students) 7.IIR(Dean, Administration_staff) 7.IIR(Postgraduates, postgraduates) 8.IIR(Chair, Administration_staff) 8.IIR(Undergraduates, undergraduates) 9.IIR(Clerical_staff, Administration_staff) 9.IIR(Course, courses) 10.IIR(System_staff, Administration_ staff) 10.IIR(courseNo, courseID) 11.IIR(Director, Administration_staff) 11.IIR(projectNo, pno) 12.IIR(Assistants, Worker) 12.IIR(staff, Worker) 13.IIR(Reacher_assistants, Assistants) 13.IIR(faculty, Faculty) 14.IIR(Teaching_assistants, Assistants) 14.IIR(administration_staff, Ad- ministration_staff) 15.IIR(Student, Person) 15.IR(projects, Projects) 16.IIR(Postgraduates, Student) 16.IIR(students, Student) 17.IIR(Undergraduates, Student) 17.IIR(postgraduates, Postgraduates) 18.IIR(Teachers, Faculty) 18.IIR(undergraduates, Undergradu- ates) 19.IIR(courses, Course) 20.IIR(courseID, courseNo) According to the ‘University’ relational database EER diagram (Figure 8), these 5 IIR could be derived from it. 21.IIR(pno, projectNo) ObjectProperty (7) and equivalent Ob- jectProperty (2) ObjectProperty (5) and equivalent ObjectProperty (0) ObjectProperty (0) and equivalent ObjectProperty (3) 1.---teache_of IIR(Faculty, Course) 1.---work_in IIR(administration_staff, departments 1.IIR(has, teache_of) 2.---attend_course IIR(Student, Course) 2.---has IIR(faculty, courses) 2.IIR(research_in, work_on) 3.---research_by IIR(Professors, Projects) 3.---employed_on IIR(researcher, projects) 3 IIR(study_in, work_on) 4.---instruct_by IIR(Course, Faculty) 4.---work_on IIR(postgraduates, projects) 5.---research_in IIR(Postgraduates, Pro- jects) 5.---take IIR(undergraduates, courses) 6.---study_in IIR(Postgraduates, Projects) 7.---join_course IIR(Student, Course) 8. IIR(study_in, research_in) 9.IIR(join_course, attend_course) Constraints----NOT NULL (6) constraints----PK (22) constraints (0) 1.IIR(studentNo, ‘studentNo, student- Name,major,address,E-mail,sex’) 1.IIR(sno, ‘sno,fname,lname,sex,address,tel,office’) 2.IIR(courseNo, ‘courseNo, courseName, creditHour’) 2.IIR(sno, ‘sno,title,school’) 3.IIR(projectNo, ‘projectNo, projectName’) 3.IIR(sno, ‘sno,school,pno’) 4.IIR(Student, studentNo) 4.IIR(sno, ‘sno,position,deptNo’) 5.IIR(Projects, projectNo) 5.IIR(matricNo, ‘matricNo,fname,lname,sex,address’) 6.IIR(Course, courseNo) 6.IIR(matricNo, ‘matricNo,pno’) 7.IIR(matricNo, ‘matricNo,creditsSoFar’) 8.IIR(pno, ‘pno,projectName’) 9.IIR(courseID, ‘courseID, courseName, creditHour, lec- turerNo, school’) 10.IIR(‘matricNo,courseID’, ‘matricNo,courseID,results’) 11.IIR(deptID, ‘deptID,departmentName’) 12.IIR(staff, sno) 13.IIR(faculty, sno) 14.IIR(researcher, sno) 15.IIR(administration_staff, sno) 16.IIR(student, matricNo) 17.IIR(postgraduates, matricNo) 18.IIR(undergraduates, matricNo) 19.IIR(projects, pno) 20.IIR(courses, courseID) 21.IIR(achievements, ‘matricNo,courseID’) 22.IIR(departments, deptID) ![]() Using an Ontology to Help Reason about the Information Content of Data Copyright © 2010 SciRes. JSEA 642 Table 4. IIR closures compared Attributes Closures from both ONTO and DB (the number of results) Closures from DB only (the number of results) Worker(3) faculty(25) student(22) sno(14)matricNo(9)Worker(1)faculty(16)students(1) sno(12) matricNo(7) Person √ √ √ Worker √ √ √ √ Student √ Faculty √ √ Course √ √ deptNo √ √ √ √ sex √ √ √ √ √ √ √ school √ √ √ √ √ title √ √ √ √ position √ √ √ √ studentNo √ studentName √ major √ address √ √ √ √ √ √ √ E-mail √ courseNo √ √ courseName √ √ creditHour √ √ projectNo √ √ projectName √ √ √ √ sno √ √ √ √ √ fname √ √ √ √ √ √ lname √ √ √ √ √ √ tel √ √ √ √ office √ √ √ √ pno √ √ √ √ √ √ matricNo √ √ creditSoFar √ √ courseID √ √ lecturerNo √ √ staff √ √ √ √ faculty √ √ √ students √ √ courses √ √ √ 6. Acknowledgements This work is partly sponsored by the a grant for Distrib- uted Information Systems Research from the Carnegie Trust for Universities of Scotland, 2007, a grant for re- search on Semantic Interoperability between Distributed Digital Museums from the Carnegie Trust for Universi- ties of Scotland, 2009, and a PhD studentship of the University of the West of Scotland, UK. REFERENCES [1] T. R. Gruber, “A Translation Approach to Portable On- tologies,” Knowledge Acquisition, Vol. 5, No. 2, 1993, pp. 199-220. [2] M. West, “Database and Ontology [Online],” 2008. Wiki HomePage. http://ontolog.cim3.net/cgi-bin/wiki.pl?Data baseAndOntology [3] T. Berners-Lee, J. Hendler and O. Lassila, “The Semantic Web,” Scientific American, Vol. 284, No. 5, 2001, pp. 34- 43. [4] Z. M. Xu, S. C. Zhang and Y. S. Dong, “Mapping be- tween Relational Database Schema and OWL Ontology for Deep Annotation,” Proceedings of the 2006 IEEE/ WIC/ACM International Conference on Web Intelligence, IEEE Computer Society, 2006, pp. 548-552. http://por- tal.acm.org/citation.cfm?id=1248823.1249215&coll=AC ![]() Using an Ontology to Help Reason about the Information Content of Data Copyright © 2010 SciRes. JSEA 643 M&dl=ACM&CFID=16616566&CFTOKEN=44022427 [5] C. B. Necib and J. C. Freytag, “Query Processing Using Ontologies,” Proceedings of 17th International Confer- ence on Advanced Information Systems Engineering, Springer, Porto, Portugal, 13-17 June 2005. [6] K. Munir, M. Odeh and R. McClatchey, “Ontology As- sisted Query Reformulation Using the Semantic and As- sertion Capabilities of OWL-DL Ontologies,” Proceed- ings of the 2008 International Symposium on Database Engineering & Applications, ACM, Coimbra, Portugal, 2008, pp. 81-90. [7] J. Feng, “The ‘Information Content’ Problem of a Con- ceptual Data Schema,” Systemist, Vol. 20, No. 4, 1998, pp. 221-233. [8] K. Xu, J. Feng and M. Crowe, “Defining the Notion of ‘Information Content’ and Reasoning about it in a Data- base,” Knowledge and Information Systems, Vol. 18, No. 1, 1 January 2009, pp. 29-59 [9] F. I. Dretske, “Knowledge and the Flow of Information,” MIT Press, Cambridge, 1981. [10] K. Loney, “Oracle Database 10g: The Complete Refer- ence,” McGraw-Hill Companies, Inc., NY, 2004. [11] B. C. Grau and B. Motik, “OWL 1.1 Web Ontology Lan- guage: Model-Theoretic Semantics. W3C Working Draft [Online],” 8 January 2008. W3C. http://www.w3.org/TR/ owl11-semantics/ [12] Z. M. Xu and Y. J. Huang, “Conversion from OWL On- tology to Relational Database Schema,” College of Computer and Information Engineering, Hohai University, Nanjing, 2006. |