Semantic Enrichment of XML Schema to Transform Association Relationships in ODL Schema ()
1. Introduction
Extensible Markup Language, a met language that allows users to define their own customized markup languages, is characterized by its flexibility and extensibility. Due to all its qualities, it’s considered as hot topic for describing and interchanging data through internet between different systems.
The migration of database appears today very interesting and promotes organizations to move towards new technology. Since information is valuable resources for organizations, the mapping process must be submitted before any shift to a new technology [1] . Furthermore, the characteristics of the XML Schema standard [W3C, 2008] are supported by the standard ODMG 3.0 [2] , and query languages are more powerful, which encourage to attempt to migrate existing database into new environment.
Database migration is a process wherein all the components of a source database are converted to their equivalents in the environment of the target one.
ODL is designed to support semantic constructs of ODMG object model. It is used to define the pattern of a compatible ODMG database independently of any programming language.
In this article, we present techniques for enriching the XML Schema. Our goal is to introduce the richness of ODL formalism to facilitate the migration of XML Schemas to ODL schema. The introduction of ODL formalism in the XML schema is obtained by extension. To do this, we proceed in several stages:
・ Defining the concepts supported by XML Schema;
・ Defining XML Schema extensions to take into account all the specificities of ODL objects. These extensions exploit the extension mechanisms supported by XML Schema to remain compliant with the W3C standard.
The content of this document provides a brief introduction to XML Schema and ODL. The rest of the paper is organized as follows. In Section 2, we review some closest related work. Section 3 presents XML Schema conceptual enrichment, we will explain how XML and ODL implement association relationship; several rules to transform an XML schema enriched in ODL schema focusing on transforming association relationship are described in Section 4. In section 5, we present the processing steps and an example for each type of association relationship is given. Section 6 presents evaluation of our approach by comparing the results of queries; finally Section 7 concludes the paper.
2. Related Work
There are many works that explain the mapping from XML to object-oriented databases, In [3] , it discuss the modeling of XML and the need for transformation. A number of generic transformation rules of the conceptual model OO to the XML schema are presented, accentuated on transforming inheritance and aggregation relationships.
Most existing work focuses on a method that was designed to map an object database into an XML database for the interoperability of databases. Schema translation process is supplied with a UML class model [4] . In this paper, set of rules to translate a simple database schema specified in ODL into XML Schema are presented focusing on transforming association relationships (1:M) and (M:M),
In [5] , the paper covers XML modeling and the need for transformation. It presents a number of XML schema transformation steps to ORDB, focusing on the transformation of association relations. Different types of these conceptual relationships (one-to-one, one-to-many and many-to-many) and their transformations are mainly dis- cussed.
In [6] , it address the mapping of the contents of an existing object-oriented database into XML using object graph; the reverse process is also proposed to store XML data in object-oriented database. In this work, the author use object graph for the transformation, but it does not cover all possible types of relationships.
3. Conceptual Enrichment of XML Schema
Associations allow complete modeling object states. The ODMG support bidirectional binary associations, cardinality (1: 1), (1: N) or (N: M). An association from A to B defines two opposite paths crossing, A- > B and B- > A. Each path must be defined in the ODL object type source by a relationship keyword.
A class in ODL is specified using the class keyword, an attribute is specified using the attribute keyword and relationship is specified using the relationship keyword. Although it is possible to materialize these concepts implicit in a XML schema, we propose to introduce an enrichment embodying these concepts explicitly in the XML Schema.
To do this, we use the extension mechanism advocated by XML Schema, this addition used to include the specific subjects of ODL and to highlight relationships between concepts. In our mapping process, these new concepts help to preserve the semantic of relationships.
3.1. ODL
A Class in ODL is defined as follows:
class < name>< /name>
(extent < names> key < attribute>…< /attribute>< /names>
{
< list of="" elements="attributes," relationships,="" methods="">< /list>
};
A relationship between a class C1 and class C2 is defined by attributes in both classes of types according to the cardinality of the relationship [7] :
・ One-to-one: an attribute of type C2 * is included in C1 and another one of type C1 * is included in C2.
・ One-to-many: an attribute of type collection < C2 *> (a set or a bag) is contained in C1 and C1 contains an attribute of type C2*.
・ Many-to-many: C1 contains an attribute of type collection< C2 *> and vice-versa.
A relationship must be specified in both directions. In ODL, the inverse keyword is used to designate the relationship in the opposite direction. For example, if we delete an object of C1, links with all C2 objects will be automatically dereferenced, also all objects C2 links back to the C1 objects will be dereferenced. If we relate an object of C1 to a set of C2 objects, the reverse links will automatically be created. In other words, it will create a link to each of C2 objects to the C1 object.
3.2. Xml Schema
XML Schema represents integrity constraints using the XPath expression [8] . It is possible to specify constraints that correspond to unique values, primary keys and relationships in ODBS.
The tags unique is used to define unique, key is to define primary key, and key ref for key reference which defined by refer attribute to specify attribute or element corresponds to the key element or unique specified. The XPath expression selector defines the domain of a constraint, and the field XPath defines the elements or attributes that represent the constraint.
For two complex types CT1 and CT2, participating in association relationship (see figure 1), XML schema is defined as follows:
Figure 1. example of association relationship.
< xsd:element name=""CT1""> < xsd:complextype> < xsd:sequence> < xsd:element name=""EL1"" type=""xsd:EL1_type"/"> ...< /xsd:element>< /xsd:sequence>< /xsd:complextype>< /xsd:element>
< xsd:attribute name=""attr1"" type=""xsd:att1_type"" use=""required"/"> < /xsd:attribute>
< xsd:element name=""" ct2"=""> < xsd:complextype> < xsd:sequence> < xsd:element name=""EL2"" type=""xsd:EL2_type"/"> ...< /xsd:element>< /xsd:sequence>< /xsd:complextype>< /xsd:element>
< xsd:attribute name=""attr2"" type=""xsd:att2_type"" use=""required"/"> < /xsd:attribute>
< xsd:key name=""CT1_K"">< /xsd:key>
< xsd: selector="" xpath="".//E1"/">< /xsd:>
< xsd: field="" xpath=""attr1"/">< /xsd:>
< xsd:key name=""CT2_K"">< /xsd:key>
< xsd: selector="" xpath=""//" ct2"="" />
< xsd: field="" xpath=""attr2"/">< /xsd:>
< xsd: keyref name="CT1_CT2_Ref" refer=" CT1_K">
< xsd: selector xpath = "CT2"/>
< xsd: field xpath="@attr1"/>
< xsd: keyref name="CT2_CT1_Ref" refer="CT1_K">
< xsd: selector xpath = "CT1"/>
< xsd: field xpath="@attr1"/>
3.3. Enrichment of semantics in the XML schema
In this section, we will conceptually enrich XML Schema for the purpose of establishing correspondences between two technologies: XML and ODL. These connections allow us to specify mappings between the two schemas.
The semantic enrichment of an XML schema involves the extraction of its data semantics, to be enriched and converted into a CDM. To do this, we have applied the approach in [9] to enrich semantically XML schema. The process starts by extracting the basic metadata information about an existing XML schema, including relation types and attribute|element properties (i.e., names, types, occurrence, required or not), and keys (K), keyrefs (KR). We assume that data dependencies are represented by keys and keyrefs. As for each keyref tag, there is a reference to a key of a complex type, which can be considered as a value reference.
We extend the semantics of the XML schema above as follows:
We add an element in both of the complex types (CT1 and CT2) that we called “elementrole” (ele_rol) whose name expresses the relationship role, its type is the same as the key element of the other complex type, and its cardinality is the same expressed in the relationship, e.g. we add in the element CT2, an element “ele_rol” with the same cardinality near CT2 (*) as:
< xsd:element name=""ele_rol"" type=""xsd:attr1_type"" maxoccurs=""unbounded"/">.< /xsd:element>
4. Rules of mapping from enriched XML schema to ODL schema
Now we present the mapping rules between XML Schema elements and ODL, including concepts describing dependency relationships.
Rule 1: An XML element (< xsd: element="">) with complex structure or a global complex type element (< xsd: complex="" type="">) are transformed into a class in ODL with the same name.< /xsd:>< /xsd:>
Rule 2: Simple XML elements < xsd: element="">, with basic data types < xsd: type=""> data type (string, short, date, float, etc.), which is enclosed by an < xsd: sequence="">, must be converted into attribute in the class ODL resulting from rule 1. XML attributes < xsd: attribute=""> are also converted into attribute in the corresponding class, with the same name and the same type, except the “elementrole”, it will be transformed in a relationship included in the corresponding class. XML attributes < xsd: minOccurs > and < xsd: maxOccurs > carried by the element can carry on associations (see rule 4).< /xsd:>< /xsd:>< /xsd:>< /xsd:>
Rule 3: Each field XPath in the element key is transformed into key attribute of the corresponding class.
Rule 4: Referring to the CDM, the relationship will be as follow:
Relationship set|bag|
elan inverse CTname_referred_to:: fiels_xpath_of_ Ctname_referred_to.
Depending on the cardinality we add set or bag or nothing.
Definition of CDM: The CDM is defined as a set of complex types: CDM: = {CT|CT:= áctn, AEcdm, Relñ}, where each complex type CT has a name ctn. Each CT has a set of elements|attributes AEcdm, and a set of relationships Rel.
Attributes (A|Ecdm): A complex type CT has a set of elements|attributes AEcdm. AEcdm:= {ela|ela: = áelan, t, tagñ}, where each element | attribute ela has a name elan, data type t and a tag, which classifies ela as a non-key “NK”, a key “K”, or a relationship as R.
Relationships (Rel): A complex type CT has a set of relationships Rel. Each relationship rel Î Rel between CT1 and complex type CT2 is defined in CT1 to represent an association. Rel: = {rel|rel: = áCTn_referred_to, Occ, F_xpath_of_CTn_referred_toñ}, where CTn_referred_to is the name of CT2, Occ:= minOcurs. maxOccurs is the cardinality constraint of rel from the CT1 side, and F_xpath_of_CTn_referred_to denotes the elementrole name representing the inverse relationship from the CT2 side.
Since we are focusing on association relationship, we don’t discuss the relation type.
5. Application Mapping of Association Relationship from an Enriched XML Schema to ODL
We presented in the previous section a specification of mappings of basic elements of XML schema. In this section, we will apply these mappings to XML Schema enriched. An association expresses a bidirectional semantic connection between two types. Each instance sharing a kind of relationship with others, it could be of any type as: one-to-one, one-to-many or many-to-many. By default, an association is navigable in both directions [10] .
Association verbal active: specifies the reading direction of the main association; roles: specifies the function of a type for a given association; cardinality: specifies the number of instances that participate in a relationship.
5.1. One-to-one Association: (rarely applied in practice)
In this section we use an example of (1:1) relationship between professor and class, we assume that each professor teaches at most one class and vice-versa. Keep in mind that this kind of relationship is not very common (see figure 2).
The steps below explain how to transform the one-to-one association relationship from enriched XML Schema to ODL.
Enriched XML Schema for one-to-one relationship:
< xsd:element name=""professor"">< /xsd:element>
< xsd:complexType>
< xsd:attribute name = "professorId" type = "xsd:string" use = "required" />
< xsd:sequence>
....
< xsd:element name="teaches" type= " xsd:string" minOccurs="0" maxOccurs="1">
< !-- xsd:sequence-->
< !-- xsd:complexType-->
< !-- xsd:element-->
< xsd:element name = "class">
< xsd:attribute name = "classId" type =" xsd:string" use = "required"/>
< xsd:sequence>< /xsd:sequence>
....
< xsd:element name="teachedby" type= " xsd:string" minOccurs="0" maxOccurs="1">
< xsd:key name=""professor_K"">< /xsd:key>
< xsd:selector xpath="".//professor"/">< /xsd:selector>
< xsd:field xpath=""@professorId"/">< /xsd:field>
< xsd:key name=""class_K"">< /xsd:key>
< xsd:selector xpath="".//class"/">< /xsd:selector>
< xsd:field xpath=""@classId"/">< /xsd:field>
< xsd:keyref name=""professorRefclass"" refer=""classK"">< /xsd:keyref>
< xsd:selector xpath="".//professor"/">< /xsd:selector>
< xsd:field xpath=""teaches"/">< /xsd:field>
< xsd:keyref name=""classRefprofessor"" refer=""professorK"">< /xsd:keyref>
< xsd:selector xpath="".//class"/">< /xsd:selector>
< xsd:field xpath=""teachedby"/">< /xsd:field>
5.2. One-to-many Association
Let’s consider the example below: a department may have one or many employees. But an employee works in only one department (see figure 3).
Enriched XML Schema for one-to-many relationship:
< xsd:element name=""Department"">< /xsd:element>
< xsd:complextype>< /xsd:complextype>
< xsd:sequence>< /xsd:sequence>
< xsd:element name=""departmentId"" type=""xsd:string"/">< /xsd:element>
....
< xsd:element name=""has"" type=""xsd:string"" maxoccurs=""unbounded"/">< /xsd:element>
< xsd:element name=""employee"">< /xsd:element>
< xsd:complextype>< /xsd:complextype>
< xsd:sequence>< /xsd:sequence>
< xsd:element name=""employeeid"" type=""xsd:string"/">< /xsd:element>
....
< xsd:element name=""worksin"" type=""xsd:string"" maxoccurs=""1"/">< /xsd:element>
< xsd:key name=""department_K"">< /xsd:key>
< xsd:selector xpath="".//department"/">< /xsd:selector>
< xsd:field xpath=""departmentId"/">< /xsd:field>
< xsd:key name=""employee_K"">< /xsd:key>
< xsd:selector xpath="".//employee"/">< /xsd:selector>
< xsd:field xpath=""employeeid"/">< /xsd:field>
< xsd:keyref name=""departmentRefemployee"" refer=""employee_K"">< /xsd:keyref>
< xsd:selector xpath="".//department"/">< /xsd:selector>
< xsd:field xpath=""has"/">< /xsd:field>
< xsd:keyref name=""employeeRefdepartment"" refer=""department_K"">< /xsd:keyref>
< xsd:selector xpath="".//employee"/">< /xsd:selector>
< xsd:field xpath=""worksin"/">< /xsd:field>
5.3. Many-to-many relationship
In the example below: a student is teached by one or more professors. The same professor teaches lots of students (see figure 4).
Enriched XML Schema for many-to-many relationship:
< xsd:element name=""professor"">< /xsd:element>
< xsd:complextype>< /xsd:complextype>
< xsd:sequence>< /xsd:sequence>
< xsd:element name=""professorId"" type=""xsd:string"/">< /xsd:element>
....
< xsd:element name=""teaches"" type=""xsd:string"" maxoccurs=""unbounded"/">< /xsd:element>
< xsd:element name=""student"">< /xsd:element>
< xsd:complextype>< /xsd:complextype>
< xsd:sequence>< /xsd:sequence>
< xsd:element name=""studentId"" type=""xsd:string"/">< /xsd:element>
....
< xsd:element name=""teachedby"" type=""xsd:string"" maxoccurs=""unbounded"/">< /xsd:element>
< xsd:key name=""professor_K"">< /xsd:key>
< xsd:selector xpath="".//professor" "="" />
< xsd:field xpath=""professouId"/">< /xsd:field>
< xsd:key name=""" student_k"="">< /xsd:key>
< xsd:selector xpath="".//" student="" "="" />
< xsd:field xpath=""studentId"/">< /xsd:field>
< xsd:keyref name=""" professorrefstudent="" "="" refer=""student_K"">< /xsd:keyref>
< xsd:selector xpath="".//" professor="" "="" />
< xsd:field xpath=""teaches"/">< /xsd:field>
< xsd:keyref name=""" studentrefprofessor="" "="" refer=""" professor_k"="">< /xsd:keyref>
< xsd:selector xpath="".//" student="" "="" />
< xsd:field xpath=""teachedby"/">< /xsd:field>
5.4. Generation Canonical Data Model (CDM) of XML Schema
Let’s consider the XML schema shown in the example in section 5.2 above, the corresponding CDM is as table 1.
Table 1. CDM of XML schema for one-to-many association relationship.
5.5. Algorithm for Schema Translation
Figure 5 shows the algorithm map XML_ODL for mapping XML schema into ODL schema. The algorithm reads each XML complex type one by one and maps it to ODL. In line 4, the complex type is mapped to a class, the algorithm maps all its elements to attributes of the class and forms the relationship with other classes. Specifically, if the relationship is one-to-one, in line 13 the algorithm adds the relationship in the corresponding class as follow:
If the relationship is one-to-many, in line 17 the algorithm add the relationship as:
Figure 5. An algorithm for mapping XML schema into ODL.
The output ODL schema of one-to-many association relationship example is shown as follow (see figure 6).
Figure 6. Sample output ODL schema of one-to-many association relationship example.
6. Experimental Study
To demonstrate the validity of our method, a prototype has been developed, realizing the algorithm above. The algorithm was implemented using Java and EyeDB. To evaluate our approach, we examined the differences between source XML schema and the ODL schema generated by the prototype; we test the query results provided by OQL in EyeDB, and XQuery in stylus studios. Queries returned the same results. The source XML database is transformed into target object database ODL without loss of data.
This section presents two queries applied on the XML schema shown in section 5.3 and the equivalent ODL generated by the prototype. Table 2 shows the description and the result of each query.
Table 2. the description and the result of each query.
7. Conclusion
In this article, we present a method of translation from XML schema into ODL schema, focusing on mapping association relationships; we extend the semantic of XML schema, our proposed method describes a process from the conceptual model to the implementation in the classes. With this method, the results preserve the semantics specified in the conceptual level, either to XML or ODL; our future work will be the development of a better mapping taking into account the concepts that we have not discussed in this paper, such as inheritance relationship.