A Stable and Consistent Document Model Suitable for Asynchronous Cooperative Edition

Complex structured documents can be intentionally represented as a tree structure decorated with attributes. Ignoring attributes (these are related to semantic aspects that can be treated separately from purely structural aspects which interest us here), in the context of a cooperative edition, legal structures are characterized by a document model (an abstract grammar) and each intentional representation can be manipulated independently and eventually asynchronously by several co-authors through various editing tools that operate on its “partial replicas”. For unsynchronized edition of a partial replica, considered co-author must have a syntactic document local model that constraints him to ensure minimum consistency of local representation that handles with respect to the global model. This consistency is synonymous with the existence of one or more (global) intentional representations towards the global model, assuming the current local representation as her/their partial replica. The purpose of this paper is to present the grammatical structures which are grammars that permit not only to specify a (global) model for documents published in a cooperative manner, but also to derive automatically via a so call projection operation, consistent (local) models for each co-authors involved in the cooperative edition. We also show some properties that meet these grammatical structures.


Introduction
With the rise of XML technologies and Web services, structured documents have become important tools for the publication and exchange of information between most often heterogeneous and remote applications.The ever-increasing power of communication networks in terms of throughput and security as well as efficiency is concern, has revolutionized the way of such documents are edited.Indeed, to the classical model of an author editing his document locally and autonomously, was added the (asynchronous) cooperative editing in which, several authors located on geographically distant sites, coordinate to edit asynchronously the same structured document (Figure 1).
Cooperative structured editing is a research field related to computersupported cooperative work-CSCW [1], which Baecker, et al. in [2] defined as a set of activities performed on computers and coordinated by a group of collaborative entities.Structured cooperative publishing is a hierarchically organized group publishing work, that operates according to a schedule involving deadlines and task sharing (coordination).When it is asynchronous, each of the participating co-authors in the edition has on its site, a replica of the structured document (intentionally represented as an abstract tree) on which he acts.It is generally preferable for safety reasons 1 , efficiency 2 , … that this copy is only a partial replica of the global document, i.e. consisting only of parts of the document containing relevant information related to the considered co-author.In this case, in order to minimize the inconsistencies that can be introduced in the partial replica when locally edited, and to ensure that at the end of edition (or at specific times), the different contributions will be structurally merged [3] [4], each co-author must have on his publishing local site a local document model (a grammar) which is consistent with the global model.Intuitively, a local document model is consistent with respect to the global model, when any partial document t' that is conform to him is the partial replica of at least one document t conform to the global model.
The central issue addressed in this paper can be simply presented by means of an example of unsynchronized cooperative structured editing process (Figure 1).In fact, one can easily imagine an editing process in which several authors work together to produce a pluri-disciplinary book and such that, according to its own field of expertise, everyone contribute to more or less disjointed parts of the same document.
It may be interesting for these authors to specify previously (may be together) the overall hierarchical structure of the document via a grammatical model; we call thereafter global model of the document.From it are derive for each of the co-authors a dedicated (local) model called thereafter local model.This local model can be regarded as a "view" on the global model and obtained by means of a projection operation performed on it, which retains on the global model only syntactic categories with a demonstrated interest for the considered author.
For example, Figure 1 present an overview of the cooperative edition distributed on three sites.Site 1 is dedicated to the edition and the merging of 1 For a given co-author, some parts of the document may contain sensitive information.It is preferable that he is not even informed of the presence of this information in the document.As we shall see later, the projection operation will resolve this confidentiality problem.

2
Handled documents pass through the network.They will circulate all the more quickly as their size is reduced.For this reason, the replica of the document to be sent to a co-author must contain only the parts which are of obvious interest to him: it's a partial replica.Here too, the projection operation will solve this concern.the (global) document according to the (global) document model G hosted on him.On G, two projections are made to obtain G 1 and G 2 , the local models hosted by site 2 and 3 an used for syntactically constrain the desynchronized edition of the partial replicas of the global document on the sites 2 (resp.site 3).Note that, documents published on these sites can be saved (serialized) then restored by parsing.The overall document is subsequently obtained from the site 1 by performing a consistent expansion 3 of the various documents published on sites 2 and 3.
The purpose of this paper is to propose a generic document model allowing to specify syntactically both the global model and derived local models, which are consistent with the global model.
In order to do this, we propose the grammatical structures (a subset of the extended context free grammars) as well as a projection operation which allows to derive from a grammatical structure (global model) and a set of syntactic categories relevant to a given co-author, a local grammatical structure dedicated to him.
Organization of the manuscript: Section 2 presents some concepts and definitions used thereafter.Section 3 presents the grammatical structures, the projection algorithm on grammatical structures and some features of this model.
Section 4 is devoted to the conclusion.

Extended Context Free Grammars, Documents and Compliances
It is usual to represent the abstract structure of a document by a tree (derivation tree) and its model by an Extended Context Free grammars (ECFG) 4 .In an ECFG, the right member of each production is a regular expression as opposed 3 The problem of re-synchronization-consistent expansion-a posteriori is presented and resolved in [4] where we can also find many basic definitions reused here.
to the sequence of terminal and non-terminal that constitute the right hand side of productions in classical context free grammar.More formally, an extended context free grammar ( ) , =  S P is given by a finite set of syntactic categories S , a finite set of production rules P written as s s → P such that, s ∈S and s P is a regular expression defined on S .
The dependency graph D  of grammar  is a graph whose set of node tags is included in S and, for all rules s s → P in P , there is an arrow from s to b , for all b in a word belonging to the language denoted by s P and termed ( ) . An ECFG is said to be non recursive if and only if D  is acyclic, and recursive if not.
A document t conforms to a grammar  and we write t ∴ , if it is a derivation tree of this grammar: it's the case if for any t node n labeled s ∈S and with children nodes 1 , , m n n  labeled respectively 1 , , m s s  , i s ∈S , the word ( )

View, Projection, Partial Replica and Consistency
The derivation tree giving a (global) representation of a structured document published cooperatively, makes visible all the grammar's grammatical symbols.
As mentioned in Section 1 above, a coauthor handling such a document using a structured dedicated editor of his area of expertise, do not necessarily have access to all of these grammatical symbols; only a subset of them correspond to syntactic categories perceptible as such by this tool: hence the notion of "view" [4].A view V , is a subset of grammatical symbols ( ⊆ V S ).Intuitively, they are symbols associated with visible syntactic categories in the considered representation (derivation tree).
Each view V is associated with a projection operation noted derivation trees t which erases nodes labeled by invisible symbols while retaining the subtree structure.Partial replication is the result of the projection of a document (derivation tree) with respect to a given view.For example, in the Figure 2 from the global document t in the center, and views , we have on the left the partial replica ( ) , and on the right the partial replica ( ) The edition type considered in this paper is asynchronous.On a site i hosting ( )

Some Definitions and Notations
Let ( ) , =  S P be an extended context free grammar, ,   is said to be finite type if and only if D  is non recursive. is said to be finite type with respect to V if the restriction of dependency graph D  on symbols which belongs to V is not recursive.
We note ( ) nt t ⊆ S the t's set nodes labels, and ( ) root t the t's root node label.
The notation " @ p X α → " means that "p has the form X α → ".We introduce function

( )
rhs p ) which returns the symbol (resp.the symbols) in left hand side (resp.right hand side) of his argument p, a production rule.
For example, if  means the substitution in the right hand side of p of all occurrences of each symbols i X ∈S by the corresponding i α .For example, with ( ) is the language generated by grammar  from symbol i A ∈S .

A Document Model Stable by Projection Operation, for Cooperative Asynchronous Edition
In this section, we present grammatical structures which are a particular form of non-recursive extended context free grammars (ECFG).Indeed, to make the projection (defined below, Section 3.2) possible, it is not permitted to have in this model, recursive grammar symbols 5 .The grammatical structures will then be models for documents of bounded depths (consequence of the nonrecursivity of the symbols) but of unbounded widths.Moreover, they will allow to specify in a homogeneous way both the global model for the global document and the local models for its various partial replicas.

Defining (Abstract) Grammatical Structures
A grammatical structure ( ) , =  S P is given as: • a set S of non recursive grammatical symbols, and • a set P of production rules.Each rule in P has one of the two forms: (classical form of context free grammars rules), A is build up by a list of B ) We recall that an equivalent ECFG can be evidently be derived from a grammatical structure.

Projection of a Grammatical Structure
Let ( ) be a grammatical structure, ⊆ V S a view; let also S P where: • V P is obtained from P by successive rewriting of symbols in V in terms of those in V , then, by substituting properly the result (of this rewriting) in the subset of rules P having symbols in V on the left hand side: V S : syntactic categories of the projected grammar contains symbols of the view with enventually new symbols introduce for structuring purpose belonging to set new S .As the process of obtaining the production rules of the projected model proceed by successive rewriting of symbols which did not belong to the view, it can occur during the rewriting process of some symbols that, new symbols being added for format purpose (or decomposition) in order to bring some rules back to the form of the production rules adopted for the grammatical structures 6 (cf.Section 3.1).The algorithm for deriving V P and V S proceeds in two steps: Step 1: consider the subset Prod ν ⊆ P of  's rules which left hand side does not belongs to the view Indeed, P ν can be considered as production rules of a concrete context free grammar as non terminal symbols and V as terminal symbols; then ( ) 1), one easily deduces that P ν is in fact the union of the rewriting of the productions of  having a symbol belonging to V in her left hand side.Thus, for every symbol i X ′ belonging to V , if we note ( ) i P X ν ′ the set obtained by rewriting rules of  having i X ′∈V as left hand side, we have ( ) with i X ν ′∈ .Recall that, symbols in V are considered as terminal symbols when rewriting.Algorithm 1 describes the construction process of ( ), X is created and rule p is decompose in two new rules as follow ′∈ should be built according to the topological sorting of the i X dependency graph: a symbol is evaluated after evaluation of symbols from which it depends.
Step 2: Consider the subset Prod ν ⊆ P of  's rules, with view symbols in left hand side ( ; for every rule in this set, replace all occurrences of V elements in right hand side, by their right hand side counterpart in P ν , this by all means; we finally obtain the set V P of production rules of the projected grammatical structure. As for P ν (Equation ( 1)), we deduce from Equation (2) that P ν is the reunion of the sets obtained by rewriting the productions of  having symbols i X belonging to V in their left hand side, by using ν P ; that sets is denoted . It explicitly presents when restructuring symbols are created (line 5) and when they are explicitly used (line 5 and line 8) in generated productions rules.

Grammatical Structures Properties
Let  be a grammatical structure, and V a view;  satisfies properties below: Property 1: is a grammatical structure (stability property); this property is guaranteed by Algorithm 3.
present below, the proof of the Property 2. The proof of Property 3 can be obtained from the proof of Theorem 3.3 given in [7].
Proof.Let t ∴ be such that ( ) .
In order to do this, if we consider an internal node n of t V labeled i A , with ; it suffices to show that the word A A  belongs to the language denoted by the grammar  V , admitting the symbol i A axiom i.e.

(
) . Note that one can define a partition ( ) S V V of S so that, every tree t ∴ (Figure 3(a)) can be uniquely partitioned into a finite set of maximal subtrees we say that it is of type t ν .Considering the decomposition of t into subtrees of type t ν and t ν as described above (Figure 3), a node of t can be found either in a subtree of type t ν or in a subtree of type t ν .Moreover, by focusing on a node n of t V and his children 1 , , k n n  , they can either: 1) all belong to the same subtree of type t ν (Figure 4) or, 2) belong to different subtrees of type t ν in t; in this case, n is a leaf in the subtree in which it appears, and the , 1, , are labels of the root nodes (Figure 5) of other subtrees of type t ν or, 3) n and some of his children are in the same subtree and the other are each one in their own subtree (Figure 6).Three case studies are therefore to be considered.Case 1: belong to the same subtree j t such that ( ) j nt t ⊆ V .In this case, according to the construction algorithm of P  V , ( ) and therefore to ( )    with some of its children belonging to subtrees of another type.
Case 2: Node n labeled i A is a leaf node of a subtree j t such that ( ) be labels of m children n in t. n has therefore been developed using a   's production rule of one of the two forms . We develop below the second form, the treatment of the first being similar.
There is therefore m sub-terms of t, says A A  in m sub-words: according to the construction process of the productions rules of  V (modulo restructuring symbols).of version managing like CVs for unstructured documents (textual merge) [13].
In the case of structured editing, all co-authors have the same document model and the merging of complete replicas relies on this model (syntactic merging software) [14] [15].We were interested in this paper to an innovative case-we did not find any study that was done in this direction-in which the co-authors act on partial replicas of the overall document and each with a local model allowing him to validate locally updates made on its (partial) local replica.
We proposed as a document model in this context, grammatical structures allowing both to specify the model for the global document, and local modelsfor partial replicas-dedicated to each co-author.Furthermore, we have defined a projection operation to automatically derive the local models (grammatical structures) of documents from the global one.
Stability and consistency are some of the major properties enjoyed by grammatical structures.Consistency ensures that, every document validated locally with the local grammatical structure is always the projection of at least one valid document according to the overall grammatical structure: the grammatical structures thus offer to the different co-authors a suitable means of carrying out local syntactic validations of the asynchronously edited documents, while ensuring consistency.
One can further this study by focusing on bottom-up construction of grammatical structures.The goal is to propose a "grammatical structures merger" similar to the "documents merger" presented in [4].

Figure 1 .
Figure 1.The desynchronized cooperative editing of partial replicas of a structured document.
⊆V S a view, t a derivation tree for  ( t ∴ ), D  the dependency graph of  and p ∈P a production rule.

Figure 2 .
Figure 2. One document (center) and two partial replicas obtained by projections.

,
and β containing only V S symbols.Hence P ν set is given as: can be obtained after successive rewriting of a rule; this is not an acceptable form of rule.So a new restructuring symbol 1

Figure 4 .
Figure 4. Case where an inner node n and its children 1 , , k n n  belong to the same subtree j t such that

Figure 5 .
Figure 5. Case of labeled i A leaf node n of subtree j t ( ( ) j nt t ⊆ V ), with all its

Figure 6 .
Figure 6.Case of an internal node n labeled i A of a subtree j t such that

Figure 7 .
Figure 7.A grammatical Structure an  of a phone book.

Figure 8 .
Figure 8. Two local models resulting from the projection on global model of Figure 7 according to view 1 V (a) and view 2 V (b).
, and the labels of the successor nodes of the leaf nodes of i t in t if they exist do not belong to V , or i nt t ⊆ V