Counting Types of Runs in Classes of Arborescent Words

An arborescence is a directed rooted tree in which all edges point away from the root. An arborescent word is obtained by replacing each element of the underlying set of an arborescence by an arbitrary letter of a given alphabet (with possible repetitions). We define a run in an arborescent word as a maximal sub-arborescent word whose letters are all identical. Various types of runs (e.g., runs of size ≤ k, linear runs, etc.) are studied in the context of R-enriched arborescent words, where R is a given species of structures.


Introduction
Classically, a word over an alphabet Ω is a finite sequence of letters taken from Ω and a run in a word is a maximal subsequence of consecutive identical letters in , see, for example, Feller [1], or Mood [2].A slightly different point of view consists of interpreting a word over Ω as the result of the replacement of each element of a linear digraph by an arbitrary letter of Ω.A run in a word is then interpreted as a maximal linear sub-digraph whose letters are all identical.Here are four runs, , in a 9-letter word over the alphabet ) obtained under this interpretation (see below).
Using this point of view, words can now be easily "structured" in various ways.For example, starting from the digraph of a circular permutation (instead of a linear digraph, as above) gives rize to the concept of a circular word studied probabilistically in the basic paper of Koutras et al. [3].In a similar way, any combinatorial species F, in the sense of Joyal [4], leads to a corresponding notion of F-structured word (or F-word, for short).Technically speaking, a F-word of length n over an alphabet Ω is an orbit of the following natural action of the symmetric group , where   , s w w  where s is a graph on [5] and i , the -th letter of the word It is important to note that under this action, the labels of the underlying set of the structure 1, 2, ,5  s are permuted while each letter is kept in a fixed position.This is a general fact and it constitutes the reason why the orbits of action (2) can be though of as unlabelled F-structures whose nodes are replaced (or filled) by letters of the alphabet, that is, F-words.Summing the orbits of (2) over , gives rize to the concept of an analytic functor in the sense of Joyal [5], of the category of sets into itself, sending each set Ω (= alphabet) to the set of all F-words over Ω.
There exist special classes of F-words, for which a natural notion of "run" can be defined.This is the case when F is a graphical species, whose structures are of the form of a simple graph possibly equipped with extra structures.Standard examples of graphical species include: the species of digraphs, of circular permutations, of tree-like structures (e.g., when F is the species of -enriched rooted trees or the species of R-enriched trees in the sense of Labelle [6]), etc.  2 shows the runs in a circular word and in a tree-like word over the alphabet .

 
, , , The study of runs in graphical words appears far from being trivial, under such a general setting.For example, a simple graph word whose runs are all singletons is equivalent to a proper coloring of the graph.So that, even in this special case, the whole theory of chromatic polynomials must be used.In the present text, our run analysis will be concentrated on classes of arborescent words.
In Section 2, we study various types of runs in Fwords over an alphabet of k letters, where F is the graphical species of R-enriched rooted trees, called here R-arborescences, for short.To do so, we make use of functional equations on multisort species and introduce a special 2-sort species, that we call a runs selector species, to classify the types of runs under study.Various enu-  merative results about runs (e.g., number of F-words whose runs are all of a given type, distribution of the biggest run, etc) then follow by making use of appropriate underlying cycle index series.Section 3 illustrates the theory of R-arborescent words by analysing specific examples and their associated series.Section 4 indicates some extensions to be developed in the future.For an introduction to species, the reader can consult the basic paper of Joyal [4] or the book by Bergeron, Labelle, and Leroux [7].

Runs in Enriched Arborescent Words
An arborescence is a directed rooted tree in which all edges point away from the root.Let be a given species of structures having at least one structure on the empty set.Recall that an -arborescence is an arborescence in which the (possibly empty) set of direct successors of each vertex is equipped with an structure.Figure 3(a) shows an -arborescence, where the black dots represent the elements of its underlying set.The small arc at each vertex represents the -structure placed on the set of successors of the vertex.The species and, depending on the choice of the "enriching" species, , various species of tree-like structures are obtained.Now, take a -letter alphabet and associate a sort A X  X   are -arborescences whose vertices are of arbitrary sorts taken among the i R X 's.The first step in our run analysis is to classify these multisort structures according the runs of sorts they contain.From (3), we obviously have, where is the species of the -structures having a root of sort X .Note that this root of sort i X is part of a run of vertices of sort i X .To reflect this fact, we now give an alternate expression for   Then, the species Proof. Figure 3(b) shows a -structure, where singletons of sort B X (resp., T ) are drawn as black dots (resp., white triangles).The result follows from the fact that an   i A -structure is a -structure in which the blackdots are replaced by singletons of sort i B X and the white triangles are replaced by arbitrary   j A -structures, , (see Figure 4(a), where , , 1 X -singletons are black dots, 3 X -singletons are gray dots).

An -arborescent word having only runs of type over the alphabet
is the result of the replacement of each vertex of sort i X in a Â -structure, by the letter , for It is worthwhile to note that a runs selector species

 
ˆ, B X T does not only select the shape of the runs under interest but also specifies how these runs should be interconnected.For example, The main interest in Lemma 2 and Definition 3 is that they give rise to a variety of enumerative results about runs of various types in -arborescent words by the introduction of special weight counters in ( 5)- (7).For example, taking , where is the species of cyclic permutations, Equation (3) takes the form whose solution is the species M of  mobiles in the term   he total weight of be t A -wor of length n havi only runs of type B ove the k -letter alphabet . The the generating series where is the cycle index series , , , , , , where is the cycle index series of , , , Proof.The series   , f u z is the total weight of all unlabelled Â -structures in which each node is replaced va e counte n der by the r z and each run is give a w riabl eight u .Now, consi species the obtained by the substitutions : , , , , , , , Z x x x index series F X .By symmetry, all the are equal.So that they can all be identified with a single species,   H X , say.By (4), taking into acco e run counter, u , we have, by the implicit species theorem [4].So that, , an llows by ta dex series on both sides of this combinatorial equation.
The following corollary is immediate fr Proposition .Then th era g series,   , is given by,

B g z k Z z z z g z z g z   
In the special ˆesponding to unre- stricted runs f z is explicitely given by, .
The use of cycle index series above comes fro fact that R-enriched arborescences have non-trivial aut morphisms, in general.This occurs when R-structures ha hism e au ex series reduce to exponential generating Corollary 6 If the species, is asymmetric, then , , , m the ove non-trivial automorp s.In the asymmetric case, however, that is when th tomorphism group of each R -structure is trivial, it is well-known that cycle ind series.R , where   ˆ, B x t is the exponential generating series of B .Moreover, if B B  , then

Runs in Ordinary Words
of non empty linear orders.In this case, L  -words are simply (ordinary non empty) words and Equation (5) takes the form  Take the runs s Solving for g , we get the well known g.f.'s of words whose runs are of size m Note that tak g 2 k m   in (20) can be called the where from which one can compute or estimate the expected s ze of the largest run (see, for example, Erdos and Revesz [10] Guibas and Odlyzko [ , Kong [12]). Take any subset S    and the selector

Runs in Binary Arborescent Words
Planar case.With , where is the species of linearly ordered s, Eq (3) takes the form  Take the linear runs selector Then by ( 16), the g.f.'s of plane bi rborescent words having only linear runs are giv  nding to nary a en by, , are the Catalan numbers. onds, in the h a 2-letter alphabet.Here, .So that, The Fibonacci case, k = 2, m = 2, corresp present context, to case (iii) above wit .
Non-planar case.With , where is the species of 2-element sets, Eq (3) take e form and by (8), For example, the first terms of   f z are ven by, gi  In the Fibonacci case, k = 2, m = 2, the runs selector (iii) above is given by The first terms of   , f u z and of   f z are given by, tion in the implicit species theorem, we have,  Take the runs selector B B  .Then since and (14) c as,

 
an be rewritten For example, the first terms of More generally, the runs selector  For Cayley arborescences, the Fibo cci case, k = 2, m = 2, corresponds to the runs select r  For singleton runs (case (iv) above), dinary generances.

Runs in Mobile Words
Recall that with , where is the cyclic permutati (3) ta he form where     , , , is the or ting series of unlabelled Cayley arboresce o g1 , , where  is the Euler function.For example, the first terms of   f z , for a 2-letter alphabet  1, u   are given by,

xtensions
The present approach to runs in structured words can be extended in various directions.For example, a probabilistic analysis of runs can be made by adding a probability of occurence to letter in the alphabet leading to an analogue of Proposition 4 for R -enriched tree words.These extensions will be developed elsewere.

F n is the 1 n
set of F-structures on the result of relabelling the F -structure s along the permutation  , and is an ordinary word of length over the alphabet Ω. Figure1illu-

1 5 ,
is written just under its associated vertex in the graph, for w Figure 1.The pairs   , s w w  1 5 and

Figure 2 .
Figure 2. Four runs in a circular word and five runs in a tree-like word.

iA
in terms of an auxiliary two-sort species,, where T is an extra sort of singletons.

Figure 5 .
Figure 5. Selectors for (a) Linear runs; (b) Singleton runs; and (c inology of Bergeron, Labelle, and Leroux [7].The reason is the following.Putting the root on top, the descendents of each node can be thought as turning around along horizontal circles.In this case, M -words are called mobile words.Figure 4(b) illus tes a mobile word on a 2-letter alphabet Figure 4(b) illus tes a mobile word on a 2-letter alphabet only runs of type B over the k - and of   solution is the species M of mobiles.In this case, M -words are call (see (ii) of linear runs, for example, corresponds to the runs selector, nly linear runs, wei having o ted by the run counter u is given by,