Cairne Arabic Syllable Structure though Different Phonological Theories

Since the times of the old Arab grammarians, the syllable has played a major role in the phonology of classical as well as colloquial Arabic. In the 1970s, phonologists investigated Cairene Arabic (CA) syllable structure and found it to be the domain of some phonological processes, such as emphatic spread, CVC syllables light word final but heavy word internally, limitations on consonant clusters in certain po-sitions of a word, and epenthesis of a vowel to break consonant clusters if there happen to be more than two consonants in word concatenation. This paper discusses some CA phonological phenomena investigated through different theories in generative phonology, i.e. rule based, autosegmental and Optimality Theory (OT). An overview of early theories is given. Early generative theories contributed substantially to the theory of the syllable of CA; however, each theory was able to explain a given phonological phe- nomenon. It is through the generative Optimality Theoretic approach that more than one phenomenon can be described and analyzed. The paper’s aim is not to compare between the different theories, but to de- scribe the progression CA syllable structure analysis took in generative phonology. Unlike earlier re-search which based conclusions on some CA words mixed with some other classical Arabic words pro- nounced by CA native speakers, this paper presents an Optimality Theoretic analysis that is based on uniquely CA phonetic outputs. The analysis finds that some syllable structure constraints are high ranked and inviolable such as O NSET , and *[µ µ µ] σ . The study also shows that OT analysis can illustrate and explain in one representation, i.e. tableau two different phonological phenomena, insertion and deletion of a vowel in consonant clusters, despite their relatedness to separate prosodic domains, the syllable, the prosodic word, and the phrase. This is carried out by analyzing the ranking, relationship and interaction be- tween the following constraints, O NSET M AX - IO , *[µ µ µ], *C OMPLEX CODA , D EP - IO >> N OCODA , *A PPENDIX ; -*V,+hi]$:, A LIGNR ( σ , Pr W d), and L INEARITY . The study analyzes data that is mainly from Cairene spoken Arabic, attempting to fill a gap created by one of the contentious issues related to studies of the phonology of CA, and that is mixing between colloquial Cairene and classical Arabic.


Introduction
A syllable is a universal unit found in all languages and perceptually accessible to preschool children and illiterate adults. The syllable plays an important role in the phonology of classical as well as colloquial Arabic (Bird & Blackburn, 1990;Kay, 1987;McCarthy, 1981). The description and analysis of the syllable in Arabic has been deeply rooted since old Arab grammarians studied it (ElSaaran, 1951). The metrics of Arabic have been described in terms of consecutive vowels and consonants or, in other words, in syllables (Maling, 1973;Prince, 1989). Additionally, assignment of stress interacts not only with the number of the syllables in a word but also with the internal prosodic structure of the syllable. An analysis of stress must depend on the analysis of the syllable (Hayes, 1995;McCarthy, 1979;Mitchell, 1960;Watson, 2002;Welden, 1980). I discuss the syllable in Cairene Arabic (referred henceforth as CA). I particularly focus on the importance of well-formedness in syllable structure and the phonological process of vowel epenthesis and deletion to secure syllable well-formedness. I present different generative phonological theories which analyzed and represented the syllable, e.g., the rule-based approach, the autosegmental approach and the Optimality Theoretic approach (referred henceforth as OT). All these approaches added to the theory of the syllable. The aim of the present paper is not to compare between the different phonological theories. Each theory contributed substantially to the theory of the syllable structure of Arabic classical as well as colloquial. The focus is on Optimality Theoretic approach because it is the most modern theory and also because the theory is able to illustrate phonological phenomena related to the syllable and beyond in one representation, i.e., a tableau, demonstrating the constraints at play and their ranking in the language.
However, I focus in this paper on colloquial Egyptian Arabic, i.e., data from spoken CA 1 . Most of the phonological investiga-R. AQUIL tion of CA (Aljarah, 2008;De Lacy, 1998;Halle & Vergnaud, 1987;Hayes, 1995;Kenstowicz, 1980;McCarthy, 1979;Selkirk, 1981Selkirk, , 1982Youssef & Mazurkewich, 1998) are based on data from (Mitchell, 1956(Mitchell, , 1960 which is classical rather than colloquial. It is referred to in the literature as Cairene classical Arabic. These linguists claimed they analyzed CA prosody, but more precisely they looked at Cairene pronunciation norms of classical forms. Only few studies looked into colloquial Arabic per se (Broselow, 1979;Watson, 2002;Welden, 1980). I personally consider basing conclusions on data that is not mainly CA rather confusing. My aim is to give a theoretical explanation to generalizations in CA data rather than to validate a theory through certain generalizations in some data. I claim that CA has its own systematic prosodic pattern and I present a number of uniquely CA phonetic output buttressing the analysis presented in this paper.
The paper is organized as follows: Section 1 discusses the data used in the literature and syllable structure in CA as well as CA-specific syllable structure phenomena, i.e., closed syllables CVC weight in word final position versus that in word internal and the theory of the appendix. Section 2 gives an overview of some of the theories in generative phonology, e.g., rulebased and autosegmental representations of the syllable in CA. Section 3 examines Optimality Theoretic framework and presents constraints at play in securing licit syllables. The section ends with an analysis of CA syllable structure demonstrating syllable structure constraints hierarchy for CA.
The same applies to the data used in (De Lacy, 1998, footnote 1: p. 1), where he states that the data is from Mitchell (1960), a study on words as pronounced by Azhar-trained Egyptians of Classical Arabic. The famous data, e.g.
[ʔadwiyatuhu] "his medicine" and [šajaratuhuma] "their (dual) trees", is totally Classical Arabic, because for the former word the CA version is [ʔadwiytu] and for the latter is [šagarithum] since the dual form exists only for the nominal in CA and not for the genitive or accusative 4 (Brustad, 2000;Holes, 2004).
In addition, the syllable structure Aljarah (2008) assumes for stress assignment does not represent important phonological alternations CA undergoes, e.g. shortening of a long vowel. The actual CA word for "he wrote it" is /(ka.táb.ha)/ and not /(ka.ta) (BA).ha/ because CA does not allow two open syllables (Broselow, 1979). Moreover, the above words have their CA equivalent which are /(mak)(túub)/ "a letter", and /mar.su.míin/ "they are drawn". As observed the syllable structures of these words are completely different from the ones maintained by Aljarah (2008).
Based on the above assumed syllable structure which is not CA, Aljarah's (2008) claims that the following constraints in the given ranking are responsible for CA stress: LX=Pr, TRO-CHAIC, MAIN-RIGHT >> PARSEσμμμ >> NONFINAL >> PARSEσμμ, FOOT-BINARITYμ >> ALL-FEET-LEFT >> ALL-FEET-RIGHT.This is in spite of the fact that the data he used to support the constraints and their ranking is partly classical Arabic pronounced by CA native speakers. In contrast to the Aljarah's (2008) constraints and their ranking, Aquil (2012b) arrived at a different set and ranking of prosodic constraints: FTBIN, ALIGN Hd/R >> TR >>WSP >> PARSEσ, PARSE SG >> AFL. See Aquil (2012) where she analyzed a number of uniquely CA phonetic outputs.

Syllable Structure in CA
Syllables in CA fall into three categories: light CV, heavy CVC and CVV, and super-heavy CVCC and CVVC, which only occur word and phrase finally. A CA syllable contains an obligatory onset and nucleus, and an optional coda. Consonant clusters (CC) are allowed but only word and phrase final. This is represented by a consonant in parenthesis (c), as illustrated in the representation of a core syllable in structure (1) 1) CA Syllable Structure Arabic in general disallows onsetless syllables, and so does CA. Syllables in Arabic in general must start with a consonant. If a syllable happens to start with a vowel then a glottal stop must be inserted and precede the vowel. Onsetless syllables are forbidden in Arabic, a language specific rule. For example, words like "America" whose initial syllable starts with a vowel in English are not permissible in CA and a glottal stop is inserted [ʔam.rii.ka]. In OT terms, the insertion takes place to obey the ONSET constraint, which stipulates that syllables must have onsets and vowels cannot start a syllable (Prince & Smolensky, 1993. Sections 2 and 3 expand this discussion. However, when words are concatenated in connected speech, the concatenation could result in series of consonant clusters. Since consonant clusters are disallowed word-or phrase-internally, re-syllabification of the second coda element must take place. Re-syllabification is discussed below. R. AQUIL However, the position of a CVC syllable in CA word interferes in interpreting the syllable as heavy or light. If the CVC syllable is word-internal it is heavy, but if it is in word-final position then it is not heavy. Consider structure (3). 3) The theoretical justification for considering CVC in the final position as light is the extra-syllabic or extra-prosodic nature of the coda consonant in the syllable. The extra-prosodic consonant in CVC, CVCC, and CVVC is regarded as an appendix. Vaux (2004) following Kiparsky (2003: p. 156) contends that an appendix or an extra-prosodic element is prosodically invisible, and is attached directly to a higher-level prosodic node (usually a PrWd), as shown in structure (4) for the CA word /bint/ "girl" 5 . Vaux (2004) provides evidence for the appendix from facts that range from the phonological, such as sonority sequencing, epenthesis and prosodic phenomena like stress assignment, lengthening and shortening, reduplication and infixation, to morphological processes like syllable-counting rules and truncation to other external linguistic phenomena. 4) A word-final appendix (adapted from Kiparsky (2003: pp. 157, 162)) Kiparsky (2003) postulates that the stray consonant /t/ in structure (4) constitutes a mora, and in order to avoid gratuitous Prosodic Licensing (Ito, 1986(Ito, , 1989) 6 this mora is affiliated with the prosodic word. As shown in structure (4), not only does a mora become affiliated with the prosodic word, but a stray consonant and syllable do as well. Studies that analyzed other Arabic dialects have also considered the second consonant of the coda consonant cluster a stray unsyllabified consonant (Abu-Mansour, 1991).
The above phonological phenomenon is considered an internal proof for the appendix and, in turn, explains the process that re-syllabifies the second element of the coda to avoid illicit syllables resulting from multiple consonant clusters when words concatenate in connected speech (see data in (1) & (2) below).

Resyllabification and Vowel Epenthesis
Unlike English, where the domain of syllabification is the phonological word (Nespor & Vogel, 1986), the domain of the syllabification in CA is the utterance. Syllabification and resyllabification in CA, as seen in data (1), occurs across words, and is in fact obligatory across words if a sequence of more than two consonants results due to morpheme or word concatenation. Note the presence of the epenthetic vowel represented here as [ɨ] 7 , in the phonological word, clitic group, phonological phrase, and utterance. ʕamr-resolved (3 rd sg, past) yesterday's loss As mentioned above, CA does not allow a cluster of three consonants; therefore, if such a cluster is generated through concatenation of words, an epenthetic vowel is inserted (Broselow, 1979(Broselow, , 1988.The following examples illustrate the re-syllabification cum epenthesis process. Data is adapted from (Broselow, 1979). 5 See Vaux (2004) for external (e.g., psycholinguistic experiments, child language acquisition, aphasic speech, and language games) and internal evidence (e.g., phonological rules, phonotactics and sonority, syllable weight, reduplication and infixation). 6 The Prosodic Licensing Principle requires that every segment be assigned to a higher-level prosodic constituent. 7 Barred [ɨ] is used here to differentiate it from the short [i] which is a phoneme in the language. This vowel (barred [ɨ]) is not psychologically real and CA native speakers are not aware of inserting it. See (Aquil, 2012a) where the vowel is acoustically investigated and found inserted in CA native speakers' output of English compound words that have clusters of three and more consonants to break consonant clusters. R. AQUIL

Words in isolation
Words concatenated a) bínt simíina → bintɨsmíina girl fat a fat girl b) ka táb t lu → katabt ĺ u write (past 1 st ) to him I wrote to him c) katábt gawá:b → katabtɨgawá:b write (past 2 nd sg, m) letter you wrote a letter d) katábti gawá:b → katabtigawá:b write (past 2 nd sg, fem) letter you wrote a letter No epenthetic vowel is added in (d) since the word katábti already ends with a vowel.

CA Syllable Structure in Generative Phonology
Syllable structure rules factor a word intro trees corresponding to syllables 12 . The formulation successfully illustrates that the syllable is the domain of emphatic 13 distribution in CA. As observed in (d), emphatic spread does not take place beyond the syllable.
One of the main contributions of generative phonology is the specification of prosodic domains. In the above examples, the domain of emphatic spread is illustrated. Emphasis spread is a phonological phenomenon where an emphatic phoneme spreads its emphatic phonetic nature and alters the pronunciation of adjacent sounds. The psychologically real emphatic sounds in CA are (t ʕ , d ʕ , s ʕ , z ʕ , q ʕ ), in addition, CA has some emphatic allophones (l ʕ and r ʕ ). From the examples above, we witness that spread takes place in the domain of the syllable and not beyond. For example, the (f) sound in (5a) is emphatic, whereas it is not in (5b). In (5a) the emphatic spreads to the second syllable and, hence, the [f] becomes emphatic [f ʕ ], whereas in (5b) the [f] is not emphatic. The same applies in (6c & d), where the [g ʕ ] is emphatic in (6c) but is not in (6d). These rules give evidence that the domain of emphasis in CA is the syllable. Although the process described here is a segmental process and is not discussed further in this paper through OT framework, however, it is necessary to give credit to early theories that captured important processes related to the syllable in CA.
However, rules could not capture many phonological processes, such as epenthesis between consonant clusters, for ex-ample; in some cases two distinct rules are required to account for this single process, namely epenthesis (Spencer, 1996). This is because rules based formulations can describe all relevant phonological information about a word in a linear representation where the strings of segments are grouped together. This led researchers to adopt non-linear phonological theories such as autosegmental phonology (Goldsmith, 1976), as illustrated in (7) [ʔultɨlu<h>] "I told him" Adapted from (Watson, 2002: p. 65). Through autosegmental phonology phenomenon such as extramtricality is successfully explained and represented. As demonstrated, these approaches successfully represented an intricate phonological phenomenon serially and in steps. However, no single approach was able to represent all relevant processes and phenomena concerning CA structure in a parallel way. In other words, each theory was able to describe a certain phenomenon. To explain intricate information related to the syllable such as, onsetless syllables are forbidden, extrametricality/appendix, syllable weight by position, epenthesis and deletion of vowels generative phonology has produced OT approach. OT approach discusses, analyzes and represents phonological phenomena and processes in parallel through invoking the concept of constraints, violability of constraints, and constraints' ranking. It is parallel because in one tableau candidates are examined against certain constraints ranked according to a language specific ranking. The candidate that fulfills and obeys high ranking constraint is the optimal candidate as evidenced by phonetic outputs in the language. I discuss OT below. OT (McCarthy & Prince, 1993;Prince & Smolensky, 1993) is a constraint-based approach to phonological well-formedness. It posits that Universal Grammar has a set of violable universal constraints (CON). These constraints encompass uni-12 Adapted from (Broselow, 1979: p. 347). 13 Emphatic sounds are stops, fricatives and laterals [t ʕ , d ʕ , s ʕ , z ʕ , l ʕ , r ʕ ] which are acoustically illustrated with a lowered second formant and articulatorily with a constriction in the pharyngeal cavity due to the retraction of the tongue root. R. AQUIL versal properties of languages. All universal constraints are available in every language in the world. However, each language has its particular ranking of these constraints, i.e., a certain hierarchy. Some languages may rank a certain constraint high in the hierarchy while others may rank the same constraint very low. This difference in constraint ranking explains the variation that arises between languages. In addition, OT adopts a representational format, i.e. tableau in which the candidate that optimally satisfies a given constraint ranking wins over all other candidates produced by GEN (the generator that creates linguistic candidates). The grammar decides on the winner through EVAL, which selects the best candidate that satisfies the highranked constraints. Note that a given language may have high ranked but yet violable constraints. The most important issue is that the number of violations occurring to a given high constraint should be minimal.

Optimality Theory and Constraints
OT offers an approach to linguistic theory that aims to combine universality and markedness. In terms of universality, Universal Grammar provides the theory with a set of constraints that are universal and universally present in all grammars. As for markedness, it aims to present a precise formal sense of what it means to be "unmarked." In OT, forms are marked with respect to some constraint if they violate it. These forms are literally marked in that they incur violation marks for the constraint as part of their grammatical derivation. In other words, OT postulates that both Constraint-unmarked-structure and the Constraint-marked-structure are present in the grammar of a language. Constraint ranking decides which of these structures surfaces in the language. Low ranked constraints are dominated by other high ranked constraints.

Syllable Structure Constraints
The constraints that are at play in CA syllable structure are the following.
3. Syllable structure constraints MAX-IO Input segments must have output correspondents. No deletion. Every segment in the input should correspond to a segment in the output (McCarthy & Prince, 1995).

DEP-IO
Output segment must have input correspondents. No epenthesis. Every segment in the output should correspond to a segment in the input (McCarthy & Prince, 1995).

ONSET Syllables must have onsets.
A syllable must start with consonant (Prince & Smolensky, 1993 NOCODA Syllables must not have codas (Prince & Smolensky, 1993. *COMPLEX CODA Syllables can have coda, but no more than one may associate to syllable node (Prince & Smolensky, 1993.
*APPENDIX (*APP) Unsyllabified segment is forbidden (Prince & Smolensky, 1993. ALIGNR (σ, PrWd) Align the right edge of each syllable with the right edge of some prosodic word (McCarthy & Prince, 1993. As discussed above, syllable structures in CA are of three categories: light CV, heavy CVC and CVV and super-heavy CVCC and CVVC, which only occur word-and phrase-finally. However, based on the premise that the final coda consonant of a word is an appendix (see structure 4), I propose that syllable structures are maximally CVC and CVV. Consider the following generalizations.
4. Syllable structure generalizations a) Syllable structures are maximally CVC and CVV. b) Complex codas are not allowed at the syllable final position. c) Consonant clusters of more than two consonants are prohibited, not only tautosyllabically, but also over a syllable boundary. d) A vowel is epenthesized to break a consonant cluster resulting from morpheme or word concatenation. e) A vowel is deleted if it is high and in two-sided open syllables. f) A vowel never starts a syllable, and hence an onsetlesslable is forbidden. Tableaux 1-16 exemplify the above generalizations by means of constraint ranking and interaction determining optimal syllables structures.

ONSET, MAX-IO >> DEP-IO
I adopt Prince (2002) comparative and McCarthy (2008) combination tableau 14 . The combination tableau illustrates the ranking between constraints, as well as violation marks. In the tableau, each losing candidate is compared to the winning candidate in regards to each constraint. (W) denotes that the constraint in question prefers the winner rather than the losing candidate. This is because the winner satisfies the constraint but the losing candidate does not, as specified by the violation mark 14 This tableau is a combination tableau adapted from McCarthy (2008a: pp. 46-47). According to McCarthy, the tableau ensures that the first two requirements of a valid ranking are met, which are constraint conflict and a winner. The cells with W and L show that these constraints compete over the choice of the winner. For the winner to win the constraint with the W must be higher than the one with the L. According to McCarthy: "The comparative or combination format is best for the ranking problem." (2008a: p. 48). McCarthy advocates the combination tableau since it includes violations as well as W and L annotations of the comparative tableau. R. AQUIL (*).Whereas the (L) denotes that the given constraint prefers the losing candidate rather than the winner. Observe the relationship between ONSET MAX-IO and DEP-IO in Tableaux  In Tableau 2 candidate (a) wins at the expense of DEP-IO, which stipulates that the output (i.e., surface form) should correspond to the input (i.e., underlying form), thus nothing be inserted. The competition between MAX-IO and DEP-IO is won by MAX-IO, as illustrated in Tableau 3. Candidate (a) is the optimal candidate, as it satisfies MAX-IO but violates DEP-IO, because it inserts a vowel to syllabify the consonants [t.l]. katab-t-l-ha "he wrote to her" MAX-IO D EP-IO  a. ka.tab.til.ha * b. ka.tabl.ha **W L

DEP-IO >> NOCODA
Coda consonants are optional, as demonstrated by structure (1) above. This suggests that NOCODA constraint must be low ranked. Tableau 4 demonstrates the ranking of NOCODA in relation to DEP-IO. Relationship between NOCODA and the rest of the constraints is discussed below. Tableau 4 demonstrates that DEP-IO, which is a faithfulness constraint, dominates NOCODA, which is a syllable structure mark-edness constraint. The winning candidate (a) obeys DEP-IO and does not add anything to the coda, however, violating NOCODA. Candidate (b) satisfies NOCODA, but violates DEP-IO, by inserting a vowel. This tableau shows a direct ranking between the two constraints: DEP-IO >> NOCODA.
Based on the analysis so far, we can conclude the following proposed hierarchy.

Syllables in CA Are Maximally CVC or CVV
Analysis of CA syllables demonstrates that syllables are maximally CVC or CVV. In OT *[µ µ µ] σ is a high-ranked constraint and dominates the following constraints, i.e. *App and Nocoda, as demonstrated in Tableaux

CVCC, CVVC Complex Codas
Complex codas (CVCC), as mentioned earlier, are not allowed word-or phrase-medially because, as established above, CA syllables are maximally CVC or CVV. However, word and phrase finally, CVCC and CVVC syllables are allowed, but at the expense of violating *APP, yet satisfying other high-ranked constraints namely *COMPLEX CODA. See the following tableau. *COMPLEX CODA has to be highly ranked because complex codas are never permitted. NOCODA, on the other hand, has been shown to be low ranked (see 11 above) DEP-IO has to be ranked lower than *COMPLEX CODA but higher than NOCODA, as shown in Tableau 4. Tableaux 13 and 14 demonstrate the interaction between MAX-IO, *COMPLEX-CODA, and DEP-IO. In word and morpheme concatenation, a repair strategy occurs where vowels are inserted and also deleted in order to achieve a licit syllable structure. The following data illustrates that, if there is a cluster of three (CCC) or four consonants (CCCC), a vowel is inserted between the second and third consonant (Davis & Zawaydeh, 1997: p. 47, as in 7a), and deleted (Broselow, 1979: p. 349   and correspondence constraints (McCarthy & Prince, 1995) are invoked to supplement the syllable structure constraints mentioned above.

Epenthesis and Deletion of Vowels in OT in the Syllable, Prosodic Word and Phrase
Tableau 15 shows the role of ALIGNR (σ, PrWd) in accounting for epenthesis in CA. No direct relationship is found for this constraint and the rest of the constraints. The mora symbol, Greek mu [μ], is used to represent violations against ALIGNR. Candidate (a) is the winner, despite's the violations of ALIGNR. The candidate has three mora violations because the right edge of its first syllable is two moras away from the right edge of the word and the right edge of its second syllable is one mora away from the right edge of the word. Candidate (a) wins because it obeys *COMPLEX-CODA. Candidate (b) violates NOCODA and *COMPLEX-CODA.
As for the deletion of the vowel in /bint kibiira/ → [bintikbiira], I would like to invoke a phrasal syncope rule noted by Broselow (1976: p. 3;reported by McCarthy, 2008: p. 18). This rule deletes unstressed high vowels in two-sided open syllables. I would also like to use the definition of the constraint as postulated by Abu-Mansour (1991)  Candidate (a) wins because it satisfies high-ranked constraints, i.e., *COMPLEX-CODA, NOCODA, and -*V, +hi]$: despite its violations in ALIGNR and LIN 17 . Notice that the sequences of the vowels are metathesized in the winning candidate /bint ki-biira/ → [bintikbiira].
To conclude this section, I provide in Tableau 17 a summary of syllables structure constraints and their ranking within the spoken Cairene Arabic data set, followed by a hierarchy of constraint ranking in 11.

Conclusion
CA syllable structure has been the focus of investigation and analysis in generative phonology since the 1970s. Theories have investigated several phonological phenomena specific to the syllable in CA. For example, as discussed in this paper, emphatic spread takes the syllable as its domain, CVC syllables are considered heavy word internally but not finally, more than two-consonant clusters are forbidden word internally, and epenthesis is across the board, to make certain syllable well-formedness is obeyed. Theories of generative phonology analyzed these phenomena piecemeal. No theory was able to capture all these phenomena in parallel representation like Optimality Theory. In this paper, OT formulations explain the following: 1) why CA syllables are maximally of two moras, and that is because *[µ µ µ] is ranked high; 2) why CVC syllables are considered heavy word internally but not finally, and that is because of the interaction between *[µ µ µ], NOCODA, and *APP, and finally 3) how to account for epenthesis and deletion of vowels in one representation, and that is by invoking constraints such as -*V,+hi]$:, (ALIGNR), LIN. This paper joins the few studies ( (Broselow, 1976(Broselow, , 1979(Broselow, , 1988Watson, 2002;Welden, 1980) that analyze mainly CA data without referring to data mixed with classical Arabic. As mentioned above, analyses presented in the literature are largely based on data from (Mitchell, 1956) referred to as Cairene classical Arabic. OT analysis and formulations presented in this paper are provided to explain the spoken CA data rather than the reverse, i.e., to explain formulations through data.