Auditory ERP Differences in the Perception of Anticipated Speech Sequences in 5-6 Years Old Children and Adults

The perception of complex auditory information such as complete speech sequences develops during human ontogeny. In order to explore age differences in the auditory perception of predictable speech sequences we compared event-related potentials (ERPs) recorded in 5to 6-yearold children (N = 15) and adults (N = 15) in response to anticipated speech sequences as successive and reverse digital series with randomly omitted digits. The ERPs obtained from the omitted digits significantly differed in the amplitude and latency of the N200 and P300 components between adults and children, the N400 and LPC components were more pronounced in children in the right frontal area for the digits presentation. These findings indicate that the perception of a successive speech structure is less automated in children and requires a detailed analysis of the successive structure and error detection. These differences in auditory ERPs reflect developmental changes in the auditory perception of speech sequences and can serve as indicators of the maturity of cognitive functions in children.


Introduction
The present study aims to investigate the perception of successive auditory stimuli such as anticipated speech sequences and its dynamics during human ontogenesis.The processing of auditory information is a complex system consisting of different elements that are required to continuously extract crucial information from the total informational flow.The most important features of this system are perception of the sound sequence, extraction of speech phonemes and the emotional components of speech, perception of everyday noises and music.Innately, children are not born with all of the necessary mechanisms for the perception of auditory information [1], they have to learn to hear as they grow and develop.
It is well known that event-related potentials (ERPs) in response to auditory stimuli exhibit different features in children and adults.These developmental features are reflected in changes in amplitude, latency and localisation of auditory potentials, which can be observed as children grow older.Both genetic and environmental factors contribute to these changes.Details in the formation of these networks are, to a great extent, affected by environmental factors; for example, synaptic connections receiving concurrent input are strengthened, while others are weakened or eliminated [1] [2].Thus, specific experiences in the perception of pitch sounds and sounds containing fine timing differences affect later auditory development.At a higher level, specific languages in which children are exposed to also substantially contribute to auditory maturation, enabling efficient processing of specific pitch systems, rhythmic structures and phonemic categories [3] [4].
Thus, during development, the specific complex brain mechanisms are forming and attenuating to the perception of contextual components of auditory information [1] [5] [6].For this purpose, a coordinated network of different brain structures is necessary and can be achieved by the development of a specific mechanism of regulation.Further, the perception and assimilation of speech stimuli organised sequentially in time require an automation of locomotors acts; simpler sequences are automated and acquired in more complex hierarchic sequences.Disorders of this regulation may be observed in cases of brain lesions, which may appear elementary, similar to systematic preservations [7] [8].During education, a child begins to perceive language sequences that are not presented as separate sound components, but as a whole sound structure that is organised sequentially.Learned in childhood, auditory speech sequences such as a numeric series or months are perceived by adults as a whole entity, and the modification or omission of some part of this sequence disturbs the whole and, thus, is perceived as an error.Therefore, contextually organised auditory information is perceived as an integral structure, and changes in its individual elements are perceived as a violation of its integrity.For example, an omission or change in the order of digits is perceived as a subjective disturbance.Due to a sustainable educational process, the perception of meaningful auditory sequences changes with age.Only a coordinated network of various brain structures provides an adequate perception of complex speech information, which requires training in ontogenesis.
This study examines the age differences in the perception of successive components of auditory information on the example of anticipated speech sequences such as digital series 1 -10 and 10 -1.ERPs of simple speech sounds arranged in the form of complex auditory information describe not only the perception of a single acoustic element but also the perception of the whole structure of the auditory information.In particular, the familiarity of the auditory information can cause a variation in the ERP component, N400, which is associated with a violation of a logic-imposed stimulus presentation of unfamiliar words or words with vague meaning [9], as well as the assessment of relationships between words [10].In children, the stimuli in a speech sequence can be separately perceived from each other, whereas in adults, the perception of sequential stimuli is inseparably linked with the perception of the speech sequence as a whole.Based on this assumption, we expect that the ERP components reflect an age-specific auditory perception of successive characteristics of speech information.

Participants
15 healthy adults (7 males and 8 females) and 15 children (8 boys and 7 girls) participated in this study.The adult subjects were aged 20 to 55 years (mean age = 29.8years), and the children were aged 5 to 6 years (mean age = 5.7 years).All of the participants were right-handed, and none of the subjects were reported to have any seizure-related disorders, to have vision or hearing problems, or to be taking any medication.The Human Research Ethics Committee of the Institute of Higher Nervous Activity and Neurophysiology of RAS approved of this research protocol.

Stimuli and Procedure
The subjects sat in a comfortable position in an armchair in front of a computer screen with a fixation symbol.Acoustic stimuli were presented binaurally via earphones (Philips O'Neill SHO9575) with a loudness of 70 dB.Sound generation was synchronised using Presentation Software (Neurobehavioural Systems).
Ascending and descending digital series with omitted digits were presented to simulate the disturbance of a successive order of auditory information.A series of digits were presented in a pseudo-randomised sequence.The digits of the ascending and descending digital series (from 1 to 10 and from 10 to 1) were articulated by a female speaker with a duration that varied from 400 to 700 ms depending on the digit delivered and were presented with 800 ms inter-stimulus intervals.During the presentation, one digit was omitted randomly from each sequence, and an omitted digit was replaced with a fragment of the natural sound.In total, each digit from 1 to 10 was presented 54 times, and different digits were omitted 50 times.

Electrophysiological Recordings
This study was performed using an EEG recording device "Encephalan" with the recording of polygraphic channels (Poly4, Medikom MTD, Taganrog, Russian Federation).
A continuous EEG was recorded using AgCl electrodes that were placed at 19 scalp sites (Fp1, Fp2, F7, F3, Fz, F4, F8, T7, C3, Cz, C4, T8, P7, P3, Pz, P4, P8, O1, O2) according to the International 10 -20 system.A vertical EOG was measured with AgCl cup electrodes that were placed 1 cm above and below the left eye, and a horizontal EOG was measured from electrodes that were placed 1 cm lateral to the right eye.The electrodes placed on the left and right mastoids served as references, and the electrode impedance was kept below 5 kΩ.The EEG epochs were amplified 1000 times with a band pass ranging between 0.1 and 50 Hz and sampling at 500 Hz.

Data Analysis
The EEG data were analysed with Matlab 7.11.0(Mathwork INC) and the EEGLAB 9_0_4_6s plugin for Matlab.Ocular artefacts were removed from the data using independent component analysis (ICA).The blink-related components were identified on the basis of their waveform, scalp topography and frequency spectra.These components were then rejected from the data.Next, epochs ranging from -400 to +1000 ms around the stimulus onset were separately extracted for each condition in each session.The ERPs were time-locked to the stimulus onset, and a baseline correction was applied using a 400 ms pre-stimulus baseline.
The ERPs to the digital series were averaged across 50 artefact-free epochs (48 presentations of each digit and 50 omitted digits) for each subject separately.
The latencies and peak amplitudes of the negative and positive ERP components were determined for the corresponding maximal deflections from baseline in the following time windows from the stimulus onset: N100 at 50 to 150 ms, P200 at 160 to 260 ms, N200 at the 180 to 230 ms, P300 at 250 to 350 ms, N400 at 350 to 450 ms, P600 at 500 to 650 ms, and a late positive component (LPC) at 750 to 900 ms.
Statistical analyses were performed using "Statistica 6" (StatSoft) and Matlab 7.11.0(Mathwork INC).First, the data were analysed using T-test for each channel using Matlab 7.11.0, and significant differences were verified by repeated measures ANOVAs.Further, the averaged amplitudes and latencies of the ERP components were subjected to ANOVA.A separate mixed-model repeated-measures ANOVA was performed to look at the effect of the lateralization (left-Fp1, F3, C3, T3, P3, right-Fp2, F4, C4, T4, P4) and topography (prefrontal-Fp1/Fp2, frontal-F3/F4, central-C3/C4, temporal-T3/T4, parietal-P3/P4) for different ERPs' components in both groups.Greenhouse-Geisser corrections are reported.Post hoc analysis was performed using the Tukey honestly significant difference (HSD) test.The ERPs for conditions were compared using a cluster randomisation test (Matlab 7.11.0), and the significance level of the temporal and spatial clusters was set at p < 0.05.The ERPs in response to different types of stimuli were investigated for both adults and children.The repeated measures analysis ANOVA was made for the grand averaged ERPs to all digits and omitted digits in the sequence.The differences within and between factors/groups were considered statistically significant at p < 0.05.The significant main effects and interactions were further analysed with post-hoc Tukey multiple-range tests (alpha 0.05).In adults, the following ERP components were found during the presentation of all digits in the series: N100 with an amplitude of up to 0.5 μV at latencies of 80 -100 Hz in the frontal, central, temporal and parietal regions; P200 with an amplitude of up to 2.5 μV at latencies of 180 -230 Hz in the frontal, central, temporal and parietal regions; N400 at latencies of 380 -450 ms in the frontal, parietal and central areas; and LPC with an amplitude of up to 0.7 μV at latencies of 750 -900 ms in the central, temporal and parietal regions.

ERPs in Response to a Numerical Series
In the "omitted digit" condition in adults, the following ERP components were found: N200 with an amplitude of up to 1 μV at latencies of 200 -260 ms in the frontal and central regions; P300 with an amplitude of up to 2.5 μV at latencies of 300 -350 ms in the frontal and parietal areas; and a late negative component with an amplitude of up to 1 μV at latencies of 700 -800 ms.
The averaged ERP to each separate digit in the sequence was compared with the ERP to the same omitted digit for each subject.There were no significant difference in amplitudes and latencies of the described ERP components for different digits, also there were no significant differences between ERPs for the different omitted digits.
Significant differences between the presence and absence of digits were observed at latencies of 70 -280 ms in the right frontal and central regions.However, the evoked response to digits of a numerical series was significantly more positive.At latencies of 260 -480 ms in the left frontal and central regions, the evoked response to digits was more negative compared to the evoked response to omitted digits.In contrast, adults demonstrated an evoked response to digits that was bilaterally more positive at latencies of 750 -950 ms in the frontal and temporal areas (t-test, p < 0.05).
In children, the following components of ERPs were found during the presentation of a numerical series: N100 with an amplitude of up to 5.5 μV at latencies of 70 -130 ms in the frontal regions; P200 with an amplitude of up to 1.5 μV at latencies of 180 -220 Hz in the frontal, central, temporal and parietal regions; N400 with an amplitude of up to 2 μV at latencies of 320 -400 ms in the frontal, parietal and central areas; P600 with an amplitude of up to 6.5 μV at latencies of 500 -650 ms in the frontal and central regions; and LPC at latencies of 850 -900 ms in the frontal areas (t-test, p < 0.05).
In children, the absence of an expected digit was found for the following ERP components: N200 with an amplitude of up to 2.5 μV at latencies of 180 -220 ms in the frontal and central regions; P300 with an amplitude of up to 7.5 μV at latencies of 280 -350 ms in the parietal, frontal and temporal areas; and a late negative component with an amplitude of up to 1 μV at latencies of 700 -800 ms (t-test, p < 0.05).
Significant differences between the presentation of digits and its absence were bilaterally observed at latencies of 70 -190 ms in the frontal regions.However, the evoked response for the digits was significantly more positive.Significant differences between the presence and absence of digits were observed at latencies of 260 -480 ms in the frontal regions evoked a response for digits that were more positive in the frontal and more negative in the temporal areas at latencies of 750 -950 ms (t-test, p < 0.05).

T-Test of Differences in Response to the Digits
Differences in the ERPs between adults and children were even more pronounced in response to digits.At latencies of 50 -200 ms in the frontal, central, and left temporal areas (p < 0.01), the ERPs of adults were significantly more positive (p < 0.01).Moreover, ERP latencies within 350 -650 ms in the frontal and central regions were significantly more positive in children (p < 0.01).

T-Test of Differences in Response to the Omitted Digits
The ERPs in response to the omitted digits were similar in their components but differed in the amplitude and latencies in children compared with adults.The ERP was significantly more positive in the left frontal areas at latencies of 150 -200 ms in adults compared to children (p < 0.01).In contrast, the ERP was significantly more positive in children at later latencies of 250 -600 ms in the central, parietal and temporal regions (p < 0.05) (Table 1).

ANOVA Repeated-Measures Effects in Response to the Digits
The N1 amplitude to the digits was larger in children than adults (the main effect of Age: F(1, 29) = 48.883,p = 0.00000) and was most pronounced in the frontal areas (the mixed effect of Age * Topography: F(4, 116) = 62.375, p = 0.0000).Moreover, the N1 amplitude was larger in children in the left hemisphere while in adults in the right hemisphere (the mixed effect of Age * Lateralization: F(1, 29) = 19.918,p = 0.00011).The N1 latency was higher in central, temporal and parietal regions in adults (the mixed effect of Age * Topography, (F(4, 116) = 10.375, p = 0.0089).
The P2 amplitude was larger in children than adults (the main effect of Age: F(1, 29) = 4.7568, p = 0.03744), in the same time the P2 amplitude was larger in the right hemisphere than in the left in adults but it had no differences in the lateralization in children (the mixed effect of Age * Lateralization: F(1, 29) = 64.539,p = 0.00000).The mixed effect of Topography * Lateralization was also found for the P2 amplitude (Figure 2).Lateral differences of the P2 amplitude in children were found in the frontal in temporal areas: in the left hemisphere it was higher compared to the right hemisphere.In adults lateral differences of the P2 amplitude were found at all electrodes: F(4, 116) = 7.7088, p = 0.00002.The P2 latency was higher in adults (the main effect of Age: F(1, 29) = 24.3,p = 0.0025).
Table 1.An appearance of the ERP components in children and adults to the digits and to the omitted digits in a digital series.A latency of each component was determined from a stimulus onset to a maximal peak.The N400 amplitude to the digits was larger in the frontal and parietal areas (the main effect of Topography: F(4, 116) = 26.789,p = 0.00000) and was larger in the right hemisphere (F(1, 29) = 33.220,p = 0.00000).Though, the N400 amplitude didn't show the main effects of Age, the interaction effect of Hemisphere and Topography was found for the N400 amplitude both in adults and children.Adults showed higher N400 amplitude in the parietal area of the left hemisphere, and children had higher N400 amplitude in the right prefrontal areas (mixed effect of Topography * Lateralization * Age: F(4, 116) = 2.5058, p = 0.04588).The N400 latency was higher in adults (the main effect of Age: F(1, 29) = 15.789,p = 0.0007) and it was most pronounced in the left temporal and parietal areas in adults (the mixed effect of Topography * Lateralization * Age: F(4, 116) = 3.0979, p = 0.00978).
The late positive component (LPC) in response to the digits was larger in children than adults (the main effect of Age: F(1, 29) = 36.345,p = 0.00000) and it was also more prominent in the frontal areas in children and in the parietal areas in adults (the mixed effect of Age * Topography: F(4, 116) = 77.165,p = 0.0000) (Figure 2).The LPC latency was higher in children (the main effect of Age F(1, 29) = 12.985, p = 0.00001).

ANOVA Repeated-Measures Effects in Response to the Omitted Digits
The N2 amplitude was larger in children than adults (the main effect of Age: F(1, 29) = 27.648,p = 0.00001) and was most pronounced in the frontal and central areas (the mixed effect of Age * Topography: (F(4, 116) = 14.934, p = 0.00000).The N2 amplitude was larger in the left hemisphere in children and in the right hemisphere in adults (the mixed effect of Age * Lateralization: F(1, 29) = 7.8990, p = 0.00877).The N2 latency was higher in adults (the main effect of Age: F(1, 29) = 16.9873,p = 0.00018).
The P300 amplitude in response to the omitted digits was larger in children than in adults (the main effect of Age: F(1, 29) = 50.176,p = 0.00000) and was smaller both in children and adults in the central regions (the main effect of Topography: (F(4, 116) = 65.569,p = 0.0000).However, P300 was more prominent both in children and adults in the left hemisphere (the main effect of Lateralization: (F(1, 29) = 21.305,p = 0.00007).The mixed effect of Topography * Lateralization * Age for P300 (F(4, 116) = 11.507,p = 0.00000) showed lateral differences in adults in the frontal and parietal areas (Figure 3).The lateralization differences in children were less developed and were found only in the frontal areas.The P300 latency was higher in adults in the left temporal and parietal areas (the mixed effect of Topography * Lateralization * Age: F(4, 116) = 8.2139, p = 0.00008).An amplitude of the late negative component was larger in the temporal and parietal regions in children and in the central regions in adults (the mixed effect of Topography * Age: F(4, 116) = 25.707,p = 0.00000).

Discussion
Adults can perceive a successive digital sequence as the whole cognitive image with predictable meaning.An omission of digits or change in the order of digits disturbs integrity of the speech sequence and brings other meaning.The present study aimed to find age-specific differences in ERP components reflecting the auditory processing of complete speech images or sequences.Our results showed that during presentation of successive speech sequence such as digital series 1 -10 and 10 -1, the ERPs of adults and children differed in their components.
As shown previously, the N100 component of auditory ERPs is sensitive to the predictability of an auditory stimulus: it is weaker when the stimuli are repetitive and stronger when the stimuli are random [26].In the present study the N100 component was higher for children compared with adult subjects in response to the digits.The increased amplitude of N100 reflects on the electrophysiological level that digits in the successive series are less expected for preschool children than for adults.The P200 component is known to be associated with sensory assessment and expectation of speech stimuli [11] [12].In our study an asymmetry of this component was registered only in adults and was higher in the right frontal area.Thus, the P200 asymmetry demonstrates that sound detection mechanisms differ in children compared with adults [13].The N400 component was significant in response to the digits both in children and adults.However, the present study demonstrates age-related difference in the N400 distribution over the brain areas.The N400 amplitude was smaller in the right frontal areas and higher in the parietal areas of the left hemisphere in adults that was not observed in children.The N400 latency was also shorter in children.It was showed that in adults N400 is typically recorded in the centro-parietal areas of the left hemisphere [10].The age influence was also reported on the N400 latency [14] while the N400 amplitude didn't show significant age differences [15].As suggested previously, N400 represents the binding of information obtained from stimulus input with representations from short-and long-term memory [14].In children learning new words, N400 appeared in the right frontal areas, and in response to congruent sentences [16].The same increase of N400 in the right frontal areas of children was observed in our study in response to successive digits.Additionally, the N400 component is routinely recorded when unfamiliar words or words with vague meaning are presented [17]- [19].Reduction in the N400 amplitude is also related with the degree of relatedness between the words in a priming paradigm [10].The present results of the shorten latency of N400 in children and different distribution of N400 may also reflect age-specific features of the perception of anticipated speech sequences.
In children, we also found a positive P600 component in response to the digits.This component was previously recorded in adults to occur when grammatical errors were detected in sentences or when the sentence was uncoordinated with a defective structure [20].P600 was also elicited by errors in musical harmony, such as when a chord is played out of key with the rest of a musical phrase [21].Thus, the appearance of P600 in the ERPs of children may be explained by a more detailed analysis of successive structure and error detection.This analysis can be caused by a more active search of errors in the auditory sequence and this search is required because the perception of auditory sequences is less automated in children.The LPC found in both adults and children in the parietal region within 500 -900 ms indicates that previously memorised words are recognised as "old" [22].This component was significantly more pronounced for old words than for new words [23]- [25].In the present study the LPC was larger in children in the frontal areas than in adults, indicating that children presumably actively evaluated the appearance of each subsequent digit as an "old" and relevant to the context of the numerical series.In a complex with P600, LPT can reflect more active process of word evaluation and recognition in children (Table 2).
The omitted digits in the digital series from 1 to 10 and from 10 to 1 induced similar components in evoked responses both in adults and children.Previous studies have shown that emitted potentials occur in the absence of any evoking stimulus and may be associated with a psychological process, such as recognition of a stimulus that has been omitted from a regular train, or may be associated with preparation for an upcoming perceptual or motor act.Emitted potentials, which are related to psychological processes rather than physical stimuli, comprise endogenous components that are only related to the context, rather than the features, of the stimulus [26].In terms of the stages of information processing, the components of the ERP trace are thought to represent high-level cognitive processes that are required in conscious attention.The most common endogenous components are the N2 and P3 components [27] [28].In the present study, a negative N200 component was found both in adults and children in the frontal and central regions.This component is associated with a sudden appearance of stimuli.The amplitude of this component is correlated with an unpredictability of the stimulus; it is lower when the stimuli are repeated and higher when the stimuli are presented randomly [6].In the present study, the N200 amplitude was higher in children compared with adults.Significant differences were observed in the left frontal area.As reported previously, the left frontal negative response at 220 ms latency in children can be marker of the automatic mismatch-detection mechanism [3].Thus, the increase of N200 in children in the present study can reflect a more active detecting process of mistakes in digital series in children than in adults relating with less automatical counting process.The P300 component also indicated that the expected stimulus was not presented [29] [30].In this study, the amplitude of P300 was higher in children than in adults, suggesting that when the stimulus is omitted from the regular train, the event is more unexpected in children.Both groups of subjects also exhibited a late negative component at latencies of 750 -800 ms in the central and parietal regions.This component is potentially related to the analysis of mistakes in a sequence.All described differences in the ERP components in response both to the presented and omitted digits are consistent with the assumption that adults, in more extent than children, bind digits in the sequence according to anticipated successive order.The perception of speech sequence consisting of successive digital ranges is more automated and consolidated in life experience of adults.Therefore, the next digit in the sequence is more predictable for adults, while omission of digits disturbs the perception of the whole sequence.Thus, the presented results indicate that the auditory ERPs reflect age-specific differences in the processing of successive parameters of auditory information and can serve as indicators of the maturity of cognitive functions in children.

Conclusions
• In response to the successive digits the N400 and LPC components were more pronounced in right frontal area in children comparing with adults, while the P600 component was found only in children.The increase in these ERP components in children indicates more active error detections.The process of evaluation and recognition of each digit of the anticipated sequence are more active in children than that in adults, and requires a detailed analysis of the successive structure and error detection • If one digit was omitted from the digital sequence, the N200 amplitude was higher in the left frontal area in children than that in adults.The amplitude of P300 in response to the omitted stimuli was also higher in children in the frontal and temporal areas.These components of EPS are connected with higher level of attention.Their increase also reflects the age differences associated with the heightened expectation and more active recognition of each successive stimulus in the sequence in children comparing with adults.

Figure 1 Figure 1 .
Figure1presents the obtained grand-averaged ERP waveforms and localisation of significant differences in

Figure 2 .
Figure 2. ANOVA repeated measures differences of the ERP amplitude between children and adults, the left and right hemispheres and the brain areas (prefrontal, frontal, central, temporal and parietal) for the P2 component in response to the digits.The mixed effect of Topography * Lateralization * Age for the P2 amplitude was F(4, 116) = 7.708, p = 0.00002.

Figure 3 .
Figure 3. ANOVA repeated measures differences of the ERP amplitude between children and adults, the left and right hemispheres and the brain areas (prefrontal, frontal, central, temporal and parietal) for the P300 component in response to the omitted digits.The mixed effect of Topography * Lateralization * Age for the P300 amplitude was F(4, 116) = 11.507,p = 0.00000.

Table 2 .
Significant differences in amplitudes, latencies of the ERP components and their localization in response to the stimuli presented monaurally into the right or into the left ear in children.