A Review of Models in Experimental Studies of Implicit Language Learning

The present review analyzes experimental research on implicit learning using linguistic stimuli, and proposes five key procedures of a framework for empirical studies of implicit learning. Our review begins with a brief overview of the current state of research on implicit learning, and then presents the procedures in detail: 1) choosing theoretical assumptions from psychology; 2) designing stimuli; 3) exposing subjects to information; 4) testing implicit learning; and 5) measuring subjects’ state of awareness. This framework is intended to assist researchers in designing experiments on implicit learning both more comprehensively and with fewer flaws.


Introduction
uses nine examples of earlier research to give more explicit concept of implicit learning, and concludes that implicit learning can generally be characterized as learning that takes place both unintentionally and unconsciously.Interests in implicit learning have lasted about 50 years.Since Reber coined the term "implicit learning" for the first time in 1967, numerous experiments have been done in this field.Until now, it seems that the central issue of implicit learning studies has been proved that what researchers thought to have been learned implicitly really was acquired by implicit learning, and then to find the cognitive processes of implicit learning, rather than more fundamentally to prove whether implicit learning did in fact exist (Frensch & Rünger, 2003;Williams, 2009).Both psychologists and linguists are interested in the matter.Psychologists study it to learn more about human psychological mechanisms; linguists study it to learn more about human language developmental mechanisms.In this review, we would focus on clinical research with linguistic features.
Researchers like Williams (2004Williams ( , 2005Williams ( , 2009)), endeavor to develop clinical methodologies and models that will make studies of implicit learning more reliable and persuasive.Clinical models are important gains from clinical studies.There are three general kinds of models: first, models based on offline methodology (Jiménez et al., 1996); second, models based on online methodology, mostly using RT, ERP or fMRI (Cleeremans & McClelland, 1991;Clegg et al., 1998;Leung & Williams, 2006;Williams, 2004Williams, , 2005)); and third, models based on computational methodology, mostly constructed according to constructivist and emergentist views (Cleeremans & McClelland, 1991;Dienes, 1992;Estes, 1957;Hintzmann, 1986;Perruchet & Vinter, 1998).Following these three general models, there are detailed models developed by experimental practices.One of the most popular models using RT is the one developed by Williams in 2004, which examined implicit learning through a series of explicit training sessions that controlled subjects' attention, recording reaction time (RT) and drawing conclusions on which items had been learned implicitly (Chen et al., 2011;Leung & Williams, 2006;Williams, 2004Williams, , 2005)).
Numerous models have been developed, but none of them is beyond dispute.On one hand, almost all contain some elements or procedures that make them less reliable; on the other hand, a more scientific criterion has not been found to guide researchers in planning their experimental procedures.In contrast, a surgeon follows a series of detailed and standardized preparation procedures before he or she enters the operating room.We, then, seek to give the best suggestions on developing and standardizing such necessary procedural steps for researchers in the clinical field of implicit learning.

Literature Search Strategy
We tried to identify published studies through searches of Elsevier, Science-direct, Springer, Google Scholar and Google using keyword, title and abstract information.Each of these databases allows searches of articles before July of 2013.The following search terms were used: implicit learning, implicit knowledge, artificial grammar learning, sequence learning, unconscious learning and learning without attention.Manual searches were also important to consult for identifying other items from the references of other relevant reviews and book chapters.

Inclusion and Omission
Only English-language articles are included in the present review.To review critically and ensure manageability, our review focuses on clinical studies of implicit learning in relation to artificial grammar learning (AGL) and sequence learning (SL) but it is not exhaustive.Other paradigms, such as probability learning (Millward & Reber, 1968), melody learning (Rohmeier & Cross, 2010), visual search in complex stimulus environments (Chun & Jiang, 1999) and dynamic system control, have not been considered.

Controversial Theoretical Issues in Implicit Learning
Though this review focuses on the experimental models on implicit learning, this section will give a very brief summary about three theoretical issues that are quite controversial and need to be settled, because these theoretical issues seem to be the sources of the inconsistency of experimental results of implicit learning.The first issue is the definition of implicit learning.In the introduction section, we mentioned Shanks' conclusion about implicit learning as learning that takes place both unintentionally and unconsciously (Shanks, 2005).Definitions elsewhere (Cleeremans & McClelland, 1991;Clegg et al., 1998;Jiménez et al., 1996;Leung & Williams, 2006;Reber, 1967) also give descriptions like this.It is not difficult to find that the description itself is quite vague, because words like "unintentionally" and "unconsciously" are words without settled definition.Another difficulty in defining "implicit learning" is whether it should only include learning that occurs implicitly or all kinds of learning except ones occurring explicitly (Frensch & Rünger, 2003), since "implicit" does not absolutely equal to "unaware", and neither does "explicit" equal "aware".The inconsistency in defining implicit learning causes researchers to design experiments of implicit learning with different concepts of implicit learning in mind (Cleeremans & McClelland, 1991;Clegg et al., 1998;Jiménez et al., 1996;Leung & Williams, 2006), and consequentially makes the results of their experiments incomparable (Frensch & Rünger, 2003).
The second theoretical problem is that the processing mechanism of implicit learning and explicit learning is unsettled.There are disputes between the multiple-system hypothesis and single-system hypothesis (Frensch & Rünger, 2003).The former holds that implicit learning and explicit learning use different processing systems, whereas the latter holds that the two use the same processing system, and even some hypothesize that explicit learning should developed from implicit learning.This also makes the results of research incomparable with different concepts about processing mechanisms (please see Frensch & Rünger, 2003, for detail).
The third theoretical issue is also very troublesome: it is the uncertainty of attention mechanisms.In experiments, researchers need to make the acquisition of stimuli implicit or unaware by controlling subjects' attention.This problem will be given a more detailed discussion in Sections 4.1 and 4.3.
Though these theoretical issues do exist and do have passive consequences on the research of implicit learning, it is unlikely to be settled any time soon.However, to some extent, we might be able to complement this by adopting more controllable models in clinical studies.

A Critical Review on Experimental Models in Clinical Studies
After analytical work, we find that experimental models in clinical studies of implicit learning are usually involved in the following essential procedures: (1) choosing theoretical assumptions from psychology; (2) designing stimuli; (3) exposing subjects to information; (4) testing implicit learning; and (5) measuring subjects' state of awareness

Choosing Theoretical Assumptions from Psychology
Researchers have conceived of various presuppositions about implicit learning.The two most famous are the following: (1) the shadow theory (Searle, 1992), which holds that there is an unconscious mind and a conscious mind, and that the two are just the same, only with consciousness absent in the former; and (2) the not-reallyexisting theory (Shanks & St. John, 1994), which holds that results in experiments are about instances rather than rules, and thus learning about any kind of knowledge is explicit rather than implicit.Though there is still much to say about such presuppositions, we will not focus on them, instead, on psychological suppositions adopted in clinical experiments.
In designing experiments that test implicit learning, all or at least most researchers (Cleeremans & McClelland, 1991;Clegg et al., 1998) try to find their ground in the achievements of psychology, since implicit learning is thought to be an integral part of psychology.In the training section, researchers usually try to create conditions that promote implicit learning by controlling how subjects allocate attention, thus the most commonly cited supposition pertains to attention."In psychology, the basic assumptions concerning attention have been that it is limited, that it is selective, that it is partially subjective to voluntary control, that attention controls access to consciousness, and that attention is essential for action control and for learning" (Schmidt, 2001: 11).These assumptions are basically used in the design of training, thus, we will review their roles in a later section on exposure.

Designing Stimuli
Usually, clinical research on implicit learning has essentially been focused on two stimulus paradigms: artificial grammar learning (AGL) and sequence learning (SL).The following sections will give more insight on the two paradigms with a critical view.

Artificial Grammar in Stimuli
Artificial grammar learning is arguably the most influential paradigm (Cleeremans & Dienes, 2008).In studies adopting artificial grammar, subjects are usually asked to memorize or look at a series of materials, and then to select from test materials the ones that conform to the materials they have seen before and to describe what rules they depend on to make the selection decisions.Reber (1967) was one of the first researchers to adopt AGL as experimental information in the study of implicit learning.He asked subjects to learn a series of letter strings within a limited time and then told them that these strings were all constructed according to a particular set of rules (an artificial grammar created by him).Later he conducted a test on the subjects with new strings and with such questions as which strings conformed to the rules earlier referred to.Subjects made decisions with better-than-chance accuracy; but results showed low correctness in description of the rules.Hence Reber concluded that the learning of the artificial rules was a phenomenon of implicit learning.Though Reber's conclusion was criticized heavily, since then, many researchers have taken to using artificial grammar to study implicit learning.Later versions of artificial grammar, however, have undergone many modifications (e.g.Reber, 1989;Berry & Dienes, 1993;Cleeremans et al., 1998;Pothos, 2007;Shanks, 2005;Wan et al., 2008).
What is arguably more worthy of note lies in the following experiments, which try to make the clinical stimuli closer to natural language.Williams (2004) used artificial nouns, artificial determiners and their artificial determiner-noun relationship as stimuli of implicit learning, but the determiners used had strong characteristics of gendered language determiners.Leung & Williams (2006) used artificial determiners, artificial syntax structure and the artificial determiner-agent/patient relationship as stimuli in Experiment 1.In Experiment 2, they used artificial nouns, artificial determiners and their artificial determiner-noun relationship as stimuli of implicit learning, having removed the features of gendered language determiner, using English nouns instead of artificial nouns, and using pictures to make up for the lack of context; Rebuschat & Williams (2009) adopted a semi-artificial grammar, which consists of English words and German syntax.Chen et al. (2011) conducted experiments in Chinese implicit learning, base on Williams' (2004) model, using extremely low-frequency Chinese characters as determiner, and Chinese nouns and an artificial determiner-noun relationship.
Although closer to natural language, these stimuli still have their own defects.The defect of Williams' ( 2004) stimuli is the gender features of the determiners; in Leung & Williams (2006), the stimulus defect results from its use of pictures, which might arouse other visual processing with the same effect as implicit learning.The stimuli in Rebuschat & Williams' (2009) experiments, from German syntax, may be too close to those of English.In Chen et al. (2011), the stimuli themselves seemed good, but Chen classified them in Chinese as "structure": in fact, the stimuli, though in the position of determiner, were more likely to be elements of adjectives belonging to a semantic field in Chinese that is completely ideographic.More modifications, therefore, are expected in future experiments.It is expected that one of the new directions will call for stimuli closer to natural language in a natural context with semantic and pragmatic features taken into consideration.

Sequence
In experiments in the paradigm of sequence learning, subjects are usually meant to learn the order of elements in a sequence during a training course that asks them to react as fast as possible to the elements that appear.If a subject has learned the sequential feature of the elements, he needs much less time to decide the features of the elements coming up (Clegg et al., 1998).Nissen and Bullemer (1987), the first adopted sequence learning in clinical study of implicit learning, demonstrated the effect of learning without awareness of the sequential rules.Cleeremans and McClelland (1991) used a sequence of stimuli whose locations were determined by a finite-state grammar.Fu et al. (2008Fu et al. ( , 2010) ) adopted two second-order conditional sequences of numbers in a target-location task, in which the location of each number was determined by the locations of the previous two numbers.Implicit sequence learning was also studied frequently in psychological studies of aging and other issues as a window through which to look inside human brain function (Rieckmann & Bäckman, 2009).
Artificial sequences are popular in today's implicit learning studies, but they are more or less too artificial to attract subjects, or unable to consider various meanings.This makes those experiments more likely to be in the situation of a mathematic or logic test.Even specialists in mathematics and logic believe that language is what we depend on to think.We believe that more linguistic features, particularly semantic and pragmatic features, should be added to the sequences in the future.

Models of Exposing Subjects to Information
Now we discuss two crucial methodology problems in the exposure phase.The first problem is the balance of exposure: researchers are expected to be able to ensure an environment that helps implicit learning happen while reducing the probability that implicit learning becomes explicit.That is to say, any break in the balance of exposure, too much or too little, would render the experiments questionable.The second problem is the control of attention allocation.As we discussed in Section 3.1, psychological presuppositions about attention are the theoretical foundation upon which researchers depend to design their training course.Attention and awareness are two inseparable sides of the same coin (Carr & Curran, 1994;James, 1890;Posner, 1994).Discussing the development of knowledge, Schmidt (2001) said, "perhaps the only role for attention is that, presumably, at least the crucial evidence that triggers changes in the unconscious system must be attended."That is to say, in clinical experiments, researchers need to control any kind of attention to implicit features, to reduce all likelihood of arousing attention to implicit features, or even to try to distract subjects' attention from implicit features.In terms of these two problems, we can see the strengths and the weaknesses of the most commonly adopted exposure paradigms.
Chiefly, there are four kinds of exposure paradigms: (1) implicit goal not mentioned + activities connected to implicit features; (2) explicit goal + explicit goal training + activities connected to implicit features + implicit goal not mentioned; (3) explicit goal + explicit goal training + implicit goal not mentioned; (4) only stimuli + implicit goal not mentioned.
Most sequence learning studies belong to the first type (Clegg et al., 1998;Cleeremans & McClelland, 1991;Nissen & Bullemer, 1987;Rieckmann & Bäckman, 2009): subjects are not told anything about the existence of rules, but only asked to react to questions as by pressing a fixed key when seeing an element or to memorize sequences in order.These kinds of inductive activities, however, are very likely to lead attention to orders and bring about the construction of hypotheses about sequence.For example, a person who had taken GRE test would easily tend to try to find rules in an exposure like of the one used by Fu et al. (2008).Likewise, clinical studies following the design of Reber (1967) which based on sequential rules might fall also this kind of trap.
Paradigms 2 and 3 have become popular since Williams (2004) adopted paradigm 3 in his experiment about implicit learning of a four-determiner-artificial grammar.Both Williams (2004) and Chen et al. (2011), which replicated models of Williams (2004), asked subjects to study four determiners' explicit features without mentioning anything about the implicit features of the stimuli.Between Williams (2004) and Chen et al. (2011), Leung & Williams (2006) replicated Williams (2004) by following Paradigm 2, adding activities about implicit features but still not mentioning the implicit goal.They used pictures to help subjects to build the implicit connection between the target words and the implicit features by asking them to decide whether the objects are in the pictures were near or far.This activity was connected strongly with the implicit feature in that experiments that targeted words also functioned as determining "near" and "far".In Paradigm 2, activities with connection to the implicit features might easily draw subjects' attention to implicit features, leading them to form hypotheses. Though the later debriefing still gave no obvious sign that hypotheses were formed, we are still not sure that the subjects knew about the existence of their subconscious hypotheses.We argue, however, that both Paradigms 2 and 3 seem more reasonable than Paradigm 1, because they set up an explicit goal to attract subjects' attention away from implicit features; and training about explicit features may leave subjects no room in attention recourse to be aware of implicit features.Would the paradigm work in the way the researchers expect?We doubt it, since to experimental subjects training is a passive way to obtain knowledge, and some of them, very weak in passive learning, might be inclined instead to explore knowledge by themselves.In this way, explicit training would fail as a distracter; a better way to attract subjects might be to let them allocate their attention to explicit features initially, by presenting more meaning-focused tasks in text form.
Paradigm 4 has been used more commonly in computational models (Elman, 1990;Perruchet & Vinter, 1998;Sun, 2002).Though computational models have proven the implicit learning ability of computer programs, we still wish to ask how one can determine whether a computational model provides a good explanation of human learning, a thing which is so complicated and multi-determined (Cleeremans & Dienes, 2008).

Models of Testing Implicit Learning
In this section, we discuss three main measures used in testing the effect of subjects' implicit learning: (1) classical tests, (2) SRT, and (3) measures in computational model.
Classical tests are the ones adopted very widely by researchers in clinical studies of implicit learning.Commonly, they test only students' accuracy of judgment on the use of implicit learning.For example, in clinical experiments with artificial grammars and sequences as stimuli, subjects' knowledge of the artificial rules or sequential rules was tested by their accuracy rate in picking out elements conforming to the rules from new strings shown to them as testing materials (e.g.Chen et al., 2011;Dienes & Altmann, 1997;Reber 1967;Wan et al., 2008;Williams, 2004).There are still a considerable number of experiments adopting the classical test model with modification.For example, Wan et al. (2008) added familiarity rating into tests; Kinder & Shanks (2003) added visual noise and string movements in their AGL experiment.These types of tests, however, would give subjects hints, or they might draw subjects' attention to implicit features, which would make test results less reliable.
Serial reaction time measurement results are considered more convincing than classical ones, since they allow retrieval cues observed when subjects take tests.Usually two facets of learning effects are recorded: accuracy rate and reaction time on test items (e.g.Cleeremans & McClelland, 1991;Clegg et al., 1998;Jiménez et al., 1996;Leung & Williams, 2006;Nissen & Bullemer, 1987).To prove that the results of reaction time reflects qualities of implicit learning, both controlled or grammatical items and violation or ungrammatical items are randomly distributed and tested in the test (Leung & Williams, 2006).If the reaction time of controlled items is significantly shorter than that of the violation items, the target implicit knowledge is thought to be learned.Whether it is learned implicitly depends on result measures of awareness, which we will discuss in the next section.Leung & Williams (2006) designed an artificial grammar expressing meaning as "near or far".The test section asked students to point out whether the phrases containing "near" or "far" elements of the artificial grammar conformed to the picture on the screen.If a phrase containing an element of "far" was shown under a picture whose target object was in the foreground, then it was a violation item, and the reaction time to it should have been longer than that of control items.The design in Leung & Williams (2006) was better, but it still left a future step to be more scientific and convincing: to add another dimension to distinguish explicit knowledge from implicit knowledge, rather than only learned from unlearned.How do we make this move?More experiments and researches need to be done.For example, researchers could conduct another experiment immediately after with a small group from the same subjects to find a time scale for an explicit reaction and an implicit reaction, and then do their analysis of implicit learning.
Another sub-model of RT was developed by adding familiarity as a variable to measure memory strength (e.g.Shanks & Perruchet, 2002;Shanks et al., 2003).Researchers following this model take the assumption that greater familiarity or priming effects would lead to faster reaction, thus, the test items that need less time are considered to be more familiar to subjects and are more likely to belong to the learned group.This assumption was proved by standard signal detection theory models for recognition judgments (Pike, 1973;Ratcliff & Murdock, 1976).However, if we do take measurement like this in a clinical study of implicit learning, we must admit firstly that it was graded rather than dichotomous between implicit and explicit (Cleeremans, 1997).Then the conclusions made by researchers under this model could be trapped in an embarrassing state.Models of tests in computational studies usually focus on measuring the learnability of the computer programs.Most of the results are positive (e.g.Cleeremans & McClelland, 1991;Perruchet & Vinter, 1998;Sun, 2002), however, it is the design of a computational model which might put its result into doubt.Shanks (2005) argued that between two most dominant computational model of implicit learning, symbol processing models (O'Brien & Opie, 1999;Shanks, 1997) were more successful than distributed models (Dienes et al., 1999;Kinder & Shanks, 2001), since the former was able to give information to distinguish implicit representational state from explicit ones.Until now, however, experiments using distributed models have seemed more successful in learning, which might delay the development of symbol processing models.

Measuring Subjects' State of Awareness
This is usually the last phase of a clinical experiment on implicit learning, which unveils the subjects' awareness states.It is used to find whether the subjects learned the target implicit features implicitly or explicitly.The measurement models of awareness tests enjoy much more attention from researchers than models of the other phases discussed above, because of the join-in researchers in psychology in the literature.Models have been updated and renewed from time to time, and new models are published almost whenever new discoveries or related inventions come up.
Researchers (Rebuschat, 2008) essentially divide the awareness measurement models into three groups.Table 1 presents a clear classification of these models.

Summary
We identify the five key procedures that are necessary to an implicit learning experiment.For each procedure, we had double-way analyses: finding flaws of a type of procedure's design and comparing different designs of different experiments.By doing this, we gave detailed comments of each procedure of the framework.Table 2 summarizes the main message of our comments.Merikle & Reingold (1991: 226) argue strongly that one measure is hardly enough to identify learning know- Abrams & Reber, 1988;Dienes et al., 1991;Leung & Williams, 2006;Payne, 1994;Williams, 2004 Interview; open questions

Conclusion and Future Directions
Subjects can say what they want; sounds like with no information omitted.
Assumptions of each procedure had better be discussed.

Continued
Awareness states

Verbal reports
Abrams & Reber, 1988;Berry & Broadbent, 1984;Broadbent, 1977;Dienes et al., 1991;Leung & Williams, 2006;Payne, 1994;Williams, 2004 Computerized/pen & paper/ recording Reducing dissociation between acquired knowledge and its verbalizability; improving insensitivity to awareness Objective tests Holender, 1986;O'Brien & Opie, 1999;Shanks, 1997;Stadler, 1998 Computerized/pen & paper Increasing exclusivity; improving sensitivity to unconscious knowledge Subjective tests Chen et al., 2011;Dienes, 2008;Dienes & Berry, 1997;Dienes & Scott, 2005 Computerized/pen & paper Finding a proper and standardized confidence scale ledge and awareness.This is true.It is exactly why we do need to maintain a whole framework to ensure that, although one step has a flaw, the steps before or after can make up for it.This is like what a food security department does when a pig becoming pieces of pork in meat stores: the farm to find disease in one pig, the butchering factory may be still able to stop the pig from entering the market; if the butchering factory fails, the quarantine still has a chance.Of course, the framework of clinical experiments cannot be as standardized as that set up by official departments, because even today any tasks designed are not process-pure and completely exclusive, since a clear and comprehensive theory of awareness has not yet settled.However, at least a framework can be set up as guidance and advice for researchers to avoid design flaws or omissions.That is what we endeavor: to conduct a detailed comparison and search for a great amount of literature, though the comments and suggestions we bring forward still await empirical verification in which implicit learning can be studied exclusively and comprehensively.
Our recommendations to future studies on implicit learning are as follows: (1) developing a more valid control on attention allocation to ensure implicit learning to take place; (2) using materials or stimuli closer to natural language in natural context with semantic and pragmatic features taken into consideration to gain more understanding about human implicit learning in real situation; (3) adopting or developing new techniques to increase sensitivity to implicit learning and explicit learning; (4) allowing researchers in computational simulation fields still to have opportunities in symbol-processing models; (5) urging more efforts in online researches using ERP or fMRI technologies; (6) exploring implicit learning in second language acquisition.
In conclusion, by furthering a comprehensive understanding of procedural mechanisms that contribute to improvement in research designs, we may be able to gain a better understanding of implicit learning.In turn, new understanding gains may contribute to new suppositions that later help design more effective empirical studies.Thus, even though theoretical and empirical difficulties are far from resolution in the near future, there is an unprecedented opportunity for advancing our understanding of implicit learning.

Table 1 .
Summary of awareness measurements models.

Table 2 .
Summary of clinical procedure related findings in implicit learning.