Objective Structured Clinical Examination (OSCE) with Immediate Feedback in Early (Preclinical) Stages of the Dental Curriculum

The Objective Structured Clinical Examination (OSCE) is a test of skills, behaviour, attitudes and application of knowledge, which has been an integral part of medical and dental curricula since late 1970s. As a procedural exam, it has been successfully supplementing other outcome-oriented assessment methods in order to measure students’ preparedness to practice. In addition to making a judgement about competence, the aim of assessment is to guide future learning and shape values. These added benefits can be harnessed by providing feedback on the students’ performance and encouraging reflection. As a tool to reinforce or modify behaviour by focusing on actual performance (compared with the intended performance level), feedback is central to support cognitive and professional development, especially in the early stages of the clinical curriculum. OSCE with immediate feedback combines a work-based assessment method and a short, immediate feedback discussion with the examiner on the student’s performance before they move to the next station. To get the most out of this modified assessment, it should be implemented early in the clinical curriculum. Giving feedback immediately after a task is an efficient learning method and valuable tool for improving student experience. The aim of this paper is to assess the validity and reliability of the OSCE with immediate feedback as an exciting and valuable learning opportunity for students and a revolutionary new informative assessment method for teachers.


Introduction
The purpose of dental education and training in the UK is to produce a safe beginner-an individual who can demonstrate that they have met the outcomes required for registration as a dental professional with the General Dental Council (GDC).The role of the GDC is to protect patients by ensuring that those who join their register are fit to practice (Firmstone, Bullock, Frame, & Wilson, 2007).To do this, future registrants must prove they have met the GDC defined outcomes through their education, training, and assessment.
To award the Bachelor of Dental Surgery (BDS) qualification, dental schools must be assured that students have demonstrated competence across the full range of learning outcomes (clinic, communication, professionalism, management and leadership).This assurance should be underpinned by coherent aggregation of all of the assessment methods and principles (Manogue et al., 2011), and the most appropriate assessment methods selected.Work-based assessment (WBA) is potentially the best way of assessing professional competence as it encompasses requisite attributes including communication, clinical reasoning, judgement, emotions, values and reflection (Govaerts & van der Vleuten, 2013), which theoretical or paper-based assessment might not.
The Objective Structured Clinical Examination (OSCE) is a type of WBA that has been an integral part of medical and dental curricula since late 1970s (Townsend, McLlvenny, Miller, & Dunn, 2001), and as a procedural exam has been successfully supplementing other outcome-oriented assessment methods (Townsend et al., 2001).An OSCE usually comprises a circuit of short, 5 -10 minute stations in which each candidate is examined on a one-to-one basis with one or two examiners.Some stations have either a real or a simulated patient (actor).Each station has a different examiner (as opposed to the traditional method of clinical examinations where a candidate would be assigned to one examiner for the entire test).Candidates rotate through the stations, completing all the stations on their circuit (Gupta, Dewan, & Singh, 2010).This method of assessment allows examiners to ensure a student is clinically competent and safe to practice.
As well as making a judgement about competence, the aim of assessment is to guide future learning and shape values (Van Der Vleuten, 1996).This can be harnessed by providing feedback on the students' performance and encouraging reflection.As a tool to reinforce or modify behaviour by focusing on actual performance (compared with the intended performance level), feedback is central to support cognitive and professional development, especially in the early stages of the clinical curriculum (Archer, 2010).To this end, giving feedback immediately after a task is an efficient learning method and valuable tool for improving student experience (Napankangas, Harila, & Lahti, 2012).
In this paper, we propose a modified version of the traditional OSCE to include immediate feedback and discuss its validity.OSCE with immediate feedback (OSCE IF) combines a work-based assessment method and a short, immediate feedback discussion with the examiner on the student's performance before they move to the next station.To get the most out of this modified assessment, it should be implemented early in the clinical curriculum.Traditional OSCEs tend to be used as part of high-stakes, qualifying assessments.By including OSCE IF in the early stages of the dental curriculum, the assessment aims to provide students with a standardised, nonthreatening clinical skills assessment before they start treating patients, and offer resources to help them build their strengths and address any weaknesses.Moreover, OSCE is well accepted by students and has been proved to be a better predictor for final examination results than concurrent examinations (Ratzmann, Wiesmann, & Kordass, 2012).

Why OSCE over Other Methods of Assessment?
According to the General Dental Council's document, Preparing for Practice, UK dental schools have a responsibly to "… prepare all potential registrants for safe and independent practice, from the first day of registration".
To ensure that this level of clinical competence is attained by graduating dental students, it is imperative that the appropriate method of assessment is chosen.
According to Miller's pyramid (Figure 1), there are four levels of clinical competence: knows, knows how, shows how, and does.In the early years of the dental curriculum, a large proportion of teaching time is devoted to the biomedical sciences which underpin clinical dentistry.These areas of knowledge are demonstrated at the "knows" and "knows how" levels, and most appropriately assessed by knowledge based tests, such as Multiple Choice Questions (MCQ).However, it has been shown that predicting whether a student will be a competent clinician based on MCQ assessment results does not work (Dennehy, Susarla, & Karimbux, 2008).Rather, the psychomotor skills required for safe clinical practice are tested at the "shows how" and "does" levels of Miller's pyramid.Work based assessments such as OSCE are specifically designed to test these areas.Therefore, if the aim of the assessment is to measure clinical competence, then OSCE is a more valid method than a purely knowledge based test.
As well as being a tool to measure clinical competence, assessment is also a useful learning tool.Whilst the students have limited clinical skills at the beginning of their study, it seems reasonable to integrate OSCEs into the early stages of the dental curriculum as a supplement to assessment in other areas of knowledge.Dental education providers should always seek to improve student performance by encouraging reflection and providing feedback, and, where possible, patients and peer feedback should also contribute to the assessment process.This feedback, received in the early stages, gives the students plenty of time to react and adapt their skills appropriately, before they reach high stakes, qualifying exams.
OSCE IF may have further benefits over more traditional methods of assessment, such as MCQ.Whilst the assessment process is integral to the dental undergraduate curriculum, there may be unintended effects.The assessment process is often considered "the hidden curriculum", where the unintended effects often include substitution of beneficial reflective learning with superficial knowledge caused by the tendency to study very hard just before examination (Epstein, 2007).This may be particularly apparent in knowledge based assessments, such as Multiple Choice Questions.With the OSCE however, it is more difficult for students to anticipate what will be assessed, therefore making cramming more difficult.

Critical Review of the Validity of OSCE with Immediate Feedback
All assessments in dental education require scientific evidence of validity to be interpreted meaningfully.Sometimes assessment methods are evaluated on the basis of "face validity" or so called "look and feel" validity which contrasts sharply with true, scientific validity evidence required to support or refute the meaning and interpretation of assessment scores (Downing, 2006).Construct validity, on the other hand, looks to five discrete sources of practical evidence in order to establish meaning for assessment results: content, response process, internal structure, relationship to other variables and consequences of assessment (Downing, 2003).This paper will review the validity of OSCE IF under these five headings.

Content Evidence
Content evidence relates to test specifications and domains that will be tested in a particular exam.To improve the validity of the test, OSCE planning includes: The design of stations by an expert committee based on reviewed blueprints and checklists, inclusion of interdisciplinary stations within a circuit, standard setting, the analysis of quality criteria, and post-test evaluation of its effects on students' learning and behaviour (Schoonheim-Klein, Walmsley, Habets, van der Velden, & Manogue, 2005).For reliable decision making a minimum of 17 stations need to be included in a summative OSCE (Schoonheim-Klein et al., 2008).Graham et al. described four types of station which could be used for dental OSCEs.In the first type the students are presented with a standardised patient whom they are required to interview and elicit an appropriate response.In the second type of station, students are given a patient scenario or record for review and are asked to evaluate this information.The third type of station involves the performance of a procedure on a mannequin and the forth type is about instrument selection and their appropriate disposal (Graham, Zubiaurre Bitzer, & Anderson, 2013).
The content validity of OSCEs is assured by the intricate procedures used to design the stations including the use of expert groups and pre-reviewed blueprints (Mohtadi, Harasym, Pipe, Strother, & Mah, 1995).The OSCE blueprint should be sufficiently detailed and should contain subcategories and subclassification of content.In addition, the proportion of test items (in this case OSCE stations) and the cognitive level required to pass them should be precisely defined (Hammad et al., 2013).Early stages dental OSCEs take place at the end of the preclinical part of the curriculum and assess the transition from preclinical to clinical education.The objectives of each station and marking criteria should closely match the learning outcomes of the preclinical stage.The stations and the marking criteria are created by faculty members who are experts or specialists in the topics covered and who have taught the students.OSCE stations should be authentic, match clinical situations and should feel realistic (Brown, Manogue, & Martin, 1999).
The concept of standardised/simulated patients (SP) was first introduced into medical and dental education by Harden in the late seventies (Harden, 1979).More recently, this concept has proven to be a very important human resource and a method of involving other stakeholders with regards to communication training and assessment in dental education (Logan, Muller, Edwards, & Jakobsen, 1999).SPs are not only trained to act as patients during dental interviews but also to mark and evaluate students' behaviour, speech and interviewing skills.Development of SP includes rigorous training so they can express their symptoms and the emotional affect in a standardised manner.Evidence that stations which involve SPs are completely edited and SP appropriately trained by experienced trainers is also a crucial source of content-related validity evidence (Amano et al., 2004).

Response Process
Response process is an important part of validity evidence and denotes the administrative procedures before, during and after assessment, quality control of all data generated by assessment, and accuracy of the SP rating.Response process evidence proves that all sources of error associated with the assessment delivery and results interpretation are either eliminated or controlled to the maximum extent (Downing, 2003).
Standard scoring protocols are used in order to ensure objective scoring of all candidates.OSCE is an objective method of assessment since candidates are assessed using exactly the same stations with the same marking scheme.It excludes examiners' subjective feelings about candidates' skills as they get marks for each step on the mark scheme that they have performed correctly.It is also structured, since stations in OSCEs have a very specific task.Simulated patients are given detailed scripts to ensure that the information they give is the same to all candidates.Instructions are written to ensure that the candidate is given a very specific task to complete within the time limit of 5 -10 minutes (Kurz, Mahoney, Martin-Plank, & Lidicker, 2009).Some of the OSCE stations are monitored by academic members of staff while other stations do not have an assessor, and the student performs the task on their own.Each member of staff remains at their station for the duration of OSCE to help ensure consistent assessment.In addition, all remaining stations which are not directly monitored are graded by the same faculty member (e.g.paper work, referral letters) (Hodges, 2003).
Each OSCE station should start with a construct (what the station aims to assess), written as a single sentence starting with "This station tests the ability of a candidate to…".Furthermore, in order to provide evidence for response process, each station should also contain the candidate instructions, examiners instructions, item checklist or appropriate domain (depending on what marking scheme is used), instructions for patients and equipment list (Davis, 2003).
Two different marking schemes could be used to mark OSCE stations: item checklist and domain based marking.Item checklist includes all items that are key to the construct.Items should reflect content and process and always include communication items in any type of station that involves an SP.The number of items should be manageable for the time given and each item can be scored with 0 (not competent), 1 (does it adequately) and 2 (does it well) (Schoonheim-Klein et al., 2009).Alternatively, the domain based marking identifies only specific domains appropriate to the station construct.Each domain will receive a single mark (very good, good, ac-ceptable, poor, very poor).Each domain can be weighted and the final weightings are agreed by a panel of experts at an OSCE review meeting in advance of the exam.In addition, both of these types of marking can be complemented by the global rating for overall competence be a 5-point scale (clear fail, borderline, clear pass, very good and excellent) (Association for Medical Education in Europe, 2011).

Internal Structure
OSCEs have the ability to measure skills and knowledge necessary for competent clinical practice and are often instituted as a comprehensive test to ensure minimum capabilities among medical and dental students (Hamann et al., 2002).However, a number of issues are related to how well OSCEs function in this regard, which is in turn dependent on their psychometric structure.Internal structure of an OSCE relates to the statistical characteristics of the examination questions such as difficulty, discrimination, reproducibility, generalizability, reliability and how well the stations that measure the same or similar construct correlate among themselves.Put simply, internal consistency assesses the consistency of results across items within a test and reliability is the correlation of test with itself (Tavakol & Dennick, 2011).
According to Downing, reproducibility of assessment results over time is a major source of validity evidence.In the split-half methodology of reliability estimation, items in a test are split into two tests that are equivalent in content and difficulty.If each station could be considered as a test, the items in the checklist could be analysed by calculating the Pearson product moment correlation coefficient, which indicates internal consistency.Another measure of internal consistency is a Cronbach alpha which estimates the stability of a particular station and its influence on the overall exam result (Downing, 2004).Brown et al. found that the internal consistency of a total OSCE as measured by a Cronbach alpha was 0.5, and such unreliability of a clinical exam may arise from several sources: The fact that OSCEs assess a broad range of clinical skills and abilities, the students are assessed by different patients, multiple circuit sessions need to be organised, and the fatigue of students, patients and examiners (Brown et al., 1999).In contrast, Silva et al. found that in their OSCE-which evaluated psychomotor, cognitive and attitudinal skills-the overall consistency was good (Cronbach 0.7) with especially good correlation between cognitive and attitudinal skills, while psychomotor skills showed greater variation between stations (Silva, Lunardi, Mendes, Souza, & Carvalho, 2011).In addition, history taking and communication skills measured by OSCE have been shown to be consistent, with Cronbach alpha higher than 0.75 (Al-Naami, 2008).
Classifying all the stations into "dental materials" and "non dental materials" may strengthen evidence of validity based on internal structure, as Eberhard et al. showed that students who performed well in the dental material stations achieved poorer results in non-material stations and vice versa (Eberhard et al., 2011).
Internal quality control of an OSCE, especially of those which involve SP, can be complemented by using the generalizability theory which is concerned with how well the specific sample of behaviour can be generalized to the population.Iramaneerat et al. examined the communication skills OSCE of 79 residents from one Midwestern university in the United States.The generalizability study revealed that the largest source of error variance besides the residual error variance was SP.SPs were significantly different in their levels of severity/difficulty (Iramaneerat, Yudkowsky, Myford, & Downing, 2008).

Relationship to Other Variables
Understanding how OSCEs relate to other performance measures in medical and dental schools could help educators effectively design curricula and optimise student instructions and assessment.Predictive validity of OSCEs is very important for remedial purposes and early identification of students who will need extra support with their clinical skills development (Martin & Jolly, 2002).
Moderate correlation between OSCE scores and a test of pre-clinical basic science knowledge has been reported by Simon and Volkan (Simon, Volkan, Hamann, Duffey, & Fletcher, 2002) while modest correlation between early stages OSCE results and the final clinical skills exams has been reported by Muller et al. (2003).A very interesting study performed by Auewarakul et al.where the correlations between different types of assessment were analysed, has found that OSCE scores had the most significant degree of relationship with other components such as MCQ and other types of Work-place based assessment than any other methods (Auewarakul, Downing, Jaturatamrong, & Praditsuwan, 2005).Nevertheless, Dong et al. found that the second-and thirdyear OSCE scores were weakly correlated.Neither second-nor third-year OSCE score was strongly correlated with clinical reasoning MCQ scores or medical school grade point average (Dong et al., 2012).Similarly, Den-nehy et al. have reported that performance on OSCE examinations is not highly correlated with performance on NBDE Parts I and II (National Board Dental Examination) and Harvard Dental School-administered MCQ examinations (Dennehy et al., 2008).These findings suggest that OSCE examinations are more likely to measure the qualities such as problem-solving ability, critical thinking and communication skills and capture a viewpoint that is different from typical assessment measures that largely reflect multiple choice questions.

Consequences
Assessment plays a key role in the learning process.It is rather central than peripheral to the instructions and teaching sessions provided to students.Moreover, learning depends on the goals provided by assessment and on the consequences of its results.According to Cronbach: "It is not the test or the test score that is validated, but a proposed interpretation of the score" (Cronbach, 1971).
Such interpretations of an OSCE might be that a student is fit to qualify as a dental surgeon, or that they are ready to progress to the clinical stage of their studies, or they may use the experience as a tool for reflection and further learning.Thus, the impact of the results interpretation on examinees is huge and false positive and false negative outcomes are not acceptable.In the case of pre-clinical dental OSCEs, their aim is to ensure patient safety and select candidates who are good enough to proceed with the patient treatment.Furthermore, they should identify candidates' strengths and weaknesses in terms of clinical performance and guide further learning effectively; teaching and learning are guided by the types of assessments incorporated into dental and medical curricula, and the feedback provided to examinees.
Given the significant implications of OSCE and the potentially damaging effects on students learning experience, robust methods for standard setting are needed.Methods of setting standard in clinical examinations remain problematic and can be classified into three categories: judgemental, empirical and combination.Judgmental methods inspect individual test items to judge how a minimally competent person would perform on each item.Empirical methods, in contrast, require examinee test data as part of the standard-setting process.Combination methods use both empirical and judgmental data (Kaufman, Mann, Muijtjens, & van der Vleuten, 2000).It has been shown that the combination method, or so called Borderline Regression, provides most defensible pass/fail standards and seems to be the optimal choice for OSCEs in health education (Schoonheim-Klein et al., 2009).Using structured marking sheets, examiners score student performance at each OSCE station.Examiners also provide a global rating of overall performance.The actual scores of any borderline candidates at each station are averaged to provide a passing score for each station.The passing scores for all stations were combined to become the passing score for the whole exam (Wilkinson, Newble, & Frampton, 2001).
As already mentioned, the main interpretation of early stages OSCEs results in clinical education, apart from ensuring patient safety, is to be a teaching tool, providing students with invaluable guidelines with regards to development of their communication, professionalism and basic clinical skills.OSCE with immediate feedback (OSCE IF) combines all of the advantages of OSCEs, with immediate feedback given by the examiner and based on the marking criteria and student performances.This immediate 3 minute feedback is one to one, individual, and recognizes the diversity of the student population (Cushing, Abbott, Lothian, Hall, & Westwood, 2011).It is given in a timely manner, immediately after the candidate has finished the task, so they can reflect on their work and implement examiner's comments on their subsequent related exams.It is supportive and meaningful and clearly illustrates student strengths and weaknesses (Sloan et al., 1996).In addition, it facilitates learning and improves students' retention of the information being tested.Comments given in this way are not seen in isolation; they do not ruin student morale and definitely improve student performance.Students are explicitly informed how well they met specific assessment criteria and are explained how to improve future work.Immediate feedback encourages students to reflect critically on their skills and motivate them to seek to deepen the understanding of the problem and to improve its management (Hodder, Rivington, Calcutt, & Hart, 1989).OSCEs IF can be developed from existing OSCE materials to provide direct observation and feedback to students on their dentist-patient relationship skills, student's ability to do a focused history taking and physical examination and inter-professional engagement (Brazeau, Boyd, & Crosson, 2002).

Synthesis
The OSCE can be carefully structured to include parts from all elements of the curriculum as well as a wide range of skills.An especially desirable characteristic of OSCE is that it takes into account the simulated patient thoughts on how they have been treated by the candidates that can influence the final mark (Homer & Pell, 2009).Reliability of the OSCEs is ensured by well-structured marking criteria and examiners briefings before each examination diet.Students have the right to receive feedback on all their work, that will help them clearly understand how well they did and how to enhance future achievements.In order for learners to receive maximum benefit from feedback, it should be supplied as soon as possible after performing the activity.The immediate feedback assessment methods are based on two psychological principles (Chandra, Chaturvedi, & Desai, 2009): • Immediate feedback is beneficial for learning (and is superior to delayed feedback).
• The best test/quiz/homework/assignment doesn't just assess; it also teaches.

Conclusion
The OSCE appears to be valid and reliable in the context of preclinical dentistry.On the negative side, OSCE requires substantial facilities, staff engagement and is costly.The improvements of students' results with ascending circuit number should be looked into and addressed.Verbal, immediate feedback during OSCE is practical and can improve competency in clinical skills.Nevertheless, negative feedback could trigger long lasting emotional responses and lead the examinees to think that it is too critical and useless.