
Creative Education
2012. Vol.3, Special Issue, 943-945
Published Online October 2012 in SciRes (http://www.SciRP.org/journal/ce) http://dx.doi.org/10.4236/ce.2012.326143
Copyright © 2012 SciRes. 943
A Systematic Error Leading to Overoptimistic Item Analysis of
a Medical Admission Test
Gilbert Reibnegger1, Hans-Chri s t i an Ca luba2, Daniel Ithaler2, Simone Manhal3,
Heide Maria Neges2
1Institute of Physiological Chemistry, Center of Physiological Chemistry, Medical University of Graz,
Graz, Austria
2Organisational Unit for Studies and Teaching, Medical University of Graz, Graz, Austria
3Office of the Vice Rector for Studies and Teaching, Medical University of Graz, Graz, Austria
Email: gilbert.reibnegger@medunigraz.at
Received July 31st, 2012; revised August 26th, 20 12 ; ac ce pt ed Se pt ember 12th, 2012
During the course of the admission procedure for the diploma programs Human Medicine and Dentistry at
the Medical University of Graz in July 2009, a serious error occurred in the evaluation process resulting
in the publication of an erroneous provisional list of successful applicants. Under considerable public in-
terest this wrong list had to be withdrawn and corrected. The publication of the erroneous list had been
encouraged by a preceding item analysis yielding falsely optimistic results due to this systematic error.
The source of the error and its consequences are described in detail, and a simple recipe to avoid similar
errors in the future is provided.
Keywords: Item Analysis; Index of Difficulty; Index of Discrimination; Admission Test; Medical Studies
Introduction
Item analysis examines the responses of students or, more
generally, of test subjects to individual test items, and it is one
of the standard tools for assessing the quality of test items and
of a test as a whole. The basic statistics used in item analysis
are the indices of difficulty and of discrimination (Lienert &
Raatz, 1988). The index of difficulty of a test item is simply the
proportion of correct answers among all tested subjects. Thus,
if 60 percent of all test subjects give the correct answer to an
item, the index of difficulty of this item is 0.60. Normally, a
range of item difficulties between 0.20 and 0.80 would be de-
sirable. Some item a nalysts define the index of difficulty as the
proportion of wrong answers, but clearly, this does not really
alter the substance of this index.
The computation of the index of test discrimination is mathe-
matically a bit more complex; briefly, this index measures
whether or not the proportion of correct answers to a given test
item is reliable in comparison with the overall abilities of the
tested subjects, estimated from their response behavior regard-
ing the complete test. The index of discrimination should be
positive; in practice, it seldom would exceed 0.50. If it lies
above 0.30, it would be judged as “good”, between 0.10 and
0.30, as “fair”, and below 0.10, as “poor”. A negative index of
discrimination would indicate that the test item under scrutiny
is correctly answered by a higher proportion of test subjects
performing worse on the test as a whole, and by a lower frac-
tion of those performing globally better. Such test items are not
desirable and should be either changed or even removed from
the test before applying the test in the future.
In Austria, admission to university studies has generally been
open, but for few studies, among them the medical studies of
human medicine and dentistry, admission is regulated by ad-
mission tests since 2005. The Medical University of Graz has
developed an admission test based on secondary school level-
knowledge in biology, chemistry, mathematics and physics, and
on comprehension of scientific texts. Recent studies have
shown a strong improvement of study progress as well as a
dramatic reduction of study dropout rate after introducing this
admission test (Reibnegger, Caluba, Ithaler, Manhal, Neges, &
Smolle, 2010, 2011).
The admission procedure consists of three steps: after an
electronic preregistration period during February, applicants
have to provide written material of application until the end of
April. The admission test takes place during the first days of
July as a paper and pencil-based multiple choice test. Evalua-
tion is performed electronically after scanning in the answer
sheets. At this stage, test quality is assured by item analysis.
The aim of this important step is to check the test items, based
on the response behavior of the applicants, for their quality. If
by item analysis one or more test items would be detected with,
e.g. a negative index of discrimination, this item could be re-
moved from the test and, by re-evaluation, a fair test result
would be obtained.
By the end of July, a provisional list of results is published
via the internet. At this time each applicant is provided an elec-
tronic copy of her or his answer sheet, and is entitled to raise an
objection if she or he believes something might be wrong with
test evaluation. For example, she or he might think that the sum
of correct answers had been counted incorrectly. By the mid of
August, after due consideration of each objection, a final list of
results is published via the internet.
The admission test is clearly a high-stakes test: at the Medi-
cal University of Graz, the number of available study places is
360 per year, and there are much more applicants. For example,
in 2011 and 2012, there were between 1700 and 1800 appli-
cants. Importantly, the applicants are ranked according to their
test achievements, and only the 360 top ranking applicants are