Development and Validation of Items to Measure Knowledge in a Basic Nutrition Course

The purpose of this paper is to present the steps of planning and monitoring a teaching-learning process proposed for a basic nutrition course offered to health students enrolled at a Public University in Brazil. The theoretical framework discusses the fundamentals of Bloom’s Taxonomy reviewed by Anderson and colleagues (2001) that outline how to establish, organize and classify educational objectives for the cognitive domain. Then, the paper describes the methodology to implement this process in a real learning environment, with a great focus in the construction and validation of a learning assessment. Finally, some recommendations are made regarding the management of the teaching and learning process in undergraduate courses.


Introduction
The workforce theme has occupied the health sector agenda (Haddad et al., 2010), since the currently observed global problems of the area will only be remedied if the "right" professionals, doing the "right" things, in the "right" places and with the "right" skills are available (WHO, 2013).Among the major global health problems are the nutritional issues such as obesity (Seidell, 2014), malnutrition (WHO, 2012) and micronutrient deficiencies (Black, 2014).
These nutritional problems are linked to political, social, economic and cultural factors.Such complexity requires the engagement of skilled multidisciplinary teams (Ceccim & Feuerwerker, 2004;Saar & Trevizan, 2007).Thus, it is understood that the healthcare professional of any field and level will only be able to approach them if training strategies, such as inserting a nutrition basiccourse, are taken.For that matter, decisions concerning the objectives, content, course loads, teaching strategies (Boog, 2002) and opportunities to teach need to be taken.
The ability to analyze the individual and collective needs of future health professionals are necessary for the modern health professor as the choice of the best strategies for training and education may be related to a number of characteristics (prior knowledge, preferences for certain learning methods, demographics and applicability of knowledge).This initial analysis allows the definition of educational objectives to be achieved, the methods and techniques that best fit to these objectives, the sequencing of the content of the media being used, strategies and teaching modality (Abbad, Walnut, & Walter, 2006;Filatro, 2007;Abbad & Zerbini, 2012) that best support student learning.
In this paper the theoretical definition that learning is the demonstration of observable behaviors defined in the educational objectives and described in terms of knowledge, skills and attitudes was adopted (Pilati & Abbad, 2005;Nogueira, 2006).For Gagné (1976), learning involves a change of human abilities, which were not caused by a normal maturation process but by the strategies adopted.
From the description of the learning objectives it becomes easier to build items to test the knowledge of students, both at the beginning, during or at the end as an educational action.The construction process of evaluation measures is part of the instructional planning and involves the definition of what is to be measured, the actors who will participate in the process, the choice and application of the techniques of data collection, analysis of results, the definition of the evaluative scale and the dimensions used for judgment, as well as the item writing and instrument formatting (Mourão & Meneses, 2012: p. 51).
Test formats capable of produce and capture the achievements of students and, therefore, the educational effectiveness have been tested (Sordi & Silva, 2010).Educational effectiveness is defined as the extent to which the proposed objectives are achieved (ABNT, 2002), the degree of educational innovation, theobtaining of satisfactory results in the educational process and meeting the needs of society in general and individual students, in particular (Pereira, Fish, & Staron, 2010).Performance appraisals should be derived from ideal educational practices and concepts based on cognitive psychology (Baker, O'neil, & Linn, 1993).To McTighe and Ferrara (1994), students can be assessed through constructed responses (essays, short answers), products (brochures, posters, software, video), income or processes (oral descriptions, observation).
The goal is to establish a judgment about an action or event (Pilati & Borges-Andrade, 2006) in order to generate feedback to the actors involved in a course, about its design, implantation and evaluation (Meneses, Zerbini, & Abbad, 2010: p. 127) leading to improvements in the educational process.

Description of Educational Objectives
During the learning assessment it is important to check the achievement of instructional objectives, defined as descriptions of observable behaviors, many of which are necessary for the professional practice (Queiroga et al., 2012).That's why the instructional systems approach foresees the clear definition of learning objectives, which should target certain learning outcomes (Abbad et al., 2006).To classify the different types of learning outcomes in the cognitive domain Bloom et al. (1956) proposed a taxonomy whose axis is the degree of complexity of the cognitive processes.These learning objectives facilitate the recall and resolution of intellectual tasks.For these researchers the complexity scale ranges from knowledge (less complex) to evaluation (more complex), as shown in Figure 1.
In this classification system, the learning outcomes depend on each other to be acquired.Therefore, the students must first acquire the simplest skills that are the prerequisite for the subsequent acquisition.The taxonomy presented provides a framework in which instruction can be supported, allowing the preparation of forms and instruments of valid and reliable assessment (Rodrigues Jr., 2006).Action verbs must be used to describe the learning objectives at different levels of complexity (Krau, 2011).
Bloom's taxonomy was adapted by Anderson et al. (2001).The category names were replaced with action verbs in order to express more clearly the progressive complexity of the cognitive processes required for learning each result.Moreover, the latter two categories had their positions reversed, so as to give greater complexity to the creation (called synthesis in Bloom's taxonomy) and lowest to the evaluation skill (Ferraz & Belhot, 2010).
This new taxonomy can be seen in Table 1, which helps the instructional designer and teacher rating educational goals, from the most basic (recall/remember, understand and apply) to complex (analyze, evaluate and  (Bloom et al., 1956).create).Thus, as in Bloom's taxonomy, processes are cumulative, with the simplest requirements for subsequent ones.As for the content or nature of knowledge, the items can address knowledge of facts, concepts, procedures, and metacognitive strategies-related to the knowledge that the individual has over their own thinking and learning processes, learning strategies, memorization, reasoning and self-assessment (Anderson et al., 2001).This table functions as a reference matrix, and should be structured on skills and abilities that students are expected to have developed in a particular stage of education.The construction of these goals becomes a reference for professors along lessons and assessments' planning, as well as for students, ensuring transparency to the process as it allows them adequate preparation (INEP, 2010).Remember that a reference matrix and its educational objectives do not cover the entire curriculum.It's a snip of what can be measured based on the instrument used, representing the curriculum (BRAZIL, 2011).

Item Construction
To assess learning Abbad and Borges-Andrade (2004) suggest applying pre-and post-tests to verify the acquisition of skills.The expected competencies to be acquired by the learner and described in the formulated objectives become the criteria (Koshino, 2010) that will be used for the construction of items that will be used to determine who learned.
The item of the test is the basic unit of a data collection instrument.It is a situation created, requiring that the student give an answer or a set of responses to a stimulus presented.Usually five formats of items are used, according to the way left to the students to express their responses: multiple choice, right or wrong, closed response, short answer and open dissertation.
The elaboration of items is a complex task that requires the mastery of specific techniques (Rabelo, 2013) and can be a major difficulty of health professors.It is recommended the construction of items at different levels of complexity.Questions that require the student the recognition of skills, classification, comparison, description, and exemplification are generally considered simple.Items requiring analysis (differentiation, discrimination, detection, selection, organization, integration and identification of components of structures); reviews and creation (generation of new ideas, solutions, inventing products) are more complex and selective.
It is crucial to keep in mind that the evaluative process is limited and a single test cannot adequately measure the learning or student performance.Then, various tests are indicated (Oermann & Gaberson, 2014: p. 4).Furthermore, in order to increase the validity of assessments, the items must cover a large and representative sample of the major and most relevant topics or subjects taught in a course.
The complexity of an item can be estimated initially by using taxonomies as Bloom's or Anderson and coworkers (2001).It is essential that the item address the matters in a non-superficial way.Wrong choices, called distractors must be carefully constructed and analyzed, since they need to be plausible, i.e., part of the context of the item and a possible response to the respondent that does not know the topic or did not develop the competence that is being evaluated.The distractors cannot escape the proposed topic or constitute clearly unreasonable assertions, even for those who do not master the subject.The distractors are different from "gotchas".The latter are the same answers that attract an individual with skills already developed (Rabelo, 2013).
Even following these rules is difficult to determine the complexity of the item and its degree of discrimination before the application of a test.One way to minimize the problem is to have a peer review in which other teachers criticize the test area.Even so, ideally, those items should be validated and measured at least via the difficulty index and the discrimination index by testing them with a smaller group of individuals (Oermann & Gaberson, 2014).

Item Assessment
Besides the evaluation of the complexity of the items through the use of taxonomies, statistics can be used to evaluate de degree of difficulty of a test.The measure most commonly used in this case the difficulty test (DT), which is the proportion of subjects who answered the item correctly (Haladyna & Rodriguez, 2013), calculated by the number of students who answered the item correctly (A) by the total number of respondents (n) times a hundred: In general, it is assumed that items with difficulty <33% are very difficult and items with difficulty >67% are very easy.Items with degrees of difficulty between these values are the most appropriate for using in tests (Cohen, Manion, & Morrison, 2004).However, when the professor needs to make judgments based on certain criteria, or when students must reach certain standards, more difficult questions may be used (Oermann & Gaberson, 2014).
Another index commonly used to assess the quality of an item is called discrimination index (D).An item with a positive discrimination index is one most often answered by students with higher grades.A negative discrimination index is more often answered correctly by students with lower grades.A simple way to calculate the discrimination index is using the formula: 1 Pu is the fraction of students with higher scores who answered the item correctly and P1 is the fraction of students with lower scores who answered the item correctly.Another formula used for calculating the discrimination index is: ( ) According to the World Health Organization Items with an index of less than 0.20 are considered poor discriminators, while results between 0.20 and 0.24 are acceptable; 0.25 to 0.35 are good and items with values greater than 0.35 are excellent (WHO, 1998).
This paper presents the process of building objectives and items, as well as its validation with a group of students from two Higher Education Institutions in Brazil.

Method
The analysis of a course allows the identification of competencies to be worked during its period, enabling its reformulation as well as the creation of support materials (Bonfa et al., 2011).This paper aimed to validate items for a basic nutrition course offered to health undergraduate students in Brazil.The syllabus of the course and materials provided by professors (classes slides and scientific papers) were read and 59 objectives for the discipline were written.They were all semantically validated at a meeting of a research group, consisting of 1 professor, 4 graduate and six undergraduate students.
For this discipline the assessment included a variety of methods such as paper reviews, presentations of group work, field research and multiple-choice tests.Multiple evaluation methods are common in higher education, since the use of a single method, in general, does not measure adequately student's learning or performance.This paper describes the construction and validation of the items to be used as a pre-test of students enrolled in the basic nutrition course.
For the preparation of items Rabelo's (2013) recommendations were observed, specially: 1) response alternatives should be written with similar size and structure; 2) statements and alternatives should be in the affirmative form to avoid the respondent's confusion; 3) alternatives should be independent items, so that the correctness of a hit is not subordinate to another; 4) items cannot provide clues to facilitate the responses of any of the items; 5) the response alternatives should not be too long or repetitive.
Sixteen questions were prepared (see Appendix).The test was administered online during March 2014, in a private educational faculty and at a Public University in Brazil.At the private faculty 115 students enrolled in the nutrition course were invited to take the test, 81 (70.43%) responded.All of them were nutrition students, most of them were female (n = 72, 88.9%), had studied during high school at a public institution and were currently at the first period of higher education (Table 2).
The same test was applied at a Public University.From the 49 students invited 13 (37.69%)took the online test.Table 3 shows their profile.As in the first institution more than 80% of the students are young female.Six of them (46.15%) were enrolled at the nursing course and seven (53.84%) at the physiotherapy course.
The outcome of the tests was assessed as to the degree of difficulty, according to the method indicated by Cohen, Manion, & Morrison (2004).For the calculation of the discrimination index, the formula indicated by  the World Health Organization was used (WHO, 1998).

Results
The two following tables show the percentage of correct answers in each item.The test was first applied in a private faculty with nutrition students.Table 4 shows the results of the 81 respondents students.Six questions obtained a low percentage of correct answers, being considered as very difficult.One question was considered very easy.At the Public University, only 13 students responded the test.All of them were nursing or physiotherapy students and were studying nutrition for the first time.The number of items evaluated as very difficult remained the same but the number of items considered very easy increased from 1 to 4 (Table 5).
To calculate the discrimination index (D) the descriptive analysis of the number of correct responses was made grouping all students (n = 94).The results are shown in Table 6.
The percentiles 25 (students that hit less than 25% of the questions) and 75 (students that hit more than 57.81%) were used to calculate the discrimination index (Table 7).

Discussion
The analysis of knowledge gain is an important step in the process of professional health education and tests are a common way to evaluate the undergraduate.Nevertheless, it is important to make sure that the items are able to measure the student knowledge.According to the method indicated by Cohen, Manion, & Morrison (2004), it was found that several items were very easy or too difficult for those students.Very difficult items indicate that the tested subject is inappropriate for the level of the student and they must attend educational activities so that they can acquire such knowledge.For students who have already taken a course a high degree of difficulty indicates that the educational process was not effective.Another problem may be the poor quality of the test itself, which may contain many distractors or "gotchas".In the case presented this suggestion does not seem true, since 81.25% of the items (n = 14) were considered excellent discriminators between students with better and worse performance.Equation (3) suggested by the World Health Organization for the calculation of the Discrimination Index is quite simple, providing important feedback on the validity of the items used.However, it is good to remember that the power of the test can only be calculated with a significantly larger sample.
The development of perfect items seems an unattainable goal, even when good educational practices are followed.Therefore, validating the items by applying the test to a sample of students, followed by the calculation of the difficulty and discrimination indexes facilitate assessment improvement and increase professor's abilities to make inferences about student learning.
At last it is important to remember that the study sample was small and the response rate in the public college was quite low.Thus, this test will still need to be applied in other classes to ensure the quality of the items.

Conclusion
It is suggested that pre-and post-tests are held in order to evaluate the effectiveness of an educational action.To do so, the items of the tests should be validated in order to discard the ones that do not measure application and problem solving skills as previously described in the educational objectives.
In this study we observed the usefulness of the revised Bloom's taxonomy (Anderson et al., 2001) in describing objectives.Even so, the use of the taxonomy does not guarantee control over the degree of difficulty or discrimination power of an item.Thus, ideally all items should be validated previously so that the less discriminating or easier ones could be reworked or discarded.
It is also important to remember that testings with closed answers have limitations, not being able to measure all knowledge worked in a classroom, especially those of greater complexity as well as the metacognitive ones.Therefore, those need to be measured in different contexts.
As health teachers in Brazil, in general, have no training in the area of teaching, trainings that help them through the educational planning are recommended.During the current work it was observed that a large number of tested items showed a high degree of complexity to the current level of knowledge of healthcare students who participated in the survey.Some of the items will need to be rewritten for future use.Those items will also need to be tested after an educational action so that the assessment of knowledge gain is possible.
Our future research includes the improvement of learning materials in a presence nutrition course supported by new technology media.Pre-and post-tests will be applied to evaluate the effectiveness of educational strategies and media in improving student's achievements.
Values are based on a 2000 kcal diet (*) Daily Value Percentage (**) Value not established.From this information and considering the nutritional functions assigned to product ingredients, judge the following items.
I) The consumption of gluten protein is recommended to celiac disease patients.II) Wheat flour enriched with iron and folic acid meets a Brazilian legislation that aims to combat anemia in the country.
III) This product contains trans fat, considered a risk factor for coronary disease.IV) Based on quantities of carbohydrates, protein and total fat described on the label, it can be said that the calories informed are correct.
V) This product contains a high density of sodium, nutrient easily absorbed by the body and that should be consumed in moderation.
These items are correct: a) I, II and III; b) I, II and IV; c) I, IV and V; d) II, III and V; e) III, IV and V. 5) Nutritional status assessment aims to identify nutritional disorders, enabling adequate interventions in order to assist in the recovery or maintenance of a person's nutritional status.The Body Mass Index (BMI) is a simple indicator of the nutritional status and is calculated by the formula: weight (kg)/height (m) 2 .
About this index evaluate the alternatives below: I) Given that BMI does not distinguish between weight associated with muscle or body fat, it is important to investigate the body composition, especially when BMI values are within the limits or outside the normal range.
II a) It is characterized by an increased deposition of fat in the hips, compared to an apple and a higher risk of diabetes.
b) Also called central it resembles the shape of an apple and is related to high cardiovascular risk.c) It is characterized by a greater deposition of fat in the hips, resembling the shape of a pear and a greater risk of varicose veins.
d) The fat is distributed evenly throughout the body and is associated with metabolic syndrome.7) Breastfeeding is vital to the health of mother and child throughout life.Regarding this theme rate the following statements: I) The current recommendation is that children should be breastfed exclusively until 6 months of age.II) After 6 months of age the child should receive complementary foods, but breastfeeding should be continued at least until 2 years of life.
III) The protection against diarrhea and respiratory infections and gastrointestinal tracts is greater when the child is breastfed exclusively and for a long time.
IV) There is a positive relationship between breastfeeding and increased incidence of breast cancer in women.These items are correct: a) I and II; b) I and III; c) I, II and III; d) I, III and IV; e) II, III and IV.8) Dietary fiber is the edible part of plants, or analogous carbohydrates that are resistant to digestion and absorption in the human small intestine with complete or partial fermentation in the large intestine.They can be classified as soluble or insoluble in water, playing different physiological roles in the body.Evaluate the following alternatives about dietary fibers: I) Fibers are the main instrument in the treatment of constipation.II) The soluble fiber is important adjuvant in the treatment of osmotic diarrhea.III) Soluble fibers are recommended in the treatment of glucose intolerance and dyslipidemia because they decrease the rate of absorption of carbohydrates and lipids.
IV) Insoluble fibers are fermented in the colon and have a prebiotic effect helping the maintenance of the equilibrium of intestinal bacterial flora.
These items are correct: a) I and II; b) I and III; c) II and III; d) I, II and III; e) II, III and IV. 9) According to the principles of a healthy diet, all food groups should make up the daily diet.About this theme evaluate the following items: I) A healthy diet should provide water, carbohydrates, proteins, lipids, vitamins, fiber and minerals, which are irreplaceable and indispensable to the proper functioning of the body.II) Fruit and vegetables are important sources of vitamins.An individual who does not like vegetables can offset them by increasing the consumption of fruit.
III) The dietary diversity that underlies the concept of healthy eating assumes no food or specific group consumed alone is sufficient to provide all the necessary nutrients to the body.
IV) A healthy diet should preferably be financially affordable, tasty, varied, colorful and safe from the health point of view.
These items are correct: a) I and II; b) I and III; c) I, II and III; d) I, III and IV; e) II, III and IV.10) Carbohydrates constitute the largest percentage of the nutrients consumed, ranging from 55% to 65% of the total caloric value of the diet.Of this percentage it is recommended that the consumption of simple carbohydrates is (check only one): I) <10% of calories consumed daily.II) 10% to 20% of calories consumed daily.III) 20% to 30% of calories consumed daily.IV) >30% of calories consumed daily.11) Fats and oils are high-energy food (900 Kcal/100g).Their differences in terms of physical and chemical properties may be more or less beneficial to human health.On this theme evaluate the following items: I) Saturated fats increase the risk of dyslipidemia.II) Unsaturated fats are divided into two types: monounsaturated (solid at room temperature) and polyunsaturated (Liquid at room temperature).III) Olive oil, avocado and nuts are good sources of monounsaturated fatty acids.IV) Omega-3 fat in fish is an example of monounsaturated fat.These items are correct: a) I and II; b) I and III; c) II and III; d) I, III and IV; e) II, III and IV.12) A high amount of processed food contains too much fat, especially the hydrogenated type, also called trans fat.Hydrogenation is a chemical process used by the industry to increase the shelf life of products.It is recommended that the consumption of this type of fat is (check only one): a) Increased, as it is a type of lipid takes longer to spoil.b) Reduced to less than 1% of the energy value of daily food intake, because of its atherogenic properties.c) Stimulated as it is less atherogenic than saturated fat in foods such as butter and lard.d) Balanced with the intake of other fats in the diet.14) Among the methods used to assess dietary intake are the 24-hour recall, the Food Frequency Questionnaire and the Food Record.The last one, also called food diary has as advantages the fact that it eliminate recall bias, and obtains with relative accuracy information on the quantity and quality of food consumed.But the method also has disadvantages such as: I) A single record does not describe the average energy consumption of nutrients and population groups.
II) The act of registering may lead an individual to change the choice and consumption of food.
III) The method requires good skills of reading and writing.IV) There is need of time for obtaining, processing and analyzing data.16) The glycemic index (GI) is the potential that a food has on raising blood glucose.This index has been proposed as a method to facilitate glycemic control, especially in patients with pre-diabetes, diabetes and meta-bolic syndrome.About this theme analyze the following items: I) Foods with higher fiber content in general have a higher glycemic index.II) Meat and eggs are among the foods with the highest glycemic index.III) White rice, white bread and pasta should be eaten in smaller quantities by individuals who wish to lose weight, as they have high GI, stimulating the secretion of insulin and fat storage.
IV) Foods with a high glycemic index should be avoided by athletes, especially before workouts.These items are correct: a) I; b) II; c) III; d) II and IV; e) III and IV.
) The World Health Organization (WHO) considers obese an individual with a BMI above 25 kg/m 2 .III) The World Health Organization (WHO) considers obese an individual with a BMI above 30 kg/m 2 .IV) It is recommended to interpret the cutoff points of BMI in combination with other risk factors.These items are correct: a) I; b) I and III; c) II and III; d) II and IV; e) I, III and IV. 6) Considering fat distribution, obesity can be classified as gynoid, android or generalized.Which of the following descriptions refers to android obesity?(Check only one of the alternatives)

13 )
The most relevant deficiencies of micronutrients in Brazil are (check only one): a) Vitamins C and D, Iron; b) Zinc, Iron, Calcium; c) Iron, Calcium, Vitamin D; d) Vitamin A, Iron, Folic Acid.
These items are correct: a) I and II; b) I and III; c) II and III; d) I, II and IV; e) I, II, III and IV.15) Enteral nutritional therapy (ENT) refers to a set of procedures used for therapeutic maintenance or restoration of nutritional status through food or product administered orally or by gavage.The main indications for TNE are: I) The existence of risk of malnutrition due to low intake compared to the daily needs.II) The failure of stomach and intestine.III) The impossibility of oral feeding, due to factors such as coma, presence of oral lesions, cancer, stroke or pain.IV) Severe abdominal trauma.These items are correct: a) I and II; b) I and III; c) II and III; d) I, III and IV; e) II, III and IV.

Table 1 .
The cognitive process complexity a .

Table 5 .
Student's outcomes at the Public University faculty (n = 13).

Table 6 .
Descritive analysis of the correct responses of all students (n = 94).

Table 7 .
Discrimination index of test items.