Designing Accessible Learning Outcomes Assessments through Intentional Test Design

Postsecondary students represent a diverse population with unique learning needs that may impact their ability to accurately demonstrate their knowledge and skills on traditional forms of learning outcomes assessments. Designing assessments is a complex process that includes specifying the assessed content and designing the actual test. Both components should work together to accurately measure key learning outcomes and generate reliable results. This process, however, is often difficult to execute. While the faculty have considerable content expertise, many lack experience in designing high-quality assessments that are also accessible to students with diverse needs and preferences. This article discusses the importance of designing accessible learning outcomes assessments, provides a framework for designing accessible assessments that are grounded in universal design principles, and lays out step-by-step actions faculty can take to improve the accessibility of their assessments. Examples of accessible assessments are provided.


Introduction
The demographics of students in higher education today are changing, and coupled with those demographic differences comes varied learning needs. Inclusive learning environments can support students' varied needs, and thus should be a priority for administrators and policymakers. Developing inclusive educational settings that are accessible to all students is only possible with the intentional design of learning environments and instructional materials, including assessments

Accessibility of Assessments
In educational testing, the term "assessment" refers to the process of gathering and subsequently evaluating various sources of information for a given purpose . Information can be gathered through standardized tests, interviews, observations, or other relevant approaches. "Accessibility" of assessments refers to students' ability to engage with the test in a way that allows them to accurately demonstrate their knowledge, skills, and abilities on the tested content. Accessibility is influenced by the interplay between students' personal characteristics and test design features (Ketterlin-Geller & Crawford, 2011). An accessible assessment is one in which students can meaningfully interact with the tested content and generate responses that reflect their proficiency in the targeted learning outcomes. Less accessible assessments result in certain students not being able to engage with the tested content, formulate a response, and/or produce the intended product. Poorly design assessments may cause students' observed scores to be an inaccurate reflection of their proficiency in the targeted learning outcomes, thereby compromising the validity of the score-based interpretations and uses. As such, creating accessible assessments that generate trustworthy data for making decisions must emerge as a key consideration of all faculty members when designing assessments.
Because accessibility is influenced by the interaction between students' personal characteristics, including their needs and preferences, and test design features, we define and provide examples each component. Next, we discuss how they interact to impact accessibility.

Personal Characteristics Impact Accessibility
When creating accessible assessments, it is essential to understand the role students' personal characteristics play in making tests more or less accessible.
Students' personal characteristics include stable traits of an individual that are not likely to change dramatically over time (e.g., visual acuity, physical mobility) as well as fluid characteristics that may vary throughout the day or semester based on changes in preferences or reactions to other stimuli (e.g., attentiveness, stress L. R. Ketterlin-Geller, M. Ellis DOI: 10.4236/ce.2020.117089 1203 Creative Education levels). These characteristics may or may not be related to a student's disability (e.g., learning disability or physical disability), language status (e.g., experience in the language of instruction), age, or other demographic variables; however, these characteristics may impact the ways in which students engage with the tested content.
Common characteristics that influence accessibility include cognitive processing, attention, language or linguistic processing, and physical characteristics (Ketterlin-Geller, Crawford, & Huscroft-D'Angelo, 2014). Students' cognitive processing refers to their ability to store and retrieve information in long-term memory as well as capacity and processing speed of working memory. Attention refers to students' sensory processing, which may include attending to or focusing attention on specific stimuli. Language or linguistic processing involves generating and/or comprehending text or spoken language, often in a designated language. Finally, physical characteristics include attributes such as sensory perception (e.g., vision, hearing) and physiological functioning (e.g., neuromuscular, cardiovascular, orthopedic).
Consider how the following examples illustrate the impact of students' personal characteristics on their ability to accurately demonstrate their knowledge, skills, and abilities. A student with low visual acuity may have difficulty perceiving a test delivered via paper with small font. A student with prior limited experience with common classroom technology, such as a learning management system (LMS) or certain computer skills, may have difficulty accessing course content or completing a test with a computer-based format. A student who is easily distracted by superfluous ambient noise or activity may not be able to concentrate on a test that is administered in a group setting in which other students are moving around the classroom. A student with fine motor difficulties (whether a result of a disability, broken arm, or arthritis) may have difficulty responding to test questions that require handwritten answers. In many of these examples, other features of test administration (e.g., timed or speededness of tests) may further influence the student's ability to accurately demonstrate their knowledge, skills, and abilities.
Although faculty members may have access to some information about students' personal characteristics (e.g., through disability services offices or international student affairs offices), many characteristics may be unobservable or unknown to faculty. Faculty may not understand how these characteristics impact students' ability to interact with tests. For example, consider the student who has low visual acuity. This physical characteristic may cause the student to have difficulty perceiving information presented visually. During instruction, this student may use self-initiated strategies to meaningfully engage with the material (e.g., screen magnifying software, audio recording of lectures), with or without instructor consent or knowledge. However, during testing, these same strategies may not be feasible, permissible, and/or may require additional time. Without these same strategies during testing, the student may not be able to accurately demonstrate his or her knowledge, skills, and abilities in the targeted learning outcomes.

Preferences Impact Accessibility
Accessibility of assessments is also impacted by students' preferences. Preferences are influenced by a variety of factors including students' previous experiences with assessments, prior successes and failures with specific testing approaches, personality, level of anxiety, and other factors (Bevitt, 2015). Specific assessment practices interact with students' preferences to impact their perception of the accessibility of the assessment. A student with a negative emotional reaction to a specific assessment approach (e.g., standardized multiple-choice tests) may become anxious during a testing situation that impacts the student's ability to demonstrate his or her knowledge. In some situations, these negative emotional responses to assessment practices may lead to sufficient stress and anxiety that it causes the student to alter his or her course selection and/or retention decisions (Bevitt, 2015). Additionally, students with varied backgrounds of educational experience and preparation may not have previous exposure to certain assessment methods, as may be the case in the example of a student with limited experience with classroom technology asked to complete an innovative technology-based assessment.
Faculty can support accessibility by understanding their students' needs and preferences, and then using this information to design more accessible learning environments, including instructional materials and assessments. In some cases, students may be reluctant to disclose personal characteristics to faculty for fear of being stereotyped and stigmatized (Magnus & Tossebro, 2014). Faculty should be sensitive to students' concerns and be respectful of students' rights to privacy. Information can be collected by administering anonymous surveys about students' needs and preferences. Direct observations can provide additional information about students' personal characteristics. In cases where preferences may not be observable, analysis of student work samples may provide insight. By understanding students' personal characteristics, faculty can begin the process of intentionally designing assessments that are maximally accessible.

The Impact of Test Design Features on Accessibility
Test design features can be broadly defined as aspects of test administration (e.g., context, standardization), delivery (e.g., paper-pencil, computerized), item format (e.g., multiple choice, essay), and duration (e.g., timing, number of items) that make up the assessment. Test design features are selected for various reasons. Although obtaining accurate measurements about what students know and are able to do in the targeted learning outcomes should be the primary concern, additional considerations such as constraints associated with administration of the operational test (e.g., pre-specified testing platform, duration of testing session) and historical precedence influence these decisions. For example, the selected response item format (e.g., multiple-choice, true-false, matching) is often used for large courses because it is an efficient mechanism for delivering and scoring items for large groups of people (Douglas, Wilson, & Ennis, 2012). Although efficiency is an important consideration, implications for accessibility should be considered.

The Interaction Impacts Accessibility
Accessibility is impacted when students' personal characteristics interact with these test design features. In the event that a test design feature interferes with a student's ability to demonstrate his or her knowledge, skills, and abilities on the tested content, accessibility (and therefore, validity) is compromised (AERA, APA, & NCME, 2014). Referring to the previously described examples, consider how varying the test design features changes the accessibility of the assessment for these students: The student with low vision may have difficulty perceiving a test delivered via paper with small font, but may have less difficulty perceiving the same information when delivered via a computer in which the size of the font can be adjusted. The student who is easily distracted during group administered tests may be better able to concentrate when taking a test in a quiet location. The student with limited classroom technology experience may have issues with completing a computer-based examination but may find success with the same examination in a paper format or in time as familiarity with the technology develops. The student with fine motor difficulties may have difficulty responding to test questions that require written answers, but has less difficulty responding to the same test questions orally.
Faculty time is a precious commodity; therefore, it should not be inferred that faculty must design assessments with limitless variability to improve accessibility. The central proposition is that through an intentional design process that recognizes the role students' personal characteristics and preferences play in their ability to interact with the tested content and generate responses, faculty can develop assessments that align with their students' needs and preferences. The intended outcome of this process is to gain a more accurate reflection of students' knowledge, skills, and abilities in the tested content to support valid decision making.

Intentional Test Design to Improve Feasibility
In the remainder of this article, we present a series of concrete steps that can support faculty members' efforts to design accessible assessments of learning outcomes. These steps are informed by the work published by Ketterlin-Geller, Johnstone, and Thurlow (2015), and are illustrated in Figure 1.
Step 1: Identify the targeted learning outcome(s) The first step in designing tests to improve accessibility is to clearly articulate the targeted learning outcomes about which student performance will be evaluated. By clearly specifying the knowledge, skills, and abilities that the test is intended to measure, faculty can determine which test design features can and cannot be adjusted to improve accessibility, while at the same time preserving the validity of the interpretations and uses of the scores. For example, consider an assessment for an introductory mathematics course. The instructor wants to assess students' ability to collect, organize, and analyze data from a variety of sources. Although this is the targeted learning outcome, additional competencies may be assessed that are associated with these skills, including knowledge of different graphical representations of data, ability to differentiate between types of data (i.e., ordinal and interval), and software skills to generate graphical representations. Students' proficiency in these skills may contribute to the instructor's interpretation and/or uses of the test scores, thereby contributing to the specification of the targeted outcomes.
Step 2: Determine test design features The second step in designing an accessible assessment is to determine the test design features that will best allow students to demonstrate their competencies.
The test design features that are selected for an assessment of specific learning outcomes should align with the purpose of the assessment and the intended interpretations and uses (Lane et al., 2016). Some common features and sample options are provided in Figure 2.
Step 3: Specify the access skills The third step in increasing the accessibility of assessments is to identify plausible access skills. In addition to the knowledge, skills, and abilities that a test is intended to measure, most assessments also require students to have proficiency in unrelated skills (AERA, APA, & NCME, 2014). These skills, often called access skills, are needed to engage with the tested content, formulate a response, and/or produce the intended product, but are not intended to contribute to the instructor's inferences about students' proficiency in the targeted learning outcomes (Ketterlin-Geller, Yovanoff, & Tindal, 2007). If students lack proficiency in these skills and are unable to accurately demonstrate their proficiency in the targeted learning outcomes, their scores might include systematic error known as construct-irrelevant variance (CIV; Haladyna & Downing, 2004). This error is considered systematic because it occurs for predictable reasons, and it is construct-irrelevant because the resulting variability in scores is related to something other than differences in their proficiency in the targeted learning outcome. For example, consider the skills involved in responding to an item formatted as multiple choice. In addition to assessing the targeted content, multiple-choice items require users to read and interpret information presented in the stem and response options, discriminate between alternate responses, and hold information presented in distractors in working memory. Although some faculty may argue that these are fundamental skills that all postsecondary students should possess (Lindholm et al., 2005), these skills require competencies that may extend beyond the targeted learning outcomes.
Step 4: Apply principles of universal design Using the list of plausible access skills generated in step three, the fourth step in designing accessible assessments is to consider how to minimize the impact of these access skills on the measurement of the targeted learning outcomes by applying the principles of universal design during test development. Universal design is a design framework for supporting access to goods, services, and physical environments that emerged as a result of the Americans with Disabilities Act of 1990 (Story, Mueller, & Mace, 1998). Universal design supports access for most users by considering the users' needs and preferences during the design and development phases, and has been applied in a variety of contexts including architecture (e.g., graded entrances to buildings) and city planning (e.g., curb cuts in sidewalks). When applied to assessments, universal design increases the accessibility of tests for all students through intentional design that attends to students' needs and preferences (Ketterlin-Geller, Johnstone, & Thurlow, 2015).  Figure 3), including: 1) enhancing the comprehensibility and legibility of the test, 2) varying the setting and timing of administration, and 3) providing students with choices in how they demonstrate their knowledge, skills, and abilities. Many of the recommendations noted below can be implemented with little or no advanced technology.
Step 4a: Enhance comprehensibility and legibility of the test Students' access to the assessed content can be improved by enhancing the comprehensibility and legibility of the test. Improving comprehensibility by offering students different mechanisms through which they can engage with the material may improve access (Thompson, Johnstone, & Thurlow, 2002). Presenting test items using different forms of media may support students' ability to attend to the information. For example, instructions and/or items with lengthy descriptions can be presented as a video vignette or audio recording that can be shared electronically with students (e.g., through a learning management system). Moreover, access for some students may be supported by providing simple, clear, and intuitive instructions. Ambiguously worded items may cause some students to misinterpret the prompt and/or generate poor responses leading to inaccurate representations of their knowledge and skills (Sadler, 2016). By clearly articulating the expected student behaviors, faculty can improve the accuracy of measurement for all students. Step 4b: Vary the setting or timing of administration Because students' ability to formulate a response may be influenced by the environment in which the test is administered, accessibility may be improved by minimizing extraneous features of the testing situation and/or providing flexibility for the setting and timing of test administration. Students may be sensitive to certain stimuli (e.g., fluorescent lights) or may become over-stimulated under certain conditions (e.g., loud background noise), both of which may impact their ability to concentrate on tests. Other students may have slower processing speeds that require additional time to attend to the information presented on tests. To support these students, faculty can provide options to the setting (e.g., large or small group administration) and timing (e.g., time of day, amount of time) of the test that may support access for these students. Again, these options should only be provided if they do not change the targeted learning outcomes.
Step 4c: Provide students with choices To support students' ability to demonstrate their competencies in the targeted learning outcomes, access can be improved by providing students with multiple means of expression (Rose et al., 2006). Students may be more adept at or prefer one method of expression than another. In the event that the way in which students express their knowledge, skills, and abilities is not systematically linked to the inferences about students' proficiency in the targeted learning outcomes, faculty can consider providing students with options for how they generate a product. For example, in the mathematics assessment designed to measure students' ability to collect, organize, and analyze data from a variety of sources, faculty can consider allowing students to write an essay, present the information orally, or prepare a poster. Insomuch as the expectations of what knowledge, skills, and abilities are demonstrated remain consistent across the different options, the same scoring criteria (i.e., rubric) could be applied to each method of expression.
Step 5: Evaluate accessibility The fifth step in improving accessibility through intentional test design is to evaluate the assessment prior to administration. As noted earlier, adjustments made to increase the accessibility of an assessment should not change the targeted learning outcomes. Similarly, if students are offered choices in the ways in which they engage with the tested content, formulate a response, or generate a product, the variations should not offer an unfair advantage or disadvantage. In the end, the assessment should provide meaningful and reliable data for making valid decisions about students' proficiency in the targeted learning outcomes. The appropriateness of the assessment can be evaluated in a number of ways. First, other faculty members with disciplinary expertise can review the test and provide input on the comparability of the variations. In designing items that assess higher-order thinking skills, Sadler (2016) also notes the value of faculty members, individually or in groups, imagining themselves as students responding to assessment items and thinking through how they might respond. Second, a representative sample of students with similar personal characteristics can be administered the test and their perceptions solicited about the relative ease or difficulty of the accessible assessments. This sample can be drawn from another course, or from a group of students who have already taken the course in which the test will be administered. This group of students can provide input about the clarity of the instructions, and describe the thought processes they used to complete the tasks. These data can be used to verify that the items are eliciting the intended outcomes (Sadler, 2016). Third, and finally, disabilities services offices may be able to provide input on the accessibility for students with disabilities. Some students with disabilities may require additional accommodations to support access to the test; disability services offices can verify that the test is amenable to accommodations. Again, revisions to the assessment should not compromise the validity of the intended interpretations and uses of the test scores. Faculty should critically examine the test to verify that it will provide trustworthy information for making valid decisions for all students.
Step 6: Expose students to the assessment approach The sixth and final step before implementing accessible assessments is to provide students with ample exposure to the assessment approaches being implemented. In a qualitative study of innovative assessment approaches in higher education, Bevitt (2015) noted that students' prior experience with innovative assessment approaches impacted their perception of the assessment, and thus their experience as a student. Student participants expressed difficulty with novel approaches, explaining that additional time and effort was required for interpreting unfamiliar features that distracted them from completing the tasks. Some participants also expressed heightened negative emotions resulting from the implementation of new assessment approaches. Moreover, international students may experience additional burdens from novel assessment approaches because of the accumulation of cultural differences. As these findings suggest, students' unfamiliarity with variations in assessments may introduce CIV. In these instances, the accessibility is decreased instead of enhanced.
To avoid these negative outcomes, students need to become familiar with the assessment approaches that will be implemented. For example, if the comprehensibility of a test is enhanced by presenting the prompts via video vignettes delivered through a learning management system, students should have prior experience interpreting information presented in video vignettes, and should have sufficient experience using the learning management system before to the testing situation. Within a course, faculty introducing a novel assessment approach (e.g., analysis of video vignettes) may want to first introduce the approach during classroom instruction. Once students are comfortable interacting with the novel approach without being evaluated, faculty can then consider using the approach for assessment purposes. For assessment approaches that may extend beyond one course (e.g., making oral presentations), faculty can work together to determine in which courses students have the opportunity to develop the skills needed to engage with the approach prior to being evaluated. Again, the purpose of providing students with opportunities to become familiar with these assessment approaches is to increase the accessibility of assessments.

Concluding Remarks
The purpose of this article was to identify factors that impact accessibility of learning outcomes assessments and describe a framework and process for improving accessibility, with the primary goal of obtaining accurate and meaningful measurements of student proficiency. However, accessibility of assessments is best viewed within the broader framework of enhancing the inclusivity of educational settings. Diverse characteristics and preferences of postsecondary students necessitates reconsideration of many instructional and assessment practice to ensure that we are meeting the needs of all students. As such, focusing on improving the accessibility of assessments is only part of the solution. Instructional practices, including content delivery mechanisms, activities, and assignments, should also be subjected to similar methods for improving accessibility. Just as was described in this article in relation to assessments, the interaction between students' characteristics and preferences and instructional practices should be evaluated for accessibility. Only when instruction is accessible will students have equitable opportunities to learn the targeted content.