Justin  Olmanson   Action Research  in  Assessment

  Welcome to  Assessment in the Mathematics Classroom !
        


 Home
  Contact us
  Project Laptop
  Volunteer Work

  Our Mission

 Solutions
  Testing
  Research
  Consulting
 itil. Family
  Teacher Ideas
  Lit. Assessment
  Cummins Web
  Lectura
  Bilingual Ed.
 TIE  R & D
  Number Land
  Co-op Inquiry
 Support
  Links
  Bilingual Education
  
Version 2
 News


  


The impact of placement of semi-dense items in a criterion referenced test of subtraction.

 By J. Olmanson.

 Question: Do distracters derived from common alternative / erroneous algorithms mislead students who might otherwise reconstruct the correct algorithm, thereby lowering a student’s ability to correctly answer subsequent questions which contain such erroneous algorithms on a multiple choice – criterion referenced test?

 

At the beginning of the curriculum planning process stands the vision of a student empowered through the internalization of the knowledge, insight and skills it embodies. The vision is broken into goals that are subsequently grouped into objectives. Along the way the state, district, principal and or educator must determine the best method or methods to use in the assessment of student achievement and the elements of the course. Such assessment seeks to gain an understanding of the effect the learning experience has had on the learner’s thinking process. Methods of assessing such thinking hold varying degrees of authenticity, feasibility, validity and reliability.

 Within the milieu of assessment options over the past decades, standardized- multiple choice tests (both norm and criterion referenced) have emerged as the measurements of choice in the state of Texas, and beyond for determining a student’s breadth of learning and level of mastery, an educator’s professional worth and a school’s quality (Black, 1998). No child in the mainstream educational system makes his or her way through school without completing hundreds of such official and practice assessments.

 

…American students are the most frequently tested and least often examined students in the world. (Resnick and Resnick, 1985)

 The question stated above seeks to determine whether the presence of certain types of test item distracters at the beginning of standardized criterion referenced tests actually impede the learner’s ability to successfully recall learned algorithms for tested instructional objectives consequently distorting the assessment instrument’s ability to accurately measure a learner’s understanding or potential to reconstruct his or her learning.

 The need for semi-dense test items (Bart, 1994) or quality distracters is derived from the diagnostic advantages they offer. In as much as feedback is requisite for the realization of educational goals, quality distracters- more so than correct answers- contain the potential to guide post-test instruction through the analysis of wrong response data. Tatsuoka and Birenbaum (1980) found such data useful:

 Searching for the algorithm reflected in the student’s response-patterns may therefore become the gate to more accurate measures of activity.

 

It seems clear…that a valid measure of achievement needs to consider information from wrong as well as from correct responses.

 

Therefore quality distracters are not only educationally beneficial but also increase a test item’s ability to discriminate between learners of varying degrees of understanding. The question put forth here is whether such distracters actually inhibit a learner’s ability to accurately recall algorithms from prior instruction.

 The possibility for such an occurrence is related to Resnick’s findings (1976) and Tatsuoka and Birenbaum’s conclusions (1980):

 Children seek simplifying procedures that lead them to construct or ‘invent’ more efficient routines that might be quite difficult to teach directly. (1976)

 …the student is most likely to modify that algorithm. This modification can result in a wrong algorithm, which may yield correct answers occasionally, depending on factors such as syntactic attributes of the task presented to the student, that may have led him/her to construct their modified algorithm. (1980)

 These statements speak to the learner’s propensity to modify presented treatment or instruction. What is uncertain is when the modification occurs. If such modifications occur each time a task (test item) is placed in front of a learner (based on a number of variables) then it is possible that the presence of alternative algorithm distracters early in a test influence the reconstruction of the previously presented “learned” algorithms thereby affecting responses on subsequent / similar items.

  In essence the learner approaches initial test items with a potentially fuzzy understanding of the taught algorithm. The test taker then constructs a tentative model of the algorithm from memory and applies it to the task. Provided the item contains the result of his or her working algorithm (correct or erroneous), the learner’s confidence in the algorithm increases and it is used to answer future items.

 According to Dr. Steven L. Wise, Senior Assessment Specialist and Professor of Psychology for the Center for Assessment and Research Studies at James Madison University, there is a need for research in this area. In light of this, a study was conducted to test for the possibility of this phenomenon.

  

Test construction

 

A 15-item multiple-choice test of two digit by two digit subtraction requiring regrouping was constructed. Each item contained one correct response and four distracters. The distracters were constructed by applying simplified erroneous algorithms to each task, in the even that the simplified algorithm produced the correct response an alternative algorithm was used (for an explanation of each algorithm see Appendix 1).

 A 5-item multiple-choice test of two digit by two digit subtraction requiring regrouping was also constructed. Each item contained one correct response and four distracters. The distracters in this group did not follow any know algorithm however they were close in value to results with would have been given had the same alternative algorithms been employed (if the alt. alg. resulted in 26 then 24, 28 or the like was used).

 Test A placed the 15 item group mentioned above first on the test followed by the 5 item group. Test B placed the 5 item group first followed by the 15 item group.

  

Data Collection Procedures

  The data to be presented in this report was collected on December 5th 2000 at an elementary school in the Greater Houston Metropolitan Area. Three classes consisting of 60 second graders, taught by three teachers who were using the same instructional objectives and materials.

  A 20 item multiple-choice test consisting of two digit over two digit subtraction involving regrouping was administered in the form of a paper and pencil test (see Appendix 2 for a copy of each version of the test). 60 students took the test receiving at random test A or test B.

 

 Results

 While only a very rudimentary analysis of the data has been done, initial results show that as a group 31 students took test A receiving a grouped raw score of 164 out of 620 possible for 26.45% correct. 29 students took test B receiving a grouped raw score of 252 out of 580 possible for 43.44% correct.

  

Conclusions

 While a more sophisticated analysis of the test data is necessary, and further studies are called for, the possibility of a strong correlation between the presence or absence of alternative algorithm distracters in initial test questions and test scores gives educators and test creators the means to reinforce instruction and increase the learner’s ability to demonstrate their understanding.



Appendix A

 

Algorithms used for distracters.

 

Example:

 

 30

-11

 

 Alternative Algorithm #1  Result  21. [Always subtract smaller number from larger (1-0) (3-1)]

 Alternative Algorithm #2  Result  29.  [Correct one’s place subtraction but no lowering of the ten’s column numerator (0 + 10 – 1) (3-1)]

 Alternative Algorithm #3  Result  41.  [Addition (0+1) (3+1)]

 Alternative Algorithm #4  Result  11.  [Subtract smaller number from larger in the one’s column (1-0), regroup in the ten’s column and subtract (3 regrouped to 2-1)]

 

 Appendix 2

 

Test A

Test B

Subtract. Circle the correct answer.

 

 30

-11

a) 21   b) 41   c) 19   d) 11   e) 29

 

 40

-13

a) 27   b) 23   c) 37   d) 33   e) 53

 

 35

-29

a)  4   b) 14    c) 64   d)  6   e) 16

 

 70

-52

a) 18  b) 22   c) 12   d) 135  e) 28

 

 86

-27

a) 51   b) 113  c) 69  d) 61  e) 59

 

 81

-12

a) 69   b) 71  c) 79  d) 93  e) 61

 

 45

-19

a) 22   b) 64  c) 24  d) 34  e) 26

 

 50

-18

a) 42   b) 68  c) 48  d) 32  e) 28

 

 53

-24

a) 77   b) 29  c) 31  d) 21  e) 39

 

 31

-15

a) 14   b) 26  c) 24  d) 16  e) 46

 

 22

-13

a) 9   b) 11  c) 35  d) 19  e) 1

 

 31

-12

a) 19   b) 11  c) 21  d) 29  e) 43

 

 20

-18

a) 18   b) 12  c) 2  d) 38  e) 8

 

 28

-19

a) 47   b) 19  c) 11  d) 9  e) 1

 

 33

-16

a) 23   b) 49  c) 13  d) 27  e) 17

 

 77

-28

a) 53   b) 100  c) 49  d) 48  e) 37

 

 63

-35

a) 20   b) 92  c) 36  d) 25  e) 28

 

 71

-24

a) 47   b) 44  c) 56  d) 98  e) 40

  

 22

-14

a) 4   b) 8  c) 13  d) 32  e) 9

 

 50

-17

a) 44   b) 42  c) 65  d) 36  e) 33

 

Subtract. Circle the correct answer.

 

 77

-28

a) 53   b) 100  c) 49  d) 48  e) 37

 

 63

-35

a) 20   b) 92  c) 36  d) 25  e) 28

 

 71

-24

a) 47   b) 44  c) 56  d) 98  e) 40

 

 22

-14

a) 4   b) 8  c) 13  d) 32  e) 9

 

 50

-17

a) 44   b) 42  c) 65  d) 36  e) 33

 

30

-11

a) 21   b) 41   c) 19   d) 11   e) 29

 

 40

-13

a) 27   b) 23   c) 37   d) 33   e) 53

 

 35

-29

a)  4   b) 14    c) 64   d)  6   e) 16

 70

-52

a) 18  b) 22   c) 12   d) 135  e) 28

 

 86

-27

a) 51   b) 113  c) 69  d) 61  e) 59

 

 81

-12

a) 69   b) 71  c) 79  d) 93  e) 61

 

 45

-19

a) 22   b) 64  c) 24  d) 34  e) 26

 

 50

-18

a) 42   b) 68  c) 48  d) 32  e) 28

 

 53

-24

a) 77   b) 29  c) 31  d) 21  e) 39

 

 31

-15

a) 14   b) 26  c) 24  d) 16  e) 46

 

 22

-13

a) 9   b) 11  c) 35  d) 19  e) 1

 

  31

-12

a) 19   b) 11  c) 21  d) 29  e) 43

 

  20

-18

a) 18   b) 12  c) 2  d) 38  e) 8

  

 28

-19

 a) 47   b) 19  c) 11  d) 9  e) 1

  

 33

-16

a) 23   b) 49  c) 13  d) 27  e) 17

 

 

Bibliography

 Bart, William M. A Diagnostic Analysis of a Proportional Reasoning Test Item: An Introduction to the Properties of a Semi-Dense Item. Focus on Learning Problems in Mathematics; v16 n3 Summer 1994.

 Birenbaum and Tatsuoka. The Use of Information from Wrong Responses in measuring Student’s Achievement. Illinois Univ., Urbana. Computer-Based Education Research Lab. 1980.

 Black, Paul. Testing: Friend or Foe? Theory and Practice of Assessment and Testing. The Farmer Press, London, 1998.

 Burton and Brown. An Investigation of Computer Coaching for Informal Learning Activities. International Journal of Man-Machine Studies; 1978.

  Haladyna, Thomas M. Developing and Validating Multiple-Choice Test Items. Lawrence Erlbaum Associates, New Jersey, 1999.

 Resnick and Resnick. Standards Curriculum and Performance: A historical and comparative perspective. Educational Researcher, 14 1985. 

Tatsuoka, Kikumi. A Probabilistic Model for Diagnosing Misconceptions by the Pattern Classification Approach. Journal of Educational Statistics; v10 n1 Spring 1985.

 Wise, Steven L. Senior Assessment Specialist and Professor of Psychology at the Center for Assessment and Research Studies. James Madison University. Harrisburg VA. E-mail correspondence. wisesl@jmu.edu .

 

 



Educational Evaluation Links

The Connection between Cooperative Learning and Authentic Assessment
A folksy, personal account of how to evaluate cooperatively grouped learners... read more.

Validity, Bias, and Justice in Educational Testing. Educational measurement has been historically dominated by technicists who abstract questions of test validity and bias from social conditions, and maintain that everyone should play by the ground rules that they, the technical experts, set.... read more.

Legal Issues in Testing.
Ability tracking, special education, school admissions, test disclosure, teacher competency... read more.

Comments on Assessment in U.S. Education. We do not know much about what assessment has accomplished but we know it has not brought about the reform of American Education... read more.


Research, Software and Theory

Criterion- vs Norm-Referenced Testing
It is common to hear criterion-referenced and norm-referenced testing referred to as if they serve the same purposes, or shared the same characteristics. Much confusion can be eliminated if the basic differences are understood....  read more.

Standard Errors in Educational Assessment: A Policy Analysis Perspective
Statistical methods are tools for understanding social processes, but there is no necessary connection between a statistical method and an empirical outcome... read more.

Hot-Potatoes, Half-Baked Software The freeware Hot Potatoes suite includes six applications, enabling you to create interactive multiple-choice, short-answer, jumbled-sentence, crossword, matching/ordering and gap-fill exercises for the World Wide Web... Read More.


This educational evaluation web's aim is to lead the drive towards integrating education, assessment and IT resources. Created by Justin Olmanson, the goal is the optimization of technology utilization in educational settings in hopes of producing more successful learners. 


Top of Page
i teach i learn.com © 1999-2003
Educators. Technology. Connected.