|

The impact of
placement of semi-dense items in a criterion referenced test of subtraction.
By J. Olmanson.
Question: Do distracters derived
from common alternative / erroneous algorithms mislead students who might
otherwise reconstruct the correct algorithm, thereby lowering a student’s
ability to correctly answer subsequent questions which contain such erroneous
algorithms on a multiple choice – criterion referenced test?
At the beginning of the curriculum planning process stands
the vision of a student empowered through the internalization of the knowledge,
insight and skills it embodies. The vision is broken into goals that are
subsequently grouped into objectives. Along the way the state, district,
principal and or educator must determine the best method or methods to use in
the assessment of student achievement and the elements of the course. Such
assessment seeks to gain an understanding of the effect the learning experience
has had on the learner’s thinking process. Methods of assessing such thinking
hold varying degrees of authenticity, feasibility, validity and reliability.
Within the milieu of assessment options over the past
decades, standardized- multiple choice tests (both norm and criterion
referenced) have emerged as the measurements of choice in the state of Texas,
and beyond for determining a student’s breadth of learning and level of
mastery, an educator’s professional worth and a school’s quality (Black,
1998). No child in the mainstream educational system makes his or her way
through school without completing hundreds of such official and practice
assessments.
…American students are the most frequently
tested and least often examined students in the world. (Resnick and Resnick,
1985)
The question stated above seeks to determine whether
the presence of certain types of test item distracters at the beginning of
standardized criterion referenced tests actually impede the learner’s ability
to successfully recall learned algorithms for tested instructional objectives
consequently distorting the assessment instrument’s ability to accurately
measure a learner’s understanding or potential to reconstruct his or her
learning.
The need for semi-dense test items (Bart, 1994) or
quality distracters is derived from the diagnostic advantages they offer. In as
much as feedback is requisite for the realization of educational goals, quality
distracters- more so than correct answers- contain the potential to guide
post-test instruction through the analysis of wrong response data. Tatsuoka and
Birenbaum (1980) found such data useful:
Searching for the algorithm reflected in the
student’s response-patterns may therefore become the gate to more accurate
measures of activity.
It seems clear…that a valid
measure of achievement needs to consider information from wrong as well as from
correct responses.
Therefore quality distracters are not only educationally
beneficial but also increase a test item’s ability to discriminate between
learners of varying degrees of understanding. The question put forth here is
whether such distracters actually inhibit a learner’s ability to accurately
recall algorithms from prior instruction.
The possibility for such an occurrence is related to
Resnick’s findings (1976) and Tatsuoka and Birenbaum’s conclusions (1980):
Children seek simplifying procedures that lead them
to construct or ‘invent’ more efficient routines that might be quite
difficult to teach directly. (1976)
…the student is most likely to modify that
algorithm. This modification can result in a wrong algorithm, which may yield
correct answers occasionally, depending on factors such as syntactic attributes
of the task presented to the student, that may have led him/her to construct
their modified algorithm. (1980)
These statements speak to the learner’s propensity
to modify presented treatment or instruction. What is uncertain is when the
modification occurs. If such modifications occur each time a task (test item) is
placed in front of a learner (based on a number of variables) then it is
possible that the presence of alternative algorithm distracters early in a test
influence the reconstruction of the previously presented “learned”
algorithms thereby affecting responses on subsequent / similar items.
In
essence the learner approaches initial test items with a potentially fuzzy
understanding of the taught algorithm. The test taker then constructs a
tentative model of the algorithm from memory and applies it to the task.
Provided the item contains the result of his or her working algorithm (correct
or erroneous), the learner’s confidence in the algorithm increases and it is
used to answer future items.
According to Dr. Steven L. Wise, Senior Assessment
Specialist and Professor of Psychology for the Center for Assessment and
Research Studies at James Madison University, there is a need for research in
this area. In light of this, a study was conducted to test for the possibility
of this phenomenon.
Test construction
A 15-item multiple-choice test of two digit by two digit
subtraction requiring regrouping was constructed. Each item contained one
correct response and four distracters. The distracters were constructed by
applying simplified erroneous algorithms to each task, in the even that the
simplified algorithm produced the correct response an alternative algorithm was
used (for an explanation of each algorithm see Appendix 1).
A 5-item multiple-choice test of two digit by two
digit subtraction requiring regrouping was also constructed. Each item contained
one correct response and four distracters. The distracters in this group did not
follow any know algorithm however they were close in value to results with would
have been given had the same alternative algorithms been employed (if the alt.
alg. resulted in 26 then 24, 28 or the like was used).
Test A placed the 15 item group mentioned above first
on the test followed by the 5 item group. Test B placed the 5 item group first
followed by the 15 item group.
Data Collection Procedures
The
data to be presented in this report was collected on December 5th
2000 at an elementary school in the Greater Houston Metropolitan
Area. Three classes consisting of 60 second graders, taught by three
teachers who were using the same instructional objectives and materials.
A 20
item multiple-choice test consisting of two digit over two digit subtraction
involving regrouping was administered in the form of a paper and pencil test
(see Appendix 2 for a copy of each version of the test). 60 students took the
test receiving at random test A or test B.
Results
While only a very rudimentary analysis of the data
has been done, initial results show that as a group 31 students took test A
receiving a grouped raw score of 164 out of 620 possible for 26.45% correct. 29
students took test B receiving a grouped raw score of 252 out of 580 possible
for 43.44% correct.
Conclusions
While a more sophisticated analysis of the test data
is necessary, and further studies are called for, the possibility of a strong
correlation between the presence or absence of alternative algorithm distracters
in initial test questions and test scores gives educators and test creators the
means to reinforce instruction and increase the learner’s ability to
demonstrate their understanding.
Appendix A
Algorithms
used for distracters.
Example:
30
-11
Alternative Algorithm #1
Result 21.
[Always subtract smaller number from larger (1-0) (3-1)]
Alternative Algorithm #2
Result 29.
[Correct one’s place subtraction but no lowering of the ten’s column
numerator (0 + 10 – 1) (3-1)]
Alternative Algorithm #3
Result 41.
[Addition (0+1) (3+1)]
Alternative Algorithm #4
Result 11.
[Subtract smaller number from larger in the one’s column (1-0), regroup
in the ten’s column and subtract (3 regrouped to 2-1)]
Appendix 2
|
Test
A
|
Test
B
|
|
Subtract. Circle the correct answer.
30
-11
a) 21 b) 41 c)
19 d) 11
e) 29
40
-13
a) 27 b) 23 c)
37 d) 33
e) 53
35
-29
a)
4 b) 14
c) 64 d)
6 e) 16
70
-52
a) 18
b) 22 c) 12 d)
135 e) 28
86
-27
a) 51
b) 113 c) 69 d) 61
e) 59
81
-12
a) 69
b) 71 c) 79 d) 93
e) 61
45
-19
a) 22
b) 64 c) 24 d) 34
e) 26
50
-18
a) 42
b) 68 c) 48 d) 32
e) 28
53
-24
a) 77
b) 29 c) 31 d) 21
e) 39
31
-15
a) 14
b) 26 c) 24 d) 16
e) 46
22
-13
a) 9
b) 11 c) 35
d) 19 e) 1
31
-12
a) 19
b) 11 c) 21 d) 29
e) 43
20
-18
a) 18
b) 12 c) 2 d) 38
e) 8
28
-19
a) 47
b) 19 c) 11 d) 9
e) 1
33
-16
a) 23
b) 49 c) 13
d) 27 e) 17
77
-28
a) 53
b) 100 c) 49 d) 48
e) 37
63
-35
a) 20
b) 92 c) 36 d) 25
e) 28
71
-24
a) 47
b) 44 c) 56 d) 98
e) 40
22
-14
a) 4
b) 8 c) 13
d) 32 e) 9
50
-17
a) 44
b) 42 c) 65 d) 36
e) 33
|
Subtract. Circle the correct
answer.
77
-28
a) 53
b) 100 c) 49 d) 48
e) 37
63
-35
a) 20
b) 92 c) 36 d) 25
e) 28
71
-24
a) 47
b) 44 c) 56 d) 98
e) 40
22
-14
a) 4
b) 8 c) 13
d) 32 e) 9
50
-17
a) 44
b) 42 c) 65 d) 36
e) 33
30
-11
a) 21 b) 41 c)
19 d) 11
e) 29
40
-13
a) 27 b) 23 c)
37 d) 33
e) 53
35
-29
a)
4 b) 14
c) 64 d)
6 e) 16
70
-52
a) 18
b) 22 c) 12 d)
135 e) 28
86
-27
a) 51
b) 113 c) 69 d) 61 e)
59
81
-12
a) 69
b) 71 c) 79 d) 93
e) 61
45
-19
a) 22
b) 64 c) 24 d) 34
e) 26
50
-18
a) 42
b) 68 c) 48 d) 32
e) 28
53
-24
a) 77
b) 29 c) 31 d) 21
e) 39
31
-15
a) 14
b) 26 c) 24 d) 16
e) 46
22
-13
a) 9
b) 11 c) 35
d) 19 e) 1
31
-12
a) 19
b) 11 c) 21 d) 29
e) 43
20
-18
a) 18
b) 12 c) 2 d) 38
e) 8
28
-19
a)
47 b) 19 c) 11 d) 9
e) 1
33
-16
a) 23
b) 49 c) 13 d) 27
e) 17
|
Bibliography
Bart, William M. A Diagnostic Analysis of a
Proportional Reasoning Test Item: An Introduction to the Properties of a
Semi-Dense Item. Focus on Learning Problems in Mathematics; v16 n3 Summer 1994.
Birenbaum and Tatsuoka. The Use of Information
from Wrong Responses in measuring Student’s Achievement. Illinois Univ.,
Urbana. Computer-Based Education Research Lab. 1980.
Black, Paul. Testing: Friend or Foe? Theory
and Practice of Assessment and Testing. The Farmer Press, London, 1998.
Burton and Brown. An Investigation of Computer
Coaching for Informal Learning Activities. International Journal of Man-Machine
Studies; 1978.
Haladyna, Thomas M. Developing and
Validating Multiple-Choice Test Items. Lawrence Erlbaum Associates, New Jersey,
1999.
Resnick and Resnick. Standards Curriculum and
Performance: A historical and comparative perspective. Educational Researcher,
14 1985.
Tatsuoka, Kikumi. A Probabilistic Model for
Diagnosing Misconceptions by the Pattern Classification Approach. Journal of
Educational Statistics; v10 n1 Spring 1985.
Wise, Steven L. Senior Assessment Specialist
and Professor of Psychology at the Center for Assessment and Research Studies.
James Madison University. Harrisburg VA. E-mail correspondence. wisesl@jmu.edu
.
|