EDUR 7130
Educational Research On-Line

Practice Exercise for Internal Validity


Exercise 1

Wendy Martin provides an exercise for threats to internal validity at: Name That Threat

Exercise 2

Instructions

Determine which of the seven threats to internal validity may apply to each example below (history, maturation, regression, differential selection, testing, mortality, instrumentation).

1. A researcher decides to try a new mathematics curriculum in a nearby elementary school and to compare student achievement in math with that of students in another elementary school using the regular curriculum. The researcher is not aware, however, that the students in the "new curriculum" school have computers to use in their classrooms.

2. A researcher wishes to compare two different kinds of textbooks in two high school chemistry classes over a semester. She finds that 20% of one group and 10% of the other group are absent during the administration of unit tests.

3. Teachers of an experimental English curriculum as well as teachers of the regular curriculum administer both pre- and posttests to their own students.

4. Eight-grade students who volunteer to tutor third-graders in reading show greater improvement in their own reading scores than a comparison group that does not participate in tutoring.

5. Those students who score in the bottom 10% academically in a school in an economically depressed area are selected for a special program of enrichment. The program includes special games, extra materials, special "snacks," specially colored materials to use, and new books. The students score substantially higher on achievement tests 6 months after the program is instituted.

6. A researcher designs a study to investigate the effects of simulation games on ethnocentrism. She plans to select two high schools to participate in an experiment. Students in both schools will be given a pretest designed to measure their attitudes toward minority groups. School A will then be given the simulation games during their social studies classes over a three day period while school B sees travel films. Both schools will then be given the same test to see if their attitude toward minority groups has changed. the researcher conducts the study as planned, but a special, unplanned documentary on racial prejudice is shown in school A between the pretest and the posttest.

7. A researcher uses pre- and posttests of "anxiety level" to compare students given relaxation training with students in a control group. Lower scores in the experimental group result.

8. In a experiment of surveying methods, several people failed to return the control group survey.

9. Concerned about pretest sensitization, a researcher constructs a test that is extremely difficult, and that is not content valid, and administers it to both the experimental and control groups. The posttest used to measure gains in achievement is not as difficult, and the experimental group shows a slight larger improvement over the control group.

10. A researcher uses the same set of problems to measure change over time in student ability to solve mathematics word problems. The first administration is given at the beginning of a unit of instruction; the second administration is given at the end of the unit of instruction, three weeks later. Improvement scores result.

11. The achievement scores of five elementary schools whose teachers use a cooperative learning approach are compared with those of five schools whose teachers do not use this approach. During the course of the study, the faculty of one of the schools where cooperative learning is not used is engaged in a disruptive conflict with the school principal.

12. A researcher tests a group of students enrolled in a special class for "students with artistic potential" every year for six years, beginning when they are aged five. She finds that their drawing ability improves markedly over the years.

13. The researcher uses a self-made test to compare the experimental and control group.

14. In an experimental test of alternative forms of the SAT, a group took the traditional SAT test form which lasted approximately four hours, and then took the shortened version which lasted about one hour immediately afterwards.


Answers

1. A researcher decides to try a new mathematics curriculum in a nearby elementary school and to compare student achievement in math with that of students in another elementary school using the regular curriculum. The researcher is not aware, however, that the students in the "new curriculum" school have computers to use in their classrooms.

A: History is most likely a threat since the computers in the new curriculum school will interfere with achievement variable in the study. Differential selection also could pose a problem since students will not likely be assigned randomly to the schools in which the experiment will occur. Anytime groups in an experiment are not randomly formed, differential selection is likely to pose a problem.

2. A researcher wishes to compare two different kinds of textbooks in two high school chemistry classes over a semester. She finds that 20% of one group and 10% of the other group are absent during the administration of unit tests.

A: Mortality since students did not participate in the tests. Also, since intact classes used, differential selection.

3. Teachers of an experimental English curriculum as well as teachers of the regular curriculum administer both pre- and posttests to their own students.

A: A pretest was used, so testing may be a possible threat. Also, it is not clear if the groups were randomly formed, so maybe differential selection also.

4. Eight-grade students who volunteer to tutor third-graders in reading show greater improvement in their own reading scores than a comparison group that does not participate in tutoring.

A: Differential selection is a problem here since the groups in the study were not selected in the same manner. Maturation is not a problem is there is a comparison group.

5. Those students who score in the bottom 10% academically in a school in an economically depressed area are selected for a special program of enrichment. The program includes special games, extra materials, special "snacks," specially colored materials to use, and new books. The students score substantially higher on achievement tests 6 months after the program is instituted.

A: Regression to the mean is a possible threat since low achieving students only were selected. Also, since there is no comparison group, in this study in gains will be difficult to distinguish from simple maturation effects.

6. A researcher designs a study to investigate the effects of simulation games on ethnocentrism. She plans to select two high schools to participate in an experiment. Students in both schools will be given a pretest designed to measure their attitudes toward minority groups. School A will then be given the simulation games during their social studies classes over a three day period while school B sees travel films. Both schools will then be given the same test to see if their attitude toward minority groups has changed. the researcher conducts the study as planned, but a special, unplanned documentary on racial prejudice is shown in school A between the pretest and the posttest.

A: Differential selection is possible since intact groups will be used. History is a threat since the unplanned documentary will likely impact the dependent variable. A pretest was given, so it is difficult to rule out testing effects.

7. A researcher uses pre- and posttests of "anxiety level" to compare students given relaxation training with students in a control group. Lower scores in the experimental group result.

A: Nothing is clearly a threat in this example, although the pretest could cause testing effects.

8. In a experiment of surveying methods, several people failed to return the control group survey.

A: Mortality is the clear problem with this study.

9. Concerned about pretest sensitization, a researcher constructs a test that is extremely difficult, and that is not content valid, and administers it to both the experimental and control groups. The posttest used to measure gains in achievement is not as difficult, and the experimental group shows a slight larger improvement over the control group.

A: Instrumentation is a problem for two reasons. First, the tests do not appear to be content valid, so scores cannot be interpreted accurately. Second, since the two tests, the pretest and posttest, are not of the same difficulty, they lack equivalence forms reliability, which is also an instrumentation threat.

10. A researcher uses the same set of problems to measure change over time in student ability to solve mathematics word problems. The first administration is given at the beginning of a unit of instruction; the second administration is given at the end of the unit of instruction, three weeks later. Improvement scores result.

A: Maturation effects cannot be eliminated as a rival explanation for the improved scores since there is no control group.

11. The achievement scores of five elementary schools whose teachers use a cooperative learning approach are compared with those of five schools whose teachers do not use this approach. During the course of the study, the faculty of one of the schools where cooperative learning is not used is engaged in a disruptive conflict with the school principal.

A: History and differential selection may be problems. History results from the conflict as this could affect students' performance; differential selection may be a problem since intact groups--schools--were used.

12. A researcher tests a group of students enrolled in a special class for "students with artistic potential" every year for six years, beginning when they are aged five. She finds that their drawing ability improves markedly over the years.

A: Maturation may be a problem since lack of control group means changes over time could be explained as maturation effects.

13. The researcher uses a self-made test to compare the experimental and control group.

A: Only thing that presents itself here is possible instrumentation since the self-made test may lack validity.

14. In an experimental test of alternative forms of the SAT, a group took the traditional SAT test form which lasted approximately four hours, and then took the shortened version which lasted about one hour immediately afterwards.

A: Maturation is a problem here. The reason is that after taking the SAT for four hours, it is very likely that the participants will be mentally exhausted, so their performance on the shortened version of the SAT may be poorer not due to a weakness of the shortened SAT (which the researchers are testing for), but due to lack of ability to concentrate.