Deck 9: Validity and Reliability
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Unlock Deck
Sign up to unlock the cards in this deck!
Unlock Deck
Unlock Deck
1/139
Play
Full screen (f)
Deck 9: Validity and Reliability
1
A test has a reliability coefficient of .84.What percent of test variance is error?
A) 4
B) 16
C) 32
D) 71
E) 84
A) 4
B) 16
C) 32
D) 71
E) 84
B
2
Exhibit 9-1: Choose the type of evidence for validity that is referred to in the following questions.
Refer to Exhibit 9-1.The test analysis shows that the test is measuring a single trait.
A) evidence based on content
B) evidence based on internal structure
C) evidence based on relationship to other variables
Refer to Exhibit 9-1.The test analysis shows that the test is measuring a single trait.
A) evidence based on content
B) evidence based on internal structure
C) evidence based on relationship to other variables
B
3
On a standardized mathematics test, the reliability coefficient is reported to be .76.From this information, one could best determine
A) the extent to which the test scores are correlated to classroom achievement in mathematics.
B) the extent to which errors of measurement have influenced the test scores.
C) the extent to which the test is a representative sample of relevant concepts in math.
D) on the average, the number of points students' scores will change when given an equivalent test.
A) the extent to which the test scores are correlated to classroom achievement in mathematics.
B) the extent to which errors of measurement have influenced the test scores.
C) the extent to which the test is a representative sample of relevant concepts in math.
D) on the average, the number of points students' scores will change when given an equivalent test.
B
4
Exhibit 9-3: Choose the type of evidence (a-c) that would be of primary importance for the validity of the following tests.
Refer to Exhibit 9-3.An employment test for selecting data processors
A) evidence based on content
B) evidence supporting construct
C) evidence based on relationship to other variables
Refer to Exhibit 9-3.An employment test for selecting data processors
A) evidence based on content
B) evidence supporting construct
C) evidence based on relationship to other variables
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
5
Exhibit 9-2: An achievement test for advanced math was developed for high school students.In the process of preparing the test, a number of steps were carried out.Choose the option that most directly relates to each of the following steps.
Refer to Exhibit 9-2.For 5,000 cases, scores on the test were correlated with grades in freshman college mathematics.This step is concerned with
A) reliability.
B) validity evidence based on content.
C) validity evidence based on relationship to the other variables.
D) validity evidence based on response processes.
Refer to Exhibit 9-2.For 5,000 cases, scores on the test were correlated with grades in freshman college mathematics.This step is concerned with
A) reliability.
B) validity evidence based on content.
C) validity evidence based on relationship to the other variables.
D) validity evidence based on response processes.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
6
The standard error of measurement is based on the test's
A) validity.
B) reliability.
C) objectivity.
D) difficulty.
E) discriminability.
A) validity.
B) reliability.
C) objectivity.
D) difficulty.
E) discriminability.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
7
Exhibit 9-1: Choose the type of evidence for validity that is referred to in the following questions.
Refer to Exhibit 9-1.The scores on Test A correlate highly with subjects' performance on Test B.
A) evidence based on content
B) evidence based on internal structure
C) evidence based on relationship to other variables
Refer to Exhibit 9-1.The scores on Test A correlate highly with subjects' performance on Test B.
A) evidence based on content
B) evidence based on internal structure
C) evidence based on relationship to other variables
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
8
Exhibit 9-3: Choose the type of evidence (a-c) that would be of primary importance for the validity of the following tests.
Refer to Exhibit 9-3.A quantitative aptitude test used for screening applicants for admission to the School of Engineering.
A) evidence based on content
B) evidence supporting construct
C) evidence based on relationship to other variables
Refer to Exhibit 9-3.A quantitative aptitude test used for screening applicants for admission to the School of Engineering.
A) evidence based on content
B) evidence supporting construct
C) evidence based on relationship to other variables
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
9
Exhibit 9-1: Choose the type of evidence for validity that is referred to in the following questions.
Refer to Exhibit 9-1.The test predicts performance in foreign language study.
A) evidence based on content
B) evidence based on internal structure
C) evidence based on relationship to other variables
Refer to Exhibit 9-1.The test predicts performance in foreign language study.
A) evidence based on content
B) evidence based on internal structure
C) evidence based on relationship to other variables
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
10
Exhibit 9-3: Choose the type of evidence (a-c) that would be of primary importance for the validity of the following tests.
Refer to Exhibit 9-3.An inventory designed to measure depression in adolescents.
A) evidence based on content
B) evidence supporting construct
C) evidence based on relationship to other variables
Refer to Exhibit 9-3.An inventory designed to measure depression in adolescents.
A) evidence based on content
B) evidence supporting construct
C) evidence based on relationship to other variables
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
11
Exhibit 9-3: Choose the type of evidence (a-c) that would be of primary importance for the validity of the following tests.
Refer to Exhibit 9-3.An end-of-year test in junior high school science
A) evidence based on content
B) evidence supporting construct
C) evidence based on relationship to other variables
Refer to Exhibit 9-3.An end-of-year test in junior high school science
A) evidence based on content
B) evidence supporting construct
C) evidence based on relationship to other variables
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
12
Exhibit 9-2: An achievement test for advanced math was developed for high school students.In the process of preparing the test, a number of steps were carried out.Choose the option that most directly relates to each of the following steps.
Refer to Exhibit 9-2.The test outline was prepared by a committee of experts in the high school math curriculum.This step is concerned with
A) reliability.
B) validity evidence based on content.
C) validity evidence based on relationship to the other variables.
D) validity evidence based on response processes.
Refer to Exhibit 9-2.The test outline was prepared by a committee of experts in the high school math curriculum.This step is concerned with
A) reliability.
B) validity evidence based on content.
C) validity evidence based on relationship to the other variables.
D) validity evidence based on response processes.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
13
The reliability of a spelling test is .80.If the test is doubled in length, the reliability of the test will be approximately
A) .80.
B) .84.
C) .89.
D) .90.
E) 1.60.
A) .80.
B) .84.
C) .89.
D) .90.
E) 1.60.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
14
Exhibit 9-1: Choose the type of evidence for validity that is referred to in the following questions.
Refer to Exhibit 9-1.The sample of items is representative of the course subject matter.
A) evidence based on content
B) evidence based on internal structure
C) evidence based on relationship to other variables
Refer to Exhibit 9-1.The sample of items is representative of the course subject matter.
A) evidence based on content
B) evidence based on internal structure
C) evidence based on relationship to other variables
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
15
Exhibit 9-2: An achievement test for advanced math was developed for high school students.In the process of preparing the test, a number of steps were carried out.Choose the option that most directly relates to each of the following steps.
Refer to Exhibit 9-2.For 5,000 cases, two forms of the test were given and scores on Form A were correlated with scores on Form B.This step is concerned with
A) reliability.
B) validity evidence based on content.
C) validity evidence based on relationship to the other variables.
D) validity evidence based on response processes.
Refer to Exhibit 9-2.For 5,000 cases, two forms of the test were given and scores on Form A were correlated with scores on Form B.This step is concerned with
A) reliability.
B) validity evidence based on content.
C) validity evidence based on relationship to the other variables.
D) validity evidence based on response processes.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
16
The correlation between two equivalent forms of an objective test administered in immediate succession enables one to determine the proportion of error variance due to
A) item homogeneity.
B) temporal fluctuation.
C) item sampling.
D) scorer reliability.
E) a and b
A) item homogeneity.
B) temporal fluctuation.
C) item sampling.
D) scorer reliability.
E) a and b
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
17
A teacher split a 50-item test into odd and even items, scored each half, and then correlated odd scores with even scores.The correlation coefficient was .55.What is the estimated reliability for the full 50-item test?
A) .80
B) .71
C) .69
D) .65
E) .60
A) .80
B) .71
C) .69
D) .65
E) .60
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
18
An individual's score on an achievement test is .82.The standard error of measurement for the test is reported to be 3 points.What are the chances that the individual's true score is between 76 and 88?
A) About 1 chance in 3
B) About 1 chance in 6
C) About 2 chances in 3
D) About 9 chances in 10
E) About 95 chances in 100
A) About 1 chance in 3
B) About 1 chance in 6
C) About 2 chances in 3
D) About 9 chances in 10
E) About 95 chances in 100
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
19
A student received a score of 85 on a classroom test where the mean was 70 and the standard deviation was 10.The test was found to have a reliability coefficient of .64.Which of the following statements would correctly express the accuracy of the student's obtained score? There are
A) 50 chances out of 100 that the true score is included by the range of scores from 72 to 78.
B) 68 chances out of 100 that the true score is included by the range of scores from 75 to 95.
C) 68 chances out of 100 that the true score is included by the range of scores from 79 to 91.
D) 75 chances out of 100 that the true score is included by the range of scores from 79 to 91.
E) 95 chances out of 100 that the true score is included by the range of scores from 79 to 91.
A) 50 chances out of 100 that the true score is included by the range of scores from 72 to 78.
B) 68 chances out of 100 that the true score is included by the range of scores from 75 to 95.
C) 68 chances out of 100 that the true score is included by the range of scores from 79 to 91.
D) 75 chances out of 100 that the true score is included by the range of scores from 79 to 91.
E) 95 chances out of 100 that the true score is included by the range of scores from 79 to 91.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
20
A test was given to a group of sixth grade children.The score reliability of the test was found to be .70.Later the test was given to the seventh and eighth grades in the same school system and the results for the three grades were combined.What would you expect the reliability coefficient of the test to be?
A) Above .70
B) Below .70
C) Very close to .70
D) Impossible to determine from the information given.
A) Above .70
B) Below .70
C) Very close to .70
D) Impossible to determine from the information given.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
21
Exhibit 9-5: Choose the method of estimating score reliability that would be most useful for each of the types of tests listed below.
Refer to Exhibit 9-5.An achievement test to be given as a pretest and a posttest in a classroom research study
A) equivalent forms
B) split-half
C) test-retest
Refer to Exhibit 9-5.An achievement test to be given as a pretest and a posttest in a classroom research study
A) equivalent forms
B) split-half
C) test-retest
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
22
The textbook defines test reliability as the ratio between the variance of the
A) error scores and true scores.
B) true scores and observed scores.
C) error scores and observed scores.
D) two sets of scores from identical or equivalent tests.
A) error scores and true scores.
B) true scores and observed scores.
C) error scores and observed scores.
D) two sets of scores from identical or equivalent tests.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
23
Exhibit 9-5: Choose the method of estimating score reliability that would be most useful for each of the types of tests listed below.
Refer to Exhibit 9-5.A physical fitness test administered in junior high school
A) equivalent forms
B) split-half
C) test-retest
Refer to Exhibit 9-5.A physical fitness test administered in junior high school
A) equivalent forms
B) split-half
C) test-retest
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
24
Exhibit 9-5: Choose the method of estimating score reliability that would be most useful for each of the types of tests listed below.
Refer to Exhibit 9-5.A high school math test used to predict success in college math.
A) equivalent forms
B) split-half
C) test-retest
Refer to Exhibit 9-5.A high school math test used to predict success in college math.
A) equivalent forms
B) split-half
C) test-retest
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
25
To estimate changes in test reliability that would result from changing the length of a test, one would use the
A) phi coefficient.
B) Kuder-Richardson 20.
C) Spearman-Brown.
D) coefficient alpha.
A) phi coefficient.
B) Kuder-Richardson 20.
C) Spearman-Brown.
D) coefficient alpha.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
26
A researcher wishes to increase the score reliability of a reading test being used in a study.The score reliability could be most easily increased by increasing the
A) number of items on the test.
B) number of persons tested.
C) homogeneity of the group tested.
D) number of types of items on the test.
A) number of items on the test.
B) number of persons tested.
C) homogeneity of the group tested.
D) number of types of items on the test.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
27
Exhibit 9-4: Following is a list of testing practices.Assuming other things are equal, indicate whether the practice would lower reliability, raise reliability, or neither.
Refer to Exhibit 9-4.Adding 10 items to the test that everyone answers correctly.
A) lower reliability
B) raise reliability
C) neither raise nor lower reliability
Refer to Exhibit 9-4.Adding 10 items to the test that everyone answers correctly.
A) lower reliability
B) raise reliability
C) neither raise nor lower reliability
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
28
Split-half reliability will be higher than Kuder-Richardson reliability in
A) a long test.
B) a very difficult test.
C) a heterogeneous test.
D) a test with a large error variance.
E) an internally consistent test.
A) a long test.
B) a very difficult test.
C) a heterogeneous test.
D) a test with a large error variance.
E) an internally consistent test.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
29
An investigator computed the split-half reliability coefficient of a test and found it to be .85.Later an equivalent form of the test was prepared and the reliability computed by administering the two forms to a group.One might expect that the reliability coefficient obtained by the latter procedure would be
A) lower than .85.
B) higher than .85.
C) also .85.
A) lower than .85.
B) higher than .85.
C) also .85.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
30
It has been determined that 30 percent of a test's total variance is error.What is the test's reliability coefficient?
A) .30
B) .60
C) .70
D) .75
A) .30
B) .60
C) .70
D) .75
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
31
A test designed to measure achievement motivation proved to be very consistent but to have no relation to other measures of motivation or teachers' ratings of motivation.The test is
A) valid and reliable.
B) reliable but not valid.
C) valid but not reliable.
D) neither valid nor reliable.
A) valid and reliable.
B) reliable but not valid.
C) valid but not reliable.
D) neither valid nor reliable.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
32
Exhibit 9-5: Choose the method of estimating score reliability that would be most useful for each of the types of tests listed below.
Refer to Exhibit 9-5.A classroom achievement test used for grading purposes
A) equivalent forms
B) split-half
C) test-retest
Refer to Exhibit 9-5.A classroom achievement test used for grading purposes
A) equivalent forms
B) split-half
C) test-retest
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
33
The Kuder-Richardson 21 formula for estimating the score reliability of a classroom achievement test has the disadvantage that
A) it is difficult to compute.
B) it assumes that all test items are of equal difficulty.
C) it is inappropriate for a power test.
D) the resulting reliability coefficient must be adjusted by using another formula.
A) it is difficult to compute.
B) it assumes that all test items are of equal difficulty.
C) it is inappropriate for a power test.
D) the resulting reliability coefficient must be adjusted by using another formula.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
34
Exhibit 9-4: Following is a list of testing practices.Assuming other things are equal, indicate whether the practice would lower reliability, raise reliability, or neither.
Refer to Exhibit 9-4.Changing from a multiple-choice test to an essay test covering the same materials.
A) lower reliability
B) raise reliability
C) neither raise nor lower reliability
Refer to Exhibit 9-4.Changing from a multiple-choice test to an essay test covering the same materials.
A) lower reliability
B) raise reliability
C) neither raise nor lower reliability
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
35
Exhibit 9-4: Following is a list of testing practices.Assuming other things are equal, indicate whether the practice would lower reliability, raise reliability, or neither.
Refer to Exhibit 9-4.Adding 10 items similar to those already in the test
A) lower reliability
B) raise reliability
C) neither raise nor lower reliability
Refer to Exhibit 9-4.Adding 10 items similar to those already in the test
A) lower reliability
B) raise reliability
C) neither raise nor lower reliability
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
36
A language aptitude test is administered to 50 students who are entering an honors French course.After one semester, their scores from the final examination in the French course are correlated with the scores from the language aptitude test.The resulting correlation coefficient would provide what kind of evidence of the test's score validity?
A) Content
B) Test-retest
C) Concurrent
D) Criterion-related
A) Content
B) Test-retest
C) Concurrent
D) Criterion-related
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
37
Which of the following represents the most direct evidence that a test is a valid measure of a construct?
A) The test predicts success in college.
B) The test scores correlate highly with teachers' ratings.
C) The test includes a representative sample of the skills taught in the course.
D) The test of trait Y places individuals into categories that were predicted from the theory of trait Y.
A) The test predicts success in college.
B) The test scores correlate highly with teachers' ratings.
C) The test includes a representative sample of the skills taught in the course.
D) The test of trait Y places individuals into categories that were predicted from the theory of trait Y.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
38
Estimates of the score reliability of speeded tests calculated by the split-half procedure are generally
A) too high.
B) too low.
C) accurate.
D) statistically unstable.
E) impossible to interpret.
A) too high.
B) too low.
C) accurate.
D) statistically unstable.
E) impossible to interpret.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
39
Exhibit 9-4: Following is a list of testing practices.Assuming other things are equal, indicate whether the practice would lower reliability, raise reliability, or neither.
Refer to Exhibit 9-4.Removing ambiguous items from the test
A) lower reliability
B) raise reliability
C) neither raise nor lower reliability
Refer to Exhibit 9-4.Removing ambiguous items from the test
A) lower reliability
B) raise reliability
C) neither raise nor lower reliability
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
40
A test has a reliability coefficient of .75 and a standard deviation of 10.What is the standard error of measurement of this test?
A) 5.00
B) 8.00
C) 10.00
D) 20.00
E) 32.00
A) 5.00
B) 8.00
C) 10.00
D) 20.00
E) 32.00
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
41
Cronbach's coefficient alpha is preferred to the Kuder-Richardson 21 when
A) the items are all of the same level of difficulty.
B) the items are scored right or wrong (dichotomously).
C) the responses may receive a varying number of points.
D) a and c
A) the items are all of the same level of difficulty.
B) the items are scored right or wrong (dichotomously).
C) the responses may receive a varying number of points.
D) a and c
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
42
High score reliability for a test indicates that
A) the test has validity.
B) random errors of measurement are not a serious problem.
C) the test is measuring what it is supposed to be measuring.
D) the proportion of error variance to true variance is very high.
A) the test has validity.
B) random errors of measurement are not a serious problem.
C) the test is measuring what it is supposed to be measuring.
D) the proportion of error variance to true variance is very high.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
43
The major disadvantage of the coefficient of stability and equivalence as a measure of reliability is
A) getting the cooperation of the subjects.
B) obtaining the two forms of the test.
C) finding a suitable criterion.
D) establishing the appropriate time interval.
E) the statistics involved in calculating the coefficient.
A) getting the cooperation of the subjects.
B) obtaining the two forms of the test.
C) finding a suitable criterion.
D) establishing the appropriate time interval.
E) the statistics involved in calculating the coefficient.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
44
A test has a reliability coefficient of .75.This coefficient indicates that
A) 56 percent of the variance in scores is true variance.
B) 75 percent of the variance in scores is true variance.
C) 25 percent of the variance in scores is true variance.
D) 75 percent of the test items are valid.
A) 56 percent of the variance in scores is true variance.
B) 75 percent of the variance in scores is true variance.
C) 25 percent of the variance in scores is true variance.
D) 75 percent of the test items are valid.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
45
Exhibit 9-6: Indicate which type of evidence is being gathered in the following questions.
Refer to Exhibit 9-6.A teacher compares the science test results with the results of a standardized test in science.
A) evidence based on content
B) evidence based on relationship to other variables
C) evidence based on construct-relevant variance
Refer to Exhibit 9-6.A teacher compares the science test results with the results of a standardized test in science.
A) evidence based on content
B) evidence based on relationship to other variables
C) evidence based on construct-relevant variance
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
46
Exhibit 9-6: Indicate which type of evidence is being gathered in the following questions.
Refer to Exhibit 9-6.A teacher correlates students' scores on the test with the final semester grade received in the course.
A) evidence based on content
B) evidence based on relationship to other variables
C) evidence based on construct-relevant variance
Refer to Exhibit 9-6.A teacher correlates students' scores on the test with the final semester grade received in the course.
A) evidence based on content
B) evidence based on relationship to other variables
C) evidence based on construct-relevant variance
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
47
Exhibit 9-6: Indicate which type of evidence is being gathered in the following questions.
Refer to Exhibit 9-6.A teacher tries to determine if the numerical reasoning test is really measuring numerical reasoning skills rather than ability to read the test items.
A) evidence based on content
B) evidence based on relationship to other variables
C) evidence based on construct-relevant variance
Refer to Exhibit 9-6.A teacher tries to determine if the numerical reasoning test is really measuring numerical reasoning skills rather than ability to read the test items.
A) evidence based on content
B) evidence based on relationship to other variables
C) evidence based on construct-relevant variance
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
48
A researcher has constructed an attitude scale in which each item is scored from 1 to 5 depending on the extent of agreement or disagreement.The estimate of score reliability that would be recommended for this scale is
A) Kuder-Richardson formula 20.
B) Kuder-Richardson formula 21.
C) Cronbach's coefficient alpha.
D) biserial correlation coefficient.
E) phi coefficient.
A) Kuder-Richardson formula 20.
B) Kuder-Richardson formula 21.
C) Cronbach's coefficient alpha.
D) biserial correlation coefficient.
E) phi coefficient.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
49
If the score reliability of a test decreases, the standard error of measurement will
A) decrease.
B) increase.
C) stay the same.
D) decrease only if the standard deviation decreases.
A) decrease.
B) increase.
C) stay the same.
D) decrease only if the standard deviation decreases.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
50
A personality test has been carefully prepared to measure a single trait.Which of the following coefficients would probably be lowest in size?
A) Kuder-Richardson
B) Split-half coefficient
C) Coefficient of stability
D) Coefficient of stability and equivalence
A) Kuder-Richardson
B) Split-half coefficient
C) Coefficient of stability
D) Coefficient of stability and equivalence
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
51
The coefficient of correlation between a scholastic aptitude test and GPA was found to be 1.05.This coefficient indicates that
A) the scholastic aptitude test has good reliability.
B) the scholastic aptitude test has high validity.
C) the scholastic aptitude test is useful as a predictor of school achievement.
D) there is an error in the computation.
E) None of these are true.
A) the scholastic aptitude test has good reliability.
B) the scholastic aptitude test has high validity.
C) the scholastic aptitude test is useful as a predictor of school achievement.
D) there is an error in the computation.
E) None of these are true.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
52
A researcher is constructing a test for use in a study.In order to insure the highest reliability, the researcher should
A) increase the number of items on the test.
B) plan to increase item homogeneity.
C) try out the test on a heterogeneous group.
D) All of these are true.
A) increase the number of items on the test.
B) plan to increase item homogeneity.
C) try out the test on a heterogeneous group.
D) All of these are true.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
53
Exhibit 9-6: Indicate which type of evidence is being gathered in the following questions.
Refer to Exhibit 9-6.A teacher examines an achievement test to see how well it covers the content and objectives covered in the course.
A) evidence based on content
B) evidence based on relationship to other variables
C) evidence based on construct-relevant variance
Refer to Exhibit 9-6.A teacher examines an achievement test to see how well it covers the content and objectives covered in the course.
A) evidence based on content
B) evidence based on relationship to other variables
C) evidence based on construct-relevant variance
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
54
The scores made by a group of students on a classroom chemistry test are correlated with their scores on a standardized chemistry test.The coefficient of correlation between the two sets of scores would be called a
A) coefficient alpha.
B) validity coefficient.
C) coefficient of equivalence.
D) coefficient of stability and equivalence.
E) coefficient of internal consistency.
A) coefficient alpha.
B) validity coefficient.
C) coefficient of equivalence.
D) coefficient of stability and equivalence.
E) coefficient of internal consistency.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
55
Four groups of sixth grade students took the same test in reading.The reliability coefficient will be highest for
A) students with similar intelligence test scores and a similar background in reading.
B) students with similar intelligence test scores but varying in their background in reading.
C) students with varying intelligence test scores, all of whom have a similar background in reading.
D) students with varying intelligence test scores and a varying background in reading.
A) students with similar intelligence test scores and a similar background in reading.
B) students with similar intelligence test scores but varying in their background in reading.
C) students with varying intelligence test scores, all of whom have a similar background in reading.
D) students with varying intelligence test scores and a varying background in reading.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
56
Reliability is to random error as validity is to
A) standard error.
B) systematic error.
C) internal error.
D) error of measurement.
E) None of these are true.
A) standard error.
B) systematic error.
C) internal error.
D) error of measurement.
E) None of these are true.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
57
The greatest difficulty for one doing research on teacher effectiveness is most likely to be
A) formulating the research question.
B) obtaining subjects to participate in the study.
C) operationally defining teacher effectiveness.
D) developing statistical procedures for analyzing the data.
A) formulating the research question.
B) obtaining subjects to participate in the study.
C) operationally defining teacher effectiveness.
D) developing statistical procedures for analyzing the data.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
58
Miss Smith compares the items on the Powers Chemistry Test to the content and objectives in her chemistry course.Miss Smith is most concerned with
A) validity evidence based on relationship to other variables.
B) validity evidence based on content.
C) validity evidence based on internal structure.
D) stability reliability.
E) equivalent forms reliability.
A) validity evidence based on relationship to other variables.
B) validity evidence based on content.
C) validity evidence based on internal structure.
D) stability reliability.
E) equivalent forms reliability.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
59
Form A of a standardized reading test is administered and then Form B of the same test is administered one month later.The coefficient of correlation between the two test scores for a group of students would be known as a
A) coefficient alpha.
B) coefficient of stability.
C) coefficient of equivalence.
D) coefficient of internal consistency.
E) coefficient of stability and equivalence.
A) coefficient alpha.
B) coefficient of stability.
C) coefficient of equivalence.
D) coefficient of internal consistency.
E) coefficient of stability and equivalence.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
60
A test has a standard deviation of 4 and a reliability of .91.What is the standard error of measurement?
A) 1.20
B) 1.82
C) 3.64
D) 3.82
A) 1.20
B) 1.82
C) 3.64
D) 3.82
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
61
Three individuals report the same information about a student's personal characteristics.Their agreement is indicative of a high degree of
A) validity.
B) interrater reliability.
C) halo effect.
D) standardization.
A) validity.
B) interrater reliability.
C) halo effect.
D) standardization.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
62
The score validity of a test is related highly to the
A) test format.
B) purpose of the test.
C) number of items.
D) availability of equivalent forms.
A) test format.
B) purpose of the test.
C) number of items.
D) availability of equivalent forms.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
63
Conventional reliability coefficients computed for criterion-referenced tests are typically quite low because criterion-referenced tests
A) provide wide variation in scores.
B) cover a broad sample of content.
C) cover a limited sample of behavior.
D) yield criterion scores which tend to be variable.
E) provide little variation in scores.
A) provide wide variation in scores.
B) cover a broad sample of content.
C) cover a limited sample of behavior.
D) yield criterion scores which tend to be variable.
E) provide little variation in scores.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
64
A researcher has a highly speeded test.The recommended way to estimate the score reliability of this test is to use
A) Cronbach alpha.
B) Kuder-Richardson 21.
C) test-retest or equivalent forms.
D) the split-half procedure.
A) Cronbach alpha.
B) Kuder-Richardson 21.
C) test-retest or equivalent forms.
D) the split-half procedure.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
65
Which of the following sources of error variance is not accounted for in a coefficient of stability and equivalence?
A) Content sampling
B) Procedures in administering or scoring the test
C) Ability differences among the subjects
D) Time between the equivalent forms
E) Changes in the examinees' emotional state
A) Content sampling
B) Procedures in administering or scoring the test
C) Ability differences among the subjects
D) Time between the equivalent forms
E) Changes in the examinees' emotional state
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
66
Exhibit 9-8: The following data were obtained when two forms of a criterion-referenced test were given to a group of 10 students.There were 40 items on each form and a student had to receive75 percent correct on each form.

Refer to Exhibit 9-8.Express the score reliability of this test in terms of the simple agreement coefficient.
A) .40
B) .55
C) .75
D) .80
E) .82

Refer to Exhibit 9-8.Express the score reliability of this test in terms of the simple agreement coefficient.
A) .40
B) .55
C) .75
D) .80
E) .82
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
67
Exhibit 9-8: The following data were obtained when two forms of a criterion-referenced test were given to a group of 10 students.There were 40 items on each form and a student had to receive75 percent correct on each form.

Refer to Exhibit 9-8.What would the score reliability of the test be if we take chance agreement into consideration (Kappa)?
A) .50
B) .52
C) .58
D) .60
E) .72

Refer to Exhibit 9-8.What would the score reliability of the test be if we take chance agreement into consideration (Kappa)?
A) .50
B) .52
C) .58
D) .60
E) .72
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
68
Which of the following would contribute the best evidence for the score validity of a new group intelligence test?
A) The correlation between Form A and Form B of the test
B) The correlation between test scores and grades in reading
C) The correlation between test scores and scores from the Stanford-Binet intelligence test
D) An examination of the homogeneity of scores on the test
A) The correlation between Form A and Form B of the test
B) The correlation between test scores and grades in reading
C) The correlation between test scores and scores from the Stanford-Binet intelligence test
D) An examination of the homogeneity of scores on the test
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
69
Ordinary correlational estimates of the score reliability of criterion-referenced tests may be quite low because
A) it is difficult to find suitable criteria.
B) the scores have little variability.
C) the tests are typically speeded.
D) the content is too heterogeneous.
A) it is difficult to find suitable criteria.
B) the scores have little variability.
C) the tests are typically speeded.
D) the content is too heterogeneous.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
70
In validating an inventory measure of need to achieve, a researcher correlated the scores from the inventory with a projective technique known to measure need to achieve.This is an example of
A) content evidence.
B) external evidence.
C) convergent evidence.
D) divergent evidence.
A) content evidence.
B) external evidence.
C) convergent evidence.
D) divergent evidence.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
71
A split-half reliability coefficient of .60 means
A) 60 percent of the students can be expected to vary from their true score.
B) on retesting, 60 percent of the students can be expected to vary about 68 percent from their true score.
C) the measurements are 60 percent reliable.
D) 40 percent of the measurements are reliable.
E) none of these are true.
A) 60 percent of the students can be expected to vary from their true score.
B) on retesting, 60 percent of the students can be expected to vary about 68 percent from their true score.
C) the measurements are 60 percent reliable.
D) 40 percent of the measurements are reliable.
E) none of these are true.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
72
Which of the following is not an internal consistency measure of reliability?
A) Split-half
B) Coefficient alpha
C) Kuder-Richardson
D) Coefficient of stability
A) Split-half
B) Coefficient alpha
C) Kuder-Richardson
D) Coefficient of stability
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
73
Miss Jones has constructed a mastery test to give after a unit on fractions.All students must have at least 85 percent of the items correct.Which of the following reliability coefficients would you recommend as being most appropriate for Miss Jones to use?
A) Test-retest
B) Kuder-Richardson 20
C) Agreement coefficient
D) Coefficient alpha
A) Test-retest
B) Kuder-Richardson 20
C) Agreement coefficient
D) Coefficient alpha
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
74
Exhibit 9-7: The following results were obtained when two forms of a criterion-referenced test were administered to a sample of 50 students:
Form 1
Nonmaster Master

Refer to Exhibit 9-7.Calculate the index that shows the extent of agreement beyond that expected by chance (phi).
A) .61
B) .64
C) .68
D) .76
E) .78
Form 1
Nonmaster Master

Refer to Exhibit 9-7.Calculate the index that shows the extent of agreement beyond that expected by chance (phi).
A) .61
B) .64
C) .68
D) .76
E) .78
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
75
If the reliability coefficient of a test was found to be 1.00, the standard error of measurement would be
A) zero.
B) .50.
C) 1.00.
D) equal to the standard deviation of the test.
A) zero.
B) .50.
C) 1.00.
D) equal to the standard deviation of the test.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
76
One way to describe the score reliability of a criterion-referenced test is in terms of the consistency with which individuals
A) obtain the same scores on a retest.
B) maintain their same relative position in the group on a retest.
C) obtain the same scores on the odd and even halves of the test.
D) achieve mastery on two different forms of the test.
A) obtain the same scores on a retest.
B) maintain their same relative position in the group on a retest.
C) obtain the same scores on the odd and even halves of the test.
D) achieve mastery on two different forms of the test.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
77
Test validity is to ____ error as reliability is to ____ error.
A) systematic, random
B) random, systematic
A) systematic, random
B) random, systematic
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
78
Exhibit 9-7: The following results were obtained when two forms of a criterion-referenced test were administered to a sample of 50 students:
Form 1
Nonmaster Master

Refer to Exhibit 9-7.Calculate the agreement coefficient for this test.
A) .76
B) .78
C) .80
D) .86
E) .90
Form 1
Nonmaster Master

Refer to Exhibit 9-7.Calculate the agreement coefficient for this test.
A) .76
B) .78
C) .80
D) .86
E) .90
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
79
Which of the following score reliability procedures requires that the items all have the same level of difficulty?
A) Kuder-Richardson 20
B) Kuder-Richardson 21
C) Split-half
D) Cronbach alpha
A) Kuder-Richardson 20
B) Kuder-Richardson 21
C) Split-half
D) Cronbach alpha
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck
80
A test has high score reliability if
A) it measures what it is supposed to measure.
B) it has a high correlation with an appropriate criterion.
C) individuals maintain their relative ranks when the test is administered a second time.
D) it is a representative sample of the content domain.
A) it measures what it is supposed to measure.
B) it has a high correlation with an appropriate criterion.
C) individuals maintain their relative ranks when the test is administered a second time.
D) it is a representative sample of the content domain.
Unlock Deck
Unlock for access to all 139 flashcards in this deck.
Unlock Deck
k this deck