Deck 7: Reliability of Selection Measures

Full screen (f)
exit full mode
Question
A split-half reliability overestimates actual reliability.Therefore, a special formula, the Spearman-Brown prophecy formula, is used to make the correction.
Use Space or
up arrow
down arrow
to flip the card.
Question
If all respondents on a selection measure remember their previous answers to an initial administration of a measure and then on the retest respond according to their memory, the reliability coefficient will decrease.
Question
With a long time interval between administrations of a measure, test-retest reliability may underestimate reliability.
Question
When a measure is perfectly reliable, its obtained score is higher than its true score.
Question
With increasing time intervals, test-retest reliability coefficients will generally decrease.
Question
The higher the value of a reliability coefficient, the less the measurement error.
Question
The higher the test-retest reliability coefficient, the greater the true score and the less the error.
Question
Reliability coefficients computed between parallel forms tend to be conservative estimates.
Question
A split-half reliability estimate is NOT a pure measure of internal consistency.
Question
Because of the way it is calculated, a higher reliability coefficient is desirable.
Question
To control the effects of memory on test-retest reliability estimates, the same measure should be used the second time.
Question
Surprisingly, increasing the lengths of time between administrations does not reduce the impact of memory effects on reliability.
Question
In general, the amount of measurement error has little effect on how high the reliability of measurement error will be.
Question
Reliability is generally determined by examining the relationship between two sets of measures measuring the same thing.
Question
Selection measures that are designed to assess job-related characteristics are more precise than measures of physical characteristics.
Question
Error score represents errors of measurement.
Question
Selection measures involving traits of personality, attitudes, or interests are usually considered to be fairly static yielding high reliability coefficients.
Question
To achieve a parallel forms reliability estimate, at least two equal versions of a measure must exist.
Question
Reliability of measurement in selection is synonymous with dependability, consistency, or stability of measurement.
Question
A selection measure is internally consistent or homogeneous when individuals' responses on one part of the measure are unrelated to their responses on other parts.
Question
Tests with many items that are very difficult are more reliable than tests containing many items of moderate difficulty.
Question
If variability or individual differences increase among respondents while variation within individuals remains the same, reliability will increase.
Question
If coefficient alpha reliability is unacceptably low, then the items on the selection measure may be assessing more than one characteristic.
Question
Reliability is a group-based statistic.
Question
Interrater agreement indices are generally restricted to interval or ratio data.
Question
Kuder-Richardson reliability procedures are rarely used.
Question
The standard error of measurement is affected by variability within the group of respondents to whom a measure has been administered.
Question
In the context of personnel selection, the reliability of criterion measures need not be as high as predictor measures.
Question
As the number of response options or categories on a measure increases, reliability also increases.
Question
Reliability is a necessary but not sufficient condition for validity.
Question
Selection measures are not simply "reliable" or "not reliable," there are degrees of reliability.
Question
Unreliable performance by a respondent on a reliable measure is possible, but reliable performance on an unreliable measure is impossible.
Question
In general, as the length of a measure decreases, its reliability increases.
Question
A good rule of thumb is that reliability must be .90 or higher.
Question
Split-half reliability procedures tend to produce a conservative estimate of reliability.
Question
Interrater reliability estimates test the hypothesis that ratings are determined by characteristics of the rater rather than by what is being rated.
Question
Kuder-Richardson reliability estimates are usually lower than those obtained from split-half estimates.
Question
If our standard error is 3.16 and the difference between two applicants' scores is 3, then it is possible that the difference in scores is due to chance.
Question
Although interrater agreement indices have their limitations, they are still widely used in selection research.
Question
The standard error of measurement is another approach for estimating reliability.
Question
With a long time interval between administrations of a measure (test-retest), what could cause scores to change resulting in an underestimate of the reliability?

A)reasoning
B)thinking
C)memory
D)learning
Question
What is the difference between interclass and intraclass correlations (reliability estimates)?

A)minimum number of targets being rated
B)minimum number of equivalent forms being used
C)minimum number of raters needed for calculation
D)minimum number of attributes measured
Question
For a test with a time limit (i.e., a speed test), which reliability estimation procedure is not appropriate?

A)test-retest
B)split-half
C)parallel or equivalent forms (immediate administration)
D)parallel or equivalent forms (long-term administration)
Question
Calculation of reliability estimates results in a coefficient ranging from _____ to _____.

A)0, 1.96
B)0.00; 1.00
C)-1.00; 1.00
D)-1.00; 0.00
Question
Among the most popular internal consistency methods are all of these EXCEPT:

A)Cronbach's coefficient alpha reliability
C)Split-half reliability
B)Kuder-Richardson reliability
D)Guion's measurement
Question
What reliability estimate consists of administering the same selection measure twice and correlating the two sets of scores?

A)parallel forms
C)split-half
B)internal consistency
D)test-retest
Question
Which of the following is not a method for estimating internal consistency reliability?

A)parallel or equivalent forms reliability
B)Kuder-Richardson reliability
C)Cronbach's coefficient alpha reliability
D)split-half reliability
Question
An obtained score consists of which two components?

A)controllable and uncontrollable
C)true and error
B)systematic and unsystematic
D)true and predictive
Question
Which of the following is NOT one of the categories of statistical procedures for estimating interrater reliability?

A)interclass correlation
C)interrater agreement
B)intraclass correlation
D)underclass correlation
Question
Test-retest reliability estimation is most appropriate for which of the following?

A)mental ability
B)attitudes
C)self-esteem
D)self-concept
Question
In order to calculate an interclass correlation, how many raters are necessary?

A)2
B)2 or more
C)more than 2
D)more than 3
Question
How many test administrations do you need in order to calculate a split-half reliability estimate?

A)1/2
B)1
C)2
D)1/4
Question
For which of the following selection measures is it most appropriate to use equivalent forms for reliability estimation?

A)vocabulary
C)personality inventory
B)biographical inventory
D)physical fitness
Question
Generally speaking, the greater the variability or standard deviation of scores on the characteristic measured, the higher the reliability of the measure of that characteristic.
Question
What is a correlation coefficient calculated between two sets of scores over time called?

A)coefficient of stability
C)coefficient of equivalence
B)coefficient alpha
D)coefficient of dependability
Question
What impact does memory have on a test-retest reliability estimate?

A)It is not possible to determine its effect on reliability.
B)It has no effect on test-retest reliability.
C)It will underestimate the true reliability of obtained scores.
D)It will overestimate the true reliability of obtained scores.
Question
Which of the following would NOT be a likely cause of interrater disagreement?

A)Raters view the same behavior differently.
B)Raters interpret the same behavior differently.
C)Error in rating or recording each impression.
D)Length of time behavior is displayed.
Question
As the coefficient approaches 1.00, the set of measures is viewed as:

A)equivalent.
B)identical.
C)very different.
D)unrelated.
Question
In order to calculate an intraclass correlation, how many raters are necessary?

A)2
C)more than 2
B)2 or more
D)any number will do
Question
What is a true score?

A)the score obtained for a person under normal conditions
B)the score obtained because of the presence of external factors
C)the mean/average score made by a person on many different administrations of tests
D)the standard deviation on many different administrations of the same test on the same individual
Question
If rxx = .85, and the standard deviation of x is 10, then the standard error of measurement for measure x is

A)3.873
B)3.16
C)3.50
D)3.30
Question
If we have a test called "x" and rxx = .80, this means

A)80% of the differences in test scores is due to error and only 20% is due to true variance.
B)20% of the test scores were used to obtain the reliability estimate.
C)20% of the differences in test scores is due to error and 80% is due to true variance.
D)the test average is in the low 'B' range.
Question
.The difference between two individuals' scores should not be considered significant unless the difference is at least ___________ the standard error of measurement of the measure.

A)equal to
B)twice
C)three times
D)four times
Question
Research has shown that the reliability of rating scales can be improved by offering from _____ to _____ rating categories:

A)1, 4
B)1, 5
C)3, 7
D)5, 9
Unlock Deck
Sign up to unlock the cards in this deck!
Unlock Deck
Unlock Deck
1/64
auto play flashcards
Play
simple tutorial
Full screen (f)
exit full mode
Deck 7: Reliability of Selection Measures
1
A split-half reliability overestimates actual reliability.Therefore, a special formula, the Spearman-Brown prophecy formula, is used to make the correction.
False
2
If all respondents on a selection measure remember their previous answers to an initial administration of a measure and then on the retest respond according to their memory, the reliability coefficient will decrease.
False
3
With a long time interval between administrations of a measure, test-retest reliability may underestimate reliability.
True
4
When a measure is perfectly reliable, its obtained score is higher than its true score.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
5
With increasing time intervals, test-retest reliability coefficients will generally decrease.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
6
The higher the value of a reliability coefficient, the less the measurement error.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
7
The higher the test-retest reliability coefficient, the greater the true score and the less the error.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
8
Reliability coefficients computed between parallel forms tend to be conservative estimates.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
9
A split-half reliability estimate is NOT a pure measure of internal consistency.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
10
Because of the way it is calculated, a higher reliability coefficient is desirable.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
11
To control the effects of memory on test-retest reliability estimates, the same measure should be used the second time.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
12
Surprisingly, increasing the lengths of time between administrations does not reduce the impact of memory effects on reliability.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
13
In general, the amount of measurement error has little effect on how high the reliability of measurement error will be.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
14
Reliability is generally determined by examining the relationship between two sets of measures measuring the same thing.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
15
Selection measures that are designed to assess job-related characteristics are more precise than measures of physical characteristics.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
16
Error score represents errors of measurement.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
17
Selection measures involving traits of personality, attitudes, or interests are usually considered to be fairly static yielding high reliability coefficients.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
18
To achieve a parallel forms reliability estimate, at least two equal versions of a measure must exist.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
19
Reliability of measurement in selection is synonymous with dependability, consistency, or stability of measurement.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
20
A selection measure is internally consistent or homogeneous when individuals' responses on one part of the measure are unrelated to their responses on other parts.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
21
Tests with many items that are very difficult are more reliable than tests containing many items of moderate difficulty.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
22
If variability or individual differences increase among respondents while variation within individuals remains the same, reliability will increase.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
23
If coefficient alpha reliability is unacceptably low, then the items on the selection measure may be assessing more than one characteristic.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
24
Reliability is a group-based statistic.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
25
Interrater agreement indices are generally restricted to interval or ratio data.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
26
Kuder-Richardson reliability procedures are rarely used.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
27
The standard error of measurement is affected by variability within the group of respondents to whom a measure has been administered.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
28
In the context of personnel selection, the reliability of criterion measures need not be as high as predictor measures.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
29
As the number of response options or categories on a measure increases, reliability also increases.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
30
Reliability is a necessary but not sufficient condition for validity.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
31
Selection measures are not simply "reliable" or "not reliable," there are degrees of reliability.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
32
Unreliable performance by a respondent on a reliable measure is possible, but reliable performance on an unreliable measure is impossible.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
33
In general, as the length of a measure decreases, its reliability increases.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
34
A good rule of thumb is that reliability must be .90 or higher.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
35
Split-half reliability procedures tend to produce a conservative estimate of reliability.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
36
Interrater reliability estimates test the hypothesis that ratings are determined by characteristics of the rater rather than by what is being rated.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
37
Kuder-Richardson reliability estimates are usually lower than those obtained from split-half estimates.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
38
If our standard error is 3.16 and the difference between two applicants' scores is 3, then it is possible that the difference in scores is due to chance.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
39
Although interrater agreement indices have their limitations, they are still widely used in selection research.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
40
The standard error of measurement is another approach for estimating reliability.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
41
With a long time interval between administrations of a measure (test-retest), what could cause scores to change resulting in an underestimate of the reliability?

A)reasoning
B)thinking
C)memory
D)learning
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
42
What is the difference between interclass and intraclass correlations (reliability estimates)?

A)minimum number of targets being rated
B)minimum number of equivalent forms being used
C)minimum number of raters needed for calculation
D)minimum number of attributes measured
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
43
For a test with a time limit (i.e., a speed test), which reliability estimation procedure is not appropriate?

A)test-retest
B)split-half
C)parallel or equivalent forms (immediate administration)
D)parallel or equivalent forms (long-term administration)
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
44
Calculation of reliability estimates results in a coefficient ranging from _____ to _____.

A)0, 1.96
B)0.00; 1.00
C)-1.00; 1.00
D)-1.00; 0.00
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
45
Among the most popular internal consistency methods are all of these EXCEPT:

A)Cronbach's coefficient alpha reliability
C)Split-half reliability
B)Kuder-Richardson reliability
D)Guion's measurement
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
46
What reliability estimate consists of administering the same selection measure twice and correlating the two sets of scores?

A)parallel forms
C)split-half
B)internal consistency
D)test-retest
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
47
Which of the following is not a method for estimating internal consistency reliability?

A)parallel or equivalent forms reliability
B)Kuder-Richardson reliability
C)Cronbach's coefficient alpha reliability
D)split-half reliability
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
48
An obtained score consists of which two components?

A)controllable and uncontrollable
C)true and error
B)systematic and unsystematic
D)true and predictive
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
49
Which of the following is NOT one of the categories of statistical procedures for estimating interrater reliability?

A)interclass correlation
C)interrater agreement
B)intraclass correlation
D)underclass correlation
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
50
Test-retest reliability estimation is most appropriate for which of the following?

A)mental ability
B)attitudes
C)self-esteem
D)self-concept
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
51
In order to calculate an interclass correlation, how many raters are necessary?

A)2
B)2 or more
C)more than 2
D)more than 3
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
52
How many test administrations do you need in order to calculate a split-half reliability estimate?

A)1/2
B)1
C)2
D)1/4
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
53
For which of the following selection measures is it most appropriate to use equivalent forms for reliability estimation?

A)vocabulary
C)personality inventory
B)biographical inventory
D)physical fitness
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
54
Generally speaking, the greater the variability or standard deviation of scores on the characteristic measured, the higher the reliability of the measure of that characteristic.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
55
What is a correlation coefficient calculated between two sets of scores over time called?

A)coefficient of stability
C)coefficient of equivalence
B)coefficient alpha
D)coefficient of dependability
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
56
What impact does memory have on a test-retest reliability estimate?

A)It is not possible to determine its effect on reliability.
B)It has no effect on test-retest reliability.
C)It will underestimate the true reliability of obtained scores.
D)It will overestimate the true reliability of obtained scores.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
57
Which of the following would NOT be a likely cause of interrater disagreement?

A)Raters view the same behavior differently.
B)Raters interpret the same behavior differently.
C)Error in rating or recording each impression.
D)Length of time behavior is displayed.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
58
As the coefficient approaches 1.00, the set of measures is viewed as:

A)equivalent.
B)identical.
C)very different.
D)unrelated.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
59
In order to calculate an intraclass correlation, how many raters are necessary?

A)2
C)more than 2
B)2 or more
D)any number will do
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
60
What is a true score?

A)the score obtained for a person under normal conditions
B)the score obtained because of the presence of external factors
C)the mean/average score made by a person on many different administrations of tests
D)the standard deviation on many different administrations of the same test on the same individual
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
61
If rxx = .85, and the standard deviation of x is 10, then the standard error of measurement for measure x is

A)3.873
B)3.16
C)3.50
D)3.30
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
62
If we have a test called "x" and rxx = .80, this means

A)80% of the differences in test scores is due to error and only 20% is due to true variance.
B)20% of the test scores were used to obtain the reliability estimate.
C)20% of the differences in test scores is due to error and 80% is due to true variance.
D)the test average is in the low 'B' range.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
63
.The difference between two individuals' scores should not be considered significant unless the difference is at least ___________ the standard error of measurement of the measure.

A)equal to
B)twice
C)three times
D)four times
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
64
Research has shown that the reliability of rating scales can be improved by offering from _____ to _____ rating categories:

A)1, 4
B)1, 5
C)3, 7
D)5, 9
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
locked card icon
Unlock Deck
Unlock for access to all 64 flashcards in this deck.