Deck 5: Reliability
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Unlock Deck
Sign up to unlock the cards in this deck!
Unlock Deck
Unlock Deck
1/182
Play
Full screen (f)
Deck 5: Reliability
1
According to Chin,it is common for research findings to be
A) rejected for not having met the general acceptance standard.
B) accepted as having met the general acceptance standard.
C) accepted by a judge but then rejected by an expert witness.
D) accepted by an expert witness but then rejected by a judge.
A) rejected for not having met the general acceptance standard.
B) accepted as having met the general acceptance standard.
C) accepted by a judge but then rejected by an expert witness.
D) accepted by an expert witness but then rejected by a judge.
B
2
Unreliable findings that reach general acceptance in the academic community
A) tend to self-correct.
B) tend to linger too long.
C) become exposed through social media.
D) are not admissible in a court of law.
A) tend to self-correct.
B) tend to linger too long.
C) become exposed through social media.
D) are not admissible in a court of law.
B
3
According to Gil et al.(2016),which of the following is a source of error in scores on psychological tests?
A) whether or not the examiner has a beard
B) whether the testtaker's country is at war or peace
C) the body weight of the testtaker two weeks prior to the test
D) None of these
A) whether or not the examiner has a beard
B) whether the testtaker's country is at war or peace
C) the body weight of the testtaker two weeks prior to the test
D) None of these
B
4
Makel et al.(2012)observed that only about ____ of the published literature replicated previous work.
A) 1%
B) 3%
C) 5%
D) 7%
A) 1%
B) 3%
C) 5%
D) 7%
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
5
Generally,diagnostic reliability is necessary.However,which of the following is NOT a reason that diagnostic reliability is necessary?
A) It is necessary for accurate diagnosis.
B) It is necessary for any double-blind study.
C) It is necessary to determine the effectiveness of treatments.
D) It is necessary to track changes in a disorder over time.
A) It is necessary for accurate diagnosis.
B) It is necessary for any double-blind study.
C) It is necessary to determine the effectiveness of treatments.
D) It is necessary to track changes in a disorder over time.
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
6
Hawkins et al.(2016)found that subjects with ____ fasting glucose levels made nearly ____ times as many errors as subjects with fasting glucose levels in the normal range.
A) low; one-quarter
B) high; one-quarter
C) high; four times
D) high; twice
A) low; one-quarter
B) high; one-quarter
C) high; four times
D) high; twice
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
7
In the test-retest method to estimate reliability
A) the time frame between interviews must be relatively short.
B) separate interviews are conducted by certified raters.
C) a minimum of two re-tests are required.
D) All of these
A) the time frame between interviews must be relatively short.
B) separate interviews are conducted by certified raters.
C) a minimum of two re-tests are required.
D) All of these
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
8
What has been called a "replicability crisis" in psychology emerged as a result of a number of factors.Which is not one of those factors?
A) a general lack of published attempts to replicate research
B) editorial preferences for papers with positive findings
C) questionable research practices on the part of study authors
D) unwillingness or inability of original study authors to share data
A) a general lack of published attempts to replicate research
B) editorial preferences for papers with positive findings
C) questionable research practices on the part of study authors
D) unwillingness or inability of original study authors to share data
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
9
Berman et al.(2015)observed that one source of error in evaluations the suicide risk of patients is
A) whether or not the patient has previously attempted suicide.
B) whether or not the clinician previously had a patient attempt suicide.
C) how religious the evaluating clinician is.
D) how religious the evaluated patient is.
A) whether or not the patient has previously attempted suicide.
B) whether or not the clinician previously had a patient attempt suicide.
C) how religious the evaluating clinician is.
D) how religious the evaluated patient is.
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
10
In 2015,a group of researchers attempted to replicate 100 peer-reviewed,published psychological studies.This group of researchers was called the
A) Society for the Replication of Psychological Studies.
B) Scientists for the Abolition of Irreproducible Results.
C) Open Science Collaboration.
D) Coalition for Responsible Science.
A) Society for the Replication of Psychological Studies.
B) Scientists for the Abolition of Irreproducible Results.
C) Open Science Collaboration.
D) Coalition for Responsible Science.
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
11
Prior to research on inter-rater reliability for DSM-5,DSM inter-rater reliability estimates were obtained using the ____ method.
A) test-retest
B) paired-paragraph
C) audio-recording
D) one-way mirror
A) test-retest
B) paired-paragraph
C) audio-recording
D) one-way mirror
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
12
Which is an example of what is referred to as "QRP" in your textbook?
A) collecting additional data to reach statistical significance
B) over-reporting of data with excessive detail
C) telling subjects in a control group that they need not participate
D) requesting detailed data from the original study author
A) collecting additional data to reach statistical significance
B) over-reporting of data with excessive detail
C) telling subjects in a control group that they need not participate
D) requesting detailed data from the original study author
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
13
With critical variables in a research study held constant,different methods used to estimate reliability will typically yield
A) virtually no differences in the magnitude of the estimate.
B) sizable differences in the magnitude of the estimate.
C) skewed estimates of reliability.
D) identical estimates of reliability.
A) virtually no differences in the magnitude of the estimate.
B) sizable differences in the magnitude of the estimate.
C) skewed estimates of reliability.
D) identical estimates of reliability.
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
14
According to Chin,as cited in your textbook,a lack of replicability in psychology affects the work of
A) the police.
B) judges.
C) court clerks.
D) All of these
A) the police.
B) judges.
C) court clerks.
D) All of these
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
15
Field trials of DSM-5 demonstrated a mean kappa that was indicative of a ______ level of agreement among raters.
A) poor
B) fair
C) good
D) "kinder and gentler"
A) poor
B) fair
C) good
D) "kinder and gentler"
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
16
In 2015,a report on attempted replications published in Science,noted that,depending on the criteria used,___ of the replications found the same results as the original study.
A) 0%
B) 20 to 40%
C) 40 to 60%
D) 100%
A) 0%
B) 20 to 40%
C) 40 to 60%
D) 100%
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
17
Replication of research by independent parties provides for
A) confidence in study findings.
B) confirmation that the study findings were not an anomaly.
C) confidence that the study findings were not the result of the original experimenter's biases.
D) All of these
A) confidence in study findings.
B) confirmation that the study findings were not an anomaly.
C) confidence that the study findings were not the result of the original experimenter's biases.
D) All of these
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
18
Prior to DSM-5,a problem with the primary method used to estimate reliability of the DSM was that the method
A) did not allow for truly independent judgments.
B) resulted in overestimates of reliability.
C) artificially constrained information provided to clinicians.
D) All of these
A) did not allow for truly independent judgments.
B) resulted in overestimates of reliability.
C) artificially constrained information provided to clinicians.
D) All of these
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
19
Which of the following is the best remedy for QRPs?
A) pre-registration
B) registration
C) post-registration
D) self-correction
A) pre-registration
B) registration
C) post-registration
D) self-correction
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
20
As compared to what was business as usual in the past,more researchers are coming to the realization that replication is
A) really not as necessary as what researchers once thought.
B) not something that can ever completely "right" past wrongs.
C) mandatory given the influence of social media.
D) needed if published findings are to be relied on.
A) really not as necessary as what researchers once thought.
B) not something that can ever completely "right" past wrongs.
C) mandatory given the influence of social media.
D) needed if published findings are to be relied on.
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
21
Which is TRUE of measurement error?
A) Like error in general, measurement error may be random or systematic.
B) Unlike error in general, measurement error may be random or systematic.
C) Measurement error is always random.
D) Measurement error is always systematic.
A) Like error in general, measurement error may be random or systematic.
B) Unlike error in general, measurement error may be random or systematic.
C) Measurement error is always random.
D) Measurement error is always systematic.
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
22
In an illustrative scenario described in Chapter 5 of your text,a group of 12th grade "whiz kids" in math,newly arrived to the United States from China,perform poorly on a test of 12th grade math.According to the text,what probably accounted for this?
A) lower standards in China as compared to the US for measuring math ability.
B) higher standards in the US as compared to China for earning high grades.
C) the ability of the Chinese students to read what was required in English.
D) the reliability of the instrument used to test 12th grade math skills.
A) lower standards in China as compared to the US for measuring math ability.
B) higher standards in the US as compared to China for earning high grades.
C) the ability of the Chinese students to read what was required in English.
D) the reliability of the instrument used to test 12th grade math skills.
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
23
In classical test theory,an observed score on an ability test is presumed to represent the testtaker's
A) true score.
B) true score less the variance.
C) true score combined with extraneous factors.
D) the testtaker's true score and error.
A) true score.
B) true score less the variance.
C) true score combined with extraneous factors.
D) the testtaker's true score and error.
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
24
The standard error of measurement is
A) used to infer how far an observed score is from the true score.
B) also known as the standard error of a score.
C) is used in the context of classical test theory.
D) All of these
A) used to infer how far an observed score is from the true score.
B) also known as the standard error of a score.
C) is used in the context of classical test theory.
D) All of these
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
25
Item response theory is to latent trait theory as observer reliability is to
A) generalizability theory.
B) domain sampling theory.
C) odd-even reliability.
D) inter-scorer reliability.
A) generalizability theory.
B) domain sampling theory.
C) odd-even reliability.
D) inter-scorer reliability.
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
26
Error in the reporting of spousal abuse may result from
A) one partner simply forgets all of the details of the abuse.
B) one partner misunderstands the instructions for reporting.
C) one partner is ashamed to report the abuse.
D) All of these
A) one partner simply forgets all of the details of the abuse.
B) one partner misunderstands the instructions for reporting.
C) one partner is ashamed to report the abuse.
D) All of these
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
27
A research study entails behavioral observation and rating of front desk clerks in the hospitality industry to determine whether or not they greet guests with a smile.Which type of error is this test most susceptible to?
A) test administration error
B) test construction error
C) examiner-related error
D) polling error
A) test administration error
B) test construction error
C) examiner-related error
D) polling error
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
28
This variety of error has also been referred to as "noise." It is
A) systematic error.
B) random error.
C) measurement error.
D) background error.
A) systematic error.
B) random error.
C) measurement error.
D) background error.
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
29
One of the problems associated with classical test theory has to do with
A) the notion that there is a "true score" on a test has great intuitive appeal.
B) the fact that CTT assumptions are often characterized as "weak."
C) its assumptions concerning the equivalence of all items on a test.
D) its assumptions allow for its application in most situations.
A) the notion that there is a "true score" on a test has great intuitive appeal.
B) the fact that CTT assumptions are often characterized as "weak."
C) its assumptions concerning the equivalence of all items on a test.
D) its assumptions allow for its application in most situations.
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
30
A Wall Street Securities firm that is actually located on Wall Street is testing a group of candidates for their aptitude in finance and business.As the testing begins,an unexpected "Occupy Wall Street" sit-in takes place.From a psychometric perspective in the context of this testing,the sit-in is viewed as
A) systematic error.
B) random error.
C) test administration error.
D) background error.
A) systematic error.
B) random error.
C) test administration error.
D) background error.
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
31
In their study of the diagnostic reliability of DSM-IV diagnoses,Chmielewsi et al.(2015)used the "gold standard" in diagnostic instruments.The tool they used was the
A) MAST-2.
B) SCID I/P.
C) SCI-5.
D) Semi-Structured Diagnostic Interview (SSDI).
A) MAST-2.
B) SCID I/P.
C) SCI-5.
D) Semi-Structured Diagnostic Interview (SSDI).
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
32
Stanley (1971)wrote that in classical test theory,a so-called "true score" is "not the ultimate fact in the book of the recording angel." By this,Stanley meant that
A) it would be imprudent to trust in Divine influence when estimating variance.
B) the amount of test variance that is true relative to error may never be known.
C) it is near impossible to separate fact from fiction with regard to "true scores."
D) All of these
A) it would be imprudent to trust in Divine influence when estimating variance.
B) the amount of test variance that is true relative to error may never be known.
C) it is near impossible to separate fact from fiction with regard to "true scores."
D) All of these
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
33
The more homogeneous a test is,the
A) less inter-item consistency it can be expected to have.
B) more utility the test has for measuring multifaceted variables.
C) more inter-item consistency it can be expected to have.
D) None of these
A) less inter-item consistency it can be expected to have.
B) more utility the test has for measuring multifaceted variables.
C) more inter-item consistency it can be expected to have.
D) None of these
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
34
Which would NOT be useful in estimating a test's inter-item consistency?
A) Cronbach's alpha
B) the Kuder-Richardson formulas
C) the average proportional distance
D) a coefficient of equivalence
A) Cronbach's alpha
B) the Kuder-Richardson formulas
C) the average proportional distance
D) a coefficient of equivalence
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
35
A confidence interval is a range or band of test scores that
A) has proven test-retest reliability.
B) is calculated using the standard error of the difference.
C) is likely to contain the true score.
D) None of these
A) has proven test-retest reliability.
B) is calculated using the standard error of the difference.
C) is likely to contain the true score.
D) None of these
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
36
The multiple-choice test items on this examination (yes,the one that your taking right at this moment)are all examples of
A) dichotomous test items.
B) latent trait test items.
C) polytomous test items.
D) None of these
A) dichotomous test items.
B) latent trait test items.
C) polytomous test items.
D) None of these
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
37
Which of the following is NOT an alternative to classical test theory cited in your text?
A) generalizability theory
B) representational theory
C) domain sampling theory
D) latent trait theory
A) generalizability theory
B) representational theory
C) domain sampling theory
D) latent trait theory
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
38
Cronbach's alpha is to similarity of scores on test items as average proportional distance is to
A) difference in scores on test items.
B) inter-item consistency.
C) test-retest reliability.
D) parallel forms reliability.
A) difference in scores on test items.
B) inter-item consistency.
C) test-retest reliability.
D) parallel forms reliability.
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
39
The term test heterogeneity BEST refers to the extent to which test items measure
A) different factors.
B) the same factor.
C) a unifactorial trait.
D) a nonhomogeneous trait.
A) different factors.
B) the same factor.
C) a unifactorial trait.
D) a nonhomogeneous trait.
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
40
Which of the following terms is used in your textbook to describe the test-retest method of estimating diagnostic reliability?
A) methodologically sound
B) artificially constrained
C) psychometrically balanced
D) ecologically valid
A) methodologically sound
B) artificially constrained
C) psychometrically balanced
D) ecologically valid
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
41
A reliability coefficient is
A) an index.
B) a proportion of the total variance attributed to true variance.
C) unaffected by a systematic source of error.
D) All of these
A) an index.
B) a proportion of the total variance attributed to true variance.
C) unaffected by a systematic source of error.
D) All of these
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
42
Which of the following is true of systematic error?
A) It significantly lowers the reliability of a measure.
B) It insignificantly lowers the reliability of a measure.
C) It increases the reliability of a measure.
D) It has no effect on the reliability of a measure.
A) It significantly lowers the reliability of a measure.
B) It insignificantly lowers the reliability of a measure.
C) It increases the reliability of a measure.
D) It has no effect on the reliability of a measure.
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
43
Test-retest estimates of reliability are referred to as measures of ________,and split-half reliability estimates are referred to as measures of ________.
A) true scores; error scores
B) internal consistency; stability
C) inter-scorer reliability; consistency
D) stability; internal consistency
A) true scores; error scores
B) internal consistency; stability
C) inter-scorer reliability; consistency
D) stability; internal consistency
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
44
Which of the following might lead to a decrease in test-retest reliability?
A) the passage of time between the two administrations of the test.
B) coaching designed to increase test scores between the two administrations of the test.
C) practice with similar test materials between the two administrations of the test.
D) All of these
A) the passage of time between the two administrations of the test.
B) coaching designed to increase test scores between the two administrations of the test.
C) practice with similar test materials between the two administrations of the test.
D) All of these
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
45
What term refers to the degree of correlation between all the items on a scale?
A) inter-item homogeneity
B) inter-item consistency
C) inter-item heterogeneity
D) parallel-form reliability
A) inter-item homogeneity
B) inter-item consistency
C) inter-item heterogeneity
D) parallel-form reliability
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
46
Reliability,in a broad statistical sense,is synonymous with
A) consistently good.
B) consistently bad.
C) consistency.
D) validity.
A) consistently good.
B) consistently bad.
C) consistency.
D) validity.
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
47
Which of the following factors may influence a split-half reliability estimate?
A) fatigue
B) anxiety
C) item difficulty
D) All of these
A) fatigue
B) anxiety
C) item difficulty
D) All of these
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
48
As the degree of reliability increases,the proportion of
A) total variance attributed to true variance decreases.
B) total variance attributed to true variance increases.
C) total variance attributed to error variance increases.
D) None of these
A) total variance attributed to true variance decreases.
B) total variance attributed to true variance increases.
C) total variance attributed to error variance increases.
D) None of these
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
49
Why might ability test scores among testtakers most typically vary?
A) because of the true ability of the testtaker
B) because of irrelevant, unwanted influences
C) Both because of the true ability of the testtaker and because of irrelevant, unwanted influences
D) None of these
A) because of the true ability of the testtaker
B) because of irrelevant, unwanted influences
C) Both because of the true ability of the testtaker and because of irrelevant, unwanted influences
D) None of these
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
50
Which source of error variance affects parallel- or alternate-form reliability estimates but does not affect test-retest estimates?
A) fatigue
B) learning
C) practice
D) item sampling
A) fatigue
B) learning
C) practice
D) item sampling
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
51
Which type of reliability estimate would be appropriate only when evaluating the reliability of a test that measures a trait that is presumed to be relatively stable over time?
A) parallel-forms
B) alternate-forms
C) test-retest
D) split-half
A) parallel-forms
B) alternate-forms
C) test-retest
D) split-half
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
52
Internal-consistency estimates of reliability are inappropriate for
A) reading achievement tests.
B) scholastic aptitude/intelligence tests.
C) word processing tests based on speed.
D) tests purporting to measure a single personality trait.
A) reading achievement tests.
B) scholastic aptitude/intelligence tests.
C) word processing tests based on speed.
D) tests purporting to measure a single personality trait.
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
53
Computer-scorable items have tended to eliminate error variance due to
A) item sampling.
B) scorer differences.
C) content sampling.
D) testtakers' reactions to environmental variables.
A) item sampling.
B) scorer differences.
C) content sampling.
D) testtakers' reactions to environmental variables.
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
54
An estimate of test-retest reliability is often referred to as a coefficient of stability when the time interval between the test and retest is more than
A) 30 days.
B) 60 days.
C) 3 months.
D) 6 months.
A) 30 days.
B) 60 days.
C) 3 months.
D) 6 months.
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
55
Which of the following types of reliability estimates is the most expensive due to the costs involved in test development?
A) test-retest
B) parallel-form
C) internal-consistency
D) Spearman's rho
A) test-retest
B) parallel-form
C) internal-consistency
D) Spearman's rho
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
56
Which type of reliability estimate is obtained by correlating pairs of scores from the same person (or people)on two different administrations of the same test?
A) a parallel-forms estimate
B) a split-half estimate
C) a test-retest estimate
D) an au-paire estimate
A) a parallel-forms estimate
B) a split-half estimate
C) a test-retest estimate
D) an au-paire estimate
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
57
A source of error variance may take the form of
A) item sampling.
B) testtakers' reactions to environment-related variables such as room temperature and lighting.
C) testtaker variables such as amount of sleep the night before a test, amount of anxiety, or drug effects.
D) All of these
A) item sampling.
B) testtakers' reactions to environment-related variables such as room temperature and lighting.
C) testtaker variables such as amount of sleep the night before a test, amount of anxiety, or drug effects.
D) All of these
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
58
Which of the following is TRUE for estimates of alternate- and parallel-forms reliability?
A) Two test administrations with the same group are required.
B) Test scores may be affected by factors such as motivation, fatigue, or intervening events like practice, learning, or therapy.
C) Item sampling is a source of error variance.
D) All of these
A) Two test administrations with the same group are required.
B) Test scores may be affected by factors such as motivation, fatigue, or intervening events like practice, learning, or therapy.
C) Item sampling is a source of error variance.
D) All of these
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
59
Which of the following is usually minimized when using split-half estimates of reliability as compared with test-retest or parallel/alternate-form estimates of reliability?
A) time and expense
B) reliability and validity
C) reliability only
D) time spent in scoring and interpretation
A) time and expense
B) reliability and validity
C) reliability only
D) time spent in scoring and interpretation
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
60
Which of the following is TRUE for parallel forms of a test?
A) The means of the observed scores are equal for the two forms.
B) The variances of the estimated scores are equal for the two forms.
C) The means and variances of the observed scores are equal for the two forms.
D) The means and variances of the estimated scores are equal for the two forms.
A) The means of the observed scores are equal for the two forms.
B) The variances of the estimated scores are equal for the two forms.
C) The means and variances of the observed scores are equal for the two forms.
D) The means and variances of the estimated scores are equal for the two forms.
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
61
Error variance for measures of inter-item consistency comes from
A) fatigue.
B) motivation.
C) a testtaker practice effect.
D) heterogeneity of the content.
A) fatigue.
B) motivation.
C) a testtaker practice effect.
D) heterogeneity of the content.
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
62
Which of the following statements is TRUE about coefficient alpha?
A) Kuder thought it to be single best measure of reliability.
B) It was first conceived by Alfalfa Alpha.
C) It is a characteristic of a particular set of scores, not of the test itself.
D) None of these
A) Kuder thought it to be single best measure of reliability.
B) It was first conceived by Alfalfa Alpha.
C) It is a characteristic of a particular set of scores, not of the test itself.
D) None of these
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
63
The "20" and "21" in KR-20 and KR-21 represent
A) numbers held constant in the denominator.
B) numbers held constant in the numerator.
C) the order in which the formulas were created.
D) the age of Fred Kuder's sons when the formulas were developed.
A) numbers held constant in the denominator.
B) numbers held constant in the numerator.
C) the order in which the formulas were created.
D) the age of Fred Kuder's sons when the formulas were developed.
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
64
For a heterogeneous test,measures of internal-consistency reliability will tend to be ________ compared with other methods of estimating reliability.
A) higher
B) lower
C) very similar or higher
D) more robust
A) higher
B) lower
C) very similar or higher
D) more robust
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
65
The Spearman-Brown formula is used for
A) correcting for one half of the test by estimating the reliability of the whole test.
B) determining how many additional items are needed to increase reliability up to a certain level.
C) determining how many items can be eliminated without reducing reliability below a predetermined level.
D) All of these
A) correcting for one half of the test by estimating the reliability of the whole test.
B) determining how many additional items are needed to increase reliability up to a certain level.
C) determining how many items can be eliminated without reducing reliability below a predetermined level.
D) All of these
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
66
Many assumptions must be met when using KR-21 to estimate reliability.Which is NOT such an assumption?
A) Items should be dichotomous.
B) Items should be of equal difficulty.
C) Items should be homogeneous.
D) Items should be scorable by computer.
A) Items should be dichotomous.
B) Items should be of equal difficulty.
C) Items should be homogeneous.
D) Items should be scorable by computer.
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
67
A synonym for inter-scorer reliability is
A) inter-judge reliability
B) observer reliability
C) inter-rater reliability
D) All of these
A) inter-judge reliability
B) observer reliability
C) inter-rater reliability
D) All of these
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
68
Coefficient alpha is an expression of
A) the mean of split-half correlations between odd- and even-numbered items.
B) the mean of split-half correlations between first- and second-half items.
C) the mean of all possible split-half correlations.
D) the mean of the best or "alpha" level split-half correlations.
A) the mean of split-half correlations between odd- and even-numbered items.
B) the mean of split-half correlations between first- and second-half items.
C) the mean of all possible split-half correlations.
D) the mean of the best or "alpha" level split-half correlations.
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
69
Which of the following is NOT an acceptable way to divide a test when using the split-half reliability method?
A) Randomly assign items to each half of the test.
B) Assign odd-numbered items to one half and even-numbered items to the other half of the test.
C) Assign the first-half of the items to one half of the test and the second half of the items to the other half of the test.
D) Assign easy items to one half of the test and difficult items to the other half of the test.
A) Randomly assign items to each half of the test.
B) Assign odd-numbered items to one half and even-numbered items to the other half of the test.
C) Assign the first-half of the items to one half of the test and the second half of the items to the other half of the test.
D) Assign easy items to one half of the test and difficult items to the other half of the test.
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
70
When more than two scorers are used to determine inter-scorer reliability,the statistic of choice is
A) Pearson r.
B) Spearman's rho.
C) KR-20.
D) coefficient alpha.
A) Pearson r.
B) Spearman's rho.
C) KR-20.
D) coefficient alpha.
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
71
The KR-21 reliability estimate was developed
A) to yield greater consistency in reliability coefficients.
B) to facilitate computation by hand.
C) for use with less homogeneous items.
D) because Kuder wanted to "one-up" Richardson's 20.
A) to yield greater consistency in reliability coefficients.
B) to facilitate computation by hand.
C) for use with less homogeneous items.
D) because Kuder wanted to "one-up" Richardson's 20.
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
72
If items on a test are measuring very different traits,estimates of reliability yielded from split-half methods will typically be ________ as compared with estimates from KR-20.
A) higher
B) lower
C) similar
D) approximately the same
A) higher
B) lower
C) similar
D) approximately the same
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
73
Typically,adding items to a test will have what effect on the test's reliability?
A) Reliability will decrease.
B) Reliability will increase.
C) Reliability will stay the same.
D) Reliability will first increase and then decrease.
A) Reliability will decrease.
B) Reliability will increase.
C) Reliability will stay the same.
D) Reliability will first increase and then decrease.
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
74
KR-20 is the statistic of choice for tests with which types of items?
A) multiple-choice
B) true-false
C) All of these
D) None of these
A) multiple-choice
B) true-false
C) All of these
D) None of these
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
75
If items from a test are measuring the same trait,estimates of reliability yielded from split-half methods will typically be ________ as compared to estimates from KR-20.
A) higher
B) lower
C) similar
D) approximately the same
A) higher
B) lower
C) similar
D) approximately the same
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
76
Which BEST conveys the meaning of an inter-scorer reliability estimate of .90?
A) Ninety percent of the scores obtained are reliable.
B) Ninety percent of the variance in the scores assigned by the scorers was attributed to true differences and 10% to error.
C) Ten percent of the variance in the scores assigned by the scorers was attributed to true differences and 90% to error.
D) Ten percent of the test's items are in need of revision according to the majority of the test's users.
A) Ninety percent of the scores obtained are reliable.
B) Ninety percent of the variance in the scores assigned by the scorers was attributed to true differences and 10% to error.
C) Ten percent of the variance in the scores assigned by the scorers was attributed to true differences and 90% to error.
D) Ten percent of the test's items are in need of revision according to the majority of the test's users.
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
77
For determining the reliability of tests scored using nominal scales of measurement,the statistic of choice is
A) Kendall's Tau.
B) the Kappa statistic.
C) KR-20.
D) coefficient alpha.
A) Kendall's Tau.
B) the Kappa statistic.
C) KR-20.
D) coefficient alpha.
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
78
Coefficient alpha is appropriate to use with all of the following test formats EXCEPT
A) multiple-choice.
B) true-false.
C) short-answer for which partial credit is awarded.
D) essay exam with no partial credit awarded.
A) multiple-choice.
B) true-false.
C) short-answer for which partial credit is awarded.
D) essay exam with no partial credit awarded.
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
79
Which of the following is,generally speaking,the preferred statistic for obtaining a measure of internal-consistency reliability?
A) KR-20
B) KR-21
C) Kendall's Tau
D) coefficient alpha
A) KR-20
B) KR-21
C) Kendall's Tau
D) coefficient alpha
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck
80
A coefficient alpha over .9 may indicate that
A) the items in the test are too dissimilar.
B) the test is not reliable.
C) the items in the test are redundant.
D) the test is biased against low-ability individuals.
A) the items in the test are too dissimilar.
B) the test is not reliable.
C) the items in the test are redundant.
D) the test is biased against low-ability individuals.
Unlock Deck
Unlock for access to all 182 flashcards in this deck.
Unlock Deck
k this deck