Deck 5: Reliability

ملء الشاشة (f)
exit full mode
سؤال
According to Chin,it is common for research findings to be

A) rejected for not having met the general acceptance standard.
B) accepted as having met the general acceptance standard.
C) accepted by a judge but then rejected by an expert witness.
D) accepted by an expert witness but then rejected by a judge.
استخدم زر المسافة أو
up arrow
down arrow
لقلب البطاقة.
سؤال
Unreliable findings that reach general acceptance in the academic community

A) tend to self-correct.
B) tend to linger too long.
C) become exposed through social media.
D) are not admissible in a court of law.
سؤال
According to Gil et al.(2016),which of the following is a source of error in scores on psychological tests?

A) whether or not the examiner has a beard
B) whether the testtaker's country is at war or peace
C) the body weight of the testtaker two weeks prior to the test
D) None of these
سؤال
Makel et al.(2012)observed that only about ____ of the published literature replicated previous work.

A) 1%
B) 3%
C) 5%
D) 7%
سؤال
Generally,diagnostic reliability is necessary.However,which of the following is NOT a reason that diagnostic reliability is necessary?

A) It is necessary for accurate diagnosis.
B) It is necessary for any double-blind study.
C) It is necessary to determine the effectiveness of treatments.
D) It is necessary to track changes in a disorder over time.
سؤال
Hawkins et al.(2016)found that subjects with ____ fasting glucose levels made nearly ____ times as many errors as subjects with fasting glucose levels in the normal range.

A) low; one-quarter
B) high; one-quarter
C) high; four times
D) high; twice
سؤال
In the test-retest method to estimate reliability

A) the time frame between interviews must be relatively short.
B) separate interviews are conducted by certified raters.
C) a minimum of two re-tests are required.
D) All of these
سؤال
What has been called a "replicability crisis" in psychology emerged as a result of a number of factors.Which is not one of those factors?

A) a general lack of published attempts to replicate research
B) editorial preferences for papers with positive findings
C) questionable research practices on the part of study authors
D) unwillingness or inability of original study authors to share data
سؤال
Berman et al.(2015)observed that one source of error in evaluations the suicide risk of patients is

A) whether or not the patient has previously attempted suicide.
B) whether or not the clinician previously had a patient attempt suicide.
C) how religious the evaluating clinician is.
D) how religious the evaluated patient is.
سؤال
In 2015,a group of researchers attempted to replicate 100 peer-reviewed,published psychological studies.This group of researchers was called the

A) Society for the Replication of Psychological Studies.
B) Scientists for the Abolition of Irreproducible Results.
C) Open Science Collaboration.
D) Coalition for Responsible Science.
سؤال
Prior to research on inter-rater reliability for DSM-5,DSM inter-rater reliability estimates were obtained using the ____ method.

A) test-retest
B) paired-paragraph
C) audio-recording
D) one-way mirror
سؤال
Which is an example of what is referred to as "QRP" in your textbook?

A) collecting additional data to reach statistical significance
B) over-reporting of data with excessive detail
C) telling subjects in a control group that they need not participate
D) requesting detailed data from the original study author
سؤال
With critical variables in a research study held constant,different methods used to estimate reliability will typically yield

A) virtually no differences in the magnitude of the estimate.
B) sizable differences in the magnitude of the estimate.
C) skewed estimates of reliability.
D) identical estimates of reliability.
سؤال
According to Chin,as cited in your textbook,a lack of replicability in psychology affects the work of

A) the police.
B) judges.
C) court clerks.
D) All of these
سؤال
Field trials of DSM-5 demonstrated a mean kappa that was indicative of a ______ level of agreement among raters.

A) poor
B) fair
C) good
D) "kinder and gentler"
سؤال
In 2015,a report on attempted replications published in Science,noted that,depending on the criteria used,___ of the replications found the same results as the original study.

A) 0%
B) 20 to 40%
C) 40 to 60%
D) 100%
سؤال
Replication of research by independent parties provides for

A) confidence in study findings.
B) confirmation that the study findings were not an anomaly.
C) confidence that the study findings were not the result of the original experimenter's biases.
D) All of these
سؤال
Prior to DSM-5,a problem with the primary method used to estimate reliability of the DSM was that the method

A) did not allow for truly independent judgments.
B) resulted in overestimates of reliability.
C) artificially constrained information provided to clinicians.
D) All of these
سؤال
Which of the following is the best remedy for QRPs?

A) pre-registration
B) registration
C) post-registration
D) self-correction
سؤال
As compared to what was business as usual in the past,more researchers are coming to the realization that replication is

A) really not as necessary as what researchers once thought.
B) not something that can ever completely "right" past wrongs.
C) mandatory given the influence of social media.
D) needed if published findings are to be relied on.
سؤال
Which is TRUE of measurement error?

A) Like error in general, measurement error may be random or systematic.
B) Unlike error in general, measurement error may be random or systematic.
C) Measurement error is always random.
D) Measurement error is always systematic.
سؤال
In an illustrative scenario described in Chapter 5 of your text,a group of 12th grade "whiz kids" in math,newly arrived to the United States from China,perform poorly on a test of 12th grade math.According to the text,what probably accounted for this?

A) lower standards in China as compared to the US for measuring math ability.
B) higher standards in the US as compared to China for earning high grades.
C) the ability of the Chinese students to read what was required in English.
D) the reliability of the instrument used to test 12th grade math skills.
سؤال
In classical test theory,an observed score on an ability test is presumed to represent the testtaker's

A) true score.
B) true score less the variance.
C) true score combined with extraneous factors.
D) the testtaker's true score and error.
سؤال
The standard error of measurement is

A) used to infer how far an observed score is from the true score.
B) also known as the standard error of a score.
C) is used in the context of classical test theory.
D) All of these
سؤال
Item response theory is to latent trait theory as observer reliability is to

A) generalizability theory.
B) domain sampling theory.
C) odd-even reliability.
D) inter-scorer reliability.
سؤال
Error in the reporting of spousal abuse may result from

A) one partner simply forgets all of the details of the abuse.
B) one partner misunderstands the instructions for reporting.
C) one partner is ashamed to report the abuse.
D) All of these
سؤال
A research study entails behavioral observation and rating of front desk clerks in the hospitality industry to determine whether or not they greet guests with a smile.Which type of error is this test most susceptible to?

A) test administration error
B) test construction error
C) examiner-related error
D) polling error
سؤال
This variety of error has also been referred to as "noise." It is

A) systematic error.
B) random error.
C) measurement error.
D) background error.
سؤال
One of the problems associated with classical test theory has to do with

A) the notion that there is a "true score" on a test has great intuitive appeal.
B) the fact that CTT assumptions are often characterized as "weak."
C) its assumptions concerning the equivalence of all items on a test.
D) its assumptions allow for its application in most situations.
سؤال
A Wall Street Securities firm that is actually located on Wall Street is testing a group of candidates for their aptitude in finance and business.As the testing begins,an unexpected "Occupy Wall Street" sit-in takes place.From a psychometric perspective in the context of this testing,the sit-in is viewed as

A) systematic error.
B) random error.
C) test administration error.
D) background error.
سؤال
In their study of the diagnostic reliability of DSM-IV diagnoses,Chmielewsi et al.(2015)used the "gold standard" in diagnostic instruments.The tool they used was the

A) MAST-2.
B) SCID I/P.
C) SCI-5.
D) Semi-Structured Diagnostic Interview (SSDI).
سؤال
Stanley (1971)wrote that in classical test theory,a so-called "true score" is "not the ultimate fact in the book of the recording angel." By this,Stanley meant that

A) it would be imprudent to trust in Divine influence when estimating variance.
B) the amount of test variance that is true relative to error may never be known.
C) it is near impossible to separate fact from fiction with regard to "true scores."
D) All of these
سؤال
The more homogeneous a test is,the

A) less inter-item consistency it can be expected to have.
B) more utility the test has for measuring multifaceted variables.
C) more inter-item consistency it can be expected to have.
D) None of these
سؤال
Which would NOT be useful in estimating a test's inter-item consistency?

A) Cronbach's alpha
B) the Kuder-Richardson formulas
C) the average proportional distance
D) a coefficient of equivalence
سؤال
A confidence interval is a range or band of test scores that

A) has proven test-retest reliability.
B) is calculated using the standard error of the difference.
C) is likely to contain the true score.
D) None of these
سؤال
The multiple-choice test items on this examination (yes,the one that your taking right at this moment)are all examples of

A) dichotomous test items.
B) latent trait test items.
C) polytomous test items.
D) None of these
سؤال
Which of the following is NOT an alternative to classical test theory cited in your text?

A) generalizability theory
B) representational theory
C) domain sampling theory
D) latent trait theory
سؤال
Cronbach's alpha is to similarity of scores on test items as average proportional distance is to

A) difference in scores on test items.
B) inter-item consistency.
C) test-retest reliability.
D) parallel forms reliability.
سؤال
The term test heterogeneity BEST refers to the extent to which test items measure

A) different factors.
B) the same factor.
C) a unifactorial trait.
D) a nonhomogeneous trait.
سؤال
Which of the following terms is used in your textbook to describe the test-retest method of estimating diagnostic reliability?

A) methodologically sound
B) artificially constrained
C) psychometrically balanced
D) ecologically valid
سؤال
A reliability coefficient is

A) an index.
B) a proportion of the total variance attributed to true variance.
C) unaffected by a systematic source of error.
D) All of these
سؤال
Which of the following is true of systematic error?

A) It significantly lowers the reliability of a measure.
B) It insignificantly lowers the reliability of a measure.
C) It increases the reliability of a measure.
D) It has no effect on the reliability of a measure.
سؤال
Test-retest estimates of reliability are referred to as measures of ________,and split-half reliability estimates are referred to as measures of ________.

A) true scores; error scores
B) internal consistency; stability
C) inter-scorer reliability; consistency
D) stability; internal consistency
سؤال
Which of the following might lead to a decrease in test-retest reliability?

A) the passage of time between the two administrations of the test.
B) coaching designed to increase test scores between the two administrations of the test.
C) practice with similar test materials between the two administrations of the test.
D) All of these
سؤال
What term refers to the degree of correlation between all the items on a scale?

A) inter-item homogeneity
B) inter-item consistency
C) inter-item heterogeneity
D) parallel-form reliability
سؤال
Reliability,in a broad statistical sense,is synonymous with

A) consistently good.
B) consistently bad.
C) consistency.
D) validity.
سؤال
Which of the following factors may influence a split-half reliability estimate?

A) fatigue
B) anxiety
C) item difficulty
D) All of these
سؤال
As the degree of reliability increases,the proportion of

A) total variance attributed to true variance decreases.
B) total variance attributed to true variance increases.
C) total variance attributed to error variance increases.
D) None of these
سؤال
Why might ability test scores among testtakers most typically vary?

A) because of the true ability of the testtaker
B) because of irrelevant, unwanted influences
C) Both because of the true ability of the testtaker and because of irrelevant, unwanted influences
D) None of these
سؤال
Which source of error variance affects parallel- or alternate-form reliability estimates but does not affect test-retest estimates?

A) fatigue
B) learning
C) practice
D) item sampling
سؤال
Which type of reliability estimate would be appropriate only when evaluating the reliability of a test that measures a trait that is presumed to be relatively stable over time?

A) parallel-forms
B) alternate-forms
C) test-retest
D) split-half
سؤال
Internal-consistency estimates of reliability are inappropriate for

A) reading achievement tests.
B) scholastic aptitude/intelligence tests.
C) word processing tests based on speed.
D) tests purporting to measure a single personality trait.
سؤال
Computer-scorable items have tended to eliminate error variance due to

A) item sampling.
B) scorer differences.
C) content sampling.
D) testtakers' reactions to environmental variables.
سؤال
An estimate of test-retest reliability is often referred to as a coefficient of stability when the time interval between the test and retest is more than

A) 30 days.
B) 60 days.
C) 3 months.
D) 6 months.
سؤال
Which of the following types of reliability estimates is the most expensive due to the costs involved in test development?

A) test-retest
B) parallel-form
C) internal-consistency
D) Spearman's rho
سؤال
Which type of reliability estimate is obtained by correlating pairs of scores from the same person (or people)on two different administrations of the same test?

A) a parallel-forms estimate
B) a split-half estimate
C) a test-retest estimate
D) an au-paire estimate
سؤال
A source of error variance may take the form of

A) item sampling.
B) testtakers' reactions to environment-related variables such as room temperature and lighting.
C) testtaker variables such as amount of sleep the night before a test, amount of anxiety, or drug effects.
D) All of these
سؤال
Which of the following is TRUE for estimates of alternate- and parallel-forms reliability?

A) Two test administrations with the same group are required.
B) Test scores may be affected by factors such as motivation, fatigue, or intervening events like practice, learning, or therapy.
C) Item sampling is a source of error variance.
D) All of these
سؤال
Which of the following is usually minimized when using split-half estimates of reliability as compared with test-retest or parallel/alternate-form estimates of reliability?

A) time and expense
B) reliability and validity
C) reliability only
D) time spent in scoring and interpretation
سؤال
Which of the following is TRUE for parallel forms of a test?

A) The means of the observed scores are equal for the two forms.
B) The variances of the estimated scores are equal for the two forms.
C) The means and variances of the observed scores are equal for the two forms.
D) The means and variances of the estimated scores are equal for the two forms.
سؤال
Error variance for measures of inter-item consistency comes from

A) fatigue.
B) motivation.
C) a testtaker practice effect.
D) heterogeneity of the content.
سؤال
Which of the following statements is TRUE about coefficient alpha?

A) Kuder thought it to be single best measure of reliability.
B) It was first conceived by Alfalfa Alpha.
C) It is a characteristic of a particular set of scores, not of the test itself.
D) None of these
سؤال
The "20" and "21" in KR-20 and KR-21 represent

A) numbers held constant in the denominator.
B) numbers held constant in the numerator.
C) the order in which the formulas were created.
D) the age of Fred Kuder's sons when the formulas were developed.
سؤال
For a heterogeneous test,measures of internal-consistency reliability will tend to be ________ compared with other methods of estimating reliability.

A) higher
B) lower
C) very similar or higher
D) more robust
سؤال
The Spearman-Brown formula is used for

A) correcting for one half of the test by estimating the reliability of the whole test.
B) determining how many additional items are needed to increase reliability up to a certain level.
C) determining how many items can be eliminated without reducing reliability below a predetermined level.
D) All of these
سؤال
Many assumptions must be met when using KR-21 to estimate reliability.Which is NOT such an assumption?

A) Items should be dichotomous.
B) Items should be of equal difficulty.
C) Items should be homogeneous.
D) Items should be scorable by computer.
سؤال
A synonym for inter-scorer reliability is

A) inter-judge reliability
B) observer reliability
C) inter-rater reliability
D) All of these
سؤال
Coefficient alpha is an expression of

A) the mean of split-half correlations between odd- and even-numbered items.
B) the mean of split-half correlations between first- and second-half items.
C) the mean of all possible split-half correlations.
D) the mean of the best or "alpha" level split-half correlations.
سؤال
Which of the following is NOT an acceptable way to divide a test when using the split-half reliability method?

A) Randomly assign items to each half of the test.
B) Assign odd-numbered items to one half and even-numbered items to the other half of the test.
C) Assign the first-half of the items to one half of the test and the second half of the items to the other half of the test.
D) Assign easy items to one half of the test and difficult items to the other half of the test.
سؤال
When more than two scorers are used to determine inter-scorer reliability,the statistic of choice is

A) Pearson r.
B) Spearman's rho.
C) KR-20.
D) coefficient alpha.
سؤال
The KR-21 reliability estimate was developed

A) to yield greater consistency in reliability coefficients.
B) to facilitate computation by hand.
C) for use with less homogeneous items.
D) because Kuder wanted to "one-up" Richardson's 20.
سؤال
If items on a test are measuring very different traits,estimates of reliability yielded from split-half methods will typically be ________ as compared with estimates from KR-20.

A) higher
B) lower
C) similar
D) approximately the same
سؤال
Typically,adding items to a test will have what effect on the test's reliability?

A) Reliability will decrease.
B) Reliability will increase.
C) Reliability will stay the same.
D) Reliability will first increase and then decrease.
سؤال
KR-20 is the statistic of choice for tests with which types of items?

A) multiple-choice
B) true-false
C) All of these
D) None of these
سؤال
If items from a test are measuring the same trait,estimates of reliability yielded from split-half methods will typically be ________ as compared to estimates from KR-20.

A) higher
B) lower
C) similar
D) approximately the same
سؤال
Which BEST conveys the meaning of an inter-scorer reliability estimate of .90?

A) Ninety percent of the scores obtained are reliable.
B) Ninety percent of the variance in the scores assigned by the scorers was attributed to true differences and 10% to error.
C) Ten percent of the variance in the scores assigned by the scorers was attributed to true differences and 90% to error.
D) Ten percent of the test's items are in need of revision according to the majority of the test's users.
سؤال
For determining the reliability of tests scored using nominal scales of measurement,the statistic of choice is

A) Kendall's Tau.
B) the Kappa statistic.
C) KR-20.
D) coefficient alpha.
سؤال
Coefficient alpha is appropriate to use with all of the following test formats EXCEPT

A) multiple-choice.
B) true-false.
C) short-answer for which partial credit is awarded.
D) essay exam with no partial credit awarded.
سؤال
Which of the following is,generally speaking,the preferred statistic for obtaining a measure of internal-consistency reliability?

A) KR-20
B) KR-21
C) Kendall's Tau
D) coefficient alpha
سؤال
A coefficient alpha over .9 may indicate that

A) the items in the test are too dissimilar.
B) the test is not reliable.
C) the items in the test are redundant.
D) the test is biased against low-ability individuals.
فتح الحزمة
قم بالتسجيل لفتح البطاقات في هذه المجموعة!
Unlock Deck
Unlock Deck
1/182
auto play flashcards
العب
simple tutorial
ملء الشاشة (f)
exit full mode
Deck 5: Reliability
1
According to Chin,it is common for research findings to be

A) rejected for not having met the general acceptance standard.
B) accepted as having met the general acceptance standard.
C) accepted by a judge but then rejected by an expert witness.
D) accepted by an expert witness but then rejected by a judge.
B
2
Unreliable findings that reach general acceptance in the academic community

A) tend to self-correct.
B) tend to linger too long.
C) become exposed through social media.
D) are not admissible in a court of law.
B
3
According to Gil et al.(2016),which of the following is a source of error in scores on psychological tests?

A) whether or not the examiner has a beard
B) whether the testtaker's country is at war or peace
C) the body weight of the testtaker two weeks prior to the test
D) None of these
B
4
Makel et al.(2012)observed that only about ____ of the published literature replicated previous work.

A) 1%
B) 3%
C) 5%
D) 7%
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
5
Generally,diagnostic reliability is necessary.However,which of the following is NOT a reason that diagnostic reliability is necessary?

A) It is necessary for accurate diagnosis.
B) It is necessary for any double-blind study.
C) It is necessary to determine the effectiveness of treatments.
D) It is necessary to track changes in a disorder over time.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
6
Hawkins et al.(2016)found that subjects with ____ fasting glucose levels made nearly ____ times as many errors as subjects with fasting glucose levels in the normal range.

A) low; one-quarter
B) high; one-quarter
C) high; four times
D) high; twice
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
7
In the test-retest method to estimate reliability

A) the time frame between interviews must be relatively short.
B) separate interviews are conducted by certified raters.
C) a minimum of two re-tests are required.
D) All of these
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
8
What has been called a "replicability crisis" in psychology emerged as a result of a number of factors.Which is not one of those factors?

A) a general lack of published attempts to replicate research
B) editorial preferences for papers with positive findings
C) questionable research practices on the part of study authors
D) unwillingness or inability of original study authors to share data
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
9
Berman et al.(2015)observed that one source of error in evaluations the suicide risk of patients is

A) whether or not the patient has previously attempted suicide.
B) whether or not the clinician previously had a patient attempt suicide.
C) how religious the evaluating clinician is.
D) how religious the evaluated patient is.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
10
In 2015,a group of researchers attempted to replicate 100 peer-reviewed,published psychological studies.This group of researchers was called the

A) Society for the Replication of Psychological Studies.
B) Scientists for the Abolition of Irreproducible Results.
C) Open Science Collaboration.
D) Coalition for Responsible Science.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
11
Prior to research on inter-rater reliability for DSM-5,DSM inter-rater reliability estimates were obtained using the ____ method.

A) test-retest
B) paired-paragraph
C) audio-recording
D) one-way mirror
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
12
Which is an example of what is referred to as "QRP" in your textbook?

A) collecting additional data to reach statistical significance
B) over-reporting of data with excessive detail
C) telling subjects in a control group that they need not participate
D) requesting detailed data from the original study author
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
13
With critical variables in a research study held constant,different methods used to estimate reliability will typically yield

A) virtually no differences in the magnitude of the estimate.
B) sizable differences in the magnitude of the estimate.
C) skewed estimates of reliability.
D) identical estimates of reliability.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
14
According to Chin,as cited in your textbook,a lack of replicability in psychology affects the work of

A) the police.
B) judges.
C) court clerks.
D) All of these
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
15
Field trials of DSM-5 demonstrated a mean kappa that was indicative of a ______ level of agreement among raters.

A) poor
B) fair
C) good
D) "kinder and gentler"
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
16
In 2015,a report on attempted replications published in Science,noted that,depending on the criteria used,___ of the replications found the same results as the original study.

A) 0%
B) 20 to 40%
C) 40 to 60%
D) 100%
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
17
Replication of research by independent parties provides for

A) confidence in study findings.
B) confirmation that the study findings were not an anomaly.
C) confidence that the study findings were not the result of the original experimenter's biases.
D) All of these
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
18
Prior to DSM-5,a problem with the primary method used to estimate reliability of the DSM was that the method

A) did not allow for truly independent judgments.
B) resulted in overestimates of reliability.
C) artificially constrained information provided to clinicians.
D) All of these
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
19
Which of the following is the best remedy for QRPs?

A) pre-registration
B) registration
C) post-registration
D) self-correction
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
20
As compared to what was business as usual in the past,more researchers are coming to the realization that replication is

A) really not as necessary as what researchers once thought.
B) not something that can ever completely "right" past wrongs.
C) mandatory given the influence of social media.
D) needed if published findings are to be relied on.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
21
Which is TRUE of measurement error?

A) Like error in general, measurement error may be random or systematic.
B) Unlike error in general, measurement error may be random or systematic.
C) Measurement error is always random.
D) Measurement error is always systematic.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
22
In an illustrative scenario described in Chapter 5 of your text,a group of 12th grade "whiz kids" in math,newly arrived to the United States from China,perform poorly on a test of 12th grade math.According to the text,what probably accounted for this?

A) lower standards in China as compared to the US for measuring math ability.
B) higher standards in the US as compared to China for earning high grades.
C) the ability of the Chinese students to read what was required in English.
D) the reliability of the instrument used to test 12th grade math skills.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
23
In classical test theory,an observed score on an ability test is presumed to represent the testtaker's

A) true score.
B) true score less the variance.
C) true score combined with extraneous factors.
D) the testtaker's true score and error.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
24
The standard error of measurement is

A) used to infer how far an observed score is from the true score.
B) also known as the standard error of a score.
C) is used in the context of classical test theory.
D) All of these
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
25
Item response theory is to latent trait theory as observer reliability is to

A) generalizability theory.
B) domain sampling theory.
C) odd-even reliability.
D) inter-scorer reliability.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
26
Error in the reporting of spousal abuse may result from

A) one partner simply forgets all of the details of the abuse.
B) one partner misunderstands the instructions for reporting.
C) one partner is ashamed to report the abuse.
D) All of these
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
27
A research study entails behavioral observation and rating of front desk clerks in the hospitality industry to determine whether or not they greet guests with a smile.Which type of error is this test most susceptible to?

A) test administration error
B) test construction error
C) examiner-related error
D) polling error
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
28
This variety of error has also been referred to as "noise." It is

A) systematic error.
B) random error.
C) measurement error.
D) background error.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
29
One of the problems associated with classical test theory has to do with

A) the notion that there is a "true score" on a test has great intuitive appeal.
B) the fact that CTT assumptions are often characterized as "weak."
C) its assumptions concerning the equivalence of all items on a test.
D) its assumptions allow for its application in most situations.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
30
A Wall Street Securities firm that is actually located on Wall Street is testing a group of candidates for their aptitude in finance and business.As the testing begins,an unexpected "Occupy Wall Street" sit-in takes place.From a psychometric perspective in the context of this testing,the sit-in is viewed as

A) systematic error.
B) random error.
C) test administration error.
D) background error.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
31
In their study of the diagnostic reliability of DSM-IV diagnoses,Chmielewsi et al.(2015)used the "gold standard" in diagnostic instruments.The tool they used was the

A) MAST-2.
B) SCID I/P.
C) SCI-5.
D) Semi-Structured Diagnostic Interview (SSDI).
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
32
Stanley (1971)wrote that in classical test theory,a so-called "true score" is "not the ultimate fact in the book of the recording angel." By this,Stanley meant that

A) it would be imprudent to trust in Divine influence when estimating variance.
B) the amount of test variance that is true relative to error may never be known.
C) it is near impossible to separate fact from fiction with regard to "true scores."
D) All of these
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
33
The more homogeneous a test is,the

A) less inter-item consistency it can be expected to have.
B) more utility the test has for measuring multifaceted variables.
C) more inter-item consistency it can be expected to have.
D) None of these
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
34
Which would NOT be useful in estimating a test's inter-item consistency?

A) Cronbach's alpha
B) the Kuder-Richardson formulas
C) the average proportional distance
D) a coefficient of equivalence
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
35
A confidence interval is a range or band of test scores that

A) has proven test-retest reliability.
B) is calculated using the standard error of the difference.
C) is likely to contain the true score.
D) None of these
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
36
The multiple-choice test items on this examination (yes,the one that your taking right at this moment)are all examples of

A) dichotomous test items.
B) latent trait test items.
C) polytomous test items.
D) None of these
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
37
Which of the following is NOT an alternative to classical test theory cited in your text?

A) generalizability theory
B) representational theory
C) domain sampling theory
D) latent trait theory
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
38
Cronbach's alpha is to similarity of scores on test items as average proportional distance is to

A) difference in scores on test items.
B) inter-item consistency.
C) test-retest reliability.
D) parallel forms reliability.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
39
The term test heterogeneity BEST refers to the extent to which test items measure

A) different factors.
B) the same factor.
C) a unifactorial trait.
D) a nonhomogeneous trait.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
40
Which of the following terms is used in your textbook to describe the test-retest method of estimating diagnostic reliability?

A) methodologically sound
B) artificially constrained
C) psychometrically balanced
D) ecologically valid
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
41
A reliability coefficient is

A) an index.
B) a proportion of the total variance attributed to true variance.
C) unaffected by a systematic source of error.
D) All of these
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
42
Which of the following is true of systematic error?

A) It significantly lowers the reliability of a measure.
B) It insignificantly lowers the reliability of a measure.
C) It increases the reliability of a measure.
D) It has no effect on the reliability of a measure.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
43
Test-retest estimates of reliability are referred to as measures of ________,and split-half reliability estimates are referred to as measures of ________.

A) true scores; error scores
B) internal consistency; stability
C) inter-scorer reliability; consistency
D) stability; internal consistency
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
44
Which of the following might lead to a decrease in test-retest reliability?

A) the passage of time between the two administrations of the test.
B) coaching designed to increase test scores between the two administrations of the test.
C) practice with similar test materials between the two administrations of the test.
D) All of these
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
45
What term refers to the degree of correlation between all the items on a scale?

A) inter-item homogeneity
B) inter-item consistency
C) inter-item heterogeneity
D) parallel-form reliability
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
46
Reliability,in a broad statistical sense,is synonymous with

A) consistently good.
B) consistently bad.
C) consistency.
D) validity.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
47
Which of the following factors may influence a split-half reliability estimate?

A) fatigue
B) anxiety
C) item difficulty
D) All of these
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
48
As the degree of reliability increases,the proportion of

A) total variance attributed to true variance decreases.
B) total variance attributed to true variance increases.
C) total variance attributed to error variance increases.
D) None of these
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
49
Why might ability test scores among testtakers most typically vary?

A) because of the true ability of the testtaker
B) because of irrelevant, unwanted influences
C) Both because of the true ability of the testtaker and because of irrelevant, unwanted influences
D) None of these
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
50
Which source of error variance affects parallel- or alternate-form reliability estimates but does not affect test-retest estimates?

A) fatigue
B) learning
C) practice
D) item sampling
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
51
Which type of reliability estimate would be appropriate only when evaluating the reliability of a test that measures a trait that is presumed to be relatively stable over time?

A) parallel-forms
B) alternate-forms
C) test-retest
D) split-half
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
52
Internal-consistency estimates of reliability are inappropriate for

A) reading achievement tests.
B) scholastic aptitude/intelligence tests.
C) word processing tests based on speed.
D) tests purporting to measure a single personality trait.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
53
Computer-scorable items have tended to eliminate error variance due to

A) item sampling.
B) scorer differences.
C) content sampling.
D) testtakers' reactions to environmental variables.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
54
An estimate of test-retest reliability is often referred to as a coefficient of stability when the time interval between the test and retest is more than

A) 30 days.
B) 60 days.
C) 3 months.
D) 6 months.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
55
Which of the following types of reliability estimates is the most expensive due to the costs involved in test development?

A) test-retest
B) parallel-form
C) internal-consistency
D) Spearman's rho
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
56
Which type of reliability estimate is obtained by correlating pairs of scores from the same person (or people)on two different administrations of the same test?

A) a parallel-forms estimate
B) a split-half estimate
C) a test-retest estimate
D) an au-paire estimate
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
57
A source of error variance may take the form of

A) item sampling.
B) testtakers' reactions to environment-related variables such as room temperature and lighting.
C) testtaker variables such as amount of sleep the night before a test, amount of anxiety, or drug effects.
D) All of these
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
58
Which of the following is TRUE for estimates of alternate- and parallel-forms reliability?

A) Two test administrations with the same group are required.
B) Test scores may be affected by factors such as motivation, fatigue, or intervening events like practice, learning, or therapy.
C) Item sampling is a source of error variance.
D) All of these
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
59
Which of the following is usually minimized when using split-half estimates of reliability as compared with test-retest or parallel/alternate-form estimates of reliability?

A) time and expense
B) reliability and validity
C) reliability only
D) time spent in scoring and interpretation
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
60
Which of the following is TRUE for parallel forms of a test?

A) The means of the observed scores are equal for the two forms.
B) The variances of the estimated scores are equal for the two forms.
C) The means and variances of the observed scores are equal for the two forms.
D) The means and variances of the estimated scores are equal for the two forms.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
61
Error variance for measures of inter-item consistency comes from

A) fatigue.
B) motivation.
C) a testtaker practice effect.
D) heterogeneity of the content.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
62
Which of the following statements is TRUE about coefficient alpha?

A) Kuder thought it to be single best measure of reliability.
B) It was first conceived by Alfalfa Alpha.
C) It is a characteristic of a particular set of scores, not of the test itself.
D) None of these
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
63
The "20" and "21" in KR-20 and KR-21 represent

A) numbers held constant in the denominator.
B) numbers held constant in the numerator.
C) the order in which the formulas were created.
D) the age of Fred Kuder's sons when the formulas were developed.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
64
For a heterogeneous test,measures of internal-consistency reliability will tend to be ________ compared with other methods of estimating reliability.

A) higher
B) lower
C) very similar or higher
D) more robust
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
65
The Spearman-Brown formula is used for

A) correcting for one half of the test by estimating the reliability of the whole test.
B) determining how many additional items are needed to increase reliability up to a certain level.
C) determining how many items can be eliminated without reducing reliability below a predetermined level.
D) All of these
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
66
Many assumptions must be met when using KR-21 to estimate reliability.Which is NOT such an assumption?

A) Items should be dichotomous.
B) Items should be of equal difficulty.
C) Items should be homogeneous.
D) Items should be scorable by computer.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
67
A synonym for inter-scorer reliability is

A) inter-judge reliability
B) observer reliability
C) inter-rater reliability
D) All of these
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
68
Coefficient alpha is an expression of

A) the mean of split-half correlations between odd- and even-numbered items.
B) the mean of split-half correlations between first- and second-half items.
C) the mean of all possible split-half correlations.
D) the mean of the best or "alpha" level split-half correlations.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
69
Which of the following is NOT an acceptable way to divide a test when using the split-half reliability method?

A) Randomly assign items to each half of the test.
B) Assign odd-numbered items to one half and even-numbered items to the other half of the test.
C) Assign the first-half of the items to one half of the test and the second half of the items to the other half of the test.
D) Assign easy items to one half of the test and difficult items to the other half of the test.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
70
When more than two scorers are used to determine inter-scorer reliability,the statistic of choice is

A) Pearson r.
B) Spearman's rho.
C) KR-20.
D) coefficient alpha.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
71
The KR-21 reliability estimate was developed

A) to yield greater consistency in reliability coefficients.
B) to facilitate computation by hand.
C) for use with less homogeneous items.
D) because Kuder wanted to "one-up" Richardson's 20.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
72
If items on a test are measuring very different traits,estimates of reliability yielded from split-half methods will typically be ________ as compared with estimates from KR-20.

A) higher
B) lower
C) similar
D) approximately the same
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
73
Typically,adding items to a test will have what effect on the test's reliability?

A) Reliability will decrease.
B) Reliability will increase.
C) Reliability will stay the same.
D) Reliability will first increase and then decrease.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
74
KR-20 is the statistic of choice for tests with which types of items?

A) multiple-choice
B) true-false
C) All of these
D) None of these
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
75
If items from a test are measuring the same trait,estimates of reliability yielded from split-half methods will typically be ________ as compared to estimates from KR-20.

A) higher
B) lower
C) similar
D) approximately the same
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
76
Which BEST conveys the meaning of an inter-scorer reliability estimate of .90?

A) Ninety percent of the scores obtained are reliable.
B) Ninety percent of the variance in the scores assigned by the scorers was attributed to true differences and 10% to error.
C) Ten percent of the variance in the scores assigned by the scorers was attributed to true differences and 90% to error.
D) Ten percent of the test's items are in need of revision according to the majority of the test's users.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
77
For determining the reliability of tests scored using nominal scales of measurement,the statistic of choice is

A) Kendall's Tau.
B) the Kappa statistic.
C) KR-20.
D) coefficient alpha.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
78
Coefficient alpha is appropriate to use with all of the following test formats EXCEPT

A) multiple-choice.
B) true-false.
C) short-answer for which partial credit is awarded.
D) essay exam with no partial credit awarded.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
79
Which of the following is,generally speaking,the preferred statistic for obtaining a measure of internal-consistency reliability?

A) KR-20
B) KR-21
C) Kendall's Tau
D) coefficient alpha
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
80
A coefficient alpha over .9 may indicate that

A) the items in the test are too dissimilar.
B) the test is not reliable.
C) the items in the test are redundant.
D) the test is biased against low-ability individuals.
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.
فتح الحزمة
k this deck
locked card icon
فتح الحزمة
افتح القفل للوصول البطاقات البالغ عددها 182 في هذه المجموعة.