Question 1

To evaluate the content validity of a portfolio assessment, it should be determined that the student's work that is included in the portfolio represents&#10;A) the best work that the student has done in the domain.&#10;B) all important dimensions within the domain.&#10;C) all of the student's work during the previous year.&#10;D) just those areas where the student continues to have difficulty.

Accepted Answer

Content validity refers to how well an assessment or measurement tool covers the domain it's supposed to measure. For a portfolio assessment to have content validity, it should include work that represents all important dimensions within the domain, ensuring a comprehensive evaluation of the student's abilities and knowledge across the entire domain of interest.

Question 2

Which of the following statements concerning test validity is most accurate?&#10;A) A test cannot be valid unless it is reliable.&#10;B) A test cannot be reliable unless it is valid.&#10;C) A test cannot be standardized unless it is valid.&#10;D) A test cannot be reliable unless it is standardized.

Accepted Answer

Validity refers to how well a test measures what it is supposed to measure, while reliability refers to the consistency of the test results. A test must be reliable to be valid because if it does not produce consistent results, it cannot accurately measure what it is supposed to. However, a test can be reliable without being valid; it can consistently measure something, but not necessarily what it is intended to measure. Standardization is about the consistent administration and scoring of the test, which is separate from its validity and reliability.

Question 3

The extent to which a person's score on a test is related to performance on a criterion measure is best described as evidence based on&#10;A) test content.&#10;B) test structure.&#10;C) relations to other variables.&#10;D) response processes.

Accepted Answer

The extent to which a person's score on a test is related to performance on a criterion measure is described as evidence based on relations to other variables. This type of evidence examines how test scores correlate with outcomes or criteria that the test is supposed to predict, reflecting the test's validity in terms of its applicability to real-world situations or outcomes.

Question 4

When different groups of test takers consistently experience disparate levels of success on specific items, there is a problem in&#10;A) differential item effectiveness.&#10;B) group selection.&#10;C) administration errors.&#10;D) reliability.

Accepted Answer

Differential item effectiveness (DIE) refers to the phenomenon where items on a test systematically favor one group of test takers over another, leading to disparities in test performance that are not related to the construct being measured. This can indicate bias in the test items.

Question 5

If a test measures something consistently but does not measure what it was designed to measure, then the test is&#10;A) reliable but not valid.&#10;B) reliable but not standardized.&#10;C) standardized but not reliable.&#10;D) valid but not reliable.

Accepted Answer

A test that measures something consistently but fails to measure what it's supposed to is considered reliable (because it produces consistent results) but not valid (because it doesn't measure what it was designed to measure).

Question 6

The higher the reliability coefficient, the lower the&#10;A) coefficient of regression.&#10;B) standard error of measurement.&#10;C) validity.&#10;D) standard deviation.

Accepted Answer

The reliability coefficient is a measure of the consistency of a test. A higher reliability coefficient indicates that the test produces stable and consistent results across different administrations. The standard error of measurement, which quantifies the amount of error in the scores of a test, decreases as the reliability of the test increases. This is because a more reliable test has less random measurement error, leading to a lower standard error of measurement.

Question 7

In the context of assessment, &#34;enabling behaviors&#34; are those behaviors that&#10;A) help the tester attract the subject's attention.&#10;B) are extraneous to the requirements of the test situation.&#10;C) focus the assessment on qualitative analyses of performance.&#10;D) are required by the assessment to demonstrate the target knowledge.

Accepted Answer

Enabling behaviors are those that are necessary for the assessment to accurately measure the target knowledge or skills being evaluated. They are integral to the test's requirements.

Question 8

Because a person's true abilities can change between two administrations of a test, it is generally true that&#10;A) test-retest procedures cannot produce good reliability estimates.&#10;B) the shorter the time between the two administrations, the higher the reliability.&#10;C) the length of time between two administrations has little effect on reliability.&#10;D) a test developer needs to calculate coefficient alpha to estimate stability.

Accepted Answer

The shorter the interval between two test administrations, the less likely it is for a person's true abilities or external factors to change significantly, leading to higher reliability in the measurement of those abilities over time.

Question 9

An individual reported a reliability coefficient of 1.25 for an intelligence test. It was obtained by correlating the results of a given group on Form A with the group's results on Form B. This coefficient indicates that

A) the test is unusually reliable.
B) the test is unusually valid.
C) there are no errors of measurement.
D) a mistake was made in computing the coefficient.

Accepted Answer

A reliability coefficient is a statistical measure that ranges from 0 to 1, where 1 indicates perfect reliability and 0 indicates no reliability. A coefficient greater than 1, such as 1.25, is not possible, indicating a computational error.

Question 10

Coefficient alpha is most linked to &#8203;&#10;A) test-retest reliability.&#10;B) percentage of agreement.&#10;C) stability.&#10;D) internal consistency.

Accepted Answer

Coefficient alpha, often referred to as Cronbach's alpha, is a measure of internal consistency, which assesses how closely related a set of items are as a group. It is not directly related to test-retest reliability, percentage of agreement, or stability, which assess different aspects of reliability and agreement.

Question 11

To determine the stability of a test, the recommended interval between administrations of the test is __________.&#10;A) 2 days&#10;B) 2 weeks&#10;C) 2 months&#10;D) 2 years

Accepted Answer

The answer of To determine the stability of a test,...

Question 12

A statistic that enables an examiner to establish confidence for the true scores of examinees is the&#10;A) Kuder-Richardson predictive index.&#10;B) validity coefficient.&#10;C) standard error of measurement.&#10;D) mode.

Accepted Answer

The answer of A statistic that enables an examiner to...

Question 13

Kiana is going to evaluate the concurrent criterion-related validity of a self-report assessment of classroom problem behaviors. The most appropriate criterion measure would be&#10;A) a test of intelligence.&#10;B) classroom observation.&#10;C) grades in math.&#10;D) court records.

Accepted Answer

The answer of Kiana is going to evaluate the concurrent...

Question 14

T-scores for student X on an achievement test battery standardized on the same population are spelling 35, math 62, social studies 50, and English grammar 52. Each test has a SEM of 2; the tests are not intercorrelated. We conclude that

A) X is strongest in spelling.
B) X is strongest in math.
C) there are no substantial differences in X's achievements in these four areas.
D) it is not possible to compare X's performance on these subtests.

Accepted Answer

The answer of T-scores for student X on an achievement...

Question 15

A stability coefficient is used for measuring the reliability of&#10;A) a test administered at two different times.&#10;B) the first 50 items, compared with the last 50 items in a 100-item test.&#10;C) standard error of measurement.&#10;D) alternate forms of a test.

Accepted Answer

The answer of A stability coefficient is used for measuring...

Question 16

The results of an achievement test are considered to be invalid if&#10;A) reliability is less than 0.95.&#10;B) the teacher has not taught the content being tested.&#10;C) the student did not listen when the subject matter was taught.&#10;D) validity is less than 0.95.

Accepted Answer

The answer of The results of an achievement test are...

Question 17

The reliability of a test refers to its relative&#10;A) validity.&#10;B) power.&#10;C) consistency.&#10;D) inappropriateness.

Accepted Answer

The answer of The reliability of a test refers to...

Question 18

Method of measurement, enabling behaviors, and administrative errors are all considered to be&#10;A) types of reliability.&#10;B) signs of validity.&#10;C) sources of systematic bias.&#10;D) test development problems.

Accepted Answer

The answer of Method of measurement, enabling behaviors, and administrative...

Question 19

The means for both Test A and Test B are 50. A 50% confidence interval for a score at the mean is 44-55 for Test A and 42-58 for Test B. Which of the following statements is true?&#10;A) Test A is more reliable than Test B.&#10;B) Test B is more reliable than Test A.&#10;C) Test A has a larger SEM than Test B.&#10;D) Test B has a larger SEM than Test A.

Accepted Answer

The answer of The means for both Test A and...

Question 20

The Dairy County School District appropriately uses a test that has a reliability of 0.89 to&#10;A) place children in special education if they earn a score below an established criterion.&#10;B) move children to another school building to receive services for gifted children if they score above a certain point.&#10;C) decide whether students should be placed in the Rainbow reading group or the Rainstorm reading group.&#10;D) decide to conduct further assessment procedures.

Accepted Answer

The answer of The Dairy County School District appropriately uses...

Question 21

The absence of __________ required for performance on a test invalidates the rest results. &#8203;

Accepted Answer

The answer of The absence of __________ required for performance...

Question 22

For individual test data, where a test score is used to make a tracking or placement decision for an individual student, the recommended level of required reliability is __________.

Accepted Answer

The answer of For individual test data, where a test...

Question 23

For group test data that are used for administrative purposes and reported only by group, the recommended level of required reliability is __________.

Accepted Answer

The answer of For group test data that are used...

Question 24

If one wants to generalize to different times, one should examine the test's __________.

Accepted Answer

The answer of If one wants to generalize to different...

Question 25

Failure to administer a test according to standardized procedures is considered&#10;A) appropriate if the subject is young.&#10;B) a form of rapport building that may be necessary.&#10;C) a source of systematic bias.&#10;D) a random error that varies from one subject to the next.

Accepted Answer

The answer of Failure to administer a test according to...

Question 26

A test has a norm sample that is not representative of the population. Inferences made on the basis of a student's performance on this test are&#10;A) likely to indicate lower performance than the true score.&#10;B) invalid.&#10;C) unreliable.&#10;D) considered to be reasonable for qualitative comparisons.

Accepted Answer

The answer of A test has a norm sample that...

Question 27

If one wants to generalize to different item samples, one should examine the test's __________. &#8203;

Accepted Answer

The answer of If one wants to generalize to different...

Question 28

Both unreliability (unsystematic error) and systematic error (bias) threaten __________.

Accepted Answer

The answer of Both unreliability (unsystematic error) and systematic error...

Question 29

The most likely explanation for items having __________ for different groups of people is differential exposure to test content.

Accepted Answer

The answer of The most likely explanation for items having...

Question 30

The validity of a particular test can never exceed the __________ of that test. &#8203;

Accepted Answer

The answer of The validity of a particular test can...

Question 31

When we test, we are interested in __________ what we see today under one set of conditions to other occasions.

Accepted Answer

The answer of When we test, we are interested in...

Question 32

An estimate of the likelihood that a person's true score may be found within a range of scores is provided by the __________.

Accepted Answer

The answer of An estimate of the likelihood that a...

Question 33

The completeness of the item sample is one of the factors to consider in determining __________ validity.

Accepted Answer

The answer of The completeness of the item sample is...

Question 34

For individual test data, where a test score is used to make a screening decision, the recommended level of required reliability is __________.

Accepted Answer

The answer of For individual test data, where a test...

Question 35

In order for evidence of high concurrent validity to be meaningful, the criterion measures must be__________.

Accepted Answer

The answer of In order for evidence of high concurrent...

Question 36

A reliability coefficient of 1.00 indicates __________ reliability. &#8203;

Accepted Answer

The answer of A reliability coefficient of 1.00 indicates __________...

Question 37

Criteria for how high a test's reliability must be are determined in part by the specific __________ of assessment.

Accepted Answer

The answer of Criteria for how high a test's reliability...

Question 38

Validity evidence based on _________ ___________ reflects the extent to which a test's items represent the domain or universe to be measured.

Accepted Answer

The answer of Validity evidence based on _________ ___________ reflects...

Question 39

A method of estimating the reliability of a test that does not have two forms is to calculate the __________.

Accepted Answer

The answer of A method of estimating the reliability of...

Question 40

A test with a reliability coefficient of .97 has relatively little __________.

Accepted Answer

The answer of A test with a reliability coefficient of...

Question 41

The test manual for the Culture-Fair Intelligence Test reports correlations with the Stanford-Binet and the Goodenough-Harris tests. What type of validity is the author trying to demonstrate?

Accepted Answer

The answer of The test manual for the Culture-Fair Intelligence...

Question 42

Dr. Qubert has developed a test for which there is not an adequate criterion measure or construct with which to evaluate validity. She therefore decides to present complete content validity data. What three factors must she consider when determining content validity?

Accepted Answer

The answer of Dr. Qubert has developed a test for...

Question 43

To the extent that a norm sample is systematically unrepresentative, the inferences based on such scores are incorrect and __________.

Accepted Answer

The answer of To the extent that a norm sample...

Question 44

Sixty percent of a test's variance is caused by the variance of true scores, whereas 40% of the variance is caused by error. What is the test's reliability?

Accepted Answer

The answer of Sixty percent of a test's variance is...

Question 45

Test results would be of little value if we were unable to generalize what was observed in one situation to other situations. Identify and discuss three types of generalizations that can be made from reliable test results.

Accepted Answer

The answer of Test results would be of little value...

Question 46

Joseph was tested on an instrument for which the SEM was relatively high. How sure can we be of Joseph's score?

Accepted Answer

The answer of Joseph was tested on an instrument for...

Question 47

Explain in your own words the relationship between reliability and validity.

Accepted Answer

The answer of Explain in your own words the relationship...

Question 48

Validity evidence based on the consequences of testing is a concept adopted by the American Educational Research Association, the American Psychological Association, and the National Council on Measurement in education. However, it has been widely accepted in education. Discuss the reason evidence based on the consequences of testing has not been accepted in education. &#8203;

Accepted Answer

The answer of Validity evidence based on the consequences of...

Question 49

Compare and contrast the two major approaches to estimating the extent to which we can generalize from different samples of items.

Accepted Answer

The answer of Compare and contrast the two major approaches...

Question 50

Unless a test is administered according to the __________ the results are invalid.

Accepted Answer

The answer of Unless a test is administered according to...

Question 51

Annette was tested on an instrument for which the SEM was quite small. How sure can we be of Annette's score?

Accepted Answer

The answer of Annette was tested on an instrument for...

Deck 5: Technical Adequacy: Reliability and Validity