# Quiz 4: Sampling, Measurement, and Hypothesis Testing

Most research in psychology uses ________ sampling.
A)convenience
B)simple random
C)stratified
D)cluster

A

The purpose of random sampling is to obtain a sample that is
A)large enough to be valid
B)representative of the population
C)smaller than the population
D)significantly different from the population

B

In order to generalize from the results obtained with a sample to the population as a whole,
A)all members of the population must be tested eventually
B)each member of the population must have exactly the same probability of being selected, especially if stratified sampling is being used
C)the sample must be representative of the population
D)the exact values (on the trait being measured)for the population must be known

C

The basic definition of ____________ is that all members of the population have exactly the same chance of being selected as participants.
A)cluster sampling
B)nonprobability sampling
C)convenience sampling
D)simple random sampling

If it is not feasible to have a complete listing of the members of the population, which probability sampling method can be used?
A)stratified
B)convenience
C)cluster
D)simple random

When should a stratified sample be used?
A)when probability sampling is not necessary
B)when identifiable subgroups of the population are of interest
C)when the population is too large for all of it to be tested
D)when a list of all population members is not available

To study math achievement in West Virginia's third graders, a researcher randomly selects 5% of the state's school districts and gives all the students in each district a math test. What sampling procedure is being used here?
A)quota
B)stratified
C)cluster
D)none of the above (all the children in the selected districts are tested - therefore the entire population is being tested, not just a sample)

A researcher who selects a probability sample that is 40% male and 60% female is most likely to be using __________ sampling.
A)cluster
B)stratified
C)convenience
D)purposive

All of the following are examples of probability sampling except
A)simple random
B)quota
C)cluster
D)stratified

Which of the following is an example of a construct?
A)entering arm #3 of a radial maze
B)using fingers when adding
C)social effectiveness
D)naming letters

Which of the following is not an example of a construct?
A)perceived social support
B)letter identification
C)habituation
D)social effectiveness

Which of the following sequences of "time (in seconds)spent looking" suggests that habituation is occurring?
A)14, 10, 8, 12
B)6, 6, 6, 6
C)10, 6, 10, 6
D)10, 8, 6, 4

Which of the following sequences of "time (in seconds)spent looking" suggests that habituation occurs initially, but is followed by the perception of "something new?"
A)12, 10, 7, 11
B)6, 6, 6, 6
C)10, 6, 10, 6
D)10, 8, 6, 4

In a sequence of trials, an infant looks at a stimulus for 10 seconds, then 8, then 6, then 4. On the next trial, the infant looks for 12 seconds. What has occurred on this last trial?
A)the infant has noticed a change in the stimulus
B)habituation has occurred
C)the infant has lost interest in the stimulus
D)the infant is afraid of the stimulus

A gradual decline in responding in the face of a repeated stimulus is known as
A)inhibition
B)habituation
C)extinction
D)reaction time

In the mental rotation studies, Shepard and Metzler (1971)predicted that
A)participants would make more errors with a 30o rotation than with a 60o rotation
B)participants would make more errors with a 60o rotation than with a 30o rotation
C)participants would take more time with a 30o rotation than with a 60o rotation
D)participants would take more time with a 60o rotation than with a 30o rotation

Suppose a child in Kim and Spelke's (1992)habituation experiment showed a gradual decrease in looking time when shown ten examples of balls rolling down a plane while accelerating. The child then sees (trial 11)a ball rolling down a plane while decelerating. If the child has grasped the concept of gravity, what will happen to the behavior?
A)they will look longer on trial 10 than on trial 11
B)they will look longer on trial 11 than on trial 10
C)they will look for the same amount of time on both trials 10 and 11
D)not enough information to decide

Suppose a child in Kim and Spelke's (1992)habituation experiment showed a gradual decrease in looking time when shown ten examples of balls rolling down a plane while accelerating. The child then sees (trial 11)a ball rolling up a plane while decelerating. If the child understands the concept of gravity, what will happen to the behavior?
A)they will no longer look at the display
B)they will look much longer on trial 11 than on trial 10
C)they will look for the same amount of time on both trials 10 and 11
D)not enough information to decide

If simple reaction time takes an average of 0.17 seconds and discrimination reaction time takes an average of 0.26 seconds, then according to Donders' method, how long does the mental event of discrimination take?
A)0)43 seconds
B)0)26 seconds
C)0)09 seconds
D)0)20 seconds

Using Donders method, if the mental event called discrimination takes 0.07 seconds and discrimination reaction time takes 0.23 seconds, what is the person's basic reaction time?
A)0)07 seconds
B)0)30 seconds
C)0)16 seconds
D)0)93 seconds

In the early reaction time research, reaction times for seemingly complex events were occasionally equal to the reaction times for simpler events. How could this have happened?
A)the simple additive model was inadequate
B)the equipment must have malfunctioned
C)the complex event was really much simpler
D)the experimenters did not have sufficient training

A test with a minimum amount of measurement error is said to be
A)reliable
B)valid
C)both alternatives a. and b.
D)none of the above

On a reaction time test, which of the following factors could contribute to measurement error?
A)subject attentiveness
B)equipment irregularities
C)increased boredom if the task lasts too long
D)all of the above

If an IQ test is reliable and a child scores 115, what is known?
A)the child will only be an average student in school
B)the student will perform at a level in school that is about 15% higher than others
C)the IQ test is a good measure of intellect
D)the person has a higher IQ than someone who scores 95

Which of the following is true about measures of behavior?
A)they are more likely to be valid than reliable
B)they all include some degree of measurement error
C)measurement error can be eliminated completely by careful researchers
D)if a measure has content validity, it is almost certain to be reliable

Magazine surveys about your mental health
A)have been shown to be highly reliable
B)have criterion validity but not construct validity
C)have construct validity but not criterion validity
D)have face validity but not construct validity

A test might not appear to be a good test of intelligence and yet it might do a very good job of predicting how well someone does in school. That is, this test
A)has both face validity and predictive validity
B)has criterion validity but not face validity
C)is reliable but not valid
D)has criterion validity but lacks reliability

If an personality test is able to produce results similar to results produced by other valid measures of personality, then we can say the test has good
A)predictive validity
B)concurrent validity
C)face validity
D)reliability

For each of the following, a construct is paired with a measure. Which measure has the least content validity?
A)creativity - crossword puzzle completion
B)delay of gratification - choosing to wait for a larger reward
C)verbal intelligence - vocabulary
D)short-term memory - recall of nonsense syllables

On the "Connectedness to Nature" scale, convergent validity was established when it was found that a correlation existed between scores on the scale and
A)SAT scores
B)scores on a test of social desirability
C)scores on the NEP ("New Ecological Paradigm")test
D)scores on a shyness test

Which of the following is true about construct validity?
A)it is never established in a single study
B)it is concerned with the question of whether the construct being measured is a meaningful construct
C)it is concerned with the question of whether a tool developed to measure a construct is the best one available
D)all of the above

On the "Connectedness to Nature" scale, divergent validity was established when it was found that no correlation existed between scores on the scale and
A)SAT scores
B)scores on a test of social desirability
C)both alternatives a. and b.
D)none of the above - the outcomes in alternatives a. and b. supported convergent validity

A study examines scores on an employment test and job performance six months later. This study is most likely attempting to establish
A)criterion validity
B)face validity
C)reliability
D)construct validity

When phrenologists assessed the trait of "destructiveness" by measuring skull contour, their measurements were
A)reliable and valid
B)reliable but not valid
C)valid but not reliable
D)neither reliable nor valid

The results of an inkblot test might be quite different when given to the same person on two different occasions. If this is the case, then based on this fact alone, the inkblot test is
A)not reliable but probably valid
B)not reliable
C)not valid
D)neither reliable nor valid

A test is said to be reliable if ___________, and valid if it _____________.
A)its results are repeatable; measures what it is supposed to measure
B)has a sufficiently high amount of measurement error; measures what it is supposed to
C)its results are repeatable; is low in measurement error
D)measures what it is supposed to measure; is low in measurement error

Classification is the major purpose of a(n)________ scale of measurement.
A)nominal
B)ordinal
C)interval
D)ratio

Guéguen and Ciccotti (2008)tested whether having a dog present would lead women to provide their phone numbers to inquiring men. In this study, a nominal scale of measurement was used for which variable?
A)gender
B)whether or not a dog was present
C)whether or not phone numbers were provided
D)a nominal scale of measurement was not used in this study

When considering a student's overall standing in a class (first, second, third, etc.), which measurement scale is being used?
A)nominal
B)ordinal
C)interval
D)ratio

When using a(n)______ measurement scale, the most that can be said is that one score is greater than another.
A)nominal
B)ordinal
C)interval
D)ratio

All of the following are examples of ratio scale measures except
A)reaction time
B)number of errors in maze running
C)Grade Point Average (GPA)
D)number of words recalled on a memory test

Consider the experiment on multiple choice answer changing. What measurement scale was used in reporting the results of this study?
A)nominal
B)ordinal
C)interval
D)ratio

The main difference between an interval and a ratio scale is that an interval scale
A)is used only for placing participants into categories
B)does not have a true zero point
C)does not preserve a rank order in the assignment of numbers
D)has equal intervals between numbers

Which of the following is true about interval and ratio scales?
A)in a ratio scale, a score of zero means the absence of the phenomenon being measured
B)in an interval scale, it is not possible to achieve a score of zero
C)equal intervals exist in interval scales, but such is not the case in ratio scales
D)equal intervals exist in ratio scales, but such is not the case in interval scales

In Sheldon's (1940)research, 7-point ______ scales were used to measure body type and temperament.
A)nominal
B)ordinal
C)interval
D)ratio

In the study by Korn, Davis, and Davis (1991), it was determined that department chairs rated B. F. Skinner higher on their "all time" list than historians did. The study featured a(n)_______ scale of measurement.
A)nominal
B)ordinal
C)interval
D)ratio

Psychologists generally assume that most personality and IQ tests use a(n)_____ scale.
A)nominal
B)ordinal
C)interval
D)ratio

Descriptive statistics
A)enable the researcher to determine the significance of results
B)summarize the data of an experiment
C)both alternatives a. and b.
D)none of the above

Descriptive statistic is to inferential statistic as _________ is to ________.
A)mean; standard deviation
B)central tendency; variability
C)sample; population
D)median; range

When is the median a better measure of central tendency than the mean?
A)when several of the scores are the same score
B)when there are a few scores that are much higher or lower than the others
C)when the scores are normally distributed
D)none of the above; the mean is always preferred

Five children are tested for IQ and their scores are: 110, 160, 100, 100, 110. What is the best way to describe the central tendency of these scores?
A)the mode
B)the median
C)the mean
D)the range

Five children are tested for IQ. For which sets of scores will the median and the mode both be the same?
A)110, 150, 100, 110, 115
B)90, 100, 120, 110, 90
C)100, 180, 90, 110, 80
D)90, 90, 100, 120, 100

What is the relationship between a frequency distribution (FD)and a normal distribution (ND)?
A)FD uses the median as the primary measure of central tendency; ND uses the mean
B)FD is a hypothetical distribution; ND is based on actual data
C)FD is a distribution of actual scores, while ND is a hypothetical distribution
D)FD is always bell shaped, while ND may or may not be bell shaped

A graph in which each vertical bar corresponds to the frequency of some score is called a
A)normal curve
B)Gee Whiz graph
C)histogram
D)line graph

In a normal distribution, what (approximate)percentage of scores likely fall within one standard deviation of the mean?
A)50%
B)68%
C)75%
D)95%

Knowing the standard deviation of a set of scores, it is possible to calculate
A)range
B)variance
C)the mean
D)the frequency distribution

If a researcher wishes to establish groups of individuals based on low and high scores on some measure, they may use the ___________ to find the 25th and 75th percentiles of the set of scores to create their groups.
A)variance
B)standard deviation
C)range
D)interquartile range

When summarizing data, why is it important to report both the mean and the standard deviation?
A)two sets of data could have the same mean but different amounts of variability
B)this way both descriptive and inferential statistics are covered
C)this way the null hypothesis can be evaluated
D)this enables the researcher to avoid Type I and Type II errors

Normally, which of the following outcomes is most desired by the researcher?
A)reject Ho; Ho is true
B)reject Ho; Ho is false
C)fail to reject Ho; Ho is true
D)fail to reject Ho; Ho is false

A Type I error occurs when the researcher
A)rejects Ho, but Ho is true
B)rejects Ho, but Ho is false
C)fails to reject Ho, but Ho is true
D)fails to reject Ho, but Ho is false

In a study examining gender differences in verbal fluency in children, the null hypothesis is that
A)girls and boys perform equally
B)girls will most likely outperform boys
C)boys will most likely outperform girls
D)could be either alternative b. or c., depending on the researcher's prediction

A Type II error occurs when the researcher
A)rejects Ho, but Ho is true
B)rejects Ho, but Ho is false
C)fails to reject Ho, but Ho is true
D)fails to reject Ho, but Ho is false

In a maze learning study, a researcher compares the performance of laboratory-bred rats and wild rats, hoping to find that the wild rats are better. Which of the following is true?
A)the null hypothesis is that wild rats will learn faster than lab rats
B)a Type II error would be to find a difference in the study when no true difference exists
C)a Type I error would be to find no difference in the study when a true difference exists
D)if wild rats really are better, but the researcher fails to reject the null hypothesis, then a Type II error has occurred

In a maze learning study, a researcher compares the performance of laboratory-bred rats and wild rats, hoping to find that the wild rats are better. Which of the following would be a Type II error?
A)the null hypothesis is rejected when it is in fact true
B)the wild rats outperform the lab rats in the study
C)no difference is found in the study, but wild rats are in truth better maze learners
D)lab rats learn faster in the study, but in truth there is no difference

In a "Gee whiz" graph,
A)the differences are so obvious that an inferential analysis is not needed
B)the hoped-for differences fail to materialize
C)apparent differences are exaggerated by failing to label the Y-axis appropriately
D)there are too many lines, making it impossible to interpret

Which of the following is true about Type I errors?
A)the probability of one occurring is equal to the alpha level
B)they cannot occur if the statistical test is powerful enough
C)they occur when a true effect exists, but we fail to discover it in our study
D)if one occurs, there is no chance that your study will be published

Which of the following is true about Type II errors?
A)the probability of one occurring is equal to the alpha level
B)they cannot occur if the statistical test is powerful enough
C)they occur when a true effect exists, but we fail to discover it in our study
D)they occur when we reject the null hypothesis, when we really should not do so

Researchers are happy whenever
A)systematic variance is large
B)error variance is small
C)both alternatives a. and b.
D)none of the above

A set of data has a mean of 12 and a 95% confidence interval of 10-14. What does this mean?
A)the standard deviation will be 14-10, or 4
B)you can be 95% sure that 12 is the population mean
C)in order for the mean to be significantly different fro some other mean, the scores producing the other mean cannot be between 10 and 14
D)you can be quite sure that the population mean falls somewhere between 10 and 14

What is accomplished by a meta-analysis?
A)this analysis determines the probability of making both type I and type II errors
B)this is the statistical technique used to measure power
C)this is the term used to describe the complete statistical analysis of data-both the descriptive and the inferential analyses
D)this is a statistical procedure that combines effect sizes of several studies

The power of a statistical analysis refers to
A)the chances of rejecting the false null hypothesis
B)the chances of rejecting a true null hypothesis
C)the chances of rejecting any null hypothesis
D)whether the analysis involves descriptive or inferential statistics (inferential are more powerful)

Null hypothesis significance testing answers the question ________, while an effect size analysis answers the question ________.
A)how much of an effect did one factor have on another?; is the difference significant?
B)is the difference significant?; how much of an effect did one factor have on another?
C)can we reject Ho; is the sample large enough?
D)have we made a Type I error?; have we made a Type II error?

The tendency for studies with statistically significant results to be published more so than studies with nonsignificant results is called ________.
A)a publication bias
B)an effect size
C)a meta-analysis
D)power

Suppose there are 100 studies that failed to demonstrate an effect of gender on false memories, but 8 studies that showed a gender difference. One may conclude from reading the published studies there is a gender difference, but one may be incorrect due to
A)incomplete confidence intervals.
B)error variance.
C)systematic variance.
D)a phenomenon called a file drawer effect.

Researchers often report effect sizes to demonstrate
A)the relationship between systematic variance and error variance.
B)confidence intervals.
C)the size or magnitude of the effect.
D)statistical power.

