Deck 11: Regression Analysis: Statistical Inference

Full screen (f)
exit full mode
Question
One method of dealing with heteroscedasticity is to try a logarithmic transformation of the data.
Use Space or
up arrow
down arrow
to flip the card.
Question
One of the potential characteristics of an outlier is that the value of the dependent variable is much larger or smaller than predicted by the regression line.
Question
In order to estimate with 90% confidence a particular value of Y for a given value of X in a simple linear regression problem,a random sample of 20 observations is taken.The appropriate t-value that would be used is 1.734.
Question
In a multiple regression problem involving 30 observations and four explanatory variables,SST = 800 and SSE = 240.The value of the F-statistic for testing the significance of this model is 14.583.
Question
In time series data,errors are often not probabilistically independent.
Question
In multiple regression with k explanatory variables,the t-tests of the individual coefficients allows us to determine whether Bi0B _ { i } \neq 0
(for i = 1,2,…. ,k),which tells us whether a linear relationship exists between xx
and Y.
Question
If exact multicollinearlity exists,that means that there is redundancy in the data.
Question
Multicollinearity is a situation in which two or more of the explanatory variables are highly correlated with each other.
Question
Suppose that one equation has 3 explanatory variables and an F-ratio of 49.Another equation has 5 explanatory variables and an F-ratio of 38.The first equation will always be considered a better model.
Question
In simple linear regression,if the error variable ε\varepsilon
is normally distributed,the test statistic for testing H0:B1=0H _ { 0 } : B _ { 1 } = 0
is t-distributed with n - 2 degrees of freedom.
Question
In order to test the significance of a multiple regression model involving 4 explanatory variables and 40 observations,the numerator and denominator degrees of freedom for the critical value of F are 4 and 35,respectively.
Question
In regression analysis,the total variation in the dependent variable Y,measured by (yiyˉ)2\sum \left( y _ { i } - \bar { y } \right) ^ { 2 }
and referred to as SST,can be decomposed into two parts: the explained variation,measured by SSR,and the unexplained variation,measured by SSE.
Question
In multiple regression,the problem of multicollinearity affects the t-tests of the individual coefficients as well as the F-test in the analysis of variance for regression,since the F-test combines these t-tests into a single test.
Question
In a multiple regression analysis involving 4 explanatory variables and 40 data points,the degrees of freedom associated with the sum of squared errors,SSE,is 35.
Question
In multiple regression,if there is multicollinearity between independent variables,the t-tests of the individual coefficients may indicate that some variables are not linearly related to the dependent variable,when in fact they are.
Question
In a simple linear regression problem,if the standard error of estimate SeS _ { e }
= 15 and n = 8,then the sum of squares for error,SSE,is 1,350.
Question
A multiple regression model involves 40 observations and 4 explanatory variables produces SST = 1000 and SSR = 804.The value of MSE is 5.6.
Question
Multiple regression represents an improvement over simple regression because it allows any number of response variables to be included in the analysis.
Question
Heteroscedasticity means that the variability of Y values is larger for some X values
than for others.
Question
When there is a group of explanatory variables that are in some sense logically related,all of them must be included in the regression equation.
Question
In a simple linear regression model,testing whether the slope β1\beta _ { 1 }
of the population regression line could be zero is the same as testing whether or not the linear relationship between the response variable Y and the explanatory variable X is significant.
Question
One method of diagnosing heteroscedasticity is to plot the residuals against the predicted values of Y,then look for a change in the spread of the plotted values.
Question
In regression analysis,homoscedasticity refers to constant error variance.
Question
In testing the overall fit of a multiple regression model in which there are three explanatory variables,the null hypothesis is H0:B1=B2=B3H _ { 0 } : B _ { 1 } = B _ { 2 } = B _ { 3 }
.
Question
The residuals are observations of the error variable ε\varepsilon
.Consequently,the minimized sum of squared deviations is called the sum of squared error,labeled SSE.
Question
The Durbin-Watson statistic can be used to measure of autocorrelation.
Question
The value of the sum of squares due to regression,SSR,can never be larger than the value of the sum of squares total,SST.
Question
Homoscedasticity means that the variability of Y values is the same for all X values.
Question
A confidence interval constructed around a point prediction from a regression model is called a prediction interval,because the actual point being estimated is not a population parameter
Question
Which of the following would be considered a definition of an outlier?

A) An extreme value for one or more variables
B) A value whose residual is abnormally large in magnitude
C) Values for individual explanatory variables that fall outside the general pattern of the other observations
D) All of these options
Question
When determining whether to include or exclude a variable in regression analysis,if the p-value associated with the variable's t-value is above some accepted significance value,such as 0.05,then the variable:

A) is a candidate for inclusion
B) is a candidate for exclusion
C) is redundant
D) not fit the guidelines of parsimony
Question
The assumptions of regression are: 1)there is a population regression line,2)the dependent variable is normally distributed,3)the standard deviation of the response variable remains constant as the explanatory variables increase,and 4)the errors are probabilistically independent.
Question
Which of the following is not one of the guidelines for including/excluding variables in a regression equation?

A) Look at t-value and associated p-value
B) Check whether t-value is less than or greater than 1.0
C) Variables are logically related to one another
D) Use economic or physical theory to make decision
E) All of these options are guidelines
Question
A backward procedure is a type of equation building procedure that begins with all potential explanatory variables in the regression equation and deletes them two at a time until further deletion would reduce the percentage of variation explained to a value less than 0.50.
Question
A scatterplot that exhibits a "fan" shape (the variation of Y increases as X increases)is an example of:

A) homoscedasticity
B) heteroscedasticity
C) autocorrelation
D) multicollinearity
Question
Suppose you run a regression of a person's height on his/her right and left foot sizes,and you suspect that there may be multicollinearity between the foot sizes.What types of problems might you see if your suspicions are true?

A) "Wrong" values for the coefficients for the left and right foot size
B) Large p-values for the coefficients for the left and right foot size
C) Small t-values for the coefficients for the left and right foot size
D) All of these options
Question
In multiple regressions,a large value of the test statistic F indicates that most of the variation in Y is unexplained by the regression equation and that the model is useless.A small value of F indicates that most of the variation in Y is explained by the regression equation and that the model is useful.
Question
The can be used to test for autocorrelation.

A) regression coefficient
B) correlation coefficient
C) Durbin-Watson statistic
D) F-test or t-test
Question
In regression analysis,the unexplained part of the total variation in the response variable Y is referred to as sum of squares due to regression,SSR.
Question
A forward procedure is a type of equation building procedure that begins with only one explanatory variable in the regression equation and successively adds one variable at a time until no remaining variables make a significant contribution.
Question
Another term for constant error variance is:

A) homoscedasticity
B) heteroscedasticity
C) autocorrelation
D) multicollinearity
Question
The t-value for testing H0:Bi=0H _ { 0 } : B _ { i } = 0
Is calculated using which of the following equations:

A) n - k - 1
B)
(Xi/Yi)\sum \left( X _ { i } / Y _ { i } \right)
C)
Bj/siB _ { j } / s _ { i }
D)
bi/sbib _ { i } / s _ { b _ { i } }
Question
The appropriate hypothesis test for a regression coefficient is:

A)
H0:B0,Hα:B=0H _ { 0 } : B \neq 0 , H _ { \alpha } : B = 0
B)
H0:B=0,Hα:B0H _ { 0 } : B = 0 , H _ { \alpha } : B \neq 0
C)
H0:B=1,Hα:B1H _ { 0 } : B = 1 , H _ { \alpha } : B \neq 1
D) None of these options
Question
The objective typically used in the tree types of equation-building procedures are to:

A) find the equation with a small se
B) find the equation with a large R2
C) find the equation with a small se and a large R2
D) find the equation with the largest F-statistic
Question
The value k in the number of degrees of freedom,n-k-1,for the sampling distribution of the regression coefficients represents:

A) the sample size
B) the population size
C) the number of coefficients in the regression equation,including the constant
D) the number of independent variables included in the equation
Question
In the standardized value (biBi)/sbj\left( b _ { i } - B _ { i } \right) / s _ { b _ { j } }
,the symbol sb1s _ { b _ { 1 } }
Represents the:

A) mean of
bjb _ { j }
B) variance of
bjb _ { j }
C) standard error of
bjb _ { j }
D) degrees of freedom of
bjb _ { j }
Question
The appropriate hypothesis test for an ANOVA test is:

A)
H0: all B0,Hα: at least one B=0H _ { 0 } : \text { all } B \neq 0 , H _ { \alpha } : \text { at least one } B = 0
B)
H0 : all B=0,Hα : at least one B0H _ { 0 } \text { : all } B = 0 , H _ { \alpha } \text { : at least one } B \neq 0
C)
H0 : at least on B0,Ha: all B=0H _ { 0 } \text { : at least on } B \neq 0 , H _ { a } : \text { all } B = 0
D)
H0 : at least one B=0,Hα : all B0H _ { 0 } \text { : at least one } B = 0 , H _ { \alpha } \text { : all } B \neq 0
Question
Suppose you forecast the values of all of the independent variables and insert them into a multiple regression equation and obtain a point prediction for the dependent variable.You could then use the standard error of the estimate to obtain an approximate

A) confidence interval
B) prediction interval
C) hypothesis test
D) independence test
Question
Which of the following is the relevant sampling distribution for regression coefficients?

A) Normal distribution
B) t-distribution with n-1 degrees of freedom
C) t-distribution with n-1-k degrees of freedom
D) F-distribution with n-1-k degrees of freedom
Question
The ANOVA table splits the total variation into two parts.They are the

A) acceptable and unacceptable variation
B) adequate and inadequate variation
C) resolved and unresolved variation
D) explained and unexplained variation
Question
Which of the following is not one of the assumptions of regression?

A) There is a population regression line
B) The response variable is normally distributed
C) The standard deviation of the response variable increases as the explanatory variables increase
D) The errors are probabilistically independent
Question
A point that "tilts" the regression line toward it,is referred to as a(n):

A) magnetic point
B) influential point
C) extreme point
D) explanatory point
Question
Determining which variables to include in regression analysis by estimating a series of regression equations by successively adding or deleting variables according to prescribed rules is referred to as:

A) elimination regression
B) forward regression
C) backward regression
D) stepwise regression
Question
The test statistic in an ANOVA analysis is:

A) the t-statistic
B) the z-statistic
C) the F-statistic
D) the Chi-square statistic
Question
In regression analysis,extrapolation is performed when you:

A) attempt to predict beyond the limits of the sample
B) have to estimate some of the explanatory variable values
C) have to use a lag variable as an explanatory variable in the model
D) don't have observations for every period in the sample
Question
The term autocorrelation refers to:

A) analyzed data refers to itself
B) sample is related too closely to the population
C) data are in a loop (values repeat themselves)
D) time series variables are usually related to their own past values
Question
In regression analysis,multicollinearity refers to:

A) the response variables being highly correlated
B) the explanatory variables being highly correlated
C) the response variable(s)and the explanatory variable(s)are highly correlated with one another
D) the response variables are highly correlated over time.
Question
Time series data often exhibits which of the following characteristics?

A) homoscedasticity
B) heteroscedasticity
C) autocorrelation
D) multicollinearity
Question
Many statistical packages have three types of equation-building procedures.They are:

A) forward,linear and non-linear
B) forward,backward and stepwise
C) simple,complex and stepwise
D) inclusion,exclusion and linear
Question
The error term represents the vertical distance from any point to the

A) estimated regression line
B) population regression line
C) value of the Y's
D) mean value of the X's
Question
A researcher can check whether the errors are normally distributed by using:

A) a t-test or an F-test
B) the Durbin-Watson statistic
C) a frequency distribution or the value of the regression coefficient
D) a histogram or a Q-Q plot
Question
In regression analysis,the ANOVA table analyzes:

A) the variation of the response variable Y
B) the variation of the explanatory variable X
C) the total variation of all variables
D) All of these options
Question
If you can determine that the outlier is not really a member of the relevant population,then it is appropriate and probably best to:

A) average it
B) reduce it
C) delete it
D) leave it
Question
Which of the following definitions best describes parsimony?

A) Explaining the most with the least
B) Explaining the least with the most
C) Being able to explain all of the change in the response variable
D) Being able to predict the value of the response variable far into the future
Question
Which of the following is true regarding regression error,e

A) it is the same as a residual
B) it can be calculated from the observed data
C) it cannot be calculated from the observed data
D) it is unbiased
Question
Forward regression:

A) begins with all potential explanatory variables in the equation and deletes them one at a time until further deletion would do more harm than good.
B) adds and deletes variables until an optimal equation is achieved.
C) begins with no explanatory variables in the equation and successively adds one at a time until no remaining variables make a significant contribution.
D) randomly selects the optimal number of explanatory variables to be used
Question
Which of the following is not one of the assumptions of regression?

A) There is a population regression line
B) The response variable is not normally distributed
C) The response variable is normally distributed
D) The errors are probabilistically independent
The manager of a commuter rail transportation system was recently asked by his governing board to predict the demand for rides in the large city served by the transportation network.The system manager has collected data on variables thought to be related to the number of weekly riders on the city's rail system.The table shown below contains these data.
 Year  Weekly Riders  Price perRide  Population  Income  Parking Rate 11200$0.151800$2,900$0.5021190$0.151790$3,100$0.5031195$0.151780$3,200$0.6041100$0.251778$3,250$0.6051105$0.251750$3,275$0.6061115$0.251740$3,290$0.7071130$0.251725$4,100$0.7581095$0.301725$4,300$0.7591090$0.301720$4,400$0.75101087$0.301705$4,600$0.80111080$0.301710$4,815$0.80121020$0.401700$5,285$0.80131010$0.401695$5,665$0.85141010$0.401695$5,800$1.00151005$0.401690$5,900$1.0516995$0.401630$5,915$1.0517930$0.751640$6,325$1.0518915$0.751635$6,500$1.1019920$0.751630$6,612$1.2520940$0.751620$6,883$1.3021950$0.751615$7,005$1.5022910$1.001605$7,234$1.5523930$1.001590$7,500$1.6524933$1.001595$7,600$1.7525940$1.001590$7,800$1.7526948$1.001600$8,000$1.9027955$1,001610$8,100$2.00\begin{array}{cccccc}\text { Year } & \text { Weekly Riders } & \text { Price perRide } & \text { Population } & \text { Income } & \text { Parking Rate } \\1 & 1200 & \$ 0.15 & 1800 & \$ 2,900 & \$ 0.50 \\2 & 1190 & \$ 0.15 & 1790 & \$ 3,100 & \$ 0.50 \\3 & 1195 & \$ 0.15 & 1780 & \$ 3,200 & \$ 0.60 \\4 & 1100 & \$ 0.25 & 1778 & \$ 3,250 & \$ 0.60 \\5 & 1105 & \$ 0.25 & 1750 & \$ 3,275 & \$ 0.60 \\6 & 1115 & \$ 0.25 & 1740 & \$ 3,290 & \$ 0.70 \\7 & 1130 & \$ 0.25 & 1725 & \$ 4,100 & \$ 0.75 \\8 & 1095 & \$ 0.30 & 1725 & \$ 4,300 & \$ 0.75 \\9 & 1090 & \$ 0.30 & 1720 & \$ 4,400 & \$ 0.75 \\10 & 1087 & \$ 0.30 & 1705 & \$ 4,600 & \$ 0.80\\11 & 1080 & \$ 0.30 & 1710 & \$ 4,815 & \$ 0.80 \\12 & 1020 & \$ 0.40 & 1700 & \$ 5,285 & \$ 0.80 \\13 & 1010 & \$ 0.40 & 1695 & \$ 5,665 & \$ 0.85 \\14 & 1010 & \$ 0.40 & 1695 & \$ 5,800 & \$ 1.00 \\15 & 1005 & \$ 0.40 & 1690 & \$ 5,900 & \$ 1.05 \\16 & 995 & \$ 0.40 & 1630 & \$ 5,915 & \$ 1.05 \\17 & 930 & \$ 0.75 & 1640 & \$ 6,325 & \$ 1.05 \\18 & 915 & \$ 0.75 & 1635 & \$ 6,500 & \$ 1.10 \\19 & 920 & \$ 0.75 & 1630 & \$ 6,612 & \$ 1.25 \\20 & 940 & \$ 0.75 & 1620 & \$ 6,883 & \$ 1.30\\21 & 950 & \$ 0.75 & 1615 & \$ 7,005 & \$ 1.50 \\22 & 910 & \$ 1.00 & 1605 & \$ 7,234 & \$ 1.55 \\23 & 930 & \$ 1.00 & 1590 & \$ 7,500 & \$ 1.65 \\24 & 933 & \$ 1.00 & 1595 & \$ 7,600 & \$ 1.75 \\25 & 940 & \$ 1.00 & 1590 & \$ 7,800 & \$ 1.75 \\26 & 948 & \$ 1.00 & 1600 & \$ 8,000 & \$ 1.90 \\27 & 955 & \$ 1,00 & 1610 & \$ 8,100 & \$ 2.00\\\end{array}
The variables "weekly riders" and "population" are measured in thousands,and the variables "price per ride","income",and "parking rate" are measured in dollars.
Question
If residuals separated by one period are autocorrelated,this is called:

A) simple autocorrelation
B) redundant autocorrelation
C) time 1 autocorrelation
D) lag 1 autocorrelation
Question
When the error variance is nonconstant,it is common to see the variation increases as the explanatory variable increases (you will see a "fan shape" in the scatterplot).There are two ways you can deal with this phenomenon.These are:

A) the weighted least squares and a logarithmic transformation
B) the partial F and a logarithmic transformation
C) the weighted least squares and the partial F
D) stepwise regression and the partial F
Unlock Deck
Sign up to unlock the cards in this deck!
Unlock Deck
Unlock Deck
1/69
auto play flashcards
Play
simple tutorial
Full screen (f)
exit full mode
Deck 11: Regression Analysis: Statistical Inference
1
One method of dealing with heteroscedasticity is to try a logarithmic transformation of the data.
True
2
One of the potential characteristics of an outlier is that the value of the dependent variable is much larger or smaller than predicted by the regression line.
True
3
In order to estimate with 90% confidence a particular value of Y for a given value of X in a simple linear regression problem,a random sample of 20 observations is taken.The appropriate t-value that would be used is 1.734.
True
4
In a multiple regression problem involving 30 observations and four explanatory variables,SST = 800 and SSE = 240.The value of the F-statistic for testing the significance of this model is 14.583.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
5
In time series data,errors are often not probabilistically independent.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
6
In multiple regression with k explanatory variables,the t-tests of the individual coefficients allows us to determine whether Bi0B _ { i } \neq 0
(for i = 1,2,…. ,k),which tells us whether a linear relationship exists between xx
and Y.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
7
If exact multicollinearlity exists,that means that there is redundancy in the data.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
8
Multicollinearity is a situation in which two or more of the explanatory variables are highly correlated with each other.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
9
Suppose that one equation has 3 explanatory variables and an F-ratio of 49.Another equation has 5 explanatory variables and an F-ratio of 38.The first equation will always be considered a better model.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
10
In simple linear regression,if the error variable ε\varepsilon
is normally distributed,the test statistic for testing H0:B1=0H _ { 0 } : B _ { 1 } = 0
is t-distributed with n - 2 degrees of freedom.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
11
In order to test the significance of a multiple regression model involving 4 explanatory variables and 40 observations,the numerator and denominator degrees of freedom for the critical value of F are 4 and 35,respectively.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
12
In regression analysis,the total variation in the dependent variable Y,measured by (yiyˉ)2\sum \left( y _ { i } - \bar { y } \right) ^ { 2 }
and referred to as SST,can be decomposed into two parts: the explained variation,measured by SSR,and the unexplained variation,measured by SSE.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
13
In multiple regression,the problem of multicollinearity affects the t-tests of the individual coefficients as well as the F-test in the analysis of variance for regression,since the F-test combines these t-tests into a single test.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
14
In a multiple regression analysis involving 4 explanatory variables and 40 data points,the degrees of freedom associated with the sum of squared errors,SSE,is 35.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
15
In multiple regression,if there is multicollinearity between independent variables,the t-tests of the individual coefficients may indicate that some variables are not linearly related to the dependent variable,when in fact they are.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
16
In a simple linear regression problem,if the standard error of estimate SeS _ { e }
= 15 and n = 8,then the sum of squares for error,SSE,is 1,350.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
17
A multiple regression model involves 40 observations and 4 explanatory variables produces SST = 1000 and SSR = 804.The value of MSE is 5.6.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
18
Multiple regression represents an improvement over simple regression because it allows any number of response variables to be included in the analysis.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
19
Heteroscedasticity means that the variability of Y values is larger for some X values
than for others.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
20
When there is a group of explanatory variables that are in some sense logically related,all of them must be included in the regression equation.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
21
In a simple linear regression model,testing whether the slope β1\beta _ { 1 }
of the population regression line could be zero is the same as testing whether or not the linear relationship between the response variable Y and the explanatory variable X is significant.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
22
One method of diagnosing heteroscedasticity is to plot the residuals against the predicted values of Y,then look for a change in the spread of the plotted values.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
23
In regression analysis,homoscedasticity refers to constant error variance.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
24
In testing the overall fit of a multiple regression model in which there are three explanatory variables,the null hypothesis is H0:B1=B2=B3H _ { 0 } : B _ { 1 } = B _ { 2 } = B _ { 3 }
.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
25
The residuals are observations of the error variable ε\varepsilon
.Consequently,the minimized sum of squared deviations is called the sum of squared error,labeled SSE.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
26
The Durbin-Watson statistic can be used to measure of autocorrelation.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
27
The value of the sum of squares due to regression,SSR,can never be larger than the value of the sum of squares total,SST.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
28
Homoscedasticity means that the variability of Y values is the same for all X values.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
29
A confidence interval constructed around a point prediction from a regression model is called a prediction interval,because the actual point being estimated is not a population parameter
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
30
Which of the following would be considered a definition of an outlier?

A) An extreme value for one or more variables
B) A value whose residual is abnormally large in magnitude
C) Values for individual explanatory variables that fall outside the general pattern of the other observations
D) All of these options
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
31
When determining whether to include or exclude a variable in regression analysis,if the p-value associated with the variable's t-value is above some accepted significance value,such as 0.05,then the variable:

A) is a candidate for inclusion
B) is a candidate for exclusion
C) is redundant
D) not fit the guidelines of parsimony
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
32
The assumptions of regression are: 1)there is a population regression line,2)the dependent variable is normally distributed,3)the standard deviation of the response variable remains constant as the explanatory variables increase,and 4)the errors are probabilistically independent.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
33
Which of the following is not one of the guidelines for including/excluding variables in a regression equation?

A) Look at t-value and associated p-value
B) Check whether t-value is less than or greater than 1.0
C) Variables are logically related to one another
D) Use economic or physical theory to make decision
E) All of these options are guidelines
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
34
A backward procedure is a type of equation building procedure that begins with all potential explanatory variables in the regression equation and deletes them two at a time until further deletion would reduce the percentage of variation explained to a value less than 0.50.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
35
A scatterplot that exhibits a "fan" shape (the variation of Y increases as X increases)is an example of:

A) homoscedasticity
B) heteroscedasticity
C) autocorrelation
D) multicollinearity
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
36
Suppose you run a regression of a person's height on his/her right and left foot sizes,and you suspect that there may be multicollinearity between the foot sizes.What types of problems might you see if your suspicions are true?

A) "Wrong" values for the coefficients for the left and right foot size
B) Large p-values for the coefficients for the left and right foot size
C) Small t-values for the coefficients for the left and right foot size
D) All of these options
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
37
In multiple regressions,a large value of the test statistic F indicates that most of the variation in Y is unexplained by the regression equation and that the model is useless.A small value of F indicates that most of the variation in Y is explained by the regression equation and that the model is useful.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
38
The can be used to test for autocorrelation.

A) regression coefficient
B) correlation coefficient
C) Durbin-Watson statistic
D) F-test or t-test
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
39
In regression analysis,the unexplained part of the total variation in the response variable Y is referred to as sum of squares due to regression,SSR.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
40
A forward procedure is a type of equation building procedure that begins with only one explanatory variable in the regression equation and successively adds one variable at a time until no remaining variables make a significant contribution.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
41
Another term for constant error variance is:

A) homoscedasticity
B) heteroscedasticity
C) autocorrelation
D) multicollinearity
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
42
The t-value for testing H0:Bi=0H _ { 0 } : B _ { i } = 0
Is calculated using which of the following equations:

A) n - k - 1
B)
(Xi/Yi)\sum \left( X _ { i } / Y _ { i } \right)
C)
Bj/siB _ { j } / s _ { i }
D)
bi/sbib _ { i } / s _ { b _ { i } }
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
43
The appropriate hypothesis test for a regression coefficient is:

A)
H0:B0,Hα:B=0H _ { 0 } : B \neq 0 , H _ { \alpha } : B = 0
B)
H0:B=0,Hα:B0H _ { 0 } : B = 0 , H _ { \alpha } : B \neq 0
C)
H0:B=1,Hα:B1H _ { 0 } : B = 1 , H _ { \alpha } : B \neq 1
D) None of these options
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
44
The objective typically used in the tree types of equation-building procedures are to:

A) find the equation with a small se
B) find the equation with a large R2
C) find the equation with a small se and a large R2
D) find the equation with the largest F-statistic
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
45
The value k in the number of degrees of freedom,n-k-1,for the sampling distribution of the regression coefficients represents:

A) the sample size
B) the population size
C) the number of coefficients in the regression equation,including the constant
D) the number of independent variables included in the equation
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
46
In the standardized value (biBi)/sbj\left( b _ { i } - B _ { i } \right) / s _ { b _ { j } }
,the symbol sb1s _ { b _ { 1 } }
Represents the:

A) mean of
bjb _ { j }
B) variance of
bjb _ { j }
C) standard error of
bjb _ { j }
D) degrees of freedom of
bjb _ { j }
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
47
The appropriate hypothesis test for an ANOVA test is:

A)
H0: all B0,Hα: at least one B=0H _ { 0 } : \text { all } B \neq 0 , H _ { \alpha } : \text { at least one } B = 0
B)
H0 : all B=0,Hα : at least one B0H _ { 0 } \text { : all } B = 0 , H _ { \alpha } \text { : at least one } B \neq 0
C)
H0 : at least on B0,Ha: all B=0H _ { 0 } \text { : at least on } B \neq 0 , H _ { a } : \text { all } B = 0
D)
H0 : at least one B=0,Hα : all B0H _ { 0 } \text { : at least one } B = 0 , H _ { \alpha } \text { : all } B \neq 0
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
48
Suppose you forecast the values of all of the independent variables and insert them into a multiple regression equation and obtain a point prediction for the dependent variable.You could then use the standard error of the estimate to obtain an approximate

A) confidence interval
B) prediction interval
C) hypothesis test
D) independence test
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
49
Which of the following is the relevant sampling distribution for regression coefficients?

A) Normal distribution
B) t-distribution with n-1 degrees of freedom
C) t-distribution with n-1-k degrees of freedom
D) F-distribution with n-1-k degrees of freedom
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
50
The ANOVA table splits the total variation into two parts.They are the

A) acceptable and unacceptable variation
B) adequate and inadequate variation
C) resolved and unresolved variation
D) explained and unexplained variation
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
51
Which of the following is not one of the assumptions of regression?

A) There is a population regression line
B) The response variable is normally distributed
C) The standard deviation of the response variable increases as the explanatory variables increase
D) The errors are probabilistically independent
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
52
A point that "tilts" the regression line toward it,is referred to as a(n):

A) magnetic point
B) influential point
C) extreme point
D) explanatory point
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
53
Determining which variables to include in regression analysis by estimating a series of regression equations by successively adding or deleting variables according to prescribed rules is referred to as:

A) elimination regression
B) forward regression
C) backward regression
D) stepwise regression
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
54
The test statistic in an ANOVA analysis is:

A) the t-statistic
B) the z-statistic
C) the F-statistic
D) the Chi-square statistic
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
55
In regression analysis,extrapolation is performed when you:

A) attempt to predict beyond the limits of the sample
B) have to estimate some of the explanatory variable values
C) have to use a lag variable as an explanatory variable in the model
D) don't have observations for every period in the sample
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
56
The term autocorrelation refers to:

A) analyzed data refers to itself
B) sample is related too closely to the population
C) data are in a loop (values repeat themselves)
D) time series variables are usually related to their own past values
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
57
In regression analysis,multicollinearity refers to:

A) the response variables being highly correlated
B) the explanatory variables being highly correlated
C) the response variable(s)and the explanatory variable(s)are highly correlated with one another
D) the response variables are highly correlated over time.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
58
Time series data often exhibits which of the following characteristics?

A) homoscedasticity
B) heteroscedasticity
C) autocorrelation
D) multicollinearity
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
59
Many statistical packages have three types of equation-building procedures.They are:

A) forward,linear and non-linear
B) forward,backward and stepwise
C) simple,complex and stepwise
D) inclusion,exclusion and linear
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
60
The error term represents the vertical distance from any point to the

A) estimated regression line
B) population regression line
C) value of the Y's
D) mean value of the X's
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
61
A researcher can check whether the errors are normally distributed by using:

A) a t-test or an F-test
B) the Durbin-Watson statistic
C) a frequency distribution or the value of the regression coefficient
D) a histogram or a Q-Q plot
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
62
In regression analysis,the ANOVA table analyzes:

A) the variation of the response variable Y
B) the variation of the explanatory variable X
C) the total variation of all variables
D) All of these options
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
63
If you can determine that the outlier is not really a member of the relevant population,then it is appropriate and probably best to:

A) average it
B) reduce it
C) delete it
D) leave it
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
64
Which of the following definitions best describes parsimony?

A) Explaining the most with the least
B) Explaining the least with the most
C) Being able to explain all of the change in the response variable
D) Being able to predict the value of the response variable far into the future
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
65
Which of the following is true regarding regression error,e

A) it is the same as a residual
B) it can be calculated from the observed data
C) it cannot be calculated from the observed data
D) it is unbiased
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
66
Forward regression:

A) begins with all potential explanatory variables in the equation and deletes them one at a time until further deletion would do more harm than good.
B) adds and deletes variables until an optimal equation is achieved.
C) begins with no explanatory variables in the equation and successively adds one at a time until no remaining variables make a significant contribution.
D) randomly selects the optimal number of explanatory variables to be used
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
67
Which of the following is not one of the assumptions of regression?

A) There is a population regression line
B) The response variable is not normally distributed
C) The response variable is normally distributed
D) The errors are probabilistically independent
The manager of a commuter rail transportation system was recently asked by his governing board to predict the demand for rides in the large city served by the transportation network.The system manager has collected data on variables thought to be related to the number of weekly riders on the city's rail system.The table shown below contains these data.
 Year  Weekly Riders  Price perRide  Population  Income  Parking Rate 11200$0.151800$2,900$0.5021190$0.151790$3,100$0.5031195$0.151780$3,200$0.6041100$0.251778$3,250$0.6051105$0.251750$3,275$0.6061115$0.251740$3,290$0.7071130$0.251725$4,100$0.7581095$0.301725$4,300$0.7591090$0.301720$4,400$0.75101087$0.301705$4,600$0.80111080$0.301710$4,815$0.80121020$0.401700$5,285$0.80131010$0.401695$5,665$0.85141010$0.401695$5,800$1.00151005$0.401690$5,900$1.0516995$0.401630$5,915$1.0517930$0.751640$6,325$1.0518915$0.751635$6,500$1.1019920$0.751630$6,612$1.2520940$0.751620$6,883$1.3021950$0.751615$7,005$1.5022910$1.001605$7,234$1.5523930$1.001590$7,500$1.6524933$1.001595$7,600$1.7525940$1.001590$7,800$1.7526948$1.001600$8,000$1.9027955$1,001610$8,100$2.00\begin{array}{cccccc}\text { Year } & \text { Weekly Riders } & \text { Price perRide } & \text { Population } & \text { Income } & \text { Parking Rate } \\1 & 1200 & \$ 0.15 & 1800 & \$ 2,900 & \$ 0.50 \\2 & 1190 & \$ 0.15 & 1790 & \$ 3,100 & \$ 0.50 \\3 & 1195 & \$ 0.15 & 1780 & \$ 3,200 & \$ 0.60 \\4 & 1100 & \$ 0.25 & 1778 & \$ 3,250 & \$ 0.60 \\5 & 1105 & \$ 0.25 & 1750 & \$ 3,275 & \$ 0.60 \\6 & 1115 & \$ 0.25 & 1740 & \$ 3,290 & \$ 0.70 \\7 & 1130 & \$ 0.25 & 1725 & \$ 4,100 & \$ 0.75 \\8 & 1095 & \$ 0.30 & 1725 & \$ 4,300 & \$ 0.75 \\9 & 1090 & \$ 0.30 & 1720 & \$ 4,400 & \$ 0.75 \\10 & 1087 & \$ 0.30 & 1705 & \$ 4,600 & \$ 0.80\\11 & 1080 & \$ 0.30 & 1710 & \$ 4,815 & \$ 0.80 \\12 & 1020 & \$ 0.40 & 1700 & \$ 5,285 & \$ 0.80 \\13 & 1010 & \$ 0.40 & 1695 & \$ 5,665 & \$ 0.85 \\14 & 1010 & \$ 0.40 & 1695 & \$ 5,800 & \$ 1.00 \\15 & 1005 & \$ 0.40 & 1690 & \$ 5,900 & \$ 1.05 \\16 & 995 & \$ 0.40 & 1630 & \$ 5,915 & \$ 1.05 \\17 & 930 & \$ 0.75 & 1640 & \$ 6,325 & \$ 1.05 \\18 & 915 & \$ 0.75 & 1635 & \$ 6,500 & \$ 1.10 \\19 & 920 & \$ 0.75 & 1630 & \$ 6,612 & \$ 1.25 \\20 & 940 & \$ 0.75 & 1620 & \$ 6,883 & \$ 1.30\\21 & 950 & \$ 0.75 & 1615 & \$ 7,005 & \$ 1.50 \\22 & 910 & \$ 1.00 & 1605 & \$ 7,234 & \$ 1.55 \\23 & 930 & \$ 1.00 & 1590 & \$ 7,500 & \$ 1.65 \\24 & 933 & \$ 1.00 & 1595 & \$ 7,600 & \$ 1.75 \\25 & 940 & \$ 1.00 & 1590 & \$ 7,800 & \$ 1.75 \\26 & 948 & \$ 1.00 & 1600 & \$ 8,000 & \$ 1.90 \\27 & 955 & \$ 1,00 & 1610 & \$ 8,100 & \$ 2.00\\\end{array}
The variables "weekly riders" and "population" are measured in thousands,and the variables "price per ride","income",and "parking rate" are measured in dollars.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
68
If residuals separated by one period are autocorrelated,this is called:

A) simple autocorrelation
B) redundant autocorrelation
C) time 1 autocorrelation
D) lag 1 autocorrelation
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
69
When the error variance is nonconstant,it is common to see the variation increases as the explanatory variable increases (you will see a "fan shape" in the scatterplot).There are two ways you can deal with this phenomenon.These are:

A) the weighted least squares and a logarithmic transformation
B) the partial F and a logarithmic transformation
C) the weighted least squares and the partial F
D) stepwise regression and the partial F
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
locked card icon
Unlock Deck
Unlock for access to all 69 flashcards in this deck.