Question 1

You are given the following data, where X₁ (Pretest score) and X₂ (Hours spent in the program) are used to predict Y (Posttest score):

\begin{array}{ccc}\hline \boldsymbol{Y} & \boldsymbol{X}_{\mathbf{1}} & \boldsymbol{X}_{\mathbf{2}} \\\hline 65 & 60 & 7.5 \\82 & 62 & 9.0 \\94 & 75 & 8.5 \\80 & 78 & 7.0 \\87 & 65 & 10.0 \\66 & 60 & 8.0 \\\hline\end{array}

Determine the following values: intercept, b₁, b₂, SS_res, SS_reg, F, s_res², s(b₁), s(b₂), t₁, t₂.

Accepted Answer

Intercept = -70.662, b₁ = 1.235, b₂ = 8.077. SS_re_g = 619.504, SS_re_s = 44.496, F(2,3) = 20.884 (p < .017, reject at .05), s²_res = 14.832. s(b₁) = .227, s(b₂) = 1.660, t₁ = 5.437 (p = .012, reject at .05), t₂ = 4.866 (p = .017, reject at .05). Procedure: Create a data set with three variables: Posttest (Y), Pretest (X₁), and Hour (X₂). The data set should have six cases. 1) Go to Analyze $\rightarrow$ Regression $\rightarrow$ Linear. 2) Select Posttest to the Dependent list. Select Pretest and Hour to the Independent(s) list. "3) Click OK. Selected SPSS Output: $\text { Model Summary }$ $\begin{array}{ccccc} \hline \text { Model } & R & R S q u a r e & \text { Adjusted } R \text { Square } & \text { Std. Error of the Estimate } \ \hline 1 & .966^{2} & .933 & .888 & 3.851 \ \hline \end{array}$ $\text { a. Fredictors: (Constant), Hour, Pretest }$ $ {ANOVA}^{b} $ $\begin{array}{ccccccc} \hline {\text { Model }} & \text { Sum of Squares } & d f & \text { Mean Square } & F & \text { Sig. } \ \hline {\begin{array}{c} \text { Regression } \end{array}} & \mathbf{6 1 9 . 5 0 4} & 2 & 309.752 & \mathbf{2 0 . 8 8 4} & \mathbf{. 0 1 7}^a \ 1\text { Residual } & \mathbf{4 4 . 4 9 6} & 3 & \mathbf{1 4 . 8 3 2} & &\ \text { Total }&664.000&5\ \hline \end{array}$ a. Predictors: (Constant), Hour, Fretest b. Dependent Variable: Posttest Results: The results of the multiple linear regression suggest that a significant proportion of the total variation in posttest scores was effectively predicted by pretest scores and hours spent in the program, F(2,3) = 20.884, p = .017. For Pretest, the unstandardized partial slope (1.235) and standardized partial slope (.846) are statistically significantly different from 0 (t = 5.437, df = 3, p = .012); with every one-point increase in pretest, posttest score will increase by 1.235 when controlling for Hour. For Hour, the unstandardized partial slope (8.077) and standardized partial slope (.757) are statistically significantly different from 0 (t = 4.866, df = 3, p = .017); with every additional hour spent in the program, posttest score is expected to increase by 8.077 when controlling for Pretest scores. Thus, Pretest and Hour were shown to be statistically significant predictors of Posttest, both individually and collectively. Multiple R² indicates that 93.3% of the variation in Salary was predicted by Pretest and Hour. This suggests a large effect size. The intercept was -70.662, which is not statistically significantly different from 0 at the .05 level (t = -3.042, df = 3, p = .056)."

Question 2

Complete the missing information for this regression model (df = 25).&#10;

Accepted Answer

t₁ = b₁/s(b₁) = 16/4 = 4; t₂ = b₂/s(b₂) = .4/.05 = 8; t₃ = b₃/s(b₃) = 70/10 = 7. The critical t value is $\pm$_$\alpha$_/2t_df = $\pm$_0.025t₂₅ = 2.06. |t₁|, |t₂|, |t₃| > critical t, so X₁, X₂, and X₃ are all significant predictors of Y.

Question 3

A researcher would like to predict GPA from a set of three predictor variables for a sample of 34 college students. Multiple linear regression analysis was utilized. Complete the following summary table &#10;($\alpha$= .05) for the test of significance of the overall regression model:&#10;

Accepted Answer

There are three independent variables, so m = 3. There are 34 students, so n = 34.

df_reg= m = 3, df_res = n - m -1 = 34 - 3 - 1 = 30, df_total = n - 1 = 34 - 1 = 33.

SS_reg = MS_reg*df_reg = 6.5*3 = 19.5, SS_res = SS_total- SS_reg = 66 - 19.5 = 46.5.

MS_res = SS_res/df_res = 46.5/30 = 1.55

F = MS_reg/MS_re_s = 6.5/1.55 = 4.19; critical value = _.05F₃_,₃₀ = 2.92 < F, reject H₀.

There are three independent variables, so m = 3. There are 34 students, so n = 34. df<sub>reg</sub><sub> </sub>= m = 3, df<sub>res</sub> = n - m -1 = 34 - 3 - 1 = 30, df<sub>total</sub> = n - 1 = 34 - 1 = 33. SS<sub>reg</sub> = MS<sub>reg</sub>*df<sub>reg</sub> = 6.5*3 = 19.5, SS<sub>res</sub> = SS<sub>total</sub>- SS<sub>reg</sub> = 66 - 19.5 = 46.5. MS<sub>res</sub> = SS<sub>res</sub>/df<sub>res</sub> = 46.5/30 = 1.55 F = MS<sub>reg</sub>/MS<sub>re</sub><sub>s</sub> = 6.5/1.55 = 4.19; critical value = <sub>.05</sub>F<sub>3</sub><sub>,</sub><sub>3</sub><sub>0</sub> = 2.92 < F, reject H<sub>0</sub>.

Question 4

You are given the following data, where X₁ (attendance rate) and X₂ (average SAT score) are to be used to predict Y (average score in graduation test). Each case represents one school.

\begin{array}{|c|c|c|}\hline Y & X_{\mathbf{1}} & X_{\mathbf{2}} \\\hline 78.4 & 93.4 & 1010 \\\hline 81.3 & 94.6 & 1020 \\\hline 81.3 & 95.4 & 1024 \\\hline 82.5 & 91.1 & 1136 \\\hline 77.8 & 91.6 & 952 \\\hline 84.5 & 94.2 & 1042 \\\hline 88.2 & 94.5 & 1106 \\\hline 88.7 & 93.4 & 1004 \\\hline 72.5 & 92.1 & 880 \\\hline 85.4 & 94.9 & 1124 \\\hline 82.9 & 94.3 & 1124 \\\hline 81.4 & 94.7 & 996 \\\hline\end{array}

Determine the following values: intercept, b₁, b₂, SS_res, SS_reg, F, s_res², s(b₁), s(b₂), t₁, t₂.

Accepted Answer

The answer of You are given the following data, where...

Question 5

For the regression model, Y_i = b₁X₁_i + b₂X₂_i + a + e_i, consider the following two situations:
Situation 1: r_Y₁ = ?0.5 r_Y₂ = 0.8 r₁₂ = 0.1
Situation 2: r_Y₁ = 0.2 r_Y₂ = 0.8 r₁₂ = 0.1
In which of the two situations will R² be larger?

A) Situation 1.
B) Situation 2.
C) R² will be the same in both situations.
D) Uncertain.

Accepted Answer

(R² is higher when there is a high correlation of the predictors with the dependent variable.)

Question 6

The scatterplot of X and Y are shown as follows.
Based on the plot, which model is the most appropriate to use?

A) Y_i = b₁X_i + a + e_i.
B) Y_i = b₁X_i + b₂X_i² + a + e_i.
C) Y_i = b₁X_i² + a + e_i.
D) Y_i = b₁X_i + b₂X_i + b₃X_i³ + a + e_i.

Accepted Answer

(There is a curvilinear relation between X and Y, so a quadratic model should be applied.)

Question 7

Which of the following situations will result in the best prediction in multiple regression analysis? A) r_Y₁ = 0.1 r_Y₂ = 0.4 r₁₂ = 0.1 B) r_Y₁ = 0.1 r_Y₂ = 0.4 r₁₂ = 0.8 C) r_Y₁ = 0.6 r_Y₂ = 0.4 r₁₂ = 0.1 D) r_Y₁ = 0.6 r_Y₂ = 0.4 r₁₂ = 0.8

Accepted Answer

(Best prediction will result when there is a high correlation of the predictors with the dependent variable and low correlations among the predictors.)

Question 8

Which one of the following reflects variables appropriate for a multiple linear regression model?&#10;A) One categorical dependent variable and one continuous independent variable&#10;B) One continuous dependent variable and one continuous or categorical independent variable&#10;C) One continuous dependent variable and two or more continuous independent variables&#10;D) Two or more continuous dependent variables and one continuous or categorical independent variable

Accepted Answer

Multiple linear regression models involve a continuous dependent variable and two or more continuous independent variables. Choice A involves only one independent variable and a categorical dependent variable, while Choice B involves only one independent variable, which can be either continuous or categorical. Choice D involves two or more dependent variables, which is not appropriate for a multiple linear regression model.

Question 9

In a multiple linear regression with three independent variables, X₁, X₂, and X₃, which one of the following reflects an example of a semipartial correlation? A) The correlation between X₁ and X₂ and X₃ where both X₂ and X₃are removed from X₁ and X₂ B) The correlation between X₁ and X₂ where X₃ is held constant C) The correlation between X₂ and X₃ where X₁ is partialed out D) The correlation between X₁ and X₂ where X₃ is removed from X₂ only

Accepted Answer

A semipartial correlation involves removing the effect of a variable from only one of the variables being correlated. In option D, X₃ is removed from X₂ only, which fits the definition of a semipartial correlation.

Question 10

Partial correlations allow for which one of the following in multiple linear regression?&#10;A) Design control&#10;B) Experiential control&#10;C) Experimental control&#10;D) Statistical control

Accepted Answer

Partial correlations allow for statistical control, which means holding other variables constant while examining the relationship between two variables. This is helpful in multiple linear regression because it allows us to examine the unique effect of each predictor variable on the outcome variable, while controlling for the effects of other predictors.

Deck 18: Multiple Linear Regression