Question 1

Using the table below, find the k-nearest neighbor for record 4 using k = 3 for age.  &#10;A) 24, 30, & 36&#10;B) 30, 36, & 44&#10;C) 30, 32, & 36&#10;D) 32, 36, & 44

Accepted Answer

24, 30, & 36&#10;

Question 2

Using the table below, find the k-nearest neighbor for record 4 using k = 3 for age.  &#10;A) 24, 31, & 34&#10;B) 31, 34, & 44&#10;C) 31, 32, & 34&#10;D) 32, 34, & 44

Accepted Answer

24, 31, & 34&#10;

Question 3

A new applicant, age 32, is applying for a loan. Using the table below, what is the estimated probability the loan will default using k = 3.  &#10;A) 100%&#10;B) 33%&#10;C) 67%&#10;D) 0%

Accepted Answer

67%&#10;

Question 4

A new applicant, age 32, is applying for a loan. Using the table below, what is the estimated probability the loan will default using k = 3.  &#10;A) 100%&#10;B) 66%&#10;C) 33%&#10;D) 0%

Accepted Answer

The answer of A new applicant, age 32, is applying...

Question 5

For a new observation of (0, 0, 0), what is the k-nearest neighbor when k = 1.  &#10;A) 2, 0, 1&#10;B) 1, 2, 0&#10;C) 0, 2, 1&#10;D) -1, 1, 0

Accepted Answer

The answer of For a new observation of (0, 0,...

Question 6

For a new observation of (0, 0, 0), what is the k-nearest neighbor when k = 1.  &#10;A) 1, 1, 1&#10;B) -2, -1, 0&#10;C) 0, 1, 3&#10;D) -1, 0, 1

Accepted Answer

The answer of For a new observation of (0, 0,...

Question 7

What is the estimated probability that the cheese sample tested in NW will be Gouda? k = 3  &#10;A) 40%&#10;B) 60%&#10;C) 34%&#10;D) 67%

Accepted Answer

The answer of What is the estimated probability that the...

Question 8

What is the estimated probability that the cheese sample tested in NW will be Gouda? k = 3  &#10;A) 40%&#10;B) 60%&#10;C) 34%&#10;D) 67%

Accepted Answer

The answer of What is the estimated probability that the...

Question 9

A new applicant, age 45, is applying for a loan. Using the table below, what is the estimated probability the loan will be approved? k = 4.  &#10;A) The probable default rate is 50%, the loan will be declined.&#10;B) The probable success rate is 50%, the loan will be approved.&#10;C) The probable success rate is 30%, the loan will be approved.&#10;D) The probable default rate is 25%, the loan will be declined.

Accepted Answer

The answer of A new applicant, age 45, is applying...

Question 10

A new applicant, age 45, is applying for a loan. Using the table below, what is the estimated probability the loan will be approved? k = 4.  &#10;A) The probable default rate is 75%, the loan will be declined.&#10;B) The probable success rate is 75%, the loan will be approved.&#10;C) The probable success rate is 30%, the loan will be approved.&#10;D) The probable default rate is 25%, the loan will be declined.

Accepted Answer

The answer of A new applicant, age 45, is applying...

Question 11

Using the following table, which k should be used in the subsequent calculations?  &#10;A) 8&#10;B) 5&#10;C) 6&#10;D) None of the percentages should be used.

Accepted Answer

The answer of Using the following table, which k should...

Question 12

Using the following table, which k should be used in the subsequent calculations?  &#10;A) 8&#10;B) 5&#10;C) 6&#10;D) None of the percentages should be used.

Accepted Answer

The answer of Using the following table, which k should...

Question 13

If the performance measures from the training data are considerably higher than the values from the validation and test data, what could be the issue?&#10;A) Proportion&#10;B) Sensitivity&#10;C) Duplication&#10;D) Overfitting

Accepted Answer

The answer of If the performance measures from the training...

Question 14

An issue with the na&#239;ve Bayes classifier is determining rare outcomes because the estimate is 0. To overcome this problem, the algorithm allows a replacement of zero probability with a nonzero value. This technique is called&#10;A) replacement.&#10;B) smoothing.&#10;C) discrete.&#10;D) combinations.

Accepted Answer

The answer of An issue with the na&#239;ve Bayes classifier...

Question 15

What is the Euclidean distance between Observation 1 and the origin point of (0, 0, 0)?  &#10;A) 3&#10;B) 1&#10;C) 0&#10;D) 2

Accepted Answer

The answer of What is the Euclidean distance between Observation...

Question 16

What is the Euclidean distance between Observation 1 and the origin point of (0, 0, 0)?  &#10;A) 2&#10;B) 1&#10;C) 0&#10;D) 3

Accepted Answer

The answer of What is the Euclidean distance between Observation...

Question 17

The chart below is a summary of the main results of a test data set representing the population observed purchasing a virtual digital assistant. What does the accuracy rate indicate?  &#10;A) 85.6% of the population has purchased a virtual assistant.&#10;B) 60% of the observations are correctly classified.&#10;C) 85.6% of the observations are correctly classified.&#10;D) 14.4% of the observations are correctly classified.

Accepted Answer

The answer of The chart below is a summary of...

Question 18

The chart below is a summary of the main results of a test data set representing the population observed purchasing a virtual digital assistant. What does the accuracy rate indicate?  &#10;A) 88.5% of the population has purchased a virtual assistant.&#10;B) 90% of the observations are correctly classified.&#10;C) 88.5% of the observations are correctly classified.&#10;D) 11.5% of the observations are correctly classified.

Accepted Answer

The answer of The chart below is a summary of...

Question 19

The chart below is a summary of the main results of a test data set representing the population observed purchasing a virtual digital assistant. What is the percent of the results that are incorrectly classified?

A) 11%
B) 8%
C) 12.05%
D) 10.70%

Accepted Answer

The answer of The chart below is a summary of...

Question 20

The chart below is a summary of the main results of a test data set representing the population observed purchasing a virtual digital assistant. What is the percent of the results that are incorrectly classified?

A) 11%
B) 8%
C) 12.05%
D) 10.70%

Accepted Answer

The answer of The chart below is a summary of...

Question 21

Marta is partitioning her data set into 60% for training and 40% for validation. She is first specifying 'Member' as her target variable. What will she need to program to ensure consistency to fix a random seed?

A) myIndex
B) trainSet
C) createDataPartition
D) set.seed

Accepted Answer

The answer of Marta is partitioning her data set into...

Question 22

Which chart allows for the categorization of large data sets from high to low values, dividing sets of observations into an easy visual representation of the data.&#10;A) Decile-wise chart&#10;B) Cumulative lift chart&#10;C) Scatterplot&#10;D) ROC Curve

Accepted Answer

The answer of Which chart allows for the categorization of...

Question 23

This chart measures the effectiveness of a predictive model, containing both a baseline and a lift curve.&#10;A) Decile-wise chart&#10;B) Cumulative lift chart&#10;C) Scatterplot&#10;D) ROC Curve

Accepted Answer

The answer of This chart measures the effectiveness of a...

Question 24

This chart determines how well the model performs in terms of sensitivity and specificity.&#10;A) Decile-wise chart&#10;B) Cumulative lift chart&#10;C) Scatterplot&#10;D) ROC Curve

Accepted Answer

The answer of This chart determines how well the model...

Question 25

The following table is the count of observations in each class of the training data set on approvals and declines for a loan at a local bank. Using the naïve Bayes method, calculate the conditional probability of both the male and female being approved (declined) for the loan and indicate which one should be categorized with approved classification?

A) 0.56 > 0.44 for male; 0.301 < 0.699 for female, providing male with the approved classification.
B) 0.56 > 0.44 for male; 0.40 < 0.60 for female, providing male with the approved classification.
C) 0.301 < 0.699 for male; 0.56 > 0.44 for female, providing female with the approved classification.
D) 0.301 < 0.699 for male; 0.50 >= 0.50 for female, providing female with the approved classification.

Accepted Answer

The answer of The following table is the count of...

Question 26

The following table is the count of observations in each class of the training data set on approvals and declines for a loan at a local bank. Using the naïve Bayes method, calculate the conditional probability of both the male and female being approved (declined) for the loan and indicate which one should be categorized with approved classification?

A) 0.56 > 0.44 for male; 0.301 < 0.699 for female, providing male with the approved classification.
B) 0.56 > 0.44 for male; 0.40 < 0.60 for female, providing male with the approved classification.
C) 0.301 < 0.699 for male; 0.56 > 0.44 for female, providing female with the approved classification.
D) 0.301 < 0.699 for male; 0.50 >= 0.50 for female, providing female with the approved classification.

Accepted Answer

The answer of The following table is the count of...

Question 27

A researcher is preparing data for a k-fold cross-validation. The number of groups the sample data is to be split into is 10. What would k equal in a 10-fold cross-validation?&#10;A) k = 1&#10;B) k = 5&#10;C) k = 10&#10;D) k cannot be determined until applied to a machine model.

Accepted Answer

The answer of A researcher is preparing data for a...

Question 28

The following table reflects the observations made on the color and type of vehicle, if a speeding ticket was received (1) or a warning (0), and if there was a prior driving violation (yes or no). Using the naïve Bayes calculation, what is the conditional probability of receiving a ticket with a red vehicle.

A) 0.76
B) 0.41
C) 0.28
D) 0.63

Accepted Answer

The answer of The following table reflects the observations made...

Question 29

The following table reflects the observations made on the color and type of vehicle, if a speeding ticket was received (1) or a warning (0), and if there was a prior driving violation (yes or no). Using the naïve Bayes calculation, what is the conditional probability of receiving a ticket with a red vehicle.

A) 0.76
B) 0.41
C) 0.28
D) 0.63

Accepted Answer

The answer of The following table reflects the observations made...

Question 30

An R's ROC curve with AUC = 0.9453 is presented below from an analysis on potential increased membership level from current basic members at Costco Wholesale. What does the AUC indicate on the prediction on increased membership enrollment among current base members?

A) The high AUC indicates an anomaly in that it requires data smoothing to match the baseline.
B) The high AUC indicates the KNN model performs well and better than the baseline model.
C) The high AUC indicates the KNN model is not predicting the level increase based on the baseline model.
D) The high AUC indicates there is a 0.05% probability the KNN model performs as predicted.

Accepted Answer

The answer of An R's ROC curve with AUC =...

Question 31

The marketing group and Rings Are Us is trying to predict if undergraduate or graduate students are more inclined to purchase (y = 1) or not purchase (y = 0) a class ring at graduation. Using the following count on the training data set, calculate the conditional probability of both to determine which should be classified to the purchase group.

A) Undergraduate 0.7 > 0.272; Graduate 0.428 < 0.272. The undergraduate is assigned to the purchaser group.
B) Undergraduate 0.7 > 0.3; Graduate 0.486 < 0.514. The undergraduate is assigned to the purchaser group.
C) Undergraduate 0.303 > 0.272; Graduate 0.476 < 0.524. The undergraduate is assigned to the purchaser group.
D) Undergraduate 0.30 < 0.20; Graduate 0.524 > 0.476. The graduate is assigned to the purchaser group.

Accepted Answer

The answer of The marketing group and Rings Are Us...

Question 32

The marketing group and Rings Are Us is trying to predict if undergraduate or graduate students are more inclined to purchase (y = 1) or not purchase (y = 0) a class ring at graduation. Using the following count on the training data set, calculate the conditional probability of both to determine which should be classified to the purchase group.

A) Undergraduate 0.70 > 0.272; Graduate 0.428 < 0.272. The undergraduate is assigned to the purchaser group.
B) Undergraduate 0.70 > 0.30; Graduate 0.476 < 0.524. The undergraduate is assigned to the purchaser group.
C) Undergraduate 0.303 > 0.272; Graduate 0.476 < 0.524. The undergraduate is assigned to the purchaser group.
D) Undergraduate 0.30 < 0.20; Graduate 0.524 > 0.476. The graduate is assigned to the purchaser group.

Accepted Answer

The answer of The marketing group and Rings Are Us...

Question 33

Specificity is&#10;A) TP &#247; (TP + FN).&#10;B) TN &#247; (TN + FP).&#10;C) TP &#247; (TN + FP).&#10;D) 1 &#8722; TN &#247; (TN + FP).

Accepted Answer

The answer of Specificity is&#10;A) TP &#247; (TP + FN).&#10;B)...

Question 34

To examine classification for k-fold cross-validation and na&#239;ve Bayes, two packages contain the necessary functions for partitioning the data. These are&#10;A) caret & klaR&#10;B) caret & Crisp&#10;C) klarR & SEMMA&#10;D) predictive & caret

Accepted Answer

The answer of To examine classification for k-fold cross-validation and...

Question 35

Mark is reviewing a partial summary of results from a test data set on a small health clinic. With an accuracy 84% (100 count), Sensitivity 55%, and Specificity 100%, can Mark correctly predict the true positive rate to identify those with the flu?

A) Yes, because the specificity is 100% identifying healthy patients.
B) No, because the test set identified all patients with the flu.
C) No, because the sensitivity rate is only 55% in identifying those with the flu.
D) Yes, because there is a 84% accuracy rate with all healthy identified.

Accepted Answer

The answer of Mark is reviewing a partial summary of...

Question 36

Mark is reviewing a partial summary of results from a test data set on a small health clinic. With an accuracy 84% (100 count), Sensitivity 55%, and Specificity 100%, can Mark correctly predict the true positive rate to identify those with the flu?

A) Yes, because the specificity is 100% identifying healthy patients.
B) No, because the test set identified all patients with the flu.
C) No, because the sensitivity rate is only 55% in identifying those with the flu.
D) Yes, because there is a 84% accuracy rate with all healthy identified.

Accepted Answer

The answer of Mark is reviewing a partial summary of...

Question 37

Of the following options, which does not represent the na&#239;ve Bayes method?&#10;A) All predictor variables are categorical.&#10;B) All predictor variables are independent.&#10;C) Does not capture possible interactions between predictor variables.&#10;D) Works best on a small data set.

Accepted Answer

The answer of Of the following options, which does not...

Question 38

Using the following table, what is the estimate of P(Color) = Black and what is the smoothed estimate of P(Color). k = 1.  &#10;A) P(Color = Black) = 0.30 and Smoothed 0.471&#10;B) P(Color = Black) = 0.30 and Smoothed 0.380&#10;C) P(Color = Black) = 0.7 and Smoothed 0.471&#10;D) P(Color = Black) = 0.40 and Smoothed 0.471

Accepted Answer

The answer of Using the following table, what is the...

Question 39

Using the following table, what is the estimate of P(Color) = Black and what is the smoothed estimate of P(Color). k = 1.  &#10;A) P(Color = Black) = 0.70 and Smoothed 0.471&#10;B) P(Color = Black) = 0.30 and Smoothed 0.380&#10;C) P(Color = Black) = 0.30 and Smoothed 0.308&#10;D) P(Color = Black) = 0.30 and Smoothed 0.471

Accepted Answer

The answer of Using the following table, what is the...

Question 40

In a decile-wise lift chart, what does the lift value of the leftmost bar imply?  &#10;A) The lift value is determined by the smoothing of the data.&#10;B) The first 10% yields twice as many as a random selection of 10% would.&#10;C) The bar represents 10% of the data cumulative score.&#10;D) The first 10% is twice as prevalent as 20%.

Accepted Answer

The answer of In a decile-wise lift chart, what does...

Question 41

To validate the model on the validation set, Mary calibrates the output of the model to examining all possible outcomes of the prediction (true positive, true negative, false positive, false negative). One way is to use a cutoff value and use functions such as the ifelse () function. These statements are called

A) prediction.
B) set.seed.
C) reference.
D) confusionMatrix.

Accepted Answer

The answer of To validate the model on the validation...

Question 42

What is the Euclidean distance between Observation 2 and the origin point of (0, 0, 0)?  &#10;A) 0&#10;B) 2&#10;C) 5&#10;D) 3

Accepted Answer

The answer of What is the Euclidean distance between Observation...

Question 43

Using the following table of the results of a paper towel study and selection, the XYZ company is making a new product with Durability = 3 and Feel = 4. Using the Euclidean distance, which Type is closest to the new observation?

A) Type 2
B) Type 1
C) Type 4
D) Type 3

Accepted Answer

The answer of Using the following table of the results...

Question 44

Using the following table of the results of a paper towel study and selection, the XYZ company is making a new product with Durability = 3 and Feel = 4. Using the Euclidean distance, which Type is closest to the new observation?

A) Type 2
B) Type 1
C) Type 4
D) Type 3

Accepted Answer

The answer of Using the following table of the results...

Question 45

The use of classifying or predicting the value to create an outcome is called scoring a record.

Accepted Answer

The answer of The use of classifying or predicting the...

Question 46

KNN is a simple data mining tool, known for developing personalized recommendations for many online company applications.

Accepted Answer

The answer of KNN is a simple data mining tool,...

Question 47

Na&#239;ve Bayes classifiers are relatively simple, efficient, and assume dependency among predictors.

Accepted Answer

The answer of Na&#239;ve Bayes classifiers are relatively simple, efficient,...

Question 48

KNN belongs to a category of mining techniques called computer-based-reasoning.

Accepted Answer

The answer of KNN belongs to a category of mining...

Question 49

While k-nearest neighbors is effective as a classifier, it provides no information on predictor importance.

Accepted Answer

The answer of While k-nearest neighbors is effective as a...

Question 50

The na&#239;ve Bayes method is an unsupervised data mining technique that uses partitioning to assess model performance.

Accepted Answer

The answer of The na&#239;ve Bayes method is an unsupervised...

Question 51

When performing a na&#239;ve Bayes analysis, all predictor variables must be categorical.

Accepted Answer

The answer of When performing a na&#239;ve Bayes analysis, all...

Question 52

Unlike the KNN method, the na&#239;ve Bayes method does not use the validation data set to optimize model complexity.

Accepted Answer

The answer of Unlike the KNN method, the na&#239;ve Bayes...

Question 53

To use the na&#239;ve Bayes method, numerical variables can be converted into discrete categories, through a process called binning, and then stored in a newly created categorical value.

Accepted Answer

The answer of To use the na&#239;ve Bayes method, numerical...

Question 54

Binning is a process where categorical data is transformed into numerical segments that can be appended back to the original data set to use a na&#239;ve Bayes method.

Accepted Answer

The answer of Binning is a process where categorical data...

Deck 9: Supervised Data Mining: K-Nearest Neighbors and Naãve Bayes