Question 1

Naive Bayes method assumes probabilistic independence across characteristics which means that the joint probability is the product of the probabilities of the individual characteristics.

Accepted Answer

Naive Bayes method assumes that the features are conditionally independent given the class label, which means that the joint probability is the product of the probabilities of the individual features.

Question 2

SQL Server Analysis Services (SSAS)concentrates on which types of data mining&#10;A)&#160;classification and clustering&#160;&#10;B)&#160;market basket and forecasting&#160;&#10;C)&#160;clustering and prediction&#160;&#10;D)&#160;forecasting and classification

Accepted Answer

SSAS mainly focuses on classification and clustering data mining techniques. Classification is the process of categorizing data into predefined classes or groups based on their characteristics. Clustering is the process of grouping data points together based on their similarities. SSAS also supports other data mining techniques such as association rules and time-series analysis, but the primary focus is on classification and clustering. Market basket and forecasting are not the main areas of concentration for SSAS.

Question 3

The K-Means clustering algorithm :&#10;A)&#160;becomes an optimization model for choosing the best cluster centers.&#160;&#10;B)&#160;determines the best value of K.&#160;&#10;C)&#160;has an objective that maximizes the sum of the dissimilarity measures.&#160;&#10;D)&#160;cannot be directly implemented in Excel.

Accepted Answer

The K-Means clustering algorithm becomes an optimization model for choosing the best cluster centers. It aims to minimize the sum of squared distances between each point and its assigned centroid.

Question 4

In the classification method,when data are partitioned into the training and testing subsets:&#10;A)&#160;the data are divided evenly into the subsets.&#160;&#10;B)&#160;the values of the dependent variable are known for both the training and the testing subsets.&#160;&#10;C)&#160;the records in the subsets are chosen based on similar characteristics.&#160;&#10;D)&#160;all of the above.

Accepted Answer

In classification, data is partitioned into training and testing subsets. The purpose of this is to train the model on the training subset and then test its accuracy on the testing subset. It is important that the values of the dependent variable are known for both the training and testing subsets to evaluate the accuracy of the model. The data may not be divided evenly, and records in the subsets may not be chosen based on similar characteristics. Therefore, option B is the best choice as it is the most important requirement for the partitioning of data in the classification method.

Question 5

If the probability of a restaurant being successful is 10% then the odds of it failing are 10 to 1.

Accepted Answer

If the probability of success is 10%, then the probability of failure is 90%. The odds of failure would be 90 to 10, or 9 to 1, not 10 to 1. Therefore, the statement is false.

Question 6

In data partitioning,data can be divided into training,testing,and prediction subsets.

Accepted Answer

Data partitioning is a technique used in machine learning where the available data is divided into training, testing, and prediction subsets. The training subset is used to train the model, the testing subset is used to evaluate the performance of the model, and the prediction subset is used to make predictions on unseen data. Therefore, the statement is true.

Question 7

The function INDEX(A1:E5,4,3)would return the value in cell C4.&#8203;

Accepted Answer

The INDEX function returns the value at a specified row and column within a range. In this case, A1:E5 is the range, 4 is the row number, and 3 is the column number, so the function would return the value in cell C4.

Question 8

Today's organizations rely on their quantitative experts,who have access to large amounts of data,to make sense of it in a timely manner.

Accepted Answer

The answer of Today's organizations rely on their quantitative experts,who...

Question 9

The last step of cluster analysis is to understand the shared characteristics of the observations in each cluster.&#8203;

Accepted Answer

The last step in cluster analysis is to interpret and understand the shared characteristics of the observations in each cluster. This is essential in order to gain insights and make decisions based on the resulting clusters.

Question 10

Which of the following is false regarding neural networks

A) They attempt to mimic the complex behavior of the human brain.
B) They are synonymous with data mining.
C) They can be used to predict the value of a numeric dependent variable.
D) They do not provide an understanding of the contributions of individual explanatory variables.

Unlock Deck

Unlock for access to all 30 flashcards in this deck.

Unlock Deck

k this deck

Accepted Answer

Neural networks and data mining are not synonymous. While neural networks can be used as a tool in data mining to analyze and model complex data relationships, neural networks are not the only data mining technique, and data mining involves a variety of other tools and techniques as well.

Question 11

The logistic function 1/(1+e -x )&#10;A)&#160;is concave.&#160;&#10;B)&#160;is convex.&#160;&#10;C)&#160;is S-shaped.&#160;&#10;D)&#160;takes on values between -1 and 1.

Accepted Answer

The answer of The logistic function 1/(1+e -x )&#10;A)&#160;is concave.&#160;&#10;B)&#160;is...

Question 12

When classifying rare events,it is inadvisable to oversample.

Accepted Answer

The answer of When classifying rare events,it is inadvisable to...

Question 13

Better classification methods should improve lift,which is the increase in purchases gained through marketing to people with the highest probability of purchasing.

Accepted Answer

The answer of Better classification methods should improve lift,which is...

Question 14

Which of the following is not a classification method&#10;A)&#160;Naive Bayes&#160;&#10;B)&#160;Logistic regression&#160;&#10;C)&#160;ROC curves&#160;&#10;D)&#160;Neural networks

Accepted Answer

The answer of Which of the following is not a...

Question 15

Specificity,a measure of classification accuracy,reflects: (regard class 1 as yes and class 2 as no)&#10;A)&#160;the portion of observations that are actually in class 1 that are classified as being in class 1.&#160;&#10;B)&#160;the portion of observations that are actually in class 2 that are classified as being in class 2.&#160;&#10;C)&#160;the portion of observations that are actually in class 2 that are classified as being in class 1.&#160;&#10;D)&#160;the portion of observations that are classified as class 1 that are actually in class 1.

Accepted Answer

The answer of Specificity,a measure of classification accuracy,reflects: (regard class...

Question 16

Which of the following is false concerning a data warehouse&#10;A)&#160;It is designed to study patterns in data.&#160;&#10;B)&#160;It combines data from multiple sources.&#160;&#10;C)&#160;It allows follow up responses to questions.&#160;&#10;D)&#160;It supports the daily operations of the company.

Accepted Answer

The answer of Which of the following is false concerning...

Question 17

For logistic regression:&#10;A)&#160;the odds ratio is the independent variable.&#160;&#10;B)&#160;the primary objective is to score each member.&#160;&#10;C)&#160;the cutoff value is set to 0.5.&#160;&#10;D)&#160;the magnitude of the regression coefficients can be compared for relative importance.

Accepted Answer

The answer of For logistic regression:&#10;A)&#160;the odds ratio is the...

Question 18

Which of the following is true of clustering methods

A) They are called supervised data mining techniques.
B) They require known values of the dependent variable.
C) They group observations into clusters based on predefined characteristics.
D) They are also known as segmentation methods.

Unlock Deck

Unlock for access to all 30 flashcards in this deck.

Unlock Deck

k this deck

Accepted Answer

The answer of Which of the following is true of...

Question 19

Business analytics,while important,is only a part of the area of data mining.

Accepted Answer

The answer of Business analytics,while important,is only a part of...

Question 20

The primary advantage of neural networks is that they provide more accurate predictions,especially when the relationships are linear.

Accepted Answer

The answer of The primary advantage of neural networks is...

Question 21

Exhibit 14.1&#10;The age, weight, gender, and the number of children were tracked for 50 people who either have or have not been diagnosed with a certain illness.&#160; The data are provided in the table below.&#10;    &#10;See Exhibit 14.1 - How well can logistic regression classify the people as having the illness or not

Accepted Answer

The answer of Exhibit 14.1&#10;The age, weight, gender, and the...

Question 22

Exhibit 14-3&#10;Information for 26 colleges and universities in the state of Indiana is provided in the table below.&#160; The following questions contain a sequence of steps for executing the K-Means Clustering Method.&#10;&#8203;&#160;  &#10;Refer to Exhibit 14-3 Choose any three college indices from 1 to 26 as trial values for the cluster centers. Complete the logicto return the name of the college or university associated with the index as well as the standardized numeric measures.&#8203;

Accepted Answer

The answer of Exhibit 14-3&#10;Information for 26 colleges and universities...

Question 23

Exhibit 14-3&#10;Information for 26 colleges and universities in the state of Indiana is provided in the table below.&#160; The following questions contain a sequence of steps for executing the K-Means Clustering Method.&#10;&#8203;&#160;  &#10;Refer to Exhibit 14-3 Finally,calculate the minimum total distance to the cluster centers and use Evolutionary Solver to optimize the model.Describe the clusters that are formed.&#8203;

Accepted Answer

The answer of Exhibit 14-3&#10;Information for 26 colleges and universities...

Question 24

Exhibit 14.1&#10;The age, weight, gender, and the number of children were tracked for 50 people who either have or have not been diagnosed with a certain illness.&#160; The data are provided in the table below.&#10;    &#10;See Exhibit 14.1 - How likely would a 40 year old,200 pound father of 2 children be to have the illness

Accepted Answer

The answer of Exhibit 14.1&#10;The age, weight, gender, and the...

Question 25

Exhibit 14-3&#10;Information for 26 colleges and universities in the state of Indiana is provided in the table below.&#160; The following questions contain a sequence of steps for executing the K-Means Clustering Method.&#10;&#8203;&#160;  &#10;Refer to exhibit 14-3 As a first step in grouping the Indiana colleges and universities into clusters,standardize the numeric measures.

Accepted Answer

The answer of Exhibit 14-3&#10;Information for 26 colleges and universities...

Question 26

Exhibit 14-3&#10;Information for 26 colleges and universities in the state of Indiana is provided in the table below.&#160; The following questions contain a sequence of steps for executing the K-Means Clustering Method.&#10;&#8203;&#160;  &#10;Refer to Exhibit 14-3 Next determine to which of the trial cluster centers each college or university would be assigned.&#8203;

Accepted Answer

The answer of Exhibit 14-3&#10;Information for 26 colleges and universities...

Question 27

Exhibit 14.2&#10;The age, weight, gender, and the number of children were tracked for 25 people.&#160; It is not known if these people have or have not been diagnosed with a certain illness.&#160; The data are provided in the table below.&#10;  &#10;See Exhibit 14-2 Using Palisade's NeuralTools and the results from the prior problem,use the prediction data in this table to classify these 25 people as having been diagnosed with an illness or not.

Accepted Answer

The answer of Exhibit 14.2&#10;The age, weight, gender, and the...

Question 28

Exhibit 14.1&#10;The age, weight, gender, and the number of children were tracked for 50 people who either have or have not been diagnosed with a certain illness.&#160; The data are provided in the table below.&#10;    &#10;See Exhibit 14.1 - What kind of person is more likely to be diagnosed with the illness

Accepted Answer

The answer of Exhibit 14.1&#10;The age, weight, gender, and the...

Question 29

Exhibit 14-3&#10;Information for 26 colleges and universities in the state of Indiana is provided in the table below.&#160; The following questions contain a sequence of steps for executing the K-Means Clustering Method.&#10;&#8203;&#160;  &#10;Refer to Exhibit 14-3 Determine the distances for each college or university to each cluster center.&#8203;

Accepted Answer

The answer of Exhibit 14-3&#10;Information for 26 colleges and universities...

Question 30

Exhibit 14.1&#10;The age, weight, gender, and the number of children were tracked for 50 people who either have or have not been diagnosed with a certain illness.&#160; The data are provided in the table below.&#10;    &#10;See Exhibit 14-1 Use neural nets via Palisade's NeuralTools to classify the people as being diagnosed as having an illness or not. (Reserve 20% of the observations for testing.) How well does this method correctly classify people

Accepted Answer

The answer of Exhibit 14.1&#10;The age, weight, gender, and the...

Deck 14: Data Mining