Deck 3: Basic Data Mining Techniques

Full screen (f)
exit full mode
Question
Association rule support is defined as

A) the percentage of instances that contain the antecendent conditional items listed in the association rule.
B) the percentage of instances that contain the consequent conditions listed in the association rule.
C) the percentage of instances that contain all items listed in the association rule.
D) the percentage of instances in the database that contain at least one of the antecendent conditional items listed in the association rule.
Use Space or
up arrow
down arrow
to flip the card.
Question
This approach is best when we are interested in finding all possible interactions among a set of attributes.

A) decision tree
B) association rules
C) K-Means algorithm
D) genetic learning Computational Questions
Question
Use these tables to answer questions 5 and 6.
<strong>Use these tables to answer questions 5 and 6.     Based on the two-item set table, which of the following is not a possible two-item set rule?</strong> A) IF Life Ins Promo = Yes THEN Magazine Promo = Yes B) IF Watch Promo = No THEN Magazine Promo = Yes C) IF Card Insurance = No THEN Magazine Promo = Yes D) IF Life Ins Promo = No THEN Card Insurance = No <div style=padding-top: 35px> <strong>Use these tables to answer questions 5 and 6.     Based on the two-item set table, which of the following is not a possible two-item set rule?</strong> A) IF Life Ins Promo = Yes THEN Magazine Promo = Yes B) IF Watch Promo = No THEN Magazine Promo = Yes C) IF Card Insurance = No THEN Magazine Promo = Yes D) IF Life Ins Promo = No THEN Card Insurance = No <div style=padding-top: 35px>
Based on the two-item set table, which of the following is not a possible two-item set rule?

A) IF Life Ins Promo = Yes THEN Magazine Promo = Yes
B) IF Watch Promo = No THEN Magazine Promo = Yes
C) IF Card Insurance = No THEN Magazine Promo = Yes
D) IF Life Ins Promo = No THEN Card Insurance = No
Question
Which statement is true about the K-Means algorithm?

A) All attribute values must be categorical.
B) The output attribute must be cateogrical.
C) Attribute values may be either categorical or numeric.
D) All attributes must be numeric.
Question
Which statement is true about the decision tree attribute selection process described in your book?

A) A categorical attribute may appear in a tree node several times but a numeric attribute may appear at most once.
B) A numeric attribute may appear in several tree nodes but a categorical attribute may appear at most once.
C) Both numeric and categorical attributes may appear in several tree nodes.
D) Numeric and categorical attributes may appear in at most one tree node.
Question
An evolutionary approach to data mining.

A) backpropagation learning
B) genetic learning
C) decision tree learning
D) linear regression
Question
Given a rule of the form IF X THEN Y, rule confidence is defined as the conditional probability that

A) Y is true when X is known to be true.
B) X is true when Y is known to be true.
C) Y is false when X is known to be false.
D) X is false when Y is known to be false.
Question
The K-Means algorithm terminates when

A) a user-defined minimum value for the summation of squared error differences between instances and their corresponding cluster center is seen.
B) the cluster centers for the current iteration are identical to the cluster centers for the previous iteration.
C) the number of instances in each cluster for the current iteration is identical to the number of instances in each cluster of the previous iteration.
D) the number of clusters formed for the current iteration is identical to the number of clusters formed in the previous iteration.
Question
Use the confusion matrix for Model X and confusion matrix for Model Y to answer questions 4 through 6.
<strong>Use the confusion matrix for Model X and confusion matrix for Model Y to answer questions 4 through 6.   A data mining algorithm is unstable if</strong> A) test set accuracy depends on the ordering of test set instances. B) the algorithm builds models unable to classify outliers. C) the algorithm is highly sensitive to small changes in the training data. D) test set accuracy depends on the choice of input attributes. <div style=padding-top: 35px>
A data mining algorithm is unstable if

A) test set accuracy depends on the ordering of test set instances.
B) the algorithm builds models unable to classify outliers.
C) the algorithm is highly sensitive to small changes in the training data.
D) test set accuracy depends on the choice of input attributes.
Question
Use these tables to answer questions 5 and 6.
<strong>Use these tables to answer questions 5 and 6.     One two-item set rule that can be generated from the tables above is: If Magazine Promo = Yes Then Life Ins promo = Yes The confidence for this rule is:</strong> A) 5 / 7 B) 5 / 12 C) 7 / 12 D) 1 <div style=padding-top: 35px> <strong>Use these tables to answer questions 5 and 6.     One two-item set rule that can be generated from the tables above is: If Magazine Promo = Yes Then Life Ins promo = Yes The confidence for this rule is:</strong> A) 5 / 7 B) 5 / 12 C) 7 / 12 D) 1 <div style=padding-top: 35px>
One two-item set rule that can be generated from the tables above is: If Magazine Promo = Yes Then Life Ins promo = Yes
The confidence for this rule is:

A) 5 / 7
B) 5 / 12
C) 7 / 12
D) 1
Question
Construct a decision tree with root node Type from the data in the table below. The first row contains attribute names. Each row after the first represents the values for one data instance. The output attribute is Class.
Construct a decision tree with root node Type from the data in the table below. The first row contains attribute names. Each row after the first represents the values for one data instance. The output attribute is Class.  <div style=padding-top: 35px>
Question
The computational complexity as well as the explanation offered by a genetic algorithm is largely determined by the

A) fitness function
B) techniques used for crossover and mutation
C) training data
D) population of elements
Question
A genetic learning operation that creates new population elements by combining parts of two or more existing elements.

A) selection
B) crossover
C) mutation
D) absorption
Unlock Deck
Sign up to unlock the cards in this deck!
Unlock Deck
Unlock Deck
1/13
auto play flashcards
Play
simple tutorial
Full screen (f)
exit full mode
Deck 3: Basic Data Mining Techniques
1
Association rule support is defined as

A) the percentage of instances that contain the antecendent conditional items listed in the association rule.
B) the percentage of instances that contain the consequent conditions listed in the association rule.
C) the percentage of instances that contain all items listed in the association rule.
D) the percentage of instances in the database that contain at least one of the antecendent conditional items listed in the association rule.
C
2
This approach is best when we are interested in finding all possible interactions among a set of attributes.

A) decision tree
B) association rules
C) K-Means algorithm
D) genetic learning Computational Questions
B
3
Use these tables to answer questions 5 and 6.
<strong>Use these tables to answer questions 5 and 6.     Based on the two-item set table, which of the following is not a possible two-item set rule?</strong> A) IF Life Ins Promo = Yes THEN Magazine Promo = Yes B) IF Watch Promo = No THEN Magazine Promo = Yes C) IF Card Insurance = No THEN Magazine Promo = Yes D) IF Life Ins Promo = No THEN Card Insurance = No <strong>Use these tables to answer questions 5 and 6.     Based on the two-item set table, which of the following is not a possible two-item set rule?</strong> A) IF Life Ins Promo = Yes THEN Magazine Promo = Yes B) IF Watch Promo = No THEN Magazine Promo = Yes C) IF Card Insurance = No THEN Magazine Promo = Yes D) IF Life Ins Promo = No THEN Card Insurance = No
Based on the two-item set table, which of the following is not a possible two-item set rule?

A) IF Life Ins Promo = Yes THEN Magazine Promo = Yes
B) IF Watch Promo = No THEN Magazine Promo = Yes
C) IF Card Insurance = No THEN Magazine Promo = Yes
D) IF Life Ins Promo = No THEN Card Insurance = No
D
4
Which statement is true about the K-Means algorithm?

A) All attribute values must be categorical.
B) The output attribute must be cateogrical.
C) Attribute values may be either categorical or numeric.
D) All attributes must be numeric.
Unlock Deck
Unlock for access to all 13 flashcards in this deck.
Unlock Deck
k this deck
5
Which statement is true about the decision tree attribute selection process described in your book?

A) A categorical attribute may appear in a tree node several times but a numeric attribute may appear at most once.
B) A numeric attribute may appear in several tree nodes but a categorical attribute may appear at most once.
C) Both numeric and categorical attributes may appear in several tree nodes.
D) Numeric and categorical attributes may appear in at most one tree node.
Unlock Deck
Unlock for access to all 13 flashcards in this deck.
Unlock Deck
k this deck
6
An evolutionary approach to data mining.

A) backpropagation learning
B) genetic learning
C) decision tree learning
D) linear regression
Unlock Deck
Unlock for access to all 13 flashcards in this deck.
Unlock Deck
k this deck
7
Given a rule of the form IF X THEN Y, rule confidence is defined as the conditional probability that

A) Y is true when X is known to be true.
B) X is true when Y is known to be true.
C) Y is false when X is known to be false.
D) X is false when Y is known to be false.
Unlock Deck
Unlock for access to all 13 flashcards in this deck.
Unlock Deck
k this deck
8
The K-Means algorithm terminates when

A) a user-defined minimum value for the summation of squared error differences between instances and their corresponding cluster center is seen.
B) the cluster centers for the current iteration are identical to the cluster centers for the previous iteration.
C) the number of instances in each cluster for the current iteration is identical to the number of instances in each cluster of the previous iteration.
D) the number of clusters formed for the current iteration is identical to the number of clusters formed in the previous iteration.
Unlock Deck
Unlock for access to all 13 flashcards in this deck.
Unlock Deck
k this deck
9
Use the confusion matrix for Model X and confusion matrix for Model Y to answer questions 4 through 6.
<strong>Use the confusion matrix for Model X and confusion matrix for Model Y to answer questions 4 through 6.   A data mining algorithm is unstable if</strong> A) test set accuracy depends on the ordering of test set instances. B) the algorithm builds models unable to classify outliers. C) the algorithm is highly sensitive to small changes in the training data. D) test set accuracy depends on the choice of input attributes.
A data mining algorithm is unstable if

A) test set accuracy depends on the ordering of test set instances.
B) the algorithm builds models unable to classify outliers.
C) the algorithm is highly sensitive to small changes in the training data.
D) test set accuracy depends on the choice of input attributes.
Unlock Deck
Unlock for access to all 13 flashcards in this deck.
Unlock Deck
k this deck
10
Use these tables to answer questions 5 and 6.
<strong>Use these tables to answer questions 5 and 6.     One two-item set rule that can be generated from the tables above is: If Magazine Promo = Yes Then Life Ins promo = Yes The confidence for this rule is:</strong> A) 5 / 7 B) 5 / 12 C) 7 / 12 D) 1 <strong>Use these tables to answer questions 5 and 6.     One two-item set rule that can be generated from the tables above is: If Magazine Promo = Yes Then Life Ins promo = Yes The confidence for this rule is:</strong> A) 5 / 7 B) 5 / 12 C) 7 / 12 D) 1
One two-item set rule that can be generated from the tables above is: If Magazine Promo = Yes Then Life Ins promo = Yes
The confidence for this rule is:

A) 5 / 7
B) 5 / 12
C) 7 / 12
D) 1
Unlock Deck
Unlock for access to all 13 flashcards in this deck.
Unlock Deck
k this deck
11
Construct a decision tree with root node Type from the data in the table below. The first row contains attribute names. Each row after the first represents the values for one data instance. The output attribute is Class.
Construct a decision tree with root node Type from the data in the table below. The first row contains attribute names. Each row after the first represents the values for one data instance. The output attribute is Class.
Unlock Deck
Unlock for access to all 13 flashcards in this deck.
Unlock Deck
k this deck
12
The computational complexity as well as the explanation offered by a genetic algorithm is largely determined by the

A) fitness function
B) techniques used for crossover and mutation
C) training data
D) population of elements
Unlock Deck
Unlock for access to all 13 flashcards in this deck.
Unlock Deck
k this deck
13
A genetic learning operation that creates new population elements by combining parts of two or more existing elements.

A) selection
B) crossover
C) mutation
D) absorption
Unlock Deck
Unlock for access to all 13 flashcards in this deck.
Unlock Deck
k this deck
locked card icon
Unlock Deck
Unlock for access to all 13 flashcards in this deck.