Deck 5: Data Mining and Clustering: Key Concepts and Tasks

Full screen (f)
exit full mode
Question
Which statement is not TRUE regarding a data mining task?

A)Clustering is a descriptive data mining task
B)Classification is a predictive data mining task
C)Regression is a descriptive data mining task
D)Deviation detection is a predictive data mining task
Use Space or
up arrow
down arrow
to flip the card.
Question
Synonym for data mining is

A)Data Warehouse
B)Knowledge discovery in database
C)Business intelligence
D)OLAP
Question
Which of the following is not a data mining task?

A)Feature Subset Detection
B)Association Rule Discovery
C)Regression
D)Sequential Pattern Discovery
Question
Which of the following issue is considered before investing in Data Mining?

A)Functionality
B)Vendor consideration
C)Compatibility
D)All of the above
Question
Data independence means

A)Data is defined separately and not included in programs
B)Programs are not dependent on the physical attributes of data
C)Programs are not dependent on the logical attributes of data
D)Both (B) and (C)
Question
A definition or a concept is if it classifies any examples as coming within the concept

A)Complete
B)Consistent
C)Constant
D)None of these
Question
Which of the following activities is a data mining task?

A)Monitoring the heart rate of a patient for abnormalities
B)Extracting the frequencies of a sound wave
C)Predicting the outcomes of tossing a (fair) pair of dice
D)Dividing the customers of a company according to their profitability
Question
Data Visualization in mining cannot be done using

A)Photos
B)Graphs
C)Charts
D)Information Graphics
Question
To detect fraudulent usage of credit cards, the following data mining task should be used

A)Outlier analysis
B)prediction
C)association analysis
D)feature selection
Question
A definition or a concept is if it classifies any examples as coming within the concept

A)Complete
B)Consistent
C)Constant
Question
Classification accuracy is

A)Subdivision of a set of examples into a number of classes
B)Measure of the accuracy, of the classification of a concept that is given by a certain theory
C)The task of assigning a classification to a set of examples
Question
Cluster is

A)Group of similar objects that differ significantly from other objects
B)Operations on a database to transform or simplify data in order to prepare it for a machine-learning algorithm
C)Symbolic representation of facts or ideas from which information can potentially be extracted
Question
Assume you want to perform supervised learning and to predict number of newborns according to size of storks' population (http://www.brixtonhealth.com/storksBabies.pdf), it is an example of

A)Classification
B)Regression
C)Clustering
D)Structural equation modeling
Question
Bayesian classifiers is

A)class of learning algorithm that tries to find an optimum classification of a set of examples using the probabilistic theory
B)Any mechanism employed by a learning system to constrain the search space of a hypothesis
C)An approach to the design of learning algorithms that is inspired by the fact that when people encounter new situations, they often explain them by reference to familiar experiences, adapting the explanations to fit the new situation
D)None of these
Question
Background knowledge referred to

A)Additional acquaintance used by a learning algorithm to facilitate the learning process
B)A neural network that makes use of a hidden layer
C)It is a form of automatic learning
D)None of these
Question
Which of the following is finally produced by Hierarchical Clustering?

A)Final estimate of cluster centroids
B)tree showing how close things are to each other
C)assignment of each point to clusters
D)all of the mentioned
Question
Which of the following combination is incorrect?

A)Continuous - euclidean distance
B)Continuous - correlation similarity
C)Binary - manhattan distance
D)None of the mentioned
Question
Which of the following ways can be used to represent a graph?

A)Adjacency List and Adjacency Matrix
B)Incidence Matrix
C)Adjacency List, Adjacency Matrix as well as Incidence Matrix
D)No way to represent
Question
Which of the following function is used for k-means clustering?

A)k-means
B)k-mean
C)heatmap
D)none of the mentioned
Question
K-means is not deterministic and it also consists of number of iterations.
Question
N-grams are defined as the combination of N keywords together. How many bi-grams can be generated from given sentence?

A)7
B)8
C)9
D)10
Question
Social networks are great distribution channel for ___________

A)Customer feedback
B)Viral Content
C)exclusive coupons
D)marketing messages
Question
Two increasingly important ethical aspects of social media are

A)Ratings and traffic.
B)Transparency and privacy.
C)Identity and honesty.
D)Virtue and virality
Unlock Deck
Sign up to unlock the cards in this deck!
Unlock Deck
Unlock Deck
1/23
auto play flashcards
Play
simple tutorial
Full screen (f)
exit full mode
Deck 5: Data Mining and Clustering: Key Concepts and Tasks
1
Which statement is not TRUE regarding a data mining task?

A)Clustering is a descriptive data mining task
B)Classification is a predictive data mining task
C)Regression is a descriptive data mining task
D)Deviation detection is a predictive data mining task
Regression is a descriptive data mining task
2
Synonym for data mining is

A)Data Warehouse
B)Knowledge discovery in database
C)Business intelligence
D)OLAP
Knowledge discovery in database
3
Which of the following is not a data mining task?

A)Feature Subset Detection
B)Association Rule Discovery
C)Regression
D)Sequential Pattern Discovery
Feature Subset Detection
4
Which of the following issue is considered before investing in Data Mining?

A)Functionality
B)Vendor consideration
C)Compatibility
D)All of the above
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
5
Data independence means

A)Data is defined separately and not included in programs
B)Programs are not dependent on the physical attributes of data
C)Programs are not dependent on the logical attributes of data
D)Both (B) and (C)
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
6
A definition or a concept is if it classifies any examples as coming within the concept

A)Complete
B)Consistent
C)Constant
D)None of these
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
7
Which of the following activities is a data mining task?

A)Monitoring the heart rate of a patient for abnormalities
B)Extracting the frequencies of a sound wave
C)Predicting the outcomes of tossing a (fair) pair of dice
D)Dividing the customers of a company according to their profitability
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
8
Data Visualization in mining cannot be done using

A)Photos
B)Graphs
C)Charts
D)Information Graphics
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
9
To detect fraudulent usage of credit cards, the following data mining task should be used

A)Outlier analysis
B)prediction
C)association analysis
D)feature selection
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
10
A definition or a concept is if it classifies any examples as coming within the concept

A)Complete
B)Consistent
C)Constant
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
11
Classification accuracy is

A)Subdivision of a set of examples into a number of classes
B)Measure of the accuracy, of the classification of a concept that is given by a certain theory
C)The task of assigning a classification to a set of examples
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
12
Cluster is

A)Group of similar objects that differ significantly from other objects
B)Operations on a database to transform or simplify data in order to prepare it for a machine-learning algorithm
C)Symbolic representation of facts or ideas from which information can potentially be extracted
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
13
Assume you want to perform supervised learning and to predict number of newborns according to size of storks' population (http://www.brixtonhealth.com/storksBabies.pdf), it is an example of

A)Classification
B)Regression
C)Clustering
D)Structural equation modeling
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
14
Bayesian classifiers is

A)class of learning algorithm that tries to find an optimum classification of a set of examples using the probabilistic theory
B)Any mechanism employed by a learning system to constrain the search space of a hypothesis
C)An approach to the design of learning algorithms that is inspired by the fact that when people encounter new situations, they often explain them by reference to familiar experiences, adapting the explanations to fit the new situation
D)None of these
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
15
Background knowledge referred to

A)Additional acquaintance used by a learning algorithm to facilitate the learning process
B)A neural network that makes use of a hidden layer
C)It is a form of automatic learning
D)None of these
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
16
Which of the following is finally produced by Hierarchical Clustering?

A)Final estimate of cluster centroids
B)tree showing how close things are to each other
C)assignment of each point to clusters
D)all of the mentioned
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
17
Which of the following combination is incorrect?

A)Continuous - euclidean distance
B)Continuous - correlation similarity
C)Binary - manhattan distance
D)None of the mentioned
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
18
Which of the following ways can be used to represent a graph?

A)Adjacency List and Adjacency Matrix
B)Incidence Matrix
C)Adjacency List, Adjacency Matrix as well as Incidence Matrix
D)No way to represent
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
19
Which of the following function is used for k-means clustering?

A)k-means
B)k-mean
C)heatmap
D)none of the mentioned
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
20
K-means is not deterministic and it also consists of number of iterations.
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
21
N-grams are defined as the combination of N keywords together. How many bi-grams can be generated from given sentence?

A)7
B)8
C)9
D)10
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
22
Social networks are great distribution channel for ___________

A)Customer feedback
B)Viral Content
C)exclusive coupons
D)marketing messages
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
23
Two increasingly important ethical aspects of social media are

A)Ratings and traffic.
B)Transparency and privacy.
C)Identity and honesty.
D)Virtue and virality
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
locked card icon
Unlock Deck
Unlock for access to all 23 flashcards in this deck.