Deck 5: Data Mining and Clustering: Key Concepts and Tasks
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Unlock Deck
Sign up to unlock the cards in this deck!
Unlock Deck
Unlock Deck
1/23
Play
Full screen (f)
Deck 5: Data Mining and Clustering: Key Concepts and Tasks
1
Which statement is not TRUE regarding a data mining task?
A)Clustering is a descriptive data mining task
B)Classification is a predictive data mining task
C)Regression is a descriptive data mining task
D)Deviation detection is a predictive data mining task
A)Clustering is a descriptive data mining task
B)Classification is a predictive data mining task
C)Regression is a descriptive data mining task
D)Deviation detection is a predictive data mining task
Regression is a descriptive data mining task
2
Synonym for data mining is
A)Data Warehouse
B)Knowledge discovery in database
C)Business intelligence
D)OLAP
A)Data Warehouse
B)Knowledge discovery in database
C)Business intelligence
D)OLAP
Knowledge discovery in database
3
Which of the following is not a data mining task?
A)Feature Subset Detection
B)Association Rule Discovery
C)Regression
D)Sequential Pattern Discovery
A)Feature Subset Detection
B)Association Rule Discovery
C)Regression
D)Sequential Pattern Discovery
Feature Subset Detection
4
Which of the following issue is considered before investing in Data Mining?
A)Functionality
B)Vendor consideration
C)Compatibility
D)All of the above
A)Functionality
B)Vendor consideration
C)Compatibility
D)All of the above
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
5
Data independence means
A)Data is defined separately and not included in programs
B)Programs are not dependent on the physical attributes of data
C)Programs are not dependent on the logical attributes of data
D)Both (B) and (C)
A)Data is defined separately and not included in programs
B)Programs are not dependent on the physical attributes of data
C)Programs are not dependent on the logical attributes of data
D)Both (B) and (C)
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
6
A definition or a concept is if it classifies any examples as coming within the concept
A)Complete
B)Consistent
C)Constant
D)None of these
A)Complete
B)Consistent
C)Constant
D)None of these
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
7
Which of the following activities is a data mining task?
A)Monitoring the heart rate of a patient for abnormalities
B)Extracting the frequencies of a sound wave
C)Predicting the outcomes of tossing a (fair) pair of dice
D)Dividing the customers of a company according to their profitability
A)Monitoring the heart rate of a patient for abnormalities
B)Extracting the frequencies of a sound wave
C)Predicting the outcomes of tossing a (fair) pair of dice
D)Dividing the customers of a company according to their profitability
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
8
Data Visualization in mining cannot be done using
A)Photos
B)Graphs
C)Charts
D)Information Graphics
A)Photos
B)Graphs
C)Charts
D)Information Graphics
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
9
To detect fraudulent usage of credit cards, the following data mining task should be used
A)Outlier analysis
B)prediction
C)association analysis
D)feature selection
A)Outlier analysis
B)prediction
C)association analysis
D)feature selection
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
10
A definition or a concept is if it classifies any examples as coming within the concept
A)Complete
B)Consistent
C)Constant
A)Complete
B)Consistent
C)Constant
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
11
Classification accuracy is
A)Subdivision of a set of examples into a number of classes
B)Measure of the accuracy, of the classification of a concept that is given by a certain theory
C)The task of assigning a classification to a set of examples
A)Subdivision of a set of examples into a number of classes
B)Measure of the accuracy, of the classification of a concept that is given by a certain theory
C)The task of assigning a classification to a set of examples
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
12
Cluster is
A)Group of similar objects that differ significantly from other objects
B)Operations on a database to transform or simplify data in order to prepare it for a machine-learning algorithm
C)Symbolic representation of facts or ideas from which information can potentially be extracted
A)Group of similar objects that differ significantly from other objects
B)Operations on a database to transform or simplify data in order to prepare it for a machine-learning algorithm
C)Symbolic representation of facts or ideas from which information can potentially be extracted
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
13
Assume you want to perform supervised learning and to predict number of newborns according to size of storks' population (http://www.brixtonhealth.com/storksBabies.pdf), it is an example of
A)Classification
B)Regression
C)Clustering
D)Structural equation modeling
A)Classification
B)Regression
C)Clustering
D)Structural equation modeling
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
14
Bayesian classifiers is
A)class of learning algorithm that tries to find an optimum classification of a set of examples using the probabilistic theory
B)Any mechanism employed by a learning system to constrain the search space of a hypothesis
C)An approach to the design of learning algorithms that is inspired by the fact that when people encounter new situations, they often explain them by reference to familiar experiences, adapting the explanations to fit the new situation
D)None of these
A)class of learning algorithm that tries to find an optimum classification of a set of examples using the probabilistic theory
B)Any mechanism employed by a learning system to constrain the search space of a hypothesis
C)An approach to the design of learning algorithms that is inspired by the fact that when people encounter new situations, they often explain them by reference to familiar experiences, adapting the explanations to fit the new situation
D)None of these
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
15
Background knowledge referred to
A)Additional acquaintance used by a learning algorithm to facilitate the learning process
B)A neural network that makes use of a hidden layer
C)It is a form of automatic learning
D)None of these
A)Additional acquaintance used by a learning algorithm to facilitate the learning process
B)A neural network that makes use of a hidden layer
C)It is a form of automatic learning
D)None of these
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
16
Which of the following is finally produced by Hierarchical Clustering?
A)Final estimate of cluster centroids
B)tree showing how close things are to each other
C)assignment of each point to clusters
D)all of the mentioned
A)Final estimate of cluster centroids
B)tree showing how close things are to each other
C)assignment of each point to clusters
D)all of the mentioned
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
17
Which of the following combination is incorrect?
A)Continuous - euclidean distance
B)Continuous - correlation similarity
C)Binary - manhattan distance
D)None of the mentioned
A)Continuous - euclidean distance
B)Continuous - correlation similarity
C)Binary - manhattan distance
D)None of the mentioned
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
18
Which of the following ways can be used to represent a graph?
A)Adjacency List and Adjacency Matrix
B)Incidence Matrix
C)Adjacency List, Adjacency Matrix as well as Incidence Matrix
D)No way to represent
A)Adjacency List and Adjacency Matrix
B)Incidence Matrix
C)Adjacency List, Adjacency Matrix as well as Incidence Matrix
D)No way to represent
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
19
Which of the following function is used for k-means clustering?
A)k-means
B)k-mean
C)heatmap
D)none of the mentioned
A)k-means
B)k-mean
C)heatmap
D)none of the mentioned
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
20
K-means is not deterministic and it also consists of number of iterations.
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
21
N-grams are defined as the combination of N keywords together. How many bi-grams can be generated from given sentence?
A)7
B)8
C)9
D)10
A)7
B)8
C)9
D)10
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
22
Social networks are great distribution channel for ___________
A)Customer feedback
B)Viral Content
C)exclusive coupons
D)marketing messages
A)Customer feedback
B)Viral Content
C)exclusive coupons
D)marketing messages
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck
23
Two increasingly important ethical aspects of social media are
A)Ratings and traffic.
B)Transparency and privacy.
C)Identity and honesty.
D)Virtue and virality
A)Ratings and traffic.
B)Transparency and privacy.
C)Identity and honesty.
D)Virtue and virality
Unlock Deck
Unlock for access to all 23 flashcards in this deck.
Unlock Deck
k this deck