Deck 5: Data Mining for Business Intelligence
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Unlock Deck
Sign up to unlock the cards in this deck!
Unlock Deck
Unlock Deck
1/69
Play
Full screen (f)
Deck 5: Data Mining for Business Intelligence
1
Data mining offers organizations a decision-enhancing environment to exploit opportunities by transforming data into a strategic weapon.
True
2
Two types of categorical data are nominal data and ordinal data.
True
3
Mass,length,time,plane angle,energy,and electric charge are examples of physical measures whose data are represented in interval scales.
False
4
Technically speaking,data mining is a process that uses statistical,mathematical,and artificial intelligence techniques to extract and identify useful information and subsequent knowledge from large sets of data.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
5
Associations are a type of pattern that discovers time-ordered events,such as predicting that an existing banking customer who already has a checking account will open a savings account followed by an investment account within a year.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
6
The variable marital status can be categorized using the codes (1)single, (2)married,and (3)divorced.These codes are examples of ordinal data.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
7
Data mining is a prime candidate for better management of companies that are data-rich,but knowledge-poor.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
8
Data types such as date/time,unstructured text,image,and audio need to be converted into some form of categorical or numeric representation before they can be processed by data mining algorithms.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
9
Data mining is a way for companies to develop business intelligence from their data to gain a better understanding of their customers and operations and to solve complex organizational problems.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
10
The first step in the data mining process is to understand the relevant data from the available databases.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
11
Compared to the other steps in CRISP-DM,data preprocessing consumes the most time and effort; most believe that this step accounts for roughly 80 percent of the total time spent on a data mining project
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
12
The data mining environment is usually a client-server architecture or a Web-based information systems architecture.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
13
Clustering partitions a collection of things,such as objects and events presented in a structured dataset into segments whose members share similar characteristics.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
14
Data mining requires a separate,dedicated database.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
15
In order to be applied successfully,a data mining study must be viewed as a set of automated software tools and techniques.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
16
Business analytics cannot be conducted in real time because it includes components such as metrics and reengineering tools.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
17
A common example of interval scale measurement is temperature on the Celsius scale.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
18
At the highest level of abstraction,all data can be divided into interval data and ratio data.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
19
Cross-Industry Standard Process for Data Mining,or CRISP-DM,is one of the most popular
nonproprietary standard methodologies for data mining.
nonproprietary standard methodologies for data mining.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
20
Predictions tells the nature of future occurrences of certain events based on what has happened in the past,such as predicting the winner of the Super Bowl or forecasting the absolute temperature of a particular day.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
21
A query that can be run by an end users and is not programmed in advance is considered a(n)________ query.
A) tailored
B) casual
C) informal
D) ad hoc
A) tailored
B) casual
C) informal
D) ad hoc
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
22
Because the latter steps in the data mining process are built on the outcome of the former ones,one should:
A) start with an understanding of the relevant data.
B) work quickly through the early steps and work in-depth on the latter steps.
C) start by cleaning the relevant data and storing it in a single data warehouse.
D) pay extra attention to the earlier steps in order not to put the whole study on an incorrect path from the onset.
A) start with an understanding of the relevant data.
B) work quickly through the early steps and work in-depth on the latter steps.
C) start by cleaning the relevant data and storing it in a single data warehouse.
D) pay extra attention to the earlier steps in order not to put the whole study on an incorrect path from the onset.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
23
The term data mining was originally used to ________.
A) include most forms of data analysis in order to increase sales
B) describe the analysis of huge datasets stored in data warehouses
C) describe the process through which previously unknown patterns in data were discovered
D) All of the above
A) include most forms of data analysis in order to increase sales
B) describe the analysis of huge datasets stored in data warehouses
C) describe the process through which previously unknown patterns in data were discovered
D) All of the above
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
24
A good question to ask with respect to the patterns/relationships that association rule mining can discover is "Are all association rules interesting and useful?" In order to answer such a question,association rule mining uses two common metrics ________ and ________.
A) mean; median
B) support; confidence
C) standard deviation; confidence interval
D) regression; distance measure
A) mean; median
B) support; confidence
C) standard deviation; confidence interval
D) regression; distance measure
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
25
Data mining seeks to identify patterns in the data.All of the following are major types of patterns except:
A) associations
B) regression
C) predictions
D) clusters
A) associations
B) regression
C) predictions
D) clusters
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
26
Data mining is tightly positioned at the intersection of many disciplines.Those disciplines include all of the following except:
A) management science
B) statistics
C) information systems and databases
D) logistics
A) management science
B) statistics
C) information systems and databases
D) logistics
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
27
On the commercial side,the most common use of data mining has been in ________ sectors.
A) manufacturing and heath care
B) finance, retail, and health care
C) online retail and government
D) R&D and scientific
A) manufacturing and heath care
B) finance, retail, and health care
C) online retail and government
D) R&D and scientific
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
28
Because the data mining is driven by experience and experimentation,depending on the problem situation and the analyst's knowledge,the whole process can be ________ and ________,i.e.,one should expect to go back and forth through the steps quite a few times.
A) simple; iterative
B) expensive; hypothetical
C) time consuming; iterative
D) time-consuming; hypothetical
A) simple; iterative
B) expensive; hypothetical
C) time consuming; iterative
D) time-consuming; hypothetical
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
29
In an article in Harvard Business Review,Thomas Davenport (2006)argued that the latest strategic weapon for companies is ________.
A) customer relationship management
B) e-commerce
C) online auctions
D) analytical decision making
A) customer relationship management
B) e-commerce
C) online auctions
D) analytical decision making
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
30
Business analytics and data mining provided 1-800-Flowers with all of the following benefits except:
A) More efficient marketing campaigns.
B) Increased mailings and response rates.
C) Better customer experience and retention.
D) Increased repeat sales.
A) More efficient marketing campaigns.
B) Increased mailings and response rates.
C) Better customer experience and retention.
D) Increased repeat sales.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
31
Numeric data represent the numeric values of specific variables,which are ________ variables that can take on an infinite number of fractional values.
A) discrete
B) continuous
C) interval
D) ratio
A) discrete
B) continuous
C) interval
D) ratio
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
32
________ data mining begins with a proposition by the user,who then seeks to validate the truthfulness of the proposition.For example,a marketing manager may begin with the following proposition: "Are BluRay player sales related to sales of HDTV sets?"
A) Hypothesis-driven
B) Theory-driven
C) Discovery-driven
D) Data-driven
A) Hypothesis-driven
B) Theory-driven
C) Discovery-driven
D) Data-driven
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
33
Predictive analysis uses sophisticated algorithms that can be designed to sift through data and identify ________.
A) patterns of behavior
B) outlier
C) average statistics
D) minimum and maximum statistics
A) patterns of behavior
B) outlier
C) average statistics
D) minimum and maximum statistics
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
34
Data that has a meaningful,or nonarbitrary,zero point is ________ data.
A) categorical
B) nominal
C) interval
D) ratio
A) categorical
B) nominal
C) interval
D) ratio
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
35
Why has data mining gained the attention of the business world?
A) More intense competition at the global scale driven by customers' ever-changing needs and wants in an increasingly saturated marketplace.
B) Consolidation and integration of database records, which enables a single view of customers and vendors.
C) Significant reduction in the cost of hardware and software for data storage and processing.
D) All of the above
A) More intense competition at the global scale driven by customers' ever-changing needs and wants in an increasingly saturated marketplace.
B) Consolidation and integration of database records, which enables a single view of customers and vendors.
C) Significant reduction in the cost of hardware and software for data storage and processing.
D) All of the above
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
36
What is a major characteristic of data mining?
A) Data mining tools are readily combined with spreadsheets and other software development tools.
B) Because of the large amounts of data and massive search efforts, it is sometimes necessary to use serial processing for data mining.
C) Data are often buried within numerous small large databases, which sometimes contain data from several years.
D) The miner needs sophisticated programming skill.
A) Data mining tools are readily combined with spreadsheets and other software development tools.
B) Because of the large amounts of data and massive search efforts, it is sometimes necessary to use serial processing for data mining.
C) Data are often buried within numerous small large databases, which sometimes contain data from several years.
D) The miner needs sophisticated programming skill.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
37
The simple split methodology splits the data into two mutually exclusive subsets called a ________ set and a ________ set.
A) training; test
B) positive; negative
C) holdout; training
D) matrix; test
A) training; test
B) positive; negative
C) holdout; training
D) matrix; test
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
38
At the highest level of abstraction,data can be classified as ________ and ________.
A) alpha; numeric
B) categorical; numerical
C) nominal; ratio
D) interval; ration
A) alpha; numeric
B) categorical; numerical
C) nominal; ratio
D) interval; ration
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
39
________,or supervised induction,is perhaps the most common of all data mining tasks.Its objective is to analyze the historical data stored in a database and automatically generate a model that can predict future behavior.
A) Association
B) Clustering
C) Prediction
D) Classification
A) Association
B) Clustering
C) Prediction
D) Classification
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
40
________ are essentially a hierarchy of if-then statements.They are most appropriate for categorical and interval data.
A) Neural nets
B) Clusters
C) Decision trees
D) Time-series
A) Neural nets
B) Clusters
C) Decision trees
D) Time-series
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
41
With ________,a fixed number of instances from the original data is sampled (with replacement)for training and the rest of the dataset is used for testing.This process is repeated as many times as desired.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
42
________ partitions a collection of things,such as objects and events stored in a dataset,into segments whose members share similar characteristics.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
43
The ________ algorithm is the most commonly used algorithm to discover association rules.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
44
________ data contains codes assigned to objects or events as labels that also represent the rank order among them; for example,credit score.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
45
________ data,also known as categorical data,contains both nominal and ordinal data.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
46
________ data contains measurements of simple codes assigned to objects as labels,such as marital status,but those labels are not measurements.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
47
Using existing and relevant data,data mining builds models to identify ________ among the attributes presented in the dataset.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
48
________ tells the nature of future occurrences of certain events based on what has happened in the past,such as predicting the winner of the Super Bowl or forecasting the absolute temperature of a particular day.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
49
________ measures the extent of uncertainty or randomness in a dataset.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
50
________ means that discovered patterns in a dataset hold true on new data with a sufficient degree of certainty.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
51
________ data can be readily represented by some sort of probability distribution.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
52
________ data mining finds patterns,associations,and other relationships hidden within datasets.It can uncover facts that an organization had not previously known or contemplated.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
53
Generally speaking,data mining tasks can be classified into three main categories: ________,association,and clustering.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
54
The ________ index has been used in economics to measure the diversity of a population.The same concept can be used to determine the purity of a specific class as a result of a decision to branch along a particular attribute or variable.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
55
________ data mining begins with a proposition by the user,who then seeks to validate the truthfulness of the proposition
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
56
With ________ forecasting,the data are a series of values of the same variable that is captured and stored over time.These data are then used to develop models to extrapolate the future values of the same phenomenon.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
57
The most commonly used measure to calculate the closeness between pairs of items in cluster analysis is the ________.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
58
________ data represent the labels of multiple classes used to divide a variable into specific groups.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
59
The model's ability to make reasonably accurate predictions,given noisy data or data with missing and erroneous values,is called ________.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
60
Mass,length,time,energy,and electric charge are examples of ________ data because their scales have a nonarbitrary zero value.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
61
List five of the data mining mistakes often made in practice.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
62
List three of the major characteristics and objectives of data mining.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
63
List the four data preprocessing steps.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
64
List and briefly describe three of the major types of patterns that data mining attempts to identify.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
65
Identify and describe the two types of categorical data.Give an example of each.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
66
List and briefly explain four of the factors to be considered when assessing a model to be used for classification.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
67
Explain the simple split methodology for classification.Explain the advantages of the k-fold cross-validation methodology over the simple split methodology.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
68
Identify and describe the two types of numerical data.Give an example of each.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck
69
List the six steps in the CRISP-DM Data Mining Process.
Unlock Deck
Unlock for access to all 69 flashcards in this deck.
Unlock Deck
k this deck