Deck 10: Data Quality and Integration
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Unlock Deck
Sign up to unlock the cards in this deck!
Unlock Deck
Unlock Deck
1/102
Play
Full screen (f)
Deck 10: Data Quality and Integration
1
One way to improve the data capture process is to:
A) allow all data to be entered manually.
B) provide little or no training to data entry operators.
C) check entered data immediately for quality against data in the database.
D) not use any automatic data entry routines.
A) allow all data to be entered manually.
B) provide little or no training to data entry operators.
C) check entered data immediately for quality against data in the database.
D) not use any automatic data entry routines.
C
2
Conformance means that:
A) data have been transformed.
B) data are stored, exchanged or presented in a format that is specified by its metadata.
C) data are stored in a way to expedite retrieval.
D) none of the above.
A) data have been transformed.
B) data are stored, exchanged or presented in a format that is specified by its metadata.
C) data are stored in a way to expedite retrieval.
D) none of the above.
B
3
Quality data can be defined as being:
A) unique.
B) inaccurate.
C) historical.
D) precise.
A) unique.
B) inaccurate.
C) historical.
D) precise.
A
4
All of the following are popular architectures for Master Data Management EXCEPT:
A) Identity Registry.
B) Integration Hub.
C) Persistent Object.
D) Normalization.
A) Identity Registry.
B) Integration Hub.
C) Persistent Object.
D) Normalization.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
5
The best place to improve data entry across all applications is:
A) in the users.
B) in the level of organizational commitment.
C) in the database definitions.
D) in the data entry operators.
A) in the users.
B) in the level of organizational commitment.
C) in the database definitions.
D) in the data entry operators.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
6
Which of the following are key steps in a data quality program?
A) Conduct a data quality audit
B) Apply TQM principles and practices
C) Estimate return on investment
D) All of the above
A) Conduct a data quality audit
B) Apply TQM principles and practices
C) Estimate return on investment
D) All of the above
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
7
In the ________ approach,one consolidated record is maintained and all applications draw on that one actual "golden" record.
A) Persistent
B) Identity Registry
C) Federated
D) Integration Hub
A) Persistent
B) Identity Registry
C) Federated
D) Integration Hub
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
8
Data governance can be defined as:
A) a means to slow down the speed of data.
B) high-level organizational groups and processes that oversee data stewardship.
C) a government task force for defining data quality.
D) none of the above.
A) a means to slow down the speed of data.
B) high-level organizational groups and processes that oversee data stewardship.
C) a government task force for defining data quality.
D) none of the above.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
9
Data quality is important for all of the following reasons EXCEPT:
A) it minimizes project delay.
B) it aids in making timely business decisions.
C) it provides a stream of profit.
D) it helps to expand the customer base.
A) it minimizes project delay.
B) it aids in making timely business decisions.
C) it provides a stream of profit.
D) it helps to expand the customer base.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
10
Data quality problems can cascade when:
A) data are not deleted properly.
B) data are copied from legacy systems.
C) there is redundant data storage and inconsistent metadata.
D) there are data entry problems.
A) data are not deleted properly.
B) data are copied from legacy systems.
C) there is redundant data storage and inconsistent metadata.
D) there are data entry problems.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
11
TQM stands for:
A) Thomas Quinn Mann, a famous data quality innovator.
B) Total Quality Manipulation.
C) Transforming Quality Management.
D) Total Quality Management.
A) Thomas Quinn Mann, a famous data quality innovator.
B) Total Quality Manipulation.
C) Transforming Quality Management.
D) Total Quality Management.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
12
Data quality ROI stands for:
A) return on investment.
B) risk of incarceration.
C) rough outline inclusion.
D) none of the above.
A) return on investment.
B) risk of incarceration.
C) rough outline inclusion.
D) none of the above.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
13
All of the following are ways to consolidate data EXCEPT:
A) application integration.
B) data rollup and integration.
C) business process integration.
D) user interaction integration.
A) application integration.
B) data rollup and integration.
C) business process integration.
D) user interaction integration.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
14
One characteristic of quality data which pertains to the expectation for the time between when data are expected and when they are available for use is:
A) currency.
B) consistency.
C) referential Integrity.
D) timeliness.
A) currency.
B) consistency.
C) referential Integrity.
D) timeliness.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
15
High quality data are data that are:
A) accurate.
B) consistent.
C) available in a timely fashion.
D) all of the above.
A) accurate.
B) consistent.
C) available in a timely fashion.
D) all of the above.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
16
External data sources present problems for data quality because:
A) data are not always available.
B) there is a lack of control over data quality.
C) there are poor data capture controls.
D) data are unformatted.
A) data are not always available.
B) there is a lack of control over data quality.
C) there are poor data capture controls.
D) data are unformatted.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
17
The methods to ensure the quality of data across various subject areas are called:
A) Variable Data Management.
B) Master Data Management.
C) Joint Data Management.
D) Managed Data Management.
A) Variable Data Management.
B) Master Data Management.
C) Joint Data Management.
D) Managed Data Management.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
18
One simple task of a data quality audit is to:
A) interview all users.
B) statistically profile all files.
C) load all data into a data warehouse.
D) establish quality metrics.
A) interview all users.
B) statistically profile all files.
C) load all data into a data warehouse.
D) establish quality metrics.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
19
Data federation is a technique which:
A) creates an integrated database from several separate databases.
B) creates a distributed database.
C) provides a virtual view of integrated data without actually creating one centralized database.
D) provides a real-time update of shared data.
A) creates an integrated database from several separate databases.
B) creates a distributed database.
C) provides a virtual view of integrated data without actually creating one centralized database.
D) provides a real-time update of shared data.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
20
________ duplicates data across databases.
A) Data propagation
B) Data duplication
C) Redundant replication
D) A replication server
A) Data propagation
B) Data duplication
C) Redundant replication
D) A replication server
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
21
An approach to filling a data warehouse that employs bulk rewriting of the target data periodically is called:
A) dump mode.
B) overwrite mode.
C) refresh mode.
D) update mode.
A) dump mode.
B) overwrite mode.
C) refresh mode.
D) update mode.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
22
A technique using artificial intelligence to upgrade the quality of raw data is called:
A) dumping.
B) data reconciliation.
C) completion backwards updates.
D) data scrubbing.
A) dumping.
B) data reconciliation.
C) completion backwards updates.
D) data scrubbing.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
23
Event-drive propagation:
A) provides a means to duplicate data for events.
B) pushes data to duplicate sites as an event occurs.
C) pulls duplicate data from redundant sites.
D) none of the above.
A) provides a means to duplicate data for events.
B) pushes data to duplicate sites as an event occurs.
C) pulls duplicate data from redundant sites.
D) none of the above.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
24
Dirty data saves work for information systems projects.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
25
A method of capturing only the changes that have occurred in the source data since the last capture is called ________ extract.
A) static
B) incremental
C) partial
D) update-driven
A) static
B) incremental
C) partial
D) update-driven
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
26
Which of the following is a basic method for single field transformation?
A) Table lookup
B) Cross-linking entities
C) Cross-linking attributes
D) Field-to-field communication
A) Table lookup
B) Cross-linking entities
C) Cross-linking attributes
D) Field-to-field communication
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
27
Which type of index is commonly used in data warehousing environments?
A) Join index
B) Bit-mapped index
C) Secondary index
D) Both A and B
A) Join index
B) Bit-mapped index
C) Secondary index
D) Both A and B
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
28
Quality data are not essential for well-run organizations.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
29
ETL is short for Extract,Tranform,Load.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
30
All of the following are tasks of data cleansing EXCEPT:
A) decoding data to make them understandable for data warehousing applications.
B) adding time stamps to distinguish values for the same attribute over time.
C) generating primary keys for each row of a table.
D) creating foreign keys.
A) decoding data to make them understandable for data warehousing applications.
B) adding time stamps to distinguish values for the same attribute over time.
C) generating primary keys for each row of a table.
D) creating foreign keys.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
31
Data may be loaded from the staging area into the warehouse by following:
A) SQL Commands (Insert/Update).
B) special load utilities.
C) custom-written routines.
D) all of the above.
A) SQL Commands (Insert/Update).
B) special load utilities.
C) custom-written routines.
D) all of the above.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
32
There are six major steps to ETL.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
33
A data steward is a person assigned the responsibility of ensuring the organizational applications properly support the organization's enterprise goals for data quality.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
34
The process of combining data from various sources into a single table or view is called:
A) extracting.
B) updating.
C) selecting.
D) joining.
A) extracting.
B) updating.
C) selecting.
D) joining.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
35
A data governance committee is always made up of high-ranking government officials.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
36
A characteristic of reconciled data that means the data reflect an enterprise-wide view is:
A) detailed.
B) historical.
C) normalized.
D) comprehensive.
A) detailed.
B) historical.
C) normalized.
D) comprehensive.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
37
The process of transforming data from a detailed to a summary level is called:
A) extracting.
B) updating.
C) joining.
D) aggregating.
A) extracting.
B) updating.
C) joining.
D) aggregating.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
38
The major advantage of data propagation is:
A) real-time cascading of data changes throughout the organization.
B) duplication of non-redundant data.
C) the ability to have trickle-feeds.
D) none of the above.
A) real-time cascading of data changes throughout the organization.
B) duplication of non-redundant data.
C) the ability to have trickle-feeds.
D) none of the above.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
39
Loading data into a data warehouse involves:
A) appending new rows to the tables in the warehouse.
B) updating existing rows with new data.
C) purging data that have become obsolete or were incorrectly loaded.
D) all of the above.
A) appending new rows to the tables in the warehouse.
B) updating existing rows with new data.
C) purging data that have become obsolete or were incorrectly loaded.
D) all of the above.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
40
Informational and operational data differ in all of the following ways EXCEPT:
A) level of detail.
B) normalization level.
C) scope of data.
D) data quality.
A) level of detail.
B) normalization level.
C) scope of data.
D) data quality.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
41
Quality data does not have to be unique.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
42
The data reconciliation process is responsible for transforming operational data to reconciled data.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
43
Static extract is a method of capturing only the changes that have occurred in the source data since the last capture.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
44
Data reconciliation occurs in two stages,an initial load and subsequent updates.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
45
The major advantage of the data propagation approach to data integration is the near real-time cascading of data changes throughout the organization.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
46
After the extract,transform,and load is done on data,the data warehouse is never fully normalized.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
47
Completeness means that all data that are needed are present.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
48
Data which arrive via XML and B2B channels is always guaranteed to be accurate.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
49
Data quality is essential for SOX and Basel II compliance.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
50
Total quality management (TQM)focuses on defect correction rather than defect prevention.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
51
A data quality audit helps an organization understand the extent and nature of data quality problems.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
52
Generally,records in a customer file never become obsolete.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
53
The uncontrolled proliferation of spreadsheets,databases and repositories leads to data quality problems.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
54
User interaction integration is achieved by creating fewer user interfaces.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
55
One of the biggest challenges of the extraction process is managing changes in the source system.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
56
Data are moved to the staging area before extraction takes place.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
57
A data stewardship program does not help to involve the organization in data quality.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
58
Retention refers to the amount of data that is not purged periodically from tables.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
59
Data federation consolidates all data into one database.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
60
Master data management is the disciplines,technologies and methods to ensure the currency,meaning and quality of data within one subject area.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
61
Bit-mapped indexing is often used in a data warehouse environment.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
62
Data transformation is not an important part of the data reconciliation process.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
63
________ is achieved by coordinating the flow of event information between business applications.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
64
A ________ is a person assigned the responsibility of ensuring that organizational applications properly support the organization's enterprise goals of data quality.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
65
Conformance refers to whether the data is stored,exchanged or presented in a format that is as specified by its ________.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
66
________ can cause delays and extra work on information systems projects.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
67
Data scrubbing is a technique using pattern recognition and other artificial intelligence techniques to upgrade the quality of raw data before transforming and moving the data to the data warehouse.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
68
________ provides a virtual view of integrated data without actually bringing the data into one physical database.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
69
A(n)________ will thoroughly review all process controls on data entry and maintenance.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
70
Improving ________ is a fundamental step in data quality improvement.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
71
Joining is often complicated by problems such as errors in source data.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
72
In the ________ approach,one consolidated record is maintained from which all applications draw data.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
73
The process of transforming data from detailed to summary levels is called normalization.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
74
User interaction integration is achieved by creating fewer ________ that feed different systems.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
75
Update mode is used to create a data warehouse.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
76
Loading data into the warehouse typically means appending new rows to tables in the warehouse as well as updating existing rows with new data.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
77
Refresh mode is an approach to filling the data warehouse that employs bulk rewriting of the target data at periodic intervals.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
78
Completeness means that all data that must have a ________ does have a ________.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
79
Sound data ________ is a central ingredient of a data quality program.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck
80
Data propagation duplicates data across databases,usually with no ________ delay.
Unlock Deck
Unlock for access to all 102 flashcards in this deck.
Unlock Deck
k this deck