Deck 20: Cluster Analysis
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Unlock Deck
Sign up to unlock the cards in this deck!
Unlock Deck
Unlock Deck
1/73
Play
Full screen (f)
Deck 20: Cluster Analysis
1
Use of different distance measures may lead to different clustering results. Hence, it is advisable to use different measures and compare the results.
True
2
In cluster analysis, the set of variables selected should describe the similarity between objects in terms that are relevant to the marketing research problem.
True
3
Clustering should be done on samples of at least 300 or more.
False
4
Choice of a clustering method and choice of a distance measure are interrelated.
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
5
Cluster analysis does not classify variables as dependent or independent.
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
6
The complete linkage method of hierarchical clustering is based on the minimum distance or the nearest neighbor approach.
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
7
Nonhierarchical clustering is faster than hierarchical methods.
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
8
Measuring similarity in terms of distance between pairs of objects is the most common approach used in cluster analysis for grouping similar objects together.
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
9
The centroid method is a variance method of hierarchical clustering in which the distance between two clusters is the distance between their centroids (means for all the variables).
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
10
Most clustering methods are relatively complex procedures that are supported by an extensive body of statistical reasoning.
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
11
The dendrogram is read from right to left.
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
12
The primary objective of cluster analysis is to classify objects into relatively homogeneous groups.
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
13
The TwoStep cluster analysis procedure can automatically determine the optimal number of clusters by comparing the values of a model-choice criteria across different clustering solutions.
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
14
In the TwoStep procedure, the euclidean measure can be used only when all of the variables are ordinal.
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
15
Cluster analysis is the obverse of factor analysis in that it reduces the number of objects, not the number of variables, by grouping them into a much smaller number of clusters.
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
16
The parallel threshold method differs from the other two non-hierarchical clustering procedures in that the objects can later be reassigned to clusters to optimize an overall criterion.
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
17
In cluster analysis, objects with larger distances between them are more similar to each other than are those at smaller distances.
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
18
If cluster analysis is used as a general data reduction tool, subsequent multivariate analysis can be conducted on the clusters rather than on the individual observations.
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
19
The average linkage method of hierarchical clustering is preferred to the single and complete linkage methods.
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
20
Cluster analysis requires prior knowledge of the cluster or group membership for each object or case included to develop the classification rule.
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
21
The ________ are the initial starting points in nonhierarchical clustering.
A)factor scores
B)cluster centers
C)cluster centroids
D)factor loadings
A)factor scores
B)cluster centers
C)cluster centroids
D)factor loadings
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
22
________ is a class of techniques used to classify objects or cases into relatively homogeneous groups.
A)Principal components analysis
B)Cluster analysis
C)Common factor analysis
D)Conjoint analysis
A)Principal components analysis
B)Cluster analysis
C)Common factor analysis
D)Conjoint analysis
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
23
A ________ is a lower-triangle matrix containing pairwise distances between objects or cases.
A)classification matrix
B)correlation matrix
C)similarity/distance coefficient matrix
D)factor matrix
A)classification matrix
B)correlation matrix
C)similarity/distance coefficient matrix
D)factor matrix
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
24
Which statement is not true about cluster analysis?
A)Cluster analysis is a technique for analyzing data when the criterion or dependent variable is categorical and the independent variables are interval in nature.
B)Cluster analysis is also called classification analysis or numerical taxonomy.
C)Groups or clusters are suggested by the data, not defined a priori.
D)Objects in each cluster tend to be similar to each other and dissimilar to objects in the other clusters.
A)Cluster analysis is a technique for analyzing data when the criterion or dependent variable is categorical and the independent variables are interval in nature.
B)Cluster analysis is also called classification analysis or numerical taxonomy.
C)Groups or clusters are suggested by the data, not defined a priori.
D)Objects in each cluster tend to be similar to each other and dissimilar to objects in the other clusters.
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
25
In hierarchical clustering, the solution may depend on the order of cases in the data set.
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
26
One method of assessing reliability and validity of clustering is to use different methods of clustering and compare the results.
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
27
Formal procedures for assessing the reliability and validity of clustering are simple and should be undertaken.
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
28
To reduce the number of variables, a large set of variables can often be replaced by the set of cluster components.
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
29
Most ________ methods are heuristics based on algorithms.
A)factor analysis
B)discriminant analysis
C)clustering
D)analysis of variance
A)factor analysis
B)discriminant analysis
C)clustering
D)analysis of variance
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
30
Principal components are usually easier to interpret than the cluster components.
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
31
In non-hierarchical clustering, the F test is only descriptive. Because the cases or objects are systematically assigned to clusters to maximize differences on the clustering variables, the resulting probabilities should not be interpreted as testing the null hypothesis of no differences among clusters.
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
32
It is helpful to profile the clusters in terms of variables that were not used for clustering.
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
33
Clustering should be done on samples of ________ or more.
A)50
B)100
C)200
D)300
A)50
B)100
C)200
D)300
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
34
A(n)________ or tree graph is a graphical device for displaying clustering results. Vertical lines represent clusters that are joined together. The position of the line on the scale indicates the distances at which clusters were joined.
A)dendrogram
B)scattergram
C)scree plot
D)icicle diagram
A)dendrogram
B)scattergram
C)scree plot
D)icicle diagram
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
35
Cluster analysis has been used in marketing for all of the purposes below except ________.
A)segmenting the market based on benefits sought from the purchase of a product
B)identifying new product opportunities by clustering brands and products so that competitive sets within the market can be determined
C)selecting test markets
D)determining how strongly sales are related to advertising expenditures
A)segmenting the market based on benefits sought from the purchase of a product
B)identifying new product opportunities by clustering brands and products so that competitive sets within the market can be determined
C)selecting test markets
D)determining how strongly sales are related to advertising expenditures
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
36
When cluster analysis is also used for clustering variables to identify homogeneous groups, the units used for analysis are the variables and the distance measures are computed for all pairs of variables.
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
37
Which method of analysis does not classify variables as dependent or independent?
A)regression analysis
B)discriminant analysis
C)analysis of variance
D)cluster analysis
A)regression analysis
B)discriminant analysis
C)analysis of variance
D)cluster analysis
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
38
The centroids represent the mean values of the objects contained in the cluster on each of the variables.
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
39
It is possible to obtain information on cluster membership of cases via the icicle plot if the number of clusters is specified.
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
40
The most important part of ________ is selecting the variables on which clustering is based.
A)interpreting and profiling clusters
B)selecting a clustering procedure
C)assessing the validity of clustering
D)formulating the clustering problem
A)interpreting and profiling clusters
B)selecting a clustering procedure
C)assessing the validity of clustering
D)formulating the clustering problem
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
41
The ________ is a nonhierarchical method in which a cluster center is selected and all objects within a pre-specified threshold value from the center are grouped together.
A)optimizing partitioning method
B)sequential threshold method
C)parallel threshold method
D)Ward's procedure
A)optimizing partitioning method
B)sequential threshold method
C)parallel threshold method
D)Ward's procedure
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
42
Which cluster analysis procedure can automatically determine the optimal number of clusters by comparing the values of a model-choice across different clustering solutions?
A)divisive
B)sequential threshold
C)Ward's method
D)TwoStep
A)divisive
B)sequential threshold
C)Ward's method
D)TwoStep
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
43
________ is a clustering procedure characterized by the development of a tree-like structure.
A)Non-hierarchical clustering
B)Hierarchical clustering
C)TwoStep clustering
D)Optimizing partitioning clustering
A)Non-hierarchical clustering
B)Hierarchical clustering
C)TwoStep clustering
D)Optimizing partitioning clustering
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
44
________ is a clustering procedure where all objects start out in one giant cluster. Clusters are formed by dividing this cluster into smaller and smaller clusters.
A)Non-hierarchical clustering
B)Hierarchical clustering
C)Divisive clustering
D)Agglomerative clustering
A)Non-hierarchical clustering
B)Hierarchical clustering
C)Divisive clustering
D)Agglomerative clustering
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
45
Which of the following is a variance method of clustering?
A)sequential threshold
B)Ward's method
C)complete linkage
D)optimizing partitioning
A)sequential threshold
B)Ward's method
C)complete linkage
D)optimizing partitioning
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
46
The most commonly used measure of similarity is the ________ or its square.
A)euclidean distance
B)city-block distance
C)Chebychev's distance
D)Manhattan distance
A)euclidean distance
B)city-block distance
C)Chebychev's distance
D)Manhattan distance
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
47
The ________ method uses information on all pairs of distances, not merely the minimum or maximum distances.
A)single linkage
B)medium linkage
C)complete linkage
D)average linkage
A)single linkage
B)medium linkage
C)complete linkage
D)average linkage
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
48
Which of the methods below is not a hierarchical method?
A)optimizing partitioning
B)parallel threshold
C)both A and B
D)variance
A)optimizing partitioning
B)parallel threshold
C)both A and B
D)variance
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
49
________ is frequently referred to as k-means clustering.
A)Non-hierarchical clustering
B)Ward's method
C)Divisive clustering
D)Agglomerative clustering
A)Non-hierarchical clustering
B)Ward's method
C)Divisive clustering
D)Agglomerative clustering
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
50
The ________ method is based on minimum distance or the nearest neighbor rule.
A)single linkage
B)medium linkage
C)complete linkage
D)average linkage
A)single linkage
B)medium linkage
C)complete linkage
D)average linkage
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
51
________ is a variance method in which the squared euclidean distance to the cluster means is minimized.
A)Optimizing partitioning method
B)Sequential threshold method
C)Parallel threshold method
D)Ward's procedure
A)Optimizing partitioning method
B)Sequential threshold method
C)Parallel threshold method
D)Ward's procedure
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
52
Which statement is not true concerning the clustering solution if the variables are measured in vastly different units?
A)The clustering solution will not be influenced by the units of measurement.
B)Standardization can reduce the differences between groups on variables that may best discriminate groups or clusters.
C)It is desirable to eliminate outliers.
D)We must standardize the data by rescaling each variable to have a mean of zero and standard deviation of unity.
A)The clustering solution will not be influenced by the units of measurement.
B)Standardization can reduce the differences between groups on variables that may best discriminate groups or clusters.
C)It is desirable to eliminate outliers.
D)We must standardize the data by rescaling each variable to have a mean of zero and standard deviation of unity.
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
53
The ________ is a nonhierarchical method that specifies several cluster centers at once. All objects within a pre-specified threshold value from the center are grouped together.
A)optimizing partitioning method
B)sequential threshold method
C)parallel threshold method
D)Ward's procedure
A)optimizing partitioning method
B)sequential threshold method
C)parallel threshold method
D)Ward's procedure
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
54
________ is a procedure that first assigns or determines a cluster center and then groups all objects within a pre-specified threshold value from the center.
A)Non-hierarchical clustering
B)Ward's method
C)Divisive clustering
D)Agglomerative clustering
A)Non-hierarchical clustering
B)Ward's method
C)Divisive clustering
D)Agglomerative clustering
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
55
________ methods are commonly used in marketing research.
A)TwoStep clustering
B)Optimizing partitioning
C)Divisive clustering
D)Agglomerative clustering
A)TwoStep clustering
B)Optimizing partitioning
C)Divisive clustering
D)Agglomerative clustering
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
56
________ is a clustering procedure where each object starts out in a separate cluster.
A)Non-hierarchical clustering
B)Hierarchical clustering
C)Divisive clustering
D)Agglomerative clustering
A)Non-hierarchical clustering
B)Hierarchical clustering
C)Divisive clustering
D)Agglomerative clustering
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
57
Which of the following is not a disadvantage of nonhierarchical clustering procedures?
A)The number of clusters must be pre-specified.
B)The selection of cluster centers is arbitrary.
C)The procedures do not work well when the clusters are poorly defined.
D)All of the above are disadvantages.
A)The number of clusters must be pre-specified.
B)The selection of cluster centers is arbitrary.
C)The procedures do not work well when the clusters are poorly defined.
D)All of the above are disadvantages.
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
58
The ________ is a nonhierarchical method that allows for later reassignment of objects to clusters to optimize an overall criterion.
A)optimizing partitioning method
B)sequential threshold method
C)parallel threshold method
D)Ward's procedure
A)optimizing partitioning method
B)sequential threshold method
C)parallel threshold method
D)Ward's procedure
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
59
The ________ method is based on the maximum distance or the furthest neighbor approach.
A)single linkage
B)medium linkage
C)complete linkage
D)average linkage
A)single linkage
B)medium linkage
C)complete linkage
D)average linkage
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
60
________ are agglomerative methods of hierarchical clustering in which clusters are generated to minimize the within-cluster variance.
A)Variance methods
B)Linkage methods
C)Centroid methods
D)Parallel methods
A)Variance methods
B)Linkage methods
C)Centroid methods
D)Parallel methods
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
61
In SPSS, the main program for hierarchical clustering of objects or cases is ________.
A)VARCLUS
B)CLUSTER ANALYSIS
C)FASTCLUS
D)HIERARCHICAL CLUSTER
A)VARCLUS
B)CLUSTER ANALYSIS
C)FASTCLUS
D)HIERARCHICAL CLUSTER
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
62
What are the steps in conducting cluster analysis (Figure 20.3 in the text)?
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
63
Which method allows the researcher to obtain information on cluster membership of cases if the number of clusters is specified?
A)cluster centers
B)scree plot
C)icicle plot
D)both A and C
A)cluster centers
B)scree plot
C)icicle plot
D)both A and C
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
64
Which is best to use when selecting a clustering procedure: hierarchical or nonhierarchical clustering?
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
65
Which method allows the researcher to obtain information on cluster membership of cases if the number of clusters is specified?
A)factor loading plot
B)scattergram
C)icicle plot
D)scree plot
A)factor loading plot
B)scattergram
C)icicle plot
D)scree plot
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
66
Why should the clustering of variables be used?
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
67
Which of the following is not a procedure to check the quality of clustering results?
A)Perform cluster analysis on the same data using different distance measures.Compare the results across measures to determine the stability of the solutions.
B)Delete variables randomly.Perform clustering based on the reduced set of variables.Compare the results with those obtained by clustering based on the entire set of variables.
C)Use the same method of clustering and compare the results.
D)Split the data randomly into halves.Perform clustering separately on each half.Compare cluster centroids across the two subsamples.
A)Perform cluster analysis on the same data using different distance measures.Compare the results across measures to determine the stability of the solutions.
B)Delete variables randomly.Perform clustering based on the reduced set of variables.Compare the results with those obtained by clustering based on the entire set of variables.
C)Use the same method of clustering and compare the results.
D)Split the data randomly into halves.Perform clustering separately on each half.Compare cluster centroids across the two subsamples.
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
68
What suggested guidelines researchers can use when deciding on the number of clusters?
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
69
________ involves examining the cluster centroids.
A)Interpreting and profiling the clusters
B)Assessing reliability and validity
C)Deciding on the number of clusters
D)Selecting a clustering procedure
A)Interpreting and profiling the clusters
B)Assessing reliability and validity
C)Deciding on the number of clusters
D)Selecting a clustering procedure
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
70
To use cluster analysis for clustering variables to identify homogeneous groups, the researcher could do all of the following except ________.
A)using the variables as the units of analysis
B)using the correlation coefficient as a measure of similarity between variables
C)inserting communalities in the diagonal of the correlation matrix
D)A and B
A)using the variables as the units of analysis
B)using the correlation coefficient as a measure of similarity between variables
C)inserting communalities in the diagonal of the correlation matrix
D)A and B
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
71
If you are performing cluster analysis on the same data using different distance measures and then comparing the results across measures to determine stability of the solutions, you are at which stage of the cluster analysis process?
A)interpreting and profiling the clusters
B)assessing reliability and validity
C)deciding on the number of clusters
D)selecting a clustering procedure
A)interpreting and profiling the clusters
B)assessing reliability and validity
C)deciding on the number of clusters
D)selecting a clustering procedure
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
72
In non-hierarchical clustering, plotting the ratio of total within-group variance to between-group variance against the number of clusters is useful if you are ________.
A)interpreting and profiling the clusters
B)assessing the validity of clustering
C)deciding on the number of clusters
D)both B and C
A)interpreting and profiling the clusters
B)assessing the validity of clustering
C)deciding on the number of clusters
D)both B and C
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck
73
In SAS, the ________ program can be used for the hierarchical clustering of objects or cases.
A)VARCLUS
B)CLUSTER ANALYSIS
C)FASTCLUS
D)HIERARCHICAL CLUSTER
A)VARCLUS
B)CLUSTER ANALYSIS
C)FASTCLUS
D)HIERARCHICAL CLUSTER
Unlock Deck
Unlock for access to all 73 flashcards in this deck.
Unlock Deck
k this deck