Multivariate Statistical Methods in Quality Management

Cluster analysis is a multivariate statistical method that identifies hidden groups, in a large number of objects, based on their characteristics. Similar to discriminant analysis, each object has multiple characteristics, which can be expressed as a random vector X = ( X 1, X 2, , X p) with values that vary from object to object. The primary objective of cluster analysis is to identify similar objects on the basis of the characteristics they possess. Cluster analysis clusters similar objects into groups, so that the objects within a group are very similar and objects from different groups are significantly different in their characteristics. Unlike discriminant analysis, in which the number of groups and group names are known before the analysis, in cluster analysis, the number of groups and group features are unknown before the analysis, and they are determined after the analysis.
Let us look at the following example.
Example 7.1 : Cereal Type Table 7.1 gives the nutritional contents of 25 types of cereals. Certainly, some brands are very similar in terms of nutritional content. Cluster analysis is able to identify hidden groups in the cereal brands, identify the features of each group and compare their differences.
Table 7.1: Nutrition Contents of Cereals Brand Calories (Cal/oz) Protein (g) Fat (g) Na (mg) Fiber (g) Carbs (g) Sugar (g) K (mg) Cheerios
110
6
2
290
2.0
17.0
1
105
Cocoa Puffs
110
1
1
180
0.0
12.0
13