Comparative Archaeology Database, University of Pittsburgh

Wankarani Settlement Dataset 
The data file KMEANS.TXT provides the statistical results of the nonhierarchical Kmeans cluster analysis of Late Intermediate farmstead sites. Each line represents one cluster analysis solution. There are 9 lines, each with 13 variables separated by commas. The variables are, in the following order:
1  The number of clusters requested for this solution 
2  N: The number of sites (farmsteads) to be clustered 
3  SSE (Summed Squared Error): The sum (over all 17 sites) of the squared Euclidean distances from each farmstead to the centroid of the cluster to which it is assigned 
4  %SSE: The SSE as a percent of the maximum SSE 
5  Log %SSE: Base 10 logarithm of the %SSE 
6  The square root of (SSE/N), where N is the number of sites in the analysis (17) 
7  nbar: The mean number of sites per cluster 
8  nstd: The standard deviation of the number of sites per cluster 
9  RMSbar: The mean RMS. RMS is the square root of the mean of the squared distances from each site in a cluster to the cluster centroid. 
10  RMSstd: The standard deviation of the RMS 
11  n>2: The number of clusters in the solution with more than two members 
12  r2wbar: The mean r² value over all the clusters 
13  r2wstd: The standard deviation of the r² value 
The first line of the file, for example, looks like this:
1,17,194.67,100,2,3.3839,17,0,3.38,0,1,0.13,0
This indicates that the solution is for 1 cluster, N is 17, SSE is 194.67, %SSE is 100, Log%SSE is 2, square root of SSE/N is 3.3839, nbar is 17, nstd is 0, RMSbar is 3.38, RMSstd is 0, n>2 is 1, r2wbar is 0.13, and r2wbarstd is 0.
The last (9th) line of the file is:
9,17,1.64,0.84,0,0.3102,1,1.9,0.19,0.18,2,0.39,0.44
This indicates that the solution is for 9 clusters, N is 17, SSE is 1.64, %SSE is 0.84, Log%SSE is 0, square root of SSE/N is 0.3102, nbar is 1, nstd is 1.9, RMSbar is 0.19, RMSstd is 0.18, n>2 is 2, r2wbar is 0.39, and r2wbarstd is 0.44.
