Perform k-means clustering on coefficient data.
Arguments
- data
A tibble with coefficient columns
- formula
A formula specifying predictors. Can be:
Missing: auto-detects single coe column
Bare column name:
coeFormula:
~ coe,~ coe + size,~ coe1 + coe2Use
coein formula to auto-detect coefficient columns
- k
Integer. Number of clusters. Default is 3.
- nstart
Integer. Number of random starts for k-means. Default is 25.
- ...
Additional arguments passed to
stats::kmeans()
Value
An object of class c("stat_kmeans", "momstats") containing:
data: Original tibble (unchanged)model: Thestats::kmeans()objectmethod: "kmeans"call: The function callformula: Formula used (if any)predictor_cols: All predictor column namesk: Number of clusterscenters: Cluster centers in predictor space (matrix)cluster_sizes: Size of each clusterwithinss: Within-cluster sum of squarestot_withinss: Total within-cluster sum of squaresbetweenss: Between-cluster sum of squaresvariance_explained: Proportion of variance explained by clustering
Details
stat_kmeans() provides a unified interface for k-means clustering on
morphometric coefficient data.
Formula syntax
The formula specifies which predictors to use:
~ coe: Use auto-detected coefficient column(s)~ coe1 + coe2: Use specific coefficient columns~ coe + size: Coefficient column plus a covariateBare name or missing: auto-detect single coe column
Getting results
Use collect() to add cluster assignments to your data:
km <- boteft %>% stat_kmeans(k = 3)
boteft_clustered <- collect(km) # Adds 'cluster' columnUse transduce() to get shapes at cluster centers:
center_shapes <- transduce(km, tibble(cluster = 1:3))Examples
if (FALSE) { # \dontrun{
# Basic k-means with 3 clusters
km1 <- boteft %>% stat_kmeans(k = 3)
# More clusters
km2 <- boteft %>% stat_kmeans(k = 5)
# With covariate
km3 <- boteft %>% stat_kmeans(~ coe + length, k = 4)
# Add cluster assignments
boteft_clustered <- collect(km1)
# Get cluster center shapes
centers <- transduce(km1, tibble(cluster = 1:3))
# Plot results
plot(km1) # Cluster visualization
plot(km1, color = type) # Color by original grouping
} # }
