r - Identifying "clusters" or "groups" in a matrix -
i have matrix populated discrete elements, , need cluster them intact groups. so, example, take matrix:
[a b b c a] [a b a] [a b b c c] [a a a]
there 2 separate clusters a, 2 separate clusters c, , 1 cluster b.
the output i'm looking ideally assign unique id each clister, this:
[1 2 2 3 4] [1 1 2 4 4] [1 2 2 5 5] [1 1 1 1 1]
right have r code recursively iteratively checking nearest neighbor, overflows when matrix gets large (i.e., 100x100).
is there built in function in r can this? looked raster , image processing, no luck. i'm convinced must out there.
thanks!
you approach building lattice graph representing matrix, edges retained if vertices have same type:
# build initial matrix , lattice graph library(igraph) mat <- matrix(c(1, 1, 1, 1, 2, 1, 2, 1, 2, 2, 2, 1, 3, 1, 3, 1, 1, 1, 3, 1), nrow=4) labels <- as.vector(mat) g <- graph.lattice(dim(mat)) lyt <- layout.auto(g) # remove edges between elements of different types edgelist <- get.edgelist(g) retain <- labels[edgelist[,1]] == labels[edgelist[,2]] g <- delete.edges(g, e(g)[!retain]) # take @ have plot(g, layout=lyt)
vertices numbered going down columns. it's easy see need grab components of graph:
matrix(clusters(g)$membership, nrow=nrow(mat)) # [,1] [,2] [,3] [,4] [,5] # [1,] 1 2 2 3 4 # [2,] 1 1 2 4 4 # [3,] 1 2 2 5 5 # [4,] 1 1 1 1 1
if wanted include diagonals in lattice, might start lattice neighborhood size 2 , limit elements no more 1 row or 1 column apart. consider following matrix:
[a b c b] [b a a]
here's code capture 4 groups, not 6, due including diagonal links:
# build initial matrix , lattice graph (neighborhood size 2) mat <- matrix(c(1, 2, 2, 1, 3, 1, 2, 1), nrow=2) labels <- as.vector(mat) rows <- (seq(length(labels)) - 1) %% nrow(mat) cols <- ceiling(seq(length(labels)) / nrow(mat)) g <- graph.lattice(dim(mat), nei=2) # remove edges between elements of different types or aren't diagonal edgelist <- get.edgelist(g) retain <- labels[edgelist[,1]] == labels[edgelist[,2]] & abs(rows[edgelist[,1]] - rows[edgelist[,2]]) <= 1 & abs(cols[edgelist[,1]] - cols[edgelist[,2]]) <= 1 g <- delete.edges(g, e(g)[!retain]) # cluster obtain final groups matrix(clusters(g)$membership, nrow=nrow(mat)) # [,1] [,2] [,3] [,4] # [1,] 1 2 3 4 # [2,] 2 1 1 1
Comments
Post a Comment