## capture column pattern frequency

Question

I have a dataset like this below

``````Id        A      B       C
10        1      0       1
11        1      0       1
12        1      1       0
13        1      0       0
14        0      1       1
``````

I am trying to count the column patterns like this below.

`````` Pattern         Count
A, C            2
A, B            1
A               1
B, C            1
``````

Not sure where to start, any help or advice is much appreciated.

Show source

## Answers to capture column pattern frequency ( 3 )

1. If you don't have to group per ID then simply,

``````table(apply(df[-1], 1, function(i) paste(names(i[i == 1]), collapse = ',')))

#  A A,B A,C B,C
#  1   1   2   1
``````
2. We can try with

``````table(gsub(",*N|N,*", "", chartr('0123', 'NABC',
do.call(paste, c(df1[-1] * col(df1[-1]), sep=",")))))

#  A A,B A,C B,C
#  1   1   2   1
``````

As @DavidArenburg mentioned, the `old/new` in `chartr` can be made automatic with

``````cols <- paste(c("N", names(df1[-1])), collapse = "")
indx <- paste(seq(nchar(cols)) - 1, collapse = "")
table(gsub(",*N|N,*", "", chartr(indx, cols,
do.call(paste, c(df1[-1] * col(df1[-1]), sep=",")))))
``````
3. Starting by "reversing" the tabulation of the data in the two separate vectors:

``````w = which(dat[-1] == 1L, TRUE)
``````

we could use

``````table(tapply(names(dat)[-1][w[, "col"]], w[, "row"], paste, collapse = ", "))
#
#   A A, B A, C B, C
#   1    1    2    1
``````

If the result is not needed only for formatting purposes, to avoid unnecessary `paste`/`strsplit`, an alternative -among many- is:

``````pats = split(names(dat)[-1][w[, "col"]], w[, "row"])
upats = unique(pats)
data.frame(pat = upats, n = tabulate(match(pats, upats)))
#   pat n
#1 A, C 2
#3 A, B 1
#4    A 1
#5 B, C 1
``````