Numbering rows within groups in a data frame

Question

asked Jul 15, 2019 in R Programming by Ajinkya757 (5.3k points)

Working with a data frame similar to this:

set.seed(100)
df <- data.frame(cat = c(rep("aaa", 5), rep("bbb", 5), rep("ccc", 5)), val = runif(15))
df <- df[order(df$cat, df$val), ]
df
cat val
1 aaa 0.05638315
2 aaa 0.25767250
3 aaa 0.30776611
4 aaa 0.46854928
5 aaa 0.55232243
6 bbb 0.17026205
7 bbb 0.37032054
8 bbb 0.48377074
9 bbb 0.54655860
10 bbb 0.81240262
11 ccc 0.28035384
12 ccc 0.39848790
13 ccc 0.62499648
14 ccc 0.76255108
15 ccc 0.88216552

I am trying to add a column with numbering within each group. Doing it this way obviously isn't using the powers of R:

df$num <- 1
for (i in 2:(length(df[,1]))) {
if (df[i,"cat"]==df[(i-1),"cat"]) {
df[i,"num"]<-df[i-1,"num"]+1
}
}
df
cat val num
1 aaa 0.05638315 1
2 aaa 0.25767250 2
3 aaa 0.30776611 3
4 aaa 0.46854928 4
5 aaa 0.55232243 5
6 bbb 0.17026205 1
7 bbb 0.37032054 2
8 bbb 0.48377074 3
9 bbb 0.54655860 4
10 bbb 0.81240262 5
11 ccc 0.28035384 1
12 ccc 0.39848790 2
13 ccc 0.62499648 3
14 ccc 0.76255108 4
15 ccc 0.88216552 5

What would be a good way to do this?

1 Answer

Related questions

0 votes

1 answer

Select rows from a data frame based on values in a vector

asked Aug 31, 2019 in R Programming by Ajinkya757 (5.3k points)

0 votes

1 answer

Select first 4 rows of a data.frame in R

asked Jul 17, 2019 in R Programming by leealex956 (7.3k points)

0 votes

1 answer

How do I delete rows in a data frame?

asked Jul 10, 2019 in R Programming by Ajinkya757 (5.3k points)

0 votes

1 answer

Repeat rows of a data.frame

asked Jul 24, 2019 in R Programming by Ajinkya757 (5.3k points)

0 votes

1 answer

data.frame rows to a list

asked Jul 19, 2019 in R Programming by Ajinkya757 (5.3k points)

anonymous · Answer 1 · 2019-07-16T05:58:25+0000

To create groups and number rows in them, you can use the group_by and mutate function from the dplyr package as follows:

set.seed(100)
df <- data.frame(cat = c(rep("aaa", 5), rep("bbb", 5), rep("ccc", 5)), val = runif(15))
df <- df[order(df$cat, df$val), ]

library(dplyr)
df %>% group_by(cat) %>% mutate(id = row_number())

Output:

# A tibble: 15 x 3
# Groups: cat [3]
cat val id
<fct> <dbl> <int>
1 aaa 0.0564 1
2 aaa 0.258 2
3 aaa 0.308 3
4 aaa 0.469 4
5 aaa 0.552 5
6 bbb 0.170 1
7 bbb 0.370 2
8 bbb 0.484 3
9 bbb 0.547 4
10 bbb 0.812 5
11 ccc 0.280 1
12 ccc 0.398 2
13 ccc 0.625 3
14 ccc 0.763 4
15 ccc 0.882 5

You can also use the following functions from the data.table package which saves memory and is faster than dplyr.

library(data.table)
dt <- data.table(df)
dt[, id := seq_len(.N), by = cat]
dt[, id := rowid(cat)]

Output:

cat val id
1: aaa 0.05638315 1
2: aaa 0.25767250 2
3: aaa 0.30776611 3
4: aaa 0.46854928 4
5: aaa 0.55232243 5
6: bbb 0.17026205 1
7: bbb 0.37032054 2
8: bbb 0.48377074 3
9: bbb 0.54655860 4
10: bbb 0.81240262 5
11: ccc 0.28035384 1
12: ccc 0.39848790 2
13: ccc 0.62499648 3
14: ccc 0.76255108 4
15: ccc 0.88216552 5

Numbering rows within groups in a data frame

1 Answer

Related questions

Browse Categories

Browse By Domains

Popular Courses

Popular Tutorials

Popular Resources