Select first and last row from grouped data

Question

asked Jul 18, 2019 in R Programming by leealex956 (7.3k points)
edited Jan 10, 2024 by admin

Question

Using dplyr, how do I select the top and bottom observations/rows of grouped data in one statement?

Data & Example

Given a data frame

df <- data.frame(id=c(1,1,1,2,2,2,3,3,3), stopId=c("a","b","c","a","b","c","a","b","c"),
stopSequence=c(1,2,3,3,1,4,3,1,2))

I can get the top and bottom observations from each group using slice, but using two separate statements:

firstStop <- df %>%
group_by(id) %>%
arrange(stopSequence) %>%
slice(1) %>%
ungroup
lastStop <- df %>%
group_by(id) %>%
arrange(stopSequence) %>%
slice(n()) %>%
ungroup

Can I combine these two statements into one that selects both top and bottom observations?

1 Answer

anonymous · Answer 1 · 2019-07-18T14:24:28+0000

To select the first and the last row from the grouped data, you can use the following:

library("dplyr")
df <- data.frame(id=c(1,1,1,2,2,2,3,3,3), stopId=c("a","b","c","a","b","c","a","b","c"),
stopSequence=c(1,2,3,3,1,4,3,1,2))
df %>%
group_by(id) %>%
arrange(stopSequence) %>%
filter(row_number() %in% c(1, n()))

Output:

# A tibble: 6 x 3
# Groups: id [3]
id stopId stopSequence
<dbl> <fct> <dbl>
1 1 a 1
2 2 b 1
3 3 b 1
4 1 c 3
5 3 a 3
6 2 c 4

Select first and last row from grouped data

1 Answer

Related questions

Browse Categories

Browse By Domains

Popular Courses

Popular Tutorials

Popular Resources