Back
Question
Using dplyr, how do I select the top and bottom observations/rows of grouped data in one statement?
Data & Example
Given a data frame
df <- data.frame(id=c(1,1,1,2,2,2,3,3,3), stopId=c("a","b","c","a","b","c","a","b","c"), stopSequence=c(1,2,3,3,1,4,3,1,2))
df <- data.frame(id=c(1,1,1,2,2,2,3,3,3), stopId=c("a","b","c","a","b","c","a","b","c"),
stopSequence=c(1,2,3,3,1,4,3,1,2))
I can get the top and bottom observations from each group using slice, but using two separate statements:
firstStop <- df %>% group_by(id) %>% arrange(stopSequence) %>% slice(1) %>% ungrouplastStop <- df %>% group_by(id) %>% arrange(stopSequence) %>% slice(n()) %>% ungroup
firstStop <- df %>%
group_by(id) %>%
arrange(stopSequence) %>%
slice(1) %>%
ungroup
lastStop <- df %>%
slice(n()) %>%
Can I combine these two statements into one that selects both top and bottom observations?
To select the first and the last row from the grouped data, you can use the following:
library("dplyr")df <- data.frame(id=c(1,1,1,2,2,2,3,3,3), stopId=c("a","b","c","a","b","c","a","b","c"), stopSequence=c(1,2,3,3,1,4,3,1,2))df %>% group_by(id) %>% arrange(stopSequence) %>% filter(row_number() %in% c(1, n()))
library("dplyr")
df %>%
filter(row_number() %in% c(1, n()))
Output:
# A tibble: 6 x 3# Groups: id [3] id stopId stopSequence <dbl> <fct> <dbl>1 1 a 12 2 b 13 3 b 14 1 c 35 3 a 36 2 c 4
# A tibble: 6 x 3
# Groups: id [3]
id stopId stopSequence
<dbl> <fct> <dbl>
1 1 a 1
2 2 b 1
3 3 b 1
4 1 c 3
5 3 a 3
6 2 c 4
31k questions
32.8k answers
501 comments
693 users