# Select first and last row from grouped data

1 view

Question

Using dplyr, how do I select the top and bottom observations/rows of grouped data in one statement?

Data & Example

Given a data frame

df <- data.frame(id=c(1,1,1,2,2,2,3,3,3),

stopId=c("a","b","c","a","b","c","a","b","c"),

stopSequence=c(1,2,3,3,1,4,3,1,2))

I can get the top and bottom observations from each group using slice, but using two separate statements:

firstStop <- df %>%

group_by(id) %>%

arrange(stopSequence) %>%

slice(1) %>%

ungroup

lastStop <- df %>%

group_by(id) %>%

arrange(stopSequence) %>%

slice(n()) %>%

ungroup

Can I combine these two statements into one that selects both top and bottom observations?

by (23.2k points)

To select the first and the last row from the grouped data, you can use the following:

library("dplyr")

df <- data.frame(id=c(1,1,1,2,2,2,3,3,3),

stopId=c("a","b","c","a","b","c","a","b","c"),

stopSequence=c(1,2,3,3,1,4,3,1,2))

df %>%

group_by(id) %>%

arrange(stopSequence) %>%

filter(row_number() %in% c(1, n()))

Output:

# A tibble: 6 x 3

# Groups:   id 

id stopId stopSequence

<dbl> <fct>         <dbl>

1     1 a                 1

2     2 b                 1

3     3 b                 1

4     1 c                 3

5     3 a                 3

6     2 c                 4