Back

Explore Courses Blog Tutorials Interview Questions
0 votes
1 view
in R Programming by (5.3k points)

From a data frame like this

test <- data.frame('id'= rep(1:5,2), 'string'= LETTERS[1:10])

test <- test[order(test$id), ]

rownames(test) <- 1:10

> test

    id string

 1   1      A

 2   1      F

 3   2      B

 4   2      G

 5   3      C

 6   3      H

 7   4      D

 8   4      I

 9   5      E

 10  5      J

I want to create a new one with the first row of each id / string pair. If sqldf accepted R code within it, the query could look like this:

res <- sqldf("select id, min(rownames(test)), string 

              from test 

              group by id, string")

> res

    id string

 1   1      A

 3   2      B

 5   3      C

 7   4      D

 9   5      E

Is there a solution short of creating a new column like

test$row <- rownames(test)

and running the same sqldf query with min(row)?

1 Answer

0 votes
by

To select the first row by group, you can use the duplicated function while indexing a data frame as follows:

test <- data.frame('id'= rep(1:5,2), 'string'= LETTERS[1:10])

test <- test[order(test$id), ]

rownames(test) <- 1:10

 test[!duplicated(test$id),]

  id string

1  1      A

3  2      B

5  3      C

7  4      D

9  5      E

Related questions

Browse Categories

...