0 votes
1 view
in R Programming by (5.3k points)

I want to filter rows from a data.frame based on a logical condition. Let's suppose that I have a data frame like

   expr_value     cell_type

1    5.345618 bj fibroblast

2    5.195871 bj fibroblast

3    5.247274 bj fibroblast

4    5.929771          hesc

5    5.873096          hesc

6    5.665857          hesc

7    6.791656          hips

8    7.133673          hips

9    7.574058          hips

10   7.208041          hips

11   7.402100          hips

12   7.167792          hips

13   7.156971          hips

14   7.197543          hips

15   7.035404          hips

16   7.269474          hips

17   6.715059          hips

18   7.434339          hips

19   6.997586          hips

20   7.619770          hips

21   7.490749          hips

What I want to is to get a new data frame which looks the same but only has the data for one cell_type. E.g. subset / select rows which contain the cell type "hesc":

   expr_value     cell_type

1    5.929771          hesc

2    5.873096          hesc

3    5.665857          hesc

Or either cell type "bj fibroblast" or "hesc":

  expr_value     cell_type

1    5.345618 bj fibroblast

2    5.195871 bj fibroblast

3    5.247274 bj fibroblast

4    5.929771          hesc

5    5.873096          hesc

6    5.665857          hesc

Is there any easy way to do this?

I've tried:

expr[expr[2] == 'hesc']

# [1] "5.929771" "5.873096" "5.665857" "hesc"     "hesc"     "hesc" 

if the original data frame is called "expr", but it gives the results in the wrong format as you can see.

1 Answer

0 votes
by (25.3k points)

To filter data frame rows by a logical condition, you can use the filter function from the dplyr package as follows:

library(dplyr) 

filter(expr, cell_type == "hesc") 

filter(expr, cell_type == "hesc" | cell_type == "bj fibroblast")

You can also use the == operator to select rows according to one cell type, as follows:

To select according to one cell type:

expr[expr$cell_type == "hesc", ]

To select according to multiple cell types:

expr[expr$cell_type %in% c("hesc", "bj fibroblast"), ]

Output:

expr_value cell_type 

1 5.345618 bj fibroblast 

2 5.195871 bj fibroblast 

3 5.247274 bj fibroblast 

4 5.929771 hesc 

5 5.873096 hesc 

6 5.665857 hesc

Welcome to Intellipaat Community. Get your technical queries answered by top developers !


Categories

...