2 views

I am having a dataframe containing a few numeric columns and I need to get the median for all these values just one median for all values. Although that seems simple, I could not find an explanation.

The df I have is similar to:

rep_id  sex     activator   P16401      P81605      B7Z958      B4DT29

CF9     Female  Control     808.3071    772.20756   14114.372   5516.857

CF10    Female  Control     1332.5300   739.96297   19373.688   4855.419

CF11    Female  Control     748.3975    1449.46860  17310.500   5324.638

CF12    Female  Control     1271.5207   978.48424   6217.883    6015.900

CF13    Female  Control     554.3564    461.37956   6659.669    5739.060

CF14    Female  Control     1575.7039   1770.07244  7143.650    5936.352

I just need the median of all numerical values. The equivalent of what would be =MEDIAN(D2: G7) in Excel, but for many reasons I prefer analyzing all the data in R.

by (108k points)

You can simply select the columns that you want to take the median of, you can do that by column names or column number, unlist, and then, at last, take a median of all values.

cols <- 4:7

median(unlist(df[cols]), na.rm = TRUE)

#Or

#median(as.matrix(df[cols]), na.rm = TRUE)

#[1] 3312.746

If you are a beginner and want to know more about R then do check out the R programming course