Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (18.4k points)
edited by

I am working on R and trying to filter the data frame and get only a few values.

df1 <- df %>% dplyr::select(Species, Weight)

Which is giving the output 

 Species  Weight

1    Dog      7

2    Cat      2

3    Dog      5

4    Dog      4

.     .       .

.     .       .

245  Cat      3

246  Dog      9

247  Cat      2

This is an example of data as the data I am using actually contains 25 species of fish. How do I add weights of every species together so that I have only one species on each row?

The data I am using has 25 species of fish, I am trying to add the weight of every species together so that I can only one species on each row.

To achieve that I used the code as follows:

Species  Weight

1    Dog      734

2    Cat      257

Now I wanted to plot this in a histogram can anyone help me to plot the histogram?

1 Answer

0 votes
by (36.8k points)
edited by

Group the 'Species' using the group_by method and then sum the 'Weight' column and then plot the graph.

library(dplyr)

df %>%

      group_by(Species) %>%

      summarise(Weight = log(sum(Weight))) %>% 

      ggplot(aes(x = Species, y = Weight)) + 

              geom_col()

Or You can use base R

aggregate(Weight ~ Species, df, sum)

Then use the bar plot if needed.

barplot(rowsum(df$Weight, df$Species)[,1])

If you want to use the log then you can wrap with log, check out the code to do it:

barplot(log(rowsum(df$Weight, df$Species))[,1])

Data I have used is:

df  <- structure(list(Species = c("Dog", "Cat", "Dog", "Dog", "Cat", 

"Dog", "Cat"), Weight = c(7L, 2L, 5L, 4L, 3L, 9L, 2L)), class = "data.frame", row.names = c("1", 

"2", "3", "4", "245", "246", "247"))

 If you are a beginner and want to know more about Data Science the do check out the Data Science course

Browse Categories

...