2 views

In the following matrix, I am having 3 columns:

[,1]       [,2] [,3]

1 0.11651699    1

1 0.03850202    1

0 0.11651699   NA

0 0.11651699   NA

1 0.04110752   39

1 0.03599296   39

1 0.05440237   41

1 0.11651699   42

1 0.06298718   42

0 0.11651699   NA

0 0.11651699   NA

0 0.11651699   NA

I want to create a fourth column in my matrix that stores the sum of column 2 for each group(column 3). The expected out below

[,1]       [,2]  [,3]   [,4]

1 0.11651699    1    0.155019 = (0.11651699  + 0.03850202)

1 0.03850202    1    0.155019 = (0.11651699  + 0.03850202)

0 0.11651699   NA    1

0 0.11651699   NA    1

1 0.04110752   39    0.07710048 = (0.04110752 + 0.03599296)

1 0.03599296   39    0.07710048 = (0.04110752 + 0.03599296)

1 0.05440237   41    0.09290439 = (0.03850202 + 0.05440237)

1 0.11651699   42    0.1795042  = (0.11651699 + 0.06298718)

1 0.06298718   42    0.1795042  = (0.11651699 + 0.06298718)

0 0.11651699   NA    1

0 0.11651699   NA    1

1 0.03850202   41    0.09290439 = (0.03850202 + 0.05440237)

I know that I cannot use dplyr and groupby because that only works with dataframes and I am dealing with a matrix object. So, I tried to perform the aggregate(df1[,2] ~ df1[,3], df, sum) and it worked but its not easy taking the results from the aggregate function and creating the fourth column as shown in the expected output.

by (108k points)

For achieving that you can simply use ave() inside the cbind() :

mat1 <- cbind(mat, ave(mat[, 2], mat[, 3], FUN = sum))

#Changing 4th column to 1 for NA values in column 3.

mat1[is.na(mat[, 3]), 4] <- 1

mat1

#      [,1]       [,2] [,3]       [,4]

# [1,]    1 0.11651699    1 0.15501901

# [2,]    1 0.03850202    1 0.15501901

# [3,]    0 0.11651699   NA 1.00000000

# [4,]    0 0.11651699   NA 1.00000000

# [5,]    1 0.04110752   39 0.07710048

# [6,]    1 0.03599296   39 0.07710048

# [7,]    1 0.05440237   41 0.09290439

# [8,]    1 0.11651699   42 0.17950417

# [9,]    1 0.06298718   42 0.17950417

#[10,]    0 0.11651699   NA 1.00000000

#[11,]    0 0.11651699   NA 1.00000000

#[12,]    0 0.03850202   41 0.09290439

If you are interested in R certification, then kindly check out the R programming certification