Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in R Programming by (50.2k points)

While adding the variable, assigning the value, filtering and changing the value of the variable, I lost the rest of the observations.

I have implemented the below code:

## add `final_session` column, defualt value 0

old_sp_long2 <- old_sp_long %>% add_column(final_session = 0)

## select most recent date of sessions 1--15 and mark as final session == 1

df <- old_sp_long2 %>%

    filter(wave <= 15) %>%

    group_by(uci) %>%

    slice(which.max(date)) %>%

    mutate(final_session = replace(final_session, final_session == 0, 1))

The minimal dataset below:

structure(list(uci = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 

1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 

2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 

3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("10001h", 

"10268h", "10431h"), class = "factor"), wave = c(1L, 2L, 3L, 

4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 

1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 

15L, 16L, 17L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 

12L, 13L, 14L, 15L, 16L, 17L), date = structure(c(17042, 17053, 

17060, 17074, 17086, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 

NA, NA, 17003, 17010, 17015, 17055, NA, NA, NA, NA, NA, NA, NA, 

NA, NA, NA, NA, NA, NA, 16994, 17000, NA, NA, NA, NA, NA, NA, 

NA, NA, NA, NA, NA, NA, NA, NA, NA), class = "Date"), session = c(1L, 

2L, 3L, 4L, 5L, 6L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 

1L, 2L, 3L, 4L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 

NA, 1L, 2L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 

NA, NA)), class = "data.frame", row.names = c(NA, -51L))

I'm sure this is possible but I just cannot figure it out. Does anyone have a solution to my problem?

1 Answer

0 votes
by (108k points)

I think you need something like this in r progarmming:

library(dplyr)

old_sp_long2 %>%

  group_by(uci) %>%

  mutate(max_date = max(date[wave <= 15], na.rm = TRUE), 

         max_wave = wave[which.max(date == max_date)],

         final_session = replace(final_session, date == max_date, 1))

#   uci     wave date       session final_session max_date   max_wave

#   <fct>  <int> <date>       <int>         <dbl> <date>        <int>

# 1 10001h     1 2016-08-29       1             0 2016-10-12        5

# 2 10001h     2 2016-09-09       2             0 2016-10-12        5

# 3 10001h     3 2016-09-16       3             0 2016-10-12        5

# 4 10001h     4 2016-09-30       4             0 2016-10-12        5

# 5 10001h     5 2016-10-12       5             1 2016-10-12        5

# 6 10001h     6 NA               6             0 2016-10-12        5

# 7 10001h     7 NA              NA             0 2016-10-12        5

# 8 10001h     8 NA              NA             0 2016-10-12        5

# 9 10001h     9 NA              NA             0 2016-10-12        5

#10 10001h    10 NA              NA             0 2016-10-12        5

# … with 41 more rows

This keeps same number of observation as in your original old_sp_long2

Browse Categories

...