Back

Explore Courses Blog Tutorials Interview Questions
0 votes
1 view
in Data Science by (18.4k points)

I am new to Data Science and I am learning it by surfing the web, I got a dataset which looks like this:

df <- data.frame(id= c(1,1,1,2,2,2,3,3,3), time=c(1,2,3,1,2,3,1,2,3),y = rnorm(9), x1 = LETTERS[seq( from = 1, to = 9 )], x2 = c(0,0,0,0,1,0,1,1,1),c2 = rnorm(9))

df

#    id time     y      x1 x2     c2

# 1  1    1  0.6364831  A  0 -0.066480473

# 2  1    2  0.4476390  B  0  0.161372575

# 3  1    3  1.5113458  C  0  0.343956178

# 4  2    1  0.3532957  D  0  0.279987147

# 5  2    2  0.3401402  E  1 -0.462635393

# 6  2    3 -0.3160222  F  0  0.338454940

# 7  3    1 -1.3797158  G  1 -0.621169576

# 8  3    2  1.4026640  H  1 -0.005690801

# 9  3    3  0.2958363  I  1 -0.176488132

I am trying to write a function that consists of 2 parameters first one is dataset and the second is the variable of interest.

The Function is further divided into different steps. however when I try filtering my dataset using a table which looks like this:

testfun<- function(dataset,var){

  intermediatedf<-unique(setDT(dataset)[var==1 & c2>0,.(y)])

return(intermediatedf)

}

when I run the below code it breaks down:

df2<-testfun(df,y)

Can anyone guide me, how to index my dataset as well as a variable?

1 Answer

0 votes
by (36.8k points)

To index, you can use the substitute and eval which will help you to get the desired output.

I have given the code below check it out:

testfun <- function(dataset, var) {

    var <- substitute(var)

    intermediatedf <- unique(dataset[eval(var) == 1 & c2 > 0, .(y)])

    return(intermediatedf)

}

If you are a beginner and want to know more about Data Science the do check out the Data Science course

Welcome to Intellipaat Community. Get your technical queries answered by top developers!

28.4k questions

29.7k answers

500 comments

94.6k users

Browse Categories

...