Explore Online Courses
Free Courses
Hire from us
Become an Instructor
Reviews
Community
All Courses
Take the Free Practice Test
Instructions:
FREE test and can be attempted multiple times.
60 Minutes
30 Multiple Choice Questions
Fill in the Details to Get Started
Select your preference
Self-learning and knowledge validation
Completed a course & revising
Just curious
By providing your contact details, you agree to our
Terms of Use
&
Privacy Policy
Welcome to your Data Science Quiz
How would you access the ‘StreamingTV’ column from the ‘customer_churn’ data.frame?
customer_churn@StreamingTV
customer_churn#StreamingTV
customer_churn$StreamingTV
customer_churn&StreamingTV
How would you create a list consisting of these elements: 100, ‘sparta’, TRUE
List(100,’sparta’,TRUE)
list(100,’sparta’,TRUE)
list(c(100,’sparta’,TRUE))
list(list(100,’sparta’,TRUE))
How would you get the last 100 records from the ‘customer_churn’ dataframe
last(100)
tail(100)
last(customer_churn,100)
tail(customer_churn,100)
How would you give a discount of 33% to the 5th cell of ‘MonthlyCharges’ column?
customer_churn$MonthlyCharges[5]*(.33)
customer_churn$MonthlyCharges[5]*(33)
customer_churn$MonthlyCharges[5]*(.67)
customer_churn$MonthlyCharges[5]*(67)
Which of these is the correct code to get the count of number of customers whose ‘MonthlyCharges’ is greater than 100
count=0 for(iin1:nrow(customer_churn)){ if(customer_churn$MonthlyCharges[i]>100){ count=count+1 } } count
count=0 i=1 while(i100){ count=count+1 } } count
count=1 for(i in 1:nrow(customer_churn)){ if(customer_churn$MonthlyCharges[i]>100){ count=count+50 } } count
count=0 if(customer_churn$MonthlyCharges[i]>100){ count=count+1
How would you extract only the female customers from the ‘customer_churn’ data.frame?
sqldf ("select gender='Female' from customer_churn")
sqldf("select from customer_churn where gender=='Female' ")
sqldf("select * from customer_churn where gender=='Female' ")
sqldf("select * from customer_churn where gender in 'Female' ")
How would you extract a random sample of 33 records from the ‘customer_churn’ dataframe?
sample(customer_churn,33)
sample_frac(customer_churn,.33)
sample_frac(customer_churn,33)
sample_n(customer_churn,33)
How would you select the 3rd, 4th& 5th columns from the ‘customer_churn’dataframe?
select(customer_churn,(3,4,5))
select(customer_churn,3,4,5)
select(customer_churn,list(3,4,5))
select(3:5,customer_churn)
How would you get a summarized result for the mean of ‘MonthlyCharges’ grouped w.r.t ‘PaymentMethod’?
summarise(group_by(customer_churn,PaymentMethod),mean_MC=mean(MonthlyCharges))
group_by(mean_MC=mean(MonthlyCharges),summarise(customer_churn,PaymentMethod))
group_by(summarise(customer_churn,PaymentMethod),mean_MC=mean(MonthlyCharges))
summarise(group_by(PaymentMethod),mean_MC=mean(customer_churn,MonthlyCharges))
How would you extract those customers who have subscribed to both ‘StreamingTV’ & ‘StreamingMovies’?
filter(customer_churn, StreamingTV=="Yes" &StreamingMovies=="Yes")
filter(customer_churn, StreamingTV=="Yes" &&StreamingMovies=="Yes")
filter(customer_churn, StreamingTV=="Yes" andStreamingMovies=="Yes")
filter(customer_churn, StreamingTV=="Yes" AND StreamingMovies=="Yes")
To which of these geometries can you add the facet_grid()?
geom_bar()
geom_histogram()
geom_point()
All of the above
Which of these is the correct code to make a box-plot between the ‘tenure’& the ‘DeviceProtection’ columns. ‘tenure’ should be mapped on the y-axis & ‘DeviceProtection’ should be mapped on the x-axis. The fill color should be determined by the ‘DeviceProtection’ column
ggplot(data = customer_churn,aes(y=tenure,x=DeviceProtection,fill=DeviceProtection))+geom_boxplot()
ggplot(data = customer_churn,aes(y=tenure,x=DeviceProtection,fill=”DeviceProtection”))+geom_boxplot()
ggplot(data = customer_churn,aes(y=tenure,x=DeviceProtection))+geom_boxplot(fill=DeviceProtection)
ggplot(data = customer_churn,aes(y=tenure,x=DeviceProtection))+geom_boxplot(col=DeviceProtection)
How would you make a histogram for the ‘tenure’ column, with the plotly package? The color of the bins should be determined by ‘Churn’ column
plot_ly(data = customer_churn,x=tenure,type="histogram", fill = ~ Churn)
plot_ly(data = customer_churn,x=”tenure”,type="histogram", color = “Churn”)
plot_ly(data = customer_churn,x=~tenure,type="histogram", color = ~ Churn)
None of the above
Which of these is the correct code to make a histogram for the ‘tenure’ column. The fill color of the bins should be ‘azure’& the number of bins should be 87
ggplot(data = customer_churn,aes(x=tenure,fill='azure'))+geom_histogram(bins=87)
ggplot(data = customer_churn,aes(x=tenure))+geom_histogram(fill="azure",bins=87)
ggplot(data = customer_churn,aes(x=tenure))+geom_histogram(col="azure",bins=87)
ggplot(data = customer_churn,aes(x=tenure,col='azure'))+geom_histogram(bins=87)
Which of these is the correct code to make a bar-plot for the ‘OnlineBackup’ column. The color of the bars should be determined by the ‘PhoneService’ column
ggplot(data = customer_churn,aes(x=OnlineBackup,fill=PhoneService))+geom_bar()
ggplot(data = customer_churn,aes(x=OnlineBackup))+geom_bar(fill=PhoneService)
ggplot(data = customer_churn,aes(y=OnlineBackup,fill=PhoneService))+geom_bar()
ggplot(data = customer_churn,aes(fill=OnlineBackup,x=PhoneService))+geom_bar()
Which of these is the correct code to make a scatter-plot between the ‘TotalCharges’& the ‘tenure’ columns. ‘TotalCharges’ should be mapped on the y-axis & ‘tenure’ should be mapped on the x-axis. The color of the points should be ‘yellow’
ggplot( data = customer_churn, aes( y = TotalCharges,x=tenure))+geom_point(fill="yellow")
ggplot(data = customer_churn,aes(x=TotalCharges,y=tenure))+geom_point(fill="yellow")
ggplot(data = customer_churn,aes(y=TotalCharges,x=tenure,col=”yellow”))+geom_point()
ggplot(data = customer_churn,aes(y=TotalCharges,x=tenure))+geom_point(col=”yellow”)
Which of these is the correct code to make a bar-plot for the ‘TechSupport’ column. The color of the bars should be ‘blue’ & the title of the plot should be ‘Distribution of Tech Support’
plot(customer_churn$TechSupport,color="blue",title="Distribution of Tech Support")
plot(customer_churn$TechSupport,col="blue",title="Distribution of Tech Support")
plot(customer_churn$TechSupport,col="blue",main="Distribution of Tech Support")
plot(customer_churn$TechSupport,fill="blue",main="Distribution of Tech Support")
How would you build a linear model where the dependent variable is ‘MonthlyCharges’ & the independent variables are ‘tenure’, ‘PaymentMethod’ & ‘Contract’
lm(MonthlyCharges~tenure+PaymentMethod+Contract, data=customer_churn)
lm(MonthlyCharges=tenure,PaymentMethod,Contract, data=customer_churn)
lm(MonthlyCharges=tenure+PaymentMethod+Contract, data=customer_churn)
Im(MonthlyCharges~tenure,PaymentMethod,Contract + data=customer_churn)
sample.split() function is a part of which package?
tree
caret
randomForest
caTools
Which function is used to create the ROC curve?
ROC()
Predict()
Performance()
Roc_plot()
How would you create a simple logistic regression model where the dependent variable is ‘gender’ & the independent variable is ‘Monthly Charges’?
lm(gender=MonthlyCharges, data= customer_churn, family="binomial")
glm(gender~MonthlyCharges, data= customer_churn, family="logistic")
glm(gender~MonthlyCharges, data= customer_churn, family="binomial")
glm(gender~MonthlyCharges, data= customer_churn)
How would you build a decision tree model where the dependent variable is ‘Churn’ & the independent variables are ‘tenure’, ‘InternetService’ & ‘OnlineBackup’
decision_tree(Churn~tenure+InternetService+OnlineBackup, data=customer_churn)
tree(Churn~tenure+InternetService+OnlineBackup, data=customer_churn)
decision_tree(Churn~tenure+InternetService+OnlineBackup)
tree(Churn~tenure+InternetService+OnlineBackup)
How would you build a random forest model where the dependent variable is ‘Churn’ & the independent variable is ‘MonthlyCharges’. The number of trees in the model should be 100
randomForest(Churn~MonthlyCharges, data=customer_churn, trees=100)
randomForest(Churn=MonthlyCharges, data=customer_churn, tree=100)
Forest(Churn~MonthlyCharges, data=customer_churn,ntree=100)
randomForest(Churn~MonthlyCharges,data=customer_churn,ntree=100)
What is the minimum no. of variables/ features required to perform clustering?
0
1
2
3
Which of the following algorithm is most sensitive to outliers?
K-means clustering algorithm
K-medians clustering algorithm
K-modes clustering algorithm
K-medoids clustering algorithm
Which of the following are true? Clustering analysis is negatively affected by multicollinearity of features Clustering analysis is negatively affected by heteroscedasticity
1 only
2 only
1 and 2
None of them
Which of the following is a bad characteristic of a dataset for clustering analysis
Data points with outliers
Data points with different densities
Data points with non-convex shapes
All of the above
Every iteration of the K-Means algorithm contains which of the following steps:
Randomly assigning all data-points to one of K clusters.
Randomly assigning the positions of K centroids in the data-point space.
Check if the average squared distance between all data-points and all centroids is decreasing.
Assigning data-points to the closest centroid using a given similarity(distance) measure.
A data scientist is asked to implement an article recommendation feature for an on-line magazine. The magazine does not want to use client tracking technologies such as cookies or reading history. Therefore, only the style and subject matter of the current article is available for making recommendations. All of the magazine’s articles are stored in a database in a format suitable for analytics. Which method should the data scientist try first?
K Means Clustering
Logistic Regression
Association Rules
Imagine you are trying to hire a Data Scientist for your team. In addition to technical ability and quantitative background, which additional essential trait would you look for in people applying for this position?
Communication skill
Scientific background
Domain expertise
Well Organized
Time is Up!