Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Python by (140 points)
recategorized by
vif = calculate_vif(features)
while vif['VIF'][vif['VIF'] > 10].any():    
    remove = vif.sort_values('VIF',ascending=0)['Features'][:1]  
    features.drop(remove,axis=1,inplace=True)  
    vif = calculate_vif(features)  

1 Answer

0 votes
by (106k points)

You can use the below-mentioned code to calculate variance inflation factor just replace your csv file inside the code and you are good to go:-

import pandas as pd

import numpy as np

from patsy import dmatrices

import statsmodels.api as sm

from statsmodels.stats.outliers_influence import variance_inflation_factor

df = pd.read_csv('Your_csv_file.csv')

df.dropna()

df = df._get_numeric_data() #This line will drop non-numeric cols

df.head()

# For each X, calculate VIF and save in dataframe

vif = pd.DataFrame()

vif["VIF Factor"] = [variance_inflation_factor(X.values, i) for i in range(X.shape[1])]

vif["features"] = X.columns

 I hope this will help you!!

Browse Categories

...