I have pandas data frame with some categorical predictors (i.e. variables) as 0 & 1 and some numeric variables. When I fit that to a stasmodel like:
est = sm.OLS(y, X).fit()
Pandas datacast to numpy dtype of object. Check input data with np.asarray(data).
I converted all the dtypes of the DataFrame using
After this, all dtypes of data frame variables appear as int32 or int64. But in the end it still shows dtype: object, like this:
Here 4516, 4523 are variable labels.
Any idea? I need to build a multi-regression model on more than hundreds of variables. For that, I have concatenated 3 pandas DataFrames to come up with final DataFrame to be used in model building.