I trying to load a bunch of ExcelSheets into Oracle table. The excelsheets are not consistent. So the tables in Oracle need to be dynamically generated based on the column in the Excelsheets. This is the easiest solution I found in the internet, so I went on with this approach. I'm kinda Novice in Python.
Taking the following structure as an example.
I'm trying to import the following excel into MEM_TABLE table. The DATE field is inconsistent, it can have NA, _ND or Date as values.
The code I've used to load the excel is below
xls_file = pd.ExcelFile("Excel File")
df = xls_file.parse('Sheet1',parse_dates=True)
df = df.replace(np.nan, '') #replace NaN with blank string
df = df.replace('NA', '') #replace "NA" with blank string
df = df.replace('_ND', '') #replace "NA" with blank string
df.to_sql('MEM_TABLE', conn_ora, if_exists='append', index = False )
The table that to_sql method created has following structure
MEM_TABLE(
date CLOB NULL,
emp_nbr NUMBER(19,0) NULL
)
In dataframe it reads something like this
DATE EMP_NBR
0 2
1 2018-01-04 00:00:00 1
2 2018-01-01 00:00:00 2
3 2018-01-03 00:00:00 1
4 1
5 1
Looks like Pandas is auto casting 1/4/2018 to date when creating the DataFrame. The error I'm getting below, which I believe is because the DATE column in table MEM_TABLE is CLOB and in dataFrame the Date field has DATE object. Looks likes the date object can't be insert into a CLOB cell.
TypeError: expecting string or bytes object
If the Dates are in string(i.e "1/4/2018"), I'm not getting any errors.
My question here is Is there anyway to disable the auto Parse on the date field, while the dataframe is getting created?