Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Python by (19.9k points)

I trying to load a bunch of ExcelSheets into Oracle table. The excelsheets are not consistent. So the tables in Oracle need to be dynamically generated based on the column in the Excelsheets. This is the easiest solution I found in the internet, so I went on with this approach. I'm kinda Novice in Python.

Taking the following structure as an example.

enter image description here

I'm trying to import the following excel into MEM_TABLE table. The DATE field is inconsistent, it can have NA, _ND or Date as values.

The code I've used to load the excel is below

xls_file = pd.ExcelFile("Excel File")

df =  xls_file.parse('Sheet1',parse_dates=True)

df =  df.replace(np.nan, '') #replace NaN with blank string

df = df.replace('NA', '') #replace "NA" with blank string

df = df.replace('_ND', '') #replace "NA" with blank string

df.to_sql('MEM_TABLE', conn_ora, if_exists='append', index = False )

The table that to_sql method created has following structure

MEM_TABLE(

  date    CLOB         NULL,

  emp_nbr NUMBER(19,0) NULL

)

In dataframe it reads something like this

 DATE                       EMP_NBR

0                             2

1  2018-01-04 00:00:00        1

2  2018-01-01 00:00:00        2

3  2018-01-03 00:00:00        1

4                             1

5                             1

Looks like Pandas is auto casting 1/4/2018 to date when creating the DataFrame. The error I'm getting below, which I believe is because the DATE column in table MEM_TABLE is CLOB and in dataFrame the Date field has DATE object. Looks likes the date object can't be insert into a CLOB cell.

TypeError: expecting string or bytes object

If the Dates are in string(i.e "1/4/2018"), I'm not getting any errors.

My question here is Is there anyway to disable the auto Parse on the date field, while the dataframe is getting created?

1 Answer

0 votes
by (25.1k points)

You are facing this issue because on line 2 you are setting parse_dates argument as true, it needs to be set as false. 

df =  xls_file.parse('Sheet1',parse_dates=False)

Browse Categories

...