Read multiple CSV files in Azure ML Python Script

Question

1 Answer

Shubham Rana · Answer 1 · 2019-07-16T10:42:02+0000

Here's some more detail on the approach others have outlined above. Try replacing the code currently in the "Execute Python Script" module with the following:

import pandas as pd
import os
def azureml_main(dataframe1=None, dataframe2=None):
print(os.listdir('.'))
return(pd.DataFrame([]))

After running the experiment, click on the module. There should be a "View output log" link now in the right-hand bar. I get something like the following:

[Information] Started in [C:\temp]
[Information] Running in [C:\temp]
[Information] Executing 4af67c05ba02417a980f6a16e84e61dc with inputs [] and generating outputs ['.maml.oport1']
[Information] Extracting Script Bundle.zip to .\Script Bundle
[Information] File Name Modified Size
[Information] temp.csv 2016-05-06 13:16:56 52
[Information] [ READING ] 0:00:00
[Information] ['4af67c05ba02417a980f6a16e84e61dc.py', 'Script Bundle', 'Script Bundle.zip']

This tells me that the contents of my zip file have been extracted to the C:\temp\Script Bundlefolder. In my case the zip file contained just one CSV file, temp.csv: your output would probably have four files. You may also have zipped a folder containing your four files, in which case the filepath would be one layer deeper. You can use the os.listdir() to explore your directory structure further if necessary.

Once you think you know the full filepaths for your CSV files, edit your Execute Python Script module's code to load them, e.g.:

Here's some more detail on the approach others have outlined above. Try replacing the code currently in the "Execute Python Script" module with the following:

import pandas as pd
import os
def azureml_main(dataframe1=None, dataframe2=None):
print(os.listdir('.'))
return(pd.DataFrame([]))

After running the experiment, click on the module. There should be a "View output log" link now in the right-hand bar. I get something like the following:

[Information] Started in [C:\temp]
[Information] Running in [C:\temp]
[Information] Executing 4af67c05ba02417a980f6a16e84e61dc with inputs [] and generating outputs ['.maml.oport1']
[Information] Extracting Script Bundle.zip to .\Script Bundle
[Information] File Name Modified Size
[Information] temp.csv 2016-05-06 13:16:56 52
[Information] [ READING ] 0:00:00
[Information] ['4af67c05ba02417a980f6a16e84e61dc.py', 'Script Bundle', 'Script Bundle.zip']

This tells me that the contents of my zip file have been extracted to the C:\temp\Script Bundle folder. In my case the zip file contained just one CSV file, temp.csv: your output would probably have four files. You may also have zipped a folder containing your four files, in which case the filepath would be one layer deeper. You can use the os.listdir() to explore your directory structure further if necessary.

Once you think you know the full filepaths for your CSV files, edit your Execute Python Script module's code to load them, e.g.:

import pandas as pd
def azureml_main(dataframe1 = None, dataframe2 = None):
df = pd.read_csv('C:/temp/Script Bundle/temp.csv')
# ...load other files and merge into a single dataframe...
return(df)

Read multiple CSV files in Azure ML Python Script

1 Answer

Related questions

Browse Categories

Browse By Domains

Popular Courses

Popular Tutorials

Popular Resources