Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (18.4k points)
  • I need to find CSV files from a folder
  • List all files inside a folder
  • Convert files to JSON and save in the same bucket

Csv file, Like below so many csv files are there

emp_id,Name,Company

10,Aka,TCS

11,VeI,TCS

Code is below

import boto3

import pandas as pd

def lambda_handler(event, context):

    s3 = boto3.resource('s3')

    my_bucket = s3.Bucket('testfolder')

    for file in my_bucket.objects.all():

        print(file.key)

    for csv_f in file.key:

        with open(f'{csv_f.replace(".csv", ".json")}', "w") as f:

            pd.read_csv(csv_f).to_json(f, orient='index')

Not able to save if you remove bucket name it will save in a folder. How to save back to the bucket name

1 Answer

0 votes
by (36.8k points)

You can check the following code:

from io import StringIO

import boto3

import pandas as pd

s3 = boto3.resource('s3')

def lambda_handler(event, context):

    

    s3 = boto3.resource('s3')

    

    input_bucket = 'bucket-with-csv-file-44244'

    

    my_bucket = s3.Bucket(input_bucket)

    

    for file in my_bucket.objects.all():

        

        if file.key.endswith(".csv"):

           

            csv_f = f"s3://{input_bucket}/{file.key}"

            

            print(csv_f)

            

            json_file = file.key.replace(".csv", ".json")

            

            print(json_file)

            

            json_buffer = StringIO()

            

            df = pd.read_csv(csv_f)

            

            df.to_json(json_buffer, orient='index')

            

            s3.Object(input_bucket, json_file).put(Body=json_buffer.getvalue())            

Your lambda layer will need to have:

fsspec

pandas

s3fs

Learn Python for Data Science Course to improve your technical knowledge. 

Related questions

Browse Categories

...