Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Python by (19.9k points)

I have a folder with csv files where each file has a string at the start identifying the game and a tag at the end identifying which table from that games. Example:

20020905_nyg_scoring.csv

20020905_nyg_team_stats.csv

20020908_buf_scoring.csv

20020908_buf_team_stats.csv

I've written a script that pairs csv files by the first part of the file name into a dictionary and then turns that dictionary into a list. I want to read the file name pairs in and perform dataframe shaping on each pair together. Ultimately, I will concat the data from the paired files into a single dataframe (concat is not my issue here).

import numpy as np

import pandas as pd

import os

game_list = {}

path = r'C:\Users\jobon\Documents\New NFL Stats\Experimental\2002 Game Logs'

for file in os.listdir(path):

    game_pairing = game_list.get(file[:12],[])

    game_pairing.append(file)

    game_list[file[:12]] = game_pairing

game_pairs = []

for game, stats in game_list.items():

    game_pairs.append(stats)

for scoring, team_stats in game_pairs:

    for file in os.listdir(path):

        df1 = pd.read_csv(scoring, header = 0, index_col = 0)

        df1.drop(['Detail', 'Quarter', 'Time', 'Tm'], axis = 1, inplace = True)

        ...more shaping...

I expect to end with a final set of data frames generated from each pair of game files that I can concat.

Instead I get

FileNotFoundError Traceback (most recent call last)

<ipython-input-37-fb1d4aa9f003> in <module>

     18 for scoring, team_stats in game_pairs:

     19     for file in os.listdir(path):

---> 20         df1 = pd.read_csv(scoring, header = 0, index_col = 0)

     21         #df1.drop(['Detail', 'Quarter', 'Time', 'Tm'], axis = 1, inplace = True)

     22         print(df1)

FileNotFoundError: [Errno 2] File b'20020905_nyg_scoring.csv' does not exist: b'20020905_nyg_scoring.csv'

The files are in the folder, and it worked for building the list, but I don't know why it suddenly can't find the files now.

1 Answer

0 votes
by (25.1k points)
edited by

The issue is that the scoring variable only holds the filename not the name of the entire location of the file. you need to store the path to the file and then use os.path.join module like this:

scoring = os.path.join(path, scoring)

To know more about this you can have a look at the following video:-

31k questions

32.9k answers

507 comments

693 users

...