Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Python by (19.9k points)

I have a folder with csv files where each file has a string at the start identifying the game and a tag at the end identifying which table from that games. Example:

20020905_nyg_scoring.csv

20020905_nyg_team_stats.csv

20020908_buf_scoring.csv

20020908_buf_team_stats.csv

I've written a script that pairs csv files by the first part of the file name into a dictionary and then turns that dictionary into a list. I want to read the file name pairs in and perform dataframe shaping on each pair together. Ultimately, I will concat the data from the paired files into a single dataframe (concat is not my issue here).

import numpy as np

import pandas as pd

import os

game_list = {}

path = r'C:\Users\jobon\Documents\New NFL Stats\Experimental\2002 Game Logs'

for file in os.listdir(path):

    game_pairing = game_list.get(file[:12],[])

    game_pairing.append(file)

    game_list[file[:12]] = game_pairing

game_pairs = []

for game, stats in game_list.items():

    game_pairs.append(stats)

for scoring, team_stats in game_pairs:

    for file in os.listdir(path):

        df1 = pd.read_csv(scoring, header = 0, index_col = 0)

        df1.drop(['Detail', 'Quarter', 'Time', 'Tm'], axis = 1, inplace = True)

        ...more shaping...

I expect to end with a final set of data frames generated from each pair of game files that I can concat.

Instead I get

FileNotFoundError                         Traceback (most recent call last)

<ipython-input-37-fb1d4aa9f003> in <module>

     18 for scoring, team_stats in game_pairs:

     19     for file in os.listdir(path):

---> 20         df1 = pd.read_csv(scoring, header = 0, index_col = 0)

     21         #df1.drop(['Detail', 'Quarter', 'Time', 'Tm'], axis = 1, inplace = True)

     22         print(df1)

FileNotFoundError: [Errno 2] File b'20020905_nyg_scoring.csv' does not exist: b'20020905_nyg_scoring.csv'

The files are in the folder, and it worked for building the list, but I don't know why it suddenly can't find the files now.

1 Answer

0 votes
by (25.1k points)

I just ran your code. I think the problem is that your .csv files are in the folder path, so you cannot find the files if just use the filename scoring without the directory name path. To fix this, you need

scoring = os.path.join(path, scoring)

in your loop.

Browse Categories

...