Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Python by (16.4k points)

I'm attempting to download the recorded information (Historical data) for a list of tickers and fare each to a csv document. I can make this work as a for loop however that is moderate when the list of stock tickers is in the 1000's. I'm attempting to multithread the process however I continue to get various mistakes. On occasion it will download only 1 record different occasions 2 or 3 and a couple of times even 6 however never past that. I'm speculating that has something to do with having a 6 core 12 string processor, yet I truly don't have the foggiest idea.

import csv

import os

import yfinance as yf

import pandas as pd

from threading import Thread

ticker_list = []

with open('tickers.csv', 'r') as csvfile:

    reader = csv.reader(csvfile, delimiter=',')

    name = None

    for row in reader:

        if row[0]:

            ticker_list.append(row[0])

start_date = '2019-03-03'

end_date = '2020-03-04'

data = pd.DataFrame()

def y_hist(i):

    ticker = ticker_list[i]

    data = yf.download(ticker, start=start_date, end=end_date, group_by="ticker")

    data.to_csv('yhist/' + ticker + '.csv', sep=',', encoding='utf-8')

threads = []

for i in range(os.cpu_count()):

    print('registering thread %d' % i)

    threads.append(Thread(target=y_hist,args=(i,)))

for thread in threads:

    thread.start()

for thread in threads:

    thread.join()

print('done')

Click on this link for sample csv file

This is a disentangled version with it's output possibly it will assist with explaining the issue.

import os

import pandas as pd

import yfinance as yf

from threading import Thread

ticker_list = ['IBM','MSFT','QQQ','SPY','FB','XLV','XLF','XLK','XLE','GTHX','IYR','ONE','ROG','OLED','GLD']

def y_hist():

    for ticker in ticker_list:

        print(ticker)

threads = []

for i in range(os.cpu_count()):

    threads.append(Thread(target=y_hist))

for thread in threads:

    thread.start()

for thread in threads:

    thread.join()

Output:

IBM

MSFT

QQQ

SPY

FB

XLV

XLF

XLK

XLE

GTHX

IYR

ONE

ROG

OLED

GLD

IBM

MSFT

QQQ

SPY

FB

XLV

XLF

XLK

XLE

GTHX

IYR

ONE

ROG

OLED

GLD

IBM

MSFT

QQQ

SPY

FB

XLV

XLF

XLK

XLE

GTHX

IYR

ONE

ROG

IBM

MSFT

QQQ

SPY

FB

XLV

XLF

XLK

XLE

GTHX

IYR

ONE

ROG

OLED

GLD

OLEDIBM

MSFT

QQQ

SPY

FB

XLV

XLF

XLK

XLE

GTHX

IYR

ONE

GLD

IBM

MSFT

QQQ

SPY

FB

XLV

XLF

XLK

XLE

GTHX

IYR

ONE

ROG

OLED

IBM

GLD

MSFT

ROG

OLED

GLD

QQQ

SPY

FB

XLV

XLF

XLK

XLE

GTHX

IYR

ONE

ROG

OLED

GLD

IBM

MSFT

QQQ

SPY

FB

XLV

XLF

XLK

XLE

GTHX

IYR

ONE

ROG

OLED

GLD

IBM

MSFT

QQQ

SPY

IBM

MSFT

FB

XLV

XLF

XLK

XLE

GTHX

IYR

ONE

ROG

OLED

GLD

QQQ

SPY

FB

XLV

XLF

XLK

XLE

GTHX

IYR

ONE

ROG

OLED

GLD

IBM

MSFT

QQQ

SPY

FB

XLV

XLF

XLK

XLE

GTHX

IYR

ONE

ROG

OLED

IBM

MSFT

QQQ

SPY

FB

XLV

XLF

XLK

XLE

GTHX

IYR

ONE

ROG

OLED

GLD

GLD

1 Answer

0 votes
by (26.4k points)

While this doesn't directly fix my messed up code it is an answer that will get a similar outcome. It utilizes yfinance worked in function to multithread. Lamentably I actually don't have a clue why the orginal code will not work, would in any case appreciate feedback on that. Meanwhile this will work in the event that anybody is searching for an answer for a similar issue.

import csv

import os

import yfinance as yf

import pandas as pd

import time

start = time.time()

ticker_list = []

with open('tickers.csv', 'r') as csvfile:

    reader = csv.reader(csvfile, delimiter=',')

    name = None

    for row in reader:

        if row[0]:

            ticker_list.append(row[0])

data = yf.download(

        tickers = ticker_list,

        period = '1y',

        interval = '1d',

        group_by = 'ticker',

        auto_adjust = False,

        prepost = False,

        threads = True,

        proxy = None

    )

data = data.T

for ticker in ticker_list:

    data.loc[(ticker,),].T.to_csv('yhist/' + ticker + '.csv', sep=',', encoding='utf-8')

print('It took', time.time()-start, 'seconds.')

Interested to learn python in detail? Come and Join the python course.

Related questions

0 votes
1 answer
0 votes
4 answers
0 votes
1 answer
0 votes
1 answer
asked Sep 26, 2019 in Python by Sammy (47.6k points)

31k questions

32.8k answers

501 comments

693 users

Browse Categories

...