0 votes
1 view
in Python by (47.8k points)

I am dealing with several large txt file, each of them has about 8000000 lines. A short example of the lines are:

usedfor zipper fasten_coat 

usedfor zipper fasten_jacket 

usedfor zipper fasten_pant 

usedfor your_foot walk 

atlocation camera cupboard 

atlocation camera drawer 

atlocation camera house 

relatedto more plenty

The code to store them in a dictionary is:

dicCSK = collections.defaultdict(list) 

for line in finCSK: 

line=line.strip('\n') 

try: 

r, c1, c2 = line.split(" ") 

except ValueError:

print line

dicCSK[c1].append(r+" "+c2)

It runs good in the first txt file, but when it runs to the second txt file, I got an error MemoryError.

I am using window 7 64bit with python 2.7 32bit, intel i5 CPU, with 8Gb memory. How can I solve the problem?

Further explaining: I have four large files, each file contains different information for many entities. For example, I want to find all information for cat, its father node animal and its child node persian cat and so on. So my program first read all text files in the dictionary, then I scan all dictionaries to find information for cat and its father and its children.

1 Answer

0 votes
by (107k points)

To solve memory error you can use the following ways, assuming your example text is representative of all the text, one line would consume about 75 bytes on my machine:

>>sys.getsizeof('usedfor zipper fasten_coat') 

75

To know more about this you can have a look at the following video tutorial:-

Related questions

0 votes
2 answers
asked Sep 17, 2019 in Python by Sammy (47.8k points)
0 votes
1 answer
0 votes
1 answer
asked Sep 27, 2019 in Python by Sammy (47.8k points)
0 votes
1 answer
asked Dec 2, 2019 by Anvi (10.2k points)
Welcome to Intellipaat Community. Get your technical queries answered by top developers !


Categories

...