Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Python by (16.4k points)
closed by

I'm doing the Coursera Python & I got stuck on Assignment 10.2. I'm getting an invalid yield for it. Here is the thing that the task inquires:

Write a program to read through the mbox-short.txt and figure out the distribution by an hour of the day for each of the messages. You can pull the hour out from the 'From ' line by finding the time and then splitting the string a second time using a colon.

From [email protected] Sat Jan  5 09:14:16 2008

Once you have accumulated the counts for each hour, print out the counts, sorted by the hour as shown below.

Look at my code:

name = raw_input("Enter file:")

if len(name) < 1 : name = "mbox-short.txt"

handle = open(name)

counts = dict()

lst = list()

for line in handle:

    line = line.rstrip()

    if not line.startswith('From '):

        continue

    words = line.split()

    words = words[5]

    words = words.split(":")

    for word in counts:

        counts[word] = counts.get(word, 0) + 1

lst = list()

for key, val in counts.items():

    lst.append((key, val))

lst.sort()

print lst

Can anyone tell me where I went wrong?

closed

4 Answers

0 votes
by (25.7k points)
selected by
 
Best answer
In your code, there are a few issues that need to be addressed. Here's the modified version of your code with the necessary corrections:

name = input("Enter file:")

if len(name) < 1:

    name = "mbox-short.txt"

handle = open(name)

counts = dict()

for line in handle:

    line = line.rstrip()

    if not line.startswith('From '):

        continue

    words = line.split()

    time = words[5]

    hour = time.split(":")[0]

    counts[hour] = counts.get(hour, 0) + 1

lst = sorted(counts.items())

for key, val in lst:

    print(key, val)

Here are the changes made:

Replaced raw_input() with input() to support Python 3.

Removed the unnecessary creation of an empty lst list.

Corrected the indentation of the code inside the second for loop to properly iterate over counts.items().

Sorted the counts.items() using the sorted() function to achieve the desired sorted output.

These modifications should address the issues and allow your code to correctly calculate the distribution by hour and print the counts sorted by hour.
0 votes
by (26.4k points)

Check the below code:

name = raw_input("Enter file:")

if len(name) < 1 : name = "mbox-short.txt"

handle = open(name)

hours = dict()

for line in handle:

   if line.startswith("From "):

    hour = line.split()[5].split(':')[0] 

    hours[hour] = hours.get(hour, 0) + 1

for key, value in sorted(hours.items(), None):

    print key, value

Interested to learn python in detail? Come and Join the python course.

0 votes
by (15.4k points)
Here's an  approach to solve the assignment using a different data structure and simplifying the code:

name = input("Enter file:")

if len(name) < 1:

    name = "mbox-short.txt"

handle = open(name)

counts = {}

for line in handle:

    line = line.rstrip()

    if not line.startswith('From '):

        continue

    hour = line.split()[5].split(":")[0]

    counts[hour] = counts.get(hour, 0) + 1

lst = sorted(counts.items())

for hour, count in lst:

    print(hour, count)

In this alternative version, the main difference is the usage of a dictionary (counts) to store the hour counts directly, eliminating the need for an additional list and sorting it later. The dictionary allows you to track the count for each hour as you iterate through the lines. The sorted() function is still used to sort the dictionary items by hour before printing them.

This approach simplifies the code by directly updating the dictionary counts and avoids the unnecessary steps of creating a list and sorting it separately.
0 votes
by (19k points)
Here is the correct code for the above question:

name = input("Enter file:")

if len(name) < 1:

    name = "mbox-short.txt"

handle = open(name)

counts = {}

for line in handle:

    line = line.rstrip()

    if not line.startswith('From '):

        continue

    hour = line.split()[5].split(":")[0]

    counts[hour] = counts.get(hour, 0) + 1

sorted_counts = sorted(counts.items())

for hour, count in sorted_counts:

    print(hour, count)

In this alternative version, the code is structured similarly to the previous solution. The dictionary counts is used to store the counts for each hour, and the sorted() function is employed to sort the dictionary items by hour. The resulting sorted items are then iterated over to print the hour and its corresponding count.

This version maintains the logic and functionality of the original code while providing an alternative presentation.

Related questions

0 votes
1 answer
asked Jan 5, 2021 in Python by ashely (50.2k points)
0 votes
1 answer
0 votes
1 answer
asked Dec 27, 2020 in Python by ashely (50.2k points)

Browse Categories

...