Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (18.4k points)

I am working on a project in one part of my project I need to extract the particular votes. So votes consist of A-D in Bootstrap. From which I wanted to extract only "C". So first I checked the count of the variable "C" occurrence using the code as shown below:

print(Counter(bootstrap).get('C'))

Which gave me the output as None

So I tried the below code hoping that I will get the number of votes of "C"

import numpy as np

from datascience import *

votes = Table().with_column('vote', np.array(['C']*470 + ['T']*380 + ['J']*80 + ['S']*30 + ['U']*40))

from collections import Counter

def proportions_in_resamples():

    prop_c = make_array()

    for i in np.arange(5000):

        bootstrap = votes.sample(votes.num_rows, with_replacement=False)

        print(Counter(bootstrap).get('C'))

        single_proportion=Counter(bootstrap).get("C")/ bootstrap.num_rows

        prop_c = np.append(prop_c, single_proportion)

    return prop_c

So i used the np.count_nonzero() to get the output:

single_proportion = np.count_nonzero(bootstrap=="C") / bootstrap.num_rows

but it returns the output as 0.0

Can anyone suggest to me how to get the results?

1 Answer

0 votes
by (36.8k points)
edited by

I am not pro in data science but gave a try using your code and I am also not so sure what you're trying to do. I installed the bootstrap and started running the code, I noticed that the elements are stored in an array. So you need to pass arguments as cloumns[0]. I have updated the code and ran it the count of "C" is 470

I have passed the value to the sample as num_rows/3 yields my results. So Counter should work fine. I have shared the code below:

import numpy as np

from datascience import *

votes = Table().with_column('vote', np.array(['C']*470 + ['T']*380 + ['J']*80 + ['S']*30 + ['U']*40))

from collections import Counter

def proportions_in_resamples():

    prop_c = make_array()

    for i in np.arange(5000):

        bootstrap = votes.sample(votes.num_rows//3, with_replacement=False)

        print(Counter(bootstrap.columns[0]).get("C"))

        single_proportion=Counter(bootstrap.columns[0]).get("C")/ bootstrap.num_rows

        prop_c = np.append(prop_c, single_proportion)

    return prop_c

print(proportions_in_resamples())

I am using the tail of the output as below:

138

158

159

162

155

159

165

151

159

161

[0.46546547 0.43843844 0.48648649 ... 0.45345345 0.47747748 0.48348348]

Learn Python for Data Science Course to improve your technical knowledge.

Related questions

Browse Categories

...