# Drop the chunks that doesn't equal to n parts [duplicate]

1 view

I am supposed to split this sequence into the list of n=3.

codons('agucaccgucautc')

# result = ['agu','cac','cgu','cau']

# 'tc' is supposed to be ignored as it doesn't equal to n=3

I have tried the below code:

def codons(RNA):

"""This functions returns a list of codons present in an RNA sequence"""

# store the length of string

length = len(RNA)

#divide the string in n equal parts

n = 3

temp = 0

chars = int(len(RNA)/3)

#stores the array of string

change = []

#check whether a string can be divided into n equal parts

for i in range(0, length, chars):

part = [RNA[i:i+3] for i in range(0, length, n)];

change.append(part);

return part

if (length % n != 0):

continue

But when I try to run my previous code again, it still returns the 'tc'

codons('agucaccgucautc')

# result = ['agu', 'cac', 'cgu', 'cau', 'tc']

Should I do to ignore the chars that are not equal to n=3 or my last part of the 'tc'?

by (36.8k points)

You could use the list-comprehension in the below code:

s = 'agucaccgucautc'

n = 3

out = [(s[i:i+n]) for i in range(0, len(s), n) if len(s[i:i+n])%n == 0]

If you are a beginner and want to know more about Data Science the do check out the Data Science course