Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (18.4k points)

I am supposed to split this sequence into the list of n=3.

codons('agucaccgucautc')

# result = ['agu','cac','cgu','cau']

# 'tc' is supposed to be ignored as it doesn't equal to n=3

I have tried the below code:

def codons(RNA): 

    """This functions returns a list of codons present in an RNA sequence"""

    # store the length of string

    length = len(RNA)

    #divide the string in n equal parts

    n = 3

    temp = 0

    chars = int(len(RNA)/3)

    #stores the array of string

    change = []

    #check whether a string can be divided into n equal parts

    for i in range(0, length, chars):

        part = [RNA[i:i+3] for i in range(0, length, n)];

        change.append(part);

        return part

        if (length % n != 0):

            continue

But when I try to run my previous code again, it still returns the 'tc'

codons('agucaccgucautc')

# result = ['agu', 'cac', 'cgu', 'cau', 'tc']

Should I do to ignore the chars that are not equal to n=3 or my last part of the 'tc'?

1 Answer

0 votes
by (36.8k points)

You could use the list-comprehension in the below code:

s = 'agucaccgucautc'

n = 3

out = [(s[i:i+n]) for i in range(0, len(s), n) if len(s[i:i+n])%n == 0] 

 If you are a beginner and want to know more about Data Science the do check out the Data Science course

Browse Categories

...