Python,将字符串分成几个子串

时间:2021-08-14 05:56:48

I have a string of RNA i.e:

我有一串RNA,即:

AUGGCCAUA

I would like to generate all substrings by the following way:

我想通过以下方式生成所有子字符串:

#starting from 0 character
AUG, GCC, AUA
#starting from 1 character
UGG, CCA
#starting from 2 character
GGC, CAU

I wrote a code that solves the first sub-problem:

我编写了一个解决第一个子问题的代码:

for i in range(0,len(rna)):
  if fmod(i,3)==0:
    print rna[i:i+3]

I have tried to change the starting position i.e.:

我试图改变起始位置,即:

 for i in range(1,len(rna)):

But it produces me the incorrect results:

但它产生了不正确的结果:

 GCC, UA #instead of UGG, CCA

Could you please give me a hint where is my mistake?

你能不能给我一个暗示我的错误在哪里?

4 个解决方案

#1


5  

The problem with your code is that you are always extracting substring from the index which is divisible by 3. Instead, try this

您的代码的问题在于您始终从索引中提取子字符串,该子字符串可被3整除。相反,请尝试此操作

a = 'AUGGCCAUA'
def getSubStrings(RNA, position):
    return [RNA[i:i+3] for i in range(position, len(RNA) - 2, 3)]

print getSubStrings(a, 0)
print getSubStrings(a, 1)
print getSubStrings(a, 2)

Output

产量

['AUG', 'GCC', 'AUA']
['UGG', 'CCA']
['GGC', 'CAU']

Explanation

说明

range(position, len(RNA) - 2, 3) will generate a list of numbers with common difference 3, starting from the position till the length of the list - 2. For example,

范围(位置,len(RNA) - 2,3)将生成一个具有共同差异3的数字列表,从位置开始直到列表长度 - 2.例如,

print range(1, 8, 3)

1 is the starting number, 8 is the last number, 3 is the common difference and it will give

1是起始编号,8是最后一个编号,3是常见编号,它将给出

[1, 4, 7]

These are our starting indices. And then we use list comprehension to generate the new list like this

这些是我们的起始指数。然后我们使用list comprehension来生成这样的新列表

[RNA[i:i+3] for i in range(position, len(RNA) - 2, 3)]

#2


2  

Is this what you're looking for?

这是你在找什么?

for i in range(len(rna)):
    if rna[i+3:]:
        print(rna[i:i+3])

outputs:

输出:

AUG
UGG
GGC
GCC
CCA
CAU

#3


1  

I thought of this oneliner:

我想到了这个oneliner:

a = 'AUGGCCAUA'
[a[x:x+3] for x in range(len(a))][:-2]

#4


1  

def generate(str, index):
    for i in range(index, len(str), 3):
        if len(str[i:i+3]) == 3:
            print str[i:i+3]

Example:

例:

In [29]: generate(str, 1)
UGG
CCA

In [30]: generate(str, 0)
AUG
GCC
AUA

#1


5  

The problem with your code is that you are always extracting substring from the index which is divisible by 3. Instead, try this

您的代码的问题在于您始终从索引中提取子字符串,该子字符串可被3整除。相反,请尝试此操作

a = 'AUGGCCAUA'
def getSubStrings(RNA, position):
    return [RNA[i:i+3] for i in range(position, len(RNA) - 2, 3)]

print getSubStrings(a, 0)
print getSubStrings(a, 1)
print getSubStrings(a, 2)

Output

产量

['AUG', 'GCC', 'AUA']
['UGG', 'CCA']
['GGC', 'CAU']

Explanation

说明

range(position, len(RNA) - 2, 3) will generate a list of numbers with common difference 3, starting from the position till the length of the list - 2. For example,

范围(位置,len(RNA) - 2,3)将生成一个具有共同差异3的数字列表,从位置开始直到列表长度 - 2.例如,

print range(1, 8, 3)

1 is the starting number, 8 is the last number, 3 is the common difference and it will give

1是起始编号,8是最后一个编号,3是常见编号,它将给出

[1, 4, 7]

These are our starting indices. And then we use list comprehension to generate the new list like this

这些是我们的起始指数。然后我们使用list comprehension来生成这样的新列表

[RNA[i:i+3] for i in range(position, len(RNA) - 2, 3)]

#2


2  

Is this what you're looking for?

这是你在找什么?

for i in range(len(rna)):
    if rna[i+3:]:
        print(rna[i:i+3])

outputs:

输出:

AUG
UGG
GGC
GCC
CCA
CAU

#3


1  

I thought of this oneliner:

我想到了这个oneliner:

a = 'AUGGCCAUA'
[a[x:x+3] for x in range(len(a))][:-2]

#4


1  

def generate(str, index):
    for i in range(index, len(str), 3):
        if len(str[i:i+3]) == 3:
            print str[i:i+3]

Example:

例:

In [29]: generate(str, 1)
UGG
CCA

In [30]: generate(str, 0)
AUG
GCC
AUA