I'm trying to format this string below where one row contains five words. However, I keep getting this as the output:
我试着在下面格式化这个字符串,其中一行包含五个单词。但是,我一直把它作为输出:
I love cookies yes I do Let s see a dog
我喜欢饼干,是的,我确实看到了一只狗。
First, I am not getting 5 words in one line, but instead, everything in one line.
首先,我不是把5个单词放在一行里,而是把所有东西都放在一行里。
Second, why does the "Let's" get split? I thought in splitting the string using "words", it will only split if there was a space in between?
第二,为什么“让我们”分裂?我想用"words"分割字符串时,只有中间有空格才会分裂?
Suggestions?
建议吗?
string = """I love cookies. yes I do. Let's see a dog."""
# split string
words = re.split('\W+',string)
words = [i for i in words if i != '']
counter = 0
output=''
for i in words:
if counter == 0:
output +="{0:>15s}".format(i)
# if counter == 5, new row
elif counter % 5 == 0:
output += '\n'
output += "{0:>15s}".format(i)
else:
output += "{0:>15s}".format(i)
# Increase the counter by 1
counter += 1
print(output)
2 个解决方案
#1
18
As a start, don't call a variable "string" since it shadows the module with the same name
首先,不要调用变量“string”,因为它在模块的阴影中显示相同的名称
Secondly, use split()
to do your word-splitting
其次,使用split()进行单词拆分
>>> s = """I love cookies. yes I do. Let's see a dog."""
>>> s.split()
['I', 'love', 'cookies.', 'yes', 'I', 'do.', "Let's", 'see', 'a', 'dog.']
From re-module
从re-module
\W Matches any character which is not a Unicode word character. This is the opposite of \w. If the ASCII flag is used this becomes the equivalent of [^a-zA-Z0-9_] (but the flag affects the entire regular expression, so in such cases using an explicit [^a-zA-Z0-9_] may be a better choice).
\W匹配任何非Unicode字符的字符。这是相反的。如果使用ASCII标志这就相当于[^ a-zA-Z0-9_](但国旗影响整个正则表达式,所以在这种情况下使用一个显式的(^ a-zA-Z0-9_)可能是一个更好的选择)。
Since the '
is not listed in the above, the regexp used splits the "Let's" string into two parts:
由于上面没有列出',regexp将“Let’s”字符串分成两部分:
>>> words = re.split('\W+', s)
>>> words
['I', 'love', 'cookies', 'yes', 'I', 'do', 'Let', 's', 'see', 'a', 'dog', '']
This is the output I get using the strip()-approach above:
这是我使用上面的strip()方法得到的输出:
$ ./sp3.py
I love cookies. yes I
do. Let's see a dog.
The code could probably be simplified to this since counter==0
and the else-clause does the same thing. I through in an enumerate there as well to get rid of the counter:
代码可以简化为这个,因为counter= 0和else-子句执行相同的操作。我在那儿一一列举了一遍又一遍,除去了柜台:
#!/usr/bin/env python3
s = """I love cookies. yes I do. Let's see a dog."""
words = s.split()
output = ''
for n, i in enumerate(words):
if n % 5 == 0:
output += '\n'
output += "{0:>15s}".format(i)
print(output)
#2
1
words = string.split()
while (len(words))
for word in words[:5]
print(word, end=" ")
print()
words = words[5:]
That's the basic concept, split it using the split() method
这是基本概念,使用split()方法对其进行分割
Then slice it using slice notation to get the first 5 words
然后用切片表示法切片,得到前5个单词
Then slice off the first 5 words, and loop again
然后切掉前5个单词,再次循环
#1
18
As a start, don't call a variable "string" since it shadows the module with the same name
首先,不要调用变量“string”,因为它在模块的阴影中显示相同的名称
Secondly, use split()
to do your word-splitting
其次,使用split()进行单词拆分
>>> s = """I love cookies. yes I do. Let's see a dog."""
>>> s.split()
['I', 'love', 'cookies.', 'yes', 'I', 'do.', "Let's", 'see', 'a', 'dog.']
From re-module
从re-module
\W Matches any character which is not a Unicode word character. This is the opposite of \w. If the ASCII flag is used this becomes the equivalent of [^a-zA-Z0-9_] (but the flag affects the entire regular expression, so in such cases using an explicit [^a-zA-Z0-9_] may be a better choice).
\W匹配任何非Unicode字符的字符。这是相反的。如果使用ASCII标志这就相当于[^ a-zA-Z0-9_](但国旗影响整个正则表达式,所以在这种情况下使用一个显式的(^ a-zA-Z0-9_)可能是一个更好的选择)。
Since the '
is not listed in the above, the regexp used splits the "Let's" string into two parts:
由于上面没有列出',regexp将“Let’s”字符串分成两部分:
>>> words = re.split('\W+', s)
>>> words
['I', 'love', 'cookies', 'yes', 'I', 'do', 'Let', 's', 'see', 'a', 'dog', '']
This is the output I get using the strip()-approach above:
这是我使用上面的strip()方法得到的输出:
$ ./sp3.py
I love cookies. yes I
do. Let's see a dog.
The code could probably be simplified to this since counter==0
and the else-clause does the same thing. I through in an enumerate there as well to get rid of the counter:
代码可以简化为这个,因为counter= 0和else-子句执行相同的操作。我在那儿一一列举了一遍又一遍,除去了柜台:
#!/usr/bin/env python3
s = """I love cookies. yes I do. Let's see a dog."""
words = s.split()
output = ''
for n, i in enumerate(words):
if n % 5 == 0:
output += '\n'
output += "{0:>15s}".format(i)
print(output)
#2
1
words = string.split()
while (len(words))
for word in words[:5]
print(word, end=" ")
print()
words = words[5:]
That's the basic concept, split it using the split() method
这是基本概念,使用split()方法对其进行分割
Then slice it using slice notation to get the first 5 words
然后用切片表示法切片,得到前5个单词
Then slice off the first 5 words, and loop again
然后切掉前5个单词,再次循环