I'm trying to analyze the contents of a string. If it has a punctuation mixed in the word I want to replace them with spaces.
我试着分析一个字符串的内容。如果在单词中混合了标点符号,我想用空格代替。
For example, If Johnny.Appleseed!is:a*good&farmer is entered as an input then it should say there are 6 words, but my code only sees it as 0 words. I'm not sure how to remove an incorrect character.
例如,如果Johnny.Appleseed !is:输入一个*good&farmer输入一个6个单词,但是我的代码只看到0个单词。我不知道如何删除不正确的字符。
FYI: I'm using python 3, also I can't import any libraries
我正在使用python 3,而且我不能导入任何库
string = input("type something")
stringss = string.split()
for c in range(len(stringss)):
for d in stringss[c]:
if(stringss[c][d].isalnum != True):
#something that removes stringss[c][d]
total+=1
print("words: "+ str(total))
6 个解决方案
#1
14
Simple loop based solution:
strs = "Johnny.Appleseed!is:a*good&farmer"
lis = []
for c in strs:
if c.isalnum() or c.isspace():
lis.append(c)
else:
lis.append(' ')
new_strs = "".join(lis)
print new_strs #print 'Johnny Appleseed is a good farmer'
new_strs.split() #prints ['Johnny', 'Appleseed', 'is', 'a', 'good', 'farmer']
Better solution:
Using regex
:
使用正则表达式:
>>> import re
>>> from string import punctuation
>>> strs = "Johnny.Appleseed!is:a*good&farmer"
>>> r = re.compile(r'[{}]'.format(punctuation))
>>> new_strs = r.sub(' ',strs)
>>> len(new_strs.split())
6
#using `re.split`:
>>> strs = "Johnny.Appleseed!is:a*good&farmer"
>>> re.split(r'[^0-9A-Za-z]+',strs)
['Johnny', 'Appleseed', 'is', 'a', 'good', 'farmer']
#2
10
Here's a one-line solution that doesn't require importing any libraries.
It replaces non-alphanumeric characters (like punctuation) with spaces, and then split
s the string.
这里有一个不需要导入任何库的单行解决方案。它用空格替换非字母数字字符(如标点),然后分割字符串。
Inspired from "Python strings split with multiple separators"
灵感来自“用多个分隔符分割的Python字符串”
>>> s = 'Johnny.Appleseed!is:a*good&farmer'
>>> words = ''.join(c if c.isalnum() else ' ' for c in s).split()
>>> words
['Johnny', 'Appleseed', 'is', 'a', 'good', 'farmer']
>>> len(words)
6
#3
3
try this: it parses the word_list using re, then creates a dictionary of word:appearances
试试这个:它使用re解析word_list,然后创建一个word:appearance的字典
import re
word_list = re.findall(r"[\w']+", string)
print {word:word_list.count(word) for word in word_list}
#4
1
for ltr in ('!', '.', ...) # insert rest of punctuation
stringss = strings.replace(ltr, ' ')
return len(stringss.split(' '))
#5
1
I know that this is an old question but...How about this?
我知道这是个老问题,但是……这个怎么样?
string = "If Johnny.Appleseed!is:a*good&farmer"
a = ["*",":",".","!",",","&"," "]
new_string = ""
for i in string:
if i not in a:
new_string += i
else:
new_string = new_string + " "
print(len(new_string.split(" ")))
#6
1
How about using Counter from collections ?
使用集合中的计数器怎么样?
import re
from collections import Counter
words = re.findall(r'\w+', string)
print (Counter(words))
#1
14
Simple loop based solution:
strs = "Johnny.Appleseed!is:a*good&farmer"
lis = []
for c in strs:
if c.isalnum() or c.isspace():
lis.append(c)
else:
lis.append(' ')
new_strs = "".join(lis)
print new_strs #print 'Johnny Appleseed is a good farmer'
new_strs.split() #prints ['Johnny', 'Appleseed', 'is', 'a', 'good', 'farmer']
Better solution:
Using regex
:
使用正则表达式:
>>> import re
>>> from string import punctuation
>>> strs = "Johnny.Appleseed!is:a*good&farmer"
>>> r = re.compile(r'[{}]'.format(punctuation))
>>> new_strs = r.sub(' ',strs)
>>> len(new_strs.split())
6
#using `re.split`:
>>> strs = "Johnny.Appleseed!is:a*good&farmer"
>>> re.split(r'[^0-9A-Za-z]+',strs)
['Johnny', 'Appleseed', 'is', 'a', 'good', 'farmer']
#2
10
Here's a one-line solution that doesn't require importing any libraries.
It replaces non-alphanumeric characters (like punctuation) with spaces, and then split
s the string.
这里有一个不需要导入任何库的单行解决方案。它用空格替换非字母数字字符(如标点),然后分割字符串。
Inspired from "Python strings split with multiple separators"
灵感来自“用多个分隔符分割的Python字符串”
>>> s = 'Johnny.Appleseed!is:a*good&farmer'
>>> words = ''.join(c if c.isalnum() else ' ' for c in s).split()
>>> words
['Johnny', 'Appleseed', 'is', 'a', 'good', 'farmer']
>>> len(words)
6
#3
3
try this: it parses the word_list using re, then creates a dictionary of word:appearances
试试这个:它使用re解析word_list,然后创建一个word:appearance的字典
import re
word_list = re.findall(r"[\w']+", string)
print {word:word_list.count(word) for word in word_list}
#4
1
for ltr in ('!', '.', ...) # insert rest of punctuation
stringss = strings.replace(ltr, ' ')
return len(stringss.split(' '))
#5
1
I know that this is an old question but...How about this?
我知道这是个老问题,但是……这个怎么样?
string = "If Johnny.Appleseed!is:a*good&farmer"
a = ["*",":",".","!",",","&"," "]
new_string = ""
for i in string:
if i not in a:
new_string += i
else:
new_string = new_string + " "
print(len(new_string.split(" ")))
#6
1
How about using Counter from collections ?
使用集合中的计数器怎么样?
import re
from collections import Counter
words = re.findall(r'\w+', string)
print (Counter(words))