I have a list that i used regex on to remove the spaces in strings in the list which works perfectly -
我有一个列表,我使用正则表达式删除列表中的字符串中的空格完美 -
newrooms = re.sub(r'\s+', " ", str(newrooms))
the original list looks like -
原始列表看起来像 -
[['4 11-12pm', 'MR252 (30)'], ['5 10.30-12pm', 'MR252 (30)'], ['8 10-11am', 'MR252 (30)'], ['9 11-12pm', 'MR252 (30)'], ['10 10-11am', 'MR252 (30)'], ['10 11-12pm', 'MR251 (22)'], ['12 10-11am', 'MR107 (63)'], ['12 11-12pm', 'MR252 (30)'], ['17 10-11am', 'MR252 (30)'], ['18 11-12pm', 'MR252 (30)'], ['19 10-11am', 'MR252 (30)'], ['19 11-12pm', 'MR265 (24)'], ['20 10-11am', 'CB203 (26)'], ['20 11-12pm', 'MR252 (30)'], ['27 10-11am', 'MR252 (30)'], ['28 11-12pm', 'MR252 (30)'], ['29 10-11am', 'MR252 (30)'], ['42 11-12pm', 'MR252 (30)'], ['42 2-4pm MA ONLY', 'MR252 (30)'], ['43 10-11am', 'MR252 (30)'], ['44 10-11am', ''], ['44 11-12pm', 'MR252 (30)']]
print newrooms[3] prints ... "['9 11-12pm', 'MR252 (30)']"
打印新房[3]打印......“['9 11-12pm','MR252(30)']”
after using the re.sub to remove the spaces the list looks like
使用re.sub删除列表的空格后
[['4 11-12pm', 'MR252 (30)'], ['5 10.30-12pm', 'MR252 (30)'], ['8 10-11am', 'MR252 (30)'], ['9 11-12pm', 'MR252 (30)'], ['10 10-11am', 'MR252 (30)'], ['10 11-12pm', 'MR251 (22)'], ['12 10-11am', 'MR107 (63)'], ['12 11-12pm', 'MR252 (30)'], ['17 10-11am', 'MR252 (30)'], ['18 11-12pm', 'MR252 (30)'], ['19 10-11am', 'MR252 (30)'], ['19 11-12pm', 'MR265 (24)'], ['20 10-11am', 'CB203 (26)'], ['20 11-12pm', 'MR252 (30)'], ['27 10-11am', 'MR252 (30)'], ['28 11-12pm', 'MR252 (30)'], ['29 10-11am', 'MR252 (30)'], ['42 11-12pm', 'MR252 (30)'], ['42 2-4pm MA ONLY', 'MR252 (30)'], ['43 10-11am', 'MR252 (30)'], ['44 10-11am', ''], ['44 11-12pm', 'MR252 (30)']]
its just the same (minus the spaces) but now =
它只是相同(减去空格)但现在=
print newrooms[3] prints ... "4"
打印新房[3]打印......“4”
all the code here =
这里的所有代码=
print newrooms[3]
print newrooms
newrooms = re.sub(r'\s+', " ", str(newrooms))
print newrooms[3]
print newrooms
Why does the list now not act like a list ?
为什么列表现在不像列表?
OK guys, I see, I was converting the whole list to a string with str(newrooms), what i should be doing is ..
好的家伙,我知道,我正在将整个列表转换成带有str(新房)的字符串,我应该做的是......
print newrooms[3]
print newrooms
for obj in newrooms:
obj[0] = re.sub(r'\s+', " ", (obj[0]))
print newrooms[3]
print newrooms
5 个解决方案
#1
4
After
后
newrooms = re.sub(r'\s+', " ", str(newrooms))
newrooms
, formerly a list()
, becomes a string.
newrooms,以前是list(),变成了一个字符串。
print newrooms[3]
prints the 4th character in that string. Python is ducktyping variables, so each variable flexibly adapts to what you store in it.
打印该字符串中的第4个字符。 Python是变形函数,因此每个变量都可以灵活地适应您存储的变量。
#2
1
What you want is to replace sequences of repeated whitespace with a single blank for each string in a lists of lists.
你想要的是用列表列表中的每个字符串用一个空格替换重复空格的序列。
What you actually do is convert the list to a string and then do the substituting operation.
你实际做的是将列表转换为字符串,然后进行替换操作。
Here's what happens - I will use a shortened version of your original list for readability:
这是发生的事情 - 为了便于阅读,我将使用原始列表的缩短版本:
>>> import re
>>> newrooms = [['4 11-12pm', 'MR252 (30)'], ['5 10.30-12pm', 'MR252 (30)']]
>>> newrooms_str = str(newrooms)
>>> newrooms_str
"[['4 11-12pm', 'MR252 (30)'], ['5 10.30-12pm', 'MR252 (30)']]"
>>> newrooms_str = re.sub(r'\s+', " ", newrooms_str)
>>> newrooms_str
"[['4 11-12pm', 'MR252 (30)'], ['5 10.30-12pm', 'MR252 (30)']]"
>>> newrooms_str[3]
'4'
As you can see, you are passing a string to re.sub
, which returns a string. The fourth character of that string is the character '4'
which you see when you do newrooms_str[3]
.
如您所见,您将字符串传递给re.sub,后者返回一个字符串。该字符串的第四个字符是你在执行newrooms_str时看到的字符'4'[3]。
In order to get your desired result, you need to operate on the individual strings in your list of lists:
为了获得所需的结果,您需要对列表列表中的各个字符串进行操作:
>>> newrooms
[['4 11-12pm', 'MR252 (30)'], ['5 10.30-12pm', 'MR252 (30)']]
>>> newrooms = [[re.sub(r'\s+', " ", string) for string in sublist] for sublist in newrooms]
>>> newrooms
[['4 11-12pm', 'MR252 (30)'], ['5 10.30-12pm', 'MR252 (30)']]
>>> newrooms[1]
['5 10.30-12pm', 'MR252 (30)']
#3
1
You convert the list
newrooms
to a single string in this line:
您将列表newrooms转换为此行中的单个字符串:
newrooms = re.sub(r'\s+', " ", str(newrooms))
So it is just one string and not a list anymore. What you want to do is to apply the substitution on the single elements of the list:
所以它只是一个字符串而不再是列表。你想要做的是在列表的单个元素上应用替换:
newrooms = [
[re.sub(r'\s+', " ", elem) for elem in sublist]
for sublist in newrooms
]
This results in:
这导致:
>>> newrooms[3]
['9 11-12pm', 'MR252 (30)']
#4
1
You can use str.join
and str.split
operating on each string in each sublist not convert the list to a string:
您可以对每个子列表中的每个字符串使用str.join和str.split,而不是将列表转换为字符串:
l = [['4 11-12pm', 'MR252 (30)'], ['5 10.30-12pm', 'MR252 (30)'], ['8 10-11am', 'MR252 (30)'], ['9 11-12pm', 'MR252 (30)'], ['10 10-11am', 'MR252 (30)'], ['10 11-12pm', 'MR251 (22)'], ['12 10-11am', 'MR107 (63)'], ['12 11-12pm', 'MR252 (30)'], ['17 10-11am', 'MR252 (30)'], ['18 11-12pm', 'MR252 (30)'], ['19 10-11am', 'MR252 (30)'], ['19 11-12pm', 'MR265 (24)'], ['20 10-11am', 'CB203 (26)'], ['20 11-12pm', 'MR252 (30)'], ['27 10-11am', 'MR252 (30)'], ['28 11-12pm', 'MR252 (30)'], ['29 10-11am', 'MR252 (30)'], ['42 11-12pm', 'MR252 (30)'], ['42 2-4pm MA ONLY', 'MR252 (30)'], ['43 10-11am', 'MR252 (30)'], ['44 10-11am', ''], ['44 11-12pm', 'MR252 (30)']]
l[:] = [[" ".join(s.split()) for s in sub] for sub in l]
from pprint import pprint as pp
Output will be a list:
输出将是一个列表:
[['4 11-12pm', 'MR252 (30)'],
['5 10.30-12pm', 'MR252 (30)'],
['8 10-11am', 'MR252 (30)'],
['9 11-12pm', 'MR252 (30)'],
['10 10-11am', 'MR252 (30)'],
['10 11-12pm', 'MR251 (22)'],
['12 10-11am', 'MR107 (63)'],
['12 11-12pm', 'MR252 (30)'],
['17 10-11am', 'MR252 (30)'],
['18 11-12pm', 'MR252 (30)'],
['19 10-11am', 'MR252 (30)'],
['19 11-12pm', 'MR265 (24)'],
['20 10-11am', 'CB203 (26)'],
['20 11-12pm', 'MR252 (30)'],
['27 10-11am', 'MR252 (30)'],
['28 11-12pm', 'MR252 (30)'],
['29 10-11am', 'MR252 (30)'],
['42 11-12pm', 'MR252 (30)'],
['42 2-4pm MA ONLY', 'MR252 (30)'],
['43 10-11am', 'MR252 (30)'],
['44 10-11am', ''],
['44 11-12pm', 'MR252 (30)']]
#5
0
It returns unexpected result, because you convert list to the string before replacing. Try this instead:
它返回意外的结果,因为您在替换之前将list转换为字符串。试试这个:
import re
newrooms = [['4 11-12pm', 'MR252 (30)'], ['5 10.30-12pm', 'MR252 (30)'], ['8 10-11am', 'MR252 (30)'], ['9 11-12pm', 'MR252 (30)'], ['10 10-11am', 'MR252 (30)'], ['10 11-12pm', 'MR251 (22)'], ['12 10-11am', 'MR107 (63)'], ['12 11-12pm', 'MR252 (30)'], ['17 10-11am', 'MR252 (30)'], ['18 11-12pm', 'MR252 (30)'], ['19 10-11am', 'MR252 (30)'], ['19 11-12pm', 'MR265 (24)'], ['20 10-11am', 'CB203 (26)'], ['20 11-12pm', 'MR252 (30)'], ['27 10-11am', 'MR252 (30)'], ['28 11-12pm', 'MR252 (30)'], ['29 10-11am', 'MR252 (30)'], ['42 11-12pm', 'MR252 (30)'], ['42 2-4pm MA ONLY', 'MR252 (30)'], ['43 10-11am', 'MR252 (30)'], ['44 10-11am', ''], ['44 11-12pm', 'MR252 (30)']]
newrooms = [[re.sub(r'\s+', " ", room) for room in rooms] for rooms in newrooms]
print newrooms[3]
#1
4
After
后
newrooms = re.sub(r'\s+', " ", str(newrooms))
newrooms
, formerly a list()
, becomes a string.
newrooms,以前是list(),变成了一个字符串。
print newrooms[3]
prints the 4th character in that string. Python is ducktyping variables, so each variable flexibly adapts to what you store in it.
打印该字符串中的第4个字符。 Python是变形函数,因此每个变量都可以灵活地适应您存储的变量。
#2
1
What you want is to replace sequences of repeated whitespace with a single blank for each string in a lists of lists.
你想要的是用列表列表中的每个字符串用一个空格替换重复空格的序列。
What you actually do is convert the list to a string and then do the substituting operation.
你实际做的是将列表转换为字符串,然后进行替换操作。
Here's what happens - I will use a shortened version of your original list for readability:
这是发生的事情 - 为了便于阅读,我将使用原始列表的缩短版本:
>>> import re
>>> newrooms = [['4 11-12pm', 'MR252 (30)'], ['5 10.30-12pm', 'MR252 (30)']]
>>> newrooms_str = str(newrooms)
>>> newrooms_str
"[['4 11-12pm', 'MR252 (30)'], ['5 10.30-12pm', 'MR252 (30)']]"
>>> newrooms_str = re.sub(r'\s+', " ", newrooms_str)
>>> newrooms_str
"[['4 11-12pm', 'MR252 (30)'], ['5 10.30-12pm', 'MR252 (30)']]"
>>> newrooms_str[3]
'4'
As you can see, you are passing a string to re.sub
, which returns a string. The fourth character of that string is the character '4'
which you see when you do newrooms_str[3]
.
如您所见,您将字符串传递给re.sub,后者返回一个字符串。该字符串的第四个字符是你在执行newrooms_str时看到的字符'4'[3]。
In order to get your desired result, you need to operate on the individual strings in your list of lists:
为了获得所需的结果,您需要对列表列表中的各个字符串进行操作:
>>> newrooms
[['4 11-12pm', 'MR252 (30)'], ['5 10.30-12pm', 'MR252 (30)']]
>>> newrooms = [[re.sub(r'\s+', " ", string) for string in sublist] for sublist in newrooms]
>>> newrooms
[['4 11-12pm', 'MR252 (30)'], ['5 10.30-12pm', 'MR252 (30)']]
>>> newrooms[1]
['5 10.30-12pm', 'MR252 (30)']
#3
1
You convert the list
newrooms
to a single string in this line:
您将列表newrooms转换为此行中的单个字符串:
newrooms = re.sub(r'\s+', " ", str(newrooms))
So it is just one string and not a list anymore. What you want to do is to apply the substitution on the single elements of the list:
所以它只是一个字符串而不再是列表。你想要做的是在列表的单个元素上应用替换:
newrooms = [
[re.sub(r'\s+', " ", elem) for elem in sublist]
for sublist in newrooms
]
This results in:
这导致:
>>> newrooms[3]
['9 11-12pm', 'MR252 (30)']
#4
1
You can use str.join
and str.split
operating on each string in each sublist not convert the list to a string:
您可以对每个子列表中的每个字符串使用str.join和str.split,而不是将列表转换为字符串:
l = [['4 11-12pm', 'MR252 (30)'], ['5 10.30-12pm', 'MR252 (30)'], ['8 10-11am', 'MR252 (30)'], ['9 11-12pm', 'MR252 (30)'], ['10 10-11am', 'MR252 (30)'], ['10 11-12pm', 'MR251 (22)'], ['12 10-11am', 'MR107 (63)'], ['12 11-12pm', 'MR252 (30)'], ['17 10-11am', 'MR252 (30)'], ['18 11-12pm', 'MR252 (30)'], ['19 10-11am', 'MR252 (30)'], ['19 11-12pm', 'MR265 (24)'], ['20 10-11am', 'CB203 (26)'], ['20 11-12pm', 'MR252 (30)'], ['27 10-11am', 'MR252 (30)'], ['28 11-12pm', 'MR252 (30)'], ['29 10-11am', 'MR252 (30)'], ['42 11-12pm', 'MR252 (30)'], ['42 2-4pm MA ONLY', 'MR252 (30)'], ['43 10-11am', 'MR252 (30)'], ['44 10-11am', ''], ['44 11-12pm', 'MR252 (30)']]
l[:] = [[" ".join(s.split()) for s in sub] for sub in l]
from pprint import pprint as pp
Output will be a list:
输出将是一个列表:
[['4 11-12pm', 'MR252 (30)'],
['5 10.30-12pm', 'MR252 (30)'],
['8 10-11am', 'MR252 (30)'],
['9 11-12pm', 'MR252 (30)'],
['10 10-11am', 'MR252 (30)'],
['10 11-12pm', 'MR251 (22)'],
['12 10-11am', 'MR107 (63)'],
['12 11-12pm', 'MR252 (30)'],
['17 10-11am', 'MR252 (30)'],
['18 11-12pm', 'MR252 (30)'],
['19 10-11am', 'MR252 (30)'],
['19 11-12pm', 'MR265 (24)'],
['20 10-11am', 'CB203 (26)'],
['20 11-12pm', 'MR252 (30)'],
['27 10-11am', 'MR252 (30)'],
['28 11-12pm', 'MR252 (30)'],
['29 10-11am', 'MR252 (30)'],
['42 11-12pm', 'MR252 (30)'],
['42 2-4pm MA ONLY', 'MR252 (30)'],
['43 10-11am', 'MR252 (30)'],
['44 10-11am', ''],
['44 11-12pm', 'MR252 (30)']]
#5
0
It returns unexpected result, because you convert list to the string before replacing. Try this instead:
它返回意外的结果,因为您在替换之前将list转换为字符串。试试这个:
import re
newrooms = [['4 11-12pm', 'MR252 (30)'], ['5 10.30-12pm', 'MR252 (30)'], ['8 10-11am', 'MR252 (30)'], ['9 11-12pm', 'MR252 (30)'], ['10 10-11am', 'MR252 (30)'], ['10 11-12pm', 'MR251 (22)'], ['12 10-11am', 'MR107 (63)'], ['12 11-12pm', 'MR252 (30)'], ['17 10-11am', 'MR252 (30)'], ['18 11-12pm', 'MR252 (30)'], ['19 10-11am', 'MR252 (30)'], ['19 11-12pm', 'MR265 (24)'], ['20 10-11am', 'CB203 (26)'], ['20 11-12pm', 'MR252 (30)'], ['27 10-11am', 'MR252 (30)'], ['28 11-12pm', 'MR252 (30)'], ['29 10-11am', 'MR252 (30)'], ['42 11-12pm', 'MR252 (30)'], ['42 2-4pm MA ONLY', 'MR252 (30)'], ['43 10-11am', 'MR252 (30)'], ['44 10-11am', ''], ['44 11-12pm', 'MR252 (30)']]
newrooms = [[re.sub(r'\s+', " ", room) for room in rooms] for rooms in newrooms]
print newrooms[3]