I have a string like this:
我有一个像这样的字符串:
string ='ArcelorMittal invests =E2=82=AC87m in new process that cuts emissions=20'
string ='ArcelorMittal投资= E2 = 82 = AC87m在新流程中减少排放= 20'
I want to take out =E2=82=AC
and =20
我想取出= E2 = 82 = AC和= 20
But when I use,
但是当我使用时,
pattern ='(=\w\w)+'
a=re.split(pattern,string)
it returns
它返回
['ArcelorMittal invests ', '=AC', '87m in new process that cuts emissions', '=20', '']
2 个解决方案
#1
1
You may use re.findall
你可以使用re.findall
>>> s = 'ArcelorMittal invests =E2=82=AC87m in new process that cuts emissions=20'
>>> re.findall(r'(?:=\w{2})+', s)
['=E2=82=AC', '=20']
>>>
Use re.sub
if you want to remove those chars.
如果要删除这些字符,请使用re.sub。
>>> re.sub(r'(?:=\w{2})+', '', s)
'ArcelorMittal invests 87m in new process that cuts emissions'
#2
1
Based on your comment I would recommend you to use quopri.decodestring
on original string. There is no need to extract these characters and decode them separately
根据您的评论,我建议您在原始字符串上使用quopri.decodestring。无需提取这些字符并单独解码它们
>>> import quopri
>>> s = 'ArcelorMittal invests =E2=82=AC87m in new process that cuts emissions=20'
>>> quopri.decodestring(s)
'ArcelorMittal invests \xe2\x82\xac87m in new process that cuts emissions '
>>> print quopri.decodestring(s)
ArcelorMittal invests €87m in new process that cuts emissions
#1
1
You may use re.findall
你可以使用re.findall
>>> s = 'ArcelorMittal invests =E2=82=AC87m in new process that cuts emissions=20'
>>> re.findall(r'(?:=\w{2})+', s)
['=E2=82=AC', '=20']
>>>
Use re.sub
if you want to remove those chars.
如果要删除这些字符,请使用re.sub。
>>> re.sub(r'(?:=\w{2})+', '', s)
'ArcelorMittal invests 87m in new process that cuts emissions'
#2
1
Based on your comment I would recommend you to use quopri.decodestring
on original string. There is no need to extract these characters and decode them separately
根据您的评论,我建议您在原始字符串上使用quopri.decodestring。无需提取这些字符并单独解码它们
>>> import quopri
>>> s = 'ArcelorMittal invests =E2=82=AC87m in new process that cuts emissions=20'
>>> quopri.decodestring(s)
'ArcelorMittal invests \xe2\x82\xac87m in new process that cuts emissions '
>>> print quopri.decodestring(s)
ArcelorMittal invests €87m in new process that cuts emissions