从电话号码字符串中删除不需要的字

时间:2021-12-16 06:28:44

I am aiming for regex code to grab phone number and remove unneeded characters.

我的目标是使用正则表达式代码来获取电话号码并删除不需要的字符。

import re
strs = 'dsds +48 124 cat cat cat245 81243!!'
match = re.search(r'.[ 0-9\+\-\.\_]+', strs)

if match:                      
    print 'found', match.group() ## 'found word:cat'
else:
    print 'did not find'

It returns only:

它只返回:

+48 124 

How I can return the entire number?

我如何归还整个号码?

3 个解决方案

#1


4  

You want to use sub(), not search():

你想使用sub(),而不是search():

>>> strs = 'dsds +48 124 cat cat cat245 81243!!'
>>> re.sub(r"[^0-9+._ -]+", "", strs)
' +48 124   245 81243'

[^0-9+._ -] is a negated character class. The ^ is significant here - this expression means: "Match a characters that is neither a digit, nor a plus, a dot, an underscore, a space or a dash".

[^ 0-9 + ._ - ]是一个否定的字符类。这里^很重要 - 这个表达意味着:“匹配既不是数字,也不是加号,点,下划线,空格或短划线的字符”。

The + tells the regex engine to match one or more instances of the preceding token.

+告诉正则表达式引擎匹配前一个令牌的一个或多个实例。

#2


4  

The problem with re.sub() is that you get extra spaces in your final phone number string. The non-regular expression way, which returns the correct phone number (without any spaces):

re.sub()的问题在于您在最终的电话号码字符串中获得了额外的空格。非正则表达式方式,返回正确的电话号码(没有任何空格):

>>> strs = 'dsds +48 124 cat cat cat245 81243!!'
>>> ''.join(x for x in strs if x.isdigit() or x == '+')
'+4812424581243'

#3


0  

This is what I use to replace all non-digits with a single hyphen, and it seems to work for me:

这是我用单个连字符替换所有非数字的方法,它似乎对我有用:

# convert sequences of non-digits to a single hyphen
fixed_phone = re.sub("[^\d]+","-",raw_phone)

#1


4  

You want to use sub(), not search():

你想使用sub(),而不是search():

>>> strs = 'dsds +48 124 cat cat cat245 81243!!'
>>> re.sub(r"[^0-9+._ -]+", "", strs)
' +48 124   245 81243'

[^0-9+._ -] is a negated character class. The ^ is significant here - this expression means: "Match a characters that is neither a digit, nor a plus, a dot, an underscore, a space or a dash".

[^ 0-9 + ._ - ]是一个否定的字符类。这里^很重要 - 这个表达意味着:“匹配既不是数字,也不是加号,点,下划线,空格或短划线的字符”。

The + tells the regex engine to match one or more instances of the preceding token.

+告诉正则表达式引擎匹配前一个令牌的一个或多个实例。

#2


4  

The problem with re.sub() is that you get extra spaces in your final phone number string. The non-regular expression way, which returns the correct phone number (without any spaces):

re.sub()的问题在于您在最终的电话号码字符串中获得了额外的空格。非正则表达式方式,返回正确的电话号码(没有任何空格):

>>> strs = 'dsds +48 124 cat cat cat245 81243!!'
>>> ''.join(x for x in strs if x.isdigit() or x == '+')
'+4812424581243'

#3


0  

This is what I use to replace all non-digits with a single hyphen, and it seems to work for me:

这是我用单个连字符替换所有非数字的方法,它似乎对我有用:

# convert sequences of non-digits to a single hyphen
fixed_phone = re.sub("[^\d]+","-",raw_phone)