Python中的re.sub()在替换字符串中的货币值时并不总是有效

时间:2022-07-19 20:28:45

I have built a "Currency Tagger" in Python which identifies all currency expressions and replaces them with a tagged string.

我在Python中构建了一个“货币标记器”,它标识所有货币表达式并用标记的字符串替换它们。

Example,
replace "I have $20 in my pocket"
with "I have <Currency>$20</Currency> in my pocket"

例如,将“我的口袋里有20美元”替换为“我的口袋里有 <货币> 20美元 ”

One of the tasks requires me to substitute the string identified as Currency with the tagged string. I am using re.sub() to do this.

其中一项任务要求我用标记字符串替换标识为Currency的字符串。我正在使用re.sub()来执行此操作。

It works perfectly for every form of string except of the form "$4.4B" or "$4.4M".

除了“$ 4.4B”或“$ 4.4M”之外,它适用于各种形式的字符串。

I tried running simple example in my python console and found that re.sub() works inconsistently with patterns which have a mixed dollar pattern.

我尝试在我的python控制台中运行简单的示例,发现re.sub()与具有混合美元模式的模式不一致。

For example,

例如,

>>> text = "I have #20 in my pocket"
>>> re.sub("#20", "$20", text)
'I have $20 in my pocket'
>>> text = "I have $20 in my pocket"
>>> re.sub("$20", "#20", text)
'I have $20 in my pocket'

In the above example you see that when I am trying to replace "$20" with "#20" it does not work (in the second case).

在上面的例子中,您会看到当我尝试将“$ 20”替换为“#20”时,它不起作用(在第二种情况下)。

Any help would be greatly appreciated of course. A very silly bug has cropped up and is stalling major work because of this.

当然,任何帮助都将非常感激。由于这个原因,一个非常愚蠢的虫子出现了,并且正在拖延主要工作。

2 个解决方案

#1


6  

$ is a special character .So if you want to replace it use

$是一个特殊字符。所以如果你想替换它使用

 re.sub(r"\$20", "#20", text)

          ^^

You will have to escape it.Also use r mode to avoid escaping problems.

你将不得不逃避它。也使用r模式来避免逃避问题。

$ means end of string.So your regex was being ineffective.

$表示字符串的结尾。所以你的正则表达式无效。

#2


0  

Unless you are using regular expressions (and you don't seem to be), there is no reason to use the "re" module.

除非您使用正则表达式(并且您似乎没有),否则没有理由使用“re”模块。

Just use the .replace() method of strings:

只需使用字符串的.replace()方法:

text.replace("#20", "$20")

#1


6  

$ is a special character .So if you want to replace it use

$是一个特殊字符。所以如果你想替换它使用

 re.sub(r"\$20", "#20", text)

          ^^

You will have to escape it.Also use r mode to avoid escaping problems.

你将不得不逃避它。也使用r模式来避免逃避问题。

$ means end of string.So your regex was being ineffective.

$表示字符串的结尾。所以你的正则表达式无效。

#2


0  

Unless you are using regular expressions (and you don't seem to be), there is no reason to use the "re" module.

除非您使用正则表达式(并且您似乎没有),否则没有理由使用“re”模块。

Just use the .replace() method of strings:

只需使用字符串的.replace()方法:

text.replace("#20", "$20")