在python中使用regex将链接替换为链接

时间:2022-05-11 10:25:07

how do I convert some text to a link? Back in PHP, I used this piece of code that worked well for my purpose:

如何将某些文本转换为链接?回到PHP,我使用了一段适用于我的目的的代码:

            $text = preg_replace("#(^|[\n ])(([\w]+?://[\w\#$%&~.\-;:=,?@\[\]+]*)(/[\w\#$%&~/.\-;:=,?@\[\]+]*)?)#is", "\\1<a href=\"\\2\" target=\"_blank\">\\3</a>", $text);
            $text = preg_replace("#(^|[\n ])(((www|ftp)\.[\w\#$%&~.\-;:=,?@\[\]+]*)(/[\w\#$%&~/.\-;:=,?@\[\]+]*)?)#is", "\\1<a href=\"http://\\2\" target=\"_blank\">\\3</a>", $text);

I tried around in Python, but was unable to get it to work.. Would be very nice if someone could translate this to Python :)..

我在Python中试过,但无法让它工作..如果有人可以将其转换为Python,那将是非常好的:)

1 个解决方案

#1


The code below is a simple translation to python. You should confirm that it actually does what you want. For more information, please see the Python Regular Expression HOWTO.

下面的代码是对python的简单翻译。你应该确认它实际上做了你想要的。有关更多信息,请参阅Python Regular Expression HOWTO。

import re

pat1 = re.compile(r"(^|[\n ])(([\w]+?://[\w\#$%&~.\-;:=,?@\[\]+]*)(/[\w\#$%&~/.\-;:=,?@\[\]+]*)?)", re.IGNORECASE | re.DOTALL)

pat2 = re.compile(r"#(^|[\n ])(((www|ftp)\.[\w\#$%&~.\-;:=,?@\[\]+]*)(/[\w\#$%&~/.\-;:=,?@\[\]+]*)?)", re.IGNORECASE | re.DOTALL)


urlstr = 'http://www.example.com/foo/bar.html'

urlstr = pat1.sub(r'\1<a href="\2" target="_blank">\3</a>', urlstr)
urlstr = pat2.sub(r'\1<a href="http:/\2" target="_blank">\3</a>', urlstr)

print urlstr

Here's what the output looks like at my end:

以下是我的输出结果:

<a href="http://www.example.com/foo/bar.html" target="_blank">http://www.example.com</a>

#1


The code below is a simple translation to python. You should confirm that it actually does what you want. For more information, please see the Python Regular Expression HOWTO.

下面的代码是对python的简单翻译。你应该确认它实际上做了你想要的。有关更多信息,请参阅Python Regular Expression HOWTO。

import re

pat1 = re.compile(r"(^|[\n ])(([\w]+?://[\w\#$%&~.\-;:=,?@\[\]+]*)(/[\w\#$%&~/.\-;:=,?@\[\]+]*)?)", re.IGNORECASE | re.DOTALL)

pat2 = re.compile(r"#(^|[\n ])(((www|ftp)\.[\w\#$%&~.\-;:=,?@\[\]+]*)(/[\w\#$%&~/.\-;:=,?@\[\]+]*)?)", re.IGNORECASE | re.DOTALL)


urlstr = 'http://www.example.com/foo/bar.html'

urlstr = pat1.sub(r'\1<a href="\2" target="_blank">\3</a>', urlstr)
urlstr = pat2.sub(r'\1<a href="http:/\2" target="_blank">\3</a>', urlstr)

print urlstr

Here's what the output looks like at my end:

以下是我的输出结果:

<a href="http://www.example.com/foo/bar.html" target="_blank">http://www.example.com</a>