Linkify Regex函数PHP大胆的Fireball方法。

时间:2022-10-14 21:45:49

So, I know there are a ton of related questions on SO, but none of them are quite what I'm looking for. I'm trying to implement a PHP function that will convert text URLs from a user-generated post into links. I'm using the 'improved' Regex from Daring Fireball towards the bottom of the page: http://daringfireball.net/2010/07/improved_regex_for_matching_urls The function does not return anything, and I'm not sure why.

我知道有很多相关的问题,但是没有一个是我想要的。我正在尝试实现一个PHP函数,它将从用户生成的post转换为链接的文本url。我正在使用从大胆的Fireball到页面底部的“改进”Regex: http://daringfireball.net/2010/07/improved_regex_for_matching_urls函数没有返回任何东西,我不知道为什么。

<?php
if ( false === function_exists('linkify') ):   
  function linkify($str) {
$pattern = '(?xi)\b((?:https?://|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:\'".,<>?«»“”‘’]))';     
return preg_replace($pattern, "<a href=\"\\0\" rel=\"nofollow\" target=\"_blank\">\\0</a>", $str);      
}
endif;
?>

Can someone please help me get this to work? Thanks!

谁能帮我把这个修好吗?谢谢!

2 个解决方案

#1


10  

Try this:

试试这个:

$pattern = '(?xi)\b((?:https?://|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`\!()\[\]{};:\'".,<>?«»“”‘’]))';     
return preg_replace("!$pattern!i", "<a href=\"\\0\" rel=\"nofollow\" target=\"_blank\">\\0</a>", $str); 

PHP's preg function do need delimiters. The i at the end makes it case-insensitive

PHP的preg函数确实需要分隔符。最后的i不区分大小写

Update

If you use # as the delimiter, you wan't need to escape the ! in the pattern as such use the original pattern string (the pattern does not have a #): "#$pattern#i"

如果使用#作为分隔符,则不需要转义!在模式中,使用原始模式字符串(模式没有#):“#$pattern#i”

Update 2

To ensure that the links are correct, do this:

为了确保链接是正确的,请这样做:

$pattern = '(?xi)\b((?:https?://|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:\'".,<>?«»“”‘’]))';
return preg_replace_callback("#$pattern#i", function($matches) {
    $input = $matches[0];
    $url = preg_match('!^https?://!i', $input) ? $input : "http://$input";
    return '<a href="' . $url . '" rel="nofollow" target="_blank">' . "$input</a>";
}, $str); 

This will now append http:// to the urls so that browser doesn't think it is a relative link.

这将给url添加http://,这样浏览器就不会认为它是一个相对链接。

#2


2  

I was looking to just get the urls from a string using the same regex from the answer above by d_inevitable and wasn't looking to turn them into links or care about the rest of the string, I only wanted the urls with in the string so this is what I did. Hope it helps.

我希望得到相同的url字符串使用正则表达式由d_inevitable从上面的回答,并不期待变成链接或关心字符串的其余部分,我只希望url的字符串这是我所做的。希望它可以帮助。

/**
 * Returns the urls in an array from a string.
 * This dos NOT return the string, only the urls with-in.
 */
function get_urls($str){

    $regex = '(?xi)\b((?:https?://|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:\'".,<>?«»“”‘’]))';
    preg_match_all("#$regex#i", $str, $matches);
    $urls = $matches[0];
    return $urls;

}

#1


10  

Try this:

试试这个:

$pattern = '(?xi)\b((?:https?://|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`\!()\[\]{};:\'".,<>?«»“”‘’]))';     
return preg_replace("!$pattern!i", "<a href=\"\\0\" rel=\"nofollow\" target=\"_blank\">\\0</a>", $str); 

PHP's preg function do need delimiters. The i at the end makes it case-insensitive

PHP的preg函数确实需要分隔符。最后的i不区分大小写

Update

If you use # as the delimiter, you wan't need to escape the ! in the pattern as such use the original pattern string (the pattern does not have a #): "#$pattern#i"

如果使用#作为分隔符,则不需要转义!在模式中,使用原始模式字符串(模式没有#):“#$pattern#i”

Update 2

To ensure that the links are correct, do this:

为了确保链接是正确的,请这样做:

$pattern = '(?xi)\b((?:https?://|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:\'".,<>?«»“”‘’]))';
return preg_replace_callback("#$pattern#i", function($matches) {
    $input = $matches[0];
    $url = preg_match('!^https?://!i', $input) ? $input : "http://$input";
    return '<a href="' . $url . '" rel="nofollow" target="_blank">' . "$input</a>";
}, $str); 

This will now append http:// to the urls so that browser doesn't think it is a relative link.

这将给url添加http://,这样浏览器就不会认为它是一个相对链接。

#2


2  

I was looking to just get the urls from a string using the same regex from the answer above by d_inevitable and wasn't looking to turn them into links or care about the rest of the string, I only wanted the urls with in the string so this is what I did. Hope it helps.

我希望得到相同的url字符串使用正则表达式由d_inevitable从上面的回答,并不期待变成链接或关心字符串的其余部分,我只希望url的字符串这是我所做的。希望它可以帮助。

/**
 * Returns the urls in an array from a string.
 * This dos NOT return the string, only the urls with-in.
 */
function get_urls($str){

    $regex = '(?xi)\b((?:https?://|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:\'".,<>?«»“”‘’]))';
    preg_match_all("#$regex#i", $str, $matches);
    $urls = $matches[0];
    return $urls;

}