如何忽略regex中围绕URL的字符

时间:2022-09-13 09:31:51

I have the following regex

我有下面的regex

var URL_REGEX = /(^|[\s\n]|<br\/?>)((?:(?:https?|ftp):\/\/)?[\-A-Z0-9\u00A0-\uD7FF\uE000-\uFDCF\uFDF0-\uFFFD+\u0026\u2019@#\/%?=()~_|!:,.;]*[\-A-Z0-9+\u0026@#\/%=~()_|])/gi;

I am able to capture the URL in the following correctly:

我能够正确地捕获以下URL:

var someString1 = "hello http://*.com";
var someString2 = "hello www.*.com";
var someString3 = "hello *.com";
var someString4 = "hello *.com?foo=bar&foo=baz&foo-bar=baz";

But suppose I have

但假设我有

var wrappedUrl = "hello (www.*.com)";

I capture the URL along with the parentheses (I don't want that). How do I only capture the URL?

我将URL和圆括号(我不想要)一起捕获。我如何才能捕获URL?

This fails to get captured. I get no match:

这没有被捕获。我没有匹配:

var wrappedUrl = "hello [www.*.com]";

2 个解决方案

#1


2  

You can use

您可以使用

/((https?|ftp)\:\/\/)?([a-z0-9+!*(),;?&=\$_.-]+(\:[a-z0-9+!*(),;?&=\$_.-]+)?@)?([a-z0-9-.]*)\.([a-z]{2,4})(\:[0-9]{2,5})?(\/([a-z0-9+\$_-]\.?)+)*\/?(\?[a-z+&\$_.-][a-z0-9;:@&%=+\/\$_.-]*)?(#[a-z_.-][a-z0-9+\$_.-]*)?/gi

See the regex demo

看到regex演示

Explanation:

解释:

  • ((https?|ftp)\:\/\/)? - Scheme
  • ((https ? | ftp)\ \ / \ /)?-计划
  • ([a-z0-9+!*(),;?&=\$_.-]+(\:[a-z0-9+!*(),;?&=\$_.-]+)?@)? - Username and password
  • ([a-z0-9 + ! *();? & = \ $ _. -]+(:\[a-z0-9 + ! *();? & = \ $ _. -]+)? @)?——用户名和密码
  • ([a-z0-9-.]*)\.([a-z]{2,3}) - Host name or IP address
  • ((a-z0-9 -。)*)\。([a - z]{ 2,3 }),主机名或IP地址
  • (\:[0-9]{2,5})? - Port address
  • (\[0 - 9]{ 2、5 })?——端口地址
  • (\/([a-z0-9+\$_-]\.?)+)*\/? - Path
  • (\ /((a-z0-9 + \ $ _ -)\ ?)+)* \ / ?——路径
  • (\?[a-z+&\$_.-][a-z0-9;:@&%=+\/\$_.-]*)? - GET query
  • (\[a - z + & \ $ _. -][a-z0-9;:@ % = + \ / \ $ _. -]*)?——查询得到
  • (#[a-z_.-][a-z0-9+\$_.-]*)? - anchor
  • (#[a-z_. -][a-z0-9 + \ $ _. -]*)?——锚

See the JS demo:

看到JS演示:

var re = /((https?|ftp)\:\/\/)?([a-z0-9+!*(),;?&=\$_.-]+(\:[a-z0-9+!*(),;?&=\$_.-]+)?@)?([a-z0-9-.]*)\.([a-z]{2,4})(\:[0-9]{2,5})?(\/([a-z0-9+\$_-]\.?)+)*\/?(\?[a-z+&\$_.-][a-z0-9;:@&%=+\/\$_.-]*)?(#[a-z_.-][a-z0-9+\$_.-]*)?/gi; 
var str = `hello http://*.com
hello www.*.com
hello *.com
hello *.com?foo=bar&foo=baz&foo-bar=baz
hello [www.*.com]
hello (www.*.com)`;
 
while ((m = re.exec(str)) !== null) {
    document.body.innerHTML += m[0] + "<br/>";
}

#2


0  

I tried this regular expression /((http|https|ftp):?\/\/)?[a-z-A-Z]*(\.[a-z-A-Z]*)+(\?([a-z-A-Z0-9_]+=[a-z-A-Z0-9_]+(&)?)*)?/
And it works perfectly in all cases you have showed.
Anyway it will be good to have look into RegExp references, and try build expression from blank on your own.

我试着这个正则表达式/((http | https | ftp):? \ / \ /)?(a-z-A-Z)*(\[a-z-A-Z]*)+(\ ?([a-z-A-Z0-9_]+ =(a-z-A-Z0-9_)+(&)?)*)?/它在你展示过的所有情况下都很有效。无论如何,查看RegExp引用将是件好事,并且尝试从您自己的空白中构建表达式。

#1


2  

You can use

您可以使用

/((https?|ftp)\:\/\/)?([a-z0-9+!*(),;?&=\$_.-]+(\:[a-z0-9+!*(),;?&=\$_.-]+)?@)?([a-z0-9-.]*)\.([a-z]{2,4})(\:[0-9]{2,5})?(\/([a-z0-9+\$_-]\.?)+)*\/?(\?[a-z+&\$_.-][a-z0-9;:@&%=+\/\$_.-]*)?(#[a-z_.-][a-z0-9+\$_.-]*)?/gi

See the regex demo

看到regex演示

Explanation:

解释:

  • ((https?|ftp)\:\/\/)? - Scheme
  • ((https ? | ftp)\ \ / \ /)?-计划
  • ([a-z0-9+!*(),;?&=\$_.-]+(\:[a-z0-9+!*(),;?&=\$_.-]+)?@)? - Username and password
  • ([a-z0-9 + ! *();? & = \ $ _. -]+(:\[a-z0-9 + ! *();? & = \ $ _. -]+)? @)?——用户名和密码
  • ([a-z0-9-.]*)\.([a-z]{2,3}) - Host name or IP address
  • ((a-z0-9 -。)*)\。([a - z]{ 2,3 }),主机名或IP地址
  • (\:[0-9]{2,5})? - Port address
  • (\[0 - 9]{ 2、5 })?——端口地址
  • (\/([a-z0-9+\$_-]\.?)+)*\/? - Path
  • (\ /((a-z0-9 + \ $ _ -)\ ?)+)* \ / ?——路径
  • (\?[a-z+&\$_.-][a-z0-9;:@&%=+\/\$_.-]*)? - GET query
  • (\[a - z + & \ $ _. -][a-z0-9;:@ % = + \ / \ $ _. -]*)?——查询得到
  • (#[a-z_.-][a-z0-9+\$_.-]*)? - anchor
  • (#[a-z_. -][a-z0-9 + \ $ _. -]*)?——锚

See the JS demo:

看到JS演示:

var re = /((https?|ftp)\:\/\/)?([a-z0-9+!*(),;?&=\$_.-]+(\:[a-z0-9+!*(),;?&=\$_.-]+)?@)?([a-z0-9-.]*)\.([a-z]{2,4})(\:[0-9]{2,5})?(\/([a-z0-9+\$_-]\.?)+)*\/?(\?[a-z+&\$_.-][a-z0-9;:@&%=+\/\$_.-]*)?(#[a-z_.-][a-z0-9+\$_.-]*)?/gi; 
var str = `hello http://*.com
hello www.*.com
hello *.com
hello *.com?foo=bar&foo=baz&foo-bar=baz
hello [www.*.com]
hello (www.*.com)`;
 
while ((m = re.exec(str)) !== null) {
    document.body.innerHTML += m[0] + "<br/>";
}

#2


0  

I tried this regular expression /((http|https|ftp):?\/\/)?[a-z-A-Z]*(\.[a-z-A-Z]*)+(\?([a-z-A-Z0-9_]+=[a-z-A-Z0-9_]+(&)?)*)?/
And it works perfectly in all cases you have showed.
Anyway it will be good to have look into RegExp references, and try build expression from blank on your own.

我试着这个正则表达式/((http | https | ftp):? \ / \ /)?(a-z-A-Z)*(\[a-z-A-Z]*)+(\ ?([a-z-A-Z0-9_]+ =(a-z-A-Z0-9_)+(&)?)*)?/它在你展示过的所有情况下都很有效。无论如何,查看RegExp引用将是件好事,并且尝试从您自己的空白中构建表达式。