How to extract level 2 of URL from Notepad++?

时间:2022-01-15 21:19:52

This is some URL:

这是一些网址:

http://www.mywebsite.com/name1-name2-name3-name4-342/46547657/ca
http://www.mywebsite.com/name5-487659826/da
http://www.mywebsite.com/name6-name7-567/5677/ca
http://www.mywebsite.com/name8-name9-name10-48765766/da
http://www.mywebsite.com/name11-name12-name13-name14-name15/11117657/ca
http://www.mywebsite.com/name16-4866626/da

So, output will be:

所以,输出将是:

name1-name2-name3-name4-342
name5
name6-name7-567
name8-name9-name10
name11-name12-name13-name14-name15
name16

Do you give me a regex which do that, please ?

你能给我一个正则表达式吗?

2 个解决方案

#1


2  

For the given urls you have provided, you could use the following to extract the wanted substrings.

对于您提供的给定URL,您可以使用以下内容来提取所需的子字符串。

http://[^/]+/\K\w+(?:-(?!\d{4,})\w+)*

Live Demo

现场演示

#2


0  

http://.*mywebsite\.com/(\w+(?:-(?!\d{4,})\w+)*)

Options: ^ and $ match at line breaks

Match the characters “http://” literally «http://»
Match any single character that is not a line break character «.*»
   Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Match the characters “mywebsite” literally «mywebsite»
Match the character “.” literally «\.»
Match the characters “com/” literally «com/»
Match the regular expression below and capture its match into backreference number 1 «(\w+(?:-(?!\d{4,})\w+)*)»
   Match a single character that is a “word character” (letters, digits, etc.) «\w+»
      Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
   Match the regular expression below «(?:-(?!\d{4,})\w+)*»
      Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
      Match the character “-” literally «-»
      Assert that it is impossible to match the regex below starting at this position (negative lookahead) «(?!\d{4,})»
         Match a single digit 0..9 «\d{4,}»
            Between 4 and unlimited times, as many times as possible, giving back as needed (greedy) «{4,}»
      Match a single character that is a “word character” (letters, digits, etc.) «\w+»
         Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»


Created with RegexBuddy

Matches all:

匹配所有:

http://mywebsite.com/name6-name7-567    name6-name7-567
http://mywebsite.com/name6-name7-567    name6-name7-567
http://www.mywebsite.com/name1-name2-name3-name4-342    name1-name2-name3-name4-342
http://www.mywebsite.com/name5  name5
http://www.mywebsite.com/name6-name7-567    name6-name7-567
http://www.mywebsite.com/name8-name9-name10 name8-name9-name10
http://www.mywebsite.com/name11-name12-name13-name14-name15 name11-name12-name13-name14-name15
http://www.mywebsite.com/name16 name16

#1


2  

For the given urls you have provided, you could use the following to extract the wanted substrings.

对于您提供的给定URL,您可以使用以下内容来提取所需的子字符串。

http://[^/]+/\K\w+(?:-(?!\d{4,})\w+)*

Live Demo

现场演示

#2


0  

http://.*mywebsite\.com/(\w+(?:-(?!\d{4,})\w+)*)

Options: ^ and $ match at line breaks

Match the characters “http://” literally «http://»
Match any single character that is not a line break character «.*»
   Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Match the characters “mywebsite” literally «mywebsite»
Match the character “.” literally «\.»
Match the characters “com/” literally «com/»
Match the regular expression below and capture its match into backreference number 1 «(\w+(?:-(?!\d{4,})\w+)*)»
   Match a single character that is a “word character” (letters, digits, etc.) «\w+»
      Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
   Match the regular expression below «(?:-(?!\d{4,})\w+)*»
      Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
      Match the character “-” literally «-»
      Assert that it is impossible to match the regex below starting at this position (negative lookahead) «(?!\d{4,})»
         Match a single digit 0..9 «\d{4,}»
            Between 4 and unlimited times, as many times as possible, giving back as needed (greedy) «{4,}»
      Match a single character that is a “word character” (letters, digits, etc.) «\w+»
         Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»


Created with RegexBuddy

Matches all:

匹配所有:

http://mywebsite.com/name6-name7-567    name6-name7-567
http://mywebsite.com/name6-name7-567    name6-name7-567
http://www.mywebsite.com/name1-name2-name3-name4-342    name1-name2-name3-name4-342
http://www.mywebsite.com/name5  name5
http://www.mywebsite.com/name6-name7-567    name6-name7-567
http://www.mywebsite.com/name8-name9-name10 name8-name9-name10
http://www.mywebsite.com/name11-name12-name13-name14-name15 name11-name12-name13-name14-name15
http://www.mywebsite.com/name16 name16