Regex以匹配c风格的多行注释

时间:2022-03-05 00:51:40

I have a string for e.g.

我对…有兴趣。

String src = "How are things today /* this is comment *\*/ and is your code  /*\* this is another comment */ working?"

I want to remove /* this is comment *\*/ and /** this is another comment */ substrings from the src string.

我想删除/*这是注释*\*/和/*这是src字符串中的另一个注释*/子字符串。

I tried to use regex but failed due to less experience.

我尝试使用regex,但由于经验不足而失败。

4 个解决方案

#1


11  

Try using this regex (Single line comments only):

尝试使用这个regex(仅用一行注释):

String src ="How are things today /* this is comment */ and is your code /* this is another comment */ working?";
String result=src.replaceAll("/\\*.*?\\*/","");//single line comments
System.out.println(result);

REGEX explained:

正则表达式解释道:

Match the character "/" literally

匹配字符“/”

Match the character "*" literally

从字面上匹配字符“*”

"." Match any single character

“。”匹配任何单个字符

"*?" Between zero and unlimited times, as few times as possible, expanding as needed (lazy)

“* ?”在零到无限的时间之间,尽可能少的时间,根据需要扩展(惰性)

Match the character "*" literally

从字面上匹配字符“*”

Match the character "/" literally

匹配字符“/”

Alternatively here is regex for single and multi-line comments by adding (?s):

另一种方法是通过添加(?s)来为单行和多行注释提供regex:

//note the added \n which wont work with previous regex
String src ="How are things today /* this\n is comment */ and is your code /* this is another comment */ working?";
String result=src.replaceAll("(?s)/\\*.*?\\*/","");
System.out.println(result);

Reference:

参考:

#2


16  

The best multiline comment regex is an unrolled version of (?s)/\*.*?\*/ that looks like

最好的多行评论regex是(?s)/\*的展开版本。\ * /看起来像

String pat = "/\\*[^*]*\\*+(?:[^/*][^*]*\\*+)*/";

See the /\*[^*]*\*+(?:[^/*][^*]*\*+)*/ regex demo and the explanation at regex101.com.

看到/ \ *(^ *)* \ * +(?:[^ / *][^ *)* \ * +)* / regex演示和解释在regex101.com。

In short,

简而言之,

  • /\* - match the comment start /*
  • /\* -匹配评论开始/*
  • [^*]*\*+ - match 0+ characters other than * followed with 1+ literal *
  • \[^ *)* * + -匹配0 +字符以外的* 1 +文字*
  • (?:[^/*][^*]*\*+)* - 0+ sequences of:
    • [^/*][^*]*\*+ - not a / or * (matched with [^/*]) followed with 0+ non-asterisk characters ([^*]*) followed with 1+ asterisks (\*+)
    • [^ / *][^ *)* \ * + -不是一个/或*(与[^ / *])紧随其后0 + non-asterisk字符((^ *)*)紧随其后1 +星号(\ * +)
  • (?:[^ / *][^ *)* \ * +)* - 0 +序列:[^ / *][^ *)* \ * + -不是一个/或*(与[^ / *])紧随其后0 + non-asterisk字符((^ *)*)紧随其后1 +星号(\ * +)
  • / - closing /
  • / -关闭/

David's regex needs 26 steps to find the match in my example string, and my regex needs just 12 steps. With huge inputs, David's regex is likely to fail with a stack overflow issue or something similar because the .*? lazy dot matching is inefficient due to lazy pattern expansion at each location the regex engine performs, while my pattern matches linear chunks of text in one go.

David的regex需要26步才能在示例字符串中找到匹配,而我的regex只需要12步。由于有大量的输入,David的regex很可能会因为堆栈溢出问题或类似的问题而失败,因为。*?由于regex引擎执行的每个位置都有延迟的模式扩展,所以延迟点匹配效率很低,而我的模式一次匹配线性文本块。

#3


0  

System.out.println(src.replaceAll("\\/\\*.*?\\*\\/ ?", ""));

You have to use the non-greedy-quantifier ? to get the regex working. I also added a ' ?' at the end of the regex to remove one space.

你必须使用非贪婪量词吗?让regex正常工作。我还在regex的末尾添加了“?”,以删除一个空格。

#4


0  

Try this which worked for me:

试试这个对我有用的方法:

System.out.println(src.replaceAll("(\/\*.*?\*\/)+",""));

#1


11  

Try using this regex (Single line comments only):

尝试使用这个regex(仅用一行注释):

String src ="How are things today /* this is comment */ and is your code /* this is another comment */ working?";
String result=src.replaceAll("/\\*.*?\\*/","");//single line comments
System.out.println(result);

REGEX explained:

正则表达式解释道:

Match the character "/" literally

匹配字符“/”

Match the character "*" literally

从字面上匹配字符“*”

"." Match any single character

“。”匹配任何单个字符

"*?" Between zero and unlimited times, as few times as possible, expanding as needed (lazy)

“* ?”在零到无限的时间之间,尽可能少的时间,根据需要扩展(惰性)

Match the character "*" literally

从字面上匹配字符“*”

Match the character "/" literally

匹配字符“/”

Alternatively here is regex for single and multi-line comments by adding (?s):

另一种方法是通过添加(?s)来为单行和多行注释提供regex:

//note the added \n which wont work with previous regex
String src ="How are things today /* this\n is comment */ and is your code /* this is another comment */ working?";
String result=src.replaceAll("(?s)/\\*.*?\\*/","");
System.out.println(result);

Reference:

参考:

#2


16  

The best multiline comment regex is an unrolled version of (?s)/\*.*?\*/ that looks like

最好的多行评论regex是(?s)/\*的展开版本。\ * /看起来像

String pat = "/\\*[^*]*\\*+(?:[^/*][^*]*\\*+)*/";

See the /\*[^*]*\*+(?:[^/*][^*]*\*+)*/ regex demo and the explanation at regex101.com.

看到/ \ *(^ *)* \ * +(?:[^ / *][^ *)* \ * +)* / regex演示和解释在regex101.com。

In short,

简而言之,

  • /\* - match the comment start /*
  • /\* -匹配评论开始/*
  • [^*]*\*+ - match 0+ characters other than * followed with 1+ literal *
  • \[^ *)* * + -匹配0 +字符以外的* 1 +文字*
  • (?:[^/*][^*]*\*+)* - 0+ sequences of:
    • [^/*][^*]*\*+ - not a / or * (matched with [^/*]) followed with 0+ non-asterisk characters ([^*]*) followed with 1+ asterisks (\*+)
    • [^ / *][^ *)* \ * + -不是一个/或*(与[^ / *])紧随其后0 + non-asterisk字符((^ *)*)紧随其后1 +星号(\ * +)
  • (?:[^ / *][^ *)* \ * +)* - 0 +序列:[^ / *][^ *)* \ * + -不是一个/或*(与[^ / *])紧随其后0 + non-asterisk字符((^ *)*)紧随其后1 +星号(\ * +)
  • / - closing /
  • / -关闭/

David's regex needs 26 steps to find the match in my example string, and my regex needs just 12 steps. With huge inputs, David's regex is likely to fail with a stack overflow issue or something similar because the .*? lazy dot matching is inefficient due to lazy pattern expansion at each location the regex engine performs, while my pattern matches linear chunks of text in one go.

David的regex需要26步才能在示例字符串中找到匹配,而我的regex只需要12步。由于有大量的输入,David的regex很可能会因为堆栈溢出问题或类似的问题而失败,因为。*?由于regex引擎执行的每个位置都有延迟的模式扩展,所以延迟点匹配效率很低,而我的模式一次匹配线性文本块。

#3


0  

System.out.println(src.replaceAll("\\/\\*.*?\\*\\/ ?", ""));

You have to use the non-greedy-quantifier ? to get the regex working. I also added a ' ?' at the end of the regex to remove one space.

你必须使用非贪婪量词吗?让regex正常工作。我还在regex的末尾添加了“?”,以删除一个空格。

#4


0  

Try this which worked for me:

试试这个对我有用的方法:

System.out.println(src.replaceAll("(\/\*.*?\*\/)+",""));