I have a java string such as this:
我有一个java字符串,如下所示:
String string = "I <strong>really</strong> want to get rid of the strong-tags!";
And I want to remove the tags. I have some other strings where the tags are way longer, so I'd like to find a way to remove everything between "<>" characters, including those characters.
我想删除标签。我有一些标签更长的其他字符串,所以我想找到一种方法来删除“<>”字符之间的所有内容,包括那些字符。
One way would be to use the built-in string method that compares the string to a regEx, but I have no idea how to write those.
一种方法是使用内置字符串方法将字符串与regEx进行比较,但我不知道如何编写它们。
1 个解决方案
#1
16
Caution is advised when using regex to parse HTML (due its allowable complexity), however for "simple" HTML, and simple text (text without literal <
or >
in it) this will work:
在使用正则表达式解析HTML时(由于其允许的复杂性),建议小心,但对于“简单”HTML和简单文本(文本中没有文字 <或> ),这将起作用:
String stripped = html.replaceAll("<.*?>", "");
#1
16
Caution is advised when using regex to parse HTML (due its allowable complexity), however for "simple" HTML, and simple text (text without literal <
or >
in it) this will work:
在使用正则表达式解析HTML时(由于其允许的复杂性),建议小心,但对于“简单”HTML和简单文本(文本中没有文字 <或> ),这将起作用:
String stripped = html.replaceAll("<.*?>", "");