正则表达式删除除一系列特定字符之外的所有字符串

I need your help to remove all characters using a Javascript Regex in string HTML Document except <body></body> and whole string inside body tag.

我需要你的帮助，使用字符串HTML文档中的Javascript Regex删除所有字符，除了和body标签内的整个字符串。

I tried to use this but doesn't work:

我尝试使用它但不起作用：

var str = "<html><head><title></title></head><body>my content</body></html>"
str.replace(/[^\<body\>(.+)\<\\body\>]+/g,'');

I need the body content only, other option will be to use DOMParser:

我只需要正文内容，其他选项将是使用DOMParser：

var oParser = new DOMParser(str);
var oDOM = oParser.parseFromString(str, "text/xml");

But this throws an error parsing my string document loaded via Ajax.
Thanks in advance for your suggestions!

但这会导致解析通过Ajax加载的字符串文档时出错。提前感谢您的建议！

3 个解决方案

#1

var str = "<html><head><title></title></head><body>my content</body></html>"

str=str.match(/<(body)>[\s\S]*?<\/\1>/gi);

//also you can try this:
//str=str.match(/<(body)>.*?<\/\1>/gis);

正则表达式删除除一系列特定字符之外的所有字符串

Debuggex Demo

Debuggex演示

#2

You could try this code,

你可以尝试这个代码，

> var str = "<html><head><title></title></head><body>my content</body></html>"
undefined
> str.replace(/.*?(<body>.*?<\/body>).*/g, '$1');
'<body>my content</body>'

DEMO

#3

You can't (or at least shouldn't) do this with replace; try match instead:

你不能（或者至少不应该）用替换来做这件事;尝试匹配：

var str = "<html><head><title></title></head><body>my content</body></html>"
var m = str.match(/<body>.*<\/body>/);
console.log(m[0]); //=> "<body>my content</body>"

If you have a multiline string, change the . (which does not include \n) to [\S\s] (not whitespace OR whitespace) or something similar.

如果您有多行字符串，请更改。（不包括\ n）到[\ S \ s]（不是空白或空格）或类似的东西。

#1

var str = "<html><head><title></title></head><body>my content</body></html>"

str=str.match(/<(body)>[\s\S]*?<\/\1>/gi);

//also you can try this:
//str=str.match(/<(body)>.*?<\/\1>/gis);

正则表达式删除除一系列特定字符之外的所有字符串

Debuggex Demo

Debuggex演示

#2

You could try this code,

你可以尝试这个代码，

> var str = "<html><head><title></title></head><body>my content</body></html>"
undefined
> str.replace(/.*?(<body>.*?<\/body>).*/g, '$1');
'<body>my content</body>'

DEMO

#3

You can't (or at least shouldn't) do this with replace; try match instead:

你不能（或者至少不应该）用替换来做这件事;尝试匹配：

var str = "<html><head><title></title></head><body>my content</body></html>"
var m = str.match(/<body>.*<\/body>/);
console.log(m[0]); //=> "<body>my content</body>"

If you have a multiline string, change the . (which does not include \n) to [\S\s] (not whitespace OR whitespace) or something similar.

如果您有多行字符串，请更改。（不包括\ n）到[\ S \ s]（不是空白或空格）或类似的东西。

秒客网

正则表达式删除除一系列特定字符之外的所有字符串

3 个解决方案

#1

#2

#3

#1

#2

#3

相关文章