I need your help to remove all characters using a Javascript Regex in string HTML Document except <body></body>
and whole string inside body tag.
我需要你的帮助,使用字符串HTML文档中的Javascript Regex删除所有字符,除了 和body标签内的整个字符串。
I tried to use this but doesn't work:
我尝试使用它但不起作用:
var str = "<html><head><title></title></head><body>my content</body></html>"
str.replace(/[^\<body\>(.+)\<\\body\>]+/g,'');
I need the body content only, other option will be to use DOMParser
:
我只需要正文内容,其他选项将是使用DOMParser:
var oParser = new DOMParser(str);
var oDOM = oParser.parseFromString(str, "text/xml");
But this throws an error parsing my string document loaded via Ajax.
Thanks in advance for your suggestions!
但这会导致解析通过Ajax加载的字符串文档时出错。提前感谢您的建议!
3 个解决方案
#1
1
var str = "<html><head><title></title></head><body>my content</body></html>"
str=str.match(/<(body)>[\s\S]*?<\/\1>/gi);
//also you can try this:
//str=str.match(/<(body)>.*?<\/\1>/gis);
Debuggex演示
#2
1
You could try this code,
你可以尝试这个代码,
> var str = "<html><head><title></title></head><body>my content</body></html>"
undefined
> str.replace(/.*?(<body>.*?<\/body>).*/g, '$1');
'<body>my content</body>'
DEMO
#3
0
You can't (or at least shouldn't) do this with replace
; try match
instead:
你不能(或者至少不应该)用替换来做这件事;尝试匹配:
var str = "<html><head><title></title></head><body>my content</body></html>"
var m = str.match(/<body>.*<\/body>/);
console.log(m[0]); //=> "<body>my content</body>"
If you have a multiline string, change the .
(which does not include \n
) to [\S\s]
(not whitespace OR whitespace) or something similar.
如果您有多行字符串,请更改。 (不包括\ n)到[\ S \ s](不是空白或空格)或类似的东西。
#1
1
var str = "<html><head><title></title></head><body>my content</body></html>"
str=str.match(/<(body)>[\s\S]*?<\/\1>/gi);
//also you can try this:
//str=str.match(/<(body)>.*?<\/\1>/gis);
Debuggex演示
#2
1
You could try this code,
你可以尝试这个代码,
> var str = "<html><head><title></title></head><body>my content</body></html>"
undefined
> str.replace(/.*?(<body>.*?<\/body>).*/g, '$1');
'<body>my content</body>'
DEMO
#3
0
You can't (or at least shouldn't) do this with replace
; try match
instead:
你不能(或者至少不应该)用替换来做这件事;尝试匹配:
var str = "<html><head><title></title></head><body>my content</body></html>"
var m = str.match(/<body>.*<\/body>/);
console.log(m[0]); //=> "<body>my content</body>"
If you have a multiline string, change the .
(which does not include \n
) to [\S\s]
(not whitespace OR whitespace) or something similar.
如果您有多行字符串,请更改。 (不包括\ n)到[\ S \ s](不是空白或空格)或类似的东西。