when I get a text from a textarea in html like this
当我从像这样的HTML格式的textarea中获取文本时
wase&
;#101;m
the correct decode is waseem
正确的解码是waseem
notice the newline , when I decode it I get
注意换行,当我解码时,我得到了
wase&;#101;m
the newline make errors here , Can I fix it ? I use javascript in the decoding process .
换行符在这里发生错误,我能解决吗?我在解码过程中使用javascript。
I use this function in decoding
我在解码中使用此功能
function html_entity_decode(str) {
var ta=document.createElement("textarea");
ta.innerHTML=str.replace(/</g,"<").replace(/>/g,">");
return ta.value;
}
2 个解决方案
#1
You could pass it through the following regex - Replace
您可以通过以下正则表达式传递它 - 替换
&[\s\r\n]+;(?=#\d+;)
with
&
globally. Your HTML entity format is simply broken. Apart from the fact that HTML entities cannot contain whitespace and newlines, they cannot contain semi-colons in the middle.
全球。您的HTML实体格式很简单。除了HTML实体不能包含空格和换行符这一事实外,它们不能在中间包含分号。
#2
Your input text may not be right and it is working as intended. Garbage-In-Garbage-Out.
您的输入文本可能不正确,并且按预期工作。垃圾进垃圾出。
I suspect the &\n; should be something else. But if not:
我怀疑&\ n;应该是别的东西。但如果没有:
str.replace(/&\s*;/g, "");
#1
You could pass it through the following regex - Replace
您可以通过以下正则表达式传递它 - 替换
&[\s\r\n]+;(?=#\d+;)
with
&
globally. Your HTML entity format is simply broken. Apart from the fact that HTML entities cannot contain whitespace and newlines, they cannot contain semi-colons in the middle.
全球。您的HTML实体格式很简单。除了HTML实体不能包含空格和换行符这一事实外,它们不能在中间包含分号。
#2
Your input text may not be right and it is working as intended. Garbage-In-Garbage-Out.
您的输入文本可能不正确,并且按预期工作。垃圾进垃圾出。
I suspect the &\n; should be something else. But if not:
我怀疑&\ n;应该是别的东西。但如果没有:
str.replace(/&\s*;/g, "");