html实体解码失败，textareas中的新行

when I get a text from a textarea in html like this

当我从像这样的HTML格式的textarea中获取文本时

&#119;&#97;&#115;&#101;&
;#101;&#109;

the correct decode is waseem

正确的解码是waseem

notice the newline , when I decode it I get

注意换行,当我解码时,我得到了

wase&;#101;m

the newline make errors here , Can I fix it ? I use javascript in the decoding process .

换行符在这里发生错误,我能解决吗?我在解码过程中使用javascript。

I use this function in decoding

我在解码中使用此功能

function html_entity_decode(str) {  
 var  ta=document.createElement("textarea");

 ta.innerHTML=str.replace(/</g,"&lt;").replace(/>/g,"&gt;");

 return ta.value;       
}

2 个解决方案

#1

You could pass it through the following regex - Replace

您可以通过以下正则表达式传递它 - 替换

&[\s\r\n]+;(?=#\d+;)

with

globally. Your HTML entity format is simply broken. Apart from the fact that HTML entities cannot contain whitespace and newlines, they cannot contain semi-colons in the middle.

全球。您的HTML实体格式很简单。除了HTML实体不能包含空格和换行符这一事实外,它们不能在中间包含分号。

#2

Your input text may not be right and it is working as intended. Garbage-In-Garbage-Out.

您的输入文本可能不正确,并且按预期工作。垃圾进垃圾出。

I suspect the &\n; should be something else. But if not:

我怀疑&\ n;应该是别的东西。但如果没有:

str.replace(/&\s*;/g, "");

#1