如何使用jQuery解码HTML实体?

时间:2022-12-09 21:59:10

How do I use jQuery to decode HTML entities in a string?

如何使用jQuery对字符串中的HTML实体进行解码?

17 个解决方案

#1


415  

Security note: using this answer (preserved in its original form below) may introduce an XSS vulnerability into your application. You should not use this answer. Read lucascaro's answer for an explanation of the vulnerabilities in this answer, and use the approach from either that answer or Mark Amery's answer instead.

安全提示:使用这个答案(在下面的原始表单中保留)可能会在您的应用程序中引入一个XSS漏洞。你不应该用这个答案。请阅读lucascaro的答案来解释这个答案中的漏洞,并使用该方法的答案或者是Mark Amery的答案。

Actually, try

实际上,试一试

var decoded = $("<div/>").html(encodedStr).text();

#2


167  

Without any jQuery:

没有任何jQuery:

function decodeEntities(encodedString) {
    var textArea = document.createElement('textarea');
    textArea.innerHTML = encodedString;
    return textArea.value;
}

console.log(decodeEntities('1 &amp; 2')); // '1 & 2'

This works similarly to the accepted answer, but is safe to use with untrusted user input.

这与接受的答案类似,但是使用不受信任的用户输入是安全的。


Security issues in similar approaches

As noted by Mike Samuel, doing this with a <div> instead of a <textarea> with untrusted user input is an XSS vulnerability, even if the <div> is never added to the DOM:

正如Mike Samuel所指出的,使用一个

而不是
从未添加到DOM中:

function decodeEntities(encodedString) {
    var div = document.createElement('div');
    div.innerHTML = encodedString;
    return div.textContent;
}

// Shows an alert
decodeEntities('<img src="nonexistent_image" onerror="alert(1337)">')

However, this attack is not possible against a <textarea> because there are no HTML elements that are permitted content of a <textarea>. Consequently, any HTML tags still present in the 'encoded' string will be automatically entity-encoded by the browser.

但是,对于

function decodeEntities(encodedString) {
    var textArea = document.createElement('textarea');
    textArea.innerHTML = encodedString;
    return textArea.value;
}

// Safe, and returns the correct answer
console.log(decodeEntities('<img src="nonexistent_image" onerror="alert(1337)">'))

Doing this using jQuery's .html() and .val() methods instead of using .innerHTML and .value is also insecure* for some versions of jQuery, even when using a textarea. This is because older versions of jQuery would deliberately and explicitly evaluate scripts contained in the string passed to .html(). Hence code like this shows an alert in jQuery 1.8:

使用jQuery .html()和.val()方法,而不是使用. innerhtml和.value,对于某些版本的jQuery,甚至在使用textarea时也不安全。这是因为老版本的jQuery会故意和显式地评估传递给.html()的字符串中包含的脚本。因此,像这样的代码在jQuery 1.8中显示了一个警告:

// Shows alert
$('<textarea>').html('<script>alert(1337)</script>').text()

* Thanks to Eru Penkman for catching this vulnerability.

*感谢Eru Penkman抓住了这个漏洞。

#3


77  

Like Mike Samuel said, don't use jQuery.html().text() to decode html entities as it's unsafe.

就像Mike Samuel说的,不要使用jQuery.html().text()来解码html实体,因为它不安全。

Instead, use a template renderer like Mustache.js or decodeEntities from @VyvIT's comment.

相反,要使用像Mustache这样的模板渲染器。js或decodeEntities从@VyvIT的评论。

Underscore.js utility-belt library comes with escape and unescape methods, but they are not safe for user input:

下划线。js实用带库有escape和unescape方法,但用户输入不安全:

_.escape(string)

_.escape(字符串)

_.unescape(string)

_.unescape(字符串)

#4


28  

I think you're confusing the text and HTML methods. Look at this example, if you use an element's inner HTML as text, you'll get decoded HTML tags (second button). But if you use them as HTML, you'll get the HTML formatted view (first button).

我认为你混淆了文本和HTML方法。看这个例子,如果你使用一个元素的内部HTML作为文本,你将得到解码的HTML标签(第二个按钮)。但是如果将它们作为HTML使用,就会得到HTML格式的视图(第一个按钮)。

<div id="myDiv">
    here is a <b>HTML</b> content.
</div>
<br />
<input value="Write as HTML" type="button" onclick="javascript:$('#resultDiv').html($('#myDiv').html());" />
&nbsp;&nbsp;
<input value="Write as Text" type="button" onclick="javascript:$('#resultDiv').text($('#myDiv').html());" />
<br /><br />
<div id="resultDiv">
    Results here !
</div>

First button writes : here is a HTML content.

第一个按钮写:这是一个HTML内容。

Second button writes : here is a <B>HTML</B> content.

第二个按钮写到:这里是一个HTML内容。

By the way, you can see a plug-in that I found in jQuery plugin - HTML decode and encode that encodes and decodes HTML strings.

顺便说一下,您可以看到我在jQuery插件中找到的插件——HTML解码和编码,编码和解码HTML字符串。

#5


26  

The question is limited by 'with jQuery' but it might help some to know that the jQuery code given in the best answer here does the following underneath...this works with or without jQuery:

这个问题受到“jQuery”的限制,但它可能会帮助一些人知道,在最佳答案中给出的jQuery代码如下所示……这适用于或没有jQuery:

function decodeEntities(input) {
  var y = document.createElement('textarea');
  y.innerHTML = input;
  return y.value;
}

#6


15  

You can use the he library, available from https://github.com/mathiasbynens/he

您可以使用他的库,可以从https://github.com/mathiasbynens/he获得。

Example:

例子:

console.log(he.decode("J&#246;rg &amp J&#xFC;rgen rocked to &amp; fro "));
// Logs "Jörg & Jürgen rocked to & fro"

I challenged the library's author on the question of whether there was any reason to use this library in clientside code in favour of the <textarea> hack provided in other answers here and elsewhere. He provided a few possible justifications:

我向该图书馆的作者提出了一个问题,即是否有理由使用该库在clientside代码中支持

  • If you're using node.js serverside, using a library for HTML encoding/decoding gives you a single solution that works both clientside and serverside.

    如果你使用节点。js服务器端,使用一个用于HTML编码/解码的库,可以为您提供一个既能同时运行clientside和serverside的解决方案。

  • Some browsers' entity decoding algorithms have bugs or are missing support for some named character references. For example, Internet Explorer will both decode and render non-breaking spaces (&nbsp;) correctly but report them as ordinary spaces instead of non-breaking ones via a DOM element's innerText property, breaking the <textarea> hack (albeit only in a minor way). Additionally, IE 8 and 9 simply don't support any of the new named character references added in HTML 5. The author of he also hosts a test of named character reference support at http://mathias.html5.org/tests/html/named-character-references/. In IE 8, it reports over one thousand errors.

    一些浏览器的实体解码算法有bug或者缺少对某些命名字符引用的支持。例如,Internet Explorer将会正确地解码和呈现非破坏空格(以及),但是将它们作为普通的空格而不是通过DOM元素的innerText属性来报告,打破了

    If you want to be insulated from browser bugs related to entity decoding and/or be able to handle the full range of named character references, you can't get away with the <textarea> hack; you'll need a library like he.

    如果您希望避免与实体解码相关的浏览器错误,或者能够处理所有命名字符引用,那么您就无法摆脱

  • He just darn well feels like doing things this way is less hacky.

    他只是觉得这样做不太容易。

#7


12  

encode:

编码:

$("<textarea/>").html('<a>').html();      // return '&lt;a&gt'

decode:

解码:

$("<textarea/>").html('&lt;a&gt').val()   // return '<a>'

#8


4  

Use

使用

myString = myString.replace( /\&amp;/g, '&' );

It is easiest to do it on the server side because apparently JavaScript has no native library for handling entities, nor did I find any near the top of search results for the various frameworks that extend JavaScript.

在服务器端执行它是最容易的,因为显然JavaScript没有本地库来处理实体,我也没有找到任何扩展JavaScript的框架的搜索结果的顶部。

Search for "JavaScript HTML entities", and you might find a few libraries for just that purpose, but they'll probably all be built around the above logic - replace, entity by entity.

搜索“JavaScript HTML实体”,您可能会找到一些这样的程序库,但是它们可能都是围绕上述逻辑来构建的——由实体来代替实体。

#9


1  

You have to make custom function for html entities:

您必须为html实体定制函数:

function htmlEntities(str) {
return String(str).replace(/&/g, '&amp;').replace(/</g, '&lt;').replace(/>/g,'&gt;').replace(/"/g, '&quot;');
}

#10


0  

I just had to have an HTML entity charater (⇓) as a value for a HTML button. The HTML code looks good from the beginning in the browser:

我只需要有一个HTML实体特点(⇓)作为一个HTML按钮的值。从浏览器开始,HTML代码看起来很好:

<input type="button" value="Embed & Share  &dArr;" id="share_button" />

Now I was adding a toggle that should also display the charater. This is my solution

现在我添加了一个toggle,它也应该显示charater。这是我的解决方案

$("#share_button").toggle(
    function(){
        $("#share").slideDown();
        $(this).attr("value", "Embed & Share " + $("<div>").html("&uArr;").text());
    }

This displays ⇓ again in the button. I hope this might help someone.

这将再次显示在按钮中。我希望这能帮助到一些人。

#11


0  

Suppose you have below String.

假设您有以下字符串。

Our Deluxe cabins are warm, cozy &amp; comfortable

我们的豪华客舱温暖、舒适、舒适。舒适的

var str = $("p").text(); // get the text from <p> tag
$('p').html(str).text();  // Now,decode html entities in your variable i.e 

str and assign back to

然后重新分配。

tag.

标签。

that's it.

就是这样。

#12


0  

For ExtJS users, if you already have the encoded string, for example when the returned value of a library function is the innerHTML content, consider this ExtJS function:

对于ExtJS用户,如果您已经有了已编码的字符串,例如,当库函数的返回值是innerHTML内容时,请考虑这个ExtJS函数:

Ext.util.Format.htmlDecode(innerHtmlContent)

#13


0  

Extend a String class:

扩展一个字符串类:

String::decode = ->
  $('<textarea />').html(this).text()

and use as method:

和使用方法:

"&lt;img src='myimage.jpg'&gt;".decode()

#14


0  

Here are still one problem: Escaped string does not look readable when assigned to input value

这里仍然有一个问题:当分配到输入值时,脱逃的字符串看起来不容易读。

var string = _.escape("<img src=fake onerror=alert('boo!')>");
$('input').val(string);

Exapmle: https://jsfiddle.net/kjpdwmqa/3/

简单的:https://jsfiddle.net/kjpdwmqa/3/

#15


-2  

The easiest way is to set a class selector to your elements an then use following code:

最简单的方法是将类选择器设置为元素,然后使用以下代码:

$(function(){
    $('.classSelector').each(function(a, b){
        $(b).html($(b).text());
    });
});

Nothing any more needed!

没有任何更多的需要!

I had this problem and found this clear solution and it works fine.

我遇到了这个问题,找到了这个清晰的解决方案,效果很好。

#16


-3  

To decode HTML Entities with jQuery, just use this function:

要用jQuery解码HTML实体,只需使用这个函数:

function html_entity_decode(txt){
    var randomID = Math.floor((Math.random()*100000)+1);
    $('body').append('<div id="random'+randomID+'"></div>');
    $('#random'+randomID).html(txt);
    var entity_decoded = $('#random'+randomID).html();
    $('#random'+randomID).remove();
    return entity_decoded;
}

How to use:

如何使用:

Javascript:

Javascript:

var txtEncoded = "&aacute; &eacute; &iacute; &oacute; &uacute;";
$('#some-id').val(html_entity_decode(txtEncoded));

HTML:

HTML:

<input id="some-id" type="text" />

#17


-3  

I think that is the exact opposite of the solution chosen.

我认为这与选择的解决方案正好相反。

var decoded = $("<div/>").text(encodedStr).html();

#1


415  

Security note: using this answer (preserved in its original form below) may introduce an XSS vulnerability into your application. You should not use this answer. Read lucascaro's answer for an explanation of the vulnerabilities in this answer, and use the approach from either that answer or Mark Amery's answer instead.

安全提示:使用这个答案(在下面的原始表单中保留)可能会在您的应用程序中引入一个XSS漏洞。你不应该用这个答案。请阅读lucascaro的答案来解释这个答案中的漏洞,并使用该方法的答案或者是Mark Amery的答案。

Actually, try

实际上,试一试

var decoded = $("<div/>").html(encodedStr).text();

#2


167  

Without any jQuery:

没有任何jQuery:

function decodeEntities(encodedString) {
    var textArea = document.createElement('textarea');
    textArea.innerHTML = encodedString;
    return textArea.value;
}

console.log(decodeEntities('1 &amp; 2')); // '1 & 2'

This works similarly to the accepted answer, but is safe to use with untrusted user input.

这与接受的答案类似,但是使用不受信任的用户输入是安全的。


Security issues in similar approaches

As noted by Mike Samuel, doing this with a <div> instead of a <textarea> with untrusted user input is an XSS vulnerability, even if the <div> is never added to the DOM:

正如Mike Samuel所指出的,使用一个

而不是
从未添加到DOM中:

function decodeEntities(encodedString) {
    var div = document.createElement('div');
    div.innerHTML = encodedString;
    return div.textContent;
}

// Shows an alert
decodeEntities('<img src="nonexistent_image" onerror="alert(1337)">')

However, this attack is not possible against a <textarea> because there are no HTML elements that are permitted content of a <textarea>. Consequently, any HTML tags still present in the 'encoded' string will be automatically entity-encoded by the browser.

但是,对于

function decodeEntities(encodedString) {
    var textArea = document.createElement('textarea');
    textArea.innerHTML = encodedString;
    return textArea.value;
}

// Safe, and returns the correct answer
console.log(decodeEntities('<img src="nonexistent_image" onerror="alert(1337)">'))

Doing this using jQuery's .html() and .val() methods instead of using .innerHTML and .value is also insecure* for some versions of jQuery, even when using a textarea. This is because older versions of jQuery would deliberately and explicitly evaluate scripts contained in the string passed to .html(). Hence code like this shows an alert in jQuery 1.8:

使用jQuery .html()和.val()方法,而不是使用. innerhtml和.value,对于某些版本的jQuery,甚至在使用textarea时也不安全。这是因为老版本的jQuery会故意和显式地评估传递给.html()的字符串中包含的脚本。因此,像这样的代码在jQuery 1.8中显示了一个警告:

// Shows alert
$('<textarea>').html('<script>alert(1337)</script>').text()

* Thanks to Eru Penkman for catching this vulnerability.

*感谢Eru Penkman抓住了这个漏洞。

#3


77  

Like Mike Samuel said, don't use jQuery.html().text() to decode html entities as it's unsafe.

就像Mike Samuel说的,不要使用jQuery.html().text()来解码html实体,因为它不安全。

Instead, use a template renderer like Mustache.js or decodeEntities from @VyvIT's comment.

相反,要使用像Mustache这样的模板渲染器。js或decodeEntities从@VyvIT的评论。

Underscore.js utility-belt library comes with escape and unescape methods, but they are not safe for user input:

下划线。js实用带库有escape和unescape方法,但用户输入不安全:

_.escape(string)

_.escape(字符串)

_.unescape(string)

_.unescape(字符串)

#4


28  

I think you're confusing the text and HTML methods. Look at this example, if you use an element's inner HTML as text, you'll get decoded HTML tags (second button). But if you use them as HTML, you'll get the HTML formatted view (first button).

我认为你混淆了文本和HTML方法。看这个例子,如果你使用一个元素的内部HTML作为文本,你将得到解码的HTML标签(第二个按钮)。但是如果将它们作为HTML使用,就会得到HTML格式的视图(第一个按钮)。

<div id="myDiv">
    here is a <b>HTML</b> content.
</div>
<br />
<input value="Write as HTML" type="button" onclick="javascript:$('#resultDiv').html($('#myDiv').html());" />
&nbsp;&nbsp;
<input value="Write as Text" type="button" onclick="javascript:$('#resultDiv').text($('#myDiv').html());" />
<br /><br />
<div id="resultDiv">
    Results here !
</div>

First button writes : here is a HTML content.

第一个按钮写:这是一个HTML内容。

Second button writes : here is a <B>HTML</B> content.

第二个按钮写到:这里是一个HTML内容。

By the way, you can see a plug-in that I found in jQuery plugin - HTML decode and encode that encodes and decodes HTML strings.

顺便说一下,您可以看到我在jQuery插件中找到的插件——HTML解码和编码,编码和解码HTML字符串。

#5


26  

The question is limited by 'with jQuery' but it might help some to know that the jQuery code given in the best answer here does the following underneath...this works with or without jQuery:

这个问题受到“jQuery”的限制,但它可能会帮助一些人知道,在最佳答案中给出的jQuery代码如下所示……这适用于或没有jQuery:

function decodeEntities(input) {
  var y = document.createElement('textarea');
  y.innerHTML = input;
  return y.value;
}

#6


15  

You can use the he library, available from https://github.com/mathiasbynens/he

您可以使用他的库,可以从https://github.com/mathiasbynens/he获得。

Example:

例子:

console.log(he.decode("J&#246;rg &amp J&#xFC;rgen rocked to &amp; fro "));
// Logs "Jörg & Jürgen rocked to & fro"

I challenged the library's author on the question of whether there was any reason to use this library in clientside code in favour of the <textarea> hack provided in other answers here and elsewhere. He provided a few possible justifications:

我向该图书馆的作者提出了一个问题,即是否有理由使用该库在clientside代码中支持

  • If you're using node.js serverside, using a library for HTML encoding/decoding gives you a single solution that works both clientside and serverside.

    如果你使用节点。js服务器端,使用一个用于HTML编码/解码的库,可以为您提供一个既能同时运行clientside和serverside的解决方案。

  • Some browsers' entity decoding algorithms have bugs or are missing support for some named character references. For example, Internet Explorer will both decode and render non-breaking spaces (&nbsp;) correctly but report them as ordinary spaces instead of non-breaking ones via a DOM element's innerText property, breaking the <textarea> hack (albeit only in a minor way). Additionally, IE 8 and 9 simply don't support any of the new named character references added in HTML 5. The author of he also hosts a test of named character reference support at http://mathias.html5.org/tests/html/named-character-references/. In IE 8, it reports over one thousand errors.

    一些浏览器的实体解码算法有bug或者缺少对某些命名字符引用的支持。例如,Internet Explorer将会正确地解码和呈现非破坏空格(以及),但是将它们作为普通的空格而不是通过DOM元素的innerText属性来报告,打破了

    If you want to be insulated from browser bugs related to entity decoding and/or be able to handle the full range of named character references, you can't get away with the <textarea> hack; you'll need a library like he.

    如果您希望避免与实体解码相关的浏览器错误,或者能够处理所有命名字符引用,那么您就无法摆脱

  • He just darn well feels like doing things this way is less hacky.

    他只是觉得这样做不太容易。

#7


12  

encode:

编码:

$("<textarea/>").html('<a>').html();      // return '&lt;a&gt'

decode:

解码:

$("<textarea/>").html('&lt;a&gt').val()   // return '<a>'

#8


4  

Use

使用

myString = myString.replace( /\&amp;/g, '&' );

It is easiest to do it on the server side because apparently JavaScript has no native library for handling entities, nor did I find any near the top of search results for the various frameworks that extend JavaScript.

在服务器端执行它是最容易的,因为显然JavaScript没有本地库来处理实体,我也没有找到任何扩展JavaScript的框架的搜索结果的顶部。

Search for "JavaScript HTML entities", and you might find a few libraries for just that purpose, but they'll probably all be built around the above logic - replace, entity by entity.

搜索“JavaScript HTML实体”,您可能会找到一些这样的程序库,但是它们可能都是围绕上述逻辑来构建的——由实体来代替实体。

#9


1  

You have to make custom function for html entities:

您必须为html实体定制函数:

function htmlEntities(str) {
return String(str).replace(/&/g, '&amp;').replace(/</g, '&lt;').replace(/>/g,'&gt;').replace(/"/g, '&quot;');
}

#10


0  

I just had to have an HTML entity charater (⇓) as a value for a HTML button. The HTML code looks good from the beginning in the browser:

我只需要有一个HTML实体特点(⇓)作为一个HTML按钮的值。从浏览器开始,HTML代码看起来很好:

<input type="button" value="Embed & Share  &dArr;" id="share_button" />

Now I was adding a toggle that should also display the charater. This is my solution

现在我添加了一个toggle,它也应该显示charater。这是我的解决方案

$("#share_button").toggle(
    function(){
        $("#share").slideDown();
        $(this).attr("value", "Embed & Share " + $("<div>").html("&uArr;").text());
    }

This displays ⇓ again in the button. I hope this might help someone.

这将再次显示在按钮中。我希望这能帮助到一些人。

#11


0  

Suppose you have below String.

假设您有以下字符串。

Our Deluxe cabins are warm, cozy &amp; comfortable

我们的豪华客舱温暖、舒适、舒适。舒适的

var str = $("p").text(); // get the text from <p> tag
$('p').html(str).text();  // Now,decode html entities in your variable i.e 

str and assign back to

然后重新分配。

tag.

标签。

that's it.

就是这样。

#12


0  

For ExtJS users, if you already have the encoded string, for example when the returned value of a library function is the innerHTML content, consider this ExtJS function:

对于ExtJS用户,如果您已经有了已编码的字符串,例如,当库函数的返回值是innerHTML内容时,请考虑这个ExtJS函数:

Ext.util.Format.htmlDecode(innerHtmlContent)

#13


0  

Extend a String class:

扩展一个字符串类:

String::decode = ->
  $('<textarea />').html(this).text()

and use as method:

和使用方法:

"&lt;img src='myimage.jpg'&gt;".decode()

#14


0  

Here are still one problem: Escaped string does not look readable when assigned to input value

这里仍然有一个问题:当分配到输入值时,脱逃的字符串看起来不容易读。

var string = _.escape("<img src=fake onerror=alert('boo!')>");
$('input').val(string);

Exapmle: https://jsfiddle.net/kjpdwmqa/3/

简单的:https://jsfiddle.net/kjpdwmqa/3/

#15


-2  

The easiest way is to set a class selector to your elements an then use following code:

最简单的方法是将类选择器设置为元素,然后使用以下代码:

$(function(){
    $('.classSelector').each(function(a, b){
        $(b).html($(b).text());
    });
});

Nothing any more needed!

没有任何更多的需要!

I had this problem and found this clear solution and it works fine.

我遇到了这个问题,找到了这个清晰的解决方案,效果很好。

#16


-3  

To decode HTML Entities with jQuery, just use this function:

要用jQuery解码HTML实体,只需使用这个函数:

function html_entity_decode(txt){
    var randomID = Math.floor((Math.random()*100000)+1);
    $('body').append('<div id="random'+randomID+'"></div>');
    $('#random'+randomID).html(txt);
    var entity_decoded = $('#random'+randomID).html();
    $('#random'+randomID).remove();
    return entity_decoded;
}

How to use:

如何使用:

Javascript:

Javascript:

var txtEncoded = "&aacute; &eacute; &iacute; &oacute; &uacute;";
$('#some-id').val(html_entity_decode(txtEncoded));

HTML:

HTML:

<input id="some-id" type="text" />

#17


-3  

I think that is the exact opposite of the solution chosen.

我认为这与选择的解决方案正好相反。

var decoded = $("<div/>").text(encodedStr).html();