将HTML输入限制在文本框中

时间:2022-01-22 06:47:39

How do I limit the types of HTML that a user can input into a textbox? I'm running a small forum using some custom software that I'm beta testing, but I need to know how to limit the HTML input. Any suggestions?

如何限制用户可以输入到文本框中的HTML类型?我正在使用一些我正在测试的自定义软件运行一个小论坛,但我需要知道如何限制HTML输入。有什么建议?

6 个解决方案

#1


2  

i'd suggest a slightly alternative approach:

我建议采用一种略微替代的方法:

  • don't filter incoming user data (beyond prevention of sql injection). user data should be kept as pure as possible.
  • 不要过滤传入的用户数据(除了防止sql注入)。用户数据应尽可能保持纯净。

  • filter all outgoing data from the database, this is where things like tag stripping, etc.. should happen
  • 过滤掉数据库中的所有传出数据,这是标签剥离等应该发生的事情

keeping user data clean allows you more flexibility in how it's displayed. filtering all outgoing data is a good habit to get into (along the never trust data meme).

保持用户数据清洁可以让您更灵活地显示它。过滤所有传出数据是一个很好的习惯(沿着永不信任的数据模因)。

#2


2  

You didn't state what the forum was built with, but if it's PHP, check out:

您没有说明论坛的构建内容,但如果是PHP,请查看:

http://htmlpurifier.org/

Library Features: Whitelist, Removal, Well-formed, Nesting, Attributes, XSS safe, Standards safe

库特征:白名单,删除,格式良好,嵌套,属性,XSS安全,标准安全

#3


0  

Once the text is submitted, you could strip any/all tags that don't match your predefined set using a regex in PHP.

提交文本后,您可以使用PHP中的正则表达式删除与预定义集不匹配的任何/所有标记。

It would look something like the following:

它看起来像下面这样:

find open tag (<)
if contents != allowed tag, remove tag (from <..>)

#4


0  

  1. Parse the input provides and strip out all html tags that don't match exactly the list you are allowing. This can either be a complex regex, or you can do a stateful iteration through the char[] of the input string building the allowed input string and stripping unwanted attributes on tags like img.

    解析输入提供并去除所有与您允许的列表不完全匹配的html标记。这可以是一个复杂的正则表达式,或者你可以通过输入字符串的char []进行有状态迭代,构建允许的输入字符串并在像img这样的标签上剥离不需要的属性。

  2. Use a different code system (BBCode, Markdown)

    使用不同的代码系统(BBCode,Markdown)

  3. Find some code online that already does this, to use as a basis for your implementation. For example Slashcode must perform this, so look for its implementation in the Perl and use the regexes (that I assume are there)

    在线查找已经执行此操作的代码,以用作实施的基础。例如,Slashcode必须执行此操作,因此在Perl中查找它的实现并使用正则表达式(我假设在那里)

#5


0  

Regardless what you use, be sure to be informed of what kind of HTML content can be dangerous.

无论您使用什么,请务必了解哪种HTML内容可能存在危险。

e.g. a < script > tag is pretty obvious, but a < style > tag is just as bad in IE, because it can invoke JScript commands.

例如一个

In fact, any style="..." attribute can invoke script in IE.

实际上,任何style =“...”属性都可以在IE中调用脚本。

< object > would be one more tag to be weary of.

将是另一个厌倦的标签。

#6


0  

PHP comes with a simple function strip_tag to strip HTML tags. It allows for certain tags to not be stripped.

PHP附带了一个简单的函数strip_tag来剥离HTML标记。它允许某些标签不被剥离。

Example #1 strip_tags() example

示例#1 strip_tags()示例

<?php
$text = '<p>Test paragraph.</p><!-- Comment --> <a href="#fragment">Other text</a>';
echo strip_tags($text);
echo "\n";

// Allow <p> and <a>
echo strip_tags($text, '<p><a>');
?>

The above example will output:

以上示例将输出:

Test paragraph. Other text
<p>Test paragraph.</p> <a href="#fragment">Other text</a>

Personally for a forum, I would use BBCode or Markdown because the amount of support and features provided such as live preview.

个人对于一个论坛,我会使用BBCode或Markdown,因为提供的支持和功能的数量,如实时预览。

#1


2  

i'd suggest a slightly alternative approach:

我建议采用一种略微替代的方法:

  • don't filter incoming user data (beyond prevention of sql injection). user data should be kept as pure as possible.
  • 不要过滤传入的用户数据(除了防止sql注入)。用户数据应尽可能保持纯净。

  • filter all outgoing data from the database, this is where things like tag stripping, etc.. should happen
  • 过滤掉数据库中的所有传出数据,这是标签剥离等应该发生的事情

keeping user data clean allows you more flexibility in how it's displayed. filtering all outgoing data is a good habit to get into (along the never trust data meme).

保持用户数据清洁可以让您更灵活地显示它。过滤所有传出数据是一个很好的习惯(沿着永不信任的数据模因)。

#2


2  

You didn't state what the forum was built with, but if it's PHP, check out:

您没有说明论坛的构建内容,但如果是PHP,请查看:

http://htmlpurifier.org/

Library Features: Whitelist, Removal, Well-formed, Nesting, Attributes, XSS safe, Standards safe

库特征:白名单,删除,格式良好,嵌套,属性,XSS安全,标准安全

#3


0  

Once the text is submitted, you could strip any/all tags that don't match your predefined set using a regex in PHP.

提交文本后,您可以使用PHP中的正则表达式删除与预定义集不匹配的任何/所有标记。

It would look something like the following:

它看起来像下面这样:

find open tag (<)
if contents != allowed tag, remove tag (from <..>)

#4


0  

  1. Parse the input provides and strip out all html tags that don't match exactly the list you are allowing. This can either be a complex regex, or you can do a stateful iteration through the char[] of the input string building the allowed input string and stripping unwanted attributes on tags like img.

    解析输入提供并去除所有与您允许的列表不完全匹配的html标记。这可以是一个复杂的正则表达式,或者你可以通过输入字符串的char []进行有状态迭代,构建允许的输入字符串并在像img这样的标签上剥离不需要的属性。

  2. Use a different code system (BBCode, Markdown)

    使用不同的代码系统(BBCode,Markdown)

  3. Find some code online that already does this, to use as a basis for your implementation. For example Slashcode must perform this, so look for its implementation in the Perl and use the regexes (that I assume are there)

    在线查找已经执行此操作的代码,以用作实施的基础。例如,Slashcode必须执行此操作,因此在Perl中查找它的实现并使用正则表达式(我假设在那里)

#5


0  

Regardless what you use, be sure to be informed of what kind of HTML content can be dangerous.

无论您使用什么,请务必了解哪种HTML内容可能存在危险。

e.g. a < script > tag is pretty obvious, but a < style > tag is just as bad in IE, because it can invoke JScript commands.

例如一个

In fact, any style="..." attribute can invoke script in IE.

实际上,任何style =“...”属性都可以在IE中调用脚本。

< object > would be one more tag to be weary of.

将是另一个厌倦的标签。

#6


0  

PHP comes with a simple function strip_tag to strip HTML tags. It allows for certain tags to not be stripped.

PHP附带了一个简单的函数strip_tag来剥离HTML标记。它允许某些标签不被剥离。

Example #1 strip_tags() example

示例#1 strip_tags()示例

<?php
$text = '<p>Test paragraph.</p><!-- Comment --> <a href="#fragment">Other text</a>';
echo strip_tags($text);
echo "\n";

// Allow <p> and <a>
echo strip_tags($text, '<p><a>');
?>

The above example will output:

以上示例将输出:

Test paragraph. Other text
<p>Test paragraph.</p> <a href="#fragment">Other text</a>

Personally for a forum, I would use BBCode or Markdown because the amount of support and features provided such as live preview.

个人对于一个论坛,我会使用BBCode或Markdown,因为提供的支持和功能的数量,如实时预览。