How do I limit the types of HTML that a user can input into a textbox? I'm running a small forum using some custom software that I'm beta testing, but I need to know how to limit the HTML input. Any suggestions?
如何限制用户可以输入到文本框中的HTML类型?我正在使用一些我正在测试的自定义软件运行一个小论坛,但我需要知道如何限制HTML输入。有什么建议?
6 个解决方案
#1
2
i'd suggest a slightly alternative approach:
我建议采用一种略微替代的方法:
- don't filter incoming user data (beyond prevention of sql injection). user data should be kept as pure as possible.
- filter all outgoing data from the database, this is where things like tag stripping, etc.. should happen
不要过滤传入的用户数据(除了防止sql注入)。用户数据应尽可能保持纯净。
过滤掉数据库中的所有传出数据,这是标签剥离等应该发生的事情
keeping user data clean allows you more flexibility in how it's displayed. filtering all outgoing data is a good habit to get into (along the never trust data meme).
保持用户数据清洁可以让您更灵活地显示它。过滤所有传出数据是一个很好的习惯(沿着永不信任的数据模因)。
#2
2
You didn't state what the forum was built with, but if it's PHP, check out:
您没有说明论坛的构建内容,但如果是PHP,请查看:
Library Features: Whitelist, Removal, Well-formed, Nesting, Attributes, XSS safe, Standards safe
库特征:白名单,删除,格式良好,嵌套,属性,XSS安全,标准安全
#3
0
Once the text is submitted, you could strip any/all tags that don't match your predefined set using a regex in PHP.
提交文本后,您可以使用PHP中的正则表达式删除与预定义集不匹配的任何/所有标记。
It would look something like the following:
它看起来像下面这样:
find open tag (<)
if contents != allowed tag, remove tag (from <..>)
#4
0
-
Parse the input provides and strip out all html tags that don't match exactly the list you are allowing. This can either be a complex regex, or you can do a stateful iteration through the char[] of the input string building the allowed input string and stripping unwanted attributes on tags like
img
.解析输入提供并去除所有与您允许的列表不完全匹配的html标记。这可以是一个复杂的正则表达式,或者你可以通过输入字符串的char []进行有状态迭代,构建允许的输入字符串并在像img这样的标签上剥离不需要的属性。
-
Use a different code system (BBCode, Markdown)
使用不同的代码系统(BBCode,Markdown)
-
Find some code online that already does this, to use as a basis for your implementation. For example Slashcode must perform this, so look for its implementation in the Perl and use the regexes (that I assume are there)
在线查找已经执行此操作的代码,以用作实施的基础。例如,Slashcode必须执行此操作,因此在Perl中查找它的实现并使用正则表达式(我假设在那里)
#5
0
Regardless what you use, be sure to be informed of what kind of HTML content can be dangerous.
无论您使用什么,请务必了解哪种HTML内容可能存在危险。
e.g. a < script > tag is pretty obvious, but a < style > tag is just as bad in IE, because it can invoke JScript commands.
例如一个
In fact, any style="..." attribute can invoke script in IE.
实际上,任何style =“...”属性都可以在IE中调用脚本。
< object > would be one more tag to be weary of.
#6
0
PHP comes with a simple function strip_tag to strip HTML tags. It allows for certain tags to not be stripped.
PHP附带了一个简单的函数strip_tag来剥离HTML标记。它允许某些标签不被剥离。
Example #1 strip_tags() example
示例#1 strip_tags()示例
<?php
$text = '<p>Test paragraph.</p><!-- Comment --> <a href="#fragment">Other text</a>';
echo strip_tags($text);
echo "\n";
// Allow <p> and <a>
echo strip_tags($text, '<p><a>');
?>
The above example will output:
以上示例将输出:
Test paragraph. Other text
<p>Test paragraph.</p> <a href="#fragment">Other text</a>
Personally for a forum, I would use BBCode or Markdown because the amount of support and features provided such as live preview.
个人对于一个论坛,我会使用BBCode或Markdown,因为提供的支持和功能的数量,如实时预览。
#1
2
i'd suggest a slightly alternative approach:
我建议采用一种略微替代的方法:
- don't filter incoming user data (beyond prevention of sql injection). user data should be kept as pure as possible.
- filter all outgoing data from the database, this is where things like tag stripping, etc.. should happen
不要过滤传入的用户数据(除了防止sql注入)。用户数据应尽可能保持纯净。
过滤掉数据库中的所有传出数据,这是标签剥离等应该发生的事情
keeping user data clean allows you more flexibility in how it's displayed. filtering all outgoing data is a good habit to get into (along the never trust data meme).
保持用户数据清洁可以让您更灵活地显示它。过滤所有传出数据是一个很好的习惯(沿着永不信任的数据模因)。
#2
2
You didn't state what the forum was built with, but if it's PHP, check out:
您没有说明论坛的构建内容,但如果是PHP,请查看:
Library Features: Whitelist, Removal, Well-formed, Nesting, Attributes, XSS safe, Standards safe
库特征:白名单,删除,格式良好,嵌套,属性,XSS安全,标准安全
#3
0
Once the text is submitted, you could strip any/all tags that don't match your predefined set using a regex in PHP.
提交文本后,您可以使用PHP中的正则表达式删除与预定义集不匹配的任何/所有标记。
It would look something like the following:
它看起来像下面这样:
find open tag (<)
if contents != allowed tag, remove tag (from <..>)
#4
0
-
Parse the input provides and strip out all html tags that don't match exactly the list you are allowing. This can either be a complex regex, or you can do a stateful iteration through the char[] of the input string building the allowed input string and stripping unwanted attributes on tags like
img
.解析输入提供并去除所有与您允许的列表不完全匹配的html标记。这可以是一个复杂的正则表达式,或者你可以通过输入字符串的char []进行有状态迭代,构建允许的输入字符串并在像img这样的标签上剥离不需要的属性。
-
Use a different code system (BBCode, Markdown)
使用不同的代码系统(BBCode,Markdown)
-
Find some code online that already does this, to use as a basis for your implementation. For example Slashcode must perform this, so look for its implementation in the Perl and use the regexes (that I assume are there)
在线查找已经执行此操作的代码,以用作实施的基础。例如,Slashcode必须执行此操作,因此在Perl中查找它的实现并使用正则表达式(我假设在那里)
#5
0
Regardless what you use, be sure to be informed of what kind of HTML content can be dangerous.
无论您使用什么,请务必了解哪种HTML内容可能存在危险。
e.g. a < script > tag is pretty obvious, but a < style > tag is just as bad in IE, because it can invoke JScript commands.
例如一个
In fact, any style="..." attribute can invoke script in IE.
实际上,任何style =“...”属性都可以在IE中调用脚本。
< object > would be one more tag to be weary of.
#6
0
PHP comes with a simple function strip_tag to strip HTML tags. It allows for certain tags to not be stripped.
PHP附带了一个简单的函数strip_tag来剥离HTML标记。它允许某些标签不被剥离。
Example #1 strip_tags() example
示例#1 strip_tags()示例
<?php
$text = '<p>Test paragraph.</p><!-- Comment --> <a href="#fragment">Other text</a>';
echo strip_tags($text);
echo "\n";
// Allow <p> and <a>
echo strip_tags($text, '<p><a>');
?>
The above example will output:
以上示例将输出:
Test paragraph. Other text
<p>Test paragraph.</p> <a href="#fragment">Other text</a>
Personally for a forum, I would use BBCode or Markdown because the amount of support and features provided such as live preview.
个人对于一个论坛,我会使用BBCode或Markdown,因为提供的支持和功能的数量,如实时预览。