I have a website related to entertainment. So, I have thought to use a new method to prevent XSS Attack. I have created the following words list
我有一个与娱乐有关的网站。所以,我想用一种新的方法来防止XSS攻击。我已经创建了下面的单词列表。
alert(, javascript, <script>,<script,vbscript,<layer>,
<layer,scriptalert,HTTP-EQUIV,mocha:,<object>,<object,
AllowScriptAccess,text/javascript,<link>, <link,<?php, <?import,
I have thought that because my site is related to entertainment, So I do not expect from a normal user (other than malicious user) to use such kind of words in his comment. So, I have decided to remove all the above comma separated words from the user submitted string. I need your advice. Do I no need to use htmlpurifier like tools after doing this?
我认为,因为我的网站与娱乐有关,所以我不期望从一个普通用户(恶意用户)在他的评论中使用这样的字眼。因此,我已经决定删除用户提交的字符串中的所有逗号分隔的单词。我需要你的建议。这样做之后,我不需要像使用工具一样使用htmlpurifier吗?
Note: I am not using htmlspecialchars() because it will also convert the tags generated from my Rich Text Editor (CKEditor), so user formatted will be gone.
注意:我没有使用htmlspecialchars(),因为它还会转换由我的富文本编辑器(CKEditor)生成的标记,所以用户格式化将会消失。
5 个解决方案
#1
4
Using a black list is a bad idea as it is simple to circumvent. For example, you are checking for and presumably removing <script>
. To circumvent this, a malicious user can enter:
使用黑名单是个坏主意,因为它很容易被绕过。例如,您正在检查并可能删除
<scri<script>pt>
your code will strip out the middle <script>
leaving the outer <script>
intact and saved to the page.
您的代码将去掉中间的
If you need to enter HTML and your users do not, then prevent them from entering HTML. You need to have a separate method, only accessible to you, for entering articles that with HTML.
如果您需要输入HTML,而您的用户不需要,那么就阻止他们进入HTML。您需要有一个单独的方法,仅供您访问,以便使用HTML输入文章。
#2
3
This approach misunderstands what the HTML-injection problem is, and is utterly ineffective.
这种方法误解了html注入的问题,并且完全无效。
There are many, many more ways to put scripting in HTML than the above list, and many ways to evade the filter by using escaped forms. You will never catch all potential "harmful" constructs with this kind of naive sequence blacklisting, and if you try you will inconvenience users with genuine comments. (eg banning use of words beginning with on
...)
在HTML中有很多比上面的列表更多的方法来使用脚本,还有很多方法可以通过使用转义表单来避开过滤器。您永远不会捕捉到所有潜在的“有害”构造,使用这种幼稚的序列黑名单,如果您尝试,您将为用户带来真正的评论。(如禁止使用以on开头的单词)
The correct way to prevent HTML-injection XSS is:
防止HTML-injection XSS的正确方法是:
-
use
htmlspecialchars()
when outputting content that is supposed to be normal text (which is the vast majority of content);使用htmlspecialchars()输出被认为是普通文本的内容时(这是绝大多数内容);
-
if you need to allow user-supplied HTML markup, whitelist the harmless tags and attributes you wish to allow, and enforce that using HTMLPurifier or another similar library.
如果您需要允许用户提供的HTML标记,请在白板上列出您希望允许的无害标记和属性,并使用HTMLPurifier或其他类似的库进行强制执行。
This is a standard and well-understood part of writing a web application, and is not difficult to implement.
这是编写web应用程序的标准部分,并且易于实现。
#3
2
Why not just make a function that reverts the changes htmlspecialchars()
made for the specific tags you want to be available, such as <b><i><a>
etc?
为什么不创建一个函数来返回为您希望可用的特定标记所做的更改htmlspecialchars(),例如等?
#4
1
Hacks to circumvent your list aside, it's always better to use a whitelist than a blacklist.
抛开你的清单不谈,使用白名单总比用黑名单好。
In this case, you would already have a clear list of tags that you want to support, so just whitelist tags like <em>
, <b>
, etc, using some HTML purifier.
在这种情况下,您可能已经有了想要支持的标记的清晰列表,所以只需要使用一些HTML净化器,比如、等的白名单标记。
#5
0
you can try with
你可以试着用
htmlentities()
htmlentities()
echo htmlentities("<b>test word</b>");
ouput: <b>test word</b>gt;
strip_tags()
strip_tags()
echo strip_tags("<b>test word</b>");
ouput: test word
mysql_real_escape_string()
mysql_real_escape_string()
or try a simple function
或者尝试一个简单的函数
function clean_string($str) {
if (!get_magic_quotes_gpc()) {
$str = addslashes($str);
}
$str = strip_tags(htmlspecialchars($str));
return $str;
}
#1
4
Using a black list is a bad idea as it is simple to circumvent. For example, you are checking for and presumably removing <script>
. To circumvent this, a malicious user can enter:
使用黑名单是个坏主意,因为它很容易被绕过。例如,您正在检查并可能删除
<scri<script>pt>
your code will strip out the middle <script>
leaving the outer <script>
intact and saved to the page.
您的代码将去掉中间的
If you need to enter HTML and your users do not, then prevent them from entering HTML. You need to have a separate method, only accessible to you, for entering articles that with HTML.
如果您需要输入HTML,而您的用户不需要,那么就阻止他们进入HTML。您需要有一个单独的方法,仅供您访问,以便使用HTML输入文章。
#2
3
This approach misunderstands what the HTML-injection problem is, and is utterly ineffective.
这种方法误解了html注入的问题,并且完全无效。
There are many, many more ways to put scripting in HTML than the above list, and many ways to evade the filter by using escaped forms. You will never catch all potential "harmful" constructs with this kind of naive sequence blacklisting, and if you try you will inconvenience users with genuine comments. (eg banning use of words beginning with on
...)
在HTML中有很多比上面的列表更多的方法来使用脚本,还有很多方法可以通过使用转义表单来避开过滤器。您永远不会捕捉到所有潜在的“有害”构造,使用这种幼稚的序列黑名单,如果您尝试,您将为用户带来真正的评论。(如禁止使用以on开头的单词)
The correct way to prevent HTML-injection XSS is:
防止HTML-injection XSS的正确方法是:
-
use
htmlspecialchars()
when outputting content that is supposed to be normal text (which is the vast majority of content);使用htmlspecialchars()输出被认为是普通文本的内容时(这是绝大多数内容);
-
if you need to allow user-supplied HTML markup, whitelist the harmless tags and attributes you wish to allow, and enforce that using HTMLPurifier or another similar library.
如果您需要允许用户提供的HTML标记,请在白板上列出您希望允许的无害标记和属性,并使用HTMLPurifier或其他类似的库进行强制执行。
This is a standard and well-understood part of writing a web application, and is not difficult to implement.
这是编写web应用程序的标准部分,并且易于实现。
#3
2
Why not just make a function that reverts the changes htmlspecialchars()
made for the specific tags you want to be available, such as <b><i><a>
etc?
为什么不创建一个函数来返回为您希望可用的特定标记所做的更改htmlspecialchars(),例如等?
#4
1
Hacks to circumvent your list aside, it's always better to use a whitelist than a blacklist.
抛开你的清单不谈,使用白名单总比用黑名单好。
In this case, you would already have a clear list of tags that you want to support, so just whitelist tags like <em>
, <b>
, etc, using some HTML purifier.
在这种情况下,您可能已经有了想要支持的标记的清晰列表,所以只需要使用一些HTML净化器,比如、等的白名单标记。
#5
0
you can try with
你可以试着用
htmlentities()
htmlentities()
echo htmlentities("<b>test word</b>");
ouput: <b>test word</b>gt;
strip_tags()
strip_tags()
echo strip_tags("<b>test word</b>");
ouput: test word
mysql_real_escape_string()
mysql_real_escape_string()
or try a simple function
或者尝试一个简单的函数
function clean_string($str) {
if (!get_magic_quotes_gpc()) {
$str = addslashes($str);
}
$str = strip_tags(htmlspecialchars($str));
return $str;
}