
时间:2021-10-22 06:55:57

I am looking at starting a hosted CMS-like service for customers.


As it would, it would require the customer to input text which would be served up to anyone that comes to visit their site. I am planning on using Markdown, possibly in combination with WMD (the live markdown preview that SO uses) for the big blocks of text.


Now, should I be sanitizing their input for html? Given that there would only be a handful of people editing their 'CMS', all paying customers, should i be stripping out the bad HTML, or should I just let them run wild? After all, it is their 'site'


Edit: The main reason as to why I would do it is to let them use their own javascript, and have their own css and divs and what not for the output


5 个解决方案



Why wouldn't you sanitize the input?


If you don't, you're inviting calamity - to either your customer or yourself or both.




Your question asks:


"Edit: The main reason as to why I would do it is to let them use their own javascript, and have their own css and divs and what not for the output".


If you allow users to supply arbitrary JavaScript, then sanitizing input is not worth the effort. The definition of Cross-Site Scripting (XSS) is basically "users can supply JavaScript and some users are bad".


Now, some websites do allow users to supply JavaScript and they mitigate the risk in one of two ways:


  1. Host the individual user's CMS's under a different domain. Blogger and Tumblr (e.g. myblog.blogspot.com vs. blogger.com) do this to prevent user's templates from stealing other user's cookies. You have to know what you are doing and never host any of the user content under the root domain.
  2. 在不同的域下托管单个用户的CMS。 Blogger和Tumblr(例如myblog.blogspot.com与blogger.com)这样做是为了防止用户的模板窃取其他用户的cookie。您必须知道自己在做什么,并且永远不会在根域下托管任何用户内容。

  3. If user content is never shared between users then it does not matter what script malicious users supply. However, CMS's are about sharing so this probably doesn't apply here
  4. 如果用户之间永远不会共享用户内容,则恶意用户提供的脚本无关紧要。但是,CMS是关于共享的,所以这可能不适用于此

There are some Blacklist filters out there that may work, but they only work today. The HTML spec and browsers change regularly which makes filters almost impossible to maintain. Blacklisting is a sure fire way to have both security and functional problems.

有一些黑名单过滤器可以使用,但它们只在今天工作。 HTML规范和浏览器会定期更改,这使得过滤器几乎无法维护。黑名单是一种确保安全和功能问题的可靠方法。

When dealing with user data, always treat it as untrusted. If you don't address this early in the product and your scenarios change, it is almost impossible to go back and find all of the XSS points or modifythe product to prevent XSS without upsetting your users.




You would also be protecting again disgruntled employees, cross customer attacks, or any other sort of idiotic behavior.


You should always sanitize, no matter the users or viewers.




At least parse their entry an only allow a certain "safe" subset of HTML tags.




I think you should always sanitize the input. Most people use a CMS because they don't want to create their own website from scratch and they want easy access to edit their pages. These users most likely will not be trying to put in text that would get sanitized, but by protecting against it you are protecting their users.




Why wouldn't you sanitize the input?


If you don't, you're inviting calamity - to either your customer or yourself or both.




Your question asks:


"Edit: The main reason as to why I would do it is to let them use their own javascript, and have their own css and divs and what not for the output".


If you allow users to supply arbitrary JavaScript, then sanitizing input is not worth the effort. The definition of Cross-Site Scripting (XSS) is basically "users can supply JavaScript and some users are bad".


Now, some websites do allow users to supply JavaScript and they mitigate the risk in one of two ways:


  1. Host the individual user's CMS's under a different domain. Blogger and Tumblr (e.g. myblog.blogspot.com vs. blogger.com) do this to prevent user's templates from stealing other user's cookies. You have to know what you are doing and never host any of the user content under the root domain.
  2. 在不同的域下托管单个用户的CMS。 Blogger和Tumblr(例如myblog.blogspot.com与blogger.com)这样做是为了防止用户的模板窃取其他用户的cookie。您必须知道自己在做什么,并且永远不会在根域下托管任何用户内容。

  3. If user content is never shared between users then it does not matter what script malicious users supply. However, CMS's are about sharing so this probably doesn't apply here
  4. 如果用户之间永远不会共享用户内容,则恶意用户提供的脚本无关紧要。但是,CMS是关于共享的,所以这可能不适用于此

There are some Blacklist filters out there that may work, but they only work today. The HTML spec and browsers change regularly which makes filters almost impossible to maintain. Blacklisting is a sure fire way to have both security and functional problems.

有一些黑名单过滤器可以使用,但它们只在今天工作。 HTML规范和浏览器会定期更改,这使得过滤器几乎无法维护。黑名单是一种确保安全和功能问题的可靠方法。

When dealing with user data, always treat it as untrusted. If you don't address this early in the product and your scenarios change, it is almost impossible to go back and find all of the XSS points or modifythe product to prevent XSS without upsetting your users.




You would also be protecting again disgruntled employees, cross customer attacks, or any other sort of idiotic behavior.


You should always sanitize, no matter the users or viewers.




At least parse their entry an only allow a certain "safe" subset of HTML tags.




I think you should always sanitize the input. Most people use a CMS because they don't want to create their own website from scratch and they want easy access to edit their pages. These users most likely will not be trying to put in text that would get sanitized, but by protecting against it you are protecting their users.
