I have a PHP script that will generate <input>
s dynamically, so I was wondering if I needed to filter any characters in the name
attribute.
我有一个PHP脚本,它将动态地生成s,所以我想知道是否需要过滤name属性中的任何字符。
I know that the name has to start with a letter, but I don't know any other rules. I figure square brackets must be allowed, since PHP uses these to create arrays from form data. How about parentheses? Spaces?
我知道名字必须以字母开头,但我不知道其他的规则。我认为必须允许方括号,因为PHP使用方括号从表单数据创建数组。括号呢?空间吗?
5 个解决方案
#1
28
The only real restriction on what characters can appear in form control names is when a form is submitted with GET
对于窗体控件名称中出现的字符,唯一的真正限制是在使用GET提交窗体时
"The "get" method restricts form data set values to ASCII characters." reference
“get”方法将表单数据集值限制为ASCII字符
There's a good thread on it here.
这里有一根很好的线。
#2
47
Note, that not all characters are submitted for name
attributes of form fields (even when using POST)!
注意,并不是所有字符都提交给表单字段的名称属性(即使使用POST)!
White-space characters are trimmed and inner white-space characters as well the character .
are replaced by _
. (Tested in Chrome 23, Firefox 13 and Internet Explorer 9, all Win7.)
空白字符是裁剪和内部空白字符以及字符。取而代之的是_。(在Chrome 23、firefox13和ie9中测试过,都是Win7。)
#3
37
Any character you can include in an [X]HTML file is fine to put in an <input name>
. As Allain's comment says, <input name>
is defined as containing CDATA
, so the only things you can't put in there are the control codes and invalid codepoints that the underlying standard (SGML or XML) disallows.
可以在[X]HTML文件中包含的任何字符都可以输入。正如Allain的评论所言,被定义为包含CDATA,所以您不能在其中输入的只有底层标准(SGML或XML)不允许的控制代码和无效代码点。
Allain quoted W3 from the HTML4 spec:
Allain引用HTML4规范中的W3:
Note. The "get" method restricts form data set values to ASCII characters. Only the "post" method (with enctype="multipart/form-data") is specified to cover the entire ISO10646 character set.
请注意。“get”方法将表单数据集值限制为ASCII字符。只指定“post”方法(使用enctype=“multipart/form-data”)来覆盖整个ISO10646字符集。
However this isn't really true in practice.
然而,这在实践中并不是真的。
The theory is that application/x-www-form-urlencoded
data doesn't have a mechanism to specify an encoding for the form's names or values, so using non-ASCII characters in either is “not specified” as working and you should use POSTed multipart/form-data
instead.
其原理是,应用程序/x-www-form- urlencodes数据没有为表单的名称或值指定编码的机制,因此使用其中任何一个中的非ascii字符都“未指定”有效,您应该使用已发布的多部分/表单数据。
Unfortunately, in the real world, no browser specifies an encoding for fields even when it theoretically could, in the subpart headers of a multipart/form-data
POST request body. (I believe Mozilla tried to implement it once, but backed out as it broke servers.)
不幸的是,在现实世界中,没有任何浏览器会在多部分/表单-数据后请求体的子部分头中指定字段的编码,即使理论上可以。(我认为Mozilla曾经尝试过实现它,但在服务器崩溃时退出了。)
And no browser implements the astonishingly complex and ugly RFC2231 standard that would be necessary to insert encoded non-ASCII field names into the multipart's subpart headers. In any case, the HTML spec that defines multipart/form-data
doesn't directly say that RFC2231 should be used, and, again, it would break servers if you tried.
而且没有浏览器实现令人惊讶的复杂和丑陋的RFC2231标准,需要将编码的非ascii字段名插入到multipart的子部分头中。在任何情况下,定义多部分/表单数据的HTML规范都没有直接规定应该使用RFC2231,而且,如果您尝试过,它也会破坏服务器。
So the reality of the situation is there is no way to know what encoding is being used for the names and values in a form submission, no matter what type of form it is. What browsers will do with field names and values that contain non-ASCII characters is the same for GET and both types of POST form: it encodes them using the encoding the page containing the form used. Non-ASCII GET form names are no more broken than everything else.
因此,实际情况是,无论表单是什么类型,都无法知道表单提交中的名称和值使用了什么编码。浏览器将如何处理包含非ascii字符的字段名和值,对于GET和这两种类型的POST表单都是一样的:它使用包含所使用表单的页面对它们进行编码。非ascii格式名并不比其他任何东西都更糟糕。
DLH:
DLH:
So name has a different data type for than it does for other elements?
所以名称与其他元素的数据类型不同?
Actually the only element whose name
attribute is not CDATA
is <meta>
. See the HTML4 spec's attribute list for all the different uses of name
; it's an overloaded attribute name, having many different meanings on the different elements. This is generally considered a bad thing.
实际上,名称属性不是CDATA的唯一元素是 。查看HTML4规范的属性列表,了解名称的所有不同用法;它是一个重载的属性名,在不同的元素上有许多不同的含义。这通常被认为是一件坏事。
However, typically these days you would avoid name
except on form fields (where it's a control name) and param
(where it's a plugin-specific parameter identifier). That's only two meanings to grapple with. The old-school use of name
for identifying elements like <form>
or <a>
on the page should be avoided (use id
instead).
但是,通常情况下,除了表单字段(它是一个控制名称)和param(它是一个特定于插件的参数标识符)之外,通常会避免使用名称。这只是两个需要解决的问题。应该避免使用旧式的名称来标识页面上的元素,如
#4
4
While Allain's comment did answer OP's direct question and bobince provided some brilliant in-depth information, I believe many people come here seeking answer to more specific question: "Can I use a dot character in form's input name attribute?"
虽然Allain的评论确实回答了OP的直接问题,bobince提供了一些非常深入的信息,但我相信很多人来这里是为了回答更具体的问题:“我能在form的输入名称属性中使用一个点字符吗?”
As this thread came up as first result when I searched for this knowledge I guessed I may as well share what I found.
当这条线作为第一个结果出现时,当我搜索这个知识时,我猜我也可以分享我的发现。
Firstly, Matthias' claimed that:
首先,马赛厄斯声称:
character . are replaced by _
的性格。取而代之的是_
This is untrue. I don't know if browser's actually did this kind of operation back in 2013 - though, I doubt that. Browsers send dot characters as they are(talking about POST data)! You can check it in developer tools of any decent browser.
这是不真实的。我不知道2013年浏览器是否真的做过这种操作——不过,我对此表示怀疑。浏览器发送点字符,就像它们本身一样(谈论后数据)!您可以在任何优秀浏览器的开发工具中检查它。
Please, notice that tiny little comment by abluejelly, that probably is missed by many:
请注意abluejelly的小注释,它可能被许多人忽略了:
I'd like to note that this is a server-specific thing, not a browser thing. Tested on Win7 FF3/3.5/31, IE5/7/8/9/10/Edge, Chrome39, and Safari Windows 5, and all of them sent " test this.stuff" (four leading spaces) as the name in POST to the ASP.NET dev server bundled with VS2012.
我想指出的是,这是一个特定于服务器的东西,而不是浏览器的东西。在Win7 FF3/3.5/31、IE5/7/8/9/10/Edge、Chrome39和Safari Windows 5上进行了测试,所有这些测试都“测试了这个”。“东西”(四个空格)作为在ASP中的名字。NET dev服务器与VS2012绑定。
I checked it with Apache HTTP server(v2.4.25) and indeed input name like "foo.bar" is changed to "foo_bar". But in a name like "foo[foo.bar]" that dot is not replaced by _!
我使用Apache HTTP服务器(v2.4.25)检查了它,并确实输入了“foo”之类的名称。“bar”被更改为“foo_bar”。但是名字是foo。“那个点没有被_取代!”
My conclusion: You can use dots but I wouldn't use it as this may lead to some unexpected behaviours depending on HTTP server used.
我的结论是:您可以使用点,但我不会使用它,因为这可能会导致一些意外的行为,这取决于使用的HTTP服务器。
#5
0
Do you mean the id and name attributes of the HTML input tag?
您是指HTML输入标记的id和名称属性吗?
If so, I'd be very tempted to restrict (or convert) allowed "input" name characters into only a-z (A-Z), 0-9 and a limited range of punctuation (".", ",", etc.), if only to limit the potential for XSS exploits, etc.
如果是这样的话,我很可能会限制(或转换)只允许“输入”名称字符到a-z (a-z)、0-9和有限范围的标点(“”)。“,”,等等),如果只是限制XSS攻击的可能性,等等。
Additionally, why let the user control any aspect of the input tag? (Might it not ultimately be easier from a validation perspective to keep the input tag names are 'custom_1', 'custom_2', etc. and then map these as required.)
另外,为什么要让用户控制输入标签的任何方面?(从验证的角度来看,保留输入标记名是“custom_1”、“custom_2”等,然后根据需要映射这些名称,这最终可能不会更简单。)
#1
28
The only real restriction on what characters can appear in form control names is when a form is submitted with GET
对于窗体控件名称中出现的字符,唯一的真正限制是在使用GET提交窗体时
"The "get" method restricts form data set values to ASCII characters." reference
“get”方法将表单数据集值限制为ASCII字符
There's a good thread on it here.
这里有一根很好的线。
#2
47
Note, that not all characters are submitted for name
attributes of form fields (even when using POST)!
注意,并不是所有字符都提交给表单字段的名称属性(即使使用POST)!
White-space characters are trimmed and inner white-space characters as well the character .
are replaced by _
. (Tested in Chrome 23, Firefox 13 and Internet Explorer 9, all Win7.)
空白字符是裁剪和内部空白字符以及字符。取而代之的是_。(在Chrome 23、firefox13和ie9中测试过,都是Win7。)
#3
37
Any character you can include in an [X]HTML file is fine to put in an <input name>
. As Allain's comment says, <input name>
is defined as containing CDATA
, so the only things you can't put in there are the control codes and invalid codepoints that the underlying standard (SGML or XML) disallows.
可以在[X]HTML文件中包含的任何字符都可以输入。正如Allain的评论所言,被定义为包含CDATA,所以您不能在其中输入的只有底层标准(SGML或XML)不允许的控制代码和无效代码点。
Allain quoted W3 from the HTML4 spec:
Allain引用HTML4规范中的W3:
Note. The "get" method restricts form data set values to ASCII characters. Only the "post" method (with enctype="multipart/form-data") is specified to cover the entire ISO10646 character set.
请注意。“get”方法将表单数据集值限制为ASCII字符。只指定“post”方法(使用enctype=“multipart/form-data”)来覆盖整个ISO10646字符集。
However this isn't really true in practice.
然而,这在实践中并不是真的。
The theory is that application/x-www-form-urlencoded
data doesn't have a mechanism to specify an encoding for the form's names or values, so using non-ASCII characters in either is “not specified” as working and you should use POSTed multipart/form-data
instead.
其原理是,应用程序/x-www-form- urlencodes数据没有为表单的名称或值指定编码的机制,因此使用其中任何一个中的非ascii字符都“未指定”有效,您应该使用已发布的多部分/表单数据。
Unfortunately, in the real world, no browser specifies an encoding for fields even when it theoretically could, in the subpart headers of a multipart/form-data
POST request body. (I believe Mozilla tried to implement it once, but backed out as it broke servers.)
不幸的是,在现实世界中,没有任何浏览器会在多部分/表单-数据后请求体的子部分头中指定字段的编码,即使理论上可以。(我认为Mozilla曾经尝试过实现它,但在服务器崩溃时退出了。)
And no browser implements the astonishingly complex and ugly RFC2231 standard that would be necessary to insert encoded non-ASCII field names into the multipart's subpart headers. In any case, the HTML spec that defines multipart/form-data
doesn't directly say that RFC2231 should be used, and, again, it would break servers if you tried.
而且没有浏览器实现令人惊讶的复杂和丑陋的RFC2231标准,需要将编码的非ascii字段名插入到multipart的子部分头中。在任何情况下,定义多部分/表单数据的HTML规范都没有直接规定应该使用RFC2231,而且,如果您尝试过,它也会破坏服务器。
So the reality of the situation is there is no way to know what encoding is being used for the names and values in a form submission, no matter what type of form it is. What browsers will do with field names and values that contain non-ASCII characters is the same for GET and both types of POST form: it encodes them using the encoding the page containing the form used. Non-ASCII GET form names are no more broken than everything else.
因此,实际情况是,无论表单是什么类型,都无法知道表单提交中的名称和值使用了什么编码。浏览器将如何处理包含非ascii字符的字段名和值,对于GET和这两种类型的POST表单都是一样的:它使用包含所使用表单的页面对它们进行编码。非ascii格式名并不比其他任何东西都更糟糕。
DLH:
DLH:
So name has a different data type for than it does for other elements?
所以名称与其他元素的数据类型不同?
Actually the only element whose name
attribute is not CDATA
is <meta>
. See the HTML4 spec's attribute list for all the different uses of name
; it's an overloaded attribute name, having many different meanings on the different elements. This is generally considered a bad thing.
实际上,名称属性不是CDATA的唯一元素是 。查看HTML4规范的属性列表,了解名称的所有不同用法;它是一个重载的属性名,在不同的元素上有许多不同的含义。这通常被认为是一件坏事。
However, typically these days you would avoid name
except on form fields (where it's a control name) and param
(where it's a plugin-specific parameter identifier). That's only two meanings to grapple with. The old-school use of name
for identifying elements like <form>
or <a>
on the page should be avoided (use id
instead).
但是,通常情况下,除了表单字段(它是一个控制名称)和param(它是一个特定于插件的参数标识符)之外,通常会避免使用名称。这只是两个需要解决的问题。应该避免使用旧式的名称来标识页面上的元素,如
#4
4
While Allain's comment did answer OP's direct question and bobince provided some brilliant in-depth information, I believe many people come here seeking answer to more specific question: "Can I use a dot character in form's input name attribute?"
虽然Allain的评论确实回答了OP的直接问题,bobince提供了一些非常深入的信息,但我相信很多人来这里是为了回答更具体的问题:“我能在form的输入名称属性中使用一个点字符吗?”
As this thread came up as first result when I searched for this knowledge I guessed I may as well share what I found.
当这条线作为第一个结果出现时,当我搜索这个知识时,我猜我也可以分享我的发现。
Firstly, Matthias' claimed that:
首先,马赛厄斯声称:
character . are replaced by _
的性格。取而代之的是_
This is untrue. I don't know if browser's actually did this kind of operation back in 2013 - though, I doubt that. Browsers send dot characters as they are(talking about POST data)! You can check it in developer tools of any decent browser.
这是不真实的。我不知道2013年浏览器是否真的做过这种操作——不过,我对此表示怀疑。浏览器发送点字符,就像它们本身一样(谈论后数据)!您可以在任何优秀浏览器的开发工具中检查它。
Please, notice that tiny little comment by abluejelly, that probably is missed by many:
请注意abluejelly的小注释,它可能被许多人忽略了:
I'd like to note that this is a server-specific thing, not a browser thing. Tested on Win7 FF3/3.5/31, IE5/7/8/9/10/Edge, Chrome39, and Safari Windows 5, and all of them sent " test this.stuff" (four leading spaces) as the name in POST to the ASP.NET dev server bundled with VS2012.
我想指出的是,这是一个特定于服务器的东西,而不是浏览器的东西。在Win7 FF3/3.5/31、IE5/7/8/9/10/Edge、Chrome39和Safari Windows 5上进行了测试,所有这些测试都“测试了这个”。“东西”(四个空格)作为在ASP中的名字。NET dev服务器与VS2012绑定。
I checked it with Apache HTTP server(v2.4.25) and indeed input name like "foo.bar" is changed to "foo_bar". But in a name like "foo[foo.bar]" that dot is not replaced by _!
我使用Apache HTTP服务器(v2.4.25)检查了它,并确实输入了“foo”之类的名称。“bar”被更改为“foo_bar”。但是名字是foo。“那个点没有被_取代!”
My conclusion: You can use dots but I wouldn't use it as this may lead to some unexpected behaviours depending on HTTP server used.
我的结论是:您可以使用点,但我不会使用它,因为这可能会导致一些意外的行为,这取决于使用的HTTP服务器。
#5
0
Do you mean the id and name attributes of the HTML input tag?
您是指HTML输入标记的id和名称属性吗?
If so, I'd be very tempted to restrict (or convert) allowed "input" name characters into only a-z (A-Z), 0-9 and a limited range of punctuation (".", ",", etc.), if only to limit the potential for XSS exploits, etc.
如果是这样的话,我很可能会限制(或转换)只允许“输入”名称字符到a-z (a-z)、0-9和有限范围的标点(“”)。“,”,等等),如果只是限制XSS攻击的可能性,等等。
Additionally, why let the user control any aspect of the input tag? (Might it not ultimately be easier from a validation perspective to keep the input tag names are 'custom_1', 'custom_2', etc. and then map these as required.)
另外,为什么要让用户控制输入标签的任何方面?(从验证的角度来看,保留输入标记名是“custom_1”、“custom_2”等,然后根据需要映射这些名称,这最终可能不会更简单。)