为什么这个%2B字符串被urldecoded?

时间:2022-02-09 01:59:17

[This may not be precisely a programming question, but it's a puzzle that may best be answered by programmers. I tried it first on the Pro Webmasters site, to overwhelming silence]

这可能不是一个精确的编程问题,但它可能是程序员最好的答案。我首先在专业的站长网站上尝试了一下,想让大家安静下来。

We have an email address verification process on our website. The site first generates an appropriate key as a string

我们的网站上有一个电子邮件地址验证程序。站点首先生成一个适当的键作为字符串

mykey

It then encodes that key as a bunch of bytes

然后它将该键编码为一串字节

&$dac~ʌ����!

It then base64 encodes that bunch of bytes

然后它对这些字节进行base64编码

JiRkYWN+yoyIhIQ==

Since this key is going to be given as a querystring value of a URL that is to be placed in an HTML email, we need to first URLEncode it then HTMLEncode the result, giving us (there's no effect of HTMLEncoding in the example case, but I can't be bothered to rework the example)

因为这个关键是考虑到作为一个URL的查询字符串值也被放置在一个HTML电子邮件,我们需要首先URLEncode然后HTMLEncode结果,给我们(没有影响HTMLEncoding情况的示例中,但我不愿返工)

JiRkYWN%2ByoyIhIQ%3D%3D

This is then embedded in HTML that is sent as part of an email, something like:

然后嵌入HTML,作为电子邮件的一部分发送,比如:

click <a href="http://myapp/verify?key=JiRkYWN%2ByoyIhIQ%3D%3D">here</a>. 
Or paste <b>http://myapp/verify?key=JiRkYWN%2ByoyIhIQ%3D%3D</b> into your browser.

When the receiving user clicks on the link, the site receives the request, extracts the value of the querystring 'key' parameter, base64 decodes it, decrypts it, and does the appropriate thing in terms of the site logic.

当接收用户单击链接时,站点接收请求,提取querystring 'key'参数的值,base64对其进行解码,解密,并根据站点逻辑做适当的事情。

However on occasion we have users who report that their clicking is ineffective. One such user forwarded us the email he had been sent, and on inspection the HTML had been transformed into (to put it in terms of the example above)

然而,有时我们的用户报告他们的点击是无效的。一个这样的用户将他发送的邮件转发给我们,经过检查,HTML被转换为(根据上面的例子)

click <a href="http://myapp/verify?key=JiRkYWN+yoyIhIQ%3D%3D">here</a>
Or paste <b>http://myapp/verify?key=JiRkYWN+yoyIhIQ%3D%3D</b> into your browser.

That is, the %2B string - but none of the other percentage encoded strings - had been converted into a plus. (It's definitely leaving us with the right values - I've looked at the appropriate SMTP logs).

也就是说,%2B字符串——但是没有其他百分比编码的字符串——被转换为加号。(它给我们留下了正确的值——我查看了适当的SMTP日志)。

key=JiRkYWN%2ByoyIhIQ%3D%3D
key=JiRkYWN+yoyIhIQ%3D%3D

So I think that there are a couple of possibilities:

所以我认为有一些可能性:

  1. There's something I'm doing that's stupid, that I can't see, or

    有些事我做得很愚蠢,我看不到,或者

  2. Some mail clients convert %2b strings to plus signs, perhaps to try to cope with the problem of people mistakenly URLEncoding plus signs

    一些邮件客户端将%2b字符串转换为加号,可能是为了解决人们错误地使用加号进行URLEncoding的问题

In case of 1 - what is it? In case of 2 - is there a standard, known way of dealing with this kind of scenario?

1是什么?如果是2 -是否有一个标准,已知的方法来处理这种情况?

Many thanks for any help

非常感谢你的帮助。

2 个解决方案

#1


1  

The problem lies at this step

问题就在这一步上

on inspection the HTML had been transformed into (to put it in terms of the example above)

通过检查,HTML已经被转换为(根据上面的例子)

click <a href="http://myapp/verify?key=JiRkYWN+yoyIhIQ%3D%3D">here</a>
Or paste <b>http://myapp/verify?key=JiRkYWN+yoyIhIQ%3D%3D</b> into
your browser.

That is, the %2B string - but none of the other percentage encoded strings - had been converted into a plus

也就是说,%2B字符串——但是没有其他百分比编码的字符串——被转换为加号

Your application at "the other end" must be missing a step of unescaping. Regardless of if there is a %2B or a + a function like perls uri_unescape returns consistent answers

您在“另一端”的应用程序必须丢失一个不可避免的步骤。无论是否有%2B或a +,像perls uri_unescape这样的函数都会返回一致的答案

DB<9> use URI::Escape;
DB<10> x uri_unescape("JiRkYWN+yoyIhIQ%3D%3D")
0  'JiRkYWN+yoyIhIQ=='
DB<11> x uri_unescape("JiRkYWN%2ByoyIhIQ%3D%3D")
0  'JiRkYWN+yoyIhIQ=='

Here is what should be happening. All I'm showing are the steps. I'm using perl in a debugger. Step 54 encodes the string to base64. Step 55 shows how the base64 encoded string could be made into a uri escaped parameter. Steps 56 and 57 are what the client end should be doing to decode.

这就是应该发生的事情。我所展示的就是这些步骤。我正在调试器中使用perl。步骤54将字符串编码为base64。第55步显示了如何将base64编码的字符串转换为uri转义参数。步骤56和57是客户端要做的解码工作。

One possible work around is to ensure that your base64 "key" does not contain any plus signs!

一种可能的工作是确保您的base64“key”不包含任何加号!

  DB<53> $key="AB~"
  DB<54> x encode_base64($key)
0  'QUJ+
'
  DB<55> x uri_escape('QUJ+') 
0  'QUJ%2B'
  DB<56> x uri_unescape('QUJ%2B')
0  'QUJ+'
  DB<57> $result=decode_base64('QUJ+')
  DB<58> x $result
0  'AB~'

#2


0  

What may be happening here is that the URLDecode is turning the %2b into a +, which is being interpreted as a space character in the URL. I was able to overcome a similar problem by first urldecoding the string, then using a replace function to replace spaces in the decoded string with + characters, and then decrypting the "fixed" string.

这里可能发生的是URLDecode将%2b转换成+,在URL中被解释为空格字符。我克服了类似的问题,首先对字符串进行urldecoding,然后使用replace函数用+字符替换解码字符串中的空格,然后解密“fixed”字符串。

#1


1  

The problem lies at this step

问题就在这一步上

on inspection the HTML had been transformed into (to put it in terms of the example above)

通过检查,HTML已经被转换为(根据上面的例子)

click <a href="http://myapp/verify?key=JiRkYWN+yoyIhIQ%3D%3D">here</a>
Or paste <b>http://myapp/verify?key=JiRkYWN+yoyIhIQ%3D%3D</b> into
your browser.

That is, the %2B string - but none of the other percentage encoded strings - had been converted into a plus

也就是说,%2B字符串——但是没有其他百分比编码的字符串——被转换为加号

Your application at "the other end" must be missing a step of unescaping. Regardless of if there is a %2B or a + a function like perls uri_unescape returns consistent answers

您在“另一端”的应用程序必须丢失一个不可避免的步骤。无论是否有%2B或a +,像perls uri_unescape这样的函数都会返回一致的答案

DB<9> use URI::Escape;
DB<10> x uri_unescape("JiRkYWN+yoyIhIQ%3D%3D")
0  'JiRkYWN+yoyIhIQ=='
DB<11> x uri_unescape("JiRkYWN%2ByoyIhIQ%3D%3D")
0  'JiRkYWN+yoyIhIQ=='

Here is what should be happening. All I'm showing are the steps. I'm using perl in a debugger. Step 54 encodes the string to base64. Step 55 shows how the base64 encoded string could be made into a uri escaped parameter. Steps 56 and 57 are what the client end should be doing to decode.

这就是应该发生的事情。我所展示的就是这些步骤。我正在调试器中使用perl。步骤54将字符串编码为base64。第55步显示了如何将base64编码的字符串转换为uri转义参数。步骤56和57是客户端要做的解码工作。

One possible work around is to ensure that your base64 "key" does not contain any plus signs!

一种可能的工作是确保您的base64“key”不包含任何加号!

  DB<53> $key="AB~"
  DB<54> x encode_base64($key)
0  'QUJ+
'
  DB<55> x uri_escape('QUJ+') 
0  'QUJ%2B'
  DB<56> x uri_unescape('QUJ%2B')
0  'QUJ+'
  DB<57> $result=decode_base64('QUJ+')
  DB<58> x $result
0  'AB~'

#2


0  

What may be happening here is that the URLDecode is turning the %2b into a +, which is being interpreted as a space character in the URL. I was able to overcome a similar problem by first urldecoding the string, then using a replace function to replace spaces in the decoded string with + characters, and then decrypting the "fixed" string.

这里可能发生的是URLDecode将%2b转换成+,在URL中被解释为空格字符。我克服了类似的问题,首先对字符串进行urldecoding,然后使用replace函数用+字符替换解码字符串中的空格,然后解密“fixed”字符串。