Base64字符串抛出无效字符错误

I keep getting a Base64 invalid character error even though I shouldn't.

即使我不应该,我仍然会收到Base64无效字符错误。

The program takes an XML file and exports it to a document. If the user wants, it will compress the file as well. The compression works fine and returns a Base64 String which is encoded into UTF-8 and written to a file.

该程序采用XML文件并将其导出到文档。如果用户想要,它也会压缩文件。压缩工作正常并返回一个Base64字符串,该字符串被编码为UTF-8并写入文件。

When its time to reload the document into the program I have to check whether its compressed or not, the code is simply:

当它将文档重新加载到程序中时,我必须检查它是否被压缩,代码只是:

byte[] gzBuffer = System.Convert.FromBase64String(text);
return "1F-8B-08" == BitConverter.ToString(new List<Byte>(gzBuffer).GetRange(4, 3).ToArray());

It checks the beginning of the string to see if it has GZips code in it.

它检查字符串的开头以查看其中是否包含GZips代码。

Now the thing is, all my tests work. I take a string, compress it, decompress it, and compare it to the original. The problem is when I get the string returned from an ADO Recordset. The string is exactly what was written to the file (with the addition of a "\0" at the end, but I don't think that even does anything, even trimmed off it still throws). I even copy and pasted the entire string into a test method and compress/decompress that. Works fine.

现在问题是,我的所有测试都有效。我拿一根绳子,压缩它,解压缩,并将它与原始的相比较。问题是当我从ADO Recordset返回字符串时。字符串正是写入文件的内容(最后添加了一个“\ 0”,但我认为即使做了任何事情,甚至修剪它仍然会抛出)。我甚至将整个字符串复制并粘贴到测试方法中并压缩/解压缩。工作良好。

The tests will pass but the code will fail using the exact same string? The only difference is instead of just declaring a regular string and passing it in I'm getting one returned from a recordset.

测试将通过,但代码将使用完全相同的字符串失败?唯一的区别是,不是仅仅声明一个常规字符串并传递它,而是从记录集中返回一个。

Any ideas on what am I doing wrong?

关于我做错什么的任何想法?

5 个解决方案

#1

You say

The string is exactly what was written to the file (with the addition of a "\0" at the end, but I don't think that even does anything).

字符串正是写入文件的内容(最后添加了“\ 0”,但我认为它甚至没有做任何事情)。

In fact, it does do something (it causes your code to throw a FormatException:"Invalid character in a Base-64 string") because the Convert.FromBase64String does not consider "\0" to be a valid Base64 character.

事实上,它确实做了一些事情(它导致您的代码抛出FormatException:“Base-64字符串中的无效字符”)因为Convert.FromBase64String不认为“\ 0”是有效的Base64字符。

  byte[] data1 = Convert.FromBase64String("AAAA\0"); // Throws exception
  byte[] data2 = Convert.FromBase64String("AAAA");   // Works

Solution: Get rid of the zero termination. (Maybe call .Trim("\0"))

解决方案:摆脱零终止。 (也许叫.Trim(“\ 0”))

Notes:

The MSDN docs for Convert.FromBase64String say it will throw a FormatException when

Convert.FromBase64String的MSDN文档说它会在什么时候抛出FormatException

The length of s, ignoring white space characters, is not zero or a multiple of 4.

忽略空格字符的s的长度不为零或为4的倍数。

-or-

The format of s is invalid. s contains a non-base 64 character, more than two padding characters, or a non-white space character among the padding characters.

s的格式无效。 s包含非基本64个字符,两个以上的填充字符或填充字符中的非空白字符。

and that

The base 64 digits in ascending order from zero are the uppercase characters 'A' to 'Z', lowercase characters 'a' to 'z', numerals '0' to '9', and the symbols '+' and '/'.

从零开始按升序排列的基数64位是大写字母“A”到“Z”,小写字母“a”到“z”,数字“0”到“9”,符号“+”和“/” 。

#2

Whether null char is allowed or not really depends on base64 codec in question. Given vagueness of Base64 standard (there is no authoritative exact specification), many implementations would just ignore it as white space. And then others can flag it as a problem. And buggiest ones wouldn't notice and would happily try decoding it... :-/

是否允许使用null char实际上取决于所讨论的base64编解码器。鉴于Base64标准的模糊性(没有权威的确切规范),许多实现只会忽略它作为空白。然后其他人可以将其标记为问题。最吵闹的人不会注意到,并乐意尝试解码...: - /

But it sounds c# implementation does not like it (which is one valid approach) so if removing it helps, that should be done.

但它听起来c#实现不喜欢它(这是一种有效的方法)所以如果删除它有帮助,那应该这样做。

One minor additional comment: UTF-8 is not a requirement, ISO-8859-x aka Latin-x, and 7-bit Ascii would work as well. This because Base64 was specifically designed to only use 7-bit subset which works with all 7-bit ascii compatible encodings.

一个小的附加评论:UTF-8不是必需的,ISO-8859-x又名Latin-x,7位Ascii也可以。这是因为Base64专门设计为仅使用7位子集,该子集适用于所有7位ascii兼容编码。

#3

If removing \0 from the end of string is impossible, you can add your own character for each string you encode, and remove it on decode.

如果从字符串末尾删除\ 0是不可能的,您可以为您编码的每个字符串添加自己的字符,并在解码时将其删除。

#4

One gotcha to do with converting Base64 from a string is that some conversion functions use the preceding "data:image/jpg;base64," and others only accept the actual data.

从字符串转换Base64的一个问题是,某些转换函数使用前面的“data:image / jpg; base64”,而其他转换函数只接受实际数据。

#5

string stringToDecrypt = HttpContext.Current.Request.QueryString.ToString()

//change to string stringToDecrypt = HttpUtility.UrlDecode(HttpContext.Current.Request.QueryString.ToString())

//更改为字符串stringToDecrypt = HttpUtility.UrlDecode(HttpContext.Current.Request.QueryString.ToString())

#1