My input string consists of a mixture of unicode escape characters with regular characters mixed in. Example:
我的输入字符串由混合了常规字符的unicode转义字符组成。示例:
\u0000\u0003\u0000\u0013timestamp\u0011clientId\u0015timeToLive\u0017destination\u000fheaders\tbody\u0013messageId\u0001\u0006
How can I convert this into a bytearray or Stream?
如何将其转换为bytearray或Stream?
EDIT: UTF+8 encoding. To clarify the input string:
编辑:UTF + 8编码。澄清输入字符串:
Char 01: U+0000
Char 02: U+0003
Char 03: U+0000
Char 04: U+0013
Char 05: t
Char 06: i
Char 07: m
Char 08: e
Char 09: s
Char 10: t
Char 11: a
Char 12: m
Char 13: p
Char 14: U+0011
...
...
2 个解决方案
#1
4
Okay, so you've got an arbitrary string (the fact that it contains non-printable characters is irrelevant) and you want to convert it into a byte array using UTF-8. That's easy :)
好的,所以你有一个任意的字符串(事实上它包含不可打印的字符是无关紧要的),你想用UTF-8将它转换成一个字节数组。这很容易 :)
byte[] bytes = Encoding.UTF8.GetBytes(text);
Or to write to a stream, you'd normally wrap it in a StreamWriter
:
或者要写入流,通常将其包装在StreamWriter中:
// Note that due to the using statement, this will close the stream at the end
// of the block
using (var writer = new StreamWriter(stream))
{
writer.Write(text);
}
(UTF-8 is the default encoding for StreamWriter
, but you can specify it explicitly of course.)
(UTF-8是StreamWriter的默认编码,但您当然可以明确指定它。)
I'm assuming you really have a good reason to have "text" in this form though. I can't say I've ever found a use for U+0003 (END OF TEXT). If, as I4V has suggested, this data was originally in a binary stream, you should avoid handling it as text in the first place. Separate out your binary data from your text data - when you mix them, it will cause issues. (For example, if the fourth character in your string were U+00FF, it would end up as two bytes when encoded to UTF-8, which probably wouldn't be what you wanted.)
我假设你真的有充分的理由在这种形式下使用“文本”。我不能说我曾经找到过用于U + 0003(结束文本)的用法。如果,正如I4V所建议的那样,这些数据最初是在二进制流中,那么首先应避免将其作为文本处理。从文本数据中分离出二进制数据 - 当它们混合时,会导致问题。 (例如,如果你的字符串中的第四个字符是U + 00FF,那么当编码为UTF-8时它最终会变为两个字节,这可能不是你想要的。)
#2
0
To simplify the conversion just do this:
要简化转换,请执行以下操作:
var stream = new memoryStream(Encoding.UTF8.GetBytes(str));
Or if you want a approach that have concerns about reusability, create a Extension Method to strings like this:
或者,如果您想要一种关注可重用性的方法,请为这样的字符串创建一个扩展方法:
public static class StringExtension
{
public static Stream ToStream(this string str)
=>new memoryStream(Encoding.UTF8.GetBytes(str))
//Or much better
public static Stream ToStreamWithEncoding(this string str, Encoding encoding)
=>new memoryStream(encoding.GetBytes(str))
}
#1
4
Okay, so you've got an arbitrary string (the fact that it contains non-printable characters is irrelevant) and you want to convert it into a byte array using UTF-8. That's easy :)
好的,所以你有一个任意的字符串(事实上它包含不可打印的字符是无关紧要的),你想用UTF-8将它转换成一个字节数组。这很容易 :)
byte[] bytes = Encoding.UTF8.GetBytes(text);
Or to write to a stream, you'd normally wrap it in a StreamWriter
:
或者要写入流,通常将其包装在StreamWriter中:
// Note that due to the using statement, this will close the stream at the end
// of the block
using (var writer = new StreamWriter(stream))
{
writer.Write(text);
}
(UTF-8 is the default encoding for StreamWriter
, but you can specify it explicitly of course.)
(UTF-8是StreamWriter的默认编码,但您当然可以明确指定它。)
I'm assuming you really have a good reason to have "text" in this form though. I can't say I've ever found a use for U+0003 (END OF TEXT). If, as I4V has suggested, this data was originally in a binary stream, you should avoid handling it as text in the first place. Separate out your binary data from your text data - when you mix them, it will cause issues. (For example, if the fourth character in your string were U+00FF, it would end up as two bytes when encoded to UTF-8, which probably wouldn't be what you wanted.)
我假设你真的有充分的理由在这种形式下使用“文本”。我不能说我曾经找到过用于U + 0003(结束文本)的用法。如果,正如I4V所建议的那样,这些数据最初是在二进制流中,那么首先应避免将其作为文本处理。从文本数据中分离出二进制数据 - 当它们混合时,会导致问题。 (例如,如果你的字符串中的第四个字符是U + 00FF,那么当编码为UTF-8时它最终会变为两个字节,这可能不是你想要的。)
#2
0
To simplify the conversion just do this:
要简化转换,请执行以下操作:
var stream = new memoryStream(Encoding.UTF8.GetBytes(str));
Or if you want a approach that have concerns about reusability, create a Extension Method to strings like this:
或者,如果您想要一种关注可重用性的方法,请为这样的字符串创建一个扩展方法:
public static class StringExtension
{
public static Stream ToStream(this string str)
=>new memoryStream(Encoding.UTF8.GetBytes(str))
//Or much better
public static Stream ToStreamWithEncoding(this string str, Encoding encoding)
=>new memoryStream(encoding.GetBytes(str))
}