将字符数组转换为字节数组,然后再转换回字节数组

时间:2022-04-22 08:58:52

I'm looking to convert a Java char array to a byte array without creating an intermediate String, as the char array contains a password. I've looked up a couple of methods, but they all seem to fail:

我希望在不创建中间字符串的情况下将Java char数组转换为字节数组,因为char数组包含密码。我查阅了一些方法,但似乎都失败了:

char[] password = "password".toCharArray();

byte[] passwordBytes1 = new byte[password.length*2];
ByteBuffer.wrap(passwordBytes1).asCharBuffer().put(password);

byte[] passwordBytes2 = new byte[password.length*2];
for(int i=0; i<password.length; i++) {
    passwordBytes2[2*i] = (byte) ((password[i]&0xFF00)>>8); 
    passwordBytes2[2*i+1] = (byte) (password[i]&0x00FF); 
}

String passwordAsString = new String(password);
String passwordBytes1AsString = new String(passwordBytes1);
String passwordBytes2AsString = new String(passwordBytes2);

System.out.println(passwordAsString);
System.out.println(passwordBytes1AsString);
System.out.println(passwordBytes2AsString);
assertTrue(passwordAsString.equals(passwordBytes1) || passwordAsString.equals(passwordBytes2));

The assertion always fails (and, critically, when the code is used in production, the password is rejected), yet the print statements print out password three times. Why are passwordBytes1AsString and passwordBytes2AsString different from passwordAsString, yet appear identical? Am I missing out a null terminator or something? What can I do to make the conversion and unconversion work?

断言总是失败(关键是,当代码在生产环境中使用时,密码被拒绝),但是打印语句会打印三次密码。为什么passwordBytes1AsString和passwordBytes2AsString与passwordAsString不同,却显示相同?我是不是漏掉了零终止符?我怎样做才能使转换和反转换工作?

8 个解决方案

#1


12  

The problem is your use of the String(byte[]) constructor, which uses the platform default encoding. That's almost never what you should be doing - if you pass in "UTF-16" as the character encoding to work, your tests will probably pass. Currently I suspect that passwordBytes1AsString and passwordBytes2AsString are each 16 characters long, with every other character being U+0000.

问题是使用String(byte[])构造函数,它使用平台默认编码。这几乎不是您应该做的——如果您将“UTF-16”作为字符编码传递到工作中,那么您的测试很可能会通过。目前,我怀疑passwordBytes1AsString和passwordBytes2AsString各为16个字符长,其他字符为U+0000。

#2


13  

Conversion between char and byte is character set encoding and decoding.I prefer to make it as clear as possible in code. It doesn't really mean extra code volume:

字符和字节之间的转换是字符集编码和解码。我希望在代码中尽可能地清楚。它并不意味着额外的代码量:

 Charset latin1Charset = Charset.forName("ISO-8859-1"); 
 charBuffer = latin1Charset.decode(ByteBuffer.wrap(byteArray)); // also decode to String
 byteBuffer = latin1Charset.encode(charBuffer);                 // also decode from String

Aside:

旁白:

java.nio classes and java.io Reader/Writer classes use ByteBuffer & CharBuffer (which use byte[] and char[] as backing arrays). So often preferable if you use these classes directly. However, you can always do:

java。和java nio类。io Reader/Writer类使用ByteBuffer和CharBuffer(使用byte[]和char[]作为后台数组)。如果您直接使用这些类,通常更可取。然而,你总是可以做到:

 byteArray = ByteBuffer.array();  byteBuffer = ByteBuffer.wrap(byteArray);  
 byteBuffer.get(byteArray);       charBuffer.put(charArray);
 charArray = CharBuffer.array();  charBuffer = ByteBuffer.wrap(charArray);
 charBuffer.get(charArray);       charBuffer.put(charArray);

#3


5  

    public byte[] charsToBytes(char[] chars){
        Charset charset = Charset.forName("UTF-8");
        ByteBuffer byteBuffer = charset.encode(CharBuffer.wrap(chars));
        return Arrays.copyOf(byteBuffer.array(), byteBuffer.limit());
    }

    public char[] bytesToChars(byte[] bytes){
        Charset charset = Charset.forName("UTF-8");
        CharBuffer charBuffer = charset.decode(ByteBuffer.wrap(bytes));
        return Arrays.copyOf(charBuffer.array(), charBuffer.limit());    
    }

#4


4  

If you want to use a ByteBuffer and CharBuffer, don't do the simple .asCharBuffer(), which simply does an UTF-16 (LE or BE, depending on your system - you can set the byte-order with the order method) conversion (since the Java Strings and thus your char[] internally uses this encoding).

如果您想使用ByteBuffer和CharBuffer,请不要使用简单的. ascharbuffer(),它只执行UTF-16 (LE或BE,取决于您的系统——您可以使用order方法设置字节顺序)转换(因为Java字符串因此您的char[]在内部使用这种编码)。

Use Charset.forName(charsetName), and then its encode or decode method, or the newEncoder /newDecoder.

使用Charset.forName(charsetName),然后使用它的编码或解码方法,或newEncoder /newDecoder。

When converting your byte[] to String, you also should indicate the encoding (and it should be the same one).

当您将您的字节[]转换为字符串时,您还应该指出编码(并且它应该是相同的)。

#5


3  

I would do is use a loop to convert to bytes and another to conver back to char.

我要做的是使用一个循环来转换成字节,另一个循环转换成char。

char[] chars = "password".toCharArray();
byte[] bytes = new byte[chars.length*2];
for(int i=0;i<chars.length;i++) {
   bytes[i*2] = (byte) (chars[i] >> 8);
   bytes[i*2+1] = (byte) chars[i];
}
char[] chars2 = new char[bytes.length/2];
for(int i=0;i<chars2.length;i++) 
   chars2[i] = (char) ((bytes[i*2] << 8) + (bytes[i*2+1] & 0xFF));
String password = new String(chars2);

#6


2  

You should make use of getBytes() instead of toCharArray()

应该使用getBytes()而不是toCharArray()

Replace the line

更换线

char[] password = "password".toCharArray();

with

byte[] password = "password".getBytes();

#7


2  

This is an extension to Peter Lawrey's answer. In order to backward (bytes-to-chars) conversion work correctly for the whole range of chars, the code should be as follows:

这是对彼得·劳瑞答案的延伸。为了使所有轮对轮对轮对轮的逆向(字节对轮)转换工作正确,代码应如下:

char[] chars = new char[bytes.length/2];
for (int i = 0; i < chars.length; i++) {
   chars[i] = (char) (((bytes[i*2] & 0xff) << 8) + (bytes[i*2+1] & 0xff));
}

We need to "unsign" bytes before using (& 0xff). Otherwise half of the all possible char values will not get back correctly. For instance, chars within [0x80..0xff] range will be affected.

在使用(& 0xff)之前,我们需要“取消”字节。否则,所有可能的char值的一半将无法正确返回。例如,在[0x80..]范围将受到影响。

#8


1  

When you use GetBytes From a String in Java, The return result will depend on the default encode of your computer setting.(eg: StandardCharsetsUTF-8 or StandardCharsets.ISO_8859_1etc...).

在Java中使用字符串GetBytes时,返回结果将取决于计算机设置的默认编码。(如:StandardCharsetsUTF-8或StandardCharsets.ISO_8859_1etc……)。

So, whenever you want to getBytes from a String Object. Make sure to give a encode . like :

所以,当你想从字符串对象中获取字节时。确保给出一个编码。如:

String sample = "abc";
Byte[] a_byte = sample .getBytes(StandardCharsets.UTF_8);

Let check what has happened with the code. In java, the String named sample , is stored by Unicode. every char in String stored by 2 byte.

让我们检查代码发生了什么。在java中,名为sample的字符串由Unicode存储。每个字符在字符串中存储为2字节。

sample :  value: "abc"   in Memory(Hex):  00 61 00 62 00 63
        a -> 00 61
        b -> 00 62
        c -> 00 63

But, When we getBytes From a String, we have

但是,当我们从字符串中获取字节时,我们有

Byte[] a_byte = sample .getBytes(StandardCharsets.UTF_8)
//result is : 61 62 63
//length: 3 bytes

Byte[] a_byte = sample .getBytes(StandardCharsets.UTF_16BE)  
//result is : 00 61 00 62 00 63        
//length: 6 bytes

In order to get the oringle byte of the String. We can just read the Memory of the string and get Each byte of the String.Below is the sample Code:

为了得到字符串的oringle字节。我们可以读取字符串的内存,得到字符串的每个字节。下面是示例代码:

public static byte[] charArray2ByteArray(char[] chars){
    int length = chars.length;
    byte[] result = new byte[length*2+2];
    int i = 0;
    for(int j = 0 ;j<chars.length;j++){
        result[i++] = (byte)( (chars[j] & 0xFF00) >> 8 );
        result[i++] = (byte)((chars[j] & 0x00FF)) ;
    }
    return result;
}

Usages:

用法:

String sample = "abc";
//First get the chars of the String,each char has two bytes(Java).
Char[] sample_chars = sample.toCharArray();
//Get the bytes
byte[] result = charArray2ByteArray(sample_chars).

//Back to String.
//Make sure we use UTF_16BE. Because we read the memory of Unicode of  
//the String from Left to right. That's the same reading 
//sequece of  UTF-16BE.
String sample_back= new String(result , StandardCharsets.UTF_16BE);

#1


12  

The problem is your use of the String(byte[]) constructor, which uses the platform default encoding. That's almost never what you should be doing - if you pass in "UTF-16" as the character encoding to work, your tests will probably pass. Currently I suspect that passwordBytes1AsString and passwordBytes2AsString are each 16 characters long, with every other character being U+0000.

问题是使用String(byte[])构造函数,它使用平台默认编码。这几乎不是您应该做的——如果您将“UTF-16”作为字符编码传递到工作中,那么您的测试很可能会通过。目前,我怀疑passwordBytes1AsString和passwordBytes2AsString各为16个字符长,其他字符为U+0000。

#2


13  

Conversion between char and byte is character set encoding and decoding.I prefer to make it as clear as possible in code. It doesn't really mean extra code volume:

字符和字节之间的转换是字符集编码和解码。我希望在代码中尽可能地清楚。它并不意味着额外的代码量:

 Charset latin1Charset = Charset.forName("ISO-8859-1"); 
 charBuffer = latin1Charset.decode(ByteBuffer.wrap(byteArray)); // also decode to String
 byteBuffer = latin1Charset.encode(charBuffer);                 // also decode from String

Aside:

旁白:

java.nio classes and java.io Reader/Writer classes use ByteBuffer & CharBuffer (which use byte[] and char[] as backing arrays). So often preferable if you use these classes directly. However, you can always do:

java。和java nio类。io Reader/Writer类使用ByteBuffer和CharBuffer(使用byte[]和char[]作为后台数组)。如果您直接使用这些类,通常更可取。然而,你总是可以做到:

 byteArray = ByteBuffer.array();  byteBuffer = ByteBuffer.wrap(byteArray);  
 byteBuffer.get(byteArray);       charBuffer.put(charArray);
 charArray = CharBuffer.array();  charBuffer = ByteBuffer.wrap(charArray);
 charBuffer.get(charArray);       charBuffer.put(charArray);

#3


5  

    public byte[] charsToBytes(char[] chars){
        Charset charset = Charset.forName("UTF-8");
        ByteBuffer byteBuffer = charset.encode(CharBuffer.wrap(chars));
        return Arrays.copyOf(byteBuffer.array(), byteBuffer.limit());
    }

    public char[] bytesToChars(byte[] bytes){
        Charset charset = Charset.forName("UTF-8");
        CharBuffer charBuffer = charset.decode(ByteBuffer.wrap(bytes));
        return Arrays.copyOf(charBuffer.array(), charBuffer.limit());    
    }

#4


4  

If you want to use a ByteBuffer and CharBuffer, don't do the simple .asCharBuffer(), which simply does an UTF-16 (LE or BE, depending on your system - you can set the byte-order with the order method) conversion (since the Java Strings and thus your char[] internally uses this encoding).

如果您想使用ByteBuffer和CharBuffer,请不要使用简单的. ascharbuffer(),它只执行UTF-16 (LE或BE,取决于您的系统——您可以使用order方法设置字节顺序)转换(因为Java字符串因此您的char[]在内部使用这种编码)。

Use Charset.forName(charsetName), and then its encode or decode method, or the newEncoder /newDecoder.

使用Charset.forName(charsetName),然后使用它的编码或解码方法,或newEncoder /newDecoder。

When converting your byte[] to String, you also should indicate the encoding (and it should be the same one).

当您将您的字节[]转换为字符串时,您还应该指出编码(并且它应该是相同的)。

#5


3  

I would do is use a loop to convert to bytes and another to conver back to char.

我要做的是使用一个循环来转换成字节,另一个循环转换成char。

char[] chars = "password".toCharArray();
byte[] bytes = new byte[chars.length*2];
for(int i=0;i<chars.length;i++) {
   bytes[i*2] = (byte) (chars[i] >> 8);
   bytes[i*2+1] = (byte) chars[i];
}
char[] chars2 = new char[bytes.length/2];
for(int i=0;i<chars2.length;i++) 
   chars2[i] = (char) ((bytes[i*2] << 8) + (bytes[i*2+1] & 0xFF));
String password = new String(chars2);

#6


2  

You should make use of getBytes() instead of toCharArray()

应该使用getBytes()而不是toCharArray()

Replace the line

更换线

char[] password = "password".toCharArray();

with

byte[] password = "password".getBytes();

#7


2  

This is an extension to Peter Lawrey's answer. In order to backward (bytes-to-chars) conversion work correctly for the whole range of chars, the code should be as follows:

这是对彼得·劳瑞答案的延伸。为了使所有轮对轮对轮对轮的逆向(字节对轮)转换工作正确,代码应如下:

char[] chars = new char[bytes.length/2];
for (int i = 0; i < chars.length; i++) {
   chars[i] = (char) (((bytes[i*2] & 0xff) << 8) + (bytes[i*2+1] & 0xff));
}

We need to "unsign" bytes before using (& 0xff). Otherwise half of the all possible char values will not get back correctly. For instance, chars within [0x80..0xff] range will be affected.

在使用(& 0xff)之前,我们需要“取消”字节。否则,所有可能的char值的一半将无法正确返回。例如,在[0x80..]范围将受到影响。

#8


1  

When you use GetBytes From a String in Java, The return result will depend on the default encode of your computer setting.(eg: StandardCharsetsUTF-8 or StandardCharsets.ISO_8859_1etc...).

在Java中使用字符串GetBytes时,返回结果将取决于计算机设置的默认编码。(如:StandardCharsetsUTF-8或StandardCharsets.ISO_8859_1etc……)。

So, whenever you want to getBytes from a String Object. Make sure to give a encode . like :

所以,当你想从字符串对象中获取字节时。确保给出一个编码。如:

String sample = "abc";
Byte[] a_byte = sample .getBytes(StandardCharsets.UTF_8);

Let check what has happened with the code. In java, the String named sample , is stored by Unicode. every char in String stored by 2 byte.

让我们检查代码发生了什么。在java中,名为sample的字符串由Unicode存储。每个字符在字符串中存储为2字节。

sample :  value: "abc"   in Memory(Hex):  00 61 00 62 00 63
        a -> 00 61
        b -> 00 62
        c -> 00 63

But, When we getBytes From a String, we have

但是,当我们从字符串中获取字节时,我们有

Byte[] a_byte = sample .getBytes(StandardCharsets.UTF_8)
//result is : 61 62 63
//length: 3 bytes

Byte[] a_byte = sample .getBytes(StandardCharsets.UTF_16BE)  
//result is : 00 61 00 62 00 63        
//length: 6 bytes

In order to get the oringle byte of the String. We can just read the Memory of the string and get Each byte of the String.Below is the sample Code:

为了得到字符串的oringle字节。我们可以读取字符串的内存,得到字符串的每个字节。下面是示例代码:

public static byte[] charArray2ByteArray(char[] chars){
    int length = chars.length;
    byte[] result = new byte[length*2+2];
    int i = 0;
    for(int j = 0 ;j<chars.length;j++){
        result[i++] = (byte)( (chars[j] & 0xFF00) >> 8 );
        result[i++] = (byte)((chars[j] & 0x00FF)) ;
    }
    return result;
}

Usages:

用法:

String sample = "abc";
//First get the chars of the String,each char has two bytes(Java).
Char[] sample_chars = sample.toCharArray();
//Get the bytes
byte[] result = charArray2ByteArray(sample_chars).

//Back to String.
//Make sure we use UTF_16BE. Because we read the memory of Unicode of  
//the String from Left to right. That's the same reading 
//sequece of  UTF-16BE.
String sample_back= new String(result , StandardCharsets.UTF_16BE);