1、先说重点:
不同的编码格式占字节数是不同的,UTF-8编码下一个中文所占字节也是不确定的,可能是2个、3个、4个字节;
2、以下是源码:
1 @Test 2 public void test1() throws UnsupportedEncodingException { 3 String a = "名"; 4 System.out.println("UTF-8编码长度:"+a.getBytes("UTF-8").length); 5 System.out.println("GBK编码长度:"+a.getBytes("GBK").length); 6 System.out.println("GB2312编码长度:"+a.getBytes("GB2312").length); 7 System.out.println("=========================================="); 8 9 String c = "0x20001"; 10 System.out.println("UTF-8编码长度:"+c.getBytes("UTF-8").length); 11 System.out.println("GBK编码长度:"+c.getBytes("GBK").length); 12 System.out.println("GB2312编码长度:"+c.getBytes("GB2312").length); 13 System.out.println("=========================================="); 14 15 char[] arr = Character.toChars(0x20001); 16 String s = new String(arr); 17 System.out.println("char array length:" + arr.length); 18 System.out.println("content:| " + s + " |"); 19 System.out.println("String length:" + s.length()); 20 System.out.println("UTF-8编码长度:"+s.getBytes("UTF-8").length); 21 System.out.println("GBK编码长度:"+s.getBytes("GBK").length); 22 System.out.println("GB2312编码长度:"+s.getBytes("GB2312").length); 23 System.out.println("=========================================="); 24 }
3、运行结果
1 UTF-8编码长度:3 2 GBK编码长度:2 3 GB2312编码长度:2 4 ========================================== 5 UTF-8编码长度:4 6 GBK编码长度:1 7 GB2312编码长度:1 8 ========================================== 9 char array length:2 10 content:|