JAVA中一个汉字占多少个字符(转载)

时间:2024-04-15 18:35:04

1、先说重点:

不同的编码格式占字节数是不同的,UTF-8编码下一个中文所占字节也是不确定的,可能是2个、3个、4个字节;

2、以下是源码:

复制代码
 1   @Test
 2     public void test1() throws UnsupportedEncodingException {
 3         String a = "名";
 4         System.out.println("UTF-8编码长度:"+a.getBytes("UTF-8").length);
 5         System.out.println("GBK编码长度:"+a.getBytes("GBK").length);
 6         System.out.println("GB2312编码长度:"+a.getBytes("GB2312").length);
 7         System.out.println("==========================================");
 8 
 9         String c = "0x20001";
10         System.out.println("UTF-8编码长度:"+c.getBytes("UTF-8").length);
11         System.out.println("GBK编码长度:"+c.getBytes("GBK").length);
12         System.out.println("GB2312编码长度:"+c.getBytes("GB2312").length);
13         System.out.println("==========================================");
14 
15         char[] arr = Character.toChars(0x20001);
16         String s = new String(arr);
17         System.out.println("char array length:" + arr.length);
18         System.out.println("content:|  " + s + " |");
19         System.out.println("String length:" + s.length());
20         System.out.println("UTF-8编码长度:"+s.getBytes("UTF-8").length);
21         System.out.println("GBK编码长度:"+s.getBytes("GBK").length);
22         System.out.println("GB2312编码长度:"+s.getBytes("GB2312").length);
23         System.out.println("==========================================");
24     }
复制代码

3、运行结果

复制代码
 1 UTF-8编码长度:3
 2 GBK编码长度:2
 3 GB2312编码长度:2
 4 ==========================================
 5 UTF-8编码长度:4
 6 GBK编码长度:1
 7 GB2312编码长度:1
 8 ==========================================
 9 char array length:2
10 content:|