I was wondering how do I find out how many bytes does a character have?
我想知道如何找出一个角色有多少字节?
2 个解决方案
#1
12
If you want to find out how many UTF-8 bytes a letter in a PHP string has then:
如果你想知道PHP字符串中的字母有多少UTF-8字节:
print strlen(mb_substr($string, 0, 1, "utf-8"));
strlen()
returns the raw byte length, while mb_substr()
returns a "character" according to the charset/encoding. In this example from position 0
.
strlen()返回原始字节长度,而mb_substr()根据charset / encoding返回“字符”。在这个例子中从位置0。
#2
6
- ASCII is 7 bits.
- Most other languages use 8 bits (1 byte).
- Many eastern languages (Chinese, Japanese) use 16 bits (2 bytes).
- Unicode is usually 32 bits (4 bytes).
ASCII是7位。
大多数其他语言使用8位(1字节)。
许多东方语言(中文,日文)使用16位(2字节)。
Unicode通常是32位(4字节)。
How a character is stored and represented depends on the programming language and the platform you are using.
如何存储和表示字符取决于您使用的编程语言和平台。
#1
12
If you want to find out how many UTF-8 bytes a letter in a PHP string has then:
如果你想知道PHP字符串中的字母有多少UTF-8字节:
print strlen(mb_substr($string, 0, 1, "utf-8"));
strlen()
returns the raw byte length, while mb_substr()
returns a "character" according to the charset/encoding. In this example from position 0
.
strlen()返回原始字节长度,而mb_substr()根据charset / encoding返回“字符”。在这个例子中从位置0。
#2
6
- ASCII is 7 bits.
- Most other languages use 8 bits (1 byte).
- Many eastern languages (Chinese, Japanese) use 16 bits (2 bytes).
- Unicode is usually 32 bits (4 bytes).
ASCII是7位。
大多数其他语言使用8位(1字节)。
许多东方语言(中文,日文)使用16位(2字节)。
Unicode通常是32位(4字节)。
How a character is stored and represented depends on the programming language and the platform you are using.
如何存储和表示字符取决于您使用的编程语言和平台。