In Python I have got this string
在Python中我有这个字符串
string = "Ľubomír Mezovský"
I need to get only first character of it. But when I tried string[0]
it returned �
. When I tried string[:2]
it worked well. My question is why? I need to run this for several strings and when string does not start with diacritic character, it returns substring of two characters.
我只需要获得它的第一个字符。但是当我尝试使用string [0]时,它返回了 。当我尝试字符串[:2]时效果很好。我的问题是为什么?我需要为几个字符串运行它,当字符串不以变音字符开头时,它返回两个字符的子字符串。
I am also using # encoding=utf8
and Python 2.7
我也在使用#encoding = utf8和Python 2.7
2 个解决方案
#1
3
You're dealing with byte-string (assuming you're using Python 2.x).
你正在处理字节串(假设你使用的是Python 2.x)。
Convert the byte-string to unicode-string using str.decode
, get the first character, then convert it back to binary string using str.encode
(optional unless you should use byte-string)
使用str.decode将字节字符串转换为unicode-string,获取第一个字符,然后使用str.encode将其转换回二进制字符串(可选,除非您应该使用字节字符串)
>>> string = "Ľubomír Mezovský"
>>> print(string.decode('utf-8')[0].encode('utf-8'))
Ľ
#2
0
Try converting the string to Unicode and the encode to "utf-8"
尝试将字符串转换为Unicode并将编码转换为“utf-8”
Ex:
string = u"Ľubomír Mezovský"
print string[0].encode('utf-8')
Output:
Ľ
Tested in python2.7
在python2.7中测试过
#1
3
You're dealing with byte-string (assuming you're using Python 2.x).
你正在处理字节串(假设你使用的是Python 2.x)。
Convert the byte-string to unicode-string using str.decode
, get the first character, then convert it back to binary string using str.encode
(optional unless you should use byte-string)
使用str.decode将字节字符串转换为unicode-string,获取第一个字符,然后使用str.encode将其转换回二进制字符串(可选,除非您应该使用字节字符串)
>>> string = "Ľubomír Mezovský"
>>> print(string.decode('utf-8')[0].encode('utf-8'))
Ľ
#2
0
Try converting the string to Unicode and the encode to "utf-8"
尝试将字符串转换为Unicode并将编码转换为“utf-8”
Ex:
string = u"Ľubomír Mezovský"
print string[0].encode('utf-8')
Output:
Ľ
Tested in python2.7
在python2.7中测试过