在redis中保存unicode,但获取错误

时间:2020-12-09 22:52:06

I'm using mongodb and redis, redis is my cache.

我使用mongodb和redis, redis是我的缓存。

I'm caching mongodb objects with redis-py:

我用redispy:缓存mongodb对象:

obj in mongodb: {u'name': u'match', u'section_title': u'\u6d3b\u52a8', u'title': 
u'\u6bd4\u8d5b', u'section_id': 1, u'_id': ObjectId('4fb1ed859b10ed2041000001'), u'id': 1}

the obj fetched from redis with hgetall(key, obj) is:

用hgetall(key, obj)从redis中提取的obj为:

{'name': 'match', 'title': '\xe6\xaf\x94\xe8\xb5\x9b', 'section_title': 
'\xe6\xb4\xbb\xe5\x8a\xa8', 'section_id': '1', '_id': '4fb1ed859b10ed2041000001', 'id': '1'}

As you can see, obj fetched from cache is str instead of unicode, so in my app, there is error s like :'ascii' codec can't decode byte 0xe6 in position 12: ordinal not in range(128)

如你所见,从缓存中获取的obj是str而不是unicode,所以在我的app中,有错误s比如:'ascii'编解码器无法解码位为12:序数不在范围内的字节0xe6 (128)

Can anyone give some suggestions? thank u

谁能给我一些建议吗?谢谢你

4 个解决方案

#1


7  

Update, for global setting, check jmoz's answer.

更新,为全局设置,检查jmoz的答案。

If you're using third-party lib such as django-redis, you may need to specify a customized ConnectionFactory:

如果您正在使用第三方库,如django-redis,您可能需要指定一个定制的ConnectionFactory:

class DecodeConnectionFactory(redis_cache.pool.ConnectionFactory):
    def get_connection(self, params):
        params['decode_responses'] = True
        return super(DecodeConnectionFactory, self).get_connection(self, params)

Assuming you're using redis-py, you'd better to pass str instead of unicode to Redis, or else Redis will encode it automatically for *set commands, normally in UTF-8. For the *get commands, Redis has no idea about the formal type of a value and has to just return the value in str directly.

假设您正在使用Redis -py,那么您最好将str而不是unicode传递给Redis,否则Redis将自动为*set命令编码,通常是UTF-8。对于*get命令,Redis不知道值的形式类型,只需直接在str中返回值。

Thus, As Denis said, the way that you storing the object to Redis is critical. You need to transform the value to str to make the Redis layer transparent for you.

因此,正如Denis所说,将对象存储到Redis的方式是至关重要的。您需要将值转换为str,以使Redis层对您透明。

Also, set the default encoding to UTF-8 instead of using ascii

另外,将默认编码设置为UTF-8,而不是使用ascii

#2


25  

I think I've discovered the problem. After reading this, I had to explicitly decode from redis which is a pain, but works.

我想我已经发现问题了。读完这篇文章后,我必须明确地从redis中解码出来,这是一种痛苦,但确实有效。

I stumbled across a blog post where the author's output was all unicode strings which was obv different to mine.

我偶然发现了一篇博客文章,作者输出的都是与我不同的unicode字符串。

Looking into the StrictRedis.__init__ there is a parameter decode_responses which by default is False. https://github.com/andymccurdy/redis-py/blob/273a47e299a499ed0053b8b90966dc2124504983/redis/client.py#L446

调查StrictRedis。__init__有一个参数decode_response默认为False。https://github.com/andymccurdy/redis-py/blob/273a47e299a499ed0053b8b90966dc2124504983/redis/client.py L446

Pass in decode_responses=True on construct and for me this FIXES THE OP'S ISSUE.

在构造上传入decode_responses=True,对我来说,这解决了OP的问题。

#3


6  

for each string you can use the decode function to transform it in utf-8, e.g. for the value if the title field in your code:

对于每个字符串,您可以使用decode函数将其转换为utf-8,例如,如果代码中的title字段:

In [7]: a='\xe6\xaf\x94\xe8\xb5\x9b'

In [8]: a.decode('utf8')
Out[8]: u'\u6bd4\u8d5b'

#4


3  

I suggest you always encode to utf-8 before writing to MongoDB or Redis (or any external system). And that you decode('utf-8') when you fecth results, so that you always work with Unicode in Python.

我建议您在写入MongoDB或Redis(或任何外部系统)之前,始终对utf-8进行编码。当您生成结果时,您将解码('utf-8'),以便您总是在Python中使用Unicode。

#1


7  

Update, for global setting, check jmoz's answer.

更新,为全局设置,检查jmoz的答案。

If you're using third-party lib such as django-redis, you may need to specify a customized ConnectionFactory:

如果您正在使用第三方库,如django-redis,您可能需要指定一个定制的ConnectionFactory:

class DecodeConnectionFactory(redis_cache.pool.ConnectionFactory):
    def get_connection(self, params):
        params['decode_responses'] = True
        return super(DecodeConnectionFactory, self).get_connection(self, params)

Assuming you're using redis-py, you'd better to pass str instead of unicode to Redis, or else Redis will encode it automatically for *set commands, normally in UTF-8. For the *get commands, Redis has no idea about the formal type of a value and has to just return the value in str directly.

假设您正在使用Redis -py,那么您最好将str而不是unicode传递给Redis,否则Redis将自动为*set命令编码,通常是UTF-8。对于*get命令,Redis不知道值的形式类型,只需直接在str中返回值。

Thus, As Denis said, the way that you storing the object to Redis is critical. You need to transform the value to str to make the Redis layer transparent for you.

因此,正如Denis所说,将对象存储到Redis的方式是至关重要的。您需要将值转换为str,以使Redis层对您透明。

Also, set the default encoding to UTF-8 instead of using ascii

另外,将默认编码设置为UTF-8,而不是使用ascii

#2


25  

I think I've discovered the problem. After reading this, I had to explicitly decode from redis which is a pain, but works.

我想我已经发现问题了。读完这篇文章后,我必须明确地从redis中解码出来,这是一种痛苦,但确实有效。

I stumbled across a blog post where the author's output was all unicode strings which was obv different to mine.

我偶然发现了一篇博客文章,作者输出的都是与我不同的unicode字符串。

Looking into the StrictRedis.__init__ there is a parameter decode_responses which by default is False. https://github.com/andymccurdy/redis-py/blob/273a47e299a499ed0053b8b90966dc2124504983/redis/client.py#L446

调查StrictRedis。__init__有一个参数decode_response默认为False。https://github.com/andymccurdy/redis-py/blob/273a47e299a499ed0053b8b90966dc2124504983/redis/client.py L446

Pass in decode_responses=True on construct and for me this FIXES THE OP'S ISSUE.

在构造上传入decode_responses=True,对我来说,这解决了OP的问题。

#3


6  

for each string you can use the decode function to transform it in utf-8, e.g. for the value if the title field in your code:

对于每个字符串,您可以使用decode函数将其转换为utf-8,例如,如果代码中的title字段:

In [7]: a='\xe6\xaf\x94\xe8\xb5\x9b'

In [8]: a.decode('utf8')
Out[8]: u'\u6bd4\u8d5b'

#4


3  

I suggest you always encode to utf-8 before writing to MongoDB or Redis (or any external system). And that you decode('utf-8') when you fecth results, so that you always work with Unicode in Python.

我建议您在写入MongoDB或Redis(或任何外部系统)之前,始终对utf-8进行编码。当您生成结果时,您将解码('utf-8'),以便您总是在Python中使用Unicode。