I'm just trying to write a really basic script that'll take some input text and compress it with lzw, using this package: http://packages.python.org/lzw/
我只是在尝试编写一个非常基本的脚本,它将使用lzw压缩输入文本,使用这个包:http://packages.python.org/lzw/
I've never tried any encoding with python before and am thoroughly confused =( - I also can't find any documentation online about it, other than the package info.
我从来没有尝试过使用python进行任何编码,我完全搞糊涂了=(-我也找不到任何关于它的文档,除了包信息。
Here's what I have:
这就是我有:
import lzw
file = lzw.readbytes("collectemailinfo.txt", buffersize=1024)
enc = lzw.compress(file)
print enc
Any help or pointers of any kind would be much appreciated!
如有任何帮助或建议,我们将不胜感激!
Thanks =)
谢谢=)
1 个解决方案
#1
5
Here is the package API : http://packages.python.org/lzw/lzw-module.html
这里是包API: http://packages.python.org/lzw/lzw-module.html
You can read psuedo-code of compression and decompression here
您可以在这里阅读压缩和解压缩的psuedo代码
Is there anything else you are confused about?
你还有什么不明白的吗?
Here is an example:
这是一个例子:
Python
Python
In this version the dicts contain mixed typed data:
在本版本中,dicts包含混合类型数据:
def compress(uncompressed):
"""Compress a string to a list of output symbols."""
# Build the dictionary.
dict_size = 256
dictionary = dict((chr(i), chr(i)) for i in xrange(dict_size))
# in Python 3: dictionary = {chr(i): chr(i) for i in range(dict_size)}
w = ""
result = []
for c in uncompressed:
wc = w + c
if wc in dictionary:
w = wc
else:
result.append(dictionary[w])
# Add wc to the dictionary.
dictionary[wc] = dict_size
dict_size += 1
w = c
# Output the code for w.
if w:
result.append(dictionary[w])
return result
def decompress(compressed):
"""Decompress a list of output ks to a string."""
# Build the dictionary.
dict_size = 256
dictionary = dict((chr(i), chr(i)) for i in xrange(dict_size))
# in Python 3: dictionary = {chr(i): chr(i) for i in range(dict_size)}
w = result = compressed.pop(0)
for k in compressed:
if k in dictionary:
entry = dictionary[k]
elif k == dict_size:
entry = w + w[0]
else:
raise ValueError('Bad compressed k: %s' % k)
result += entry
# Add w+entry[0] to the dictionary.
dictionary[dict_size] = w + entry[0]
dict_size += 1
w = entry
return result
How to use:
如何使用:
compressed = compress('TOBEORNOTTOBEORTOBEORNOT')
print (compressed)
decompressed = decompress(compressed)
print (decompressed)
Output:
输出:
['T', 'O', 'B', 'E', 'O', 'R', 'N', 'O', 'T', 256, 258, 260, 265, 259, 261, 263]
TOBEORNOTTOBEORTOBEORNOT
NOTE: this example is taken from here
注意:这个例子取自这里。
#1
5
Here is the package API : http://packages.python.org/lzw/lzw-module.html
这里是包API: http://packages.python.org/lzw/lzw-module.html
You can read psuedo-code of compression and decompression here
您可以在这里阅读压缩和解压缩的psuedo代码
Is there anything else you are confused about?
你还有什么不明白的吗?
Here is an example:
这是一个例子:
Python
Python
In this version the dicts contain mixed typed data:
在本版本中,dicts包含混合类型数据:
def compress(uncompressed):
"""Compress a string to a list of output symbols."""
# Build the dictionary.
dict_size = 256
dictionary = dict((chr(i), chr(i)) for i in xrange(dict_size))
# in Python 3: dictionary = {chr(i): chr(i) for i in range(dict_size)}
w = ""
result = []
for c in uncompressed:
wc = w + c
if wc in dictionary:
w = wc
else:
result.append(dictionary[w])
# Add wc to the dictionary.
dictionary[wc] = dict_size
dict_size += 1
w = c
# Output the code for w.
if w:
result.append(dictionary[w])
return result
def decompress(compressed):
"""Decompress a list of output ks to a string."""
# Build the dictionary.
dict_size = 256
dictionary = dict((chr(i), chr(i)) for i in xrange(dict_size))
# in Python 3: dictionary = {chr(i): chr(i) for i in range(dict_size)}
w = result = compressed.pop(0)
for k in compressed:
if k in dictionary:
entry = dictionary[k]
elif k == dict_size:
entry = w + w[0]
else:
raise ValueError('Bad compressed k: %s' % k)
result += entry
# Add w+entry[0] to the dictionary.
dictionary[dict_size] = w + entry[0]
dict_size += 1
w = entry
return result
How to use:
如何使用:
compressed = compress('TOBEORNOTTOBEORTOBEORNOT')
print (compressed)
decompressed = decompress(compressed)
print (decompressed)
Output:
输出:
['T', 'O', 'B', 'E', 'O', 'R', 'N', 'O', 'T', 256, 258, 260, 265, 259, 261, 263]
TOBEORNOTTOBEORTOBEORNOT
NOTE: this example is taken from here
注意:这个例子取自这里。