I'm extracting jpeg type bits from mp3 data actually it will be album arts. I thought about using library called mutagen, but I'd like to try with bits for some practice purpose.
我正在从mp3数据中提取jpeg类型的比特,实际上它将是专辑艺术。我想过使用名为mutagen的库,但我想用比特来做一些练习。
import os
import sys
import re
f = open(sys.argv[1], "rb")
#sys.argv[1] gets mp3 file name ex) test1.mp3
saver = ""
for value in f:
for i in value:
hexval = hex(ord(i))[2:]
if (ord(i) == 0):
saver += "00" #to match with hex form
else:
saver += hexval
header = "ffd8"
tail = "ffd9"
this part of code is to get mp3 as bit form, and then transform it into hex and find jpeg trailers which starts as "ffd8" and ends with "ffd9"
这部分代码是将mp3作为位格式,然后将其转换成十六进制,并找到jpeg格式的拖车,它以“ffd8”开头,以“ffd9”结尾
frontmatch = re.search(header,saver)
endmatch = re.search(tail, saver)
startIndex = frontmatch.start()
endIndex = endmatch.end()
jpgcontents = saver[startIndex:endIndex]
scale = 16 # equals to hexadecimal
numbits = len(jpgcontents) * 4 #log2(scale)
bitcontents = bin(int(jpgcontents, scale))[2:].zfill(numbits)
and here, I get the bits between the header and tail and transform it into binary form. Which supposed to be the jpg part of the mp3 files.
在这里,我得到了头和尾之间的比特并把它转换成二进制形式。它应该是mp3文件的jpg部分。
txtfile = open(sys.argv[1] + "_tr.jpg", "w")
txtfile.write(bitcontents)
and I wrote the bin to the new file with writing type as jpg. sorry for my wrong naming as txtfile.
我把这个箱子写成了jpg格式的新文件。不好意思,我的命名为txtfile。
But these codes gave the error which is
但是这些代码给出了错误。
Error interpreting JPEG image file
(Not a JPEG file: starts with 0x31 0x31)
I'm not sure whether the bits I extracted are wrong or writing to the file step is wrong. Or there might be other problem in code.
我不确定我提取的比特是错误的还是写入文件的步骤是错误的。或者代码中还有其他问题。
I'm working in linux version with python 2.6. Is there anything wrong with just writing str type of bin data as JPG?
我使用的是python 2.6的linux版本。将bin数据的str类型写入JPG是否有什么问题?
4 个解决方案
#1
3
You are creating a string of ASCII zeroes and ones, i.e. \x30
and \x31
, but the JPEG file needs to be proper binary data. So where your file should have a single byte of (for example) \xd8
you instead have these eight bytes: 11011000
, or \x31\x31\x30\x31\x31\x30\x30\x30
.
您正在创建一串ASCII 0和1,即\x30和\x31,但是JPEG文件需要是正确的二进制数据。因此,你的文件应该有一个字节(例如)\xd8,你可以使用这8个字节:11011000,或\x31\x31\ x31\x30\x30\x30 \x30\x30\x30 \x30\x30\x30 \x30\x30\x30 \x30\x30\x30 \x30\x30\x30 \x30\x30\x30 \x30\x30\x30 \x30\x30\x30 \x30
You don't need to do all that messy conversion stuff. You can just search directly for the desired byte patterns, writing them using \x
hex escape sequences. And you don't even need regex: the simple string .index
or .find
methods can do this easily and quickly.
你不需要做那些乱七八糟的转换。您可以直接搜索所需的字节模式,并使用\x十六进制转义序列来编写它们。您甚至不需要regex:简单的字符串.index或.find方法可以轻松快速地完成这一任务。
with open(fname, 'rb') as f:
data = f.read()
header = "\xff\xd8"
tail = "\xff\xd9"
try:
start = data.index(header)
end = data.index(tail, start) + 2
except ValueError:
print "Can't find JPEG data!"
exit()
print 'Start: %d End: %d Size: %d' % (start, end, end - start)
with open(fname + "_tr.jpg", 'wb') as f:
f.write(data[start:end])
(Tested on Python 2.6.6)
(测试Python 2.6.6)
However, extracting embedded JPEG data like this isn't foolproof, since it's possible that those header and tail byte sequences exist in the MP3 sound data.
然而,像这样提取嵌入的JPEG数据并不是万无一失的,因为这些头和尾字节序列可能存在于MP3音频数据中。
FWIW, a simpler way to translate binary data to hex strings and back is to use hexlify and unhexlify from the binascii module.
FWIW是将二进制数据转换为十六进制字符串的一种更简单的方法,它可以从binascii模块中使用hexlify和unhexlify。
Here are some examples of doing these transformations, both with and without the binascii functions.
这里有一些做这些转换的例子,它们都有并且没有binascii函数。
from binascii import hexlify, unhexlify
#Create a string of all possible byte values
allbytes = ''.join([chr(i) for i in xrange(256)])
print 'allbytes'
print repr(allbytes)
print '\nhex list'
print [hex(ord(v))[2:].zfill(2) for v in allbytes]
hexstr = hexlify(allbytes)
print '\nhex string'
print hexstr
newbytes = ''.join([chr(int(hexstr[i:i+2], 16)) for i in xrange(0, len(hexstr), 2)])
print '\nNew bytes'
print repr(newbytes)
print '\nUsing unhexlify'
print repr(unhexlify(hexstr))
output
输出
allbytes
'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff'
hex list
['00', '01', '02', '03', '04', '05', '06', '07', '08', '09', '0a', '0b', '0c', '0d', '0e', '0f', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '1a', '1b', '1c', '1d', '1e', '1f', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '2a', '2b', '2c', '2d', '2e', '2f', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '3a', '3b', '3c', '3d', '3e', '3f', '40', '41', '42', '43', '44', '45', '46', '47', '48', '49', '4a', '4b', '4c', '4d', '4e', '4f', '50', '51', '52', '53', '54', '55', '56', '57', '58', '59', '5a', '5b', '5c', '5d', '5e', '5f', '60', '61', '62', '63', '64', '65', '66', '67', '68', '69', '6a', '6b', '6c', '6d', '6e', '6f', '70', '71', '72', '73', '74', '75', '76', '77', '78', '79', '7a', '7b', '7c', '7d', '7e', '7f', '80', '81', '82', '83', '84', '85', '86', '87', '88', '89', '8a', '8b', '8c', '8d', '8e', '8f', '90', '91', '92', '93', '94', '95', '96', '97', '98', '99', '9a', '9b', '9c', '9d', '9e', '9f', 'a0', 'a1', 'a2', 'a3', 'a4', 'a5', 'a6', 'a7', 'a8', 'a9', 'aa', 'ab', 'ac', 'ad', 'ae', 'af', 'b0', 'b1', 'b2', 'b3', 'b4', 'b5', 'b6', 'b7', 'b8', 'b9', 'ba', 'bb', 'bc', 'bd', 'be', 'bf', 'c0', 'c1', 'c2', 'c3', 'c4', 'c5', 'c6', 'c7', 'c8', 'c9', 'ca', 'cb', 'cc', 'cd', 'ce', 'cf', 'd0', 'd1', 'd2', 'd3', 'd4', 'd5', 'd6', 'd7', 'd8', 'd9', 'da', 'db', 'dc', 'dd', 'de', 'df', 'e0', 'e1', 'e2', 'e3', 'e4', 'e5', 'e6', 'e7', 'e8', 'e9', 'ea', 'eb', 'ec', 'ed', 'ee', 'ef', 'f0', 'f1', 'f2', 'f3', 'f4', 'f5', 'f6', 'f7', 'f8', 'f9', 'fa', 'fb', 'fc', 'fd', 'fe', 'ff']
hex string
000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f202122232425262728292a2b2c2d2e2f303132333435363738393a3b3c3d3e3f404142434445464748494a4b4c4d4e4f505152535455565758595a5b5c5d5e5f606162636465666768696a6b6c6d6e6f707172737475767778797a7b7c7d7e7f808182838485868788898a8b8c8d8e8f909192939495969798999a9b9c9d9e9fa0a1a2a3a4a5a6a7a8a9aaabacadaeafb0b1b2b3b4b5b6b7b8b9babbbcbdbebfc0c1c2c3c4c5c6c7c8c9cacbcccdcecfd0d1d2d3d4d5d6d7d8d9dadbdcdddedfe0e1e2e3e4e5e6e7e8e9eaebecedeeeff0f1f2f3f4f5f6f7f8f9fafbfcfdfeff
New bytes
'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff'
Using unhexlify
'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff'
Note that this code needs some modifications to run on Python 3 (apart from converting the print
statements to print
function calls) because plain Python 3 strings are Unicode strings, not byte strings.
注意,这段代码需要在Python 3上运行一些修改(除了将print语句转换为print函数调用之外),因为普通的Python 3字符串是Unicode字符串,而不是字节字符串。
#2
0
You need to write out as binary
你需要把它写成二进制的形式。
Try:
试一试:
txtfile = open(sys.argv[1] + "_tr.jpg", "wb")
#3
0
Oups, you are not doing what you expect. the bin
generates a string containing the value in binary form. Let's look at what you have, if the content on the input file was :
Oups,你不是在做你想做的事。bin生成一个包含二进制形式值的字符串。让我们看看你有什么,如果输入文件的内容是:
-
saver
is a string of hexadecimal characters in textual form something like "313233414243" for an initial string of "132ABC" - 保护程序是一串十六进制字符,在文本形式中类似于“132ABC”的初始字符串“313233414243”。
-
jpgcontents
has same format and starts with "ffd8" and ends with "ffd9" - jpgcontents具有相同的格式,以“ffd8”开头,以“ffd9”结尾
- you then apply the magic formula
bin(int(jpgcontents, scale))[2:].zfill(numbits)
that- convert the hexa string to a long integer
- 将hexa字符串转换为一个长整数。
- convert the long integer to a binary representation string - this part would convert hexa "ff" in integer 255 and end in the string "0b11111111"
- 将长整数转换为二进制表示字符串——这部分将转换为整数255中的hexa“ff”,并在字符串“0b11111111”中结束。
- 然后应用神奇公式bin(int(jpgcontents, scale))[2:].zfill(numbits)将hexa字符串转换成一个长整数,将长整数转换为二进制表示字符串——这部分将转换为整数255中的hexa“ff”,并在字符串“0b11111111”中结束。
- remove first characters "0b" and fill the end of buffer if needed
- 删除第一个字符“0b”,并在必要时填充缓冲区的末端。
bitcontents
is then a string starting with "11111111....". Just rename your file with a .txt extension and open it with a text editor, you will see that it is a large file containing only ASCII characters 0 and 1.
bitcontents然后一个字符串从“11111111 ....”开始。只需使用.txt扩展名重命名文件,并使用文本编辑器打开它,您将看到它是一个包含ASCII字符0和1的大文件。
As the header is "ffd8" the file will start with 10 "1". So the error that it starts with 0x31 0x31 because 0x31 is the ascii code of "1".
当标题为“ffd8”时,文件将以10“1”开始。所以从0x31 0x31开始的错误因为0x31是“1”的ascii码。
What you need is convert the hexa string jpgcontents
in a binary byte array.
您需要的是将hexa字符串jpgcontents转换为二进制字节数组。
fileimage = ''.join([ jpgcontent[i:i+2] for i in range(0, len(jpgcontent), 2]
You can then safely copy the fileimage buffer to a binary file:
然后,您可以安全地将fileimage缓冲区复制到一个二进制文件:
file = open(sys.argv[1] + "_tr.jpg", "wb")
file.write(fileimage)
#4
0
The easiest method is using the binascii module: https://docs.python.org/2/library/binascii.html.
最简单的方法是使用binascii模块:https://docs.python.org/2/library/binascii.html。
import binascii
# code in ascii format contained in a list
code = ['00', '01', '02', '03', '04', '05', '06', '07', '08', '09']
bfile = open('bfile.bin', 'w')
for c in code:
# convert the ascii to binary and write it to the file
bfile.write(binascii.unhexlify(c))
bfile.close()
#1
3
You are creating a string of ASCII zeroes and ones, i.e. \x30
and \x31
, but the JPEG file needs to be proper binary data. So where your file should have a single byte of (for example) \xd8
you instead have these eight bytes: 11011000
, or \x31\x31\x30\x31\x31\x30\x30\x30
.
您正在创建一串ASCII 0和1,即\x30和\x31,但是JPEG文件需要是正确的二进制数据。因此,你的文件应该有一个字节(例如)\xd8,你可以使用这8个字节:11011000,或\x31\x31\ x31\x30\x30\x30 \x30\x30\x30 \x30\x30\x30 \x30\x30\x30 \x30\x30\x30 \x30\x30\x30 \x30\x30\x30 \x30\x30\x30 \x30\x30\x30 \x30
You don't need to do all that messy conversion stuff. You can just search directly for the desired byte patterns, writing them using \x
hex escape sequences. And you don't even need regex: the simple string .index
or .find
methods can do this easily and quickly.
你不需要做那些乱七八糟的转换。您可以直接搜索所需的字节模式,并使用\x十六进制转义序列来编写它们。您甚至不需要regex:简单的字符串.index或.find方法可以轻松快速地完成这一任务。
with open(fname, 'rb') as f:
data = f.read()
header = "\xff\xd8"
tail = "\xff\xd9"
try:
start = data.index(header)
end = data.index(tail, start) + 2
except ValueError:
print "Can't find JPEG data!"
exit()
print 'Start: %d End: %d Size: %d' % (start, end, end - start)
with open(fname + "_tr.jpg", 'wb') as f:
f.write(data[start:end])
(Tested on Python 2.6.6)
(测试Python 2.6.6)
However, extracting embedded JPEG data like this isn't foolproof, since it's possible that those header and tail byte sequences exist in the MP3 sound data.
然而,像这样提取嵌入的JPEG数据并不是万无一失的,因为这些头和尾字节序列可能存在于MP3音频数据中。
FWIW, a simpler way to translate binary data to hex strings and back is to use hexlify and unhexlify from the binascii module.
FWIW是将二进制数据转换为十六进制字符串的一种更简单的方法,它可以从binascii模块中使用hexlify和unhexlify。
Here are some examples of doing these transformations, both with and without the binascii functions.
这里有一些做这些转换的例子,它们都有并且没有binascii函数。
from binascii import hexlify, unhexlify
#Create a string of all possible byte values
allbytes = ''.join([chr(i) for i in xrange(256)])
print 'allbytes'
print repr(allbytes)
print '\nhex list'
print [hex(ord(v))[2:].zfill(2) for v in allbytes]
hexstr = hexlify(allbytes)
print '\nhex string'
print hexstr
newbytes = ''.join([chr(int(hexstr[i:i+2], 16)) for i in xrange(0, len(hexstr), 2)])
print '\nNew bytes'
print repr(newbytes)
print '\nUsing unhexlify'
print repr(unhexlify(hexstr))
output
输出
allbytes
'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff'
hex list
['00', '01', '02', '03', '04', '05', '06', '07', '08', '09', '0a', '0b', '0c', '0d', '0e', '0f', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '1a', '1b', '1c', '1d', '1e', '1f', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '2a', '2b', '2c', '2d', '2e', '2f', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '3a', '3b', '3c', '3d', '3e', '3f', '40', '41', '42', '43', '44', '45', '46', '47', '48', '49', '4a', '4b', '4c', '4d', '4e', '4f', '50', '51', '52', '53', '54', '55', '56', '57', '58', '59', '5a', '5b', '5c', '5d', '5e', '5f', '60', '61', '62', '63', '64', '65', '66', '67', '68', '69', '6a', '6b', '6c', '6d', '6e', '6f', '70', '71', '72', '73', '74', '75', '76', '77', '78', '79', '7a', '7b', '7c', '7d', '7e', '7f', '80', '81', '82', '83', '84', '85', '86', '87', '88', '89', '8a', '8b', '8c', '8d', '8e', '8f', '90', '91', '92', '93', '94', '95', '96', '97', '98', '99', '9a', '9b', '9c', '9d', '9e', '9f', 'a0', 'a1', 'a2', 'a3', 'a4', 'a5', 'a6', 'a7', 'a8', 'a9', 'aa', 'ab', 'ac', 'ad', 'ae', 'af', 'b0', 'b1', 'b2', 'b3', 'b4', 'b5', 'b6', 'b7', 'b8', 'b9', 'ba', 'bb', 'bc', 'bd', 'be', 'bf', 'c0', 'c1', 'c2', 'c3', 'c4', 'c5', 'c6', 'c7', 'c8', 'c9', 'ca', 'cb', 'cc', 'cd', 'ce', 'cf', 'd0', 'd1', 'd2', 'd3', 'd4', 'd5', 'd6', 'd7', 'd8', 'd9', 'da', 'db', 'dc', 'dd', 'de', 'df', 'e0', 'e1', 'e2', 'e3', 'e4', 'e5', 'e6', 'e7', 'e8', 'e9', 'ea', 'eb', 'ec', 'ed', 'ee', 'ef', 'f0', 'f1', 'f2', 'f3', 'f4', 'f5', 'f6', 'f7', 'f8', 'f9', 'fa', 'fb', 'fc', 'fd', 'fe', 'ff']
hex string
000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f202122232425262728292a2b2c2d2e2f303132333435363738393a3b3c3d3e3f404142434445464748494a4b4c4d4e4f505152535455565758595a5b5c5d5e5f606162636465666768696a6b6c6d6e6f707172737475767778797a7b7c7d7e7f808182838485868788898a8b8c8d8e8f909192939495969798999a9b9c9d9e9fa0a1a2a3a4a5a6a7a8a9aaabacadaeafb0b1b2b3b4b5b6b7b8b9babbbcbdbebfc0c1c2c3c4c5c6c7c8c9cacbcccdcecfd0d1d2d3d4d5d6d7d8d9dadbdcdddedfe0e1e2e3e4e5e6e7e8e9eaebecedeeeff0f1f2f3f4f5f6f7f8f9fafbfcfdfeff
New bytes
'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff'
Using unhexlify
'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff'
Note that this code needs some modifications to run on Python 3 (apart from converting the print
statements to print
function calls) because plain Python 3 strings are Unicode strings, not byte strings.
注意,这段代码需要在Python 3上运行一些修改(除了将print语句转换为print函数调用之外),因为普通的Python 3字符串是Unicode字符串,而不是字节字符串。
#2
0
You need to write out as binary
你需要把它写成二进制的形式。
Try:
试一试:
txtfile = open(sys.argv[1] + "_tr.jpg", "wb")
#3
0
Oups, you are not doing what you expect. the bin
generates a string containing the value in binary form. Let's look at what you have, if the content on the input file was :
Oups,你不是在做你想做的事。bin生成一个包含二进制形式值的字符串。让我们看看你有什么,如果输入文件的内容是:
-
saver
is a string of hexadecimal characters in textual form something like "313233414243" for an initial string of "132ABC" - 保护程序是一串十六进制字符,在文本形式中类似于“132ABC”的初始字符串“313233414243”。
-
jpgcontents
has same format and starts with "ffd8" and ends with "ffd9" - jpgcontents具有相同的格式,以“ffd8”开头,以“ffd9”结尾
- you then apply the magic formula
bin(int(jpgcontents, scale))[2:].zfill(numbits)
that- convert the hexa string to a long integer
- 将hexa字符串转换为一个长整数。
- convert the long integer to a binary representation string - this part would convert hexa "ff" in integer 255 and end in the string "0b11111111"
- 将长整数转换为二进制表示字符串——这部分将转换为整数255中的hexa“ff”,并在字符串“0b11111111”中结束。
- 然后应用神奇公式bin(int(jpgcontents, scale))[2:].zfill(numbits)将hexa字符串转换成一个长整数,将长整数转换为二进制表示字符串——这部分将转换为整数255中的hexa“ff”,并在字符串“0b11111111”中结束。
- remove first characters "0b" and fill the end of buffer if needed
- 删除第一个字符“0b”,并在必要时填充缓冲区的末端。
bitcontents
is then a string starting with "11111111....". Just rename your file with a .txt extension and open it with a text editor, you will see that it is a large file containing only ASCII characters 0 and 1.
bitcontents然后一个字符串从“11111111 ....”开始。只需使用.txt扩展名重命名文件,并使用文本编辑器打开它,您将看到它是一个包含ASCII字符0和1的大文件。
As the header is "ffd8" the file will start with 10 "1". So the error that it starts with 0x31 0x31 because 0x31 is the ascii code of "1".
当标题为“ffd8”时,文件将以10“1”开始。所以从0x31 0x31开始的错误因为0x31是“1”的ascii码。
What you need is convert the hexa string jpgcontents
in a binary byte array.
您需要的是将hexa字符串jpgcontents转换为二进制字节数组。
fileimage = ''.join([ jpgcontent[i:i+2] for i in range(0, len(jpgcontent), 2]
You can then safely copy the fileimage buffer to a binary file:
然后,您可以安全地将fileimage缓冲区复制到一个二进制文件:
file = open(sys.argv[1] + "_tr.jpg", "wb")
file.write(fileimage)
#4
0
The easiest method is using the binascii module: https://docs.python.org/2/library/binascii.html.
最简单的方法是使用binascii模块:https://docs.python.org/2/library/binascii.html。
import binascii
# code in ascii format contained in a list
code = ['00', '01', '02', '03', '04', '05', '06', '07', '08', '09']
bfile = open('bfile.bin', 'w')
for c in code:
# convert the ascii to binary and write it to the file
bfile.write(binascii.unhexlify(c))
bfile.close()