I'm generating binary data files that are simply a series of records concatenated together. Each record consists of a (binary) header followed by binary data. Within the binary header is an ascii string 80 characters long. Somewhere along the way, my process of writing the files got a little messed up and I'm trying to debug this problem by inspecting how long each record actually is.
我正在生成二进制数据文件,这些文件只是一系列连接在一起的记录。每个记录包含一个(二进制)标题,后跟二进制数据。在二进制头内是一个长度为80个字符的ascii字符串。在某个地方,我编写文件的过程有点搞砸了,我试图通过检查每条记录的实际长度来调试这个问题。
This seems extremely related, but I don't understand perl, so I haven't been able to get the accepted answer there to work. The other answer points to bgrep
which I've compiled, but it wants me to feed it a hex string and I'd rather just have a tool where I can give it the ascii string and it will find it in the binary data, print the string and the byte offset where it was found.
这似乎非常相关,但我不理解perl,所以我无法在那里得到公认的答案。另一个答案指向我编译的bgrep,但它希望我用十六进制字符串提供它,我宁愿只有一个工具,我可以给它ascii字符串,它会在二进制数据中找到它,打印字符串和找到它的字节偏移量。
In other words, I'm looking for some tool which acts like this:
换句话说,我正在寻找一些像这样的工具:
tool foobar filename
or
要么
tool foobar < filename
and its output is something like this:
它的输出是这样的:
foobar:10
foobar:410
foobar:810
foobar:1210
...
e.g. the string which matched and a byte offset in the file where the match started. In this example case, I can infer that each record is 400 bytes long.
例如匹配的字符串和匹配开始的文件中的字节偏移量。在这个示例中,我可以推断每条记录的长度为400字节。
Other constraints:
其他限制:
- ability to search by regex is cool, but I don't need it for this problem
- 通过正则表达式搜索的能力很酷,但我不需要它来解决这个问题
- My binary files are big (3.5Gb), so I'd like to avoid reading the whole file into memory if possible.
- 我的二进制文件很大(3.5Gb),所以我想尽可能避免将整个文件读入内存。
3 个解决方案
#1
23
You could use strings
for this:
您可以使用字符串:
strings -a -t x filename | grep foobar
Tested with GNU binutils.
用GNU binutils测试。
For example, where in /bin/ls
does --help
occur:
例如,/ bin / ls中的地址--help发生:
strings -a -t x /bin/ls | grep -- --help
Output:
输出:
14938 Try `%s --help' for more information.
162f0 --help display this help and exit
#2
23
grep --byte-offset --only-matching --text foobar filename
The --byte-offset
option prints the offset of each matching line.
--byte-offset选项打印每个匹配行的偏移量。
The --only-matching
option makes it print offset for each matching instance instead of each matching line.
--only-matching选项使其为每个匹配实例而不是每个匹配行打印偏移量。
The --text
option makes grep treat the binary file as a text file.
--text选项使grep将二进制文件视为文本文件。
You can shorten it to:
您可以将其缩短为:
grep -oba foobar filename
It works in the GNU version of grep
, which comes with linux by default. It won't work in BSD grep (which comes with Mac by default).
它适用于GNU版本的grep,默认情况下它带有linux。它不适用于BSD grep(默认情况下附带Mac)。
#3
0
I wanted to do the same task. Though strings | grep worked, I found gsar was the very tool I needed.
我想做同样的任务。虽然字符串| grep工作,我发现gsar是我需要的工具。
http://tjaberg.com/
The output looks like:
输出如下:
>gsar.exe -bic -sfoobar filename.bin
filename.bin: 0x34b5: AAA foobar BBB
filename.bin: 0x56a0: foobar DDD
filename.bin: 2 matches found
#1
23
You could use strings
for this:
您可以使用字符串:
strings -a -t x filename | grep foobar
Tested with GNU binutils.
用GNU binutils测试。
For example, where in /bin/ls
does --help
occur:
例如,/ bin / ls中的地址--help发生:
strings -a -t x /bin/ls | grep -- --help
Output:
输出:
14938 Try `%s --help' for more information.
162f0 --help display this help and exit
#2
23
grep --byte-offset --only-matching --text foobar filename
The --byte-offset
option prints the offset of each matching line.
--byte-offset选项打印每个匹配行的偏移量。
The --only-matching
option makes it print offset for each matching instance instead of each matching line.
--only-matching选项使其为每个匹配实例而不是每个匹配行打印偏移量。
The --text
option makes grep treat the binary file as a text file.
--text选项使grep将二进制文件视为文本文件。
You can shorten it to:
您可以将其缩短为:
grep -oba foobar filename
It works in the GNU version of grep
, which comes with linux by default. It won't work in BSD grep (which comes with Mac by default).
它适用于GNU版本的grep,默认情况下它带有linux。它不适用于BSD grep(默认情况下附带Mac)。
#3
0
I wanted to do the same task. Though strings | grep worked, I found gsar was the very tool I needed.
我想做同样的任务。虽然字符串| grep工作,我发现gsar是我需要的工具。
http://tjaberg.com/
The output looks like:
输出如下:
>gsar.exe -bic -sfoobar filename.bin
filename.bin: 0x34b5: AAA foobar BBB
filename.bin: 0x56a0: foobar DDD
filename.bin: 2 matches found