写一个1和0的字符串到二进制文件？

I want to take a string of 1's and 0's and convert it into an actual binary file(simply writing the string of 1's and 0's to a file would just make it either ascii file containing "00110001"s and "00110000"s ). I would prefer to do this in python or directly from a bash shell, but java or C is fine too. this is probably a one time use.

我想取一个1和0的字符串并将其转换为实际的二进制文件(只需将1和0的字符串写入文件就可以使其成为包含“00110001”和“00110000”的ascii文件)。我更喜欢在python中或直接从bash shell中执行此操作,但java或C也可以。这可能是一次性使用。

Thanks.

5 个解决方案

#1

In Python, use the int built-in function to convert the string of 0s and 1s to a number:

在Python中,使用int内置函数将0和1的字符串转换为数字:

>>> int("00100101", 2)
37

Then use the chr built-in to convert a 8-bit integer (that is, in the inclusive range 0-255) to a character.

然后使用内置的chr将8位整数(即包含范围0-255)转换为字符。

>>> chr(_)
'%'

The result of chr can be simply written to a file (opened in binary mode) with the file.write method.

可以使用file.write方法将chr的结果简单地写入文件(以二进制模式打开)。

#2

If you've got more than 8 characters to convert (and I'm presuming you do) then you'll need something like this (using Python):

如果你有超过8个字符要转换(我假设你这样做)那么你需要这样的东西(使用Python):

>>> b = '0010101001010101010111101010100101011011'
>>> bytearray(int(b[x:x+8], 2) for x in range(0, len(b), 8))
bytearray(b'*U^\xa9[')

This splits the bit string into 8 character chunks (and if your string isn't a multiple of 8 long you should pad it first), converts each chunk into an integer and then converts the list of integers into a bytearray which can be written directly to your binary file (there's no need to convert it to a string):

这会将位串分成8个字符块(如果你的字符串不是8长的倍数你应该先填充它),将每个块转换成一个整数然后将整数列表转换成一个可以直接写入的bytearray到你的二进制文件(没有必要将其转换为字符串):

>>> with open('somefile', 'wb') as f:
...     f.write(the_bytearray)

If you have more tasks like this then there are libraries that can help, for example here's the same conversion using my bitstring module:

如果您有更多这样的任务,那么有些库可以提供帮助,例如,使用我的bitstring模块进行相同的转换:

>>> from bitstring import BitArray
>>> with open('somefile', 'wb') as f:
...     BitArray(bin=b).tofile(f)

#3

Any language that can do shifting can combine numbers of any radix. While I'm a fan of the different ways/manipulations that different languages can access this kind of stuff with ease, never forget that behind all of this is some very very basic maths.

任何可以进行移位的语言都可以组合任何基数的数字。虽然我是不同语言可以轻松访问这些东西的不同方式/操作的粉丝,但永远不要忘记所有这些都是一些非常基本的数学。

In this case, binary is just a simple power of 2 so:

在这种情况下,二进制只是2的简单幂,所以:

    1 << 1 = 1
    1 << 2 = 2
    1 << 3 = 4
    1 << 4 = 8

and so on...

等等...

if your taking the binary string : 10100101 you can easily convert it to a byte as follows:

如果您使用二进制字符串:10100101,您可以轻松地将其转换为字节,如下所示:

    (1 << 7) + (0 << 6) + (1 << 5) + (0 << 4) + (0 << 3) + (1 << 2) + (0 << 1) + 1

Assuming that you've gone through and converted each "0" or "1" to it's number format first.

假设您已经完成并首先将每个“0”或“1”转换为它的数字格式。

This will start getting a bit tedious if your dealing with numbers of bits larger than the 8 above, but since your doing a byte at a time, a simple byte array in your chosen language will suffice, allowing you to push each byte in turn.

如果你处理的位数大于上面的8位,这将开始变得有点乏味,但由于你一次做一个字节,你选择的语言中的一个简单的字节数组就足够了,允许你依次推送每个字节。

It's worth mentioning also that the same process can be used for other bases, and if you don't have a shift facility, a simple multiplication will generally work just as well.

值得一提的是,同样的过程可以用于其他基础,如果你没有移位工具,简单的乘法通常也可以。

If you label your columns across the top in binary, you'll easily see what I'm on about.. taking the above example (Remember it's all powers of 2):

如果你用二进制标记顶部的列,你可以很容易地看到我在做什么......采取上面的例子(记住它的所有权力为2):

    1   0  1  0  0 1 0 1
    128 64 32 16 8 4 2 1 = 128 + 32 + 4 + 1 = 165

Not part of the question, but related... and taking it one step further:

不是问题的一部分,而是相关......并且更进一步:

Hexadecimal is the values 0 to F (16 values) each can fit into 4 bits... so

十六进制是0到F(16个值)的值,每个值可以容纳4位......所以

    1010 0101 (8+2) (4+1) - Binary using powers of 2 only on 4 bits (8 4 2 1)
    10   5    (Decimal) - (10 << 4) + 5 = 165
    A    5    (Hexadecimal)

#4

In java u have built-in function Integer.parseInt(String strBinaryNumber,int radix) method.

在java中你有内置函数Integer.parseInt(String strBinaryNumber,int radix)方法。

Which work as..

哪个工作为......

             String strBinaryNumber="00100101";
     System.out.println(Integer.parseInt(strBinaryNumber,2));

Output Will be: 37

输出将是:37

but An exception of type NumberFormatException is thrown if any of the following situations occurs:

但如果出现以下任何一种情况,则抛出类型NumberFormatException的异常:

The first argument is null or is a string of length zero.

第一个参数为null或者是长度为零的字符串。

The radix is either smaller than Character.MIN_RADIX or larger than Character.MAX_RADIX.

基数小于Character.MIN_RADIX或大于Character.MAX_RADIX。

Any character of the string is not a digit of the specified radix, except that the first character may be a minus sign '-' ('\u002D') provided that the string is longer than length 1. 4.The value represented by the string is not a value of type int.

字符串的任何字符都不是指定基数的数字,除非第一个字符可以是减号' - '('\ u002D'),前提是字符串长度超过长度1. 4.由表示的值string不是int类型的值。

#5

This isn't all that practical, but here's one way it could be done in a shell script. Note: it uses bc

这不是那么实用,但这是在shell脚本中完成的一种方式。注意:它使用bc

#!/bin/bash

# Name of your output file
OFILE="output.txt"

# A goofy wrapper to convert a sequence of 8 1s and 0s into a 8-bit number, expressed in hex
function bstr_to_byte()
{
    echo "obase=16;ibase=2;$1" | bc
}


# Build input string from stdin
#   This can be done using pipes ( echo "1010101..." | ./binstr.sh
#   Or "interactively", so long as you enter q on it's own line when you are done entering your
#       binary string.
ISTR=""
while read data; do
    if [[ ${data} != "q" ]] ; then
        ISTR="${ISTR}${data}"
    else
        break
    fi
done

# Byte-by-byte conversion
while [[ $(expr length ${ISTR}) -ge 8 ]] ; do
    # Copy the first 8 characters
    BSTR=${ISTR:0:8}
    # Drop them from the input string
    ISTR=${ISTR:8}
    # Convert the byte-string into a byte
    BYTE=$(bstr_to_byte $BSTR)

    # Debug print
    ##echo "$BSTR => [ ${BYTE} ]"

    # Write character to file
    echo -en "\x${BYTE}" >> ${OFILE}

    # Check for empty ISTR, which will cause error on iteration
    if [[ -z ${ISTR} ]] ; then
        ##echo "String parsed evenly"
        break
    fi
done

##echo "Remaining, unparsed characters: ${ISTR}"

Which, if you name binstr.sh can be run by piping stdin, e.g.:

其中,如果你命名binstr.sh可以通过管道stdin运行,例如:

echo "11001100" | ./binstr.sh

You can check this with something like hexdump, e.g. hexdump output.txt

您可以使用hexdump等内容进行检查,例如hexdump output.txt

I should point out that this assumes that your string is being entered with the MSB first. It will also simply discard any number of "bits" that don't form a complete byte. You could change this, or just make sure you pad your input sufficiently.

我应该指出,这假设您的字符串是首先使用MSB输入的。它还将简单地丢弃任何不形成完整字节的“位”。您可以更改此设置,或者只是确保填充输入。

Lastly, there are some debugging lines I left in there but commented out with double # signs.

最后,我留下了一些调试线,但注释了双#符号。

#1