字节码解析指令和机器语言之间的区别?

时间:2021-06-08 20:45:46

"A bytecode program is normally executed by parsing the instructions one at a time. This kind of bytecode interpreter is very portable. Some systems, called dynamic translators, or "just-in-time" (JIT) compilers, translate bytecode into machine language as necessary at runtime: this makes the virtual machine unportable."

“字节码程序通常通过一次解析一个指令来执行。这种字节码解释器非常便携。有些系统,称为动态转换器,或”即时“(JIT)编译器,将字节码转换为机器语言在运行时必要时:这会使虚拟机无法移植。“

A question about this paragraph is that: After the bytecode gets processed, what's the difference between a parsed instruction and machine language (or machine code)?

关于这一段的问题是:在处理字节码之后,解析的指令和机器语言(或机器代码)之间有什么区别?

4 个解决方案

#1


JIT is different to a byte code interpreter.

JIT与字节码解释器不同。

Consider the following C function:

考虑以下C函数:

int sum() {
   return 5 + 6;
}

This will be compiled directly machine code. The exact instructions on say x86 and ARM processors will be different.

这将直接编译机器代码。关于x86和ARM处理器的确切说明将有所不同。

If we wrote a basic bytecode interpreter it might look something like this:

如果我们编写了一个基本的字节码解释器,它可能看起来像这样:

for(;;) {
   switch(*currentInstruction++) {
   case OP_PUSHINT:
      *stack++ = nextInt(currentInstruction);
      break;
   case OP_ADD:
      --stack;
      stack[-1].add(*stack);
      break;
   case OP_RETURN:
      return stack[-1];
   }
}

This can then interpret the following set of instructions:

然后,这可以解释以下一组说明:

OP_PUSHINT (5)
OP_PUSHINT (6)
OP_ADD
OP_RETURN

If you compiled the byte code interpreter on both x86 or ARM then you would be able to run the same byte code without doing any further rewriting of the interpreter.

如果您在x86或ARM上编译了字节码解释器,那么您将能够运行相同的字节代码而无需进一步重写解释器。

If you wrote a JIT compiler you would need to emit processor specific instructions (machine code) for each supported processor, whereas the byte code interpreter is relying on the C++ compiler to emit the processor specific instructions.

如果编写了JIT编译器,则需要为每个支持的处理器发出特定于处理器的指令(机器代码),而字节代码解释器依赖于C ++编译器来发出处理器特定的指令。

#2


In a bytecode interpreter, the instruction format is usually designed for very fast "parsing" using shift and mask operators. The interpreter, after "parsing" (I prefer "decoding") the instruction, immediately updates the state of the virtual machine and then begins decoding the next instruction. So after the bytecode gets processed in an interpreter, no remnant remains.

在字节码解释器中,指令格式通常用于使用移位和掩码运算符进行非常快速的“解析”。解释器在“解析”(我更喜欢“解码”)指令之后,立即更新虚拟机的状态,然后开始解码下一条指令。因此,在解释器中处理字节码后,没有剩余的余数。

In a JIT compiler, bytes are processed in units larger than a single instruction. The minimum unit is the basic block, but modern JITs will convert larger paths to machine code. This is a translation step, and the output of the translation step is machine code. The original bytecode may remain in memory, but it is not used for implementation—so there is no real difference. (Although it is still typical that the machine code for a JITted virtual machine does different things from the machine code emitted by a native-code compiler.)

在JIT编译器中,字节以大于单个指令的单位进行处理。最小单位是基本块,但现代JIT将更大的路径转换为机器代码。这是翻译步骤,翻译步骤的输出是机器代码。原始字节码可能保留在内存中,但它不用于实现 - 因此没有真正的区别。 (尽管JITted虚拟机的机器代码与本机代码编译器发出的机器代码完全不同,但通常也是如此。)

#3


There's no difference - JIT compiler is done exactly for that - it produces machine code that is executed on the hardware.

没有区别 - JIT编译器正是为此完成的 - 它产生了在硬件上执行的机器代码。

#4


Ultimately it all boils down to machine instructions.

归根结底,它归结为机器指令。

  1. Native App - contains machine instructions that are executed directly.
  2. Native App - 包含直接执行的机器指令。

  3. JIT App - bytecode is compiled into machine instructions and executed.
  4. JIT App - 将字节码编译成机器指令并执行。

  5. Translated App - bytecode is translated by a virtual machine that is a Native App.
  6. 翻译的应用程序 - 字节码由作为本机应用程序的虚拟机翻译。

As you can tell, with #1, you have the least overhead while with #3, you have the most overhead. So, performance should be fastest on #1 and just as fast on #2 after the initial compilation overhead.

正如您所知,使用#1,您拥有最少的开销,而使用#3,您的开销最大。因此,在初始编译开销之后,性能应该在#1上最快,在#2上同样快。

#1


JIT is different to a byte code interpreter.

JIT与字节码解释器不同。

Consider the following C function:

考虑以下C函数:

int sum() {
   return 5 + 6;
}

This will be compiled directly machine code. The exact instructions on say x86 and ARM processors will be different.

这将直接编译机器代码。关于x86和ARM处理器的确切说明将有所不同。

If we wrote a basic bytecode interpreter it might look something like this:

如果我们编写了一个基本的字节码解释器,它可能看起来像这样:

for(;;) {
   switch(*currentInstruction++) {
   case OP_PUSHINT:
      *stack++ = nextInt(currentInstruction);
      break;
   case OP_ADD:
      --stack;
      stack[-1].add(*stack);
      break;
   case OP_RETURN:
      return stack[-1];
   }
}

This can then interpret the following set of instructions:

然后,这可以解释以下一组说明:

OP_PUSHINT (5)
OP_PUSHINT (6)
OP_ADD
OP_RETURN

If you compiled the byte code interpreter on both x86 or ARM then you would be able to run the same byte code without doing any further rewriting of the interpreter.

如果您在x86或ARM上编译了字节码解释器,那么您将能够运行相同的字节代码而无需进一步重写解释器。

If you wrote a JIT compiler you would need to emit processor specific instructions (machine code) for each supported processor, whereas the byte code interpreter is relying on the C++ compiler to emit the processor specific instructions.

如果编写了JIT编译器,则需要为每个支持的处理器发出特定于处理器的指令(机器代码),而字节代码解释器依赖于C ++编译器来发出处理器特定的指令。

#2


In a bytecode interpreter, the instruction format is usually designed for very fast "parsing" using shift and mask operators. The interpreter, after "parsing" (I prefer "decoding") the instruction, immediately updates the state of the virtual machine and then begins decoding the next instruction. So after the bytecode gets processed in an interpreter, no remnant remains.

在字节码解释器中,指令格式通常用于使用移位和掩码运算符进行非常快速的“解析”。解释器在“解析”(我更喜欢“解码”)指令之后,立即更新虚拟机的状态,然后开始解码下一条指令。因此,在解释器中处理字节码后,没有剩余的余数。

In a JIT compiler, bytes are processed in units larger than a single instruction. The minimum unit is the basic block, but modern JITs will convert larger paths to machine code. This is a translation step, and the output of the translation step is machine code. The original bytecode may remain in memory, but it is not used for implementation—so there is no real difference. (Although it is still typical that the machine code for a JITted virtual machine does different things from the machine code emitted by a native-code compiler.)

在JIT编译器中,字节以大于单个指令的单位进行处理。最小单位是基本块,但现代JIT将更大的路径转换为机器代码。这是翻译步骤,翻译步骤的输出是机器代码。原始字节码可能保留在内存中,但它不用于实现 - 因此没有真正的区别。 (尽管JITted虚拟机的机器代码与本机代码编译器发出的机器代码完全不同,但通常也是如此。)

#3


There's no difference - JIT compiler is done exactly for that - it produces machine code that is executed on the hardware.

没有区别 - JIT编译器正是为此完成的 - 它产生了在硬件上执行的机器代码。

#4


Ultimately it all boils down to machine instructions.

归根结底,它归结为机器指令。

  1. Native App - contains machine instructions that are executed directly.
  2. Native App - 包含直接执行的机器指令。

  3. JIT App - bytecode is compiled into machine instructions and executed.
  4. JIT App - 将字节码编译成机器指令并执行。

  5. Translated App - bytecode is translated by a virtual machine that is a Native App.
  6. 翻译的应用程序 - 字节码由作为本机应用程序的虚拟机翻译。

As you can tell, with #1, you have the least overhead while with #3, you have the most overhead. So, performance should be fastest on #1 and just as fast on #2 after the initial compilation overhead.

正如您所知,使用#1,您拥有最少的开销,而使用#3,您的开销最大。因此,在初始编译开销之后,性能应该在#1上最快,在#2上同样快。