使用Lua 局部变量来优化性能,同一时候比較局部变量和全局变量

时间:2021-07-20 05:03:04

在竞争激烈的游戏行业中,尤其页游,面对策划复杂和频繁的需求,使用脚本能够减少难度和成本。在使用Lua的过程中,会常常訪问全局变量来作为配置文件。

在訪问全局变量时,能够通过局部变量引用全局变量来优化。当然,这种优化毫无意义。

Locals Vs Globals  from  http://lua-users.org/wiki/LocalsVsGlobals









Comparison between local and global variables:





Implementation: Locals index an array of registers on the stack, with hard-coded integer index (at least in the standard implementation -- LuaImplementations).

Globals index from a table (or userdata), typically with variable name string, stored as constant or variable, as the key.

实现:在Lua标准实现中,使用hard-code 整数索引在 registers on the stack 上索引局部变量。

在table或者userdata中,通过string常量或者string变量作为键来索引全局变量。

Performance: The above point implies that locals can be a bit faster

Still, table indexing is a fairly efficient operation in Lua (e.g. strings are interned with precomputed hash), so this point can often be ignored unless profiling indicates optimization is needed.

性能:通过实现不同,局部变量略微比全局变量速度快。

可是,在lua table中通过string来索引也是相当快。由于字符串在lua中被内化,事先计算过hash值。

所以,除非在剖析优化性能之外,性能问题能够被忽略。

Syntax: If an environment table is set, globals can be accessed in Lua code with or without explicitly naming the environment table they come from: foo, _G.foo, or getfenv().foo (or in 5.2.0work3, _ENV.foo). 

This allows different styles of environment usage, such as in ModuleDefinition.

语法:假设环境table是一个集合,在环境table之外能够通过foo,_G.foo 或者getfenv().foo来获取到全局变量。

可是,局部变量仅仅能在模块定义中获取到。

Scope: Locals are scoped within a lexical block. Globals are scoped within any number of functions assigned a given environment table at run-time.

作用域:局部变量的作用域在语法块中。在执行时,全局变量能够在不论什么环境table赋值的函数中获取。

Declaration: Locals must be explicitly declared, with local statement, to use them.

Globals can be accessed on-the-fly, even with variable name set at runtime

The list of locals is defined statically. The list of globals may be determined at run-time and can even change during program execution.

声明:局部变量必须通过明显的声明去定义和使用。

全部局部变量的声明都是静态的。

可是,全局变量能够在代码执行时,动态的訪问,甚至能够在执行时设置全局变量。

全部全局变量能够在执行时来决定,设置在代码执行过程中来改变全局变量。

Standard library access: The standard library is normally exposed to a chunk via globals.

标准库的訪问:全部Lua的标准库都是通过全局变量暴露给使用者。

Precedence: Locals override globals.

优先权:局部变量覆盖全局变量

Bytecode introspection: Globals are more visible in the bytecode.

Global get/sets become GETGLOBAL/SETGLOBAL opcodes with the given variable name.

It is possible to list all the globals read or written byte a compiled chunk。

Limit on number: There is a limit to the number of locals per function (MAXLOCALS, typically 200).

字节码内省:

全局变量在字节码中是能够看到的。通过制定的变量名称,获取或者设置全局变量变成了GETGLOBAL/SETGLOBAL 指令。

通过读取或者写入一个预编译好块能够罗列出全部的全局变量。

数量限制:

 每个函数中,局部变量的数量最大为200个。

 

 

Basic facts  from  Lua Performance Tips Roberto Ierusalimschy

Before running any code, Lua translates (precompiles) the source into an internal format.

This format is a sequence of instructions for a virtual machine, similar to machine code for a real CPU.

This internal format is then interpreted by C code that is essentially a while loop with a large switch inside, one case for each instruction.

    Lua在执行不论什么代码之前,Lua预编译源码成为内部字节码格式。

字节码格式就是虚拟机的序列化指定,类似于真正CPU的机器指令。

内部字节码格式通过C语言编写的解释器通过一个基础的while loop来不断的为每个指定运行switch。

Perhaps you have already read somewhere that, since version 5.0, Lua uses a register-based virtual machine. 

The “registers” of this virtual machine do not correspond to real registers in the CPU, because this correspondence would be not portable and quite limited in the number of registers available.

Instead, Lua uses a stack (implemented as an array plus some indices) to accommodate its registers. 

Each active function has an activation record, which is a stack slice where in the function stores its registers.

So, each function has its own registers2. 

Each function may use up to 250 registers, because each instruction has only 8 bits to refer to a register.

  自从5.0版本号,Lua使用一种register-based virtual machine 虚拟机。 “registers” 不依赖于CPU真正的指令,依赖CPU指正的registers因为registers数量限制将导致Lua的不能够移植。

相反,lua使用stack 来解决registers。每个函数最大有250个registers,由于每个指定唯独8为来运行register。

Given that large number of registers, the Lua precompiler is able to store all local variables in registers.

The result is that access to local variables is very fast in Lua. 

So, it is easy to justify one of the most important rules to improve the performance of Lua programs: use locals!

    依赖于大量的registers,Lua能够预编译所以的局部变量在registers中。所以,在Lua中訪问局部变量非常高速。

所以。在优化lua性能时,一个非常重要的原则是使用局部变量:使用局部变量。

If you need to squeeze performance out of your program, there are several

places where you can use locals besides the obvious ones. 

For instance, if you call a function within a long loop, you can assign the function to a local variable. For instance, the code

for i = 1, 1000000 do

local x = math.sin(i)

end

runs 30% slower than this one:

local sin = math.sin

for i = 1, 1000000 do

local x = sin(i)

end

測试代码:

module.lua 文件:

A =1

B =1

main.lua文件:

print("=================================")

require("module")

local A =A

local t1 =os.clock ()





for i=1,100000000 do

    A =2

end





local t2 =os.clock ()





print("通过局部变量来优化全局变量的訪问:"..(t2-t1))









local t1 =os.clock ()





for i=1,100000000 do

    B =2

end





local t2 =os.clock ()

print("直接訪问全局变量:"..(t2-t1))

print("=================================")





local t1 =os.clock ()

local sin = math.sin

for i = 1, 100000000 do

  local x = sin(i)

end

local t2 =os.clock ()

print("通过局部变量来优化全局变量的訪问:"..(t2-t1))

local t1=os.clock ()

for i=1,100000000 do

     x = math.sin(i)

end

local t2 =os.clock ()

print("直接訪问全局变量:"..(t2-t1))

測试结果:

=================================

通过局部变量来优化全局变量的訪问:0.825

直接訪问全局变量:1.961

=================================

通过局部变量来优化全局变量的訪问:7.017

直接訪问全局变量:10.264