使用Lua 局部变量来优化性能,同时比较局部变量和全局变量

时间:2021-03-10 16:04:24


在竞争激烈的游戏行业中,尤其页游,面对策划复杂和频繁的需求,使用脚本可以降低难度和成本。在使用Lua的过程中,会经常访问全局变量来作为配置文件。

在访问全局变量时,可以通过局部变量引用全局变量来优化。当然,这样的优化毫无意义。

Locals Vs Globals  from  http://lua-users.org/wiki/LocalsVsGlobals




Comparison between local and global variables:


Implementation: Locals index an array of registers on the stack, with hard-coded integer index (at least in the standard implementation -- LuaImplementations).
Globals index from a table (or userdata), typically with variable name string, stored as constant or variable, as the key.
实现:在Lua标准实现中,使用hard-code 整数索引在 registers on the stack 上索引局部变量。
在table或者userdata中,通过string常量或者string变量作为键来索引全局变量。
Performance: The above point implies that locals can be a bit faster
Still, table indexing is a fairly efficient operation in Lua (e.g. strings are interned with precomputed hash), so this point can often be ignored unless profiling indicates optimization is needed.
性能:通过实现不同,局部变量稍微比全局变量速度快。但是,在lua table中通过string来索引也是相当快。因为字符串在lua中被内化,事先计算过hash值。
所以,除非在剖析优化性能之外,性能问题可以被忽略。
Syntax: If an environment table is set, globals can be accessed in Lua code with or without explicitly naming the environment table they come from: foo, _G.foo, or getfenv().foo (or in 5.2.0work3, _ENV.foo). 
This allows different styles of environment usage, such as in ModuleDefinition.
语法:如果环境table是一个集合,在环境table之外可以通过foo,_G.foo 或者getfenv().foo来获取到全局变量。
但是,局部变量只能在模块定义中获取到。
Scope: Locals are scoped within a lexical block. Globals are scoped within any number of functions assigned a given environment table at run-time.
作用域:局部变量的作用域在语法块中。在运行时,全局变量可以在任何环境table赋值的函数中获取。
Declaration: Locals must be explicitly declared, with local statement, to use them.
Globals can be accessed on-the-fly, even with variable name set at runtime
The list of locals is defined statically. The list of globals may be determined at run-time and can even change during program execution.
声明:局部变量必须通过明显的声明去定义和使用。所有局部变量的声明都是静态的。但是,全局变量可以在代码执行时,动态的访问,甚至可以在运行时设置全局变量。
所有全局变量可以在运行时来决定,设置在代码执行过程中来改变全局变量。
Standard library access: The standard library is normally exposed to a chunk via globals.
标准库的访问:所有Lua的标准库都是通过全局变量暴露给使用者。
Precedence: Locals override globals.
优先权:局部变量覆盖全局变量
Bytecode introspection: Globals are more visible in the bytecode.
Global get/sets become GETGLOBAL/SETGLOBAL opcodes with the given variable name.
It is possible to list all the globals read or written byte a compiled chunk。
Limit on number: There is a limit to the number of locals per function (MAXLOCALS, typically 200).
字节码内省:
全局变量在字节码中是可以看到的。通过制定的变量名称,获取或者设置全局变量变成了GETGLOBAL/SETGLOBAL 指令。
通过读取或者写入一个预编译好块可以罗列出所有的全局变量。
数量限制:
 每一个函数中,局部变量的数量最大为200个。
 
 
Basic facts  from  Lua Performance Tips Roberto Ierusalimschy
Before running any code, Lua translates (precompiles) the source into an internal format.
This format is a sequence of instructions for a virtual machine, similar to machine code for a real CPU.
This internal format is then interpreted by C code that is essentially a while loop with a large switch inside, one case for each instruction.
    Lua在运行任何代码之前,Lua预编译源代码成为内部字节码格式。
字节码格式就是虚拟机的序列化指定,类似于真正CPU的机器指令。内部字节码格式通过C语言编写的解释器通过一个基础的while loop来不断的为每一个指定执行switch。
Perhaps you have already read somewhere that, since version 5.0, Lua uses a register-based virtual machine. 
The “registers” of this virtual machine do not correspond to real registers in the CPU, because this correspondence would be not portable and quite limited in the number of registers available.
Instead, Lua uses a stack (implemented as an array plus some indices) to accommodate its registers. 
Each active function has an activation record, which is a stack slice where in the function stores its registers.
So, each function has its own registers2. 
Each function may use up to 250 registers, because each instruction has only 8 bits to refer to a register.
  自从5.0版本,Lua使用一种register-based virtual machine 虚拟机。 “registers” 不依赖于CPU真正的指令,依赖CPU指正的registers由于registers数量限制将导致Lua的不可以移植。
相反,lua使用stack 来解决registers。每一个函数最大有250个registers,因为每一个指定仅仅有8为来执行register。
Given that large number of registers, the Lua precompiler is able to store all local variables in registers.
The result is that access to local variables is very fast in Lua. 
So, it is easy to justify one of the most important rules to improve the performance of Lua programs: use locals!
    依赖于大量的registers,Lua可以预编译所以的局部变量在registers中。所以,在Lua中访问局部变量非常快速。所以,在优化lua性能时,一个很重要的原则是使用局部变量:使用局部变量。
If you need to squeeze performance out of your program, there are several
places where you can use locals besides the obvious ones. 
For instance, if you call a function within a long loop, you can assign the function to a local variable. For instance, the code
for i = 1, 1000000 do
local x = math.sin(i)
end
runs 30% slower than this one:
local sin = math.sin
for i = 1, 1000000 do
local x = sin(i)
end

测试代码:

module.lua 文件:

A =1
B =1

main.lua文件:

print("=================================")
require("module")
local A =A
local t1 =os.clock ()


for i=1,100000000 do
    A =2
end


local t2 =os.clock ()


print("通过局部变量来优化全局变量的访问:"..(t2-t1))




local t1 =os.clock ()


for i=1,100000000 do
    B =2
end


local t2 =os.clock ()
print("直接访问全局变量:"..(t2-t1))
print("=================================")


local t1 =os.clock ()
local sin = math.sin
for i = 1, 100000000 do
  local x = sin(i)
end
local t2 =os.clock ()
print("通过局部变量来优化全局变量的访问:"..(t2-t1))
local t1=os.clock ()
for i=1,100000000 do
     x = math.sin(i)
end
local t2 =os.clock ()
print("直接访问全局变量:"..(t2-t1))

测试结果:

=================================
通过局部变量来优化全局变量的访问:0.825
直接访问全局变量:1.961
=================================
通过局部变量来优化全局变量的访问:7.017
直接访问全局变量:10.264