我花了一下午的时间完成了一个简单语言的解释器,我会在最后帖出所有代码,但是今天不打算详细解释程序的每一个步骤,最近考虑找实习、做论文,要花一些时间。有时间了我会解释每一部分,在这里只提醒一下读者,程序的写作过程和它呈现出来的不一样,总体来说我的写作过程是先写一个只包含一条指令的解释器,然后逐渐加入其他指令。ps:我是多么的想看看高手们写程序的过程,而不只是结果,但是就像graham说的“写的过程往往显得冗长”,所以这方面的书籍不多。我觉得比较接近的书包括《clean code》、《paip》、《software tools》。只可惜现在只看完了一本半。
想法的来源
昨天圣诞节,坐在不能上网的实验室,觉得好无聊。然后想起了学习《可计算性与计算复杂性》的时候,里面用到了一种含有5条指令的原语言,老师也要求我们用这个语言写程序,所以我就想写一个解释器,来执行用该语言,并把这个解释器作为自己的圣诞礼物。
原语言描述
书中说它是一个fortran式的简单语言,所以我就给它起了个简单的名字Metafor,表示meta fortran。下面来看一下它的具体描述,然后贴上我在代码里更详细的描述,我为每一种指令都起了个名字(注意其中把赋值语句命名为setf是由于我最近在写common lisp多一些)。
+-------------------------------------------------------------------------------------------+
指令 描述
x = x + 1 变量x的值加1
x = x - 1 变元x的值减1。若x的值为0,则结果仍为0
TO A IF x≠0 若x≠0,则转标号为A的指令;否则执行下一条指令
TO A 无条件转到标号为A的指令
y=x 把x的值赋给变元y,x值保持不变
+-------------------------------------------------------------------------------------------+
1. Metafor has five instructions: name instruction description inc: x = x + 1, Increase by 1 the value of the variable x. dec: x = x - 1, If the value of x is 0, leave it unchanged; otherwise decrease by 1 the value of x. con_goto: TO A IF x != 0, If the value of x is nonzero, perform the instruction with the label A next; otherwise proceed to the next instruction in the list. goto: TO A, Perform the instruction with the label A next. setf: y = x, Change the value variable y to the value of variable x. 2. Some specification: (1) Input variables are represented as: x1, x2, x3, ... (2) Local variables are represented as: z1, z2, z3, ... (3) Output variable is represented as: y note: the num 1 is often omitted(i.e., x stands for x1 and z stands for z1). 3. Labeled instructions: Instructions may or may not have labels. When an instruction is labeled, the label is written to its left in square brackets. For example, [B] Z = Z - l 4. A smaple program: "Program to compute y = x1+ x2" y = x1 [B] TO A IF x2 != 0 TO E [A] x2 = x2 - 1 y = y + 1 TO B 4. For more information, please refer to 《可计算性与计算复杂性》, 周长林、李占山著.整体思路
由于该语言中含有goto语句,所以我不能一句一句的解释执行,我需要把整个程序读进来,转换成一种方面的内部格式后,作为整体执行。例如,这是计算y = x + x2的语言程序:
y = x [B] TO A IF x2 != 0 TO E [A] x2 = x2 - 1 y = y + 1 TO B
它转换为内部格式后,为:
[['setf', 'y', 'x'], ['labeled_exp', 'B', ['con_goto', 'A', 'x2']], ['goto', 'E'], ['labeled_exp', 'A', ['dec', 'x2']], ['inc', 'y'], ['goto', 'B']]运行演示——从文件获得程序
还是前面那个y = x + x2的程序,首先我需要为两个输入参数(x, x2)给定值,如果没有为其赋值,那么它们的值默认为0。另外,我们还可以查看程序的内部表示结构,以及简单调试(就查看其他环境变量的值)。其演示结果如下:
====================================================== *Metafor 1.0, Welecome to Metafor shell environment. * *Author: Zhu zhaolong(zzljlu@gmail.com) * ====================================================== Metafor> 2 3 5 Metafor> 5 6 11 Metafor> code [['setf', 'y', 'x'], ['labeled_exp', 'B', ['con_goto', 'A', 'x2']], ['goto', 'E'], ['labeled_exp', 'A', ['dec', 'x2']], ['inc', 'y'], ['goto', 'B']] >>> ========================================== RESTART ========================================== >>> ====================================================== *Metafor 1.0, Welecome to Metafor shell environment. * *Author: Zhu zhaolong(zzljlu@gmail.com) * ====================================================== Metafor> 234 5 239 Metafor> debug debug> x 234 debug> x2 0 debug> y 239 debug> exit >>>运行演示——从终端用户直接输入原语言程序
我写了一个简单的repl交互环境,这样我们可以方便的测试简单的原语言程序,例如下面给出了一个y = x + 1的程序,示例如下:
>>> repl() ====================================================== *Metafor 1.0, Welecome to Metafor shell environment. * *Author: Zhu zhaolong(zzljlu@gmail.com) * ====================================================== Input your program: x = x + 1 y = x inputs> 5 =>6 Input your program:全部代码
在这里我贴出所有python代码:
""" Metafor: meta fortran, is a small language used in "Computabilit and Complexity" course in Jilin University. @Author: Zhu Zhaolong(zzljlu@gmail.com) @Date: 2011.12.25 1. Metafor has five instructions: name instruction description inc: x = x + 1, Increase by 1 the value of the variable x. dec: x = x - 1, If the value of x is 0, leave it unchanged; otherwise decrease by 1 the value of x. con_goto: TO A IF x != 0, If the value of x is nonzero, perform the instruction with the label A next; otherwise proceed to the next instruction in the list. goto: TO A, Perform the instruction with the label A next. setf: y = x, Change the value variable y to the value of variable x. 2. Some specification: (1) Input variables are represented as: x1, x2, x3, ... (2) Local variables are represented as: z1, z2, z3, ... (3) Output variable is represented as: y note: the num 1 is often omitted(i.e., x stands for x1 and z stands for z1). 3. Labeled instructions: Instructions may or may not have labels. When an instruction is labeled, the label is written to its left in square brackets. For example, [B] Z = Z - l 4. A smaple program: "Program to compute y = x1+ x2" y = x1 [B] TO A IF x2 != 0 TO E [A] x2 = x2 - 1 y = y + 1 TO B 4. For more information, please refer to 《可计算性与计算复杂性》, 周长林、李占山著. """ ###======================================================== import re from pprint import pprint ###===========================parse======================== def parse(program): return [read(e) for e in program] def read(s): """Read a metafor insctruction from a string. and change it to internal represtation.""" return read_from(tokenize(s)) def tokenize(s): "Convert a string into alist of tokens." return s.split() def read_from(tokens): "Read an exp from a sequence of tokens." if len(tokens) == 0: raise SyntaxError("uncexpected EOF while reading.") if is_setf(tokens): return make_setf(tokens[0], tokens[2]) elif is_inc(tokens): return make_inc(tokens[0]) elif is_dec(tokens): return make_dec(tokens[0]) elif is_goto(tokens): return make_goto(tokens[1]) elif is_con_goto(tokens): return make_con_goto(tokens[1], tokens[3]) elif is_labeled_exp(tokens): return make_labeled_exp(get_label(tokens[0]), read_from(tokens[1:])) else: raise SyntaxError("unexpected instructions: %s" % " ".join(tokens)) #======================================================= def is_variable(token): if token.startswith("x") or \ token.startswith("z") or \ token == "y": return True else: return False def get_label(token): "Token is like [A1]. we want get A1." return token.replace('[', '').replace(']', '') ## setf def is_setf(tokens): if tokens[1] == "=" and len(tokens) == 3: if is_variable(tokens[0]) and is_variable(tokens[2]): return True else: raise SyntaxError("unexpected setf instruction: %s" % " ".join(tokens)) else: return False def make_setf(des_var, src_var): return ["setf", des_var, src_var] ## inc def is_inc(tokens): if len(tokens) == 5 and tokens[1] == "=" \ and tokens[3] == "+" and tokens[4] == "1": if tokens[0] == tokens[2] and is_variable(tokens[0]): return True else: raise SyntaxError("unexpected inc instruction: %s" % " ".join(tokens)) else: return False def make_inc(var): return ["inc", var] ## dec def is_dec(tokens): if len(tokens) == 5 and tokens[1] == "=" \ and tokens[3] == "-" and tokens[4] == "1": if tokens[0] == tokens[2] and is_variable(tokens[0]): return True else: raise SyntaxError("unexpected dec instruction: %s" % " ".join(tokens)) else: return False def make_dec(var): return ["dec", var] ## goto def is_goto(tokens): if len(tokens) == 2 and tokens[0] == "TO": return True else: return False def make_goto(lable): return ["goto", lable] ## con_goto def is_con_goto(tokens): if len(tokens) == 6 and tokens[0] == "TO" \ and tokens[2] == "IF" and is_variable(tokens[3]) \ and tokens[4] == "!=" and tokens[5] == "0": return True else: return False def make_con_goto(lable, var): return ["con_goto", lable, var] ## labeled exp def is_labeled_exp(tokens): return tokens[0].startswith('[') def make_labeled_exp(label, exp): return ["labeled_exp", label, exp] ###==========================CodeSeg====================== def is_label(s): return s == "labeled_exp" class CodeSeg: """ To store internal instructions (exps). N : the number of instructions. exps: the list of instructions. label_2_exps: a dict of {'label': num} pairs.""" def __init__(self, exps): self.N = len(exps) self.exps = [] self.label_2_exp={} for (i, e) in enumerate(exps): if is_label(e[0]): (_, label, exp) = e self.label_2_exp[label] = i self.exps.append(exp) else: self.exps.append(e) def get_instruction_num(self, label): "Return instruction num with the label." if label in self.label_2_exp.keys(): return self.label_2_exp[label] # IF there is no instruction with that label, # return a instruction num that doesn't exit. else: return self.N def get_instruction(self, pc): return self.exps[pc] ###=============================Env====================== class Env(): "An enviromnet: a dict of {'var': val} paris, var's default value is 0." def __init__(self, args): self.pairs = {} for (i, v) in enumerate(args): if i == 0: self.pairs['x'] = int(v) else: self.pairs['x'+str(i+1)] = int(v) def get_v(self, var): if var in self.pairs.keys(): return self.pairs[var] else: return 0 def set_v(self, var, value): self.pairs[var] = value ###=========================mainloop======================= def mainloop(codeseg, env): "pc refers to the current instruction." pc = 0 while pc < codeseg.N: e = codeseg.get_instruction(pc) if e[0] == 'inc': (_, v) = e env.set_v(v, env.get_v(v) + 1) pc += 1 elif e[0] == 'dec': (_, v) = e env.set_v(v, max(env.get_v(v) - 1, 0)) pc += 1 elif e[0] == 'setf': (_, x, y) = e env.set_v(x, env.get_v(y)) pc += 1 elif e[0] == 'goto': (_, label) = e pc = codeseg.get_instruction_num(label) elif e[0] == "con_goto": (_, label, var) = e val = env.get_v(var) if val != 0: pc = codeseg.get_instruction_num(label) else: pc += 1 else: raise SyntaxError("unexpected instructions: %s" % " ".join(e)) return env.get_v('y') ###========================repl================================ def repl(input_prompt='inputs> '): "A prompt-read-eval-print loop." print_info() while True: program = get_prog_from_shell() if program == []: return "Return successfully!" codes = CodeSeg(parse(program)) inputs = raw_input(input_prompt) env = Env(inputs.split()) print("=>" + str(mainloop(codes, env))) def get_prog_from_shell(): program = [] exp = raw_input("\n\nInput your program:\n") while exp: program.append(exp) exp = raw_input() return program def print_info(): print("======================================================\n" "*Metafor 1.0, Welecome to Metafor shell environment. *\n" "*Author: Zhu zhaolong(zzljlu@gmail.com) *\n" "======================================================\n") ###========================run============================= def run(filepath, prompt='Metafor> '): "This function can run Metafor program from file." print_info() program = get_prog_from_file(filepath) parsed_prog = parse(program) codes = CodeSeg(parsed_prog) inputs = raw_input(prompt) while True: env = Env(inputs.split()) print(mainloop(codes, env)) inputs = raw_input(prompt) if inputs == "exit": return "Return successfully!" if inputs == "debug": while True: var = raw_input('debug> ') if var == "exit": return None print(env.get_v(var)) if inputs == "code": pprint(parsed_prog) return None def get_prog_from_file(filepath): f = open(filepath, "r") program = f.readlines() f.close() return program ###===========================test================================== # y = x + x2 test_file_1 = "metafor_test.mf" # y = 2 * x test_file_2 = "metafor_test2.mf" if __name__ == "__main__": run(test_file_1)