I'm trying to make a function in Python that does the equivalent of compile(), but also lets me get the original string back. Let's call those two functions comp() and decomp(), for disambiguation purposes. That is,
我试图在Python中创建一个与compile()相同的函数,但也让我得到原始字符串。为了消除歧义,我们将这两个函数称为comp()和decomp()。那是,
a = comp("2 * (3 + x)", "", "eval")
eval(a, dict(x=3)) # => 12
decomp(a) # => "2 * (3 + x)"
The returned string does not have to be identical ("2*(3+x)" would be acceptable), but it needs to be basically the same ("2 * x + 6" would not be).
返回的字符串不必相同(“2 *(3 + x)”是可接受的),但它必须基本相同(“2 * x + 6”不会)。
Here's what I've tried that doesn't work:
这是我尝试过的不起作用:
- Setting an attribute on the code object returned by compile. You can't set custom attributes on code objects.
- Subclassing code so I can add the attribute. code cannot be subclassed.
- Setting up a WeakKeyDictionary mapping code objects to the original strings. code objects cannot be weakly referenced.
在compile返回的代码对象上设置属性。您无法在代码对象上设置自定义属性。
子类化代码,以便我可以添加属性。代码不能被子类化。
设置WeakKeyDictionary将代码对象映射到原始字符串。代码对象不能被弱引用。
Here's what does work, with issues:
以下是有问题的工作:
- Passing in the original code string for the filename to compile(). However, I lose the ability to actually keep a filename there, which I'd like to also do.
- Keeping a real dictionary mapping code objects to strings. This leaks memory, although since compiling is rare, it's acceptable for my current use case. I could probably run the keys through gc.get_referrers periodically and kill off dead ones, if I had to.
将文件名的原始代码字符串传递给compile()。但是,我失去了实际保存文件名的能力,我也想这样做。
保持一个真实的字典将代码对象映射到字符串。这会泄漏内存,虽然编译很少见,但我目前的用例是可以接受的。如果必须的话,我可以定期通过gc.get_referrers运行密钥并杀掉死的密钥。
2 个解决方案
#1
This is kind of a weird problem, and my initial reaction is that you might be better off doing something else entirely to accomplish whatever it is you're trying to do. But it's still an interesting question, so here's my crack at it: I make the original code source an unused constant of the code object.
这是一个奇怪的问题,我最初的反应是,你可能最好做一些其他事情来完成你正在尝试做的事情。但它仍然是一个有趣的问题,所以这里是我的解决方法:我使原始代码源成为代码对象的未使用常量。
import types
def comp(source, *args, **kwargs):
"""Compile the source string; takes the same arguments as builtin compile().
Modifies the resulting code object so that the original source can be
recovered with decomp()."""
c = compile(source, *args, **kwargs)
return types.CodeType(c.co_argcount, c.co_nlocals, c.co_stacksize,
c.co_flags, c.co_code, c.co_consts + (source,), c.co_names,
c.co_varnames, c.co_filename, c.co_name, c.co_firstlineno,
c.co_lnotab, c.co_freevars, c.co_cellvars)
def decomp(code_object):
return code_object.co_consts[-1]
>>> a = comp('2 * (3 + x)', '', 'eval')
>>> eval(a, dict(x=3))
12
>>> decomp(a)
'2 * (3 + x)'
#2
My approach would be to wrap the code object in another object. Something like this:
我的方法是将代码对象包装在另一个对象中。像这样的东西:
class CodeObjectEnhanced(object):
def __init__(self, *args):
self.compiled = compile(*args)
self.original = args[0]
def comp(*args):
return CodeObjectEnhanced(*args)
Then whenever you need the code object itself, you use a.compiled, and whenever you need the original, you use a.original. There may be a way to get eval to treat the new class as though it were an ordinary code object, redirecting the function to call eval(self.compiled) instead.
然后,只要您需要代码对象本身,就可以使用a.compiled,只要您需要原始代码,就可以使用原始代码。可能有一种方法可以让eval将新类视为普通的代码对象,将函数重定向到调用eval(self.compiled)。
One advantage of this is the original string is deleted at the same time as the code object. However you do this, I think storing the original string is probably the best approach, as you end up with the exact string you used, not just an approximation.
这样做的一个优点是原始字符串与代码对象同时被删除。但是,你这样做,我认为存储原始字符串可能是最好的方法,因为你最终得到你使用的确切字符串,而不仅仅是近似值。
#1
This is kind of a weird problem, and my initial reaction is that you might be better off doing something else entirely to accomplish whatever it is you're trying to do. But it's still an interesting question, so here's my crack at it: I make the original code source an unused constant of the code object.
这是一个奇怪的问题,我最初的反应是,你可能最好做一些其他事情来完成你正在尝试做的事情。但它仍然是一个有趣的问题,所以这里是我的解决方法:我使原始代码源成为代码对象的未使用常量。
import types
def comp(source, *args, **kwargs):
"""Compile the source string; takes the same arguments as builtin compile().
Modifies the resulting code object so that the original source can be
recovered with decomp()."""
c = compile(source, *args, **kwargs)
return types.CodeType(c.co_argcount, c.co_nlocals, c.co_stacksize,
c.co_flags, c.co_code, c.co_consts + (source,), c.co_names,
c.co_varnames, c.co_filename, c.co_name, c.co_firstlineno,
c.co_lnotab, c.co_freevars, c.co_cellvars)
def decomp(code_object):
return code_object.co_consts[-1]
>>> a = comp('2 * (3 + x)', '', 'eval')
>>> eval(a, dict(x=3))
12
>>> decomp(a)
'2 * (3 + x)'
#2
My approach would be to wrap the code object in another object. Something like this:
我的方法是将代码对象包装在另一个对象中。像这样的东西:
class CodeObjectEnhanced(object):
def __init__(self, *args):
self.compiled = compile(*args)
self.original = args[0]
def comp(*args):
return CodeObjectEnhanced(*args)
Then whenever you need the code object itself, you use a.compiled, and whenever you need the original, you use a.original. There may be a way to get eval to treat the new class as though it were an ordinary code object, redirecting the function to call eval(self.compiled) instead.
然后,只要您需要代码对象本身,就可以使用a.compiled,只要您需要原始代码,就可以使用原始代码。可能有一种方法可以让eval将新类视为普通的代码对象,将函数重定向到调用eval(self.compiled)。
One advantage of this is the original string is deleted at the same time as the code object. However you do this, I think storing the original string is probably the best approach, as you end up with the exact string you used, not just an approximation.
这样做的一个优点是原始字符串与代码对象同时被删除。但是,你这样做,我认为存储原始字符串可能是最好的方法,因为你最终得到你使用的确切字符串,而不仅仅是近似值。