如何运行不受信任的代码服务器端?

I'm trying to run untrusted javascript code in linux + node.js with the sandbox module but it's broken, all i need is to let users write javascript programs that printout some text. No other i/o is allowed and just plain javascript is to be used, no other node modules. If it's not really possible to do, what other language do you suggest for this kind of task? The minimal feature set i need is some math, regexes, string manipulation, and basic JSON functions. Scripts will run for let's say 5 seconds tops and then the process would be killed, how can i achieve that?

我正在尝试在linux + node中运行不受信任的javascript代码。虽然沙箱模块已经坏了，但我只需要让用户编写javascript程序来打印一些文本。不允许使用其他i/o，只使用普通的javascript，没有其他节点模块。如果这是不可能的，你建议用哪种语言来完成这类任务?我需要的最小特性集是一些数学、正则表达式、字符串操作和基本的JSON函数。脚本将运行5秒，然后进程将被终止，我如何实现呢?

8 个解决方案

#1

I've recently created a library for sandboxing the untrusted code, it seems to fit the demands (executes a code in a restricted process in case of Node.js, and in a Worker inside a sandboxed iframe for a web-browser):

我最近创建了一个用于sandboxing的不受信任代码的库，它似乎符合需求(在一个受限制的过程中执行一个代码，以防节点出现)。js和一个工作人员在一个沙箱iframe中为一个web浏览器):

https://github.com/asvd/jailed

There is an opportunity to export the given set of methods from the main application into the sandbox thus providing any custom API and set of privilliges (that feature was actually the reason why I decided to make a library from scratch). The mentioned maths, regexp and string -related stuff is provided by the JavaScript itself, anything additional may be explicitly exported from outside (like some function for communicating with the main application).

有机会将给定的方法集从主应用程序导出到沙箱中，从而提供任何自定义API和特权集(这个特性实际上是我决定从头创建库的原因)。上面提到的数学、regexp和与字符串相关的内容是由JavaScript本身提供的，任何附加的内容都可以从外部显式地导出(比如用于与主应用程序通信的函数)。

#2

The basic idea of sandboxes is, you need variables predefined as globals to do stuff, so if you deny a script them by unsetting them, or replacing them with controlled one, it cannot escape. As long you don't forget anything.

沙箱的基本思想是，您需要预定义的变量作为全局变量，因此，如果您不通过设置它们来拒绝脚本，或者用受控的方法替换它们，那么它就无法逃脱。只要你不忘记任何事情。

First replace deny require() or replace it with something controlled. dont forget about process and "global" a.k.a "root", the difficult thing is not to forget anything, thats why its good to rely on someone else having built a sandbox ;-)

首先替换deny require()，或者用某个控件替换它。不要忘记过程和“全球”。一个“根”，困难的是不要忘记任何东西，这就是为什么依靠别人建了一个沙箱是好的;

#3

Ask yourself these questions:

问自己这些问题:

Are you one of the smartest persons on the planet?
你是这个星球上最聪明的人之一吗?
Do you turn down job offers by Google, Mozilla and Kaspersky Lab routinely because it would bore you?
你会拒绝谷歌、Mozilla和卡巴斯基实验室提供的工作机会吗?
Does the "untrusted code" come from people working at the same company as you or from criminals and bored computer kids all over the globe?
“不可信的代码”是来自与你在同一家公司工作的人，还是来自世界各地的罪犯和无聊的电脑孩子?
Are you sure that node.js has no security holes that could leak through your sandbox?
你确定那个结点。js没有安全漏洞可以通过沙箱泄漏?
Can you write perfect 100% error free code?
你能写出完美的100%无错误的代码吗?
Do you know everything about JavaScript?
你对JavaScript了如指掌吗?

As you already know by your experiments with the sandbox module, writing your own sandbox isn't trivial. The main problem with sandboxes is that you must get everything right. One mistake will ruin your security completely which is why browser developers fight a constant battle with crackers all over the globe.

正如您已经通过沙箱模块的实验了解到的，编写自己的沙箱并不简单。沙箱的主要问题是你必须把一切都做好。一个错误将彻底破坏您的安全性，这就是为什么浏览器开发人员要与全球各地的破解者进行持续的战斗。

That said, simple sandboxes are pretty easy to do. First, you'll need to write your own JavaScript interpreter because you can't use the one from node.js because of eval() and require() (both would allow crackers to escape your sandbox).

也就是说，简单的沙箱很容易做到。首先，您需要编写自己的JavaScript解释器，因为您不能使用来自node的解释器。因为eval()和require()，所以是js(这两种方法都允许破解者逃离沙箱)。

The interpreter must make sure that the interpreted code cannot access anything besides the few global symbols that you provide. This means there can't be an eval() function, for example (or you must make sure that this function is only evaluated in the context of your own JavaScript interpreter).

解释器必须确保所解释的代码除了您提供的几个全局符号之外，不能访问任何其他内容。这意味着不能有eval()函数，例如(或者您必须确保这个函数仅在您自己的JavaScript解释器的上下文中进行评估)。

Drawback of this approach: A lot of work and if you make a mistake in your interpreter, the crackers can leave the sandbox.

这种方法的缺点是:工作量很大，如果您在解释器中犯了错误，那么鞭炮就会离开沙箱。

Another approach is to clean the code and run it with node.js's eval(). You can clean existing code by running a bunch of regexp's over it like /eval\s*[(]//g to remove malicious code parts.

另一种方法是清理代码并使用node运行它。js的eval()。您可以通过运行一堆regexp来清除现有代码，比如/eval\s*[(]// /g，以删除恶意代码部分。

Drawback of this approach: It's easy to make a mistake that will leave you vulnerable to an attack. For example, there might be mismatch between what regexp and what node.js think of as "whitespace". Some obscure unicode whitespace might be accepted by the interpreter but not by regexp which would allow an attacker to run eval().

这种方法的缺点:很容易犯错误，这会让你很容易受到攻击。例如，在regexp和哪个节点之间可能存在不匹配。js认为是“空格”。解释器可以接受一些不明显的unicode空白，但regexp不能接受，因为regexp允许攻击者运行eval()。

My suggestion: Write a small demo test case that shows how the sandbox module is broken and have it fixed. It will save you a lot of time and effort and if there is a bug in the sandbox, it won't be your fault (well, not entirely at least).

我的建议是:编写一个小的演示测试用例，展示沙箱模块是如何被破坏并修复的。这将为您节省大量的时间和精力，如果沙箱中有bug，那也不是您的错(嗯，至少不是全部)。

#4

If you can afford the performance hit, you could run the JS in a throwaway virtual machine with the appropriate CPU and memory limits.

如果您负担得起性能冲击，您可以在具有适当的CPU和内存限制的一次性虚拟机上运行JS。

Of course, then you are trusting the security of the VM solution. By using it together with an ordinary JS sandbox, you'd have two layers of security.

当然,那么你就信任VM的安全解决方案。通过使用它与一个普通的JS沙箱中,你会有两层安全。

For an additional layer, put the sandbox on a different physical machine than your main app.

对于另一层，将沙箱放在与主应用不同的物理机器上。

#5

Docker.io Is an awesome new kid on the block, which uses LXCs and CGroups to create sandboxes.

码头工人。io是一个非常棒的新手，它使用LXCs和CGroups来创建沙箱。

Here is one implementation of an online gist (similar to codepad.org) using Docker and Go Lang

下面是一个使用Docker和Go Lang的在线gist实现(类似于codepad.org)

This just goes to demonstrate that one can safely run untrusted code written in many programming languages inside Docker Containers, including node.js

这说明可以安全地运行在Docker容器内的许多编程语言中编写的不受信任的代码，包括node.js。

#6

I am facing a similar problem right now and I'm reading only bad things about the sandbox module.

我现在正面临着一个类似的问题，我正在阅读关于沙箱模块的一些不好的东西。

If you don't need anything specific to the node environment, I thing the best approach will be to use a headless browser such as PhantomJS or Chimera to use as a sandbox environment.

如果您不需要特定于节点环境的任何东西，我认为最好的方法是使用一个无头浏览器，如PhantomJS或Chimera，作为沙箱环境使用。

#7

Know its pretty late to answer the question, guess the below tool might be a value add which is not mentioned in the above answers/comments.

知道现在回答这个问题已经很晚了，我猜下面的工具可能是一个在上面的答案/评论中没有提到的值添加。

Trying to implement similar use-case. After have gone through the web resources, https://www.npmjs.com/package/vm2 seems to be handling the sandbox environment(nodejs) pretty well.

尝试实现类似的用例。在浏览了web资源之后，https://www.npmjs.com/package/vm2似乎可以很好地处理沙箱环境(nodejs)。

It's pretty much satisfies the sandboxing features like restricting the access to builtin or external modules, data exchanges between sandbox, etc.

它几乎满足了沙箱特性，比如限制对内置或外部模块的访问，沙箱之间的数据交换等等。

#8

A late answer but maybe an interesting idea.

一个迟来的答案，但也许是个有趣的想法。

Static code analysis => AST manipulation => Code generating

静态代码分析=> AST操作=>代码生成

Static analysis will parse the AST of the source code. AST provides a common data structure to allow us to traverse and modify the source code.
静态分析将解析源代码的AST。AST提供了一个通用的数据结构，允许我们遍历和修改源代码。
Via AST manipulations, we can find out all the identifier references to any sensitive variables in the outer scopes. If we need, we can re-declare and initialize them at the beginning of the function body, so as to overwrite them. Thus the references from the inside to the outside are all in control.
通过AST操作，我们可以找到对外部作用域中任何敏感变量的所有标识符引用。如果需要，我们可以在函数体的开头重新声明和初始化它们，以便覆盖它们。因此，从内部到外部的引用都在控制之中。
Generating codes from AST is easy as well.
从AST生成代码也很容易。

For instance, a function is as shown below:

例如，一个函数如下所示:

function () {
    a = 1;
    window.b = 1;
    eval('window.c()');
}

Static analysis based on JS code parser enables us to insert variable declaration statements before the original function body:

基于JS代码解析器的静态分析使我们能够在原始函数体之前插入变量声明语句:

function () {
    var a, window = {}, eval = function () {}; // variable overwriting
    a = 1;
    window.b = 1;
    eval('window.c()');
}

That's it.

就是这样。

More overwritings should be considered, such as eval(), new Function() and other global objects or APIs. And warnings during parsing should be well organized and reported.

应该考虑更多的重写，例如eval()、new Function()和其他全局对象或api。解析期间的警告应该得到良好的组织和报告。

Some related work in order:

#1