I have an application where I am sequentially downloading mp3 files from a server, storing them in my server temporarily, then streaming them directly to clients, like so:
我有一个应用程序,我从服务器顺序下载mp3文件,暂时将它们存储在我的服务器中,然后直接将它们流式传输到客户端,如下所示:
function downloadNextTrack(){
var request = http.get('http://mp3server.com', function(response){
response.on('data', function(data) {
fs.appendFile('sometrack.mp3', data, function (err) {});
});
response.on('end', function(){
streamTrack('sometrack.mp3');
}
});
};
var clients = []; // client response objects are pushed to this array when they request the stream through a route like /stream.mp3
var stream;
function streamTrack(track){
stream = fs.createReadStream(track);
stream.on('data', function(data){
clients.forEach(function(client) {
client.write(data);
});
});
stream.on('end', function(){
downloadNextTrack(); // redoes the same thing with another track
}
};
Apparently this code is creating a lot of buffers which aren't being freed by the OS, when I run 'free -M' command, this is what I get (after about 4 hours of running the app):
显然这段代码创建了很多缓存,这些缓冲区没有被操作系统释放,当我运行'free -M'命令时,这就是我得到的(运行应用程序大约4个小时后):
total used free shared buffers cached
Mem: 750 675 75 0 12 180
-/+ buffers/cache: 481 269
Swap: 255 112 143
The number under 'buffers' constantly rises (as well as the cached memory) and the OS apparently doesn't reclaim that 180mb back, until eventually my app runs out of memory and crashes when I try spawning a small process to verify a track's bitrate, sampling rate, id3 info, etc.
'缓冲区'下的数字不断上升(以及缓存的内存),操作系统显然没有回收180mb,直到最后我的应用程序耗尽内存并在我尝试生成一个小进程来验证轨道的比特率时崩溃,采样率,id3信息等
I have diagnosed with a lot of different tools (such as memwatch and nodetime) to find out if it was an internal memory leak and it isn't, the V8 memory heap as well as the Node RSS vary +/- 10mb but stay constant for most part while the OS free memory gets lower and lower (when the Node process starts I have about 350MB of free memory).
我已经诊断出有很多不同的工具(例如memwatch和nodetime)来确定它是否是内部内存泄漏但事实并非如此,V8内存堆以及Node RSS变化+/- 10mb但保持不变在大多数情况下,OS可用内存越来越低(当Node进程启动时,我有大约350MB的可用内存)。
I read somewhere that Buffer instances allocated by Node have direct access to raw memory and so V8 doesn't have power over them (which checks out with the fact that I am not getting memory leaks from the V8 heap), the thing is, I need a way to get rid of these old buffers. Is this possible? Or will I have to restart my app every 5 hours or so (or worse, buy more RAM!)?
我在某处读到Node分配的Buffer实例可以直接访问原始内存,因此V8没有对它们的供电(这可以检查我没有从V8堆中获取内存泄漏),问题是,我需要一种摆脱这些旧缓冲区的方法。这可能吗?或者我每隔5个小时左右重启我的应用程序(或者更糟糕的是,购买更多内存!)?
PS. I am running Node v0.8.16 on Ubuntu 10.04.
PS。我在Ubuntu 10.04上运行Node v0.8.16。
2 个解决方案
#1
2
I agree with Tiago, I think this is caused because of the recursive nature of your code. I don't think the streams is what gobbling up your heap, because as you said, the stream variable is being reassigned with a new ReadStream with every iteration. However, the http.get's request and response (and whatever Buffers they use) in line 2 are never being released before calling the next iteration; they are scoped within the downloadNextTrack function. You end up with a recursive stack trace that has a set of request and response objects (and some underlying buffers) per file.
我同意Tiago,我认为这是因为你的代码具有递归性质。我不认为这些流是吞噬你的堆的原因,因为如你所说,每次迭代都会使用新的ReadStream重新分配流变量。但是,第2行中的http.get的请求和响应(以及它们使用的任何缓冲区)在调用下一次迭代之前永远不会被释放;它们的范围在downloadNextTrack函数中。最终会得到一个递归堆栈跟踪,每个文件都有一组请求和响应对象(以及一些底层缓冲区)。
In general, if this code needs to run many, many times, why not opt-out of the recursion and do it all iteratively? a never-ending recursion will always gobble more and more memory, until the program crashes, even if there's no memory leaks on your part.
一般来说,如果这段代码需要运行很多次,为什么不选择退出递归并迭代地完成所有操作呢?一个永无止境的递归将总是吞噬越来越多的内存,直到程序崩溃,即使你的内存没有泄漏。
#2
0
Read this: http://www.linuxatemyram.com
阅读本文:http://www.linuxatemyram.com
Buffer cache is a cache for inodes and dentries (filesystem structures). That memory is still available to processes. You should not care about this.
缓冲区缓存是inode和dentries(文件系统结构)的缓存。该内存仍可供进程使用。你不应该关心这个。
#1
2
I agree with Tiago, I think this is caused because of the recursive nature of your code. I don't think the streams is what gobbling up your heap, because as you said, the stream variable is being reassigned with a new ReadStream with every iteration. However, the http.get's request and response (and whatever Buffers they use) in line 2 are never being released before calling the next iteration; they are scoped within the downloadNextTrack function. You end up with a recursive stack trace that has a set of request and response objects (and some underlying buffers) per file.
我同意Tiago,我认为这是因为你的代码具有递归性质。我不认为这些流是吞噬你的堆的原因,因为如你所说,每次迭代都会使用新的ReadStream重新分配流变量。但是,第2行中的http.get的请求和响应(以及它们使用的任何缓冲区)在调用下一次迭代之前永远不会被释放;它们的范围在downloadNextTrack函数中。最终会得到一个递归堆栈跟踪,每个文件都有一组请求和响应对象(以及一些底层缓冲区)。
In general, if this code needs to run many, many times, why not opt-out of the recursion and do it all iteratively? a never-ending recursion will always gobble more and more memory, until the program crashes, even if there's no memory leaks on your part.
一般来说,如果这段代码需要运行很多次,为什么不选择退出递归并迭代地完成所有操作呢?一个永无止境的递归将总是吞噬越来越多的内存,直到程序崩溃,即使你的内存没有泄漏。
#2
0
Read this: http://www.linuxatemyram.com
阅读本文:http://www.linuxatemyram.com
Buffer cache is a cache for inodes and dentries (filesystem structures). That memory is still available to processes. You should not care about this.
缓冲区缓存是inode和dentries(文件系统结构)的缓存。该内存仍可供进程使用。你不应该关心这个。