I have a php script that uses Doctrine2 and Zend to calculate some things from a database and send some emails for 30.000 users.
我有一个php脚本,它使用Doctrine2和Zend从数据库中计算一些内容,并为30,000名用户发送一些电子邮件。
My script is leaking memory and I want to know which are the objects that are consuming that memory, and if it is possible who is keeping a reference to them (thus not allowing them to be released).
我的脚本正在泄漏内存,我想知道哪些对象正在消耗内存,如果可能的话,谁在保存对它们的引用(因此不允许释放它们)。
Im using php 5.3.x, so plain circular references shouldn't be the problem.
我使用php 5.3。x,所以简单的循环引用应该不是问题。
Ive tried using xdebug trace capabilities to get mem_delta with no success (too much data).
我尝试使用xdebug跟踪功能获取mem_delta,但没有成功(数据太多)。
Ive tried manually adding memory_get_usage before and after the important functions. But the only conclusion that I got was that I loose around 400k per user, and 3000 users times that gives me the 1Gb that i have available.
我尝试过在重要函数之前和之后手动添加memory_get_usage。但我得到的唯一结论是,我每个用户丢失了大约400k, 3000个用户乘以这就得到了1Gb的可用容量。
Are there any other ways to know where and why memory is leaking? Thanks
还有其他方法可以知道内存泄漏的位置和原因吗?谢谢
3 个解决方案
#1
2
You could try sending say 10 emails and then inserting this
你可以试着发10封邮件,然后插入这个
get_defined_vars();
http://nz.php.net/manual/en/function.get-defined-vars.php
http://nz.php.net/manual/en/function.get-defined-vars.php
At the end of the script or after the email is sent (depending on how your code is setup).
在脚本的末尾或发送电子邮件之后(取决于代码的设置)。
This should tell you what is still loaded, and what you can unset / turn into a reference.
这将告诉您哪些内容仍然被加载,以及哪些内容可以被解压/转换为引用。
Also if there are two many things loaded you get this near start and end of your code and work out the difference.
另外,如果有两件东西被加载,你可以在代码的开始和结束时得到它,并计算出它们之间的区别。
#2
2
30.000 objects to hydrate is quite a lot. Doctrine 2 is stable, but there are some bugs, so I am not too surprised about your memory leak problems.
3 .000个对象的水合物是相当多的。原则2是稳定的,但是有一些错误,所以我不太惊讶您的内存泄漏问题。
Although with smaller data sets I had some good success using doctrines batch processing capabilities and creating an iterable result.
虽然使用较小的数据集,但我在使用理论批处理能力和创建迭代结果方面取得了一些成功。
You can use the code from the examples, and add a gc_collect_cycles()
after each iteration. You have to test it, but for me batch sizes around 100 or so worked quite good – that number gave a good balance between performance and memory usage.
您可以使用示例中的代码,并在每次迭代之后添加gc_collect_cycle()。您必须对它进行测试,但是对于我来说,大约100个批处理大小的工作非常好——这个数字在性能和内存使用之间提供了良好的平衡。
It´s quite important that the script recognizes which entities where processed so that it can be restarted without any problems and resume normal operation without sending emails twice.
´s脚本识别哪些实体,处理非常重要,这样就可以将重启没有任何问题和恢复正常操作没有发送邮件两次。
$batchSize = 20;
$i = 0;
$q = $em->createQuery('select u from MyProject\Model\User u');
$iterableResult = $q->iterate();
while (($row = $iterableResult->next()) !== false) {
$entity = $row[0];
// do stuff with $entity here
// mark entity as processed
if (($i % $batchSize) == 0) {
$em->flush();
$em->clear();
gc_collect_cycles();
}
++$i;
}
Anyhow, maybe you should rethink your architecture for that script a bit, as a ORM is not well suited for processing large chunks of data. Maybe you can get away with working on the raw SQL rows?
无论如何,也许您应该稍微重新考虑一下您的体系结构,因为ORM不太适合处理大块数据。也许您可以处理原始SQL行?
#3
0
This isn't a tool that will give you what you need but maybe it'll help you out. If you aren't already, you can implement the identity map pattern where every time you create an object it gets registered with the identity map so at any time you can call on the IM and see what objects are loaded or tell it to unload any loaded objects.
这个工具并不能满足你的需要,但它可能会帮助你。如果还没有,您可以实现标识映射模式,每次创建一个对象时,它都会被标识映射注册,因此您可以随时调用IM并查看加载了哪些对象,或者告诉它卸载任何加载的对象。
http://martinfowler.com/eaaCatalog/identityMap.html
http://martinfowler.com/eaaCatalog/identityMap.html
#1
2
You could try sending say 10 emails and then inserting this
你可以试着发10封邮件,然后插入这个
get_defined_vars();
http://nz.php.net/manual/en/function.get-defined-vars.php
http://nz.php.net/manual/en/function.get-defined-vars.php
At the end of the script or after the email is sent (depending on how your code is setup).
在脚本的末尾或发送电子邮件之后(取决于代码的设置)。
This should tell you what is still loaded, and what you can unset / turn into a reference.
这将告诉您哪些内容仍然被加载,以及哪些内容可以被解压/转换为引用。
Also if there are two many things loaded you get this near start and end of your code and work out the difference.
另外,如果有两件东西被加载,你可以在代码的开始和结束时得到它,并计算出它们之间的区别。
#2
2
30.000 objects to hydrate is quite a lot. Doctrine 2 is stable, but there are some bugs, so I am not too surprised about your memory leak problems.
3 .000个对象的水合物是相当多的。原则2是稳定的,但是有一些错误,所以我不太惊讶您的内存泄漏问题。
Although with smaller data sets I had some good success using doctrines batch processing capabilities and creating an iterable result.
虽然使用较小的数据集,但我在使用理论批处理能力和创建迭代结果方面取得了一些成功。
You can use the code from the examples, and add a gc_collect_cycles()
after each iteration. You have to test it, but for me batch sizes around 100 or so worked quite good – that number gave a good balance between performance and memory usage.
您可以使用示例中的代码,并在每次迭代之后添加gc_collect_cycle()。您必须对它进行测试,但是对于我来说,大约100个批处理大小的工作非常好——这个数字在性能和内存使用之间提供了良好的平衡。
It´s quite important that the script recognizes which entities where processed so that it can be restarted without any problems and resume normal operation without sending emails twice.
´s脚本识别哪些实体,处理非常重要,这样就可以将重启没有任何问题和恢复正常操作没有发送邮件两次。
$batchSize = 20;
$i = 0;
$q = $em->createQuery('select u from MyProject\Model\User u');
$iterableResult = $q->iterate();
while (($row = $iterableResult->next()) !== false) {
$entity = $row[0];
// do stuff with $entity here
// mark entity as processed
if (($i % $batchSize) == 0) {
$em->flush();
$em->clear();
gc_collect_cycles();
}
++$i;
}
Anyhow, maybe you should rethink your architecture for that script a bit, as a ORM is not well suited for processing large chunks of data. Maybe you can get away with working on the raw SQL rows?
无论如何,也许您应该稍微重新考虑一下您的体系结构,因为ORM不太适合处理大块数据。也许您可以处理原始SQL行?
#3
0
This isn't a tool that will give you what you need but maybe it'll help you out. If you aren't already, you can implement the identity map pattern where every time you create an object it gets registered with the identity map so at any time you can call on the IM and see what objects are loaded or tell it to unload any loaded objects.
这个工具并不能满足你的需要,但它可能会帮助你。如果还没有,您可以实现标识映射模式,每次创建一个对象时,它都会被标识映射注册,因此您可以随时调用IM并查看加载了哪些对象,或者告诉它卸载任何加载的对象。
http://martinfowler.com/eaaCatalog/identityMap.html
http://martinfowler.com/eaaCatalog/identityMap.html