I am running a script which extracts the information from debian packages and saves it in a database.
我正在运行一个脚本,它从debian包中提取信息并将其保存到数据库中。
After extracting information from about 100 packages an error occurs. The error is "can't start a new thread" Why am I facing this error? What can be the possible solution to it?
在从大约100个包中提取信息之后,会发生错误。错误是“无法启动新线程”为什么我要面对这个错误?有什么可能的解决办法呢?
This is the code used to save the data :
这是用来保存数据的代码:
for i in list_pack:
if not i in oblist:
#Creating Packages
slu=slugify(i)
ob=Gbobject()
ob.title=i
ob.slug=slu
ob.content=''
ob.tags=tagname
#with reversion.create_revision():
ob.save()
gb=Gbobject.objects.get(title=i)
gb.objecttypes.add(Objecttype.objects.get(title='Packages'))
gb.sites.add(Site.objects.get_current())
#with reversion.create_revision():
gb.save()
gd=Gbobject.objects.get(title=i)
print 'The Object created was',gd
#Decsription
try:
atv=package_description_dict[i]
atvalue=unicode(atv,'utf-8')
except UnicodeDecodeError:
pass
try:
lo=Gbobject.objects.get(title=i)
loid=NID.objects.get(id=lo.id)
except Gbobject.DoesNotExist:
pass
if atvalue<>'':
slu=slugify(atvalue)
at=Attribute()
at.title=atvalue
at.slug=slu
at.svalue=atvalue
at.subject=loid
att=Attributetype.objects.get(title='Description')
at.attributetype=att
#with reversion.create_revision():
at.save()
print 'yeloow13'
Just like Description
, there are around !2 more properties of the package which are saved in a similar way.
就像描述一样,包中还有2个属性以类似的方式保存。
This is the full traceback that I get when the error occurs :-
这是当错误发生时我得到的全部回溯:-
error Traceback (most recent call last)
/home/radhika/Desktop/dev_75/gnowsys-studio/demo/<ipython console> in <module>()
/usr/local/lib/python2.6/dist-packages/django_gstudio-0.3.dev-py2.6.egg/gstudio/Harvest/debdata.py in <module>()
1086 # create_attribute_type()
1087 # create_relation_type()
-> 1088 create_objects()
1089 #create_sec_objects()
1090 #create_relations()
/usr/local/lib/python2.6/dist-packages/django_gstudio-0.3.dev-py2.6.egg/gstudio/Harvest/debdata.py in create_objects()
403 ob.sites.add(Site.objects.get_current())
404 #with reversion.create_revision():
--> 405 ob.save()
406 #time.sleep(10)
407 #gd=Gbobject.objects.get(title=i)
/usr/local/lib/python2.6/dist-packages/django_reversion-1.6.0-py2.6.egg/reversion/revisions.pyc in do_revision_context(*args, **kwargs)
298 try:
299 try:
--> 300 return func(*args, **kwargs)
301 except:
302 exception = True
/usr/local/lib/python2.6/dist-packages/django_gstudio-0.3.dev-py2.6.egg/objectapp/models.pyc in save(self, *args, **kwargs)
658 @reversion.create_revision()
659 def save(self, *args, **kwargs):
--> 660 super(Gbobject, self).save(*args, **kwargs) # Call the "real" save() method.
661
662
/usr/local/lib/python2.6/dist-packages/django_reversion-1.6.0-py2.6.egg/reversion/revisions.pyc in do_revision_context(*args, **kwargs)
298 try:
299 try:
--> 300 return func(*args, **kwargs)
301 except:
302 exception = True
/usr/local/lib/python2.6/dist-packages/django_gstudio-0.3.dev-py2.6.egg/gstudio/models.pyc in save(self, *args, **kwargs)
327 @reversion.create_revision()
328 def save(self, *args, **kwargs):
--> 329 super(Node, self).save(*args, **kwargs) # Call the "real" save() method.
330
331
/usr/local/lib/python2.6/dist-packages/django/db/models/base.pyc in save(self, force_insert, force_update, using)
458 if force_insert and force_update:
459 raise ValueError("Cannot force both insert and updating in model saving.")
--> 460 self.save_base(using=using, force_insert=force_insert, force_update=force_update)
461
462 save.alters_data = True
/usr/local/lib/python2.6/dist-packages/django/db/models/base.pyc in save_base(self, raw, cls, origin, force_insert, force_update, using)
568 if origin and not meta.auto_created:
569 signals.post_save.send(sender=origin, instance=self,
--> 570 created=(not record_exists), raw=raw, using=using)
571
572
/usr/local/lib/python2.6/dist-packages/django/dispatch/dispatcher.pyc in send(self, sender, **named)
170
171 for receiver in self._live_receivers(_make_id(sender)):
--> 172 response = receiver(signal=self, sender=sender, **named)
173 responses.append((receiver, response))
174 return responses
/usr/local/lib/python2.6/dist-packages/django_gstudio-0.3.dev-py2.6.egg/objectapp/signals.pyc in wrapper(*args, **kwargs)
65 if inspect.getmodulename(fr[1]) == 'loaddata':
66 return
---> 67 signal_handler(*args, **kwargs)
68
69 return wrapper
/usr/local/lib/python2.6/dist-packages/django_gstudio-0.3.dev-py2.6.egg/objectapp/signals.pyc in ping_external_urls_handler(sender, **kwargs)
90 from objectapp.ping import ExternalUrlsPinger
91
---> 92 ExternalUrlsPinger(gbobject)
93
94
/usr/local/lib/python2.6/dist-packages/django_gstudio-0.3.dev-py2.6.egg/objectapp/ping.pyc in __init__(self, gbobject, timeout, start_now)
153 threading.Thread.__init__(self)
154 if start_now:
--> 155 self.start()
156
157 def run(self):
/usr/lib/python2.6/threading.pyc in start(self)
472 _active_limbo_lock.release()
473 try:
--> 474 _start_new_thread(self.__bootstrap, ())
475 except Exception:
476 with _active_limbo_lock:
error: can't start new thread
I am not writing any code to handle the threads.
我没有编写任何代码来处理这些线程。
2 个解决方案
#1
13
Sorry to have an incomplete solution here, I don't have the rating to post in the comment section.
抱歉,这里有一个不完整的解决方案,我没有在评论部分发布的评级。
One thing to check is the total number of threads you have running. I have some code that checks the count of cores (with sys) and then launches threads and checks core loading to test how the OS handles thread distribution and I've learned that windows 7 (for example) seems to throw an error beyond 32 threads on an 8 (logical) core CPU. [That's on Python 2.7, 32 bit within Win 7 64-bit, etc, YMMMV]. On other machines I can get past 1,000 Threads.
要检查的一件事是您正在运行的线程的总数。我有一些代码检查内核的计数(使用sys),然后启动线程并检查内核加载,以测试操作系统如何处理线程分布,我还了解到windows 7(例如)似乎在8(逻辑)核心CPU上抛出了32个线程之外的错误。[在Python 2.7上,win7 64位内32位,等等,YMMMV]。在其他机器上,我可以超过1000个线程。
So I guess the short version is: how many threads do you already have going when you get that error? You can check with
所以我猜简短的版本是:当你得到那个错误时,你已经有多少线程了?你可以去看看
threading.active_count()
Beyond that you don't really give the threading part of the code here so I'd direct you to this excellent Python Central page.
除此之外,您并没有给出代码的线程部分,因此我将引导您到这个优秀的Python中心页面。
You may also benefit from this previous Stack overflow discussion on large thread task counts and how to approach them.
您还可以从前面关于大型线程任务计数和如何处理它们的堆栈溢出讨论中获益。
Again, my apologies that this is more of a direction to look than a solution, I think that more information is likely needed to help us understand what you're running into.
再次,我很抱歉,这更像是一个方向,而不是一个解决方案,我认为需要更多的信息来帮助我们理解您遇到的问题。
#2
0
You're using a 32bit system and running out of virtual memory. One of your libraries is likely spawning threads and not reclaiming them correctly. As a workaround, try reducing the default thread stack size with threading.stack_size
.
您正在使用一个32位的系统并耗尽虚拟内存。您的一个库可能会生成线程,而不会正确地回收它们。作为解决方案,尝试使用thread .stack_size减少默认的线程堆栈大小。
#1
13
Sorry to have an incomplete solution here, I don't have the rating to post in the comment section.
抱歉,这里有一个不完整的解决方案,我没有在评论部分发布的评级。
One thing to check is the total number of threads you have running. I have some code that checks the count of cores (with sys) and then launches threads and checks core loading to test how the OS handles thread distribution and I've learned that windows 7 (for example) seems to throw an error beyond 32 threads on an 8 (logical) core CPU. [That's on Python 2.7, 32 bit within Win 7 64-bit, etc, YMMMV]. On other machines I can get past 1,000 Threads.
要检查的一件事是您正在运行的线程的总数。我有一些代码检查内核的计数(使用sys),然后启动线程并检查内核加载,以测试操作系统如何处理线程分布,我还了解到windows 7(例如)似乎在8(逻辑)核心CPU上抛出了32个线程之外的错误。[在Python 2.7上,win7 64位内32位,等等,YMMMV]。在其他机器上,我可以超过1000个线程。
So I guess the short version is: how many threads do you already have going when you get that error? You can check with
所以我猜简短的版本是:当你得到那个错误时,你已经有多少线程了?你可以去看看
threading.active_count()
Beyond that you don't really give the threading part of the code here so I'd direct you to this excellent Python Central page.
除此之外,您并没有给出代码的线程部分,因此我将引导您到这个优秀的Python中心页面。
You may also benefit from this previous Stack overflow discussion on large thread task counts and how to approach them.
您还可以从前面关于大型线程任务计数和如何处理它们的堆栈溢出讨论中获益。
Again, my apologies that this is more of a direction to look than a solution, I think that more information is likely needed to help us understand what you're running into.
再次,我很抱歉,这更像是一个方向,而不是一个解决方案,我认为需要更多的信息来帮助我们理解您遇到的问题。
#2
0
You're using a 32bit system and running out of virtual memory. One of your libraries is likely spawning threads and not reclaiming them correctly. As a workaround, try reducing the default thread stack size with threading.stack_size
.
您正在使用一个32位的系统并耗尽虚拟内存。您的一个库可能会生成线程,而不会正确地回收它们。作为解决方案,尝试使用thread .stack_size减少默认的线程堆栈大小。