I have the following select statement (using sqlite3 and the pysqlite module):
我有以下select语句(使用sqlite3和pysqlite模块):
self.cursor.execute("SELECT precursor_id FROM MSMS_precursor "+
"JOIN spectrum ON spectrum_id = spectrum_spectrum_id "+
"WHERE spectrum_id = spectrum_spectrum_id "+
"AND ROUND(ion_mz,9) = ? AND ROUND(scan_start_time,4) = ? "+
"AND msrun_msrun_id = ?", select_inputValues)
Which takes 55 seconds when running in Python. When running it directly on the SQLite command line it only takes 15ms. Now, I noticed that when it's in this step the Python program goes into uninterrupted sleep (31283 ndeklein 18 0 126m 24m 3192 D 1.0 0.0 2:02.50 python
, The D in top output) and it goes down from 100% CPU to around 1% CPU. Now that I noticed it during this query, I also looked at the top output when running the query I asked about here. During this time top also shows it going into uninterrupted sleep, although it switches back and forth between R and D and only slows down to around 50% (it fluctuates depending on if it's in D or R status).
在Python中运行时需要55秒。在SQLite命令行上直接运行它只需要15ms。现在,我注意到,当它在这一步中时,Python程序进入不间断的睡眠状态(31283 ndeklein 18 0 126m 24m 3192 D 1.0 0.0 2:02.50 python,顶部输出中的D)并且它从100%CPU下降到1左右% *处理器。现在我在这个查询中注意到了,在运行我在这里询问的查询时,我也查看了顶部输出。在此期间,顶部还显示它进入不间断睡眠,虽然它在R和D之间来回切换并且仅减慢到约50%(它根据D或R状态是否波动)。
So now I think that this is what is slowing my querying down (please correct me if uninterrupted sleep has nothing to do with programs speed). If this is true, how can I make sure a program does not go into this status?
所以现在我认为这正在减缓我的查询速度(如果不间断睡眠与程序速度无关,请纠正我)。如果是这样,我怎样才能确保程序不会进入这种状态?
Update 1:
The EXPLAIN QUERY PLAN using Python returned:
使用Python返回的EXPLAIN QUERY PLAN:
(0, 0, 1, u'SCAN TABLE spectrum (~50000 rows)')
The EXPLAIN QUERY PLAN using sqlite's command line returned:
使用sqlite的命令行返回EXPLAIN QUERY PLAN:
0|0|1|SCAN TABLE spectrum (~50000 rows)
0|1|0|SEARCH TABLE MSMS_precursor USING INDEX fk_MSMS_precursor_spectrum_spectrum_id_1 (spectrum_spectrum_id=?) (~2 rows)
The EXPLAIN using Python returned:
使用Python的EXPLAIN返回:
(0, u'Trace', 0, 0, 0, u'', u'00', None)
The EXPLAIN using sqlite returned:
使用sqlite的EXPLAIN返回:
0|Trace|0|0|0||00|
1|Real|0|1|0|438.718658447|00|
2|Real|0|2|0|692.6345000000001|00|
3|Integer|1|3|0||00|
4|Goto|0|39|0||00|
5|OpenRead|1|33|0|13|00|
6|OpenRead|0|39|0|5|00|
7|OpenRead|2|41|0|keyinfo(1,BINARY)|00|
8|Rewind|1|35|0||00|
9|Column|1|8|5||00|
10|RealAffinity|5|0|0||00|
11|Integer|4|6|0||00|
12|Function|2|5|4|round(2)|02|
13|Ne|2|34|4||6a|
14|Column|1|12|4||00|
15|Ne|3|34|4|collseq(BINARY)|6c|
16|Column|1|0|8||00|
17|IsNull|8|34|0||00|
18|Affinity|8|1|0|d|00|
19|SeekGe|2|34|8|1|00|
20|IdxGE|2|34|8|1|01|
21|IdxRowid|2|7|0||00|
22|Seek|0|7|0||00|
23|Column|1|0|9||00|
24|Column|2|0|10||00|
25|Ne|10|33|9|collseq(BINARY)|6b|
26|Column|0|1|5||00|
27|RealAffinity|5|0|0||00|
28|Integer|9|6|0||00|
29|Function|2|5|11|round(2)|02|
30|Ne|1|33|11||6a|
31|Column|0|0|13||00|
32|ResultRow|13|1|0||00|
33|Next|2|20|0||00|
34|Next|1|9|0||01|
35|Close|1|0|0||00|
36|Close|0|0|0||00|
37|Close|2|0|0||00|
38|Halt|0|0|0||00|
39|Transaction|0|0|0||00|
40|VerifyCookie|0|31|0||00|
41|TableLock|0|33|0|spectrum|00|
42|TableLock|0|39|0|MSMS_precursor|00|
43|Goto|0|5|0||00|
And iostat returned:
而iostat回来了:
io-bash-3.2$ iostat
Linux 2.6.18-194.26.1.el5 (ningal.cluster.lifesci.ac.uk) 06/04/2012
avg-cpu: %user %nice %system %iowait %steal %idle
14.35 0.00 0.30 0.01 0.00 85.34
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 1.16 4.55 17.22 1520566 5752802
sda1 0.00 0.02 0.00 5074 34
sda2 1.16 4.53 17.22 1515184 5752768
sdb 0.00 0.02 0.00 5108 0
dm-0 2.29 3.88 16.70 1297226 5579336
dm-1 0.00 0.00 0.00 928 0
dm-2 0.11 0.65 0.52 216106 173432
Update 2
I migrated the database to MySQL and here the query only takes about 0.001 second, even though for all the other queries I'm doing it is actually slower than sqlite (I optimized for sqlite so this might or might not be surprising).
我将数据库迁移到MySQL,这里查询只需要大约0.001秒,即使对于所有其他查询我正在做的实际上比sqlite慢(我为sqlite优化所以这可能会或可能不会令人惊讶)。
2 个解决方案
#1
2
As I mentioned in an answer to a prior question you asked, did you give the sqlite module apsw a try? From the website:
正如我在回答您之前提出的问题时提到的那样,您是否尝试过sqlite模块apsw?来自网站:
APSW is a Python wrapper for the SQLite embedded relational database engine. In contrast to other wrappers such as pysqlite it focuses on being a minimal layer over SQLite attempting just to translate the complete SQLite API into Python. The documentation has a section on the differences between APSW and pysqlite.
APSW是SQLite嵌入式关系数据库引擎的Python包装器。与其他包装器(如pysqlite)相比,它专注于成为SQLite的最小层,试图将完整的SQLite API转换为Python。该文档有一节介绍APSW和pysqlite之间的区别。
I tried it myself and it seems to indeed reflect better how SQL statements are executed by the "real" Sqlite (i.e. the client or the C library).
我自己尝试过,似乎确实更好地反映了“真实”Sqlite(即客户端或C库)如何执行SQL语句。
#2
0
There is a performance issue with SQLite and Python. Read this thread for more information. There are a few suggestions there - try them, it might work - like adding an index to your join fields or using pysqlite.
SQLite和Python存在性能问题。阅读此主题以获取更多信息。那里有一些建议 - 尝试它们,它可能会起作用 - 比如在连接字段中添加索引或使用pysqlite。
http://www.mail-archive.com/python-list@python.org/msg253067.html
http://www.mail-archive.com/python-list@python.org/msg253067.html
#1
2
As I mentioned in an answer to a prior question you asked, did you give the sqlite module apsw a try? From the website:
正如我在回答您之前提出的问题时提到的那样,您是否尝试过sqlite模块apsw?来自网站:
APSW is a Python wrapper for the SQLite embedded relational database engine. In contrast to other wrappers such as pysqlite it focuses on being a minimal layer over SQLite attempting just to translate the complete SQLite API into Python. The documentation has a section on the differences between APSW and pysqlite.
APSW是SQLite嵌入式关系数据库引擎的Python包装器。与其他包装器(如pysqlite)相比,它专注于成为SQLite的最小层,试图将完整的SQLite API转换为Python。该文档有一节介绍APSW和pysqlite之间的区别。
I tried it myself and it seems to indeed reflect better how SQL statements are executed by the "real" Sqlite (i.e. the client or the C library).
我自己尝试过,似乎确实更好地反映了“真实”Sqlite(即客户端或C库)如何执行SQL语句。
#2
0
There is a performance issue with SQLite and Python. Read this thread for more information. There are a few suggestions there - try them, it might work - like adding an index to your join fields or using pysqlite.
SQLite和Python存在性能问题。阅读此主题以获取更多信息。那里有一些建议 - 尝试它们,它可能会起作用 - 比如在连接字段中添加索引或使用pysqlite。
http://www.mail-archive.com/python-list@python.org/msg253067.html
http://www.mail-archive.com/python-list@python.org/msg253067.html