I have a query that is returning in vastly different amounts of time between 2 datasets. For one set (database A) it returns in a few seconds, for the other (database B)....well I haven't waited long enough yet, but over 10 minutes. I have dumped both of these databases to my local machine where I can reproduce the issue running MySQL 5.1.37.
我有一个查询在2个数据集之间返回的时间差异很大。对于一组(数据库A),它会在几秒钟后返回,而另一组(数据库B)......我还没有等待足够长的时间,但是超过10分钟。我已将这两个数据库转储到本地计算机,我可以重现运行MySQL 5.1.37的问题。
Curiously, database B is smaller than database A.
奇怪的是,数据库B小于数据库A.
A stripped down version of the query that reproduces the problem is:
重现问题的查询的精简版本是:
SELECT * FROM po_shipment ps
JOIN po_shipment_item psi USING (ship_id)
JOIN po_alloc pa ON ps.ship_id = pa.ship_id AND pa.UID_items = psi.UID_items
JOIN po_header ph ON pa.hdr_id = ph.hdr_id
LEFT JOIN EVENT_TABLE ev0 ON ev0.TABLE_ID1 = ps.ship_id AND ev0.EVENT_TYPE = 'MAS0'
LEFT JOIN EVENT_TABLE ev1 ON ev1.TABLE_ID1 = ps.ship_id AND ev1.EVENT_TYPE = 'MAS1'
LEFT JOIN EVENT_TABLE ev2 ON ev2.TABLE_ID1 = ps.ship_id AND ev2.EVENT_TYPE = 'MAS2'
LEFT JOIN EVENT_TABLE ev3 ON ev3.TABLE_ID1 = ps.ship_id AND ev3.EVENT_TYPE = 'MAS3'
LEFT JOIN EVENT_TABLE ev4 ON ev4.TABLE_ID1 = ps.ship_id AND ev4.EVENT_TYPE = 'MAS4'
LEFT JOIN EVENT_TABLE ev5 ON ev5.TABLE_ID1 = ps.ship_id AND ev5.EVENT_TYPE = 'MAS5'
WHERE ps.eta >= '2010-03-22'
GROUP BY ps.ship_id
LIMIT 100;
The EXPLAIN query plan for the first database (A) that returns in ~2 seconds is:
在~2秒内返回的第一个数据库(A)的EXPLAIN查询计划是:
+----+-------------+-------+--------+----------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+------------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+----------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+------------------------------+------+----------------------------------------------+
| 1 | SIMPLE | ps | range | PRIMARY,IX_ETA_DATE | IX_ETA_DATE | 4 | NULL | 174 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | ev0 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_PROD.ps.ship_id,const | 1 | |
| 1 | SIMPLE | ev1 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_PROD.ps.ship_id,const | 1 | |
| 1 | SIMPLE | ev2 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_PROD.ps.ship_id,const | 1 | |
| 1 | SIMPLE | ev3 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_PROD.ps.ship_id,const | 1 | |
| 1 | SIMPLE | ev4 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_PROD.ps.ship_id,const | 1 | |
| 1 | SIMPLE | ev5 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_PROD.ps.ship_id,const | 1 | |
| 1 | SIMPLE | psi | ref | PRIMARY,IX_po_shipment_item_po_shipment1,FK_po_shipment_item_po_shipment1 | IX_po_shipment_item_po_shipment1 | 4 | UNIVIS_PROD.ps.ship_id | 1 | |
| 1 | SIMPLE | pa | ref | IX_po_alloc_po_shipment_item2,IX_po_alloc_po_details_old,FK_po_alloc_po_shipment1,FK_po_alloc_po_shipment_item1,FK_po_alloc_po_header1 | FK_po_alloc_po_shipment1 | 4 | UNIVIS_PROD.psi.ship_id | 5 | Using where |
| 1 | SIMPLE | ph | eq_ref | PRIMARY,IX_HDR_ID | PRIMARY | 4 | UNIVIS_PROD.pa.hdr_id | 1 | |
+----+-------------+-------+--------+----------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+------------------------------+------+----------------------------------------------+
The EXPLAIN query plan for the second database (B) that returns in >600 seconds is:
在> 600秒内返回的第二个数据库(B)的EXPLAIN查询计划是:
+----+-------------+-------+--------+----------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+--------------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+----------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+--------------------------------+------+----------------------------------------------+
| 1 | SIMPLE | ps | range | PRIMARY,IX_ETA_DATE | IX_ETA_DATE | 4 | NULL | 38 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | psi | ref | PRIMARY,IX_po_shipment_item_po_shipment1,FK_po_shipment_item_po_shipment1 | IX_po_shipment_item_po_shipment1 | 4 | UNIVIS_DEV01.ps.ship_id | 1 | |
| 1 | SIMPLE | ev0 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_DEV01.psi.ship_id,const | 1 | |
| 1 | SIMPLE | ev1 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_DEV01.psi.ship_id,const | 1 | |
| 1 | SIMPLE | ev2 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_DEV01.ps.ship_id,const | 1 | |
| 1 | SIMPLE | ev3 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_DEV01.psi.ship_id,const | 1 | |
| 1 | SIMPLE | ev4 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_DEV01.psi.ship_id,const | 1 | |
| 1 | SIMPLE | ev5 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_DEV01.ps.ship_id,const | 1 | |
| 1 | SIMPLE | pa | ref | IX_po_alloc_po_shipment_item2,IX_po_alloc_po_details_old,FK_po_alloc_po_shipment1,FK_po_alloc_po_shipment_item1,FK_po_alloc_po_header1 | IX_po_alloc_po_shipment_item2 | 4 | UNIVIS_DEV01.ps.ship_id | 4 | Using where |
| 1 | SIMPLE | ph | eq_ref | PRIMARY,IX_HDR_ID | PRIMARY | 4 | UNIVIS_DEV01.pa.hdr_id | 1 | |
+----+-------------+-------+--------+----------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+--------------------------------+------+----------------------------------------------+
When database B is running I can look at the MySQL Administrator and the state remains at "Copying to tmp table" indefinitely. Database A also has this state but for only a second or so.
当数据库B运行时,我可以查看MySQL管理员,状态将无限期地保留在“正在复制到tmp表”。数据库A也有这种状态但只有一秒左右。
There are no differences in the table structure, indexes, keys etc between these databases (I have done show create tables and diff'd them).
这些数据库之间的表结构,索引,键等没有区别(我已经完成了show create tables和diff'd)。
The sizes of the tables are:
表的大小是:
database A:
po_shipment 1776
po_shipment_item 1945
po_alloc 36298
po_header 71642
EVENT_TABLE 1608
database B:
po_shipment 463
po_shipment_item 470
po_alloc 3291
po_header 56149
EVENT_TABLE 1089
Some points to note:
有些要点需要注意:
- Removing the WHERE clause makes the query return < 1 sec.
- Removing the GROUP BY makes the query return < 1 sec.
- Removing ev5, ev4, ev3 etc makes the query get faster for each one removed.
删除WHERE子句使查询返回<1秒。
删除GROUP BY使查询返回<1秒。
删除ev5,ev4,ev3等会使每个删除的查询变得更快。
UPDATE after AJ's answer: - The size of the ship_id is significantly larger on database B (max value = 800002752) than database A (max value = 3489). Given that these are InnoDB tables would changing any buffer help with handling keys of this size? Further Update to this: I reduced the size of the keys and re-ANALYZEd but still no change in performance.
AJ回答后的更新: - 数据库B(最大值= 800002752)上ship_id的大小明显大于数据库A(最大值= 3489)。鉴于这些是InnoDB表会更改任何缓冲区帮助处理这个大小的键?进一步更新:我缩小了键的大小并重新分析,但性能仍然没有变化。
UPDATE desc of EVENT_TABLE:
更新EVENT_TABLE的desc:
Note that it is identical in both databases
请注意,它在两个数据库中都是相同的
+--------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------------+--------------+------+-----+---------+----------------+
| EVENT_TABLE_ID | bigint(20) | NO | PRI | NULL | auto_increment |
| EVENT_TYPE | varchar(10) | NO | | NULL | |
| TABLE_ID1 | int(11) | NO | MUL | NULL | |
| TABLE_ID2 | int(11) | YES | | NULL | |
| TABLE_ID3 | int(11) | YES | | NULL | |
| TABLE_ID4 | int(11) | YES | | NULL | |
| EVENT_CREATED_DATE | datetime | NO | | NULL | |
| MESSAGE_REF | varchar(100) | YES | | NULL | |
+--------------------+--------------+------+-----+---------+----------------+
And for good measure the SHOW CREATE TABLE EVENT_TABLE:
为了更好地衡量SHOW CREATE TABLE EVENT_TABLE:
The only thing to differ in this between databases is the auto increment value
数据库之间唯一不同的是自动增量值
| EVENT_TABLE | CREATE TABLE `EVENT_TABLE` (
`EVENT_TABLE_ID` bigint(20) NOT NULL AUTO_INCREMENT,
`EVENT_TYPE` varchar(10) NOT NULL,
`TABLE_ID1` int(11) NOT NULL,
`TABLE_ID2` int(11) DEFAULT NULL,
`TABLE_ID3` int(11) DEFAULT NULL,
`TABLE_ID4` int(11) DEFAULT NULL,
`EVENT_CREATED_DATE` datetime NOT NULL,
`MESSAGE_REF` varchar(100) DEFAULT NULL,
PRIMARY KEY (`EVENT_TABLE_ID`),
KEY `IX_EVENT_ID_EVENT_TYPE` (`TABLE_ID1`,`EVENT_TYPE`)
) ENGINE=InnoDB AUTO_INCREMENT=1925 DEFAULT CHARSET=utf8 |
Can anyone suggest how to resolve this issue? What have I missed?
任何人都可以建议如何解决这个问题?我错过了什么?
UPDATE after question from Michael Holzmann Here is the new Query plans based on his updated STRAIGHT_JOIN query. Note that database B has "Using temporary; Using filesort" whereas now database A doesn't. Could this be due to the long keys or something similar?
Michael Holzmann提问后更新以下是基于他更新的STRAIGHT_JOIN查询的新查询计划。请注意,数据库B具有“使用临时;使用filesort”,而现在数据库A没有。这可能是由于长键或类似的东西?
database A
+----+-------------+-------+--------+----------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+------------------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+----------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+------------------------------+------+-------------+
| 1 | SIMPLE | ps | index | PRIMARY,IX_ETA_DATE | PRIMARY | 4 | NULL | 168 | Using where |
| 1 | SIMPLE | ev0 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_PROD.ps.ship_id,const | 1 | |
| 1 | SIMPLE | ev1 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_PROD.ps.ship_id,const | 1 | |
| 1 | SIMPLE | ev2 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_PROD.ps.ship_id,const | 1 | |
| 1 | SIMPLE | ev3 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_PROD.ps.ship_id,const | 1 | |
| 1 | SIMPLE | ev4 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_PROD.ps.ship_id,const | 1 | |
| 1 | SIMPLE | ev5 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_PROD.ps.ship_id,const | 1 | |
| 1 | SIMPLE | psi | ref | PRIMARY,IX_po_shipment_item_po_shipment1,FK_po_shipment_item_po_shipment1 | IX_po_shipment_item_po_shipment1 | 4 | UNIVIS_PROD.ps.ship_id | 1 | |
| 1 | SIMPLE | pa | ref | IX_po_alloc_po_shipment_item2,IX_po_alloc_po_details_old,FK_po_alloc_po_shipment1,FK_po_alloc_po_shipment_item1,FK_po_alloc_po_header1 | FK_po_alloc_po_shipment_item1 | 8 | UNIVIS_PROD.psi.UID_items | 6 | Using where |
| 1 | SIMPLE | ph | eq_ref | PRIMARY,IX_HDR_ID | PRIMARY | 4 | UNIVIS_PROD.pa.hdr_id | 1 | |
+----+-------------+-------+--------+----------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+------------------------------+------+-------------+
database B
+----+-------------+-------+--------+----------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+-------------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+----------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+-------------------------------+------+----------------------------------------------+
| 1 | SIMPLE | ps | range | PRIMARY,IX_ETA_DATE | IX_ETA_DATE | 4 | NULL | 38 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | ev0 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_DEV01.ps.ship_id,const | 1 | |
| 1 | SIMPLE | ev1 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_DEV01.ps.ship_id,const | 1 | |
| 1 | SIMPLE | ev2 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_DEV01.ps.ship_id,const | 1 | |
| 1 | SIMPLE | ev3 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_DEV01.ps.ship_id,const | 1 | |
| 1 | SIMPLE | ev4 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_DEV01.ps.ship_id,const | 1 | |
| 1 | SIMPLE | ev5 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_DEV01.ps.ship_id,const | 1 | |
| 1 | SIMPLE | psi | ref | PRIMARY,IX_po_shipment_item_po_shipment1,FK_po_shipment_item_po_shipment1 | IX_po_shipment_item_po_shipment1 | 4 | UNIVIS_DEV01.ps.ship_id | 1 | |
| 1 | SIMPLE | pa | ref | IX_po_alloc_po_shipment_item2,IX_po_alloc_po_details_old,FK_po_alloc_po_shipment1,FK_po_alloc_po_shipment_item1,FK_po_alloc_po_header1 | IX_po_alloc_po_shipment_item2 | 4 | UNIVIS_DEV01.ps.ship_id | 3 | Using where |
| 1 | SIMPLE | ph | eq_ref | PRIMARY,IX_HDR_ID | PRIMARY | 4 | UNIVIS_DEV01.pa.hdr_id | 1 | |
+----+-------------+-------+--------+----------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+-------------------------------+------+----------------------------------------------+
UPDATE It's definitely data related. I dumped the data from database A and loaded it into database B using:
更新这绝对是数据相关的。我从数据库A转储数据并使用以下方法将其加载到数据库B中:
SELECT * from <table> into outfile <file>
and
LOAD DATA INFILE <file> into table <table>
Then the database B query runs quickly - ie. as fast as database A. Any ideas on how to diagnose what could be wrong with the data??
然后数据库B查询快速运行 - 即。和数据库A一样快。关于如何诊断数据可能出错的任何想法?
UPDATE @newtover: From database A:
更新@newtover:来自数据库A:
+-----------------+---------------------+
| eta_selectivity | ship_id_selectivity |
+-----------------+---------------------+
| 0.0693 | 1.0000 |
+-----------------+---------------------+
1 row in set (0.02 sec)
From database B (the bad one)
来自数据库B(坏的)
+-----------------+---------------------+
| eta_selectivity | ship_id_selectivity |
+-----------------+---------------------+
| 0.1814 | 1.0000 |
+-----------------+---------------------+
1 row in set (0.02 sec)
And the show create for po_shipment:
这个节目为po_shipment创造:
| po_shipment | CREATE TABLE `po_shipment` (
`ship_id` int(11) NOT NULL DEFAULT '0',
`ship_type` varchar(16) DEFAULT NULL,
`foreign_agent` varchar(16) DEFAULT NULL,
`agent_ref` varchar(16) DEFAULT NULL,
`exporter_code` varchar(30) DEFAULT NULL,
`importer_code` varchar(30) DEFAULT NULL,
`carrier_code` varchar(30) DEFAULT NULL,
`exporter_name` varchar(50) DEFAULT NULL,
`importer_name` varchar(50) DEFAULT NULL,
`carrier_name` varchar(50) DEFAULT NULL,
`receipt` varchar(30) DEFAULT NULL,
`pol_aol` varchar(50) DEFAULT NULL,
`pod_aod` varchar(30) DEFAULT NULL,
`final_dest` varchar(50) DEFAULT NULL,
`vessel_flno` varchar(30) DEFAULT NULL,
`ets` date DEFAULT NULL,
`eta` date DEFAULT NULL,
`pieces` int(11) DEFAULT '0',
`weight` decimal(17,2) DEFAULT '0.00',
`volume` decimal(17,2) DEFAULT '0.00',
`marks` varchar(500) DEFAULT NULL,
`goods_desc` varchar(500) DEFAULT NULL,
`ship_terms` varchar(16) DEFAULT NULL,
`ship_terms_desc` varchar(50) DEFAULT NULL,
`house_hawb` varchar(30) DEFAULT NULL,
`ocean_mawb` varchar(30) DEFAULT NULL,
`booking_date` date DEFAULT NULL,
`expected_cargo` date DEFAULT NULL,
`mfrt_jobdisp` varchar(30) DEFAULT NULL,
`ship_complete` date DEFAULT NULL,
`user_id` varchar(30) DEFAULT NULL,
`receipt_desc` varchar(60) DEFAULT NULL,
`fin_dest_desc` varchar(60) DEFAULT NULL,
`pol_aol_desc` varchar(60) DEFAULT NULL,
`pod_aod_desc` varchar(60) DEFAULT NULL,
`exporter_ref` varchar(26) DEFAULT NULL,
`carrier_ref` varchar(26) DEFAULT NULL,
`terms_conds` date DEFAULT NULL,
`last_amended` date DEFAULT NULL,
`user_amended` varchar(30) DEFAULT NULL,
`package_type` varchar(24) DEFAULT NULL,
`ext_cancelled` tinyint(1) NOT NULL DEFAULT '0',
`ext_goh` tinyint(1) NOT NULL DEFAULT '0',
`ext_arrival_date` date DEFAULT NULL,
`ext_booking_ref` varchar(255) DEFAULT NULL,
`ext_dc_booked_delivery_date` date DEFAULT NULL,
`ext_dc_booked_delivery_time` varchar(10) DEFAULT NULL,
`ext_comments` text,
`deleted` tinyint(1) NOT NULL DEFAULT '0',
`last_amended_time` int(10) DEFAULT NULL,
`last_amended_uni` varchar(30) DEFAULT NULL,
PRIMARY KEY (`ship_id`),
KEY `IX_ETA_DATE` (`eta`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
UPDATE @chris_I If I strip the query down by removing all other joins aside from EVENT_TABLE I get the same performance (ie. crappy)
更新@chris_I如果我通过删除除EVENT_TABLE之外的所有其他连接来删除查询我得到相同的性能(即蹩脚)
SELECT * FROM po_shipment ps
LEFT JOIN EVENT_TABLE ev0 ON ev0.TABLE_ID1 = ps.ship_id AND ev0.EVENT_TYPE = 'MAS0'
LEFT JOIN EVENT_TABLE ev1 ON ev1.TABLE_ID1 = ps.ship_id AND ev1.EVENT_TYPE = 'MAS1'
LEFT JOIN EVENT_TABLE ev2 ON ev2.TABLE_ID1 = ps.ship_id AND ev2.EVENT_TYPE = 'MAS2'
LEFT JOIN EVENT_TABLE ev3 ON ev3.TABLE_ID1 = ps.ship_id AND ev3.EVENT_TYPE = 'MAS3'
LEFT JOIN EVENT_TABLE ev4 ON ev4.TABLE_ID1 = ps.ship_id AND ev4.EVENT_TYPE = 'MAS4'
LEFT JOIN EVENT_TABLE ev5 ON ev5.TABLE_ID1 = ps.ship_id AND ev5.EVENT_TYPE = 'MAS5'
WHERE ps.eta >= '2010-03-22'
GROUP BY ps.ship_id
LIMIT 100;
UPDATE @Marcus Adams: Query for plans you have asked for with inner joins removed:
更新@Marcus Adams:查询已删除内部联接所要求的计划:
SELECT * FROM po_shipment ps
LEFT JOIN EVENT_TABLE ev0 ON ev0.TABLE_ID1 = ps.ship_id AND ev0.EVENT_TYPE = 'MAS0'
LEFT JOIN EVENT_TABLE ev1 ON ev1.TABLE_ID1 = ps.ship_id AND ev1.EVENT_TYPE = 'MAS1'
LEFT JOIN EVENT_TABLE ev2 ON ev2.TABLE_ID1 = ps.ship_id AND ev2.EVENT_TYPE = 'MAS2'
LEFT JOIN EVENT_TABLE ev3 ON ev3.TABLE_ID1 = ps.ship_id AND ev3.EVENT_TYPE = 'MAS3'
LEFT JOIN EVENT_TABLE ev4 ON ev4.TABLE_ID1 = ps.ship_id AND ev4.EVENT_TYPE = 'MAS4'
LEFT JOIN EVENT_TABLE ev5 ON ev5.TABLE_ID1 = ps.ship_id AND ev5.EVENT_TYPE = 'MAS5'
WHERE ps.eta >= '2010-03-22'
GROUP BY ps.ship_id
LIMIT 100;
Query Plan from database A (responds in 0.35s)
数据库A的查询计划(在0.35秒内响应)
+----+-------------+-------+-------+------------------------+------------------------+---------+------------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+------------------------+------------------------+---------+------------------------------+------+----------------------------------------------+
| 1 | SIMPLE | ps | range | IX_ETA_DATE | IX_ETA_DATE | 4 | NULL | 174 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | ev0 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_PROD.ps.ship_id,const | 1 | |
| 1 | SIMPLE | ev1 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_PROD.ps.ship_id,const | 1 | |
| 1 | SIMPLE | ev2 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_PROD.ps.ship_id,const | 1 | |
| 1 | SIMPLE | ev3 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_PROD.ps.ship_id,const | 1 | |
| 1 | SIMPLE | ev4 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_PROD.ps.ship_id,const | 1 | |
| 1 | SIMPLE | ev5 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_PROD.ps.ship_id,const | 1 | |
+----+-------------+-------+-------+------------------------+------------------------+---------+------------------------------+------+----------------------------------------------
Query Plan from database B (doesn't respond in time it takes to make a cup of tea)
来自数据库B的查询计划(没有及时回复制作一杯茶)
+----+-------------+-------+-------+------------------------+------------------------+---------+-------------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+------------------------+------------------------+---------+-------------------------------+------+----------------------------------------------+
| 1 | SIMPLE | ps | range | IX_ETA_DATE | IX_ETA_DATE | 4 | NULL | 38 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | ev0 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36
14 个解决方案
#1
2
If it's a data problem, I can't tell you what the exact problem is, but here's my favorite strategy to solve this kind of problem:
如果这是一个数据问题,我不能告诉你确切的问题是什么,但这是我最喜欢的解决这类问题的策略:
Try to remove half of your joins. Repeat recursively, until the query runs fast. Then add half of the joins you removed in the last step... (This strategy will require much fewer steps than removing and adding join by join.)
尝试删除一半的连接。递归重复,直到查询运行得很快。然后添加您在上一步中删除的一半连接...(此策略比通过连接删除和添加连接所需的步骤少得多。)
Once you've found out the "bad" join, you can try restricting its values with an additional "where" clause, until the query runs fast again... In each step, always try to reduce the problem in half.
一旦找到“坏”连接,您可以尝试使用额外的“where”子句限制其值,直到查询再次快速运行...在每个步骤中,始终尝试将问题减少一半。
Note: It can very well be the case, that you get a lot more records for the intermediate results of your joins, even if the total amount of data is smaller in database B.
注意:很可能就是这样,即使数据库B中的数据总量较小,您也可以获得更多联接中间结果的记录。
#2
3
Try adding STRAIGHT_JOIN to the query to see if the execution plan is the issue. The optimizer is choosing a different execution plan for each database and this might be causing the issue.
尝试将STRAIGHT_JOIN添加到查询中以查看执行计划是否存在问题。优化器正在为每个数据库选择不同的执行计划,这可能会导致问题。
SELECT STRAIGHT_JOIN * FROM po_shipment ps
LEFT JOIN EVENT_TABLE ev0 ON ev0.TABLE_ID1 = ps.ship_id AND ev0.EVENT_TYPE = 'MAS0'
LEFT JOIN EVENT_TABLE ev1 ON ev1.TABLE_ID1 = ps.ship_id AND ev1.EVENT_TYPE = 'MAS1'
LEFT JOIN EVENT_TABLE ev2 ON ev2.TABLE_ID1 = ps.ship_id AND ev2.EVENT_TYPE = 'MAS2'
LEFT JOIN EVENT_TABLE ev3 ON ev3.TABLE_ID1 = ps.ship_id AND ev3.EVENT_TYPE = 'MAS3'
LEFT JOIN EVENT_TABLE ev4 ON ev4.TABLE_ID1 = ps.ship_id AND ev4.EVENT_TYPE = 'MAS4'
LEFT JOIN EVENT_TABLE ev5 ON ev5.TABLE_ID1 = ps.ship_id AND ev5.EVENT_TYPE = 'MAS5'
JOIN po_shipment_item psi USING (ship_id)
JOIN po_alloc pa ON ps.ship_id = pa.ship_id AND pa.UID_items = psi.UID_items
JOIN po_header ph ON pa.hdr_id = ph.hdr_id
WHERE ps.eta >= '2010-03-22'
GROUP BY ps.ship_id
LIMIT 100;
UPDATE
1. The new execution plan for database A doesn't need a filesort or temporary table because it's using the primary key. I would start adding USE INDEX into the query to see if you can speed up the query on database B. Is ship_id the primary key of po_shipment? if so, you need to figure out what costs more the group by or filtering of the dates.
更新1.数据库A的新执行计划不需要文件或临时表,因为它使用主键。我会开始在查询中添加USE INDEX以查看是否可以加快数据库B上的查询。ship_id是po_shipment的主键吗?如果是这样,您需要计算出更多的成本或过滤日期。
SELECT STRAIGHT_JOIN * FROM po_shipment ps USE INDEX( PRIMARY )
LEFT JOIN EVENT_TABLE ev0 ON ev0.TABLE_ID1 = ps.ship_id AND ev0.EVENT_TYPE = 'MAS0'
LEFT JOIN EVENT_TABLE ev1 ON ev1.TABLE_ID1 = ps.ship_id AND ev1.EVENT_TYPE = 'MAS1'
LEFT JOIN EVENT_TABLE ev2 ON ev2.TABLE_ID1 = ps.ship_id AND ev2.EVENT_TYPE = 'MAS2'
LEFT JOIN EVENT_TABLE ev3 ON ev3.TABLE_ID1 = ps.ship_id AND ev3.EVENT_TYPE = 'MAS3'
LEFT JOIN EVENT_TABLE ev4 ON ev4.TABLE_ID1 = ps.ship_id AND ev4.EVENT_TYPE = 'MAS4'
LEFT JOIN EVENT_TABLE ev5 ON ev5.TABLE_ID1 = ps.ship_id AND ev5.EVENT_TYPE = 'MAS5'
JOIN po_shipment_item psi USING (ship_id)
JOIN po_alloc pa ON ps.ship_id = pa.ship_id AND pa.UID_items = psi.UID_items
JOIN po_header ph ON pa.hdr_id = ph.hdr_id
WHERE ps.eta >= '2010-03-22'
GROUP BY ps.ship_id
LIMIT 100;
If that doesn't help try suggesting more of the indexes used in the execution plan of database A for database B.
如果这没有帮助尝试建议数据库A的数据库A的执行计划中使用的更多索引。
#3
2
Haven't touched MySQL for a while but I am guessing the problem has to do with either
有一段时间没有触及MySQL,但我猜这个问题与两者有关
- Have you checked you key/join field lengths (actual data) maybe causing the sort_buffer to page to disk joining on large (size) keys? (this smells like a data issue...)
- The server settings, basically it is writting to an in memory temp table. I had something similar a few years back. Have you enlarged the key_buffer_size, table_cache, read_rnd_buffer_size, sort_buffer, read_buffer_size to see if that helps?
您是否检查过键/连接字段长度(实际数据)是否可能导致sort_buffer在大(大小)键上寻呼到磁盘? (这闻起来像数据问题......)
服务器设置,基本上是写入内存临时表。几年前我有类似的东西。你有没有放大key_buffer_size,table_cache,read_rnd_buffer_size,sort_buffer,read_buffer_size以查看是否有帮助?
#4
1
Update your statistics. one time I have a similar issue, and this works for me.
更新统计信息。有一次我有类似的问题,这对我有用。
#5
1
Since it's InnoDB it looks like it's uite possibly a locking problem. What else is going on at the same time?
因为它是InnoDB,看起来它可能是一个锁定问题。还有什么事情在同一时间发生?
#6
1
Since it seems like some issue with the data, it would probably be helpful to figure out what data is causing the problem. Make a third database C and insert half of the data from database B twice(so you have the same number of rows). If database C is slower, than the bad data is there, otherwise it is in the other half. Repeat with smaller and smaller chunk sizes to help find the problem data.
由于数据似乎有些问题,因此找出导致问题的数据可能会有所帮助。创建第三个数据库C并从数据库B插入一半数据(因此您具有相同的行数)。如果数据库C较慢,那么坏数据就在那里,否则它在另一半。重复使用越来越小的块大小来帮助查找问题数据。
Even though database B is smaller that database A, the tables 'po_header' and 'EVENT_TABLE' are not proportionally smaller. That may have something to do with the speed difference.
即使数据库B小于数据库A,表'po_header'和'EVENT_TABLE'也不是按比例缩小的。这可能与速度差异有关。
#7
1
Have to tried doing "CHECK TABLE" and/or "OPTIMIZE TABLE"?
必须尝试做“CHECK TABLE”和/或“OPTIMIZE TABLE”吗?
I had a similar situation where the problem was that the key indexes where not updated properly and doing CHECK TABLE fixed them. A long shot, but worth the try.
我有一个类似的情况,问题是关键索引没有正确更新和做CHECK TABLE修复它们。远射,但值得一试。
#8
1
To begin with make sure that you have created indexes on the appropriate fields. I am sure you've done that.
首先,确保已在相应字段上创建索引。我相信你已经做到了。
Next, try using index hints (USE INDEX) to force the database to use the indexes properly.
接下来,尝试使用索引提示(USE INDEX)强制数据库正确使用索引。
I had a similar problem in which I assumed that the indexes were setup and used by mysql properly, but they were not. I was able to work around it using index hints.
我有一个类似的问题,我认为索引是由mysql正确设置和使用的,但它们不是。我能够使用索引提示解决它。
#9
1
What is the selectivity of data in po_shipment.eta and po_shipment.ship_id. Could you post the results of the following query from both databases:
po_shipment.eta和po_shipment.ship_id中数据的选择性是什么?您可以从两个数据库发布以下查询的结果:
SELECT
count(distinct eta)/count(*) as eta_selectivity,
count(distinct ship_id)/count(*) as ship_id_selectivity
FROM po_shipment;
Usually the more selective are the field data (closer to 1), the better index is working. The cause might also be very uneven distribution of data in po_shipment.eta (when you use '2099-01-01' or the like for as a special value) if the optimizer lacks necessary statistics.
通常,现场数据的选择性越高(接近1),指数就越好。如果优化器缺少必要的统计信息,那么原因也可能是po_shipment.eta中的数据分布非常不均匀(当您使用'2099-01-01'或类似值作为特殊值时)。
By the way, could you provide SHOW CREATE TABLE po_shipment? Indices on the table could shed some light.
顺便问一下,你能提供SHOW CREATE TABLE po_shipment吗?桌上的指数可能会有所启发。
UPD: When the selectivity on the field is so low as for eta
field, the index is basically useless. Even worse, it might confuse the optimizer on selects and slow down inserts and updates of the data.
UPD:当场上的选择性如此低于eta场时,索引基本上是无用的。更糟糕的是,它可能会使优化器混淆选择并减慢数据的插入和更新。
The first suggestion is to try removing the index on the eta
field and measure the results. It might be that for the A database optimizer does not even try to use the index because of the the very low selectivity and for the B database it gives the index a try.
第一个建议是尝试删除eta字段上的索引并测量结果。可能是因为A数据库优化器甚至没有尝试使用索引,因为选择性非常低,而对于B数据库,它会尝试使用索引。
The second thing that bothers me is why do you group by ship_id in the first place? When there is a need in an implicit temporary table and among the fields there is a TEXT (as in your case) or BLOB one, MySQL will always use on disk temporary tables for sorting (which is implicitly required in GROUP BY). In your case, ship_id is the clustered primary key and the results will anyway be sorted by ship_id. What you need is to extract a base query (already grouped by ship_id if there can be several correscpondences) and JOIN po_shipment with the base query applying your range condition and not using the group by at the top.
困扰我的第二件事是你为什么首先按ship_id分组?当需要一个隐式临时表,并且在字段中有一个TEXT(如你的情况)或BLOB一个,MySQL将始终使用磁盘临时表进行排序(这在GROUP BY中是隐式需要的)。在您的情况下,ship_id是聚簇主键,结果无论如何都将按ship_id排序。你需要的是提取一个基本查询(如果可以有多个相关性,已经按ship_id分组)和JOIN po_shipment,基本查询应用你的范围条件而不是在顶部使用group by。
And the third. Do you realy need all the fields when you use * at the top? Having joined 10 tables, you receive a lot of fieds. I hardly believe you need al of them. Even exclusion of the TEXT field from the results might boost the query performance.
第三个。当您在顶部使用*时,您真的需要所有字段吗?加入10张牌桌后,你会收到很多赌注。我几乎不相信你需要他们。即使从结果中排除TEXT字段也可能提高查询性能。
#10
1
I've experienced the same when using range selectors less than
< and greater than
> too.
当使用小于 <且大于> 的范围选择器时,我也遇到了同样的问题。
Experiment: If the range is not too big, have you tried tried to expand the range into and an IN (...,...,...) statement instad ?
实验:如果范围不是太大,你试过将范围扩展到IN(...,...,...)语句中吗?
E.g
SELECT * FROM po_shipment ps USE INDEX (IX_ETA_DATE)
JOIN po_shipment_item psi USING (ship_id)
JOIN po_alloc pa ON ps.ship_id = pa.ship_id AND pa.UID_items = psi.UID_items
JOIN po_header ph ON pa.hdr_id = ph.hdr_id
LEFT JOIN EVENT_TABLE ev0 ON ev0.TABLE_ID1 = ps.ship_id AND ev0.EVENT_TYPE = 'MAS0'
LEFT JOIN EVENT_TABLE ev1 ON ev1.TABLE_ID1 = ps.ship_id AND ev1.EVENT_TYPE = 'MAS1'
LEFT JOIN EVENT_TABLE ev2 ON ev2.TABLE_ID1 = ps.ship_id AND ev2.EVENT_TYPE = 'MAS2'
LEFT JOIN EVENT_TABLE ev3 ON ev3.TABLE_ID1 = ps.ship_id AND ev3.EVENT_TYPE = 'MAS3'
LEFT JOIN EVENT_TABLE ev4 ON ev4.TABLE_ID1 = ps.ship_id AND ev4.EVENT_TYPE = 'MAS4'
LEFT JOIN EVENT_TABLE ev5 ON ev5.TABLE_ID1 = ps.ship_id AND ev5.EVENT_TYPE = 'MAS5'
WHERE ps.eta IN ('2010-03-22','2010-03-21','2010-03-20',...)
GROUP BY ps.ship_id
LIMIT 100;
Edit: Added USE INDEX() hint as Salman A suggested. Seems like mysql see's the possible index, but choses not to use it... worth testing.
编辑:添加USE INDEX()提示为Salman A建议。看起来像mysql看到可能的索引,但选择不使用它...值得测试。
#11
1
I think it may be the manual sort (filesort) for the GROUP BY clause that is causing the noticeable deficit.
我认为可能是GROUP BY子句的手动排序(filesort)导致了明显的不足。
Try using the SQL_BIG_RESULT hint to see if MySQL will improve it's method for handling the GROUP BY.
尝试使用SQL_BIG_RESULT提示来查看MySQL是否会改进它处理GROUP BY的方法。
SELECT SQL_BIG_RESULT * FROM ...
#12
0
I think you're going at this the wrong way. When you do a LEFT JOIN, you're getting back all of the records in EVENT_TABLE regardless of a match to po_shipment.
我认为你的方式是错误的。当你进行LEFT JOIN时,无论与po_shipment匹配,你都会收到EVENT_TABLE中的所有记录。
And you've run a LEFT JOIN to EVENT_TABLE six times. You're getting back (6 * (SELECT count(*) FROM EVENT_TABLE) ) records with every query. Since db A has less records than db B, of course the query runs faster on A.
而且你已经连续六次对EVENT_TABLE进行LEFT JOIN了。每次查询都会返回(6 *(SELECT count(*)FROM EVENT_TABLE))记录。由于db A的记录少于db B,因此查询在A上运行得更快。
I think something like this would perform worlds better:
我觉得这样的事情可以让世界变得更好:
DECLARE @TEMP_EVENT_TYPES table
(
EVENT_TYPE varchar(10) PRIMARY KEY
)
;
/*
INSERT VALUES 'MAS0', 'MAS1', 'MAS2', 'MAS3', 'MAS4', 'MAS5'
*/
;
SELECT * FROM po_shipment ps
JOIN po_shipment_item psi USING (ship_id)
JOIN po_alloc pa ON ps.ship_id = pa.ship_id AND pa.UID_items = psi.UID_items
JOIN po_header ph ON pa.hdr_id = ph.hdr_id
INNER JOIN
EVENT_TABLE et ON et.TABLE_ID1 = ps.ship_id
INNER JOIN
@TEMP_EVENT_TYPES tet ON tet.event_type = et.EVENT_TYPE
WHERE ps.eta >= '2010-03-22'
GROUP BY ps.ship_id
#13
0
It sure looks like you're modeling a state machine for po_shipment w/ event records in event_table for each state transition.
对于每个状态转换,看起来你正在为event_table中的事件记录建立一个状态机,用于po_shipment w / event记录。
What is the business logic you're trying to express with this query?
您尝试使用此查询表达的业务逻辑是什么?
You should be able to add a state field to your po_shipment to accomplish this, which summarizes and denormalizes the series of events in the event_table.
您应该能够在po_shipment中添加一个状态字段来完成此操作,它会对event_table中的一系列事件进行汇总和非规范化。
Building the logic to verify all 6 events have taken place for a given record shouldn't be in the database, it should be in your model and persisted in the database (classic state machine pattern for a model).
构建用于验证给定记录的所有6个事件的逻辑不应该在数据库中,它应该在您的模型中并且保存在数据库中(模型的经典状态机模式)。
Then it's just select from po_shipment where state = 'MAS5' and eta > date;
然后它只是从po_shipment中选择state ='MAS5'和eta> date;
#14
0
Out of curiosity - does the original data in B have many NULLs?
出于好奇 - B中的原始数据是否有很多NULL?
#1
2
If it's a data problem, I can't tell you what the exact problem is, but here's my favorite strategy to solve this kind of problem:
如果这是一个数据问题,我不能告诉你确切的问题是什么,但这是我最喜欢的解决这类问题的策略:
Try to remove half of your joins. Repeat recursively, until the query runs fast. Then add half of the joins you removed in the last step... (This strategy will require much fewer steps than removing and adding join by join.)
尝试删除一半的连接。递归重复,直到查询运行得很快。然后添加您在上一步中删除的一半连接...(此策略比通过连接删除和添加连接所需的步骤少得多。)
Once you've found out the "bad" join, you can try restricting its values with an additional "where" clause, until the query runs fast again... In each step, always try to reduce the problem in half.
一旦找到“坏”连接,您可以尝试使用额外的“where”子句限制其值,直到查询再次快速运行...在每个步骤中,始终尝试将问题减少一半。
Note: It can very well be the case, that you get a lot more records for the intermediate results of your joins, even if the total amount of data is smaller in database B.
注意:很可能就是这样,即使数据库B中的数据总量较小,您也可以获得更多联接中间结果的记录。
#2
3
Try adding STRAIGHT_JOIN to the query to see if the execution plan is the issue. The optimizer is choosing a different execution plan for each database and this might be causing the issue.
尝试将STRAIGHT_JOIN添加到查询中以查看执行计划是否存在问题。优化器正在为每个数据库选择不同的执行计划,这可能会导致问题。
SELECT STRAIGHT_JOIN * FROM po_shipment ps
LEFT JOIN EVENT_TABLE ev0 ON ev0.TABLE_ID1 = ps.ship_id AND ev0.EVENT_TYPE = 'MAS0'
LEFT JOIN EVENT_TABLE ev1 ON ev1.TABLE_ID1 = ps.ship_id AND ev1.EVENT_TYPE = 'MAS1'
LEFT JOIN EVENT_TABLE ev2 ON ev2.TABLE_ID1 = ps.ship_id AND ev2.EVENT_TYPE = 'MAS2'
LEFT JOIN EVENT_TABLE ev3 ON ev3.TABLE_ID1 = ps.ship_id AND ev3.EVENT_TYPE = 'MAS3'
LEFT JOIN EVENT_TABLE ev4 ON ev4.TABLE_ID1 = ps.ship_id AND ev4.EVENT_TYPE = 'MAS4'
LEFT JOIN EVENT_TABLE ev5 ON ev5.TABLE_ID1 = ps.ship_id AND ev5.EVENT_TYPE = 'MAS5'
JOIN po_shipment_item psi USING (ship_id)
JOIN po_alloc pa ON ps.ship_id = pa.ship_id AND pa.UID_items = psi.UID_items
JOIN po_header ph ON pa.hdr_id = ph.hdr_id
WHERE ps.eta >= '2010-03-22'
GROUP BY ps.ship_id
LIMIT 100;
UPDATE
1. The new execution plan for database A doesn't need a filesort or temporary table because it's using the primary key. I would start adding USE INDEX into the query to see if you can speed up the query on database B. Is ship_id the primary key of po_shipment? if so, you need to figure out what costs more the group by or filtering of the dates.
更新1.数据库A的新执行计划不需要文件或临时表,因为它使用主键。我会开始在查询中添加USE INDEX以查看是否可以加快数据库B上的查询。ship_id是po_shipment的主键吗?如果是这样,您需要计算出更多的成本或过滤日期。
SELECT STRAIGHT_JOIN * FROM po_shipment ps USE INDEX( PRIMARY )
LEFT JOIN EVENT_TABLE ev0 ON ev0.TABLE_ID1 = ps.ship_id AND ev0.EVENT_TYPE = 'MAS0'
LEFT JOIN EVENT_TABLE ev1 ON ev1.TABLE_ID1 = ps.ship_id AND ev1.EVENT_TYPE = 'MAS1'
LEFT JOIN EVENT_TABLE ev2 ON ev2.TABLE_ID1 = ps.ship_id AND ev2.EVENT_TYPE = 'MAS2'
LEFT JOIN EVENT_TABLE ev3 ON ev3.TABLE_ID1 = ps.ship_id AND ev3.EVENT_TYPE = 'MAS3'
LEFT JOIN EVENT_TABLE ev4 ON ev4.TABLE_ID1 = ps.ship_id AND ev4.EVENT_TYPE = 'MAS4'
LEFT JOIN EVENT_TABLE ev5 ON ev5.TABLE_ID1 = ps.ship_id AND ev5.EVENT_TYPE = 'MAS5'
JOIN po_shipment_item psi USING (ship_id)
JOIN po_alloc pa ON ps.ship_id = pa.ship_id AND pa.UID_items = psi.UID_items
JOIN po_header ph ON pa.hdr_id = ph.hdr_id
WHERE ps.eta >= '2010-03-22'
GROUP BY ps.ship_id
LIMIT 100;
If that doesn't help try suggesting more of the indexes used in the execution plan of database A for database B.
如果这没有帮助尝试建议数据库A的数据库A的执行计划中使用的更多索引。
#3
2
Haven't touched MySQL for a while but I am guessing the problem has to do with either
有一段时间没有触及MySQL,但我猜这个问题与两者有关
- Have you checked you key/join field lengths (actual data) maybe causing the sort_buffer to page to disk joining on large (size) keys? (this smells like a data issue...)
- The server settings, basically it is writting to an in memory temp table. I had something similar a few years back. Have you enlarged the key_buffer_size, table_cache, read_rnd_buffer_size, sort_buffer, read_buffer_size to see if that helps?
您是否检查过键/连接字段长度(实际数据)是否可能导致sort_buffer在大(大小)键上寻呼到磁盘? (这闻起来像数据问题......)
服务器设置,基本上是写入内存临时表。几年前我有类似的东西。你有没有放大key_buffer_size,table_cache,read_rnd_buffer_size,sort_buffer,read_buffer_size以查看是否有帮助?
#4
1
Update your statistics. one time I have a similar issue, and this works for me.
更新统计信息。有一次我有类似的问题,这对我有用。
#5
1
Since it's InnoDB it looks like it's uite possibly a locking problem. What else is going on at the same time?
因为它是InnoDB,看起来它可能是一个锁定问题。还有什么事情在同一时间发生?
#6
1
Since it seems like some issue with the data, it would probably be helpful to figure out what data is causing the problem. Make a third database C and insert half of the data from database B twice(so you have the same number of rows). If database C is slower, than the bad data is there, otherwise it is in the other half. Repeat with smaller and smaller chunk sizes to help find the problem data.
由于数据似乎有些问题,因此找出导致问题的数据可能会有所帮助。创建第三个数据库C并从数据库B插入一半数据(因此您具有相同的行数)。如果数据库C较慢,那么坏数据就在那里,否则它在另一半。重复使用越来越小的块大小来帮助查找问题数据。
Even though database B is smaller that database A, the tables 'po_header' and 'EVENT_TABLE' are not proportionally smaller. That may have something to do with the speed difference.
即使数据库B小于数据库A,表'po_header'和'EVENT_TABLE'也不是按比例缩小的。这可能与速度差异有关。
#7
1
Have to tried doing "CHECK TABLE" and/or "OPTIMIZE TABLE"?
必须尝试做“CHECK TABLE”和/或“OPTIMIZE TABLE”吗?
I had a similar situation where the problem was that the key indexes where not updated properly and doing CHECK TABLE fixed them. A long shot, but worth the try.
我有一个类似的情况,问题是关键索引没有正确更新和做CHECK TABLE修复它们。远射,但值得一试。
#8
1
To begin with make sure that you have created indexes on the appropriate fields. I am sure you've done that.
首先,确保已在相应字段上创建索引。我相信你已经做到了。
Next, try using index hints (USE INDEX) to force the database to use the indexes properly.
接下来,尝试使用索引提示(USE INDEX)强制数据库正确使用索引。
I had a similar problem in which I assumed that the indexes were setup and used by mysql properly, but they were not. I was able to work around it using index hints.
我有一个类似的问题,我认为索引是由mysql正确设置和使用的,但它们不是。我能够使用索引提示解决它。
#9
1
What is the selectivity of data in po_shipment.eta and po_shipment.ship_id. Could you post the results of the following query from both databases:
po_shipment.eta和po_shipment.ship_id中数据的选择性是什么?您可以从两个数据库发布以下查询的结果:
SELECT
count(distinct eta)/count(*) as eta_selectivity,
count(distinct ship_id)/count(*) as ship_id_selectivity
FROM po_shipment;
Usually the more selective are the field data (closer to 1), the better index is working. The cause might also be very uneven distribution of data in po_shipment.eta (when you use '2099-01-01' or the like for as a special value) if the optimizer lacks necessary statistics.
通常,现场数据的选择性越高(接近1),指数就越好。如果优化器缺少必要的统计信息,那么原因也可能是po_shipment.eta中的数据分布非常不均匀(当您使用'2099-01-01'或类似值作为特殊值时)。
By the way, could you provide SHOW CREATE TABLE po_shipment? Indices on the table could shed some light.
顺便问一下,你能提供SHOW CREATE TABLE po_shipment吗?桌上的指数可能会有所启发。
UPD: When the selectivity on the field is so low as for eta
field, the index is basically useless. Even worse, it might confuse the optimizer on selects and slow down inserts and updates of the data.
UPD:当场上的选择性如此低于eta场时,索引基本上是无用的。更糟糕的是,它可能会使优化器混淆选择并减慢数据的插入和更新。
The first suggestion is to try removing the index on the eta
field and measure the results. It might be that for the A database optimizer does not even try to use the index because of the the very low selectivity and for the B database it gives the index a try.
第一个建议是尝试删除eta字段上的索引并测量结果。可能是因为A数据库优化器甚至没有尝试使用索引,因为选择性非常低,而对于B数据库,它会尝试使用索引。
The second thing that bothers me is why do you group by ship_id in the first place? When there is a need in an implicit temporary table and among the fields there is a TEXT (as in your case) or BLOB one, MySQL will always use on disk temporary tables for sorting (which is implicitly required in GROUP BY). In your case, ship_id is the clustered primary key and the results will anyway be sorted by ship_id. What you need is to extract a base query (already grouped by ship_id if there can be several correscpondences) and JOIN po_shipment with the base query applying your range condition and not using the group by at the top.
困扰我的第二件事是你为什么首先按ship_id分组?当需要一个隐式临时表,并且在字段中有一个TEXT(如你的情况)或BLOB一个,MySQL将始终使用磁盘临时表进行排序(这在GROUP BY中是隐式需要的)。在您的情况下,ship_id是聚簇主键,结果无论如何都将按ship_id排序。你需要的是提取一个基本查询(如果可以有多个相关性,已经按ship_id分组)和JOIN po_shipment,基本查询应用你的范围条件而不是在顶部使用group by。
And the third. Do you realy need all the fields when you use * at the top? Having joined 10 tables, you receive a lot of fieds. I hardly believe you need al of them. Even exclusion of the TEXT field from the results might boost the query performance.
第三个。当您在顶部使用*时,您真的需要所有字段吗?加入10张牌桌后,你会收到很多赌注。我几乎不相信你需要他们。即使从结果中排除TEXT字段也可能提高查询性能。
#10
1
I've experienced the same when using range selectors less than
< and greater than
> too.
当使用小于 <且大于> 的范围选择器时,我也遇到了同样的问题。
Experiment: If the range is not too big, have you tried tried to expand the range into and an IN (...,...,...) statement instad ?
实验:如果范围不是太大,你试过将范围扩展到IN(...,...,...)语句中吗?
E.g
SELECT * FROM po_shipment ps USE INDEX (IX_ETA_DATE)
JOIN po_shipment_item psi USING (ship_id)
JOIN po_alloc pa ON ps.ship_id = pa.ship_id AND pa.UID_items = psi.UID_items
JOIN po_header ph ON pa.hdr_id = ph.hdr_id
LEFT JOIN EVENT_TABLE ev0 ON ev0.TABLE_ID1 = ps.ship_id AND ev0.EVENT_TYPE = 'MAS0'
LEFT JOIN EVENT_TABLE ev1 ON ev1.TABLE_ID1 = ps.ship_id AND ev1.EVENT_TYPE = 'MAS1'
LEFT JOIN EVENT_TABLE ev2 ON ev2.TABLE_ID1 = ps.ship_id AND ev2.EVENT_TYPE = 'MAS2'
LEFT JOIN EVENT_TABLE ev3 ON ev3.TABLE_ID1 = ps.ship_id AND ev3.EVENT_TYPE = 'MAS3'
LEFT JOIN EVENT_TABLE ev4 ON ev4.TABLE_ID1 = ps.ship_id AND ev4.EVENT_TYPE = 'MAS4'
LEFT JOIN EVENT_TABLE ev5 ON ev5.TABLE_ID1 = ps.ship_id AND ev5.EVENT_TYPE = 'MAS5'
WHERE ps.eta IN ('2010-03-22','2010-03-21','2010-03-20',...)
GROUP BY ps.ship_id
LIMIT 100;
Edit: Added USE INDEX() hint as Salman A suggested. Seems like mysql see's the possible index, but choses not to use it... worth testing.
编辑:添加USE INDEX()提示为Salman A建议。看起来像mysql看到可能的索引,但选择不使用它...值得测试。
#11
1
I think it may be the manual sort (filesort) for the GROUP BY clause that is causing the noticeable deficit.
我认为可能是GROUP BY子句的手动排序(filesort)导致了明显的不足。
Try using the SQL_BIG_RESULT hint to see if MySQL will improve it's method for handling the GROUP BY.
尝试使用SQL_BIG_RESULT提示来查看MySQL是否会改进它处理GROUP BY的方法。
SELECT SQL_BIG_RESULT * FROM ...
#12
0
I think you're going at this the wrong way. When you do a LEFT JOIN, you're getting back all of the records in EVENT_TABLE regardless of a match to po_shipment.
我认为你的方式是错误的。当你进行LEFT JOIN时,无论与po_shipment匹配,你都会收到EVENT_TABLE中的所有记录。
And you've run a LEFT JOIN to EVENT_TABLE six times. You're getting back (6 * (SELECT count(*) FROM EVENT_TABLE) ) records with every query. Since db A has less records than db B, of course the query runs faster on A.
而且你已经连续六次对EVENT_TABLE进行LEFT JOIN了。每次查询都会返回(6 *(SELECT count(*)FROM EVENT_TABLE))记录。由于db A的记录少于db B,因此查询在A上运行得更快。
I think something like this would perform worlds better:
我觉得这样的事情可以让世界变得更好:
DECLARE @TEMP_EVENT_TYPES table
(
EVENT_TYPE varchar(10) PRIMARY KEY
)
;
/*
INSERT VALUES 'MAS0', 'MAS1', 'MAS2', 'MAS3', 'MAS4', 'MAS5'
*/
;
SELECT * FROM po_shipment ps
JOIN po_shipment_item psi USING (ship_id)
JOIN po_alloc pa ON ps.ship_id = pa.ship_id AND pa.UID_items = psi.UID_items
JOIN po_header ph ON pa.hdr_id = ph.hdr_id
INNER JOIN
EVENT_TABLE et ON et.TABLE_ID1 = ps.ship_id
INNER JOIN
@TEMP_EVENT_TYPES tet ON tet.event_type = et.EVENT_TYPE
WHERE ps.eta >= '2010-03-22'
GROUP BY ps.ship_id
#13
0
It sure looks like you're modeling a state machine for po_shipment w/ event records in event_table for each state transition.
对于每个状态转换,看起来你正在为event_table中的事件记录建立一个状态机,用于po_shipment w / event记录。
What is the business logic you're trying to express with this query?
您尝试使用此查询表达的业务逻辑是什么?
You should be able to add a state field to your po_shipment to accomplish this, which summarizes and denormalizes the series of events in the event_table.
您应该能够在po_shipment中添加一个状态字段来完成此操作,它会对event_table中的一系列事件进行汇总和非规范化。
Building the logic to verify all 6 events have taken place for a given record shouldn't be in the database, it should be in your model and persisted in the database (classic state machine pattern for a model).
构建用于验证给定记录的所有6个事件的逻辑不应该在数据库中,它应该在您的模型中并且保存在数据库中(模型的经典状态机模式)。
Then it's just select from po_shipment where state = 'MAS5' and eta > date;
然后它只是从po_shipment中选择state ='MAS5'和eta> date;
#14
0
Out of curiosity - does the original data in B have many NULLs?
出于好奇 - B中的原始数据是否有很多NULL?