Having some real issues with a few queries, this one inparticular. Info below.
在一些查询中有一些真正的问题,尤其是这个问题。下面的信息。
tgmp_games, about 20k rows
tgmp_games,大约20 k行
CREATE TABLE IF NOT EXISTS `tgmp_games` (
`g_id` int(8) NOT NULL AUTO_INCREMENT,
`site_id` int(6) NOT NULL,
`g_name` varchar(255) NOT NULL,
`g_link` varchar(255) NOT NULL,
`g_url` varchar(255) NOT NULL,
`g_platforms` varchar(128) NOT NULL,
`g_added` datetime NOT NULL,
`g_cover` varchar(255) NOT NULL,
`g_impressions` int(8) NOT NULL,
PRIMARY KEY (`g_id`),
KEY `g_platforms` (`g_platforms`),
KEY `site_id` (`site_id`),
KEY `g_link` (`g_link`),
KEY `g_release` (`g_release`),
KEY `g_genre` (`g_genre`),
KEY `g_name` (`g_name`),
KEY `g_impressions` (`g_impressions`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
tgmp_reviews - about 200k rows
tgmp_review——大约200k行。
CREATE TABLE IF NOT EXISTS `tgmp_reviews` (
`r_id` int(8) NOT NULL AUTO_INCREMENT,
`site_id` int(6) NOT NULL,
`r_source` varchar(128) NOT NULL,
`r_date` date NOT NULL,
`r_score` int(3) NOT NULL,
`r_copy` text NOT NULL,
`r_link` text NOT NULL,
`r_int_link` text NOT NULL,
`r_parent` int(8) NOT NULL,
`r_platform` varchar(12) NOT NULL,
`r_impressions` int(8) NOT NULL,
PRIMARY KEY (`r_id`),
KEY `site_id` (`site_id`),
KEY `r_parent` (`r_parent`),
KEY `r_platform` (`r_platform`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 ;
Here is the query, takes 3 seconds ish
这是查询,大约需要3秒
SELECT * FROM tgmp_games g
RIGHT JOIN tgmp_reviews r ON g_id = r.r_parent
WHERE g.site_id = '34'
GROUP BY g_name
ORDER BY g_impressions DESC LIMIT 15
EXPLAIN
解释
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE r ALL r_parent NULL NULL NULL 201133 Using temporary; Using filesort
1 SIMPLE g eq_ref PRIMARY,site_id PRIMARY 4 engine_comp.r.r_parent 1 Using where
I am just trying to grab the 15 most viewed games, then grab a single review (doesnt really matter which, I guess highest rated would be ideal, r_score) for each game.
我只是想抓取15个观看次数最多的游戏,然后抓取一个单一的评论(我猜最高的评分应该是理想的r_score)。
Can someone help me figure out why this is so horribly inefficient?
有人能帮我弄明白为什么这个效率这么低吗?
2 个解决方案
#1
2
-
I don't understand what is the purpose of having a
GROUP BY g_name
in your query, but this makes MySQL performing aggregates on the columns selected, or all columns from both table. So please try to exclude it and check if it helps.我不明白在查询中使用g_name组的目的是什么,但是这使得MySQL对所选的列或两个表中的所有列执行聚合。所以请排除它并检查它是否有用。
-
Also,
RIGHT JOIN
makes database to querytgmp_reviews
first, which is not what you want. I supposeLEFT JOIN
is a better choice here. Please, try to change the join type.另外,右连接使数据库首先查询tgmp_review,这不是您想要的。我认为左连接是更好的选择。请尝试更改连接类型。
-
If none of the first options helps, you need to redesign your query. As you need to obtain 15 most viewed games for the site, the query will be:
如果第一个选项都不起作用,您需要重新设计查询。由于你需要为该网站获得15个观看次数最多的游戏,查询内容如下:
SELECT g_id FROM tgmp_games g WHERE site_id = 34 ORDER BY g_impressions DESC LIMIT 15;
This is the very first part that should be executed by the database, as it provides the best selectivity. Then you can get the desired reviews for the games:
这是数据库应该执行的第一部分,因为它提供了最佳的选择性。然后你可以得到想要的游戏评论:
SELECT r_parent, max(r_score) FROM tgmp_reviews r WHERE r_parent IN (/*1st query*/) GROUP BY r_parent;
Such construct will force database to execute the first query first (sorry for the tautology) and will give you the maximal score for each of the wanted games. I hope you will be able to use the obtained results for your purpose.
这样的构造将强制数据库首先执行第一个查询(抱歉,同义反复),并将为每个需要的游戏提供最大的分数。我希望你能把得到的结果用于你的目的。
#2
1
Your MyISAM table is small, you can try converting it to see if that resolves the issue. Do you have a reason for using MyISAM instead of InnoDB for that table?
您的MyISAM表很小,您可以尝试将它转换为看看这是否解决了问题。您是否有理由使用MyISAM而不是InnoDB来处理该表?
You can also try running an analyze on each table to update the statistics to see if the optimizer chooses something different.
您还可以尝试在每个表上运行分析,以更新统计数据,看看优化器是否选择了不同的内容。
#1
2
-
I don't understand what is the purpose of having a
GROUP BY g_name
in your query, but this makes MySQL performing aggregates on the columns selected, or all columns from both table. So please try to exclude it and check if it helps.我不明白在查询中使用g_name组的目的是什么,但是这使得MySQL对所选的列或两个表中的所有列执行聚合。所以请排除它并检查它是否有用。
-
Also,
RIGHT JOIN
makes database to querytgmp_reviews
first, which is not what you want. I supposeLEFT JOIN
is a better choice here. Please, try to change the join type.另外,右连接使数据库首先查询tgmp_review,这不是您想要的。我认为左连接是更好的选择。请尝试更改连接类型。
-
If none of the first options helps, you need to redesign your query. As you need to obtain 15 most viewed games for the site, the query will be:
如果第一个选项都不起作用,您需要重新设计查询。由于你需要为该网站获得15个观看次数最多的游戏,查询内容如下:
SELECT g_id FROM tgmp_games g WHERE site_id = 34 ORDER BY g_impressions DESC LIMIT 15;
This is the very first part that should be executed by the database, as it provides the best selectivity. Then you can get the desired reviews for the games:
这是数据库应该执行的第一部分,因为它提供了最佳的选择性。然后你可以得到想要的游戏评论:
SELECT r_parent, max(r_score) FROM tgmp_reviews r WHERE r_parent IN (/*1st query*/) GROUP BY r_parent;
Such construct will force database to execute the first query first (sorry for the tautology) and will give you the maximal score for each of the wanted games. I hope you will be able to use the obtained results for your purpose.
这样的构造将强制数据库首先执行第一个查询(抱歉,同义反复),并将为每个需要的游戏提供最大的分数。我希望你能把得到的结果用于你的目的。
#2
1
Your MyISAM table is small, you can try converting it to see if that resolves the issue. Do you have a reason for using MyISAM instead of InnoDB for that table?
您的MyISAM表很小,您可以尝试将它转换为看看这是否解决了问题。您是否有理由使用MyISAM而不是InnoDB来处理该表?
You can also try running an analyze on each table to update the statistics to see if the optimizer chooses something different.
您还可以尝试在每个表上运行分析,以更新统计数据,看看优化器是否选择了不同的内容。