Is it possible to optimize the following into a single query?
是否可以将以下内容优化为单个查询?
I am assuming here that a single query would be more efficient that multiple queries using a temporary table, so please let me know if my assumption is incorrect.
我在这里假设单个查询使用临时表的多个查询会更有效,所以如果我的假设不正确,请告诉我。
$id
is the current memberid. $list
is a list of itemids to be removed from the final results (e.g. items already downloaded).
$ id是当前的memberid。 $ list是要从最终结果中删除的itemid列表(例如已下载的项目)。
What this query is supposed to do it find the top 500 members who have downloaded similar items to the $id
member. Then find all the items that these members have downloaded, ranked by a score based on the number of similar downloads from each member and the total number of downloads of each item. The final result is therefore a list of recommendations for the $id
member.
这个查询应该做什么找到已经向$ id成员下载类似项目的前500名成员。然后查找这些成员下载的所有项目,根据每个成员的类似下载次数和每个项目的下载总数进行排名。因此,最终结果是$ id成员的推荐列表。
The queries are:
查询是:
mysql_query('CREATE TEMPORARY TABLE temp1 ENGINE=MEMORY AS
(SELECT a.memberid, COUNT(*) `score` FROM table_downloads a INNER JOIN
(SELECT itemid FROM table_downloads WHERE memberid='.$id.') b ON a.itemid = b.itemid
WHERE a.memberid!='.$id.' GROUP BY a.memberid HAVING score>0 ORDER BY score DESC LIMIT 500)');
$res=mysql_query('SELECT table_downloads.itemid,COUNT(table_downloads.itemid*temp1.score) AS score2
FROM table_downloads,temp1
WHERE table_downloads.memberid=temp1.memberid AND table_downloads.itemid NOT IN ('.$list.')
GROUP BY table_downloads.itemid
ORDER BY score2 DESC LIMIT 30');
mysql_query('DROP TABLE temp1');
It's possible that this query might take too long as to be unusable if there were several million rows. Any advice on ensuring it is executes quickly would also be greatly appreciated.
如果存在数百万行,则此查询可能需要太长时间才能无法使用。关于确保快速执行的任何建议也将非常感激。
*I am using mysql_query deliberately. Please do not tell me to use mysqli.*
*我故意使用mysql_query。请不要告诉我使用mysqli。*
2 个解决方案
#1
1
- It's not possible to do with
mysql_query()
- It's a mistake to think that joining 3 calls into one would save something noticeable in this case
使用mysql_query()是不可能的
认为将3个调用合并为一个会在这种情况下保存一些明显的东西是错误的
And to be clear it's a very common delusion - to think that a single messy query would run faster than multiple. It wouldn't.
并且要清楚这是一个非常普遍的错觉 - 认为单个混乱的查询运行速度比多个快。它不会。
#2
0
Out of interest I had a play.
出于兴趣,我玩了一玩。
I think it is possible, although not pretty and I suspect slower than your current script.
我认为这是可能的,虽然不是很漂亮,我怀疑比你当前的脚本慢。
SELECT table_downloads.itemid, COUNT(table_downloads.itemid * temp1.score) AS score2
FROM table_downloads
INNER JOIN (SELECT a.memberid, COUNT(*) `score`
FROM table_downloads a
INNER JOIN table_downloads b
ON a.itemid = b.itemid AND b.memberid='.$id.' AND a.memberid=b.memberid
GROUP BY a.memberid
ORDER BY score DESC
LIMIT 500) temp1
ON table_downloads.memberid = temp1.memberid
WHERE table_downloads.itemid NOT IN ('.$list.')
GROUP BY table_downloads.itemid
ORDER BY score2 DESC
LIMIT 30
Not really tested though (don't know your table layouts)
虽然没有真正测试过(不知道你的表格布局)
#1
1
- It's not possible to do with
mysql_query()
- It's a mistake to think that joining 3 calls into one would save something noticeable in this case
使用mysql_query()是不可能的
认为将3个调用合并为一个会在这种情况下保存一些明显的东西是错误的
And to be clear it's a very common delusion - to think that a single messy query would run faster than multiple. It wouldn't.
并且要清楚这是一个非常普遍的错觉 - 认为单个混乱的查询运行速度比多个快。它不会。
#2
0
Out of interest I had a play.
出于兴趣,我玩了一玩。
I think it is possible, although not pretty and I suspect slower than your current script.
我认为这是可能的,虽然不是很漂亮,我怀疑比你当前的脚本慢。
SELECT table_downloads.itemid, COUNT(table_downloads.itemid * temp1.score) AS score2
FROM table_downloads
INNER JOIN (SELECT a.memberid, COUNT(*) `score`
FROM table_downloads a
INNER JOIN table_downloads b
ON a.itemid = b.itemid AND b.memberid='.$id.' AND a.memberid=b.memberid
GROUP BY a.memberid
ORDER BY score DESC
LIMIT 500) temp1
ON table_downloads.memberid = temp1.memberid
WHERE table_downloads.itemid NOT IN ('.$list.')
GROUP BY table_downloads.itemid
ORDER BY score2 DESC
LIMIT 30
Not really tested though (don't know your table layouts)
虽然没有真正测试过(不知道你的表格布局)