Let's say I have two tables, news and comments.
假设我有两张桌子,新闻和评论。
news (
id,
subject,
body,
posted
)
comments (
id,
parent, // points to news.id
message,
name,
posted
)
I would like to create one query that grabs the latest x # of news item along with the name and posted date for the latest comment for each news post.
我想创建一个查询来获取最新的x#新闻项目以及每个新闻帖子的最新评论的名称和发布日期。
Speed matters in terms of selecting ALL the comments in a subquery is not an option.
速度在选择子查询中的所有注释方面都很重要。
7 个解决方案
#1
I just realized the query does not return results if there are no comments attached to the news table, here's the fix as well as an added column for the total # of posts:
我刚刚意识到如果没有附加到新闻表的评论,查询不会返回结果,这里是修复以及添加的帖子总数列:
SELECT news.*, comments.name, comments.posted, (SELECT count(id) FROM comments WHERE comments.parent = news.id) AS numComments
FROM news
LEFT JOIN comments
ON news.id = comments.parent
AND comments.id = (SELECT max(id) FROM comments WHERE parent = news.id)
#2
If speed is that important, why not create a recent_comment table that contains the id and parent id of just the most recent comments? Every time a comment is posted on a news post, replace that news id's most recent comment id. Create an index on the news id column of the new table and your joins will be fast.
You'd be trading write speed for read speed, but not by a whole lot.
如果速度非常重要,为什么不创建一个包含最新注释的id和parent id的recent_comment表?每次在新闻帖子上发布评论时,请替换该新闻ID的最新评论ID。在新表的新闻ID列上创建索引,您的连接将很快。你将以读取速度交换写入速度,但不是很多。
#3
Assuming posted is a unique timestamp, otherwise choose a unique autonumber
假设已发布是唯一的时间戳,否则请选择唯一的自动编号
select c.id, c.parent, c.message, c.name, c.posted
c.message, c.name,
c.posted -- same as comment_latest.recent
from comments c
join
(
select parent, max(posted) as recent
from comments
group by parent
) as comment_latest
on c.parent = comment_latest.parent
and c.posted = comment_latest.recent
Complete(displays news information):
完成(显示新闻信息):
select
n.id as news_id, n.subject, n.body, n.posted as news_posted_date
c.id as comment_id,
c.message, c.name as commenter_name, c.posted as comment_posted_date
from comments c
join
(
select r.parent, max(r.posted) as recent
from comments r
join
(
select id from news order by id desc limit $last_x_news
) news l
on r.parent = l.id
group by r.parent
) as comment_latest
on c.parent = comment_latest.parent
and c.posted = comment_latest.recent
join news n on c.parent = n.id
NOTE:
The above code is not subquery, it is table-deriving query. It is faster than subquery. This is subquery(slow):
上面的代码不是子查询,而是表派生查询。它比子查询更快。这是子查询(慢):
select
id,
subject,
body,
posted as news_posted_date,
(select id from comments where parent = news.id order by posted desc limit 1) as comment_id,
(select message from comments where parent = news.id order by posted desc limit 1) as message,
(select name from comments where parent = news.id order by posted desc limit 1) as name,
(select posted from comments where parent = news.id order by posted desc limit 1) as comment_posted_date,
from news
#4
SELECT news.subject, news.body, comments.name, comments.posted
FROM news
INNER JOIN comments ON
(comments.parent = news.id)
WHERE comments.parent = news.id
AND comments.id = (SELECT MAX(id)
FROM comments
WHERE parent = news.id)
ORDER BY news.id
This gets all the news items, along with the related comment with the highest id value, which in theory should be the latest.
这将获得所有新闻项目,以及具有最高id值的相关评论,理论上这应该是最新的。
#5
My solution is similar to J but I think he added one line that is unnecessary:
我的解决方案类似于J,但我认为他添加了一行不必要的:
SELECT news.*, comments.name, comments.posted FROM news INNER JOIN comments ON news.id = comments.parent WHERE comments.id = (SELECT max(id) FROM comments WHERE parent = news.id )
Not sure of the speed on an extremely large table though.
虽然不确定在一张超大桌子上的速度。
#6
Given the constraints brought to light in the comments of my other answer, I have a new idea that may or may not make any sense in practise.
鉴于我在其他答案的评论中所揭示的限制,我有一个新的想法,在实践中可能有或没有任何意义。
Create a view (or function if it's more appropriate) with the following definition, called recent_comments:
使用以下定义创建一个视图(或更合适的函数),名为recent_comments:
SELECT MAX(id), parent
FROM comments
GROUP BY parent
If you have a clustered index on the parent column, this is probably a reasonably fast query, but even then it will still be a bottleneck.
如果父列上有聚簇索引,这可能是一个相当快的查询,但即便如此,它仍然是一个瓶颈。
Using this, the query you need to get your answer is something like,
使用这个,您需要获得答案的查询是这样的,
SELECT news.*, comments.*
FROM news
INNER JOIN recent_comments
ON news.id = recent_comments.parent
INNER JOIN comments
ON comments.id = recent_comments.id
Plus considerations for news posts that don't have any comments yet.
此外,对于尚未发表任何评论的新闻帖也需要考虑。
#7
I think the solution provided by @Jan is the best. i.e create the "View" and inner join it with the SQL statement.
我认为@Jan提供的解决方案是最好的。即创建“视图”并使用SQL语句将其连接起来。
It'll definitely reduce the time to pull the data. I tested it and it works 100%.
它肯定会减少提取数据的时间。我测试了它,它100%工作。
#1
I just realized the query does not return results if there are no comments attached to the news table, here's the fix as well as an added column for the total # of posts:
我刚刚意识到如果没有附加到新闻表的评论,查询不会返回结果,这里是修复以及添加的帖子总数列:
SELECT news.*, comments.name, comments.posted, (SELECT count(id) FROM comments WHERE comments.parent = news.id) AS numComments
FROM news
LEFT JOIN comments
ON news.id = comments.parent
AND comments.id = (SELECT max(id) FROM comments WHERE parent = news.id)
#2
If speed is that important, why not create a recent_comment table that contains the id and parent id of just the most recent comments? Every time a comment is posted on a news post, replace that news id's most recent comment id. Create an index on the news id column of the new table and your joins will be fast.
You'd be trading write speed for read speed, but not by a whole lot.
如果速度非常重要,为什么不创建一个包含最新注释的id和parent id的recent_comment表?每次在新闻帖子上发布评论时,请替换该新闻ID的最新评论ID。在新表的新闻ID列上创建索引,您的连接将很快。你将以读取速度交换写入速度,但不是很多。
#3
Assuming posted is a unique timestamp, otherwise choose a unique autonumber
假设已发布是唯一的时间戳,否则请选择唯一的自动编号
select c.id, c.parent, c.message, c.name, c.posted
c.message, c.name,
c.posted -- same as comment_latest.recent
from comments c
join
(
select parent, max(posted) as recent
from comments
group by parent
) as comment_latest
on c.parent = comment_latest.parent
and c.posted = comment_latest.recent
Complete(displays news information):
完成(显示新闻信息):
select
n.id as news_id, n.subject, n.body, n.posted as news_posted_date
c.id as comment_id,
c.message, c.name as commenter_name, c.posted as comment_posted_date
from comments c
join
(
select r.parent, max(r.posted) as recent
from comments r
join
(
select id from news order by id desc limit $last_x_news
) news l
on r.parent = l.id
group by r.parent
) as comment_latest
on c.parent = comment_latest.parent
and c.posted = comment_latest.recent
join news n on c.parent = n.id
NOTE:
The above code is not subquery, it is table-deriving query. It is faster than subquery. This is subquery(slow):
上面的代码不是子查询,而是表派生查询。它比子查询更快。这是子查询(慢):
select
id,
subject,
body,
posted as news_posted_date,
(select id from comments where parent = news.id order by posted desc limit 1) as comment_id,
(select message from comments where parent = news.id order by posted desc limit 1) as message,
(select name from comments where parent = news.id order by posted desc limit 1) as name,
(select posted from comments where parent = news.id order by posted desc limit 1) as comment_posted_date,
from news
#4
SELECT news.subject, news.body, comments.name, comments.posted
FROM news
INNER JOIN comments ON
(comments.parent = news.id)
WHERE comments.parent = news.id
AND comments.id = (SELECT MAX(id)
FROM comments
WHERE parent = news.id)
ORDER BY news.id
This gets all the news items, along with the related comment with the highest id value, which in theory should be the latest.
这将获得所有新闻项目,以及具有最高id值的相关评论,理论上这应该是最新的。
#5
My solution is similar to J but I think he added one line that is unnecessary:
我的解决方案类似于J,但我认为他添加了一行不必要的:
SELECT news.*, comments.name, comments.posted FROM news INNER JOIN comments ON news.id = comments.parent WHERE comments.id = (SELECT max(id) FROM comments WHERE parent = news.id )
Not sure of the speed on an extremely large table though.
虽然不确定在一张超大桌子上的速度。
#6
Given the constraints brought to light in the comments of my other answer, I have a new idea that may or may not make any sense in practise.
鉴于我在其他答案的评论中所揭示的限制,我有一个新的想法,在实践中可能有或没有任何意义。
Create a view (or function if it's more appropriate) with the following definition, called recent_comments:
使用以下定义创建一个视图(或更合适的函数),名为recent_comments:
SELECT MAX(id), parent
FROM comments
GROUP BY parent
If you have a clustered index on the parent column, this is probably a reasonably fast query, but even then it will still be a bottleneck.
如果父列上有聚簇索引,这可能是一个相当快的查询,但即便如此,它仍然是一个瓶颈。
Using this, the query you need to get your answer is something like,
使用这个,您需要获得答案的查询是这样的,
SELECT news.*, comments.*
FROM news
INNER JOIN recent_comments
ON news.id = recent_comments.parent
INNER JOIN comments
ON comments.id = recent_comments.id
Plus considerations for news posts that don't have any comments yet.
此外,对于尚未发表任何评论的新闻帖也需要考虑。
#7
I think the solution provided by @Jan is the best. i.e create the "View" and inner join it with the SQL statement.
我认为@Jan提供的解决方案是最好的。即创建“视图”并使用SQL语句将其连接起来。
It'll definitely reduce the time to pull the data. I tested it and it works 100%.
它肯定会减少提取数据的时间。我测试了它,它100%工作。