I have got 3 tables with those columns below:
我有3个表格,下面是这些列:
Topics: [TopicID] [TopicName]
Messages: [MessageID] [MessageText]
MessageTopicRelations [EntryID] [MessageID] [TopicID]
messages can be about more than one topic. question is: given couple of topics, I need to get messages which are about ALL these topics and not the less, but they can be about some other topic too. a message which is about SOME of these given topics won't be included. I hope I explained my request well. otherwise I can provide sample data. thanks
消息可以是多个主题。问题是:给定几个主题,我需要获得关于所有这些主题的消息,而不是更少,但它们也可以是关于其他主题的。将不包括关于这些给定主题的一些消息。我希望我能很好地解释我的要求。否则我可以提供样本数据。谢谢
4 个解决方案
#1
5
The following use x
, y
, and z
to stand in for topic ids, being that none were provided for examples.
以下使用x,y和z代表主题id,因为没有提供示例。
Using JOINs:
SELECT m.*
FROM MESSAGES m
JOIN MESSAGETOPICRELATIONS mtr ON mtr.messageid = m.messageid
JOIN TOPICS tx ON tx.topicid = mtr.topicid
AND tx.topicid = x
JOIN TOPICS ty ON ty.topicid = mtr.topicid
AND ty.topicid = y
JOIN TOPICS tz ON tz.topicid = mtr.topicid
AND tz.topicid = z
Using GROUP BY/HAVING COUNT(*):
SELECT m.*
FROM MESSAGES m
JOIN MESSAGETOPICRELATIONS mtr ON mtr.messageid = m.messageid
JOIN TOPICS t ON t.topicid = mtr.topicid
WHERE t.topicid IN (x, y, z)
GROUP BY m.messageid, m.messagetext
HAVING COUNT(*) = 3
Of the two, the JOIN approach is safer.
The GROUP BY/HAVING relies on the MESSAGETOPICRELATIONS.TOPICID
being either part of the primary key, or having a unique key constraint to ensure there aren't duplicates. Otherwise, you could have 2+ instances of the same topic associated to a message - which would be a false positive. Using HAVING COUNT(DISTINCT ...
would clear up any false positives, but support depends on the database - MySQL supports it at 5.1+, but not on 4.1. Oracle might, have to wait till Monday to test on SQL Server...
GROUP BY / HAVING依赖于MESSAGETOPICRELATIONS.TOPICID作为主键的一部分,或者具有唯一的键约束以确保没有重复。否则,您可能会有2个与消息关联的同一主题的实例 - 这将是误报。使用HAVING COUNT(DISTINCT ...会清除任何误报,但支持取决于数据库 - MySQL支持5.1+,但不支持4.1.Oracle可能必须等到周一才能测试SQL Server ...
I looked into Bill's comment about not needing the join to the TOPICS
table:
我查看了Bill关于不需要加入TOPICS表的评论:
SELECT m.*
FROM MESSAGES m
JOIN MESSAGETOPICRELATIONS mtr ON mtr.messageid = m.messageid
AND mtr.topicid IN (x, y, z)
...will return false positives - rows that match at least one of the values defined in the IN
clause. And:
...将返回误报 - 与IN子句中定义的至少一个值匹配的行。和:
SELECT m.*
FROM MESSAGES m
JOIN MESSAGETOPICRELATIONS mtr ON mtr.messageid = m.messageid
AND mtr.topicid = x
AND mtr.topicid = y
AND mtr.topicid = z
...won't return anything at all, because the topicid
can never be all of the values at once.
...根本不会返回任何内容,因为topicid永远不会是所有值。
#2
1
Here's a profoundly inelegant solution
这是一个非常不优雅的解决方案
SELECT
m.MessageID
,m.MessageText
FROM
Messages m
WHERE
m.MessageID IN (
SELECT
mt.MessageID
FROM
MessageTopicRelations mt
WHERE
TopicID IN (1,4,5)// List of topic IDS
GROUP BY
mt.MessageID
HAVING
count(*) = 3 //Number of topics
)
#3
1
Edit: thanks to @Paul Creasey and @OMG Ponies for finding the flaws in my approach.
The correct way to do this is with a self-join for each topic; as shown in the leading answer.
编辑:感谢@Paul Creasey和@OMG小马寻找我的方法中的缺陷。正确的方法是为每个主题自我加入;如前面的答案所示。
Another profoundly inelegant entry:
另一个非常不优雅的条目:
select m.MessageText
, t.TopicName
from Messages m
inner join MessageTopicRelations mtr
on mtr.MessageID = m.MessageID
inner join Topics t
on t.TopicID = mtr.TopicID
and
t.TopicName = 'topic1'
UNION
select m.MessageText
, t.TopicName
from Messages m
inner join MessageTopicRelations mtr
on mtr.MessageID = m.MessageID
inner join Topics t
on t.TopicID = mtr.TopicID
and
t.TopicName = 'topic2'
...
#4
1
Re: the answer by OMG Ponies, you don't need to join to the TOPICS
table. And the HAVING COUNT(DISTINCT)
clause works fine in MySQL 5.1. I just tested it.
Re:OMG Ponies的答案,你不需要加入TOPICS表。 HAVING COUNT(DISTINCT)子句在MySQL 5.1中运行良好。我刚试过它。
This is what I mean:
这就是我的意思:
Using GROUP BY/HAVING COUNT(*):
SELECT m.*
FROM MESSAGES m
JOIN MESSAGETOPICRELATIONS mtr ON mtr.messageid = m.messageid
WHERE mtr.topicid IN (x, y, z)
GROUP BY m.messageid
HAVING COUNT(DISTINCT mtr.topicid) = 3
The reason that I suggest COUNT(DISTINCT)
is that if the columns (messageid,topicid)
don't have a unique constraint, you could get duplicates, which would result in a count of 3 in the group, even with fewer than three distinct values.
我建议COUNT(DISTINCT)的原因是,如果列(messageid,topicid)没有唯一约束,您可能会得到重复项,这将导致组中的计数为3,即使少于3个不同值。
#1
5
The following use x
, y
, and z
to stand in for topic ids, being that none were provided for examples.
以下使用x,y和z代表主题id,因为没有提供示例。
Using JOINs:
SELECT m.*
FROM MESSAGES m
JOIN MESSAGETOPICRELATIONS mtr ON mtr.messageid = m.messageid
JOIN TOPICS tx ON tx.topicid = mtr.topicid
AND tx.topicid = x
JOIN TOPICS ty ON ty.topicid = mtr.topicid
AND ty.topicid = y
JOIN TOPICS tz ON tz.topicid = mtr.topicid
AND tz.topicid = z
Using GROUP BY/HAVING COUNT(*):
SELECT m.*
FROM MESSAGES m
JOIN MESSAGETOPICRELATIONS mtr ON mtr.messageid = m.messageid
JOIN TOPICS t ON t.topicid = mtr.topicid
WHERE t.topicid IN (x, y, z)
GROUP BY m.messageid, m.messagetext
HAVING COUNT(*) = 3
Of the two, the JOIN approach is safer.
The GROUP BY/HAVING relies on the MESSAGETOPICRELATIONS.TOPICID
being either part of the primary key, or having a unique key constraint to ensure there aren't duplicates. Otherwise, you could have 2+ instances of the same topic associated to a message - which would be a false positive. Using HAVING COUNT(DISTINCT ...
would clear up any false positives, but support depends on the database - MySQL supports it at 5.1+, but not on 4.1. Oracle might, have to wait till Monday to test on SQL Server...
GROUP BY / HAVING依赖于MESSAGETOPICRELATIONS.TOPICID作为主键的一部分,或者具有唯一的键约束以确保没有重复。否则,您可能会有2个与消息关联的同一主题的实例 - 这将是误报。使用HAVING COUNT(DISTINCT ...会清除任何误报,但支持取决于数据库 - MySQL支持5.1+,但不支持4.1.Oracle可能必须等到周一才能测试SQL Server ...
I looked into Bill's comment about not needing the join to the TOPICS
table:
我查看了Bill关于不需要加入TOPICS表的评论:
SELECT m.*
FROM MESSAGES m
JOIN MESSAGETOPICRELATIONS mtr ON mtr.messageid = m.messageid
AND mtr.topicid IN (x, y, z)
...will return false positives - rows that match at least one of the values defined in the IN
clause. And:
...将返回误报 - 与IN子句中定义的至少一个值匹配的行。和:
SELECT m.*
FROM MESSAGES m
JOIN MESSAGETOPICRELATIONS mtr ON mtr.messageid = m.messageid
AND mtr.topicid = x
AND mtr.topicid = y
AND mtr.topicid = z
...won't return anything at all, because the topicid
can never be all of the values at once.
...根本不会返回任何内容,因为topicid永远不会是所有值。
#2
1
Here's a profoundly inelegant solution
这是一个非常不优雅的解决方案
SELECT
m.MessageID
,m.MessageText
FROM
Messages m
WHERE
m.MessageID IN (
SELECT
mt.MessageID
FROM
MessageTopicRelations mt
WHERE
TopicID IN (1,4,5)// List of topic IDS
GROUP BY
mt.MessageID
HAVING
count(*) = 3 //Number of topics
)
#3
1
Edit: thanks to @Paul Creasey and @OMG Ponies for finding the flaws in my approach.
The correct way to do this is with a self-join for each topic; as shown in the leading answer.
编辑:感谢@Paul Creasey和@OMG小马寻找我的方法中的缺陷。正确的方法是为每个主题自我加入;如前面的答案所示。
Another profoundly inelegant entry:
另一个非常不优雅的条目:
select m.MessageText
, t.TopicName
from Messages m
inner join MessageTopicRelations mtr
on mtr.MessageID = m.MessageID
inner join Topics t
on t.TopicID = mtr.TopicID
and
t.TopicName = 'topic1'
UNION
select m.MessageText
, t.TopicName
from Messages m
inner join MessageTopicRelations mtr
on mtr.MessageID = m.MessageID
inner join Topics t
on t.TopicID = mtr.TopicID
and
t.TopicName = 'topic2'
...
#4
1
Re: the answer by OMG Ponies, you don't need to join to the TOPICS
table. And the HAVING COUNT(DISTINCT)
clause works fine in MySQL 5.1. I just tested it.
Re:OMG Ponies的答案,你不需要加入TOPICS表。 HAVING COUNT(DISTINCT)子句在MySQL 5.1中运行良好。我刚试过它。
This is what I mean:
这就是我的意思:
Using GROUP BY/HAVING COUNT(*):
SELECT m.*
FROM MESSAGES m
JOIN MESSAGETOPICRELATIONS mtr ON mtr.messageid = m.messageid
WHERE mtr.topicid IN (x, y, z)
GROUP BY m.messageid
HAVING COUNT(DISTINCT mtr.topicid) = 3
The reason that I suggest COUNT(DISTINCT)
is that if the columns (messageid,topicid)
don't have a unique constraint, you could get duplicates, which would result in a count of 3 in the group, even with fewer than three distinct values.
我建议COUNT(DISTINCT)的原因是,如果列(messageid,topicid)没有唯一约束,您可能会得到重复项,这将导致组中的计数为3,即使少于3个不同值。