I have currently have 2 mysql tables in my db
我的数据库中目前有两个mysql表
Film
and Film_Ratings_Report
电影和Film_Ratings_Report
The primary key for Film is filmid
which is used to identify the film ratings in the Film_Ratings_Report
table.
电影的主键是filmid,它用于确定Film_Ratings_Report表中的电影评级。
I would like to know if its possible using a MYSQL query only to search the ratings table and collect all film ids which fit a certain criteria then use the selected IDs to get the film titles from the Film table. Below is the MYSQL query Im using which isnt working:
我想知道是否可以只使用MYSQL查询来搜索评级表并收集符合一定条件的所有电影id,然后使用所选的id从电影表中获取电影标题。下面是我使用的不工作的MYSQL查询Im:
SELECT *
FROM film
UNION SELECT filmid
FROM film_rating_report
WHERE rating = 'GE'
LIMIT 0,0
I am relatively green to MYSQL and would appreciate any help on this.
我对MYSQL比较熟悉,希望您能帮忙。
Thanks in Advance
谢谢提前
5 个解决方案
#1
16
SELECT * FROM film WHERE id IN
(SELECT filmid FROM film_rating_report WHERE rating = 'GE');
should work
应该工作
#2
13
It seems you want a semi-join, e.g. a join where only data from one of the 2 joined tables are needed. In this case, all rows from film
for which there is a matching row in film_rating_report
that has the wanted condition (rating = 'GE'
).
看起来您需要一个半连接,例如一个连接,其中只需要两个连接表中的一个的数据。在本例中,来自film的所有行,在film_rating_report中有一个匹配的行,该行具有所需的条件(评级= 'GE')。
This is not exactly equivalent to a usual join because even if there are 2 (or more) row in the second table (2 ratings of a film, both with 'GE'
), we still want the film to be shown once, not twice (or more times) as it would be shown with a usual join.
这不是相当于通常加入,因为即使有两个(或更多)行第二个表(2评级的电影,都与“通用电气”),我们仍然希望这部电影上映一次,不是两次(或多次),因为它将显示一个通常的加入。
There are various ways to write a semi-join and most popular are:
写半连接有很多种方式,最流行的是:
-
using an
EXISTS
correlated subquery (@Justin's answer):使用一个存在关联的子查询(@Justin的答案):
SELECT t1.* FROM film t1 WHERE EXISTS (SELECT filmid FROM film_rating_report t2 WHERE t2.rating = 'GE' AND t2.filmid = t1.id);
-
using an
IN
(uncorrelated) subquery (@SG 86's answer):
(this should be used with extreme care as it may return unexpected results - or none at all - if the joining columns (the twofilmid
in this case) are Nullable)使用IN (uncorrelation)子查询(@SG 86的答案):(如果连接列(在本例中为两个filmid)是可空的,那么应该非常小心地使用这个查询,因为它可能会返回意想不到的结果——或者根本不返回结果)
SELECT * FROM film WHERE id IN ( SELECT filmid FROM film_rating_report WHERE rating = 'GE' );
-
using a usual
JOIN
with aGROUP BY
to avoid the duplicate rows in the results (@Tomas' answer):
(and note that this specific use ofGROUP BY
works in MySQL only and in recent versions of Postgres, if you ever want to write a similar query in other DBMS, you'll have to include all columns:GROUP BY f.filmid, f.title, f.director, ...
)使用通常与一组结果,避免重复的行(@Tomas '回答):(和注意,这个特定的使用GROUP BY只在MySQL和Postgres的最近版本,如果你想写一个类似的查询其他DBMS,你必须包括所有列:由f组。filmid f。标题,f。导演,…)
SELECT f.* FROM film AS f JOIN film_rating_report AS frr ON f.filmid = frr.filmid WHERE frr.rating = 'GE' GROUP BY f.filmid ;
-
A variation on @Tomas'es answer, where the
GROUP BY
is done on a derived table and then theJOIN
:@Tomas' s答案的一个变体,其中BY组在派生表上完成,然后连接:
SELECT f.* FROM film AS f JOIN ( SELECT filmid FROM film_rating_report WHERE rating = 'GE' GROUP BY filmid ) AS frr ON f.filmid = frr.filmid ;
Which one to use, depends on the RDBMS and the specific version you are using (for example, IN
subqueries should be avoided in most versions of MySQL as they may produce inefficient execution plans), your specific table sizes, distribution, indexes, etc.
使用哪种方法取决于RDBMS和您正在使用的特定版本(例如,在大多数MySQL版本中,子查询应该避免,因为它们可能产生低效的执行计划)、特定的表大小、分布、索引等等。
I usually prefer the EXISTS
solution but it never hurts to first test the various queries with the table sizes you have or expect to have in the future and try to find the best query-indexes combination for your case.
我通常更喜欢存在的解决方案,但首先测试各种查询,并使用您在未来拥有或期望的表大小来测试各种查询,并尝试为您的案例找到最佳的查询索引组合。
Addition: if there is a unique constraint on the film_rating_report (filmid, rating)
combination, which means that no film will ever get two same ratings, or if there is an even stricter (but more plausible) unique constraint on film_rating_report (filmid)
that means that every film has at most one rating, you can simplify the JOIN
solutions to (and get rid of all the other queries):
补充:如果有一个独特的约束film_rating_report(filmid,评级)组合,这意味着没有电影会得到两个相同的评级,甚至如果有更严格的(但更合理)独特的限制film_rating_report(filmid)这意味着每一部电影最多有一个评级,可以简化连接解决方案(和摆脱所有其他查询):
SELECT f.*
FROM film AS f
JOIN film_rating_report AS frr
ON f.filmid = frr.filmid
WHERE frr.rating = 'GE' ;
#3
3
Preferred solution for this is to use join
, and don't forget group by
so that you don't have duplicate lines:
首选的解决方案是使用join,不要忘记group by,这样就不会有重复的行了:
select film.*
from film
join film_rating_report on film.filmid = film_rating_report.filmid
and rating = 'GE'
group by film.filmid
EDIT: as correctly noted by @ypercube, I was wrong claiming that the performance of join & group by is better than using subqueries with exists
or in
- quite the opposite.
编辑:正如@ypercube所指出的那样,我错误地认为,join & group的性能比使用现有或相反的子查询要好得多。
#4
1
Query:
查询:
SELECT t1.*
FROM film t1
WHERE EXISTS (SELECT filmid
FROM film_rating_report t2
WHERE t2.rating = 'GE'
AND t2.filmid = t1.id);
#5
-1
I believe this will work, thought without knowing your DB structure (consider giving SHOW CREATE TABLE on your tables), I have no way to know for sure:
我相信这是可行的,尽管我不知道您的DB结构(考虑在您的表上提供SHOW CREATE TABLE),但我无法确定:
SELECT film.*
FROM (film)
LEFT JOIN film_rating_report ON film.filmid = film_rating_report.filmid AND film_rating_report.rating = 'GE'
WHERE film_rating_report.filmid IS NOT NULL
GROUP BY film.filmid
(The WHERE film_rating_report.filmid IS NOT NULL prevents lines that don't have the rating you are seeking from sneaking in, I added GROUP BY at the end because film_rating_report might match more than once - not sure as I have visibility to the data stored in it)
(film_rating_report的地方。filmid并不是NULL,可以防止没有您想要的评级的行偷偷进入,我在最后添加了GROUP,因为film_rating_report可能匹配不止一次——不确定,因为我可以看到其中存储的数据)
#1
16
SELECT * FROM film WHERE id IN
(SELECT filmid FROM film_rating_report WHERE rating = 'GE');
should work
应该工作
#2
13
It seems you want a semi-join, e.g. a join where only data from one of the 2 joined tables are needed. In this case, all rows from film
for which there is a matching row in film_rating_report
that has the wanted condition (rating = 'GE'
).
看起来您需要一个半连接,例如一个连接,其中只需要两个连接表中的一个的数据。在本例中,来自film的所有行,在film_rating_report中有一个匹配的行,该行具有所需的条件(评级= 'GE')。
This is not exactly equivalent to a usual join because even if there are 2 (or more) row in the second table (2 ratings of a film, both with 'GE'
), we still want the film to be shown once, not twice (or more times) as it would be shown with a usual join.
这不是相当于通常加入,因为即使有两个(或更多)行第二个表(2评级的电影,都与“通用电气”),我们仍然希望这部电影上映一次,不是两次(或多次),因为它将显示一个通常的加入。
There are various ways to write a semi-join and most popular are:
写半连接有很多种方式,最流行的是:
-
using an
EXISTS
correlated subquery (@Justin's answer):使用一个存在关联的子查询(@Justin的答案):
SELECT t1.* FROM film t1 WHERE EXISTS (SELECT filmid FROM film_rating_report t2 WHERE t2.rating = 'GE' AND t2.filmid = t1.id);
-
using an
IN
(uncorrelated) subquery (@SG 86's answer):
(this should be used with extreme care as it may return unexpected results - or none at all - if the joining columns (the twofilmid
in this case) are Nullable)使用IN (uncorrelation)子查询(@SG 86的答案):(如果连接列(在本例中为两个filmid)是可空的,那么应该非常小心地使用这个查询,因为它可能会返回意想不到的结果——或者根本不返回结果)
SELECT * FROM film WHERE id IN ( SELECT filmid FROM film_rating_report WHERE rating = 'GE' );
-
using a usual
JOIN
with aGROUP BY
to avoid the duplicate rows in the results (@Tomas' answer):
(and note that this specific use ofGROUP BY
works in MySQL only and in recent versions of Postgres, if you ever want to write a similar query in other DBMS, you'll have to include all columns:GROUP BY f.filmid, f.title, f.director, ...
)使用通常与一组结果,避免重复的行(@Tomas '回答):(和注意,这个特定的使用GROUP BY只在MySQL和Postgres的最近版本,如果你想写一个类似的查询其他DBMS,你必须包括所有列:由f组。filmid f。标题,f。导演,…)
SELECT f.* FROM film AS f JOIN film_rating_report AS frr ON f.filmid = frr.filmid WHERE frr.rating = 'GE' GROUP BY f.filmid ;
-
A variation on @Tomas'es answer, where the
GROUP BY
is done on a derived table and then theJOIN
:@Tomas' s答案的一个变体,其中BY组在派生表上完成,然后连接:
SELECT f.* FROM film AS f JOIN ( SELECT filmid FROM film_rating_report WHERE rating = 'GE' GROUP BY filmid ) AS frr ON f.filmid = frr.filmid ;
Which one to use, depends on the RDBMS and the specific version you are using (for example, IN
subqueries should be avoided in most versions of MySQL as they may produce inefficient execution plans), your specific table sizes, distribution, indexes, etc.
使用哪种方法取决于RDBMS和您正在使用的特定版本(例如,在大多数MySQL版本中,子查询应该避免,因为它们可能产生低效的执行计划)、特定的表大小、分布、索引等等。
I usually prefer the EXISTS
solution but it never hurts to first test the various queries with the table sizes you have or expect to have in the future and try to find the best query-indexes combination for your case.
我通常更喜欢存在的解决方案,但首先测试各种查询,并使用您在未来拥有或期望的表大小来测试各种查询,并尝试为您的案例找到最佳的查询索引组合。
Addition: if there is a unique constraint on the film_rating_report (filmid, rating)
combination, which means that no film will ever get two same ratings, or if there is an even stricter (but more plausible) unique constraint on film_rating_report (filmid)
that means that every film has at most one rating, you can simplify the JOIN
solutions to (and get rid of all the other queries):
补充:如果有一个独特的约束film_rating_report(filmid,评级)组合,这意味着没有电影会得到两个相同的评级,甚至如果有更严格的(但更合理)独特的限制film_rating_report(filmid)这意味着每一部电影最多有一个评级,可以简化连接解决方案(和摆脱所有其他查询):
SELECT f.*
FROM film AS f
JOIN film_rating_report AS frr
ON f.filmid = frr.filmid
WHERE frr.rating = 'GE' ;
#3
3
Preferred solution for this is to use join
, and don't forget group by
so that you don't have duplicate lines:
首选的解决方案是使用join,不要忘记group by,这样就不会有重复的行了:
select film.*
from film
join film_rating_report on film.filmid = film_rating_report.filmid
and rating = 'GE'
group by film.filmid
EDIT: as correctly noted by @ypercube, I was wrong claiming that the performance of join & group by is better than using subqueries with exists
or in
- quite the opposite.
编辑:正如@ypercube所指出的那样,我错误地认为,join & group的性能比使用现有或相反的子查询要好得多。
#4
1
Query:
查询:
SELECT t1.*
FROM film t1
WHERE EXISTS (SELECT filmid
FROM film_rating_report t2
WHERE t2.rating = 'GE'
AND t2.filmid = t1.id);
#5
-1
I believe this will work, thought without knowing your DB structure (consider giving SHOW CREATE TABLE on your tables), I have no way to know for sure:
我相信这是可行的,尽管我不知道您的DB结构(考虑在您的表上提供SHOW CREATE TABLE),但我无法确定:
SELECT film.*
FROM (film)
LEFT JOIN film_rating_report ON film.filmid = film_rating_report.filmid AND film_rating_report.rating = 'GE'
WHERE film_rating_report.filmid IS NOT NULL
GROUP BY film.filmid
(The WHERE film_rating_report.filmid IS NOT NULL prevents lines that don't have the rating you are seeking from sneaking in, I added GROUP BY at the end because film_rating_report might match more than once - not sure as I have visibility to the data stored in it)
(film_rating_report的地方。filmid并不是NULL,可以防止没有您想要的评级的行偷偷进入,我在最后添加了GROUP,因为film_rating_report可能匹配不止一次——不确定,因为我可以看到其中存储的数据)