MYSQL自连接如何工作?

时间:2022-03-29 15:06:07

I recently asked a question about Self-Joins and I got a great answer.

最近我问了一个关于自我联结的问题,我得到了一个很好的答案。

The query is meant to find the ID, Start Date, and Price of Event2, Following Event1 by 1 Day.

查询的目的是找到Event2的ID、开始日期和价格,在Event1之后的1天。

The code WORKS fine. But I don't understand HOW.

代码没问题。但我不明白是怎么回事。

Could someone explain as thoroughly as you can- what the different parts of the query are and what they do?

有人能尽可能详细地解释一下吗——查询的不同部分是什么,它们是做什么的?

SELECT event2.id, event2.startdate, event2.price
FROM mm_eventlist_dates event1
JOIN mm_eventlist_dates event2 
ON event2.startdate = date_add(event1.enddate, INTERVAL 1 DAY)
WHERE event1.id=$id

I really appreciate your help, for whatever reason I'm having a really hard time wrapping my head around this.

我真的很感谢你的帮助,不管出于什么原因,我一直在纠结这个问题。

4 个解决方案

#1


17  

The way I'd try to understand this is to write out two lists on piece one labelled event1 and one event2. Then list a few records in each list (the lists will be identical) now start at the WHERE in the description below.

我理解这个的方法是在片1上写出两个列表,分别是event1和event2。然后列出每个列表中的一些记录(列表将是相同的),现在从下面描述的WHERE开始。

We're taking data from two tables (OK the same table used twice, but try to ignore that for the moment)

我们正在从两个表中获取数据(好,同一个表使用了两次,但是暂时忽略它)

FROM mm_eventlist_dates event1
JOIN mm_eventlist_dates event2 

It probably helps to read the rest from the bottom up.

从下往上读剩下的部分可能会有帮助。

  WHERE event1.id=$id

So we want the record from event1 that has the specified record id. Presumably that's exactly one record. Now we figure out the day after that event ended.

我们希望event1的记录有指定的记录id,这大概是一个记录。现在我们算出事件结束后的第二天。

 date_add(event1.enddate, INTERVAL 1 DAY)

Now that tells us the records from event2, they need to start on that date,

现在告诉我们event2的记录,他们需要从这个日期开始,

ON event2.startdate = date_add(event1.enddate, INTERVAL 1 DAY)

We now have two records identified, what fields do we want?

我们现在已经标识了两条记录,我们想要哪个字段?

SELECT event2.id, event2.startdate, event2.price

Oh, just the fields from the one whose start date we figured out.

哦,就是我们算出起始日期的那个字段。

#2


8  

When you create a join in a query you are literally doing that - joining 2 tables together. This can be 2 different or 2 of the same tables, it doesn't matter. When specifying a join, creating an alias for the table (a name that refers to it in the rest of the query) is useful if the tables are different, and essential if they are the same. Your query is taking table 1 (event1) which has the columns:

当您在查询中创建一个连接时,您实际上是在做这个操作——将两个表连接在一起。这可以是两个不同的表格,也可以是两个相同的表格。在指定连接时,如果表不同,为表创建别名(在查询的其余部分引用此名称)将非常有用,如果表是相同的,则非常重要。您的查询取表1 (event1),其中包含以下列:

event1.id, event1.startdate, event1.price

and joining table 2 (event2):

及连接表2 (event2):

event2.id, event2.startdate, event2.price

which leaves you with the result set:

结果集如下:

event1.id, event1.startdate, event1.price, event2.id, event2.startdate, event2.price

The criteria for the join is specified as:

联接的标准指定为:

ON event2.startdate = date_add(event1.enddate, INTERVAL 1 DAY)

which is saying 'For each row in event1 join the row(s) in event2 that has a startdate of 1 day after the startdate in event1'

也就是说,event1中的每一行在event2中都有一个startdate在event1的startdate之后

Then as you are only interested in the data for one event the where clause limits the result set.

然后,当您只对一个事件的数据感兴趣时,where子句限制结果集。

WHERE event1.id=$id

Finally, as you don't need the information from event1 about the original event your select statement simply selects the event2 columns from the resultset:

最后,由于不需要来自event1的关于原始事件的信息,您的select语句只是从resultset中选择event2列:

SELECT event2.id, event2.startdate, event2.price

#3


6  

A self join works by referencing the same table twice using different selection criteria. Think of each reference to the table as a different "Virtual Table" created by filtering the original table. Generally, one of the tables is "filtered" using the WHERE clause and the second is "filtered" in the join clause. This is the most efficient way to do it, it is also possible to "filter" both in the join clause.

self join通过使用不同的选择条件两次引用同一个表来工作。将对该表的每个引用看作是通过过滤原始表创建的不同的“虚拟表”。通常,其中一个表使用WHERE子句“过滤”,第二个表在join子句中“过滤”。这是最有效的方法,也可以在join子句中“筛选”这两个子句。

So we have two virtual tables based on data in the same underlying table and they are joined together as though they were two totally separate tables.

我们有两个基于相同底层表中的数据的虚拟表,它们被连接在一起,就好像它们是两个完全独立的表。

The crux of it is that you store data in one table that takes on slightly different meaning based on context.

关键在于,您将数据存储在一个表中,该表根据上下文具有稍微不同的含义。

Consider a table of people, each with a unique id, and a column for father

考虑一个人员表,每个人都有一个唯一的id,以及一个父亲的列

    id   name    fatherID
    1    Joseph  [null]
    2    Greg     1

    SELECT child.name as childName, father.name as fatherName
        FROM people as child
        INNER JOIN people as father on (child.fatherID = father.id)  

Would yield 1 row

会产生1行

    childName   fatherName
    Greg       Joseph

#4


5  

A join combines two tables based on a certain criteria. The criteria is what follows the ON clause.

join基于特定的条件组合两个表。标准是遵循ON子句的。

If you join the a table with itself, it effectively is the same as creating a copy of the table, rename it and perform the join with that copy.

如果您将一个表与它自己连接起来,这实际上与创建一个表的副本、重命名它并与该副本执行连接是一样的。

For example

例如

Table foo         Table bar
+---+---+---+     +---+---+---+
| a | b | c |     | a | d | e |
+---+---+---+     +---+---+---+
| 1 | 2 | 3 |     | 1 | 0 | 0 |
+---+---+---+     +---+---+---+
| 1 | 3 | 4 |     | 2 | 9 | 3 |
+---+---+---+     +---+---+---+
| 1 | 3 | 5 |
+---+---+---+
| 2 | 4 | 6 |
+---+---+---+

If we do

如果我们做

select * from foo join bar on (foo.a = bar.a and foo.c > 4)

we end up with

我们结束了


foo join bar on (foo.a = bar.a and foo.c > 4)
+---+---+---+---+---+
| a | b | c | d | e |
+---+---+---+---+---+
| 1 | 3 | 5 | 0 | 0 |
+---+---+---+---+---+
| 2 | 4 | 6 | 9 | 3 |
+---+---+---+---+---+

Now,

现在,

SELECT event2.id, event2.startdate, event2.price 
FROM mm_eventlist_dates event1                     
JOIN mm_eventlist_dates event2 
ON event2.startdate = date_add(event1.enddate, INTERVAL 1 DAY)
WHERE event1.id=$id

that query follows the same principle, but with two instances of the table mm_eventlist_dates, one aliased as event1 and the other as event2. We can think of this like having two tables and perform the join just as in the real two tables scenario.

该查询遵循相同的原则,但是表mm_eventlist_dates有两个实例,一个别名为event1,另一个别名为event2。我们可以把它想象成有两个表,并在实际的两个表场景中执行联接。

The join criteria in this case is that for table event2, the startdate matches the enddate plus one day of table event1.

在这种情况下的连接标准是,对于表event2, startdate匹配enddate +一天的表event1。

The where clause restricts over what is the join performed, in this case it performs the join only over the rows of event1 table having the supplied id.

where子句限制执行的连接是什么,在这种情况下,它只在有提供id的event1表的行上执行连接。

#1


17  

The way I'd try to understand this is to write out two lists on piece one labelled event1 and one event2. Then list a few records in each list (the lists will be identical) now start at the WHERE in the description below.

我理解这个的方法是在片1上写出两个列表,分别是event1和event2。然后列出每个列表中的一些记录(列表将是相同的),现在从下面描述的WHERE开始。

We're taking data from two tables (OK the same table used twice, but try to ignore that for the moment)

我们正在从两个表中获取数据(好,同一个表使用了两次,但是暂时忽略它)

FROM mm_eventlist_dates event1
JOIN mm_eventlist_dates event2 

It probably helps to read the rest from the bottom up.

从下往上读剩下的部分可能会有帮助。

  WHERE event1.id=$id

So we want the record from event1 that has the specified record id. Presumably that's exactly one record. Now we figure out the day after that event ended.

我们希望event1的记录有指定的记录id,这大概是一个记录。现在我们算出事件结束后的第二天。

 date_add(event1.enddate, INTERVAL 1 DAY)

Now that tells us the records from event2, they need to start on that date,

现在告诉我们event2的记录,他们需要从这个日期开始,

ON event2.startdate = date_add(event1.enddate, INTERVAL 1 DAY)

We now have two records identified, what fields do we want?

我们现在已经标识了两条记录,我们想要哪个字段?

SELECT event2.id, event2.startdate, event2.price

Oh, just the fields from the one whose start date we figured out.

哦,就是我们算出起始日期的那个字段。

#2


8  

When you create a join in a query you are literally doing that - joining 2 tables together. This can be 2 different or 2 of the same tables, it doesn't matter. When specifying a join, creating an alias for the table (a name that refers to it in the rest of the query) is useful if the tables are different, and essential if they are the same. Your query is taking table 1 (event1) which has the columns:

当您在查询中创建一个连接时,您实际上是在做这个操作——将两个表连接在一起。这可以是两个不同的表格,也可以是两个相同的表格。在指定连接时,如果表不同,为表创建别名(在查询的其余部分引用此名称)将非常有用,如果表是相同的,则非常重要。您的查询取表1 (event1),其中包含以下列:

event1.id, event1.startdate, event1.price

and joining table 2 (event2):

及连接表2 (event2):

event2.id, event2.startdate, event2.price

which leaves you with the result set:

结果集如下:

event1.id, event1.startdate, event1.price, event2.id, event2.startdate, event2.price

The criteria for the join is specified as:

联接的标准指定为:

ON event2.startdate = date_add(event1.enddate, INTERVAL 1 DAY)

which is saying 'For each row in event1 join the row(s) in event2 that has a startdate of 1 day after the startdate in event1'

也就是说,event1中的每一行在event2中都有一个startdate在event1的startdate之后

Then as you are only interested in the data for one event the where clause limits the result set.

然后,当您只对一个事件的数据感兴趣时,where子句限制结果集。

WHERE event1.id=$id

Finally, as you don't need the information from event1 about the original event your select statement simply selects the event2 columns from the resultset:

最后,由于不需要来自event1的关于原始事件的信息,您的select语句只是从resultset中选择event2列:

SELECT event2.id, event2.startdate, event2.price

#3


6  

A self join works by referencing the same table twice using different selection criteria. Think of each reference to the table as a different "Virtual Table" created by filtering the original table. Generally, one of the tables is "filtered" using the WHERE clause and the second is "filtered" in the join clause. This is the most efficient way to do it, it is also possible to "filter" both in the join clause.

self join通过使用不同的选择条件两次引用同一个表来工作。将对该表的每个引用看作是通过过滤原始表创建的不同的“虚拟表”。通常,其中一个表使用WHERE子句“过滤”,第二个表在join子句中“过滤”。这是最有效的方法,也可以在join子句中“筛选”这两个子句。

So we have two virtual tables based on data in the same underlying table and they are joined together as though they were two totally separate tables.

我们有两个基于相同底层表中的数据的虚拟表,它们被连接在一起,就好像它们是两个完全独立的表。

The crux of it is that you store data in one table that takes on slightly different meaning based on context.

关键在于,您将数据存储在一个表中,该表根据上下文具有稍微不同的含义。

Consider a table of people, each with a unique id, and a column for father

考虑一个人员表,每个人都有一个唯一的id,以及一个父亲的列

    id   name    fatherID
    1    Joseph  [null]
    2    Greg     1

    SELECT child.name as childName, father.name as fatherName
        FROM people as child
        INNER JOIN people as father on (child.fatherID = father.id)  

Would yield 1 row

会产生1行

    childName   fatherName
    Greg       Joseph

#4


5  

A join combines two tables based on a certain criteria. The criteria is what follows the ON clause.

join基于特定的条件组合两个表。标准是遵循ON子句的。

If you join the a table with itself, it effectively is the same as creating a copy of the table, rename it and perform the join with that copy.

如果您将一个表与它自己连接起来,这实际上与创建一个表的副本、重命名它并与该副本执行连接是一样的。

For example

例如

Table foo         Table bar
+---+---+---+     +---+---+---+
| a | b | c |     | a | d | e |
+---+---+---+     +---+---+---+
| 1 | 2 | 3 |     | 1 | 0 | 0 |
+---+---+---+     +---+---+---+
| 1 | 3 | 4 |     | 2 | 9 | 3 |
+---+---+---+     +---+---+---+
| 1 | 3 | 5 |
+---+---+---+
| 2 | 4 | 6 |
+---+---+---+

If we do

如果我们做

select * from foo join bar on (foo.a = bar.a and foo.c > 4)

we end up with

我们结束了


foo join bar on (foo.a = bar.a and foo.c > 4)
+---+---+---+---+---+
| a | b | c | d | e |
+---+---+---+---+---+
| 1 | 3 | 5 | 0 | 0 |
+---+---+---+---+---+
| 2 | 4 | 6 | 9 | 3 |
+---+---+---+---+---+

Now,

现在,

SELECT event2.id, event2.startdate, event2.price 
FROM mm_eventlist_dates event1                     
JOIN mm_eventlist_dates event2 
ON event2.startdate = date_add(event1.enddate, INTERVAL 1 DAY)
WHERE event1.id=$id

that query follows the same principle, but with two instances of the table mm_eventlist_dates, one aliased as event1 and the other as event2. We can think of this like having two tables and perform the join just as in the real two tables scenario.

该查询遵循相同的原则,但是表mm_eventlist_dates有两个实例,一个别名为event1,另一个别名为event2。我们可以把它想象成有两个表,并在实际的两个表场景中执行联接。

The join criteria in this case is that for table event2, the startdate matches the enddate plus one day of table event1.

在这种情况下的连接标准是,对于表event2, startdate匹配enddate +一天的表event1。

The where clause restricts over what is the join performed, in this case it performs the join only over the rows of event1 table having the supplied id.

where子句限制执行的连接是什么,在这种情况下,它只在有提供id的event1表的行上执行连接。