SQL - SubQuery和外表之间的关系

时间:2021-10-26 04:17:00

Problem

问题

I need to better understand the rules about when I can reference an outer table in a subquery and when (and why) that is an inappropriate request. I've discovered a duplication in an Oracle SQL query I'm trying to refactor but I'm running into issues when I try and turn my referenced table into a grouped subQuery.

我需要更好地理解有关何时可以在子查询中引用外部表以及何时(以及为什么)这是一个不适当的请求的规则。我在Oracle SQL查询中发现了一个重复,我正在尝试重构,但是当我尝试将我的引用表转换为分组子查询时,我遇到了问题。

The following statement works appropriately:

以下声明适用:

SELECT  t1.*  
FROM    table1 t1, 
INNER JOIN table2 t2 
        on t1.id = t2.id        
        and t2.date = (SELECT max(date) 
                       FROM   table2  
                       WHERE  id = t1.id) --This subquery has access to t1

Unfortunately table2 sometimes has duplicate records so I need to aggregate t2 first before I join it to t1. However when I try and wrap it in a subquery to accomplish this operation, suddenly the SQL engine can't recognize the outer table any longer.

不幸的是,table2有时会有重复的记录,所以在加入t1之前我需要首先聚合t2。但是,当我尝试将其包装在子查询中以完成此操作时,SQL引擎突然无法再识别外部表。

SELECT  t1.* 
FROM    table1 t1, 
INNER JOIN (SELECT * 
            FROM  table2 t2
            WHERE t1.id = t2.id              --This loses access to t1
              and t2.date = (SELECT max(date) 
                             FROM   table2 
                             WHERE  id = t1.id)) sub on t1.id = sub.id 
                             --Subquery loses access to t1

I know these are fundamentally different queries I'm asking the compiler to put together but I'm not seeing why the one would work but not the other.

我知道这些是根本不同的查询我要求编译器放在一起,但我不明白为什么一个会工作而不是另一个。

I know I can duplicate the table references in my subquery and effectively detach my subquery from the outer table but that seems like a really ugly way of accomplishing this task (what with all the duplication of code and processing).

我知道我可以复制子查询中的表引用并有效地将子查询从外表中分离出来,但这似乎是完成此任务的一种非常难看的方式(所有代码和处理的重复都是如此)。

Helpful References

有用的参考资料

  • I found this fantastic description of the order in which clauses are executed in SQL Server: (INNER JOIN ON vs WHERE clause). I'm using Oracle but I would think that this would be standard across the board. There is a clear order to clause evaluation (with FROM being first) so I would think that any clause occuring further down the list would have access to all information previously processed. I can only assume my 2nd query somehow changes that ordering so that my subquery is being evaluated too early?

    我发现了这个在SQL Server中执行子句的顺序的精彩描述:(INNER JOIN ON vs WHERE子句)。我正在使用Oracle,但我认为这将是全面的标准。有一个明确的命令条款评估(FROM是第一个)所以我认为在列表中进一步发生的任何条款都可以访问以前处理过的所有信息。我只能假设我的第二个查询以某种方式更改了这个排序,以便我的子查询过早评估?

  • In addition, I found a similar question asked (Referencing outer query's tables in a subquery ) but while the input was good they never really explained why he couldn't do what he is doing and just gave alternative solutions to his problem. I've tried their alternate solutions but it's causing me other issues. Namely, that subquery with the date reference is fundamental to the entire operation so I can't get rid of it.

    另外,我发现了一个类似的问题(在子查询中引用外部查询的表)但是虽然输入很好,但他们从未真正解释过为什么他不能做他正在做的事情而只是为他的问题提供了替代解决方案。我已经尝试了他们的替代解决方案,但它引起了我的其他问题。也就是说,带有日期引用的子查询是整个操作的基础,所以我无法摆脱它。

Questions

问题

  • I want to understand what I've done here... Why can my initial subquery see the outer table but not after I wrap the entire statement in a subquery?

    我想了解我在这里做了什么...为什么我的初始子查询可以看到外表但是在我将整个语句包装在子查询中之后?

  • That said, if what I'm trying to do can't be done, what is the best way of refactoring the first query to eliminate the duplication? Should I reference table1 twice (with all the duplication that requires)? Or is there (probably) a better way of tackling this problem?

    也就是说,如果我要做的事情无法完成,重构第一个查询以消除重复的最佳方法是什么?我应该引用table1两次(所有重复需要)吗?或者是否(可能)有更好的方法来解决这个问题?

Thanks in advance!

提前致谢!

------EDIT------

- - - 编辑 - - -

As some have surmised these queries above are not the actually query I'm refactoring but an example of the problem I'm running into. The query I'm working with is a lot more complicated so I'm hesitant to post it here as I'm afraid it will get people off track.

正如一些人推测上面的这些查询不是我正在重构的实际查询,而是我遇到的问题的一个例子。我正在使用的查询要复杂得多,所以我不愿在这里发帖,因为我担心它会让人偏离轨道。

------UPDATE------

------ UPDATE ------

So I ran this by a fellow developer and he had one possible explanation for why my subquery is losing access to t1. Because I'm wrapping this subquery in a parenthesis, he thinks that this subquery is being evaluated before my table t1 is being evaluated. This would definitely explain the 'ORA-00904: "t1"."id": invalid identifier' error I've been receiving. It would also suggest that like arithmetic order of operations, that adding parens to a statement gives it priority within certain clause evaluations. I would still love for an expert to weigh in if they agree/disagree that is a logical explanation for what I'm seeing here.

所以我由一位开发人员运行这个,他有一个可能的解释为什么我的子查询失去了对t1的访问权限。因为我将这个子查询包装在括号中,所以他认为在我的表t1被评估之前正在评估这个子查询。这肯定会解释'ORA-00904:“t1”。“id”:我收到的无效标识符'错误。它还表明,与操作的算术顺序一样,将语句添加到语句中会使其在某些子句评估中具有优先级。如果他们同意/不同意这是我在这里看到的合乎逻辑的解释,我仍然希望专家能够权衡。

3 个解决方案

#1


6  

So I figured this out based on the comment that Martin Smith made above (THANKS MARTIN!) and I wanted to make sure I shared my discovery for anyone else who trips across this issue.

所以我根据马丁史密斯在上面做出的评论(感谢马丁!)来解决这个问题,我想确保我和其他任何人一起分享我的发现。

Technical Considerations

技术考虑因素

Firstly, it would certainly help if I used the proper terminology to describe my problem: My first statement above uses a correlated subquery:

首先,如果我使用适当的术语来描述我的问题,它肯定会有所帮助:我上面的第一个语句使用了相关的子查询:

This is actually a fairly inefficient way of pulling back data as it reruns the subquery for every line in the outer table. For this reason I'm going to look for ways of eliminating these type of subqueries in my code:

这实际上是一种相当低效的数据撤销方式,因为它重新运行外部表中每一行的子查询。出于这个原因,我将在我的代码中寻找消除这些类型的子查询的方法:

My second statement on the other hand was using what is called an inline view in Oracle also known as a derived table in SQL Server:

另一方面,我的第二个声明是在Oracle中使用所谓的内联视图,也称为SQL Server中的派生表:

An inline view / derived table creates a temporary unnamed view at the beginning of your query and then treats it like another table until the operation is complete. Because the compiler needs to create a temporary view when it sees on of these subqueries on the FROM line, those subqueries must be entirely self-contained with no references outside the subquery.

内联视图/派生表在查询开头创建临时未命名视图,然后将其视为另一个表,直到操作完成。因为编译器需要在FROM行上查看这些子查询时创建临时视图,所以这些子查询必须完全自包含且子查询外没有引用。

Why what I was doing was stupid

为什么我在做什么是愚蠢的

What I was trying to do in that second table was essentially create a view based on an ambiguous reference to another table that was outside the knowledge of my statement. It would be like trying to reference a field in a table that you hadn't explicitly stated in your query.

我在第二个表中尝试做的事实上是基于对我的陈述知识之外的另一个表的模糊引用创建一个视图。这就像尝试引用您未在查询中明确声明的表中的字段。

Workaround

解决方法

Lastly, it's worth noting that Martin suggested a fairly clever but ultimately inefficient way to accomplish what I was trying to do. The Apply statement is a proprietary SQL Server function but it allows you to talk to objects outside of your derived table:

最后,值得注意的是,马丁提出了一种相当聪明但最终效率低下的方法来完成我想要做的事情。 Apply语句是一个专有的SQL Server函数,但它允许您与派生表之外的对象进行通信:

Likewise this functionality is available in Oracle through different syntax:

同样,Oracle可以通过不同的语法提供此功能:

Ultimately I'm going to re-evaluate my entire approach to this query which means I'll have to rebuild it from scratch (believe it or not I didn't create this monstrocity originally - I swear!). A big thanks to everyone who commented - this was definitely stumping me but all of the input helped put me on the right track!

最终,我将重新评估我对这个查询的整个方法,这意味着我将不得不从头开始重建它(不管你信不信,我最初都没有创建这个怪物 - 我发誓!)。非常感谢所有评论的人 - 这绝对让我很难过,但所有的投入都帮助我走上正轨!

#2


0  

How about the following query:

以下查询如何:

SELECT t1.* FROM 
(
  SELECT * 
  FROM 
  (
    SELECT t2.id,
    RANK() OVER (PARTITION BY t2.id, t2.date ORDER BY t2.date DESC) AS R  
    FROM table2 t2
  )
  WHERE R = 1
) sub 
INNER JOIN table1 t1 
ON t1.id = sub.id

#3


-1  

In your second example you are trying to pass the t1 reference down 2 levels.. you can't do that, you can only pass it down 1 level (which is why the 1st works). If you give a better example of what you are trying to do, we can help you rewrite your query as well.

在你的第二个例子中,你试图将t1参考传递到2级...你不能这样做,你只能将它传递到1级(这就是为什么第1级工作)。如果您提供更好的示例,我们也可以帮助您重写查询。

#1


6  

So I figured this out based on the comment that Martin Smith made above (THANKS MARTIN!) and I wanted to make sure I shared my discovery for anyone else who trips across this issue.

所以我根据马丁史密斯在上面做出的评论(感谢马丁!)来解决这个问题,我想确保我和其他任何人一起分享我的发现。

Technical Considerations

技术考虑因素

Firstly, it would certainly help if I used the proper terminology to describe my problem: My first statement above uses a correlated subquery:

首先,如果我使用适当的术语来描述我的问题,它肯定会有所帮助:我上面的第一个语句使用了相关的子查询:

This is actually a fairly inefficient way of pulling back data as it reruns the subquery for every line in the outer table. For this reason I'm going to look for ways of eliminating these type of subqueries in my code:

这实际上是一种相当低效的数据撤销方式,因为它重新运行外部表中每一行的子查询。出于这个原因,我将在我的代码中寻找消除这些类型的子查询的方法:

My second statement on the other hand was using what is called an inline view in Oracle also known as a derived table in SQL Server:

另一方面,我的第二个声明是在Oracle中使用所谓的内联视图,也称为SQL Server中的派生表:

An inline view / derived table creates a temporary unnamed view at the beginning of your query and then treats it like another table until the operation is complete. Because the compiler needs to create a temporary view when it sees on of these subqueries on the FROM line, those subqueries must be entirely self-contained with no references outside the subquery.

内联视图/派生表在查询开头创建临时未命名视图,然后将其视为另一个表,直到操作完成。因为编译器需要在FROM行上查看这些子查询时创建临时视图,所以这些子查询必须完全自包含且子查询外没有引用。

Why what I was doing was stupid

为什么我在做什么是愚蠢的

What I was trying to do in that second table was essentially create a view based on an ambiguous reference to another table that was outside the knowledge of my statement. It would be like trying to reference a field in a table that you hadn't explicitly stated in your query.

我在第二个表中尝试做的事实上是基于对我的陈述知识之外的另一个表的模糊引用创建一个视图。这就像尝试引用您未在查询中明确声明的表中的字段。

Workaround

解决方法

Lastly, it's worth noting that Martin suggested a fairly clever but ultimately inefficient way to accomplish what I was trying to do. The Apply statement is a proprietary SQL Server function but it allows you to talk to objects outside of your derived table:

最后,值得注意的是,马丁提出了一种相当聪明但最终效率低下的方法来完成我想要做的事情。 Apply语句是一个专有的SQL Server函数,但它允许您与派生表之外的对象进行通信:

Likewise this functionality is available in Oracle through different syntax:

同样,Oracle可以通过不同的语法提供此功能:

Ultimately I'm going to re-evaluate my entire approach to this query which means I'll have to rebuild it from scratch (believe it or not I didn't create this monstrocity originally - I swear!). A big thanks to everyone who commented - this was definitely stumping me but all of the input helped put me on the right track!

最终,我将重新评估我对这个查询的整个方法,这意味着我将不得不从头开始重建它(不管你信不信,我最初都没有创建这个怪物 - 我发誓!)。非常感谢所有评论的人 - 这绝对让我很难过,但所有的投入都帮助我走上正轨!

#2


0  

How about the following query:

以下查询如何:

SELECT t1.* FROM 
(
  SELECT * 
  FROM 
  (
    SELECT t2.id,
    RANK() OVER (PARTITION BY t2.id, t2.date ORDER BY t2.date DESC) AS R  
    FROM table2 t2
  )
  WHERE R = 1
) sub 
INNER JOIN table1 t1 
ON t1.id = sub.id

#3


-1  

In your second example you are trying to pass the t1 reference down 2 levels.. you can't do that, you can only pass it down 1 level (which is why the 1st works). If you give a better example of what you are trying to do, we can help you rewrite your query as well.

在你的第二个例子中,你试图将t1参考传递到2级...你不能这样做,你只能将它传递到1级(这就是为什么第1级工作)。如果您提供更好的示例,我们也可以帮助您重写查询。