什么是更好/更快:加入许多表或选择一个大表

时间:2021-10-25 15:31:37

We are using Oracle 11, our application written on Java. Once a day, usually afternoon our database freezing because of many big sql queries. I want optimize this queries somehow. This queries consists of many joins of different tables. My question is: is it better for performance to use left join, or it is better put all information in one table and use one select? Suppose I will build a good indexes for this table.

我们使用的是Oracle 11,我们的应用程序是用Java编写的。每天一次,通常是下午我们的数据库冻结,因为许多大的SQL查询。我想以某种方式优化这些查询。此查询包含许多不同表的连接。我的问题是:使用左连接是否更好的性能,或者最好将所有信息放在一个表中并使用一个选择?假设我将为此表构建一个好的索引。

For information: in average, one query fetch 100 megabytes. Also this queries lock each other sometimes

有关信息:平均而言,一个查询获取100兆字节。此查询有时也会相互锁定

Update

Table that are joining 8 tables,they are usual tables with 3-5 columns, one of the column are some ID. Sql query looks like:

连接8个表的表,它们通常是3-5列的表,其中一列是一些ID。 Sql查询看起来像:

SELECT t1.c1,t2.c5, t6.c2, ... FROM t1
LEFT JOIN t2 ON t1.c1 = t2.c1
LEFT JOIN t3 ON t3.c1 = t2.c2
LEFT JOIN t4 ON t4.c1 = t2.c1
LEFT JOIN t5 ON t5.c5 = t1.c1
LEFT JOIN t6 ON t6.c1 = t2.c1
LEFT JOIN t7 ON t7.c3 = t3.c1
LEFT JOIN t8 ON t8.c1 = t2.c1
WHERE something

My question is, is it better to create one new tables, that consist of all joining tables, and use query like this:

我的问题是,创建一个包含所有连接表的新表是否更好,并使用如下查询:

SELECT c1,c5, c2, ... FROM SOME_NEW_TABLE

SELECT c1,c5,c2,... FROM SOME_NEW_TABLE

Update 2

Here is report, it would be great if some one can explain it in general.

这是报告,如果有人可以解释一般情况会很好。

1 个解决方案

#1


5  

The question can be generally answered I think. In performance tuning this type of query you will a number of things to consider:

我认为这个问题可以得到普遍回答。在性能调优这种类型的查询中,您需要考虑以下几点:

Parse time

How long does it take to establish an execution plan for the statement? If the query runs slow the first time and fast all times later, parse time is an issue. I assume that there are no changing constants in the query. If not, please use bind variables or as a last resort use dynamic bind variables, but I can be a bad idea to automatically introduce bind variables , see "alter session set cursor_sharing=similar".

为声明建立执行计划需要多长时间?如果查询第一次运行缓慢并且之后所有时间都快,则解析时间是个问题。我假设查询中没有更改常量。如果没有,请使用绑定变量或作为最后的手段使用动态绑定变量,但我自动引入绑定变量可能是一个坏主意,请参阅“alter session set cursor_sharing = similar”。

Especially with older versions and many joins (Oracle 8 was really bad in parsing statements with more than a 6 similar identity joins....) parse time can be expensive. Oracle 11 typically cuts the parse time by stopping after a number of execution plans have been considered. On Oracle 11 parse time still can be an issue, especially with union/union all.

特别是对于旧版本和许多连接(Oracle 8在解析具有超过6个类似身份连接的语句时非常糟糕......)解析时间可能很昂贵。 Oracle 11通常会在考虑了许多执行计划后停止,从而缩短解析时间。在Oracle 11上,解析时间仍然是个问题,特别是对于union / union all。

Also, in this query you use ANSI style joins. Note that Oracle 11 has some performance drawbacks when using the more elegant ANSI style joins with complex statements. For automatically generated statements I therefore recommend Oracle style (c (+) = d), for statements that need to be maintained you need to study whether it really is a problem.

此外,在此查询中,您使用ANSI样式连接。请注意,当使用更复杂的ANSI样式连接与复杂语句时,Oracle 11具有一些性能缺陷。因此,对于自动生成的语句,我建议使用Oracle样式(c(+)= d),对于需要维护的语句,您需要研究它是否确实存在问题。

When parse time is an issue, I would recommend using a hint like /*+ ordered */ as a starting point. With this make sure your join order is such that the amount is data produced temporarily is as little as possible and the correct indexes are present.

当解析时间成为问题时,我建议使用/ * + ordered * /这样的提示作为起点。这样可以确保您的连接顺序使得临时生成的数据量尽可能少,并且存在正确的索引。

Execution time

During execution, Oracle executes the execution plan. Oracle does this really efficient compared to other database platforms. But if the execution plan stinks, the execution takes time. In your question you talk about whether to prejoin everything or not.

在执行期间,Oracle执行执行计划。与其他数据库平台相比,Oracle确实非常有效。但如果执行计划发臭,执行需要时间。在你的问题中,你谈论是否要加入一切。

In general it is best to always start of with a fully normalized model. In a fully normalized model data is stored only once. So when the query is efficiently planned, the least amount of data is processed. This assumes that the Oracle server has sufficient memory to cache it all or large parts of it, since join strategies need sometimes a lot of work space in memory plus the data already fetched from disk.

通常,最好始终使用完全标准化的模型。在完全标准化的模型中,数据仅存储一次。因此,当有效地计划查询时,处理的数据量最少。这假设Oracle服务器有足够的内存来缓存它的全部或大部分,因为连接策略有时需要内存中的大量工作空间以及已经从磁盘获取的数据。

When performance is insufficient, I would start by introducing hints but staying with the normalized model. Always try to keep the amount of data eligible for output during interim steps as small as possible. When it really doesn't work, you might go for a derived table but I find this generally a sign of weak development skills.

当性能不足时,我会先介绍提示,但要遵循规范化模型。始终尽量保持在临时步骤中符合输出要求的数据量。当它真的不起作用时,你可能会选择派生表,但我发现这通常表明开发技能很弱。

In all this, I am assuming that one of the tables that start the execution plan has a large data volume and the other are smaller, maybe a little smaller or a lot smaller. If not, you are running a "Wiebertje" query (I don't have another name for it, it is shape of Dutch candy). Then please read page 9 and further of conference presentation 2006

在所有这些中,我假设启动执行计划的其中一个表具有较大的数据量而另一个表较小,可能稍微小一点或小一些。如果没有,你正在运行一个“Wiebertje”查询(我没有其他名称,它是荷兰糖果的形状)。那么请阅读第9页以及2006年会议报告的进一步内容

Fetch time

At the end of the cycle, Oracle starts sending back the data at some moment. Especially the volume can highly the time needed to transfer it all. It is not uncommon for applications to fetch absolutely everything, but only displaying the first 50 rows. Please introduce windowing or "fetch to displayed watermark + constant" to reduce the fetch time. You may need to introduce a hint such as /*+ first_rows */ in the statement or session for interactive use.

在周期结束时,Oracle会在某个时刻开始发回数据。特别是音量可以高度传输所需的时间。应用程序绝对获取所有内容,但仅显示前50行,这种情况并不少见。请介绍窗口或“获取显示的水印+常量”以减少获取时间。您可能需要在语句或会话中引入一个提示,例如/ * + first_rows * /以供交互使用。

#1


5  

The question can be generally answered I think. In performance tuning this type of query you will a number of things to consider:

我认为这个问题可以得到普遍回答。在性能调优这种类型的查询中,您需要考虑以下几点:

Parse time

How long does it take to establish an execution plan for the statement? If the query runs slow the first time and fast all times later, parse time is an issue. I assume that there are no changing constants in the query. If not, please use bind variables or as a last resort use dynamic bind variables, but I can be a bad idea to automatically introduce bind variables , see "alter session set cursor_sharing=similar".

为声明建立执行计划需要多长时间?如果查询第一次运行缓慢并且之后所有时间都快,则解析时间是个问题。我假设查询中没有更改常量。如果没有,请使用绑定变量或作为最后的手段使用动态绑定变量,但我自动引入绑定变量可能是一个坏主意,请参阅“alter session set cursor_sharing = similar”。

Especially with older versions and many joins (Oracle 8 was really bad in parsing statements with more than a 6 similar identity joins....) parse time can be expensive. Oracle 11 typically cuts the parse time by stopping after a number of execution plans have been considered. On Oracle 11 parse time still can be an issue, especially with union/union all.

特别是对于旧版本和许多连接(Oracle 8在解析具有超过6个类似身份连接的语句时非常糟糕......)解析时间可能很昂贵。 Oracle 11通常会在考虑了许多执行计划后停止,从而缩短解析时间。在Oracle 11上,解析时间仍然是个问题,特别是对于union / union all。

Also, in this query you use ANSI style joins. Note that Oracle 11 has some performance drawbacks when using the more elegant ANSI style joins with complex statements. For automatically generated statements I therefore recommend Oracle style (c (+) = d), for statements that need to be maintained you need to study whether it really is a problem.

此外,在此查询中,您使用ANSI样式连接。请注意,当使用更复杂的ANSI样式连接与复杂语句时,Oracle 11具有一些性能缺陷。因此,对于自动生成的语句,我建议使用Oracle样式(c(+)= d),对于需要维护的语句,您需要研究它是否确实存在问题。

When parse time is an issue, I would recommend using a hint like /*+ ordered */ as a starting point. With this make sure your join order is such that the amount is data produced temporarily is as little as possible and the correct indexes are present.

当解析时间成为问题时,我建议使用/ * + ordered * /这样的提示作为起点。这样可以确保您的连接顺序使得临时生成的数据量尽可能少,并且存在正确的索引。

Execution time

During execution, Oracle executes the execution plan. Oracle does this really efficient compared to other database platforms. But if the execution plan stinks, the execution takes time. In your question you talk about whether to prejoin everything or not.

在执行期间,Oracle执行执行计划。与其他数据库平台相比,Oracle确实非常有效。但如果执行计划发臭,执行需要时间。在你的问题中,你谈论是否要加入一切。

In general it is best to always start of with a fully normalized model. In a fully normalized model data is stored only once. So when the query is efficiently planned, the least amount of data is processed. This assumes that the Oracle server has sufficient memory to cache it all or large parts of it, since join strategies need sometimes a lot of work space in memory plus the data already fetched from disk.

通常,最好始终使用完全标准化的模型。在完全标准化的模型中,数据仅存储一次。因此,当有效地计划查询时,处理的数据量最少。这假设Oracle服务器有足够的内存来缓存它的全部或大部分,因为连接策略有时需要内存中的大量工作空间以及已经从磁盘获取的数据。

When performance is insufficient, I would start by introducing hints but staying with the normalized model. Always try to keep the amount of data eligible for output during interim steps as small as possible. When it really doesn't work, you might go for a derived table but I find this generally a sign of weak development skills.

当性能不足时,我会先介绍提示,但要遵循规范化模型。始终尽量保持在临时步骤中符合输出要求的数据量。当它真的不起作用时,你可能会选择派生表,但我发现这通常表明开发技能很弱。

In all this, I am assuming that one of the tables that start the execution plan has a large data volume and the other are smaller, maybe a little smaller or a lot smaller. If not, you are running a "Wiebertje" query (I don't have another name for it, it is shape of Dutch candy). Then please read page 9 and further of conference presentation 2006

在所有这些中,我假设启动执行计划的其中一个表具有较大的数据量而另一个表较小,可能稍微小一点或小一些。如果没有,你正在运行一个“Wiebertje”查询(我没有其他名称,它是荷兰糖果的形状)。那么请阅读第9页以及2006年会议报告的进一步内容

Fetch time

At the end of the cycle, Oracle starts sending back the data at some moment. Especially the volume can highly the time needed to transfer it all. It is not uncommon for applications to fetch absolutely everything, but only displaying the first 50 rows. Please introduce windowing or "fetch to displayed watermark + constant" to reduce the fetch time. You may need to introduce a hint such as /*+ first_rows */ in the statement or session for interactive use.

在周期结束时,Oracle会在某个时刻开始发回数据。特别是音量可以高度传输所需的时间。应用程序绝对获取所有内容,但仅显示前50行,这种情况并不少见。请介绍窗口或“获取显示的水印+常量”以减少获取时间。您可能需要在语句或会话中引入一个提示,例如/ * + first_rows * /以供交互使用。