如何加速Oracle中的row_number?

时间:2022-03-30 09:20:08

I have a SQL query that looks something like this:

我有一个SQL查询,看起来像这样:

SELECT * FROM(
    SELECT
        ...,
        row_number() OVER(ORDER BY ID) rn
    FROM
        ...
) WHERE rn between :start and :end

Essentially, it's the ORDER BY part that's slowing things down. If I were to remove it, the EXPLAIN cost goes down by an order of magnitude (over 1000x). I've tried this:

从本质上讲,正是ORDER BY部分减慢了速度。如果我要删除它,EXPLAIN成本会下降一个数量级(超过1000倍)。我试过这个:

SELECT 
    ...
FROM
    ...
WHERE
    rownum between :start and :end

But this doesn't give correct results. Is there any easy way to speed this up? Or will I have to spend some more time with the EXPLAIN tool?

但这并没有给出正确的结果。有什么简单的方法可以加快速度吗?或者我将不得不花更多时间使用EXPLAIN工具?

5 个解决方案

#1


12  

ROW_NUMBER is quite inefficient in Oracle.

ROW_NUMBER在Oracle中效率很低。

See the article in my blog for performance details:

有关性能详细信息,请参阅我博客中的文章:

For your specific query, I'd recommend you to replace it with ROWNUM and make sure that the index is used:

对于您的特定查询,我建议您将其替换为ROWNUM并确保使用索引:

SELECT  *
FROM    (
        SELECT  /*+ INDEX_ASC(t index_on_column) NOPARALLEL_INDEX(t index_on_column) */
                t.*, ROWNUM AS rn
        FROM    table t
        ORDER BY
                column
        )
WHERE rn >= :start
      AND rownum <= :end - :start + 1

This query will use COUNT STOPKEY

此查询将使用COUNT STOPKEY

Also either make sure you column is not nullable, or add WHERE column IS NOT NULL condition.

还要么确保列不可为空,要么添加WHERE列IS NOT NULL条件。

Otherwise the index cannot be used to retrieve all values.

否则,索引不能用于检索所有值。

Note that you cannot use ROWNUM BETWEEN :start and :end without a subquery.

请注意,您不能使用ROWNUM BETWEEN:start和:end而不使用子查询。

ROWNUM is always assigned last and checked last, that's way ROWNUM's always come in order without gaps.

ROWNUM总是最后分配并最后检查,这样ROWNUM始终按顺序排列。

If you use ROWNUM BETWEEN 10 and 20, the first row that satisifies all other conditions will become a candidate for returning, temporarily assigned with ROWNUM = 1 and fail the test of ROWNUM BETWEEN 10 AND 20.

如果你使用ROWNUM BETWEEN 10和20,那么满足所有其他条件的第一行将成为返回的候选者,暂时分配ROWNUM = 1并且未通过10和20之间的ROWNUM测试。

Then the next row will be a candidate, assigned with ROWNUM = 1 and fail, etc., so, finally, no rows will be returned at all.

然后下一行将是候选者,分配ROWNUM = 1并失败等,因此,最后,根本不会返回任何行。

This should be worked around by putting ROWNUM's into the subquery.

这应该通过将ROWNUM放入子查询来解决。

#2


5  

Looks like a pagination query to me.

看起来像是一个分页查询给我。

From this ASKTOM article (about 90% down the page):

从这篇ASKTOM文章(约90%的页面):

You need to order by something unique for these pagination queries, so that ROW_NUMBER is assigned deterministically to the rows each and every time.

您需要按照这些分页查询的唯一内容进行排序,以便每次都确定性地为行分配ROW_NUMBER。

Also your queries are no where near the same so I'm not sure what the benefit of comparing the costs of one to the other is.

你的查询也不在同一个地方,所以我不确定比较一个和另一个的成本有什么好处。

#3


1  

Is your ORDER BY column indexed? If not that's a good place to start.

您的ORDER BY列是否已编入索引?如果不是那个开始的好地方。

#4


1  

Part of the problem is how big is the 'start' to 'end' span and where they 'live'. Say you have a million rows in the table, and you want rows 567,890 to 567,900 then you are going to have to live with the fact that it is going to need to go through the entire table, sort pretty much all of that by id, and work out what rows fall into that range.

部分问题是“开始”到“结束”跨度有多大以及它们“存在”的位置。假设您在表中有一百万行,并且您想要行567,890到567,900那么您将不得不忍受这样一个事实:它需要遍历整个表,通过id排序几乎所有这些行,并确定哪些行属于该范围。

In short, that's a lot of work, which is why the optimizer gives it a high cost.

简而言之,这是很多工作,这就是优化器给它带来高成本的原因。

It is also not something an index can help with much. An index would give the order, but at best, that gives you somewhere to start and then you keep reading on until you get to the 567,900th entry.

它也不是索引可以帮助的东西。索引会给出订单,但充其量,这会让您在某个地方开始,然后继续阅读,直到您到达第567,900条。

If you are showing your end user 10 items at a time, it may be worth actually grabbing the top 100 from the DB, then having the app break that 100 into ten chunks.

如果您一次向最终用户显示10个项目,那么实际上可能值得从数据库中获取前100名,然后让应用程序将100个项目分成10个块。

#5


0  

Spend more time with the EXPLAIN PLAN tool. If you see a TABLE SCAN you need to change your query.

花更多时间使用EXPLAIN PLAN工具。如果您看到TABLE SCAN,则需要更改查询。

Your query makes little sense to me. Querying over a ROWID seems like asking for trouble. There's no relational info in that query. Is it the real query that you're having trouble with or an example that you made up to illustrate your problem?

你的查询对我来说没什么意义。查询ROWID似乎是在寻找麻烦。该查询中没有关系信息。是您遇到问题的真实查询还是您说明问题的示例?

#1


12  

ROW_NUMBER is quite inefficient in Oracle.

ROW_NUMBER在Oracle中效率很低。

See the article in my blog for performance details:

有关性能详细信息,请参阅我博客中的文章:

For your specific query, I'd recommend you to replace it with ROWNUM and make sure that the index is used:

对于您的特定查询,我建议您将其替换为ROWNUM并确保使用索引:

SELECT  *
FROM    (
        SELECT  /*+ INDEX_ASC(t index_on_column) NOPARALLEL_INDEX(t index_on_column) */
                t.*, ROWNUM AS rn
        FROM    table t
        ORDER BY
                column
        )
WHERE rn >= :start
      AND rownum <= :end - :start + 1

This query will use COUNT STOPKEY

此查询将使用COUNT STOPKEY

Also either make sure you column is not nullable, or add WHERE column IS NOT NULL condition.

还要么确保列不可为空,要么添加WHERE列IS NOT NULL条件。

Otherwise the index cannot be used to retrieve all values.

否则,索引不能用于检索所有值。

Note that you cannot use ROWNUM BETWEEN :start and :end without a subquery.

请注意,您不能使用ROWNUM BETWEEN:start和:end而不使用子查询。

ROWNUM is always assigned last and checked last, that's way ROWNUM's always come in order without gaps.

ROWNUM总是最后分配并最后检查,这样ROWNUM始终按顺序排列。

If you use ROWNUM BETWEEN 10 and 20, the first row that satisifies all other conditions will become a candidate for returning, temporarily assigned with ROWNUM = 1 and fail the test of ROWNUM BETWEEN 10 AND 20.

如果你使用ROWNUM BETWEEN 10和20,那么满足所有其他条件的第一行将成为返回的候选者,暂时分配ROWNUM = 1并且未通过10和20之间的ROWNUM测试。

Then the next row will be a candidate, assigned with ROWNUM = 1 and fail, etc., so, finally, no rows will be returned at all.

然后下一行将是候选者,分配ROWNUM = 1并失败等,因此,最后,根本不会返回任何行。

This should be worked around by putting ROWNUM's into the subquery.

这应该通过将ROWNUM放入子查询来解决。

#2


5  

Looks like a pagination query to me.

看起来像是一个分页查询给我。

From this ASKTOM article (about 90% down the page):

从这篇ASKTOM文章(约90%的页面):

You need to order by something unique for these pagination queries, so that ROW_NUMBER is assigned deterministically to the rows each and every time.

您需要按照这些分页查询的唯一内容进行排序,以便每次都确定性地为行分配ROW_NUMBER。

Also your queries are no where near the same so I'm not sure what the benefit of comparing the costs of one to the other is.

你的查询也不在同一个地方,所以我不确定比较一个和另一个的成本有什么好处。

#3


1  

Is your ORDER BY column indexed? If not that's a good place to start.

您的ORDER BY列是否已编入索引?如果不是那个开始的好地方。

#4


1  

Part of the problem is how big is the 'start' to 'end' span and where they 'live'. Say you have a million rows in the table, and you want rows 567,890 to 567,900 then you are going to have to live with the fact that it is going to need to go through the entire table, sort pretty much all of that by id, and work out what rows fall into that range.

部分问题是“开始”到“结束”跨度有多大以及它们“存在”的位置。假设您在表中有一百万行,并且您想要行567,890到567,900那么您将不得不忍受这样一个事实:它需要遍历整个表,通过id排序几乎所有这些行,并确定哪些行属于该范围。

In short, that's a lot of work, which is why the optimizer gives it a high cost.

简而言之,这是很多工作,这就是优化器给它带来高成本的原因。

It is also not something an index can help with much. An index would give the order, but at best, that gives you somewhere to start and then you keep reading on until you get to the 567,900th entry.

它也不是索引可以帮助的东西。索引会给出订单,但充其量,这会让您在某个地方开始,然后继续阅读,直到您到达第567,900条。

If you are showing your end user 10 items at a time, it may be worth actually grabbing the top 100 from the DB, then having the app break that 100 into ten chunks.

如果您一次向最终用户显示10个项目,那么实际上可能值得从数据库中获取前100名,然后让应用程序将100个项目分成10个块。

#5


0  

Spend more time with the EXPLAIN PLAN tool. If you see a TABLE SCAN you need to change your query.

花更多时间使用EXPLAIN PLAN工具。如果您看到TABLE SCAN,则需要更改查询。

Your query makes little sense to me. Querying over a ROWID seems like asking for trouble. There's no relational info in that query. Is it the real query that you're having trouble with or an example that you made up to illustrate your problem?

你的查询对我来说没什么意义。查询ROWID似乎是在寻找麻烦。该查询中没有关系信息。是您遇到问题的真实查询还是您说明问题的示例?