如何加快此查询?

时间:2022-10-24 03:53:58

I have a table with 60 attributes in it, attribute1..attribute60. The database engine in MySQL and the table engine is MyISAM. The query is as follows:

我有一个包含60个属性的表,attribute1..attribute60。 MySQL中的数据库引擎和表引擎是MyISAM。查询如下:

SELECT DISTINCT attribute1
  FROM `product_applications`
 WHERE `product_applications`.`brand_id` NOT IN (642, 630, 513, 637, 632,
                                                 556, 548, 628, 651, 660,
                                                 648, 557, 650, 624, 652,
                                                 636, 546, 662, 634, 629,
                                                 657, 638, 658, 659, 661, 625)

I use a NOT IN, because that list is significantly smaller than the IN list.

我使用NOT IN,因为该列表明显小于IN列表。

I have created the following index:

我创建了以下索引:

brand_id, attribute1, attribute2, attribute3, attribute4

A DESC reveals that this index is being selected, but it looks like it is still looking at the whole table because I see the whole row count in the "rows" column:

DESC显示正在选择此索引,但它看起来仍然在查看整个表,因为我在“行”列中看到整行计数:

6732948

In the "extra" column I have:

在“额外”栏中,我有:

Using where; Using index; Using temporary

This query is taking over 7 seconds. I am looking at all different options here, including breaking the table up.

此查询需要7秒钟。我在这里看到所有不同的选项,包括打破桌面。

UPDATE:

I was able to cut the query time in half with the clever use of the UNION ALL noted by my friend below. Also, this is a dynamically generated query, so none of the temporary table options that some of you have offered, while an excellent idea, were available to me.

通过巧妙地使用我朋友在下面注明的UNION ALL,我能够将查询时间缩短一半。此外,这是一个动态生成的查询,因此我可以使用您所提供的一些临时表选项,这是一个很好的主意。

3 个解决方案

#1


2  

Previously, the following used a LEFT JOIN -- but the OP reversed the logic to use an INNER JOIN:

以前,以下使用LEFT JOIN - 但是OP颠倒了逻辑以使用INNER JOIN:

   SELECT DISTINCT 
          t.attribute1
     FROM PRODUCT_APPLICATIONS t
     JOIN (SELECT 642 AS brand_id
           UNION ALL 
           SELECT 630
           UNION ALL 
           SELECT 513
           UNION ALL 
           SELECT 637
           UNION ALL 
           SELECT 632           
           UNION ALL 
           SELECT 556
           UNION ALL 
           SELECT 548
           UNION ALL 
           SELECT 628
           UNION ALL 
           SELECT 651
           UNION ALL 
           SELECT 660
           UNION ALL 
           SELECT 648
           UNION ALL 
           SELECT 557
           UNION ALL 
           SELECT 650
           UNION ALL 
           SELECT 624
           UNION ALL 
           SELECT 652
           UNION ALL 
           SELECT 636
           UNION ALL 
           SELECT 546
           UNION ALL 
           SELECT 662
           UNION ALL 
           SELECT 634
           UNION ALL 
           SELECT 629
           UNION ALL 
           SELECT 657
           UNION ALL 
           SELECT 638
           UNION ALL 
           SELECT 658
           UNION ALL 
           SELECT 659
           UNION ALL 
           SELECT 661
           UNION ALL 
           SELECT 625) x ON x.brand_id = t.brand_id

You could consider populating a temp table, to use in place of the derived one you see in my answer.

您可以考虑填充临时表,以代替您在我的答案中看到的派生表。

#2


0  

[1] 642, 630, 513, 637, 632, 556, 548, 628, 651, 660, 648, 557, 650, 624, 652, 636, 546, 662, 634, 629, 657, 638, 658, 659, 661, 625

[1] 642,630,513,637,632,556,548,628,651,660,648,557,650,624,652,636,546,662,634,629,657,638,658,659 ,661,625

Since you have hardcoded it, i assume these are the numbers you want to exclude all the time.
Then, why not create a table that only contains these ids, and another table that doesn't contain these ids. And, your insert determines which table to be inserted into based on the 'brand_id'.

由于您已对其进行了硬编码,因此我假设这些是您要一直排除的数字。那么,为什么不创建一个只包含这些id的表,另一个不包含这些id的表。并且,您的插入基于'brand_id'确定要插入哪个表。

[2] 662, 661, 660, 659, 658, 657, 652, 651, 650, 648, 642, 638, 637, 636, 634, 632, 630, 629, 628, 625, 624, 557, 556, 548, 546, 513

[2] 662,661,660,659,658,657,652,651,650,648,642,638,637,636,634,632,630,629,628,625,624,557,556,548 ,546,513

A sorted list of your brand_ids is shown above. It looks like you can lower the equality calls a few more by providing a range condition instead. (ie., >=657 && <=662, >=650 &&<=652, etc..)

您的brand_ids的排序列表如上所示。看起来你可以通过提供范围条件来降低等式调用。 (即,> = 657 && <= 662,> = 650 && <= 652,等等。)

#3


0  

The long pole in this query is the "DISTINCT" clause.

此查询中的长极点是“DISTINCT”子句。

First, I'm not sure why you said "this is a dynamically generated query, so none of the temporary table options that some of you have offered, while an excellent idea, were available to me." It's possible to use temporary tables with dynamically generated queries ... ? Perhaps you meant something else.

首先,我不确定你为什么说“这是一个动态生成的查询,所以你们中有些人提供的临时表选项都没有提供给我,这是一个很好的主意。”可以使用动态生成查询的临时表...?也许你的意思是别的。

Can you at least build a supporting table for this? Something like:

你能为此至少建立一个支持表吗?就像是:

CREATE TABLE product_applications_brand_id_attribute1 (
    PRIMARY KEY (attribute1)
) IGNORE AS SELECT attribute1
FROM product_applications
WHERE brand_id NOT IN (
    642, 630, 513, 637, 632, 556, 548, 628, 651, 660,
    648, 557, 650, 624, 652, 636, 546, 662, 634, 629,
    657, 638, 658, 659, 661, 625)

Then, you would query:

然后,您将查询:

SELECT attribute1
    FROM product_applications_brand_id_attribute1

This isn't an ideal solution because you have to update the new separate table every time there's a change in the base table.

这不是一个理想的解决方案,因为每次基表发生更改时都必须更新新的单独表。

#1


2  

Previously, the following used a LEFT JOIN -- but the OP reversed the logic to use an INNER JOIN:

以前,以下使用LEFT JOIN - 但是OP颠倒了逻辑以使用INNER JOIN:

   SELECT DISTINCT 
          t.attribute1
     FROM PRODUCT_APPLICATIONS t
     JOIN (SELECT 642 AS brand_id
           UNION ALL 
           SELECT 630
           UNION ALL 
           SELECT 513
           UNION ALL 
           SELECT 637
           UNION ALL 
           SELECT 632           
           UNION ALL 
           SELECT 556
           UNION ALL 
           SELECT 548
           UNION ALL 
           SELECT 628
           UNION ALL 
           SELECT 651
           UNION ALL 
           SELECT 660
           UNION ALL 
           SELECT 648
           UNION ALL 
           SELECT 557
           UNION ALL 
           SELECT 650
           UNION ALL 
           SELECT 624
           UNION ALL 
           SELECT 652
           UNION ALL 
           SELECT 636
           UNION ALL 
           SELECT 546
           UNION ALL 
           SELECT 662
           UNION ALL 
           SELECT 634
           UNION ALL 
           SELECT 629
           UNION ALL 
           SELECT 657
           UNION ALL 
           SELECT 638
           UNION ALL 
           SELECT 658
           UNION ALL 
           SELECT 659
           UNION ALL 
           SELECT 661
           UNION ALL 
           SELECT 625) x ON x.brand_id = t.brand_id

You could consider populating a temp table, to use in place of the derived one you see in my answer.

您可以考虑填充临时表,以代替您在我的答案中看到的派生表。

#2


0  

[1] 642, 630, 513, 637, 632, 556, 548, 628, 651, 660, 648, 557, 650, 624, 652, 636, 546, 662, 634, 629, 657, 638, 658, 659, 661, 625

[1] 642,630,513,637,632,556,548,628,651,660,648,557,650,624,652,636,546,662,634,629,657,638,658,659 ,661,625

Since you have hardcoded it, i assume these are the numbers you want to exclude all the time.
Then, why not create a table that only contains these ids, and another table that doesn't contain these ids. And, your insert determines which table to be inserted into based on the 'brand_id'.

由于您已对其进行了硬编码,因此我假设这些是您要一直排除的数字。那么,为什么不创建一个只包含这些id的表,另一个不包含这些id的表。并且,您的插入基于'brand_id'确定要插入哪个表。

[2] 662, 661, 660, 659, 658, 657, 652, 651, 650, 648, 642, 638, 637, 636, 634, 632, 630, 629, 628, 625, 624, 557, 556, 548, 546, 513

[2] 662,661,660,659,658,657,652,651,650,648,642,638,637,636,634,632,630,629,628,625,624,557,556,548 ,546,513

A sorted list of your brand_ids is shown above. It looks like you can lower the equality calls a few more by providing a range condition instead. (ie., >=657 && <=662, >=650 &&<=652, etc..)

您的brand_ids的排序列表如上所示。看起来你可以通过提供范围条件来降低等式调用。 (即,> = 657 && <= 662,> = 650 && <= 652,等等。)

#3


0  

The long pole in this query is the "DISTINCT" clause.

此查询中的长极点是“DISTINCT”子句。

First, I'm not sure why you said "this is a dynamically generated query, so none of the temporary table options that some of you have offered, while an excellent idea, were available to me." It's possible to use temporary tables with dynamically generated queries ... ? Perhaps you meant something else.

首先,我不确定你为什么说“这是一个动态生成的查询,所以你们中有些人提供的临时表选项都没有提供给我,这是一个很好的主意。”可以使用动态生成查询的临时表...?也许你的意思是别的。

Can you at least build a supporting table for this? Something like:

你能为此至少建立一个支持表吗?就像是:

CREATE TABLE product_applications_brand_id_attribute1 (
    PRIMARY KEY (attribute1)
) IGNORE AS SELECT attribute1
FROM product_applications
WHERE brand_id NOT IN (
    642, 630, 513, 637, 632, 556, 548, 628, 651, 660,
    648, 557, 650, 624, 652, 636, 546, 662, 634, 629,
    657, 638, 658, 659, 661, 625)

Then, you would query:

然后,您将查询:

SELECT attribute1
    FROM product_applications_brand_id_attribute1

This isn't an ideal solution because you have to update the new separate table every time there's a change in the base table.

这不是一个理想的解决方案,因为每次基表发生更改时都必须更新新的单独表。