使用COUNT子查询减慢MYSQL查询

时间:2021-09-07 00:12:19

Right I have no idea why but this query takes well over 6 seconds to execute, index's are all setup correctly and if I run each query separately it works great with less than 0.5 seconds to execute.

我不知道为什么这个查询要花6秒才能执行,索引都是正确设置的,如果我单独运行每个查询,执行时间不到0.5秒就很好了。

Here is the query

这是查询

SELECT c.supplier_id, supplier_name, address1, address2, address3, address4, suppliertype, postcode, contact_name,
(SELECT COUNT(*)
    FROM supplier_questions q1
    WHERE c.supplier_id = q1.supplier_id AND q1.incomplete = '0') AS questions, 
IF (active=1,'Yes', IF (active=2, 'NCR Only','Inactive')) AS rated, 
(SELECT COUNT(*)
    FROM supplier_questions q2
    WHERE c.supplier_id = q2.supplier_id AND q2.reviewed = '1') AS reviewed, 
questapproved, 
ss.supplier_no AS supplier_no
FROM suppliers c
INNER JOIN supplier_site ss ON c.supplier_id = ss.supplier_id
WHERE c.supplier_id != '0' AND ss.site_id = '2'
GROUP BY c.supplier_id
ORDER BY c.supplier_name ASC
LIMIT 0, 20

Results of the Explain query is as follows

Explain查询的结果如下

id  select_type table   type    possible_keys   key key_len ref rows    Extra
1   PRIMARY ss  ref site_id,supplier_id site_id 4   const   1287    Using where; Using temporary; Using filesort
1   PRIMARY c   eq_ref  PRIMARY PRIMARY 4   ss.supplier_id  1   
3   DEPENDENT SUBQUERY  q2  ref supplier_id,reviewed    reviewed    4   const   263 Using where
2   DEPENDENT SUBQUERY  q1  ref supplier_id,incomplete  incomplete  4   const   254 Using where

The reason the count queries are in there is because I need to know the number of rows from those tables, this can't be done in another query as the results also need to be sorted by those values :(

count查询之所以出现,是因为我需要知道来自这些表的行数,这不能在另一个查询中完成,因为结果也需要按照这些值进行排序:(

4 个解决方案

#1


2  

As a stab in the dark, does this run faster? (I havent got a mysql to verify the syntax on, so forgive any slight mistakes, but you might get the idea)

作为暗处的一个暗处,它跑得快吗?(我没有mysql来验证语法,所以请原谅有任何小错误,但您可能理解了)

SELECT c.supplier_id, supplier_name, address1, address2, address3, address4, suppliertype, postcode, contact_name, questions, reviewed 
IF (active=1,'Yes', IF (active=2, 'NCR Only','Inactive')) AS rated,  
questapproved,  ss.supplier_no AS supplier_no 
FROM suppliers c 
INNER JOIN supplier_site ss ON c.supplier_id = ss.supplier_id 
inner join 
(SELECT supplier_id, sum(if(incomplete='0',1,0)) as questions,  sum(if(incomplete='1',1,0)) as reviewed FROM supplier_questions q1 group by supplier_id) as tmp
on c.supplier_id = tmp.supplier_id
WHERE c.supplier_id != '0' AND ss.site_id = '2' 
GROUP BY c.supplier_id 
ORDER BY c.supplier_name ASC LIMIT 0, 20 

#2


2  

FROM suppliers c
INNER JOIN supplier_site ss ON c.supplier_id = ss.supplier_id
WHERE c.supplier_id != '0' AND ss.site_id = '2'
GROUP BY c.supplier_id 
ORDER BY c.supplier_name ASC

Since autogenerated primary keys are never equal to 0 (unless big db design mistake) you can drop the c.supplier_id != '0' clause.

由于自动生成的主键永远不等于0(除非大db设计错误),您可以删除c。supplier_id ! = ' 0 '条款。

ss.site_id = '2' should be in the JOIN condition for readability.

ss.site_id = '2'应该处于可读性的连接条件中。

It looks like this should match only one row in table supplier_site per supplier (if this is your usual 1-N thing-addresses relation, ie you're selecting the second address of each supplier, maybe '2' corresponds to 'billing address' or something) so the GROUP BY c.supplier_id is useless. If the GROUP BY actually does something, then the query is wrong, since the "address" columns, which presumably come from supplier_site table, would come from a random row.

看起来这应该只匹配每个供应商的表supplier_site中的一行(如果这是你通常的1-N - to -address关系,也就是你选择每个供应商的第二个地址,也许“2”对应“账单地址”或其他什么),所以组是c。supplier_id是没有用的。如果这个组确实做了一些事情,那么查询是错误的,因为“address”列(可能来自supplier_site表)将来自一个随机的行。

So here's the simplified FROM (the WHERE is gone) :

这是简化后的(WHERE is gone)

FROM suppliers c
INNER JOIN supplier_site ss ON 
    (c.supplier_id = ss.supplier_id AND ss.site_id = '2')
ORDER BY c.supplier_name ASC

I suppose you got an index on c.supplier_name so this part of the query should be very fast.

我假设你有一个关于c的索引。supplier_name因此查询的这一部分应该非常快。

Now try this query :

现在尝试这个查询:

SELECT a.*,
    questapproved, 
    ss.supplier_no AS supplier_no,
    IF (active=1,'Yes', IF (active=2, 'NCR Only','Inactive')) AS rated, 
    sum( q.incomplete = '0') AS questions,
    sum( q.reviewed = '1' ) AS reviewed
FROM
(
    SELECT c.supplier_id, supplier_name, address1, address2, address3, address4, suppliertype, postcode, contact_name
    FROM suppliers c
    INNER JOIN supplier_site ss ON 
        (c.supplier_id = ss.supplier_id AND ss.site_id = '2')
    ORDER BY c.supplier_name ASC
    LIMIT 0, 20
) a
LEFT JOIN supplier_questions q ON (q.supplier_id = c.supplier_id)
GROUP BY c.supplier_id
ORDER BY c.supplier_name;

#3


1  

If you remove the sub-selects you end up with something like this:

如果你删除子选择,你会得到这样的结果:

SELECT c.supplier_id, supplier_name, address1, address2, address3, address4, suppliertype, postcode, contact_name,
COUNT(IF (q1.incomplete = '0', '0', null)) AS questions, 
IF (active=1,'Yes', IF (active=2, 'NCR Only','Inactive')) AS rated, 
COUNT(IF (q1.reviewed = '1', '1', null)) AS reviewed,
questapproved, 
ss.supplier_no AS supplier_no
FROM suppliers c
INNER JOIN supplier_site ss ON c.supplier_id = ss.supplier_id
LEFT OUTER JOIN supplier_questions q1 ON c.supplier_id = q1.supplier_id
WHERE c.supplier_id != '0' AND ss.site_id = '2'
GROUP BY c.supplier_id
ORDER BY c.supplier_name ASC
LIMIT 0, 20

I don't have a MySQL database available so there may be errors in my SQL. The idea is to remove the subqueries and replace them with an outer join and use IF to only count relevant rows.

我没有可用的MySQL数据库,所以SQL中可能有错误。其思想是删除子查询,并用外部连接替换它们,并使用IF只计算相关行。

#4


0  

I would first attempt restructure by pre-querying the aggregates by supplier the count of questions and reviewed ONCE. Then, join to the rest of the details. By using the STRAIGHT_JOIN keyword, it should process in the order displayed. This will pre-aggregate first and use THAT as the basis to join back to suppliers and then supplier sites. No outer group by needed since its based on a supplier ID anyhow. However, the join to supplier_sites (your ss.supplier_no) would imply a supplier has more than one location. Does that mean the address and active status columns are originating from that table?

我将首先尝试重组,通过预先查询供应商汇总的问题数量和审查一次。然后,加入其他细节。通过使用STRAIGHT_JOIN关键字,它应该按照显示的顺序进行处理。这将首先预聚合,并将其作为连接到供应商和供应商站点的基础。没有外部集团需要,因为它是基于供应商ID。但是,连接到supplier_sites(您的ss.supplier_no)将意味着供应商有多个位置。这是否意味着地址和活动状态列源自该表?

Should the join of questions be associated with a specific supplier and it's corresponding site location or not?

问题的连接是否应该与特定的供应商相关联,并且是否与相应的站点位置相关联?

Additionally, since the prequery has the WHERE clause on supplier_id != '0', it's not needed down stream since that will be the basis of a normal join to the other tables, thus eliminating them out of the result set.

此外,由于prequery在supplier_id != '0'上有WHERE子句,所以在下游不需要这个子句,因为这将是其他表的常规连接的基础,从而将它们从结果集中排除出去。

SELECT STRAIGHT_JOIN
      PreAggregate.supplier_id, 
      PreAggregate.supplier_name, 
      address1, 
      address2, 
      address3, 
      address4, 
      suppliertype, 
      postcode, 
      contact_name,
      PreAggregate.Questions,
      IF (active=1,'Yes', IF (active=2, 'NCR Only','Inactive')) AS rated, 
      PreAggregate.Reviewed,
      questapproved, 
      ss.supplier_no AS supplier_no
   FROM 
      (select 
             s1.Supplier_ID,
             s1.Supplier_Name,
             SUM( IF( q1.Incomplete = '0', 1, 0 )) Questions,
             SUM( IF( q1.Reviewed = '1', 1, 0 )) Reviewed
          from 
             suppliers s1
                join supplier_questions q1
                   ON s1.supplier_id = q1.supplier_id
          where
             s1.supplier_id != '0'
          group by
             s1.Supplier_ID 
          ORDER BY 
             s1.supplier_name ASC ) PreAggregate

       JOIN suppliers c
          ON PreAggregate.Supplier_ID = c.Supplier_ID

       JOIN supplier_site ss 
          ON PreAggregate.Supplier_ID = ss.supplier_id
          AND ss.Site_ID = '2'
  LIMIT 0, 20

#1


2  

As a stab in the dark, does this run faster? (I havent got a mysql to verify the syntax on, so forgive any slight mistakes, but you might get the idea)

作为暗处的一个暗处,它跑得快吗?(我没有mysql来验证语法,所以请原谅有任何小错误,但您可能理解了)

SELECT c.supplier_id, supplier_name, address1, address2, address3, address4, suppliertype, postcode, contact_name, questions, reviewed 
IF (active=1,'Yes', IF (active=2, 'NCR Only','Inactive')) AS rated,  
questapproved,  ss.supplier_no AS supplier_no 
FROM suppliers c 
INNER JOIN supplier_site ss ON c.supplier_id = ss.supplier_id 
inner join 
(SELECT supplier_id, sum(if(incomplete='0',1,0)) as questions,  sum(if(incomplete='1',1,0)) as reviewed FROM supplier_questions q1 group by supplier_id) as tmp
on c.supplier_id = tmp.supplier_id
WHERE c.supplier_id != '0' AND ss.site_id = '2' 
GROUP BY c.supplier_id 
ORDER BY c.supplier_name ASC LIMIT 0, 20 

#2


2  

FROM suppliers c
INNER JOIN supplier_site ss ON c.supplier_id = ss.supplier_id
WHERE c.supplier_id != '0' AND ss.site_id = '2'
GROUP BY c.supplier_id 
ORDER BY c.supplier_name ASC

Since autogenerated primary keys are never equal to 0 (unless big db design mistake) you can drop the c.supplier_id != '0' clause.

由于自动生成的主键永远不等于0(除非大db设计错误),您可以删除c。supplier_id ! = ' 0 '条款。

ss.site_id = '2' should be in the JOIN condition for readability.

ss.site_id = '2'应该处于可读性的连接条件中。

It looks like this should match only one row in table supplier_site per supplier (if this is your usual 1-N thing-addresses relation, ie you're selecting the second address of each supplier, maybe '2' corresponds to 'billing address' or something) so the GROUP BY c.supplier_id is useless. If the GROUP BY actually does something, then the query is wrong, since the "address" columns, which presumably come from supplier_site table, would come from a random row.

看起来这应该只匹配每个供应商的表supplier_site中的一行(如果这是你通常的1-N - to -address关系,也就是你选择每个供应商的第二个地址,也许“2”对应“账单地址”或其他什么),所以组是c。supplier_id是没有用的。如果这个组确实做了一些事情,那么查询是错误的,因为“address”列(可能来自supplier_site表)将来自一个随机的行。

So here's the simplified FROM (the WHERE is gone) :

这是简化后的(WHERE is gone)

FROM suppliers c
INNER JOIN supplier_site ss ON 
    (c.supplier_id = ss.supplier_id AND ss.site_id = '2')
ORDER BY c.supplier_name ASC

I suppose you got an index on c.supplier_name so this part of the query should be very fast.

我假设你有一个关于c的索引。supplier_name因此查询的这一部分应该非常快。

Now try this query :

现在尝试这个查询:

SELECT a.*,
    questapproved, 
    ss.supplier_no AS supplier_no,
    IF (active=1,'Yes', IF (active=2, 'NCR Only','Inactive')) AS rated, 
    sum( q.incomplete = '0') AS questions,
    sum( q.reviewed = '1' ) AS reviewed
FROM
(
    SELECT c.supplier_id, supplier_name, address1, address2, address3, address4, suppliertype, postcode, contact_name
    FROM suppliers c
    INNER JOIN supplier_site ss ON 
        (c.supplier_id = ss.supplier_id AND ss.site_id = '2')
    ORDER BY c.supplier_name ASC
    LIMIT 0, 20
) a
LEFT JOIN supplier_questions q ON (q.supplier_id = c.supplier_id)
GROUP BY c.supplier_id
ORDER BY c.supplier_name;

#3


1  

If you remove the sub-selects you end up with something like this:

如果你删除子选择,你会得到这样的结果:

SELECT c.supplier_id, supplier_name, address1, address2, address3, address4, suppliertype, postcode, contact_name,
COUNT(IF (q1.incomplete = '0', '0', null)) AS questions, 
IF (active=1,'Yes', IF (active=2, 'NCR Only','Inactive')) AS rated, 
COUNT(IF (q1.reviewed = '1', '1', null)) AS reviewed,
questapproved, 
ss.supplier_no AS supplier_no
FROM suppliers c
INNER JOIN supplier_site ss ON c.supplier_id = ss.supplier_id
LEFT OUTER JOIN supplier_questions q1 ON c.supplier_id = q1.supplier_id
WHERE c.supplier_id != '0' AND ss.site_id = '2'
GROUP BY c.supplier_id
ORDER BY c.supplier_name ASC
LIMIT 0, 20

I don't have a MySQL database available so there may be errors in my SQL. The idea is to remove the subqueries and replace them with an outer join and use IF to only count relevant rows.

我没有可用的MySQL数据库,所以SQL中可能有错误。其思想是删除子查询,并用外部连接替换它们,并使用IF只计算相关行。

#4


0  

I would first attempt restructure by pre-querying the aggregates by supplier the count of questions and reviewed ONCE. Then, join to the rest of the details. By using the STRAIGHT_JOIN keyword, it should process in the order displayed. This will pre-aggregate first and use THAT as the basis to join back to suppliers and then supplier sites. No outer group by needed since its based on a supplier ID anyhow. However, the join to supplier_sites (your ss.supplier_no) would imply a supplier has more than one location. Does that mean the address and active status columns are originating from that table?

我将首先尝试重组,通过预先查询供应商汇总的问题数量和审查一次。然后,加入其他细节。通过使用STRAIGHT_JOIN关键字,它应该按照显示的顺序进行处理。这将首先预聚合,并将其作为连接到供应商和供应商站点的基础。没有外部集团需要,因为它是基于供应商ID。但是,连接到supplier_sites(您的ss.supplier_no)将意味着供应商有多个位置。这是否意味着地址和活动状态列源自该表?

Should the join of questions be associated with a specific supplier and it's corresponding site location or not?

问题的连接是否应该与特定的供应商相关联,并且是否与相应的站点位置相关联?

Additionally, since the prequery has the WHERE clause on supplier_id != '0', it's not needed down stream since that will be the basis of a normal join to the other tables, thus eliminating them out of the result set.

此外,由于prequery在supplier_id != '0'上有WHERE子句,所以在下游不需要这个子句,因为这将是其他表的常规连接的基础,从而将它们从结果集中排除出去。

SELECT STRAIGHT_JOIN
      PreAggregate.supplier_id, 
      PreAggregate.supplier_name, 
      address1, 
      address2, 
      address3, 
      address4, 
      suppliertype, 
      postcode, 
      contact_name,
      PreAggregate.Questions,
      IF (active=1,'Yes', IF (active=2, 'NCR Only','Inactive')) AS rated, 
      PreAggregate.Reviewed,
      questapproved, 
      ss.supplier_no AS supplier_no
   FROM 
      (select 
             s1.Supplier_ID,
             s1.Supplier_Name,
             SUM( IF( q1.Incomplete = '0', 1, 0 )) Questions,
             SUM( IF( q1.Reviewed = '1', 1, 0 )) Reviewed
          from 
             suppliers s1
                join supplier_questions q1
                   ON s1.supplier_id = q1.supplier_id
          where
             s1.supplier_id != '0'
          group by
             s1.Supplier_ID 
          ORDER BY 
             s1.supplier_name ASC ) PreAggregate

       JOIN suppliers c
          ON PreAggregate.Supplier_ID = c.Supplier_ID

       JOIN supplier_site ss 
          ON PreAggregate.Supplier_ID = ss.supplier_id
          AND ss.Site_ID = '2'
  LIMIT 0, 20