从表上的不同标准中选择计数

时间:2022-03-24 15:41:20

I have a table named 'jobs'. For a particular user a job can be active, archived, overdue, pending, or closed. Right now every page request is generating 5 COUNT queries and in an attempt at optimization I'm trying to reduce this to a single query. This is what I have so far but it is barely faster than the 5 individual queries. Note that I've simplified the conditions for each subquery to make it easier to understand, the full query acts the same however.

我有一个叫“乔布斯”的桌子。对于特定的用户,作业可以是活动的、存档的、过期的、挂起的或关闭的。现在,每个页面请求都生成5个计数查询,为了优化,我尝试将其减少为单个查询。这是我到目前为止所得到的,但它仅仅比5个查询快一点点。注意,为了便于理解,我简化了每个子查询的条件,但是完整的查询的作用是相同的。

Is there a way to get these 5 counts in the same query without using the inefficient subqueries?

是否有一种方法可以在同一个查询中获得这5个计数,而不用低效的子查询?

SELECT
  (SELECT count(*)
    FROM "jobs"
    WHERE
      jobs.creator_id = 5 AND
      jobs.status_id NOT IN (8,3,11) /* 8,3,11 being 'inactive' related statuses */
  ) AS active_count, 
  (SELECT count(*)
    FROM "jobs"
    WHERE
      jobs.creator_id = 5 AND
      jobs.due_date < '2011-06-14' AND
      jobs.status_id NOT IN(8,11,5,3) /* Grabs the overdue active jobs
                                      ('5' means completed successfully) */
  ) AS overdue_count,
  (SELECT count(*)
    FROM "jobs"
    WHERE
      jobs.creator_id = 5 AND
      jobs.due_date BETWEEN '2011-06-14' AND '2011-06-15 06:00:00.000000'
  ) AS due_today_count

This goes on for 2 more subqueries but I think you get the idea.

这还需要两个子查询,但我想你已经明白了。

Is there an easier way to collect this data since it's basically 5 different COUNT's off of the same subset of data from the jobs table?

是否有一种更简单的方法来收集这些数据,因为它基本上是来自jobs表的同一子集数据的5个不同的计数?

The subset of data is 'creator_id = 5', after that each count is basically just 1-2 additional conditions. Note that right now we're using Postgres but may be moving to MySQL in the near future. So if you can provide an ANSI-compatible solution I'd be gratetful :)

数据的子集是“creator_id = 5”,之后每个计数基本上就是1-2个附加条件。注意,我们现在使用的是Postgres,但不久的将来可能会转向MySQL。因此,如果你能提供一个与ansi兼容的解决方案,我将非常感激:)

3 个解决方案

#1


24  

This is the typical solution. Use a case statement to break out the different conditions. If a record meets it gets a 1 else a 0. Then do a SUM on the values

这是典型的解决方案。使用case语句来划分不同的条件。如果一个记录满足,它会得到1,否则就是0。然后对这些值做一个求和。

  SELECT
    SUM(active_count) active_count,
    SUM(overdue_count) overdue_count
    SUM(due_today_count) due_today_count
  FROM 
  (

  SELECT 
    CASE WHEN jobs.status_id NOT IN (8,3,11) THEN 1 ELSE 0 END active_count,
    CASE WHEN jobs.due_date < '2011-06-14' AND jobs.status_id NOT IN(8,11,5,3)  THEN 1 ELSE 0 END  overdue_count,
    CASE WHEN jobs.due_date BETWEEN '2011-06-14' AND '2011-06-15 06:00:00.000000' THEN 1 ELSE 0 END  due_today_count

    FROM "jobs"
    WHERE
      jobs.creator_id = 5 ) t

UPDATE As noted when 0 records are returned as t this result in as single result of Nulls in all the values. You have three options

当0条记录作为t返回时进行更新,这将导致所有值中的null。你有三个选择

1) Add A Having clause so that you have No records returned rather than result of all NULLS

1)添加一个Having子句,这样您就没有返回的记录,而不是所有null的结果。

   HAVING SUM(active_count) is not null

2) If you want all zeros returned than you could add coalesce to all your sums

如果你想让所有的0都返回,你就可以把所有的总数加到一起。

For example

例如

 SELECT
      COALESCE(SUM(active_count)) active_count,
       COALESCE(SUM(overdue_count)) overdue_count
      COALESCE(SUM(due_today_count)) due_today_count

3) Take advantage of the fact that COUNT(NULL) = 0 as sbarro's demonstrated. You should note that the not-null value could be anything it doesn't have to be a 1

3)利用COUNT(NULL) = 0这一事实。你应该注意到非空值可以是任何它不一定是1的值

for example

例如

 SELECT
      COUNT(CASE WHEN 
            jobs.status_id NOT IN (8,3,11) THEN 'Manticores Rock' ELSE NULL
       END) as [active_count]

#2


12  

I would use this approach, use COUNT in combination with CASE WHEN.

我会用这种方法,结合计数和大小写。

SELECT 
    COUNT(CASE WHEN 
        jobs.status_id NOT IN (8,3,11) THEN 1 
    END) as [Count1],
    COUNT(CASE WHEN 
        jobs.due_date < '2011-06-14' 
        AND jobs.status_id NOT IN(8,11,5,3) THEN 1
    END) as [COUNT2],
    COUNT(CASE WHEN
            jobs.due_date BETWEEN '2011-06-14' AND '2011-06-15 06:00:00.000000'
    END) as [COUNT3]
FROM 
    "jobs"
WHERE 
     jobs.creator_id = 5 

#3


0  

Brief

SQL Server 2012 introduced the IIF logical function. Using SQL Server 2012 or greater you can now use this new function instead of a CASE expression. The IIF function also works with Azure SQL Database (but at the moment it does not work with Azure SQL Data Warehouse or Parallel Data Warehouse). It's shorthand for the CASE expression.

SQL Server 2012引入了IIF逻辑函数。使用SQL Server 2012或更高版本,您现在可以使用这个新函数而不是用例表达式。IIF函数也适用于Azure SQL数据库(但目前它不适用于Azure SQL数据仓库或并行数据仓库)。它是CASE表达式的简写。

I find myself using the IIF function rather than the CASE expression when there is only one case. This alleviates the pain of having to write CASE WHEN condition THEN x ELSE y END and instead writing it as IIF(condition, x, y). If multiple conditions may be met (multiple WHENs), you should instead consider using the regular CASE expression rather than nested IIF functions.

当只有一种情况时,我发现我使用的是IIF函数而不是CASE表达式。这减轻了在条件为x的情况下不得不编写CASE的痛苦,并将其写成IIF(条件,x, y)。如果可能满足多个条件(多个WHENs),则应该考虑使用常规的CASE表达式而不是嵌套的IIF函数。

Returns one of two values, depending on whether the Boolean expression evaluates to true or false in SQL Server.

返回两个值中的一个,具体取决于SQL Server中布尔表达式的计算值是true还是false。

Syntax

IIF ( boolean_expression, true_value, false_value )

Arguments

boolean_expression
A valid Boolean expression.

boolean_expression有效的布尔表达式。

If this argument is not a Boolean expression, then a syntax error is raised.

如果该参数不是布尔表达式,则会引发语法错误。

true_value
Value to return if boolean_expression evaluates to true.

如果boolean_expression计算为true,则返回true_value值。

false_value
Value to return if boolean_expression evaluates to false.

如果boolean_expression计算为false,则返回false值。

Remarks

IIF is a shorthand way for writing a CASE expression. It evaluates the Boolean expression passed as the first argument, and then returns either of the other two arguments based on the result of the evaluation. That is, the true_value is returned if the Boolean expression is true, and the false_value is returned if the Boolean expression is false or unknown. true_value and false_value can be of any type. The same rules that apply to the CASE expression for Boolean expressions, null handling, and return types also apply to IIF. For more information, see CASE (Transact-SQL).

IIF是写CASE表达式的一种简写方式。它计算作为第一个参数传递的布尔表达式,然后根据评估结果返回其他两个参数。也就是说,如果布尔表达式为true,则返回true_value,如果布尔表达式为false或未知,则返回false_value。true_value和false_value可以是任何类型。用于布尔表达式、null处理和返回类型的CASE表达式的规则也适用于IIF。有关更多信息,请参见CASE (Transact-SQL)。

The fact that IIF is translated into CASE also has an impact on other aspects of the behavior of this function. Since CASE expressions can be nested only up to the level of 10, IIF statements can also be nested only up to the maximum level of 10. Also, IIF is remoted to other servers as a semantically equivalent CASE expression, with all the behaviors of a remoted CASE expression.

IIF被转化为CASE的事实也会对这个函数的行为的其他方面产生影响。由于CASE表达式只能嵌套到10级,所以IIF语句也只能嵌套到10级的最大值。此外,IIF作为语义等效的大小写表达式被转移到其他服务器上,并具有远程大小写表达式的所有行为。


Code

Implementation of the IIF function in SQL would resemble the following (using the same logic presented by @rsbarro in his answer):

在SQL中实现IIF函数将类似于以下内容(使用@rsbarro在其回答中提供的相同逻辑):

SELECT 
    COUNT(
        IIF(jobs.status_id NOT IN (8,3,11), 1, 0)
    ) as active_count,
    COUNT(
        IIF(jobs.due_date < '2011-06-14' AND jobs.status_id NOT IN(8,11,5,3), 1, 0)
    ) as overdue_count,
    COUNT(
        IIF(jobs.due_date BETWEEN '2011-06-14' AND '2011-06-15 06:00:00.000000', 1, 0)
    ) as due_today_count
FROM 
    "jobs"
WHERE 
     jobs.creator_id = 5 

#1


24  

This is the typical solution. Use a case statement to break out the different conditions. If a record meets it gets a 1 else a 0. Then do a SUM on the values

这是典型的解决方案。使用case语句来划分不同的条件。如果一个记录满足,它会得到1,否则就是0。然后对这些值做一个求和。

  SELECT
    SUM(active_count) active_count,
    SUM(overdue_count) overdue_count
    SUM(due_today_count) due_today_count
  FROM 
  (

  SELECT 
    CASE WHEN jobs.status_id NOT IN (8,3,11) THEN 1 ELSE 0 END active_count,
    CASE WHEN jobs.due_date < '2011-06-14' AND jobs.status_id NOT IN(8,11,5,3)  THEN 1 ELSE 0 END  overdue_count,
    CASE WHEN jobs.due_date BETWEEN '2011-06-14' AND '2011-06-15 06:00:00.000000' THEN 1 ELSE 0 END  due_today_count

    FROM "jobs"
    WHERE
      jobs.creator_id = 5 ) t

UPDATE As noted when 0 records are returned as t this result in as single result of Nulls in all the values. You have three options

当0条记录作为t返回时进行更新,这将导致所有值中的null。你有三个选择

1) Add A Having clause so that you have No records returned rather than result of all NULLS

1)添加一个Having子句,这样您就没有返回的记录,而不是所有null的结果。

   HAVING SUM(active_count) is not null

2) If you want all zeros returned than you could add coalesce to all your sums

如果你想让所有的0都返回,你就可以把所有的总数加到一起。

For example

例如

 SELECT
      COALESCE(SUM(active_count)) active_count,
       COALESCE(SUM(overdue_count)) overdue_count
      COALESCE(SUM(due_today_count)) due_today_count

3) Take advantage of the fact that COUNT(NULL) = 0 as sbarro's demonstrated. You should note that the not-null value could be anything it doesn't have to be a 1

3)利用COUNT(NULL) = 0这一事实。你应该注意到非空值可以是任何它不一定是1的值

for example

例如

 SELECT
      COUNT(CASE WHEN 
            jobs.status_id NOT IN (8,3,11) THEN 'Manticores Rock' ELSE NULL
       END) as [active_count]

#2


12  

I would use this approach, use COUNT in combination with CASE WHEN.

我会用这种方法,结合计数和大小写。

SELECT 
    COUNT(CASE WHEN 
        jobs.status_id NOT IN (8,3,11) THEN 1 
    END) as [Count1],
    COUNT(CASE WHEN 
        jobs.due_date < '2011-06-14' 
        AND jobs.status_id NOT IN(8,11,5,3) THEN 1
    END) as [COUNT2],
    COUNT(CASE WHEN
            jobs.due_date BETWEEN '2011-06-14' AND '2011-06-15 06:00:00.000000'
    END) as [COUNT3]
FROM 
    "jobs"
WHERE 
     jobs.creator_id = 5 

#3


0  

Brief

SQL Server 2012 introduced the IIF logical function. Using SQL Server 2012 or greater you can now use this new function instead of a CASE expression. The IIF function also works with Azure SQL Database (but at the moment it does not work with Azure SQL Data Warehouse or Parallel Data Warehouse). It's shorthand for the CASE expression.

SQL Server 2012引入了IIF逻辑函数。使用SQL Server 2012或更高版本,您现在可以使用这个新函数而不是用例表达式。IIF函数也适用于Azure SQL数据库(但目前它不适用于Azure SQL数据仓库或并行数据仓库)。它是CASE表达式的简写。

I find myself using the IIF function rather than the CASE expression when there is only one case. This alleviates the pain of having to write CASE WHEN condition THEN x ELSE y END and instead writing it as IIF(condition, x, y). If multiple conditions may be met (multiple WHENs), you should instead consider using the regular CASE expression rather than nested IIF functions.

当只有一种情况时,我发现我使用的是IIF函数而不是CASE表达式。这减轻了在条件为x的情况下不得不编写CASE的痛苦,并将其写成IIF(条件,x, y)。如果可能满足多个条件(多个WHENs),则应该考虑使用常规的CASE表达式而不是嵌套的IIF函数。

Returns one of two values, depending on whether the Boolean expression evaluates to true or false in SQL Server.

返回两个值中的一个,具体取决于SQL Server中布尔表达式的计算值是true还是false。

Syntax

IIF ( boolean_expression, true_value, false_value )

Arguments

boolean_expression
A valid Boolean expression.

boolean_expression有效的布尔表达式。

If this argument is not a Boolean expression, then a syntax error is raised.

如果该参数不是布尔表达式,则会引发语法错误。

true_value
Value to return if boolean_expression evaluates to true.

如果boolean_expression计算为true,则返回true_value值。

false_value
Value to return if boolean_expression evaluates to false.

如果boolean_expression计算为false,则返回false值。

Remarks

IIF is a shorthand way for writing a CASE expression. It evaluates the Boolean expression passed as the first argument, and then returns either of the other two arguments based on the result of the evaluation. That is, the true_value is returned if the Boolean expression is true, and the false_value is returned if the Boolean expression is false or unknown. true_value and false_value can be of any type. The same rules that apply to the CASE expression for Boolean expressions, null handling, and return types also apply to IIF. For more information, see CASE (Transact-SQL).

IIF是写CASE表达式的一种简写方式。它计算作为第一个参数传递的布尔表达式,然后根据评估结果返回其他两个参数。也就是说,如果布尔表达式为true,则返回true_value,如果布尔表达式为false或未知,则返回false_value。true_value和false_value可以是任何类型。用于布尔表达式、null处理和返回类型的CASE表达式的规则也适用于IIF。有关更多信息,请参见CASE (Transact-SQL)。

The fact that IIF is translated into CASE also has an impact on other aspects of the behavior of this function. Since CASE expressions can be nested only up to the level of 10, IIF statements can also be nested only up to the maximum level of 10. Also, IIF is remoted to other servers as a semantically equivalent CASE expression, with all the behaviors of a remoted CASE expression.

IIF被转化为CASE的事实也会对这个函数的行为的其他方面产生影响。由于CASE表达式只能嵌套到10级,所以IIF语句也只能嵌套到10级的最大值。此外,IIF作为语义等效的大小写表达式被转移到其他服务器上,并具有远程大小写表达式的所有行为。


Code

Implementation of the IIF function in SQL would resemble the following (using the same logic presented by @rsbarro in his answer):

在SQL中实现IIF函数将类似于以下内容(使用@rsbarro在其回答中提供的相同逻辑):

SELECT 
    COUNT(
        IIF(jobs.status_id NOT IN (8,3,11), 1, 0)
    ) as active_count,
    COUNT(
        IIF(jobs.due_date < '2011-06-14' AND jobs.status_id NOT IN(8,11,5,3), 1, 0)
    ) as overdue_count,
    COUNT(
        IIF(jobs.due_date BETWEEN '2011-06-14' AND '2011-06-15 06:00:00.000000', 1, 0)
    ) as due_today_count
FROM 
    "jobs"
WHERE 
     jobs.creator_id = 5