I've the following dataset:
我有以下数据集:
Machine Type Value
1 A 11
1 B 32
2 A 23
3 A 1
4 B 23
4 B 31
5 B 56
6 A 12
And I want to the following calculation:
我想进行以下计算:
SELECT COUNT(WHERE TYPE = A) / COUNT(TOTAL)
FROM....
Which is the best way to do it? Is using With like:
这是最好的方法吗?使用With喜欢:
DECLARE @CNT INT
SET @CNT = (SELECT COUNT(*) FROM dataset)
SELECT COUNT(*)/CNT
FROM dataset
WHERE TYPE = A
But if I have a big query repeating the same query for this calculations puts SQL to slow... Anyone can give a better solution?
但是,如果我有一个大查询重复相同的查询进行此计算,则SQL会变慢......任何人都可以提供更好的解决方案吗?
4 个解决方案
#1
1
using conditional aggregation: summing 1.0 will give you a percentage that isn't converted to an int
of 0 or 1.
使用条件聚合:求和1.0将为您提供未转换为0或1的int的百分比。
select sum(case when type='a' then 1.0 else 0 end)/count(*)
from t
test setup: http://rextester.com/GXN95560
测试设置:http://rextester.com/GXN95560
create table t (Machine int, Type char(1), Value int)
insert into t values
(1,'A',11)
,(1,'B',32)
,(2,'A',23)
,(3,'A',1 )
,(4,'B',23)
,(4,'B',31)
,(5,'B',56)
,(6,'A',12)
select sum(case when type='a' then 1.0 else 0 end)/count(*)
from t
returns: 0.500000
返回:0.500000
SELECT COUNT(case when TYPE = 'a' then 1 end) / COUNT(*)
FROM t
returns: 0
返回:0
#2
2
Use a case
expression to do conditional counting.
使用case表达式进行条件计数。
(When type <> a
, the case
will return null. count()
doesn't count nulls.)
(当键入<> a时,大小写将返回null.count()不计算空值。)
SELECT COUNT(case when TYPE = A then 1 end) * 1.0 / COUNT(*)
FROM dataset
EDIT:
编辑:
Inspired by the other answers I decided to run some performance tests. The table tx used here has 10's of millions of rows. The column c2 is indexed, and has some hundred different values randomly placed all over the table.
受其他答案的启发,我决定进行一些性能测试。这里使用的表tx有10百万行。列c2被索引,并且在整个表中随机放置了几百个不同的值。
Query 1:
查询1:
select count(case when c2 = 'A' then 1 end) * 1.0 / count(*) from tx;
Direct I/O count : 83067
Buffered I/O count : 0
Page faults : 3
CPU time (seconds): 77.18
Elapsed time (seconds): 77.17
Query 2:
查询2:
select avg(case when c2 = 'A' then 1.0 else 0.0 end) from tx;
Direct I/O count : 83067
Buffered I/O count : 0
Page faults : 0
CPU time (seconds): 84.90
Elapsed time (seconds): 84.90
Query 3:
查询3:
select (select count(*) from tx where c2 = 'A') * 1.0 /
(select count(*) from tx)
from onerow_table;
Direct I/O count : 86204
Buffered I/O count : 0
Page faults : 2
CPU time (seconds): 3.45
Elapsed time (seconds): 3.45
PS. Not run on MS SQL Server.
PS。不在MS SQL Server上运行。
#3
1
Here is a little trick that Gordon demonstrated a couple of weeks ago. (can't seem to find the link)
这是几周前戈登演示的一个小技巧。 (似乎无法找到链接)
Declare @YourTable table (Machine int, Type char(1), Value int)
insert into @YourTable values
(1,'A',11)
,(1,'B',32)
,(2,'A',23)
,(3,'A',1 )
,(4,'B',23)
,(4,'B',31)
,(5,'B',56)
,(6,'A',12)
select avg(case when type='a' then 1.0 else 0 end)
from @YourTable
Returns
返回
0.500000
#4
0
select
cast((sum(case when Type='A' then 1 else 0 end)) as float)/cast(Count(Type)) as float)
from dataset
#1
1
using conditional aggregation: summing 1.0 will give you a percentage that isn't converted to an int
of 0 or 1.
使用条件聚合:求和1.0将为您提供未转换为0或1的int的百分比。
select sum(case when type='a' then 1.0 else 0 end)/count(*)
from t
test setup: http://rextester.com/GXN95560
测试设置:http://rextester.com/GXN95560
create table t (Machine int, Type char(1), Value int)
insert into t values
(1,'A',11)
,(1,'B',32)
,(2,'A',23)
,(3,'A',1 )
,(4,'B',23)
,(4,'B',31)
,(5,'B',56)
,(6,'A',12)
select sum(case when type='a' then 1.0 else 0 end)/count(*)
from t
returns: 0.500000
返回:0.500000
SELECT COUNT(case when TYPE = 'a' then 1 end) / COUNT(*)
FROM t
returns: 0
返回:0
#2
2
Use a case
expression to do conditional counting.
使用case表达式进行条件计数。
(When type <> a
, the case
will return null. count()
doesn't count nulls.)
(当键入<> a时,大小写将返回null.count()不计算空值。)
SELECT COUNT(case when TYPE = A then 1 end) * 1.0 / COUNT(*)
FROM dataset
EDIT:
编辑:
Inspired by the other answers I decided to run some performance tests. The table tx used here has 10's of millions of rows. The column c2 is indexed, and has some hundred different values randomly placed all over the table.
受其他答案的启发,我决定进行一些性能测试。这里使用的表tx有10百万行。列c2被索引,并且在整个表中随机放置了几百个不同的值。
Query 1:
查询1:
select count(case when c2 = 'A' then 1 end) * 1.0 / count(*) from tx;
Direct I/O count : 83067
Buffered I/O count : 0
Page faults : 3
CPU time (seconds): 77.18
Elapsed time (seconds): 77.17
Query 2:
查询2:
select avg(case when c2 = 'A' then 1.0 else 0.0 end) from tx;
Direct I/O count : 83067
Buffered I/O count : 0
Page faults : 0
CPU time (seconds): 84.90
Elapsed time (seconds): 84.90
Query 3:
查询3:
select (select count(*) from tx where c2 = 'A') * 1.0 /
(select count(*) from tx)
from onerow_table;
Direct I/O count : 86204
Buffered I/O count : 0
Page faults : 2
CPU time (seconds): 3.45
Elapsed time (seconds): 3.45
PS. Not run on MS SQL Server.
PS。不在MS SQL Server上运行。
#3
1
Here is a little trick that Gordon demonstrated a couple of weeks ago. (can't seem to find the link)
这是几周前戈登演示的一个小技巧。 (似乎无法找到链接)
Declare @YourTable table (Machine int, Type char(1), Value int)
insert into @YourTable values
(1,'A',11)
,(1,'B',32)
,(2,'A',23)
,(3,'A',1 )
,(4,'B',23)
,(4,'B',31)
,(5,'B',56)
,(6,'A',12)
select avg(case when type='a' then 1.0 else 0 end)
from @YourTable
Returns
返回
0.500000
#4
0
select
cast((sum(case when Type='A' then 1 else 0 end)) as float)/cast(Count(Type)) as float)
from dataset