How do I find the unique groups that are present in my table, and display how often that type of group is used?
如何找到表中存在的唯一组,并显示该类型组的使用频率?
For example (SQL Server 2008R2)
例如(SQL Server 2008R2)
So, I would like to find out how many times the combination of
所以,我想要算出这个组合的次数
PMI 100
RT 100
VT 100
is present in my table and for how many itemid's it is used;
在我的表中,有多少项id被使用;
These three form a group because together they are assigned to a single itemid. The same combination is assigned to id 2527 and 2529, so therefore this group is used at least twice. (usagecount = 2)
这三个元素组成一个组,因为它们一起被分配给一个itemid。同样的组合被分配给id 2527和2529,因此这个组至少被使用两次。(usagecount = 2)
(and I want to know that for all types of groups that are appearing)
(我想知道对于所有出现的群体
- The entire dataset is quite large, about 5.000.000 records, so I'd like to avoid using a cursor.
- 整个数据集相当大,大约有5.000.000条记录,所以我希望避免使用游标。
- The number of code/pct combinations per itemid varies between 1 and 6.
- 每个项目id的代码/pct组合数量在1和6之间变化。
- The values in the "code" field are not known up front, there are more than a dozen values on average
- “代码”字段中的值预先不知道,平均有超过12个值
I tried using pivot, but I got stuck eventually and I also tried various combinations of GROUP-BY and counts.
我尝试过使用枢轴,但最终我被卡住了,我还尝试了不同的组合组合,如GROUP-BY和count。
Any bright ideas?
什么好主意吗?
Example output:
示例输出:
code pct groupid usagecount
PMI 100 1 234
RT 100 1 234
VT 100 1 234
CD 5 2 567
PMI 100 2 567
VT 100 2 567
PMI 100 3 123
PT 100 3 123
VT 100 3 123
RT 100 4 39
VT 100 4 39
etc
3 个解决方案
#1
2
Just using a simple group:
使用一个简单的组:
SELECT
code
, pct
, COUNT(*)
FROM myTable
GROUP BY
code
, pct
Not too sure if that's more like what you're looking for:
不太确定这是否更像你想要的:
select
uniqueGrp
, count(*)
from (
select distinct
itemid
from myTable
) as I
cross apply (
select
cast(code as varchar(max)) + cast(pct as varchar(max)) + '_'
from myTable
where myTable.itemid = I.itemid
order by code, pct
for xml path('')
) as x(uniqueGrp)
group by uniqueGrp
#2
2
Either of these should return each combination of code and percentage with a group id for the code and the total number of instances of the code against it. You can use them for also adding the number of instances of the specific code/pct combo too for determining % contribution etc
其中任何一个都应该返回代码的每个组合和百分比,以及代码的组id以及针对它的代码实例的总数。您也可以使用它们来添加特定代码/pct组合的实例数量,以确定百分比贡献等
select
distinct
t.code, t.pct, v.groupcol, v.vol
from
[tablename] t
inner join (select code, rank() over(order by count(*)) as groupcol,
count(*) as vol from [tablename] s
group by code) v on v.code=t.code
or
或
select
t.code, t.pct, v.groupcol, v.vol
from
(select code, pct from [tablename] group by code, pct) t
inner join (select code, rank() over(order by count(*)) as groupcol,
count(*) as vol from [tablename] s
group by code) v on v.code=t.code
#3
1
Grouping by Code, and Pct should be enough I think. See the following :
按代码分组,我认为Pct就足够了。见以下:
select code,pct,count(p.*) from [table] as p group by code,pct
#1
2
Just using a simple group:
使用一个简单的组:
SELECT
code
, pct
, COUNT(*)
FROM myTable
GROUP BY
code
, pct
Not too sure if that's more like what you're looking for:
不太确定这是否更像你想要的:
select
uniqueGrp
, count(*)
from (
select distinct
itemid
from myTable
) as I
cross apply (
select
cast(code as varchar(max)) + cast(pct as varchar(max)) + '_'
from myTable
where myTable.itemid = I.itemid
order by code, pct
for xml path('')
) as x(uniqueGrp)
group by uniqueGrp
#2
2
Either of these should return each combination of code and percentage with a group id for the code and the total number of instances of the code against it. You can use them for also adding the number of instances of the specific code/pct combo too for determining % contribution etc
其中任何一个都应该返回代码的每个组合和百分比,以及代码的组id以及针对它的代码实例的总数。您也可以使用它们来添加特定代码/pct组合的实例数量,以确定百分比贡献等
select
distinct
t.code, t.pct, v.groupcol, v.vol
from
[tablename] t
inner join (select code, rank() over(order by count(*)) as groupcol,
count(*) as vol from [tablename] s
group by code) v on v.code=t.code
or
或
select
t.code, t.pct, v.groupcol, v.vol
from
(select code, pct from [tablename] group by code, pct) t
inner join (select code, rank() over(order by count(*)) as groupcol,
count(*) as vol from [tablename] s
group by code) v on v.code=t.code
#3
1
Grouping by Code, and Pct should be enough I think. See the following :
按代码分组,我认为Pct就足够了。见以下:
select code,pct,count(p.*) from [table] as p group by code,pct