Here is a sample table that mimics my scenario:
这是一个模仿我的场景的示例表:
COL_1 COL_2 COL_3 COL_4 LAST_COL
A P X NY 10
A P X NY 11
A P Y NY 12
A P Y NY 13
A P X NY 14
B Q X NY 15
B Q Y NY 16
B Q Y CA 17
B Q Y CA 18
The LAST_COL is a primary key so it will be different every time.
LAST_COL是主键,因此每次都会有所不同。
I want to ignore LAST_COL and gather some statistics related to the rest of the 4 columns.
我想忽略LAST_COL并收集一些与4列其余部分相关的统计信息。
Basically, I have millions of rows in my table, and I want to know which set of COL_1, COL_2, COL_3 and COL_4
are having most number of rows.
基本上,我的表中有数百万行,我想知道哪一组COL_1,COL_2,COL_3和COL_4具有最多行数。
So, I want a query which can output me all the unique rows with their count of occurrences.
所以,我想要一个查询,它可以输出所有唯一的行及其出现次数。
COL_1 COL_2 COL_3 COL_4 TOTAL
A P X NY 3
A P Y NY 2
B Q X NY 1
B Q Y NY 1
B Q Y CA 2
Thanks to anyone who helps me with this.
感谢任何帮助我的人。
*I am using MS SQL, if that would make any difference.
*我正在使用MS SQL,如果这会有所不同。
4 个解决方案
#1
9
SELECT COL_1, COL_2, COL_3, COL_4, COUNT(*)
FROM MyTable
GROUP BY COL_1, COL_2, COL_3, COL_4
If you ever want to weed out rows that don't have a duplicate:
如果您想要清除没有重复的行:
SELECT COL_1, COL_2, COL_3, COL_4, COUNT(*)
FROM MyTable
GROUP BY COL_1, COL_2, COL_3, COL_4
HAVING COUNT(*) > 1
#2
1
The GROUP BY is what you want here. For example:
GROUP BY就是你想要的。例如:
SELECT COL_1, COL_2, COL_3, COL_4, COUNT(*)
FROM my_table
GROUP BY COL_1, COL_2, COL_3, COL_4
#3
1
If I am understanding correctly all you need is something like:
如果我理解正确你所需要的是:
SELECT COL_1,COL_2,COL_3,COL_4, COUNT(*) AS TOTAL
FROM table
GROUP BY COL_1,COL_2,COL_3,COL_4
#4
1
are having most number of rows
拥有最多行数
So you want to count, and then ORDER BY the count DESC
所以你想要数,然后ORDER BY计数DESC
SELECT COL_1, COL_2, COL_3, COL_4, COUNT(*) COUNT_ROWS
FROM TBL
GROUP BY COL_1, COL_2, COL_3, COL_4
ORDER BY COUNT_ROWS DESC
#1
9
SELECT COL_1, COL_2, COL_3, COL_4, COUNT(*)
FROM MyTable
GROUP BY COL_1, COL_2, COL_3, COL_4
If you ever want to weed out rows that don't have a duplicate:
如果您想要清除没有重复的行:
SELECT COL_1, COL_2, COL_3, COL_4, COUNT(*)
FROM MyTable
GROUP BY COL_1, COL_2, COL_3, COL_4
HAVING COUNT(*) > 1
#2
1
The GROUP BY is what you want here. For example:
GROUP BY就是你想要的。例如:
SELECT COL_1, COL_2, COL_3, COL_4, COUNT(*)
FROM my_table
GROUP BY COL_1, COL_2, COL_3, COL_4
#3
1
If I am understanding correctly all you need is something like:
如果我理解正确你所需要的是:
SELECT COL_1,COL_2,COL_3,COL_4, COUNT(*) AS TOTAL
FROM table
GROUP BY COL_1,COL_2,COL_3,COL_4
#4
1
are having most number of rows
拥有最多行数
So you want to count, and then ORDER BY the count DESC
所以你想要数,然后ORDER BY计数DESC
SELECT COL_1, COL_2, COL_3, COL_4, COUNT(*) COUNT_ROWS
FROM TBL
GROUP BY COL_1, COL_2, COL_3, COL_4
ORDER BY COUNT_ROWS DESC