I have a table that looks like this one :
我有一张像这样的桌子:
+------+------+------------------+
| item | val | timestamp |
+------+------+------------------+
| 1 | 3.66 | 16-05-2011 09:17 |
| 1 | 2.56 | 16-05-2011 09:47 |
| 2 | 4.23 | 16-05-2011 09:37 |
| 3 | 6.89 | 16-05-2011 11:26 |
| 3 | 1.12 | 16-05-2011 12:11 |
| 3 | 4.56 | 16-05-2011 13:23 |
| 4 | 1.10 | 16-05-2011 14:11 |
| 4 | 9.79 | 16-05-2011 14:23 |
| 5 | 1.58 | 16-05-2011 15:27 |
| 5 | 0.80 | 16-05-2011 15:29 |
| 6 | 3.80 | 16-05-2011 15:29 |
+------+------+------------------+
so, the grand total of all item for the day : 16 May 2011 is : 40.09
所以,2011年5月16日当天所有项目的总价是:40.09
Now i want to retrieve which items of this list form an amount of 80% of the grand total. Let me make an example :
现在我想要检索这个列表中的哪些项目,它们的数量占总数量的80%。让我举个例子:
- Grand Total : 40.09
- 总计:40.09
- 80% of the Grand Total : 32.07
- 总数的80%:32.07
starting from the item with more percentage weight on the total amount i want to retrieve the grouped list of the item that form the 80% of the grand total :
从占总金额百分比权重较大的项目开始,我想检索构成总金额80%的项目的分组列表:
+------+------+
| item | val |
+------+------+
| 3 | 12.57|
| 4 | 10.89|
| 1 | 6.22|
+------+------+
As you can see the elements in the result set are the elements grouped by item code and ordered from the element with greater percentage weight on the grand total descending until reaching the 80% threshold.
正如您所看到的,结果集中的元素是按项目代码分组的元素,并从元素中有序地排序,在大的总降序中权重较大,直到达到80%的阈值。
From the item 2 onward the items are discarded from the result set because they exceed the threshold of 80%, because :
从第2项开始,由于超出了80%的阈值,所以从结果集中丢弃这些项,因为:
12.57 + 10.89 + 6.22 + 4.23 > 32.07 (80 % of the grand total )
This is not an homework, this is a real context where i am stumbled and i need to achieve the result with a single query ...
这不是作业,这是我被绊倒的真实情况,我需要用一个查询来达到结果…
The query should run unmodified or with few changes on MySQL, SQL Server, PostgreSQL .
查询应该在MySQL、SQL Server、PostgreSQL上运行未修改或很少更改。
2 个解决方案
#1
4
You can do this with a single query:
你可以通过一个查询来完成:
WITH Total_Sum(overallTotal) as (SELECT SUM(val)
FROM dataTable),
Summed_Items(id, total) as (SELECT id, SUM(val)
FROM dataTable
GROUP BY id),
Ordered_Sums(id, total, ord) as (SELECT id, total,
ROW_NUMBER() OVER(ORDER BY total DESC)
FROM Summed_Items),
Percent_List(id, itemTotal, ord, overallTotal) as (
SELECT id, total, ord, total
FROM Ordered_Sums
WHERE ord = 1
UNION ALL
SELECT b.id, b.total, b.ord, b.total + a.overallTotal
FROM Percent_List as a
JOIN Ordered_Sums as b
ON b.ord = a.ord + 1
JOIN Total_Sum as c
ON (c.overallTotal * .8) > (a.overallTotal + b.total))
SELECT id, itemTotal
FROM Percent_List
Which will yield the following:
将产生以下内容:
id itemTotal
3 12.57
4 10.89
1 6.22
Please note that this will not work in mySQL (no CTEs), and will require a more recent version of postgreSQL to work (otherwise OLAP functions are not supported). SQLServer should be able to run the statement as-is (I think - this was written and tested on DB2). Otherwise, you may attempt to translate this into correlated table joins, etc, but it will not be pretty, if it's even possible (a stored procedure or re-assembly in a higher level language may then be your only option).
请注意,这在mySQL(没有cte)中不能工作,并且需要更新的postgreSQL版本才能工作(否则不支持OLAP函数)。SQLServer应该能够按原样运行语句(我认为这是在DB2上编写和测试的)。否则,您可能会尝试将其转换为相关的表连接,等等,但如果可能的话,这将不是很好(使用更高级别语言的存储过程或重新组装可能是您惟一的选择)。
#2
0
I don't know of any way this can be done with a single query; you'll probably have to create a stored procedure. The steps of the proc would be something like this:
我不知道用一个查询能做什么;您可能需要创建一个存储过程。proc的步骤是这样的:
- Calculate the grand total for that day by using a
SUM
- 用总数来计算当天的总数
- Get the individual records for that day ordered by
val DESC
- 获取由val DESC订购的当天的个人记录
- Keep a running total as you loop through the individual records; as long as the running total is < 0.8 * grandtotal, add the current record to your list
- 在循环遍历各个记录时保持运行总数;只要运行总数< 0.8 * grandtotal,将当前记录添加到列表中
#1
4
You can do this with a single query:
你可以通过一个查询来完成:
WITH Total_Sum(overallTotal) as (SELECT SUM(val)
FROM dataTable),
Summed_Items(id, total) as (SELECT id, SUM(val)
FROM dataTable
GROUP BY id),
Ordered_Sums(id, total, ord) as (SELECT id, total,
ROW_NUMBER() OVER(ORDER BY total DESC)
FROM Summed_Items),
Percent_List(id, itemTotal, ord, overallTotal) as (
SELECT id, total, ord, total
FROM Ordered_Sums
WHERE ord = 1
UNION ALL
SELECT b.id, b.total, b.ord, b.total + a.overallTotal
FROM Percent_List as a
JOIN Ordered_Sums as b
ON b.ord = a.ord + 1
JOIN Total_Sum as c
ON (c.overallTotal * .8) > (a.overallTotal + b.total))
SELECT id, itemTotal
FROM Percent_List
Which will yield the following:
将产生以下内容:
id itemTotal
3 12.57
4 10.89
1 6.22
Please note that this will not work in mySQL (no CTEs), and will require a more recent version of postgreSQL to work (otherwise OLAP functions are not supported). SQLServer should be able to run the statement as-is (I think - this was written and tested on DB2). Otherwise, you may attempt to translate this into correlated table joins, etc, but it will not be pretty, if it's even possible (a stored procedure or re-assembly in a higher level language may then be your only option).
请注意,这在mySQL(没有cte)中不能工作,并且需要更新的postgreSQL版本才能工作(否则不支持OLAP函数)。SQLServer应该能够按原样运行语句(我认为这是在DB2上编写和测试的)。否则,您可能会尝试将其转换为相关的表连接,等等,但如果可能的话,这将不是很好(使用更高级别语言的存储过程或重新组装可能是您惟一的选择)。
#2
0
I don't know of any way this can be done with a single query; you'll probably have to create a stored procedure. The steps of the proc would be something like this:
我不知道用一个查询能做什么;您可能需要创建一个存储过程。proc的步骤是这样的:
- Calculate the grand total for that day by using a
SUM
- 用总数来计算当天的总数
- Get the individual records for that day ordered by
val DESC
- 获取由val DESC订购的当天的个人记录
- Keep a running total as you loop through the individual records; as long as the running total is < 0.8 * grandtotal, add the current record to your list
- 在循环遍历各个记录时保持运行总数;只要运行总数< 0.8 * grandtotal,将当前记录添加到列表中