有人能给我解释一下本教程中运行的total和SQL self-join吗?

I was reading over the tutorial here: http://www.1keydata.com/sql/sql-running-totals.html and it all made sense until it suddenly got extremely ridiculously unbelievably complicated when it got to rank, median, running totals, etc. Can somebody explain in plain English how that query results in a running total? Thanks!

我在这里阅读了教程:http://www.1keydata.com/sql/sql-running-totals.html，所有这些都是有意义的，直到它突然变得极其荒谬地难以置信地复杂，当它需要排名、中位数、运行总数等等。谢谢!

4 个解决方案

#1

Before I get started, I've not seen this before and it doesn't look like a terribly comprehensible way to accomplish a running total.

在我开始之前，我还没有见过这个，它看起来不像是一种非常容易理解的方式来完成一个跑步总数。

Okay, here's the query from the tutorial:

好的，这是教程中的查询:

SELECT a1.Name, a1.Sales, SUM(a2.Sales) Running_Total
FROM Total_Sales a1, Total_Sales a2
WHERE a1.Sales <= a2.sales or (a1.Sales=a2.Sales and a1.Name = a2.Name)
GROUP BY a1.Name, a1.Sales
ORDER BY a1.Sales DESC, a1.Name DESC;

And the sample output

和样例输出

Name    Sales   Running_Total
Greg     50     50
Sophia    40    90
Stella    20    110
Jeff      20    130
Jennifer  15    145
John      10    155

The simple part of this query is displaying the sales data for each employee. All we're doing is selecting name and sales from each employee and ordering them by the sale amount (descending). This gives us our base list.

这个查询的简单部分是显示每个员工的销售数据。我们所做的就是从每个员工中选择名字和销售额，并按销售额(降序)来排序。这给了我们基本的列表。

Now for the running total, we want every row that has already been displayed. So, we join the table against itself, on each row that would already have been displayed:

现在对于运行总数，我们希望显示已经显示的每一行。因此，我们将表与它本身连接起来，在每一行上都已经显示了:

WHERE a1.Sales <= a2.sales or (a1.Sales=a2.Sales and a1.Name = a2.Name)

Then we use the SUM aggregate function and group accordingly. A good way to understand this is if you look at what would happen if you didn't use the group function. The 'Sophia' row would look like this:

然后我们使用求和集合函数并相应地进行分组。理解这一点的一个好方法是如果不使用group函数会发生什么。“索菲娅”一排看起来是这样的:

Name    A1.Sales    A2.Sales
Sophia  40          50
Sophia    40         40

Notice how we got Greg's sales row? The group will sum that up, and voila!

注意我们是怎么搞到格雷格的销售的?这组人会总结出来，瞧!

Hope that helps. Joe

希望有帮助。乔

#2

The first table joins to itself, the join resulting in x number of rows, where x is the number of rows that have total sales lower than itself, or the name in the row is the same (i.e. all those sales previous to the row we are looking at, when ordered by sales amount).

第一个表连接,连接导致x的行数,其中x是的行数,总销售额低于本身,或行中的名称是相同的(即那些销售前一行我们看,当下令销售额)。

It then groups on the fields in the left side of the join and sums the rows we join to, thus a running total. To see how it works, you might want to run it without the sum and grouping, to see the raw results returned.

然后，它在join左边的字段上分组，并对我们加入的行进行求和，从而得到一个运行的总数。要查看它是如何工作的，您可能希望在不使用sum和分组的情况下运行它，以查看返回的原始结果。

#3

The SQL above gives a different result on Sybase (ASE 15). I think the reason is that the 'order by' is not applied until display time. Here is the SQL and the result:

上面的SQL在Sybase上给出了不同的结果(ASE 15)。我认为原因是“order by”直到显示时间才被应用。下面是SQL和结果:

drop table Total_Sales
go
create table Total_Sales
(
    Name char(15),
    Sales  int
)

INSERT INTO Total_Sales VALUES( 'John', 10 )
INSERT INTO Total_Sales VALUES( 'Jennifer', 15)
INSERT INTO Total_Sales VALUES('Stella', 20 )
INSERT INTO Total_Sales VALUES('Sophia', 40 )
INSERT INTO Total_Sales VALUES('Greg', 50 )
INSERT INTO Total_Sales VALUES('Jeff', 20 )

SELECT a1.Name, a1.Sales, SUM(a2.Sales) Running_Total 
FROM Total_Sales a1, Total_Sales a2 
WHERE a1.Sales <= a2.Sales or (a1.Sales=a2.Sales and a1.Name = a2.Name) 
GROUP BY a1.Name, a1.Sales 
ORDER BY a1.Sales DESC, a1.Name DESC

Result:

结果:

Name           Sales Running_Total   
Greg            50  50   
Sophia          40  90   
Stella          20  130 --note that two running totals are the same! 
Jeff            20  130  
Jennifer        15  145  
John            10  155

Bob

鲍勃

#4

I also get the same incorrect output as Bob above where the running total breaks down at Stella & Jeff, who have the same sales number. I'm using SQL Server 2014 Management Studio Express. I don't think the website's solution is actually correct. I did the join based on name instead of on sales and came up with these, which produce a correct running total:

我还得到了与上面的Bob相同的错误输出，其中Stella & Jeff的运行总数出现故障，他们的销售数字相同。我使用的是SQL Server 2014 Management Studio Express。我不认为这个网站的解决方案是正确的。我是根据名字而不是销售来加入的，然后想到了这些，这就产生了一个正确的运行总数:

select a1.name
, a1.sales
, sum(a2.sales) 'running_total'
from #total_sales a1
inner join #total_sales a2 on a1.name <= a2.name 
group by a1.name, a1.sales
order by sum(a2.sales);

Yields:

收益率:

name      sales  running_total
Stella    20     20
Sophia    40     60
John      10     70
Jennifer  15     85
Jeff      20     105
Greg      50     155

You could also do the variant below if you're uncomfortable sorting on an aggregate. It changes the order, but the running total is still correct:

如果你对聚合体感到不舒服，你也可以做下面的变换。它改变了顺序，但是运行总数仍然是正确的:

select a1.name
, a1.sales
, sum(a2.sales) 'running_total'
from #total_sales a1
inner join #total_sales a2 on a1.name >= a2.name 
group by a1.name, a1.sales
order by a1.name;

Yields:

收益率:

name     sales  running_total
Greg     50     50
Jeff     20     70
Jennifer 15     85
John     10     95
Sophia   40     135
Stella   20     155

#1

Before I get started, I've not seen this before and it doesn't look like a terribly comprehensible way to accomplish a running total.

在我开始之前，我还没有见过这个，它看起来不像是一种非常容易理解的方式来完成一个跑步总数。

Okay, here's the query from the tutorial:

好的，这是教程中的查询:

SELECT a1.Name, a1.Sales, SUM(a2.Sales) Running_Total
FROM Total_Sales a1, Total_Sales a2
WHERE a1.Sales <= a2.sales or (a1.Sales=a2.Sales and a1.Name = a2.Name)
GROUP BY a1.Name, a1.Sales
ORDER BY a1.Sales DESC, a1.Name DESC;

And the sample output

和样例输出

Name    Sales   Running_Total
Greg     50     50
Sophia    40    90
Stella    20    110
Jeff      20    130
Jennifer  15    145
John      10    155

这个查询的简单部分是显示每个员工的销售数据。我们所做的就是从每个员工中选择名字和销售额，并按销售额(降序)来排序。这给了我们基本的列表。

Now for the running total, we want every row that has already been displayed. So, we join the table against itself, on each row that would already have been displayed:

现在对于运行总数，我们希望显示已经显示的每一行。因此，我们将表与它本身连接起来，在每一行上都已经显示了:

WHERE a1.Sales <= a2.sales or (a1.Sales=a2.Sales and a1.Name = a2.Name)

然后我们使用求和集合函数并相应地进行分组。理解这一点的一个好方法是如果不使用group函数会发生什么。“索菲娅”一排看起来是这样的:

Name    A1.Sales    A2.Sales
Sophia  40          50
Sophia    40         40

Notice how we got Greg's sales row? The group will sum that up, and voila!

注意我们是怎么搞到格雷格的销售的?这组人会总结出来，瞧!

Hope that helps. Joe

希望有帮助。乔

#2

第一个表连接,连接导致x的行数,其中x是的行数,总销售额低于本身,或行中的名称是相同的(即那些销售前一行我们看,当下令销售额)。

#3

The SQL above gives a different result on Sybase (ASE 15). I think the reason is that the 'order by' is not applied until display time. Here is the SQL and the result:

上面的SQL在Sybase上给出了不同的结果(ASE 15)。我认为原因是“order by”直到显示时间才被应用。下面是SQL和结果:

drop table Total_Sales
go
create table Total_Sales
(
    Name char(15),
    Sales  int
)

INSERT INTO Total_Sales VALUES( 'John', 10 )
INSERT INTO Total_Sales VALUES( 'Jennifer', 15)
INSERT INTO Total_Sales VALUES('Stella', 20 )
INSERT INTO Total_Sales VALUES('Sophia', 40 )
INSERT INTO Total_Sales VALUES('Greg', 50 )
INSERT INTO Total_Sales VALUES('Jeff', 20 )

SELECT a1.Name, a1.Sales, SUM(a2.Sales) Running_Total 
FROM Total_Sales a1, Total_Sales a2 
WHERE a1.Sales <= a2.Sales or (a1.Sales=a2.Sales and a1.Name = a2.Name) 
GROUP BY a1.Name, a1.Sales 
ORDER BY a1.Sales DESC, a1.Name DESC

Result:

结果:

Name           Sales Running_Total   
Greg            50  50   
Sophia          40  90   
Stella          20  130 --note that two running totals are the same! 
Jeff            20  130  
Jennifer        15  145  
John            10  155

Bob

鲍勃

#4

select a1.name
, a1.sales
, sum(a2.sales) 'running_total'
from #total_sales a1
inner join #total_sales a2 on a1.name <= a2.name 
group by a1.name, a1.sales
order by sum(a2.sales);

Yields:

收益率:

name      sales  running_total
Stella    20     20
Sophia    40     60
John      10     70
Jennifer  15     85
Jeff      20     105
Greg      50     155

You could also do the variant below if you're uncomfortable sorting on an aggregate. It changes the order, but the running total is still correct:

如果你对聚合体感到不舒服，你也可以做下面的变换。它改变了顺序，但是运行总数仍然是正确的:

select a1.name
, a1.sales
, sum(a2.sales) 'running_total'
from #total_sales a1
inner join #total_sales a2 on a1.name >= a2.name 
group by a1.name, a1.sales
order by a1.name;

Yields:

收益率:

name     sales  running_total
Greg     50     50
Jeff     20     70
Jennifer 15     85
John     10     95
Sophia   40     135
Stella   20     155

秒客网

有人能给我解释一下本教程中运行的total和SQL self-join吗?

4 个解决方案

#1

#2

#3

#4

#1

#2

#3

#4

相关文章