在T-SQL中使用percentile_cont和“group by”语句

时间:2021-07-26 08:46:58

I'd like to use the percentile_cont function to get median values in T-SQL. However, I also need to get mean values as well. I'd like to do something like the following:

我想使用percentile_cont函数来获取T-SQL中的中值。但是,我也需要获得平均值。我想做类似以下的事情:

SELECT  CustomerID ,
    AVG(Expenditure) AS MeanSpend , percentile_cont
    ( .5) WITHIN GROUP(ORDER BY Expenditure) OVER( ) AS MedianSpend
FROM    Customers
GROUP BY CustomerID

Can this be accomplished? I know I can use the OVER clause to group the percentile_cont results...

这可以实现吗?我知道我可以使用OVER子句对percentile_cont结果进行分组......

but then I'm stuck using two queries, am I not?

但后来我被困在使用两个查询,不是吗?

2 个解决方案

#1


5  

Just figured it out... gotta drop the group by and give both aggregation functions a over statement.

刚想通了......要放弃组,并给两个聚合函数一个过度语句。

SELECT CustomerID,
    AVG(Expenditure) OVER(PARTITION BY CustomerID) AS MeanSpend,
    percentile_cont(.5) WITHIN GROUP(ORDER BY Expenditure) OVER(PARTITION BY CustomerID) AS MedianSpend
FROM Customers

#2


1  

You can't use "group by" with window functions. These functions return the aggregated values for every row. One way is to use "select distinct" to get rid of the duplicate rows. Just make sure you partition each window function by the non-aggregated columns (groupId in this example).

您不能在窗口功能中使用“分组依据”。这些函数返回每行的聚合值。一种方法是使用“select distinct”来摆脱重复的行。只需确保按非聚合列(本例中为groupId)对每个窗口函数进行分区。

--Generate test data
SELECT  TOP(10) 
    value.number%3  AS  groupId
,   value.number    AS  number
INTO    #data
FROM  master.dbo.spt_values  AS  value
WHERE value."type" = 'P' 
ORDER BY NEWID()
;

--View test data
SELECT  * FROM #data ORDER BY groupId,number;

--CALCULATE MEDIAN
SELECT DISTINCT 
    groupId
,   AVG(number)                                         OVER(PARTITION BY groupId)  AS mean
,   percentile_cont(.5) WITHIN GROUP(ORDER BY number)   OVER(PARTITION BY groupId)  AS median
FROM    #data
;

--Clean up
DROP TABLE #data;

#1


5  

Just figured it out... gotta drop the group by and give both aggregation functions a over statement.

刚想通了......要放弃组,并给两个聚合函数一个过度语句。

SELECT CustomerID,
    AVG(Expenditure) OVER(PARTITION BY CustomerID) AS MeanSpend,
    percentile_cont(.5) WITHIN GROUP(ORDER BY Expenditure) OVER(PARTITION BY CustomerID) AS MedianSpend
FROM Customers

#2


1  

You can't use "group by" with window functions. These functions return the aggregated values for every row. One way is to use "select distinct" to get rid of the duplicate rows. Just make sure you partition each window function by the non-aggregated columns (groupId in this example).

您不能在窗口功能中使用“分组依据”。这些函数返回每行的聚合值。一种方法是使用“select distinct”来摆脱重复的行。只需确保按非聚合列(本例中为groupId)对每个窗口函数进行分区。

--Generate test data
SELECT  TOP(10) 
    value.number%3  AS  groupId
,   value.number    AS  number
INTO    #data
FROM  master.dbo.spt_values  AS  value
WHERE value."type" = 'P' 
ORDER BY NEWID()
;

--View test data
SELECT  * FROM #data ORDER BY groupId,number;

--CALCULATE MEDIAN
SELECT DISTINCT 
    groupId
,   AVG(number)                                         OVER(PARTITION BY groupId)  AS mean
,   percentile_cont(.5) WITHIN GROUP(ORDER BY number)   OVER(PARTITION BY groupId)  AS median
FROM    #data
;

--Clean up
DROP TABLE #data;