为什么在SQL Server中透视文本列时使用Max函数?

I have just learnt how to Pivot in SQL Server. I was wondering why the max function is used when we want to pivot text columns? What's the logic behind this? I understand if it's Count, Sum etc (because your summing that respective row and column) but I don't understand the logic of using max when we have text columns?

我刚刚学会了如何在SQL Server中使用Pivot。我想知道为什么当我们想要主文本列时使用max函数?这背后的逻辑是什么?我知道它是Count, Sum等等(因为你把每个行和列相加)但是我不明白当我们有文本列时使用max的逻辑?

For example, my code is:

例如，我的代码是:

SELECT * 
  FROM ( SELECT DATE
               ,SITA
               ,EVENT 
          FROM  [UKRMC].[dbo].[strategy] 
          where datename(year, DATE) = 2018 or datename(year,DATE)=2019
        ) strategy
  PIVOT ( max(EVENT)
          FOR SITA IN ([ABZPD],[BFSPD]
,[BFSZH]
,[BHXPD]
,[BHXZH]
,[BRSZH]
,[BRUPQ] ) piv

2 个解决方案

#1

Because in your example you've chosen EVENT as the value to show in the PIVOT intersections (i.e. since you've specified EVENT in the PIVOT clause), the value must be specified with one of the permissible aggregate functions, as there are potentially multiple rows for each of the column values that you've chosen in your pivot, when grouped by the remaining columns (i.e. DATE in your case).

因为在你的例子中你选择了事件的值显示在主路口(即由于你指定事件的主条款),允许的值必须指定一个聚合函数,因为有可能有多个行中的每个列值,你选择了你的主,当按剩余的分组列(例如日期在你的情况下)。

In Sql Server^[1], MAX() or MIN() is commonly used when pivoting non-numeric columns, as it is able to show one of the original of the values of the column.

在Sql Server[1]中，通常使用MAX()或MIN()来显示非数字列，因为它可以显示列的值的原始值。

Any non-aggregate and non-pivoted columns will be left as-is and will be used to form the groups on which the pivot is based (in your case, column DATE isn't either in the aggregate, or the column pivot, so it will form the row group)

任何非聚合列和非枢轴列都将保持原样，并将用于形成主元所基于的组(在您的示例中，列日期既不在聚合中，也不在列轴心中，因此它将形成行组)

Consider the case where your pivoted table contains multiple rows matching your predicate, such as this:

考虑这样一种情况:您的旋转表包含多个与谓词匹配的行，例如:

INSERT INTO strategy (DATE, SITA, EVENT) VALUES
('1 Jan 2018', 'ABZPD', 'Event1'),
('1 Jan 2018', 'BFSPD', 'Event2'),
('1 Jan 2018', 'BFSPD', 'Event3');

After Pivot:

主后:

DATE                    ABZPD   BFSPD
2018-01-01T00:00:00Z    Event1  Event3

i.e. During the Pivot, the BFSPD rows for Event2 and Event3 needed to somehow be projected into a single cell - hence the need for an aggregate. This aggregate is still needed, even if there is known to be just one value (this being the case for the Event1 value for SITA ABZPD in the above example).

例如，在Pivot期间，Event2和Event3的BFSPD行需要以某种方式投影到单个单元中——因此需要聚合。这个聚合仍然需要，即使已知只有一个值(在上面的示例中，这是SITA ABZPD的Event1值的情况)。

Since BFSPD has two events, you'll need to somehow resolve how to project a value into a single cell value. The use of MAX on the VARCHAR column resolves the 'largest' value (Event3) in the event of multiple rows in projecting into the same resulting pivot 'cell' - SqlFiddle example here

由于BFSPD有两个事件，您需要以某种方式解决如何将一个值投射到单个单元格值的问题。在VARCHAR列上使用MAX解决了在多个行投影到同一个结果轴心“cell”时的“最大”值(Event3)——这里的SqlFiddle示例

You could choose to use COUNT(Event) to show you the number of events per row / pivot intersection - Fiddle

您可以选择使用COUNT(Event)来显示每行/ pivot交叉的事件数

And you could switch the aggregate on EVENT with DATE - EVENT is thus used in the column grouping.

您可以在事件上切换聚合，并在列分组中使用日期事件。

_{*1 Aggregates like AVG or STDEV are obviously not available to strings. Other RDBMS have additional aggregates like FIRST which will arbitrarily take the first value, or GROUP_CONCAT / LIST_AGG, which can fold string values together with a delimiter. And PostGres allows you to make your own aggregate functions!. But sadly, none of this in SqlServer, hence MIN() / MAX() for now.}

像AVG或STDEV这样的集合显然不能用于字符串。其他RDBMS有其他的聚合，比如FIRST，它可以任意地取第一个值，或者GROUP_CONCAT / LIST_AGG，它可以将字符串值与分隔符一起折叠。PostGres允许您创建自己的聚合函数!但遗憾的是，SqlServer中没有这一项，因此目前还没有MIN() / MAX()。

#2

An aggregation function must be specified when using PIVOT command because the first step of the pivoting operation is a grouping operation on the column specified in the FOR clause that reduces the number of lines of the resulting tables.

在使用PIVOT命令时必须指定一个聚合函数，因为pivoting操作的第一步是在FOR子句中指定的列上进行分组操作，从而减少结果表的行数。

The aggregation function is used to manage values for the other columns that are required in the output table.

聚合函数用于管理输出表中需要的其他列的值。

From Technet documentation:

从技术文档:

PIVOT rotates a table-valued expression by turning the unique values from one column in the expression into multiple columns in the output, and performs aggregations where they are required on any remaining column values that are wanted in the final output.

PIVOT通过将表达式中的一个列中的惟一值转换为输出中的多个列来旋转表值表达式，并在最终输出中需要的任何其他列值上执行聚合。

Here is the PIVOT command syntax taken from the same Technet article:

以下是来自同一篇Technet文章的PIVOT命令语法:

SELECT <non-pivoted column>,  
    [first pivoted column] AS <column name>,  
    [second pivoted column] AS <column name>,  
    ...  
    [last pivoted column] AS <column name>  
FROM  
    (<SELECT query that produces the data>)   
    AS <alias for the source query>  
PIVOT  
(  
    <aggregation function>(<column being aggregated>)  
FOR   
[<column that contains the values that will become column headers>]   
    IN ( [first pivoted column], [second pivoted column],  
    ... [last pivoted column])  
) AS <alias for the pivot table>  
<optional ORDER BY clause>;

Please note that after the PIVOT clause you must specify an aggregation function:

请注意，在PIVOT子句之后，您必须指定一个聚合函数:

...
<aggregation function>(<column being aggregated>)
...

For an additional insight on this topic see also this Microsoft Press article.

有关此主题的更多信息，请参见本文。

#1