合并两个列并添加到一个新的列中

时间:2022-07-03 22:54:47

In PostgreSQL, I want to use an SQL statement to combine two columns and create a new column from them.

在PostgreSQL中,我希望使用SQL语句组合两个列,并从中创建一个新列。

I'm thinking about using concat(...), but is there a better way?
What's the best way to do this?

我在考虑使用concat(…),但是有更好的方法吗?最好的方法是什么?

3 个解决方案

#1


58  

Generally, I agree with @kgrittn's advice. Go for it.

总的来说,我同意@kgrittn的建议。就去做吧。

But to address your basic question about concat(): The new function concat() is useful if you need to deal with null values - and null has neither been ruled out in your question nor in the one you refer to.

但是要解决关于concat()的基本问题:如果您需要处理null值,那么新的函数concat()是有用的,并且null在您的问题中没有被排除,也没有在您所引用的问题中被排除。

If you can rule out null values, the good old (SQL standard) concatenation operator || is still the best choice, and @luis' answer is just fine:

如果可以排除空值,那么最好的老式(SQL标准)连接操作符||仍然是最好的选择,@luis的回答也很好:

SELECT col_a || col_b;

If either of your columns can be null, the result would be null in that case. You could defend with COALESCE:

如果您的任何一列都可以为空,那么在这种情况下,结果将为空。你可以联合起来防御:

SELECT COALESCE(col_a, '') || COALESCE(col_b, '');

But that get tedious quickly with more arguments. That's where concat() comes in, which never returns null, not even if all arguments are null. Per documentation:

但随着争论的增多,这很快就会变得单调乏味。这就是concat()的作用,它从不返回null,即使所有的参数都是空的。每个文档:

NULL arguments are ignored.

NULL参数被忽略。

SELECT concat(col_a, col_b);

The remaining corner case for both alternatives is where all input columns are null in which case we still get an empty string '', but one might want null instead (at least I would). One possible way:

这两种方法的另一种情况是,所有输入列都为null,在这种情况下,我们仍然得到一个空字符串”,但是可能需要null代替(至少我是这样想的)。一个可能的方法:

SELECT CASE
          WHEN col_a IS NULL THEN col_b
          WHEN col_b IS NULL THEN col_a
          ELSE col_a || col_b
       END

This gets more complex with more columns quickly. Again, use concat() but add a check for the special condition:

随着列数的增加,这个问题变得更加复杂。同样,使用concat(),但是为特殊情况添加检查:

SELECT CASE WHEN (col_a, col_b) IS NULL THEN NULL
            ELSE concat(col_a, col_b) END;

How does this work?
(col_a, col_b) is shorthand notation for a row type expression ROW (col_a, col_b). And a row type is only null if all columns are null. Detailed explanation:

这是如何工作的呢?(col_a, col_b)是行类型表达式行的简写表示法(col_a, col_b)。如果所有列都为空,则行类型为null。详细解释:

Also, use concat_ws() to add separators between elements (_ws .. "with separator").

另外,使用concat_ws()在元素(_ws .)之间添加分隔符。“分隔符”)。


An expression like the one in Kevin's answer:

一个像凯文回答中那样的表达:

SELECT $1.zipcode || ' - ' || $1.city || ', ' || $1.state;

is tedious to prepare for null values in PostgreSQL 8.3 (without concat()). One way (of many):

在PostgreSQL 8.3(没有concat())中准备空值是很繁琐的。一个方法(许多):

SELECT COALESCE(
         CASE
          WHEN $1.zipcode IS NULL THEN $1.city
          WHEN $1.city IS NULL THEN $1.zipcode
          ELSE $1.zipcode || ' - ' || $1.city
         END, '')
       || COALESCE(', ' || $1.state, '');

Function volatility is only STABLE

Note, however, that concat() and concat_ws() are STABLE functions, not IMMUTABLE because they can invoke datatype output functions (like timestamptz_out) that depend on locale settings. Explanation by Tom Lane.

但是请注意,concat()和concat_ws()是稳定的函数,不是不可变的,因为它们可以调用依赖于语言环境设置的数据类型输出函数(如timestamptz_out)。由汤姆·莱恩解释。

This prohibits their direct use in index expressions. If you know that the result is actually immutable in your case, you can work around this with an IMMUTABLE function wrapper. Example here:

这禁止它们在索引表达式中直接使用。如果您知道结果在您的例子中实际上是不可变的,那么您可以使用不可变函数包装器来解决这个问题。例子:

#2


13  

You don't need to store the column to reference it that way. Try this:

您不需要存储列来以这种方式引用它。试试这个:

To set up:

设置:

CREATE TABLE tbl
  (zipcode text NOT NULL, city text NOT NULL, state text NOT NULL);
INSERT INTO tbl VALUES ('10954', 'Nanuet', 'NY');

We can see we have "the right stuff":

我们可以看到我们拥有“正确的东西”:

\pset border 2
SELECT * FROM tbl;
+---------+--------+-------+
| zipcode |  city  | state |
+---------+--------+-------+
| 10954   | Nanuet | NY    |
+---------+--------+-------+

Now add a function with the desired "column name" which takes the record type of the table as its only parameter:

现在添加一个函数,其所需的“列名”以表的记录类型为唯一参数:

CREATE FUNCTION combined(rec tbl)
  RETURNS text
  LANGUAGE SQL
AS $$
  SELECT $1.zipcode || ' - ' || $1.city || ', ' || $1.state;
$$;

This creates a function which can be used as if it were a column of the table, as long as the table name or alias is specified, like this:

这创建了一个函数,只要指定了表名或别名,就可以像使用表列一样使用它:

SELECT *, tbl.combined FROM tbl;

Which displays like this:

它显示是这样的:

+---------+--------+-------+--------------------+
| zipcode |  city  | state |      combined      |
+---------+--------+-------+--------------------+
| 10954   | Nanuet | NY    | 10954 - Nanuet, NY |
+---------+--------+-------+--------------------+

This works because PostgreSQL checks first for an actual column, but if one is not found, and the identifier is qualified with a relation name or alias, it looks for a function like the above, and runs it with the row as its argument, returning the result as if it were a column. You can even index on such a "generated column" if you want to do so.

这个工作因为PostgreSQL检查首先对于一个实际的列,但如果一个人没有找到,和标识符是合格的关系名称或别名,它看起来像上面的函数,与行,并运行它作为它的参数,返回结果就好像它是一个列。如果您想这样做,您甚至可以在这样一个“生成的列”上建立索引。

Because you're not using extra space in each row for the duplicated data, or firing triggers on all inserts and updates, this can often be faster than the alternatives.

因为您没有在每一行中为重复的数据使用额外的空间,或者在所有的插入和更新中触发触发器,这通常比其他方法更快。

#3


12  

Did you check the string concatenation function? Something like:

你检查过字符串连接函数吗?喜欢的东西:

update table_c set column_a = column_b || column_c 

should work. More here

应该工作。更多的在这里

#1


58  

Generally, I agree with @kgrittn's advice. Go for it.

总的来说,我同意@kgrittn的建议。就去做吧。

But to address your basic question about concat(): The new function concat() is useful if you need to deal with null values - and null has neither been ruled out in your question nor in the one you refer to.

但是要解决关于concat()的基本问题:如果您需要处理null值,那么新的函数concat()是有用的,并且null在您的问题中没有被排除,也没有在您所引用的问题中被排除。

If you can rule out null values, the good old (SQL standard) concatenation operator || is still the best choice, and @luis' answer is just fine:

如果可以排除空值,那么最好的老式(SQL标准)连接操作符||仍然是最好的选择,@luis的回答也很好:

SELECT col_a || col_b;

If either of your columns can be null, the result would be null in that case. You could defend with COALESCE:

如果您的任何一列都可以为空,那么在这种情况下,结果将为空。你可以联合起来防御:

SELECT COALESCE(col_a, '') || COALESCE(col_b, '');

But that get tedious quickly with more arguments. That's where concat() comes in, which never returns null, not even if all arguments are null. Per documentation:

但随着争论的增多,这很快就会变得单调乏味。这就是concat()的作用,它从不返回null,即使所有的参数都是空的。每个文档:

NULL arguments are ignored.

NULL参数被忽略。

SELECT concat(col_a, col_b);

The remaining corner case for both alternatives is where all input columns are null in which case we still get an empty string '', but one might want null instead (at least I would). One possible way:

这两种方法的另一种情况是,所有输入列都为null,在这种情况下,我们仍然得到一个空字符串”,但是可能需要null代替(至少我是这样想的)。一个可能的方法:

SELECT CASE
          WHEN col_a IS NULL THEN col_b
          WHEN col_b IS NULL THEN col_a
          ELSE col_a || col_b
       END

This gets more complex with more columns quickly. Again, use concat() but add a check for the special condition:

随着列数的增加,这个问题变得更加复杂。同样,使用concat(),但是为特殊情况添加检查:

SELECT CASE WHEN (col_a, col_b) IS NULL THEN NULL
            ELSE concat(col_a, col_b) END;

How does this work?
(col_a, col_b) is shorthand notation for a row type expression ROW (col_a, col_b). And a row type is only null if all columns are null. Detailed explanation:

这是如何工作的呢?(col_a, col_b)是行类型表达式行的简写表示法(col_a, col_b)。如果所有列都为空,则行类型为null。详细解释:

Also, use concat_ws() to add separators between elements (_ws .. "with separator").

另外,使用concat_ws()在元素(_ws .)之间添加分隔符。“分隔符”)。


An expression like the one in Kevin's answer:

一个像凯文回答中那样的表达:

SELECT $1.zipcode || ' - ' || $1.city || ', ' || $1.state;

is tedious to prepare for null values in PostgreSQL 8.3 (without concat()). One way (of many):

在PostgreSQL 8.3(没有concat())中准备空值是很繁琐的。一个方法(许多):

SELECT COALESCE(
         CASE
          WHEN $1.zipcode IS NULL THEN $1.city
          WHEN $1.city IS NULL THEN $1.zipcode
          ELSE $1.zipcode || ' - ' || $1.city
         END, '')
       || COALESCE(', ' || $1.state, '');

Function volatility is only STABLE

Note, however, that concat() and concat_ws() are STABLE functions, not IMMUTABLE because they can invoke datatype output functions (like timestamptz_out) that depend on locale settings. Explanation by Tom Lane.

但是请注意,concat()和concat_ws()是稳定的函数,不是不可变的,因为它们可以调用依赖于语言环境设置的数据类型输出函数(如timestamptz_out)。由汤姆·莱恩解释。

This prohibits their direct use in index expressions. If you know that the result is actually immutable in your case, you can work around this with an IMMUTABLE function wrapper. Example here:

这禁止它们在索引表达式中直接使用。如果您知道结果在您的例子中实际上是不可变的,那么您可以使用不可变函数包装器来解决这个问题。例子:

#2


13  

You don't need to store the column to reference it that way. Try this:

您不需要存储列来以这种方式引用它。试试这个:

To set up:

设置:

CREATE TABLE tbl
  (zipcode text NOT NULL, city text NOT NULL, state text NOT NULL);
INSERT INTO tbl VALUES ('10954', 'Nanuet', 'NY');

We can see we have "the right stuff":

我们可以看到我们拥有“正确的东西”:

\pset border 2
SELECT * FROM tbl;
+---------+--------+-------+
| zipcode |  city  | state |
+---------+--------+-------+
| 10954   | Nanuet | NY    |
+---------+--------+-------+

Now add a function with the desired "column name" which takes the record type of the table as its only parameter:

现在添加一个函数,其所需的“列名”以表的记录类型为唯一参数:

CREATE FUNCTION combined(rec tbl)
  RETURNS text
  LANGUAGE SQL
AS $$
  SELECT $1.zipcode || ' - ' || $1.city || ', ' || $1.state;
$$;

This creates a function which can be used as if it were a column of the table, as long as the table name or alias is specified, like this:

这创建了一个函数,只要指定了表名或别名,就可以像使用表列一样使用它:

SELECT *, tbl.combined FROM tbl;

Which displays like this:

它显示是这样的:

+---------+--------+-------+--------------------+
| zipcode |  city  | state |      combined      |
+---------+--------+-------+--------------------+
| 10954   | Nanuet | NY    | 10954 - Nanuet, NY |
+---------+--------+-------+--------------------+

This works because PostgreSQL checks first for an actual column, but if one is not found, and the identifier is qualified with a relation name or alias, it looks for a function like the above, and runs it with the row as its argument, returning the result as if it were a column. You can even index on such a "generated column" if you want to do so.

这个工作因为PostgreSQL检查首先对于一个实际的列,但如果一个人没有找到,和标识符是合格的关系名称或别名,它看起来像上面的函数,与行,并运行它作为它的参数,返回结果就好像它是一个列。如果您想这样做,您甚至可以在这样一个“生成的列”上建立索引。

Because you're not using extra space in each row for the duplicated data, or firing triggers on all inserts and updates, this can often be faster than the alternatives.

因为您没有在每一行中为重复的数据使用额外的空间,或者在所有的插入和更新中触发触发器,这通常比其他方法更快。

#3


12  

Did you check the string concatenation function? Something like:

你检查过字符串连接函数吗?喜欢的东西:

update table_c set column_a = column_b || column_c 

should work. More here

应该工作。更多的在这里