Some SQL servers have a feature where INSERT
is skipped if it would violate a primary/unique key constraint. For instance, MySQL has INSERT IGNORE
.
有些SQL服务器有一个特性,如果插入违反主/惟一键约束,则跳过它。例如,MySQL有INSERT IGNORE。
What's the best way to emulate INSERT IGNORE
and ON DUPLICATE KEY UPDATE
with PostgreSQL?
模拟插入忽略的最佳方式是什么,以及使用PostgreSQL的重复键更新?
11 个解决方案
#1
28
Try to do an UPDATE. If it doesn't modify any row that means it didn't exist, so do an insert. Obviously, you do this inside a transaction.
尝试做一个更新。如果它不修改任何表示它不存在的行,那么进行插入。显然,这是在事务中进行的。
You can of course wrap this in a function if you don't want to put the extra code on the client side. You also need a loop for the very rare race condition in that thinking.
如果不想在客户端上放置额外的代码,当然可以将其封装到函数中。你还需要一个循环,在这个想法中非常罕见的种族条件。
There's an example of this in the documentation: http://www.postgresql.org/docs/9.3/static/plpgsql-control-structures.html, example 40-2 right at the bottom.
在文档中有一个这样的例子:http://www.postgresql.org/docs/9.3/static/plpgsql-control-structures.html,示例40-2在底部。
That's usually the easiest way. You can do some magic with rules, but it's likely going to be a lot messier. I'd recommend the wrap-in-function approach over that any day.
这通常是最简单的方法。你可以用规则做一些魔术,但它可能会更混乱。我建议在任何一天都使用包装功能方法。
This works for single row, or few row, values. If you're dealing with large amounts of rows for example from a subquery, you're best of splitting it into two queries, one for INSERT and one for UPDATE (as an appropriate join/subselect of course - no need to write your main filter twice)
这适用于单个行或少数行值。如果要处理大量的行(例如来自子查询的行),最好将其分为两个查询,一个用于插入,一个用于更新(当然作为适当的连接/子选择—不需要编写主过滤器两次)
#2
94
With PostgreSQL 9.5, this is now native functionality (like MySQL has had for several years):
对于PostgreSQL 9.5,这是现在的本地功能(就像MySQL几年前的功能一样):
INSERT ... ON CONFLICT DO NOTHING/UPDATE ("UPSERT")
插入……关于冲突,什么都不做/更新(“UPSERT”)
9.5 brings support for "UPSERT" operations. INSERT is extended to accept an ON CONFLICT DO UPDATE/IGNORE clause. This clause specifies an alternative action to take in the event of a would-be duplicate violation.
9.5支持“维护”操作。插入扩展为接受一个ON CONFLICT DO UPDATE/IGNORE子句。该条款指定了在可能重复违反的情况下要采取的另一项行动。
...
…
Further example of new syntax:
新语法的进一步例子:
INSERT INTO user_logins (username, logins)
VALUES ('Naomi',1),('James',1)
ON CONFLICT (username)
DO UPDATE SET logins = user_logins.logins + EXCLUDED.logins;
#3
91
Edit: in case you missed warren's answer, PG9.5 now has this natively; time to upgrade!
编辑:如果你错过了warren的答案,PG9.5现在有了这个本地版本;时间升级!
Building on Bill Karwin's answer, to spell out what a rule based approach would look like (transferring from another schema in the same DB, and with a multi-column primary key):
以Bill Karwin的回答为基础,阐明基于规则的方法是什么样子的(从同一个数据库中的另一个模式转移,并使用多列主键):
CREATE RULE "my_table_on_duplicate_ignore" AS ON INSERT TO "my_table"
WHERE EXISTS(SELECT 1 FROM my_table
WHERE (pk_col_1, pk_col_2)=(NEW.pk_col_1, NEW.pk_col_2))
DO INSTEAD NOTHING;
INSERT INTO my_table SELECT * FROM another_schema.my_table WHERE some_cond;
DROP RULE "my_table_on_duplicate_ignore" ON "my_table";
Note: The rule applies to all INSERT
operations until the rule is dropped, so not quite ad hoc.
注意:该规则适用于所有的插入操作,直到规则被删除,所以不是非常特别。
#4
22
To get the insert ignore logic you can do something like below. I found simply inserting from a select statement of literal values worked best, then you can mask out the duplicate keys with a NOT EXISTS clause. To get the update on duplicate logic I suspect a pl/pgsql loop would be necessary.
要获得插入忽略逻辑,您可以执行如下操作。我发现简单地从一个文本值的select语句中插入效果最好,然后您可以用一个不存在的子句来屏蔽重复的键。为了获得对重复逻辑的更新,我怀疑有必要使用pl/pgsql循环。
INSERT INTO manager.vin_manufacturer
(SELECT * FROM( VALUES
('935',' Citroën Brazil','Citroën'),
('ABC', 'Toyota', 'Toyota'),
('ZOM',' OM','OM')
) as tmp (vin_manufacturer_id, manufacturer_desc, make_desc)
WHERE NOT EXISTS (
--ignore anything that has already been inserted
SELECT 1 FROM manager.vin_manufacturer m where m.vin_manufacturer_id = tmp.vin_manufacturer_id)
)
#5
18
INSERT INTO mytable(col1,col2)
SELECT 'val1','val2'
WHERE NOT EXISTS (SELECT 1 FROM mytable WHERE col1='val1')
#6
12
Looks like PostgreSQL supports a schema object called a rule.
看起来PostgreSQL支持一个称为规则的模式对象。
http://www.postgresql.org/docs/current/static/rules-update.html
http://www.postgresql.org/docs/current/static/rules-update.html
You could create a rule ON INSERT
for a given table, making it do NOTHING
if a row exists with the given primary key value, or else making it do an UPDATE
instead of the INSERT
if a row exists with the given primary key value.
您可以为给定的表创建一个INSERT的规则,如果一行存在给定的主键值,那么它将不做任何事情,或者如果一行存在给定的主键值,那么它将进行更新,而不是插入。
I haven't tried this myself, so I can't speak from experience or offer an example.
我自己还没有尝试过,所以我不能从经验中说话,也不能举个例子。
#7
12
For those of you that have Postgres 9.5 or higher, the new ON CONFLICT DO NOTHING syntax should work:
对于那些有9.5或更高的Postgres的人来说,新的关于冲突的语法应该不会起作用:
INSERT INTO target_table (field_one, field_two, field_three )
SELECT field_one, field_two, field_three
FROM source_table
ON CONFLICT (field_one) DO NOTHING;
For those of us who have an earlier version, this right join will work instead:
对于我们这些有较早版本的人来说,这个正确的连接将会起作用:
INSERT INTO target_table (field_one, field_two, field_three )
SELECT source_table.field_one, source_table.field_two, source_table.field_three
FROM source_table
LEFT JOIN target_table ON source_table.field_one = target_table.field_one
WHERE target_table.field_one IS NULL;
#8
2
This solution avoids using rules:
此解决方案避免使用规则:
BEGIN
INSERT INTO tableA (unique_column,c2,c3) VALUES (1,2,3);
EXCEPTION
WHEN unique_violation THEN
UPDATE tableA SET c2 = 2, c3 = 3 WHERE unique_column = 1;
END;
but it has a performance drawback (see PostgreSQL.org):
但是它有一个性能缺陷(参见PostgreSQL.org):
A block containing an EXCEPTION clause is significantly more expensive to enter and exit than a block without one. Therefore, don't use EXCEPTION without need.
包含异常子句的块比没有异常子句的块要昂贵得多。因此,不需要就不要使用异常。
#9
1
On bulk, you can always delete the row before the insert. A deletion of a row that doesn't exist doesn't cause an error, so its safely skipped.
在批量上,您可以在插入前删除行。删除不存在的行不会导致错误,因此它被安全地跳过。
#10
1
As @hanmari mentioned in his comment. when inserting into a postgres tables, the on conflict (..) do nothing is the best code to use for not inserting duplicate data.:
正如@hanmari在他的评论中提到的。当插入到postgres表时,on conflict(.. .)是不插入重复数据的最佳代码。
query = "INSERT INTO db_table_name(column_name)
VALUES(%s) ON CONFLICT (column_name) DO NOTHING;"
The ON CONFLICT line of code will allow the insert statement to still insert rows of data. The query and values code is an example of inserted date from a Excel into a postgres db table. I have constraints added to a postgres table I use to make sure the ID field is unique. Instead of running a delete on rows of data that is the same, I add a line of sql code that renumbers the ID column starting at 1. Example:
冲突代码行允许insert语句仍然插入数据行。查询和值代码是将日期从Excel插入到postgres db表的示例。我在postgres表中添加了一些约束,以确保ID字段是惟一的。我没有对相同的数据行执行删除操作,而是添加了一行sql代码,从1开始重新对ID列进行编号。例子:
q = 'ALTER id_column serial RESTART WITH 1'
If my data has an ID field, I do not use this as the primary ID/serial ID, I create a ID column and I set it to serial. I hope this information is helpful to everyone. *I have no college degree in software development/coding. Everything I know in coding, I study on my own.
如果我的数据有一个ID字段,我不使用它作为主ID/串行ID,我创建一个ID列并将其设置为串行。我希望这些信息对大家都有帮助。*我没有软件开发/编码方面的大学学位。我在编码中所知道的一切,都是我自己研究的。
#11
-1
For data import scripts, to replace "IF NOT EXISTS", in a way, there's a slightly awkward formulation that nevertheless works:
对于数据导入脚本,要替换“如果不存在”,在某种程度上,有一个稍微有点笨拙的公式,但仍然有效:
DO
$do$
BEGIN
PERFORM id
FROM whatever_table;
IF NOT FOUND THEN
-- INSERT stuff
END IF;
END
$do$;
#1
28
Try to do an UPDATE. If it doesn't modify any row that means it didn't exist, so do an insert. Obviously, you do this inside a transaction.
尝试做一个更新。如果它不修改任何表示它不存在的行,那么进行插入。显然,这是在事务中进行的。
You can of course wrap this in a function if you don't want to put the extra code on the client side. You also need a loop for the very rare race condition in that thinking.
如果不想在客户端上放置额外的代码,当然可以将其封装到函数中。你还需要一个循环,在这个想法中非常罕见的种族条件。
There's an example of this in the documentation: http://www.postgresql.org/docs/9.3/static/plpgsql-control-structures.html, example 40-2 right at the bottom.
在文档中有一个这样的例子:http://www.postgresql.org/docs/9.3/static/plpgsql-control-structures.html,示例40-2在底部。
That's usually the easiest way. You can do some magic with rules, but it's likely going to be a lot messier. I'd recommend the wrap-in-function approach over that any day.
这通常是最简单的方法。你可以用规则做一些魔术,但它可能会更混乱。我建议在任何一天都使用包装功能方法。
This works for single row, or few row, values. If you're dealing with large amounts of rows for example from a subquery, you're best of splitting it into two queries, one for INSERT and one for UPDATE (as an appropriate join/subselect of course - no need to write your main filter twice)
这适用于单个行或少数行值。如果要处理大量的行(例如来自子查询的行),最好将其分为两个查询,一个用于插入,一个用于更新(当然作为适当的连接/子选择—不需要编写主过滤器两次)
#2
94
With PostgreSQL 9.5, this is now native functionality (like MySQL has had for several years):
对于PostgreSQL 9.5,这是现在的本地功能(就像MySQL几年前的功能一样):
INSERT ... ON CONFLICT DO NOTHING/UPDATE ("UPSERT")
插入……关于冲突,什么都不做/更新(“UPSERT”)
9.5 brings support for "UPSERT" operations. INSERT is extended to accept an ON CONFLICT DO UPDATE/IGNORE clause. This clause specifies an alternative action to take in the event of a would-be duplicate violation.
9.5支持“维护”操作。插入扩展为接受一个ON CONFLICT DO UPDATE/IGNORE子句。该条款指定了在可能重复违反的情况下要采取的另一项行动。
...
…
Further example of new syntax:
新语法的进一步例子:
INSERT INTO user_logins (username, logins)
VALUES ('Naomi',1),('James',1)
ON CONFLICT (username)
DO UPDATE SET logins = user_logins.logins + EXCLUDED.logins;
#3
91
Edit: in case you missed warren's answer, PG9.5 now has this natively; time to upgrade!
编辑:如果你错过了warren的答案,PG9.5现在有了这个本地版本;时间升级!
Building on Bill Karwin's answer, to spell out what a rule based approach would look like (transferring from another schema in the same DB, and with a multi-column primary key):
以Bill Karwin的回答为基础,阐明基于规则的方法是什么样子的(从同一个数据库中的另一个模式转移,并使用多列主键):
CREATE RULE "my_table_on_duplicate_ignore" AS ON INSERT TO "my_table"
WHERE EXISTS(SELECT 1 FROM my_table
WHERE (pk_col_1, pk_col_2)=(NEW.pk_col_1, NEW.pk_col_2))
DO INSTEAD NOTHING;
INSERT INTO my_table SELECT * FROM another_schema.my_table WHERE some_cond;
DROP RULE "my_table_on_duplicate_ignore" ON "my_table";
Note: The rule applies to all INSERT
operations until the rule is dropped, so not quite ad hoc.
注意:该规则适用于所有的插入操作,直到规则被删除,所以不是非常特别。
#4
22
To get the insert ignore logic you can do something like below. I found simply inserting from a select statement of literal values worked best, then you can mask out the duplicate keys with a NOT EXISTS clause. To get the update on duplicate logic I suspect a pl/pgsql loop would be necessary.
要获得插入忽略逻辑,您可以执行如下操作。我发现简单地从一个文本值的select语句中插入效果最好,然后您可以用一个不存在的子句来屏蔽重复的键。为了获得对重复逻辑的更新,我怀疑有必要使用pl/pgsql循环。
INSERT INTO manager.vin_manufacturer
(SELECT * FROM( VALUES
('935',' Citroën Brazil','Citroën'),
('ABC', 'Toyota', 'Toyota'),
('ZOM',' OM','OM')
) as tmp (vin_manufacturer_id, manufacturer_desc, make_desc)
WHERE NOT EXISTS (
--ignore anything that has already been inserted
SELECT 1 FROM manager.vin_manufacturer m where m.vin_manufacturer_id = tmp.vin_manufacturer_id)
)
#5
18
INSERT INTO mytable(col1,col2)
SELECT 'val1','val2'
WHERE NOT EXISTS (SELECT 1 FROM mytable WHERE col1='val1')
#6
12
Looks like PostgreSQL supports a schema object called a rule.
看起来PostgreSQL支持一个称为规则的模式对象。
http://www.postgresql.org/docs/current/static/rules-update.html
http://www.postgresql.org/docs/current/static/rules-update.html
You could create a rule ON INSERT
for a given table, making it do NOTHING
if a row exists with the given primary key value, or else making it do an UPDATE
instead of the INSERT
if a row exists with the given primary key value.
您可以为给定的表创建一个INSERT的规则,如果一行存在给定的主键值,那么它将不做任何事情,或者如果一行存在给定的主键值,那么它将进行更新,而不是插入。
I haven't tried this myself, so I can't speak from experience or offer an example.
我自己还没有尝试过,所以我不能从经验中说话,也不能举个例子。
#7
12
For those of you that have Postgres 9.5 or higher, the new ON CONFLICT DO NOTHING syntax should work:
对于那些有9.5或更高的Postgres的人来说,新的关于冲突的语法应该不会起作用:
INSERT INTO target_table (field_one, field_two, field_three )
SELECT field_one, field_two, field_three
FROM source_table
ON CONFLICT (field_one) DO NOTHING;
For those of us who have an earlier version, this right join will work instead:
对于我们这些有较早版本的人来说,这个正确的连接将会起作用:
INSERT INTO target_table (field_one, field_two, field_three )
SELECT source_table.field_one, source_table.field_two, source_table.field_three
FROM source_table
LEFT JOIN target_table ON source_table.field_one = target_table.field_one
WHERE target_table.field_one IS NULL;
#8
2
This solution avoids using rules:
此解决方案避免使用规则:
BEGIN
INSERT INTO tableA (unique_column,c2,c3) VALUES (1,2,3);
EXCEPTION
WHEN unique_violation THEN
UPDATE tableA SET c2 = 2, c3 = 3 WHERE unique_column = 1;
END;
but it has a performance drawback (see PostgreSQL.org):
但是它有一个性能缺陷(参见PostgreSQL.org):
A block containing an EXCEPTION clause is significantly more expensive to enter and exit than a block without one. Therefore, don't use EXCEPTION without need.
包含异常子句的块比没有异常子句的块要昂贵得多。因此,不需要就不要使用异常。
#9
1
On bulk, you can always delete the row before the insert. A deletion of a row that doesn't exist doesn't cause an error, so its safely skipped.
在批量上,您可以在插入前删除行。删除不存在的行不会导致错误,因此它被安全地跳过。
#10
1
As @hanmari mentioned in his comment. when inserting into a postgres tables, the on conflict (..) do nothing is the best code to use for not inserting duplicate data.:
正如@hanmari在他的评论中提到的。当插入到postgres表时,on conflict(.. .)是不插入重复数据的最佳代码。
query = "INSERT INTO db_table_name(column_name)
VALUES(%s) ON CONFLICT (column_name) DO NOTHING;"
The ON CONFLICT line of code will allow the insert statement to still insert rows of data. The query and values code is an example of inserted date from a Excel into a postgres db table. I have constraints added to a postgres table I use to make sure the ID field is unique. Instead of running a delete on rows of data that is the same, I add a line of sql code that renumbers the ID column starting at 1. Example:
冲突代码行允许insert语句仍然插入数据行。查询和值代码是将日期从Excel插入到postgres db表的示例。我在postgres表中添加了一些约束,以确保ID字段是惟一的。我没有对相同的数据行执行删除操作,而是添加了一行sql代码,从1开始重新对ID列进行编号。例子:
q = 'ALTER id_column serial RESTART WITH 1'
If my data has an ID field, I do not use this as the primary ID/serial ID, I create a ID column and I set it to serial. I hope this information is helpful to everyone. *I have no college degree in software development/coding. Everything I know in coding, I study on my own.
如果我的数据有一个ID字段,我不使用它作为主ID/串行ID,我创建一个ID列并将其设置为串行。我希望这些信息对大家都有帮助。*我没有软件开发/编码方面的大学学位。我在编码中所知道的一切,都是我自己研究的。
#11
-1
For data import scripts, to replace "IF NOT EXISTS", in a way, there's a slightly awkward formulation that nevertheless works:
对于数据导入脚本,要替换“如果不存在”,在某种程度上,有一个稍微有点笨拙的公式,但仍然有效:
DO
$do$
BEGIN
PERFORM id
FROM whatever_table;
IF NOT FOUND THEN
-- INSERT stuff
END IF;
END
$do$;