We have a database that gets updated everyday at midnight with a cronjob, we get new data from an external XML.
我们有一个数据库,每天午夜使用cronjob进行更新,我们从外部XML获取新数据。
What we do is that we insert all the new content and in case there is a duplicated key we update that field.
我们所做的是插入所有新内容,如果有重复的密钥,我们会更新该字段。
INSERT INTO table (id, col1, col2, col3)
values (id_value, val1, val2, val3),
(id_value, val1, val2, val3),
(id_value, val1, val2, val3),
(id_value, val1, val2, val3),
ON DUPLICATE KEY UPDATE
col1 = VALUES (col1),
col2 = VALUES (col2),
col3 = VALUES (col3);
What we want to know is which rows have actually been inserted, meaning we want to have a list of the new items. is there any query that might return the new inserts? Basically we will need to get all the new ID's and not the number of new insertions.
我们想知道的是实际插入了哪些行,这意味着我们想要一个新项目的列表。有没有可能返回新插入的查询?基本上我们需要获得所有新ID,而不是新插入的数量。
Thanks
谢谢
4 个解决方案
#1
7
Add an update_count INT NOT NULL DEFAULT 1
column and change your query:
添加update_count INT NOT NULL DEFAULT 1列并更改您的查询:
INSERT
INTO table (id, col1, col2, col3)
VALUES
(id_value, val1, val2, val3),
(id_value, val1, val2, val3,),
(id_value, val1, val2, val3),
(id_value, val1, val2, val3),
ON DUPLICATE KEY
UPDATE
col1 = VALUES (col1),
col2 = VALUES (col2),
col3 = VALUES (col3),
update_count = update_count + 1;
You can also increment it in a BEFORE UPDATE
trigger which will allow you to keep the query as is.
您还可以在BEFORE UPDATE触发器中递增它,这将允许您保持查询不变。
#2
26
You can get this information at the time of the insert/update by examining the number of affected rows in the result set.
您可以通过检查结果集中受影响的行数来在插入/更新时获取此信息。
MySQL documentation states:
MySQL文档说明:
With ON DUPLICATE KEY UPDATE, the affected-rows value per row is 1 if the row is inserted as a new row and 2 if an existing row is updated.
使用ON DUPLICATE KEY UPDATE时,如果将行作为新行插入,则每行的受影响行值为1;如果更新现有行,则每行受影响的行值为2。
You'll need to combine ROW_COUNT with LAST_INSERT_ID to get your answer and insert one row at a time.
您需要将ROW_COUNT与LAST_INSERT_ID结合使用才能获得答案,并一次插入一行。
#3
0
Add a timestamp field to the table, and set its default value to current_timestamp
, but don't set ON UPDATE CURRENT_TIMESTAMP
. That way, if you know the time your cron job ran, you can query all the rows that were added at or after that time, but before the succeeding cron job, or the end time that you know for sure the cron would have completed by.
向表中添加时间戳字段,并将其默认值设置为current_timestamp,但不要设置ON UPDATE CURRENT_TIMESTAMP。这样,如果您知道您的cron作业运行的时间,您可以查询在该时间或之后添加的所有行,但是在后续的cron作业之前,或者您知道的结束时间确定cron将由。
Alter table your_table add column create_time TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
should update the create_time field automatically only for inserts, not for updates.
更改表your_table添加列create_time TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP应仅自动更新create_time字段,而不是更新。
If you read up on the MySQL documentation:
如果您阅读了MySQL文档:
With a DEFAULT clause but no ON UPDATE CURRENT_TIMESTAMP clause, the column has the given default value and is not automatically updated to the current timestamp.
使用DEFAULT子句但没有ON UPDATE CURRENT_TIMESTAMP子句时,该列具有给定的默认值,并且不会自动更新为当前时间戳。
The default depends on whether the DEFAULT clause specifies CURRENT_TIMESTAMP or a constant value. With CURRENT_TIMESTAMP, the default is the current timestamp.
缺省值取决于DEFAULT子句是指定CURRENT_TIMESTAMP还是常量值。使用CURRENT_TIMESTAMP,默认值是当前时间戳。
So to sum up, a query like SELECT id from your_table where create_timestamp >= $cron_1_time and create_timestamp < $cron_2_time;
should give you what you are looking for. Of course, this all depends on you knowing approximately when the cron jobs run and for how long.
总而言之,来自your_table的SELECT id等查询,其中create_timestamp> = $ cron_1_time和create_timestamp <$ cron_2_time;应该给你你想要的东西。当然,这一切都取决于你大致了解cron工作何时运行以及持续多长时间。
#4
0
I can say How I did in PHP:
我可以说我在PHP中的表现如何:
1) Simple query SELECT MAX(id) and remember it to $max_id from table before Insert On Duplicate.
1)简单查询SELECT MAX(id)并在Insert On Duplicate之前从表中记住$ max_id。
2) Then during the update process collect ID of affected rows (no mater new or existed): $ids[] = mysql_insert_id();
2)然后在更新过程中收集受影响行的ID(无新的或已存在的):$ ids [] = mysql_insert_id();
3) Then $inserted_rows = max($ids)-$max_id;
3)然后$ inserted_rows = max($ ids) - $ max_id;
4) Updated rows = count($ids_srt)-$inserted_rows
4)更新了rows = count($ ids_srt) - $ inserted_rows
$max_id = mysql_query("SELECT MAX(id) from table");
$max_id = mysql_result($max_id, 0);
// !!! prepare here 'insert on duplicate' query in a cycle
$result=mysql_query($query);
$ids[] = mysql_insert_id();
// finish inserting and collecting affected ids and close cycle
$inserted_rows = max($ids)- $max_id;
$updated_rows = count($ids)- $inserted_rows
#1
7
Add an update_count INT NOT NULL DEFAULT 1
column and change your query:
添加update_count INT NOT NULL DEFAULT 1列并更改您的查询:
INSERT
INTO table (id, col1, col2, col3)
VALUES
(id_value, val1, val2, val3),
(id_value, val1, val2, val3,),
(id_value, val1, val2, val3),
(id_value, val1, val2, val3),
ON DUPLICATE KEY
UPDATE
col1 = VALUES (col1),
col2 = VALUES (col2),
col3 = VALUES (col3),
update_count = update_count + 1;
You can also increment it in a BEFORE UPDATE
trigger which will allow you to keep the query as is.
您还可以在BEFORE UPDATE触发器中递增它,这将允许您保持查询不变。
#2
26
You can get this information at the time of the insert/update by examining the number of affected rows in the result set.
您可以通过检查结果集中受影响的行数来在插入/更新时获取此信息。
MySQL documentation states:
MySQL文档说明:
With ON DUPLICATE KEY UPDATE, the affected-rows value per row is 1 if the row is inserted as a new row and 2 if an existing row is updated.
使用ON DUPLICATE KEY UPDATE时,如果将行作为新行插入,则每行的受影响行值为1;如果更新现有行,则每行受影响的行值为2。
You'll need to combine ROW_COUNT with LAST_INSERT_ID to get your answer and insert one row at a time.
您需要将ROW_COUNT与LAST_INSERT_ID结合使用才能获得答案,并一次插入一行。
#3
0
Add a timestamp field to the table, and set its default value to current_timestamp
, but don't set ON UPDATE CURRENT_TIMESTAMP
. That way, if you know the time your cron job ran, you can query all the rows that were added at or after that time, but before the succeeding cron job, or the end time that you know for sure the cron would have completed by.
向表中添加时间戳字段,并将其默认值设置为current_timestamp,但不要设置ON UPDATE CURRENT_TIMESTAMP。这样,如果您知道您的cron作业运行的时间,您可以查询在该时间或之后添加的所有行,但是在后续的cron作业之前,或者您知道的结束时间确定cron将由。
Alter table your_table add column create_time TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
should update the create_time field automatically only for inserts, not for updates.
更改表your_table添加列create_time TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP应仅自动更新create_time字段,而不是更新。
If you read up on the MySQL documentation:
如果您阅读了MySQL文档:
With a DEFAULT clause but no ON UPDATE CURRENT_TIMESTAMP clause, the column has the given default value and is not automatically updated to the current timestamp.
使用DEFAULT子句但没有ON UPDATE CURRENT_TIMESTAMP子句时,该列具有给定的默认值,并且不会自动更新为当前时间戳。
The default depends on whether the DEFAULT clause specifies CURRENT_TIMESTAMP or a constant value. With CURRENT_TIMESTAMP, the default is the current timestamp.
缺省值取决于DEFAULT子句是指定CURRENT_TIMESTAMP还是常量值。使用CURRENT_TIMESTAMP,默认值是当前时间戳。
So to sum up, a query like SELECT id from your_table where create_timestamp >= $cron_1_time and create_timestamp < $cron_2_time;
should give you what you are looking for. Of course, this all depends on you knowing approximately when the cron jobs run and for how long.
总而言之,来自your_table的SELECT id等查询,其中create_timestamp> = $ cron_1_time和create_timestamp <$ cron_2_time;应该给你你想要的东西。当然,这一切都取决于你大致了解cron工作何时运行以及持续多长时间。
#4
0
I can say How I did in PHP:
我可以说我在PHP中的表现如何:
1) Simple query SELECT MAX(id) and remember it to $max_id from table before Insert On Duplicate.
1)简单查询SELECT MAX(id)并在Insert On Duplicate之前从表中记住$ max_id。
2) Then during the update process collect ID of affected rows (no mater new or existed): $ids[] = mysql_insert_id();
2)然后在更新过程中收集受影响行的ID(无新的或已存在的):$ ids [] = mysql_insert_id();
3) Then $inserted_rows = max($ids)-$max_id;
3)然后$ inserted_rows = max($ ids) - $ max_id;
4) Updated rows = count($ids_srt)-$inserted_rows
4)更新了rows = count($ ids_srt) - $ inserted_rows
$max_id = mysql_query("SELECT MAX(id) from table");
$max_id = mysql_result($max_id, 0);
// !!! prepare here 'insert on duplicate' query in a cycle
$result=mysql_query($query);
$ids[] = mysql_insert_id();
// finish inserting and collecting affected ids and close cycle
$inserted_rows = max($ids)- $max_id;
$updated_rows = count($ids)- $inserted_rows