I am dealing with a huge database with millions of rows. I would like to run an SQL statement through C#, which selects 1.2 million rows from one database, and inserts them into another after parsing and modifying some data.
我正在处理一个包含数百万行的庞大数据库。我想通过C#运行一个SQL语句,它从一个数据库中选择120万行,并在解析和修改一些数据后将它们插入到另一个数据库中。
I originally wanted to do so by first running the select statement and parsing the data by iterating through the MySqlDataReader
object which contains the data. This would be a memory overhead, so I have decided to select one row, parse it and insert into the other database, and then move onto the next row.
我最初想要这样做,首先运行select语句并通过迭代包含数据的MySqlDataReader对象来解析数据。这将是一个内存开销,所以我决定选择一行,解析它并插入另一个数据库,然后移动到下一行。
How can this be done? I have tried the SELECT....INTO
syntax for a MySQL query, however this still seems to select all the data, and then inserts it after.
如何才能做到这一点?我已经尝试了MySQL查询的SELECT .... INTO语法,但是这似乎仍然选择了所有数据,然后将其插入。
3 个解决方案
#1
0
Use SqlBulkCopy Class to move data from one source to other
使用SqlBulkCopy类将数据从一个源移动到另一个源
http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlbulkcopy%28v=vs.110%29.aspx
#2
0
I am not sure if you are able to add a new column to the existing table. If you are able to add a new column, you can use the new column as a flag. It could be "TRANSFERED(boolean)".
我不确定您是否能够在现有表中添加新列。如果能够添加新列,则可以将新列用作标志。它可能是“TRANSFERED(boolean)”。
You will select one row at a time with the condition TRANSFERED=FALSE and do the process. After that row is processed, you should update as TRANSFERED=TRUE.
您将使用条件TRANSFERED = FALSE一次选择一行并执行该过程。处理完该行后,您应该更新为TRANSFERED = TRUE。
Or, you must have a uniqe id column in your existing table. Create a temp table which will store the id of processed rows, that way you will know which rows are processed or not
或者,您必须在现有表中具有uniqe id列。创建一个临时表,用于存储已处理行的ID,这样您就可以知道哪些行已处理
#3
0
I am not quite sure what is your error. For your case, I suggest you should use 'select top 1000 ' to get the data because insert row one by one is really slow. After that, you can use 'insert into query', it should be noted that sqlbulkcopy is just for sql server, I suggest you use the stringbuilder to make the sql query for if you use string, it will has a big overhead to concat the string.
我不太清楚你的错误是什么。对于你的情况,我建议你应该使用'select top 1000'来获取数据,因为逐行插入行非常慢。在那之后,你可以使用'insert into query',应该注意sqlbulkcopy只适用于sql server,我建议你使用stringbuilder来进行sql查询,如果你使用字符串,它会有很大的开销来连接串。
#1
0
Use SqlBulkCopy Class to move data from one source to other
使用SqlBulkCopy类将数据从一个源移动到另一个源
http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlbulkcopy%28v=vs.110%29.aspx
#2
0
I am not sure if you are able to add a new column to the existing table. If you are able to add a new column, you can use the new column as a flag. It could be "TRANSFERED(boolean)".
我不确定您是否能够在现有表中添加新列。如果能够添加新列,则可以将新列用作标志。它可能是“TRANSFERED(boolean)”。
You will select one row at a time with the condition TRANSFERED=FALSE and do the process. After that row is processed, you should update as TRANSFERED=TRUE.
您将使用条件TRANSFERED = FALSE一次选择一行并执行该过程。处理完该行后,您应该更新为TRANSFERED = TRUE。
Or, you must have a uniqe id column in your existing table. Create a temp table which will store the id of processed rows, that way you will know which rows are processed or not
或者,您必须在现有表中具有uniqe id列。创建一个临时表,用于存储已处理行的ID,这样您就可以知道哪些行已处理
#3
0
I am not quite sure what is your error. For your case, I suggest you should use 'select top 1000 ' to get the data because insert row one by one is really slow. After that, you can use 'insert into query', it should be noted that sqlbulkcopy is just for sql server, I suggest you use the stringbuilder to make the sql query for if you use string, it will has a big overhead to concat the string.
我不太清楚你的错误是什么。对于你的情况,我建议你应该使用'select top 1000'来获取数据,因为逐行插入行非常慢。在那之后,你可以使用'insert into query',应该注意sqlbulkcopy只适用于sql server,我建议你使用stringbuilder来进行sql查询,如果你使用字符串,它会有很大的开销来连接串。