
时间:2022-10-29 20:47:49

So I'm upgrading an old parser right now. It's written in C# and uses SQL to insert records into a database.


Currently it reads and parses a few thousand lines of data from a file, then inserts the new data into a database containing over a million records.


Sometimes it can take over 10 minutes just to add a few thousand lines.


I've come to the conclusion that this bottleneck in performance is due to a SQL command where it uses an IF NOT EXISTS statement to determine whether the row attempting to be inserted already exists, and if it doesn't insert the record.

我得出结论,这个性能瓶颈是由于SQL命令,它使用IF NOT EXISTS语句来确定尝试插入的行是否已经存在,以及它是否未插入记录。

I believe the problem is that it just takes way too long to call the IF NOT EXISTS on every single row in the new data.

我认为问题在于,在新数据的每一行上调用IF NOT EXISTS只需要太长时间。

Is there a faster way to determine whether data exists already or not?


I was thinking to insert all of the records first anyways using the SQLBulkCopy Class, then running a stored procedure to remove the duplicates.


Does anyone else have any suggestions or methods to do this as efficiently and quickly as possible? Anything would be appreciated.


EDIT: To clarify, I'd run a stored procedure (on the large table) after copying the new data into the large table


large table = 1,000,000+ rows

大表= 1,000,000+行

1 个解决方案



1.  Create an IDataReader to loop over your source data.
2.  Place the values into a strong dataset.
3.  Every N number of rows, send the dataset (.GetXml) to a stored procedure.  Let's say 1000 for the heck of it.
4.  Have the stored procedure shred the xml.
5.  Do your INSERT/UPDATE based on this shredded xml.
6.  Return from the procedure, keep looping until you're done.

Here is an older example: http://granadacoder.wordpress.com/2009/01/27/bulk-insert-example-using-an-idatareader-to-strong-dataset-to-sql-server-xml/


The key is that you are doing "bulk" operations.......instead of row by row. And you can pick a sweet spot # (1000 for example) that gives you the best performance.




1.  Create an IDataReader to loop over your source data.
2.  Place the values into a strong dataset.
3.  Every N number of rows, send the dataset (.GetXml) to a stored procedure.  Let's say 1000 for the heck of it.
4.  Have the stored procedure shred the xml.
5.  Do your INSERT/UPDATE based on this shredded xml.
6.  Return from the procedure, keep looping until you're done.

Here is an older example: http://granadacoder.wordpress.com/2009/01/27/bulk-insert-example-using-an-idatareader-to-strong-dataset-to-sql-server-xml/


The key is that you are doing "bulk" operations.......instead of row by row. And you can pick a sweet spot # (1000 for example) that gives you the best performance.
