I want to import huge .csv file about 1 gig into database.
我想将大约1 gig的巨大.csv文件导入数据库。
My application is coded in c# in visual studio 2010. It is running locally and does not need to use on network.
我的应用程序在visual studio 2010中以c#编码。它在本地运行,不需要在网络上使用。
My attempt to import only 25mb using sql compact toolbox scripts leads to crash in Visual Studio.
我尝试使用sql紧凑工具箱脚本仅导入25mb导致Visual Studio崩溃。
My attempt to use stringbuilder
leads to an out of memory exception (usage about 4 gig of memory !) and then fails.
我尝试使用stringbuilder会导致内存不足异常(使用大约4 GB内存!)然后失败。
My attempt to import these file into Excel or Access and then convert them to database fails as well.
我尝试将这些文件导入Excel或Access,然后将它们转换为数据库也失败了。
Which of these databases can handle better to solve my problem?
哪些数据库可以更好地处理我的问题?
- SQL Express
- SQL Compact
- local SQL Server database
本地SQL Server数据库
Also, which method should I use to import it as fast as I can and to load it faster into a datagridview?
另外,我应该使用哪种方法尽可能快地导入它并将其加载到datagridview中?
Thanks for any help.
谢谢你的帮助。
4 个解决方案
#1
6
If the CSV file does not have any strings containing commas, you can do a direct BULK INSERT from SQL (if it does, you will have to change the delimiter to something like a bar (|
) character, first. This is the most direct means of getting data from a flat file into the database, and doesn't require any intermediate programs like SSIS or Excel
如果CSV文件没有包含逗号的任何字符串,则可以从SQL直接执行BULK INSERT(如果是,则必须先将分隔符更改为bar(|)字符。这是最直接的将数据从平面文件获取到数据库中的方法,并且不需要任何中间程序,如SSIS或Excel
I use it often, and it is the fastest and most efficient way of getting data into SQL from outside. Your command will look something like like
我经常使用它,它是从外部将数据导入SQL的最快速,最有效的方法。你的命令看起来像
BULK INSERT MyDatabase.dbo.MyTable
FROM MyFileName
DATAFILETYPE='char',
FIELDTERMINATOR=',',
BATCHSIZE=10000
The most common strategy is to load the data into a working table, do any clean-up / conversion necessary, and then insert it to the actual target table.
最常见的策略是将数据加载到工作表中,进行必要的清理/转换,然后将其插入到实际的目标表中。
#2
4
If you really want to achieve this using C#, what you'll need to do is read the CSV line-by-line and insert it before you move to the next one.
如果你真的想用C#来实现这个目标,那么你需要做的就是逐行读取CSV并在移动到下一个之前插入它。
I have a similar situation where I have to read a 2GB "CSV" (tab seperated) and load into MSSQL. Here's how I have it setup.
我有类似的情况,我必须读取2GB“CSV”(选项卡分离)并加载到MSSQL。这是我如何设置它。
using (FileStream fs = new FileStream(@"C:\file.csv", FileMode.Open, FileAccess.Read, FileShare.None))
using (StreamReader sr = new StreamReader(fs, Encoding.GetEncoding(1252)))
{
if (sr.ReadLine() == null) //Take this out if you don't have a header
{
throw new Exception("Empty file?!");
}
while (sr.Peek() >= 0)
{
String s = sr.ReadLine();
//SPLIT
//INSERT SQL
}
}
#3
1
Both SQL Express, and a standard SQL Server are good candidates for your storage. And as far as what to use to import the data, use SSIS. Once you've created the database on the SQL Express or Standard SQL Server instance, right-click on that database, and under the Tasks
menu item you'll see an option for Import Data
. It will walk you through selecting a data source, in your case Excel, and then getting it imported into the database.
SQL Express和标准SQL Server都是您存储的理想选择。至于用于导入数据的内容,请使用SSIS。在SQL Express或Standard SQL Server实例上创建数据库后,右键单击该数据库,然后在“任务”菜单项下,您将看到“导入数据”选项。它将引导您选择数据源,在您的情况下为Excel,然后将其导入数据库。
This script then, at the end of the process, can be saved.
然后,在该过程结束时,可以保存该脚本。
#4
1
You can use the SQLBulkImporter object in C#. Works like a charm.
您可以在C#中使用SQLBulkImporter对象。奇迹般有效。
#1
6
If the CSV file does not have any strings containing commas, you can do a direct BULK INSERT from SQL (if it does, you will have to change the delimiter to something like a bar (|
) character, first. This is the most direct means of getting data from a flat file into the database, and doesn't require any intermediate programs like SSIS or Excel
如果CSV文件没有包含逗号的任何字符串,则可以从SQL直接执行BULK INSERT(如果是,则必须先将分隔符更改为bar(|)字符。这是最直接的将数据从平面文件获取到数据库中的方法,并且不需要任何中间程序,如SSIS或Excel
I use it often, and it is the fastest and most efficient way of getting data into SQL from outside. Your command will look something like like
我经常使用它,它是从外部将数据导入SQL的最快速,最有效的方法。你的命令看起来像
BULK INSERT MyDatabase.dbo.MyTable
FROM MyFileName
DATAFILETYPE='char',
FIELDTERMINATOR=',',
BATCHSIZE=10000
The most common strategy is to load the data into a working table, do any clean-up / conversion necessary, and then insert it to the actual target table.
最常见的策略是将数据加载到工作表中,进行必要的清理/转换,然后将其插入到实际的目标表中。
#2
4
If you really want to achieve this using C#, what you'll need to do is read the CSV line-by-line and insert it before you move to the next one.
如果你真的想用C#来实现这个目标,那么你需要做的就是逐行读取CSV并在移动到下一个之前插入它。
I have a similar situation where I have to read a 2GB "CSV" (tab seperated) and load into MSSQL. Here's how I have it setup.
我有类似的情况,我必须读取2GB“CSV”(选项卡分离)并加载到MSSQL。这是我如何设置它。
using (FileStream fs = new FileStream(@"C:\file.csv", FileMode.Open, FileAccess.Read, FileShare.None))
using (StreamReader sr = new StreamReader(fs, Encoding.GetEncoding(1252)))
{
if (sr.ReadLine() == null) //Take this out if you don't have a header
{
throw new Exception("Empty file?!");
}
while (sr.Peek() >= 0)
{
String s = sr.ReadLine();
//SPLIT
//INSERT SQL
}
}
#3
1
Both SQL Express, and a standard SQL Server are good candidates for your storage. And as far as what to use to import the data, use SSIS. Once you've created the database on the SQL Express or Standard SQL Server instance, right-click on that database, and under the Tasks
menu item you'll see an option for Import Data
. It will walk you through selecting a data source, in your case Excel, and then getting it imported into the database.
SQL Express和标准SQL Server都是您存储的理想选择。至于用于导入数据的内容,请使用SSIS。在SQL Express或Standard SQL Server实例上创建数据库后,右键单击该数据库,然后在“任务”菜单项下,您将看到“导入数据”选项。它将引导您选择数据源,在您的情况下为Excel,然后将其导入数据库。
This script then, at the end of the process, can be saved.
然后,在该过程结束时,可以保存该脚本。
#4
1
You can use the SQLBulkImporter object in C#. Works like a charm.
您可以在C#中使用SQLBulkImporter对象。奇迹般有效。