如何读取c#中的excel文件而不丢失任何列?

时间:2021-07-10 07:20:47

I've been using an OleDb connection to read excel files successfully for quite a while now, but I've run across a problem. I've got someone who is trying to upload an Excel spreadsheet with nothing in the first column and when I try to read the file, it doesn't recognize that column.

我使用OleDb连接来成功读取excel文件已经有一段时间了,但是我遇到了一个问题。我有一个人试图上载一个Excel电子表格在第一列什么都没有,当我试图读取文件时,它不认识这个列。

I'm currently using the following OleDb connection string:

我目前正在使用以下OleDb连接字符串:

Provider=Microsoft.Jet.OLEDB.4.0;
Data Source=c:\test.xls;
Extended Properties="Excel 8.0;IMEX=1;"

提供者= Microsoft.Jet.OLEDB.4.0;数据源= c:\ test.xls;扩展属性=“Excel 8.0;IMEX = 1;

So, if there are 13 columns in the excel file, the OleDbDataReader I get back only has 12 columns/fields.

因此,如果excel文件中有13个列,我得到的OleDbDataReader只有12个列/字段。

Any insight would be appreciated.

任何见解都将受到赞赏。

6 个解决方案

#1


3  

SpreadsheetGear for .NET gives you an API for working with xls and xlsx workbooks from .NET. It is easier to use and faster than OleDB or the Excel COM object model. You can see the live samples or try it for yourself with the free trial.

net的SpreadsheetGear为您提供了一个与xls和xlsx workbooks一起使用的API。它比OleDB或Excel COM对象模型更容易使用,也更快。你可以看到现场样品,也可以免费试用。

Disclaimer: I own SpreadsheetGear LLC

免责声明:我自己开的是SpreadsheetGear LLC。

EDIT:

编辑:

StingyJack commented "Faster than OleDb? Better back that claim up".

吝啬鬼说“比OleDb快?”最好支持这一主张。

This is a reasonable request. I see claims all the time which I know for a fact to be false, so I cannot blame anyone for being skeptical.

这是一个合理的要求。我总是看到我知道的断言是错误的,所以我不能责怪任何人的怀疑。

Below is the code to create a 50,000 row by 10 column workbook with SpreadsheetGear, save it to disk, and then sum the numbers using OleDb and SpreadsheetGear. SpreadsheetGear reads the 500K cells in 0.31 seconds compared to 0.63 seconds with OleDB - just over twice as fast. SpreadsheetGear actually creates and reads the workbook in less time than it takes to read the workbook with OleDB.

下面的代码将使用SpreadsheetGear创建一个50000行10列的工作簿,并将其保存到磁盘中,然后使用OleDb和SpreadsheetGear对数字进行求和。与OleDB的0.63秒相比,SpreadsheetGear在0.31秒内读取500K细胞的速度是OleDB的两倍多。与使用OleDB阅读工作簿相比,SpreadsheetGear用更短的时间创建和阅读工作簿。

The code is below. You can try it yourself with the SpreadsheetGear free trial.

下面的代码。你可以自己试用电子表格免费试用。

using System;
using System.Data; 
using System.Data.OleDb; 
using SpreadsheetGear;
using SpreadsheetGear.Advanced.Cells;
using System.Diagnostics;

namespace SpreadsheetGearAndOleDBBenchmark
{
    class Program
    {
        static void Main(string[] args)
        {
            // Warm up (get the code JITed).
            BM(10, 10);

            // Do it for real.
            BM(50000, 10);
        }

        static void BM(int rows, int cols)
        {
            // Compare the performance of OleDB to SpreadsheetGear for reading
            // workbooks. We sum numbers just to have something to do.
            //
            // Run on Windows Vista 32 bit, Visual Studio 2008, Release Build,
            // Run Without Debugger:
            //  Create time: 0.25 seconds
            //  OleDb Time: 0.63 seconds
            //  SpreadsheetGear Time: 0.31 seconds
            //
            // SpreadsheetGear is more than twice as fast at reading. Furthermore,
            // SpreadsheetGear can create the file and read it faster than OleDB
            // can just read it.
            string filename = @"C:\tmp\SpreadsheetGearOleDbBenchmark.xls";
            Console.WriteLine("\nCreating {0} rows x {1} columns", rows, cols);
            Stopwatch timer = Stopwatch.StartNew();
            double createSum = CreateWorkbook(filename, rows, cols);
            double createTime = timer.Elapsed.TotalSeconds;
            Console.WriteLine("Create sum of {0} took {1} seconds.", createSum, createTime);
            timer = Stopwatch.StartNew();
            double oleDbSum = ReadWithOleDB(filename);
            double oleDbTime = timer.Elapsed.TotalSeconds;
            Console.WriteLine("OleDb sum of {0} took {1} seconds.", oleDbSum, oleDbTime);
            timer = Stopwatch.StartNew();
            double spreadsheetGearSum = ReadWithSpreadsheetGear(filename);
            double spreadsheetGearTime = timer.Elapsed.TotalSeconds;
            Console.WriteLine("SpreadsheetGear sum of {0} took {1} seconds.", spreadsheetGearSum, spreadsheetGearTime);
        }

        static double CreateWorkbook(string filename, int rows, int cols)
        {
            IWorkbook workbook = Factory.GetWorkbook();
            IWorksheet worksheet = workbook.Worksheets[0];
            IValues values = (IValues)worksheet;
            double sum = 0.0;
            Random rand = new Random();
            // Put labels in the first row.
            foreach (IRange cell in worksheet.Cells[0, 0, 0, cols - 1])
                cell.Value = "Cell-" + cell.Address;
            // Using IRange and foreach be less code, 
            // but we'll do it the fast way.
            for (int row = 1; row <= rows; row++)
            {
                for (int col = 0; col < cols; col++)
                {
                    double number = rand.NextDouble();
                    sum += number;
                    values.SetNumber(row, col, number);
                }
            }
            workbook.SaveAs(filename, FileFormat.Excel8);
            return sum;
        }

        static double ReadWithSpreadsheetGear(string filename)
        {
            IWorkbook workbook = Factory.GetWorkbook(filename);
            IWorksheet worksheet = workbook.Worksheets[0];
            IValues values = (IValues)worksheet;
            IRange usedRahge = worksheet.UsedRange;
            int rowCount = usedRahge.RowCount;
            int colCount = usedRahge.ColumnCount;
            double sum = 0.0;
            // We could use foreach (IRange cell in usedRange) for cleaner 
            // code, but this is faster.
            for (int row = 1; row <= rowCount; row++)
            {
                for (int col = 0; col < colCount; col++)
                {
                    IValue value = values[row, col];
                    if (value != null && value.Type == SpreadsheetGear.Advanced.Cells.ValueType.Number)
                        sum += value.Number;
                }
            }
            return sum;
        }

        static double ReadWithOleDB(string filename)
        {
            String connectionString =  
                "Provider=Microsoft.Jet.OLEDB.4.0;" + 
                "Data Source=" + filename + ";" + 
                "Extended Properties=Excel 8.0;"; 
            OleDbConnection connection = new OleDbConnection(connectionString); 
            connection.Open(); 
            OleDbCommand selectCommand =new OleDbCommand("SELECT * FROM [Sheet1$]", connection); 
            OleDbDataAdapter dataAdapter = new OleDbDataAdapter(); 
            dataAdapter.SelectCommand = selectCommand; 
            DataSet dataSet = new DataSet(); 
            dataAdapter.Fill(dataSet); 
            connection.Close(); 
            double sum = 0.0;
            // We'll make some assumptions for brevity of the code.
            DataTable dataTable = dataSet.Tables[0];
            int cols = dataTable.Columns.Count;
            foreach (DataRow row in dataTable.Rows)
            {
                for (int i = 0; i < cols; i++)
                {
                    object val = row[i];
                    if (val is double)
                        sum += (double)val;
                }
            }
            return sum;
        }
    }
}

#2


1  

We always use Excel Interop to open the spreadsheet and parse directly (e.g. similar to how you would scan through cells in VBA), or we create locked down templates that enforce certain columns to be filled in before the user can save the data.

我们总是使用Excel Interop直接打开电子表格并进行解析(例如,类似于在VBA中扫描单元格),或者创建锁定模板,在用户保存数据之前强制填充某些列。

#3


1  

You can probably look at ExcelMapper. It is a tool to read excel files as strongly typed objects. It hides all the details of reading an excel from your code. It would take care if your excel is missing a column or data is missing from a column. You read data that you are interested in. You can get the code/executable for ExcelMapper from http://code.google.com/p/excelmapper/.

你可以看看ExcelMapper。它是一个将excel文件作为强类型对象读取的工具。它隐藏了从代码中读取excel的所有细节。如果您的excel漏掉了一列,或者列中没有数据,那么就需要注意了。您读取感兴趣的数据。您可以从http://code.google.com/p/excelmapper/获得ExcelMapper的代码/可执行文件。

#4


0  

If could require the format of the excel sheet to have column headers, then you would always have the 13 columns. You would just need to skip the header row when processing.

如果可以要求excel表的格式具有列标题,那么您将始终拥有13列。您只需要在处理时跳过标题行。

This would also correct situations where the user puts the columns in an order that you are not expecting. (detect column indexes in the header row and read appropriately)

这也将纠正用户将列按您不希望的顺序放置的情况。(检测标题行中的列索引并正确读取)

I see that others are recommending the Excel interop, but jeez that's a slow option compared to the OleDb way. Plus it requires Excel or OWC to be installed on the server (licensing).

我看到其他人在推荐Excel interop,但是天哪,这和OleDb的方式相比是一个缓慢的选择。此外,它还需要在服务器上安装Excel或OWC(授权许可)。

#5


0  

You might try using Excel and COM. That way, you'll be getting your info straight form the horse's mouth, as it were.

您可以尝试使用Excel和COM。这样,你就可以直接从马嘴里得到信息。

From D. Anand over on the MSDN forums:

来自D. Anand的MSDN论坛:

Create a reference in your project to Excel Objects Library. The excel object library can be added in the COM tab of adding reference dialog.

在您的项目中为Excel对象库创建一个引用。可以在添加引用对话框的COM选项卡中添加excel对象库。

Here's some info on the Excel object model in C# http://msdn.microsoft.com/en-us/library/aa168292(office.11).aspx

下面是关于c# http://msdn.microsoft.com/en-us/library/aa168292(office.11).aspx中的Excel对象模型的一些信息

#6


0  

I recommend you to try Visual Studio Tools for Office and Excel Interop! It's using is very easy.

我建议你试试Office和Excel Interop的Visual Studio工具!它的使用非常简单。

#1


3  

SpreadsheetGear for .NET gives you an API for working with xls and xlsx workbooks from .NET. It is easier to use and faster than OleDB or the Excel COM object model. You can see the live samples or try it for yourself with the free trial.

net的SpreadsheetGear为您提供了一个与xls和xlsx workbooks一起使用的API。它比OleDB或Excel COM对象模型更容易使用,也更快。你可以看到现场样品,也可以免费试用。

Disclaimer: I own SpreadsheetGear LLC

免责声明:我自己开的是SpreadsheetGear LLC。

EDIT:

编辑:

StingyJack commented "Faster than OleDb? Better back that claim up".

吝啬鬼说“比OleDb快?”最好支持这一主张。

This is a reasonable request. I see claims all the time which I know for a fact to be false, so I cannot blame anyone for being skeptical.

这是一个合理的要求。我总是看到我知道的断言是错误的,所以我不能责怪任何人的怀疑。

Below is the code to create a 50,000 row by 10 column workbook with SpreadsheetGear, save it to disk, and then sum the numbers using OleDb and SpreadsheetGear. SpreadsheetGear reads the 500K cells in 0.31 seconds compared to 0.63 seconds with OleDB - just over twice as fast. SpreadsheetGear actually creates and reads the workbook in less time than it takes to read the workbook with OleDB.

下面的代码将使用SpreadsheetGear创建一个50000行10列的工作簿,并将其保存到磁盘中,然后使用OleDb和SpreadsheetGear对数字进行求和。与OleDB的0.63秒相比,SpreadsheetGear在0.31秒内读取500K细胞的速度是OleDB的两倍多。与使用OleDB阅读工作簿相比,SpreadsheetGear用更短的时间创建和阅读工作簿。

The code is below. You can try it yourself with the SpreadsheetGear free trial.

下面的代码。你可以自己试用电子表格免费试用。

using System;
using System.Data; 
using System.Data.OleDb; 
using SpreadsheetGear;
using SpreadsheetGear.Advanced.Cells;
using System.Diagnostics;

namespace SpreadsheetGearAndOleDBBenchmark
{
    class Program
    {
        static void Main(string[] args)
        {
            // Warm up (get the code JITed).
            BM(10, 10);

            // Do it for real.
            BM(50000, 10);
        }

        static void BM(int rows, int cols)
        {
            // Compare the performance of OleDB to SpreadsheetGear for reading
            // workbooks. We sum numbers just to have something to do.
            //
            // Run on Windows Vista 32 bit, Visual Studio 2008, Release Build,
            // Run Without Debugger:
            //  Create time: 0.25 seconds
            //  OleDb Time: 0.63 seconds
            //  SpreadsheetGear Time: 0.31 seconds
            //
            // SpreadsheetGear is more than twice as fast at reading. Furthermore,
            // SpreadsheetGear can create the file and read it faster than OleDB
            // can just read it.
            string filename = @"C:\tmp\SpreadsheetGearOleDbBenchmark.xls";
            Console.WriteLine("\nCreating {0} rows x {1} columns", rows, cols);
            Stopwatch timer = Stopwatch.StartNew();
            double createSum = CreateWorkbook(filename, rows, cols);
            double createTime = timer.Elapsed.TotalSeconds;
            Console.WriteLine("Create sum of {0} took {1} seconds.", createSum, createTime);
            timer = Stopwatch.StartNew();
            double oleDbSum = ReadWithOleDB(filename);
            double oleDbTime = timer.Elapsed.TotalSeconds;
            Console.WriteLine("OleDb sum of {0} took {1} seconds.", oleDbSum, oleDbTime);
            timer = Stopwatch.StartNew();
            double spreadsheetGearSum = ReadWithSpreadsheetGear(filename);
            double spreadsheetGearTime = timer.Elapsed.TotalSeconds;
            Console.WriteLine("SpreadsheetGear sum of {0} took {1} seconds.", spreadsheetGearSum, spreadsheetGearTime);
        }

        static double CreateWorkbook(string filename, int rows, int cols)
        {
            IWorkbook workbook = Factory.GetWorkbook();
            IWorksheet worksheet = workbook.Worksheets[0];
            IValues values = (IValues)worksheet;
            double sum = 0.0;
            Random rand = new Random();
            // Put labels in the first row.
            foreach (IRange cell in worksheet.Cells[0, 0, 0, cols - 1])
                cell.Value = "Cell-" + cell.Address;
            // Using IRange and foreach be less code, 
            // but we'll do it the fast way.
            for (int row = 1; row <= rows; row++)
            {
                for (int col = 0; col < cols; col++)
                {
                    double number = rand.NextDouble();
                    sum += number;
                    values.SetNumber(row, col, number);
                }
            }
            workbook.SaveAs(filename, FileFormat.Excel8);
            return sum;
        }

        static double ReadWithSpreadsheetGear(string filename)
        {
            IWorkbook workbook = Factory.GetWorkbook(filename);
            IWorksheet worksheet = workbook.Worksheets[0];
            IValues values = (IValues)worksheet;
            IRange usedRahge = worksheet.UsedRange;
            int rowCount = usedRahge.RowCount;
            int colCount = usedRahge.ColumnCount;
            double sum = 0.0;
            // We could use foreach (IRange cell in usedRange) for cleaner 
            // code, but this is faster.
            for (int row = 1; row <= rowCount; row++)
            {
                for (int col = 0; col < colCount; col++)
                {
                    IValue value = values[row, col];
                    if (value != null && value.Type == SpreadsheetGear.Advanced.Cells.ValueType.Number)
                        sum += value.Number;
                }
            }
            return sum;
        }

        static double ReadWithOleDB(string filename)
        {
            String connectionString =  
                "Provider=Microsoft.Jet.OLEDB.4.0;" + 
                "Data Source=" + filename + ";" + 
                "Extended Properties=Excel 8.0;"; 
            OleDbConnection connection = new OleDbConnection(connectionString); 
            connection.Open(); 
            OleDbCommand selectCommand =new OleDbCommand("SELECT * FROM [Sheet1$]", connection); 
            OleDbDataAdapter dataAdapter = new OleDbDataAdapter(); 
            dataAdapter.SelectCommand = selectCommand; 
            DataSet dataSet = new DataSet(); 
            dataAdapter.Fill(dataSet); 
            connection.Close(); 
            double sum = 0.0;
            // We'll make some assumptions for brevity of the code.
            DataTable dataTable = dataSet.Tables[0];
            int cols = dataTable.Columns.Count;
            foreach (DataRow row in dataTable.Rows)
            {
                for (int i = 0; i < cols; i++)
                {
                    object val = row[i];
                    if (val is double)
                        sum += (double)val;
                }
            }
            return sum;
        }
    }
}

#2


1  

We always use Excel Interop to open the spreadsheet and parse directly (e.g. similar to how you would scan through cells in VBA), or we create locked down templates that enforce certain columns to be filled in before the user can save the data.

我们总是使用Excel Interop直接打开电子表格并进行解析(例如,类似于在VBA中扫描单元格),或者创建锁定模板,在用户保存数据之前强制填充某些列。

#3


1  

You can probably look at ExcelMapper. It is a tool to read excel files as strongly typed objects. It hides all the details of reading an excel from your code. It would take care if your excel is missing a column or data is missing from a column. You read data that you are interested in. You can get the code/executable for ExcelMapper from http://code.google.com/p/excelmapper/.

你可以看看ExcelMapper。它是一个将excel文件作为强类型对象读取的工具。它隐藏了从代码中读取excel的所有细节。如果您的excel漏掉了一列,或者列中没有数据,那么就需要注意了。您读取感兴趣的数据。您可以从http://code.google.com/p/excelmapper/获得ExcelMapper的代码/可执行文件。

#4


0  

If could require the format of the excel sheet to have column headers, then you would always have the 13 columns. You would just need to skip the header row when processing.

如果可以要求excel表的格式具有列标题,那么您将始终拥有13列。您只需要在处理时跳过标题行。

This would also correct situations where the user puts the columns in an order that you are not expecting. (detect column indexes in the header row and read appropriately)

这也将纠正用户将列按您不希望的顺序放置的情况。(检测标题行中的列索引并正确读取)

I see that others are recommending the Excel interop, but jeez that's a slow option compared to the OleDb way. Plus it requires Excel or OWC to be installed on the server (licensing).

我看到其他人在推荐Excel interop,但是天哪,这和OleDb的方式相比是一个缓慢的选择。此外,它还需要在服务器上安装Excel或OWC(授权许可)。

#5


0  

You might try using Excel and COM. That way, you'll be getting your info straight form the horse's mouth, as it were.

您可以尝试使用Excel和COM。这样,你就可以直接从马嘴里得到信息。

From D. Anand over on the MSDN forums:

来自D. Anand的MSDN论坛:

Create a reference in your project to Excel Objects Library. The excel object library can be added in the COM tab of adding reference dialog.

在您的项目中为Excel对象库创建一个引用。可以在添加引用对话框的COM选项卡中添加excel对象库。

Here's some info on the Excel object model in C# http://msdn.microsoft.com/en-us/library/aa168292(office.11).aspx

下面是关于c# http://msdn.microsoft.com/en-us/library/aa168292(office.11).aspx中的Excel对象模型的一些信息

#6


0  

I recommend you to try Visual Studio Tools for Office and Excel Interop! It's using is very easy.

我建议你试试Office和Excel Interop的Visual Studio工具!它的使用非常简单。