I am trying to build a simulation tool in Excel using VSTO and by creating a Visual Studio 2010 Office workbook project. One of the worksheets in this workbook will contain approximately half a million records. Ideally I would like to read all of the records use them in the simulation and then output back some statistics. So far I had OutOfMemory
exceptions when I tried to get the whole range and then the cells out of it in one go. Does anyone have other ideas as to how I can read all of the data or suggestions in doing this?
我正在尝试使用VSTO在Excel中构建一个模拟工具,并创建一个Visual Studio 2010 Office工作簿项目。本工作簿中的一个工作表将包含大约五十万条记录。理想情况下,我想阅读所有在模拟中使用它们的记录,然后输出一些统计信息。到目前为止,当我试图获得整个范围然后一次性完成单元格时,我有OutOfMemory异常。有没有人对我如何阅读所有这些数据或建议有其他想法?
This is my code:
这是我的代码:
Excel.Range range = Globals.shData.Range["A2:AX500000"];
Excel.Range range = Globals.shData.Range [“A2:AX500000”];
Array values = (Array)range.Cells.Value;
数组值=(数组)range.Cells.Value;
2 个解决方案
#1
8
How about fetching in batches, and assembling a slightly less memory heavy model in memory?
如何批量获取,并在内存中组装一个稍微少一点的内存重型模型?
var firstRow = 2;
var lastRow = 500000;
var batchSize = 5000;
var batches = Enumerable
.Range(0, (int)Math.Ceiling( (lastRow-firstRow) / (double)batchSize ))
.Select(x =>
string.Format(
"A{0}:AX{1}",
x * batchSize + firstRow,
Math.Min((x+1) * batchSize + firstRow - 1, lastRow)))
.Select(range => ((Array)Globals.shData.Range[range]).Cells.Value);
foreach(var batch in batches)
{
foreach(var item in batch)
{
//reencode item into your own object collection.
}
}
#2
2
This is not an Excel problem, rather a general C# issue. Instead of gathering all the rows in memory, yield the rows and calculate the stats iteratively.
这不是Excel问题,而是一般的C#问题。而不是收集内存中的所有行,产生行并迭代计算统计数据。
For example
例如
class Program
{
static void Main(string[] args)
{
var totalOfAllAges = 0D;
var rows = new ExcelRows();
//calculate various statistics
foreach (var item in rows.GetRow())
{
totalOfAllAges += item.Age;
}
Console.WriteLine("The total of all ages is {0}", totalOfAllAges);
}
}
internal class ExcelRows
{
private double rowCount = 1500000D;
private double rowIndex = 0D;
public IEnumerable<ExcelRow> GetRow()
{
while (rowIndex < rowCount)
{
rowIndex++;
yield return new ExcelRow() { Age = rowIndex };
}
}
}
/// <summary>
/// represents the next read gathered by VSTO
/// </summary>
internal class ExcelRow
{
public double Age { get; set; }
}
#1
8
How about fetching in batches, and assembling a slightly less memory heavy model in memory?
如何批量获取,并在内存中组装一个稍微少一点的内存重型模型?
var firstRow = 2;
var lastRow = 500000;
var batchSize = 5000;
var batches = Enumerable
.Range(0, (int)Math.Ceiling( (lastRow-firstRow) / (double)batchSize ))
.Select(x =>
string.Format(
"A{0}:AX{1}",
x * batchSize + firstRow,
Math.Min((x+1) * batchSize + firstRow - 1, lastRow)))
.Select(range => ((Array)Globals.shData.Range[range]).Cells.Value);
foreach(var batch in batches)
{
foreach(var item in batch)
{
//reencode item into your own object collection.
}
}
#2
2
This is not an Excel problem, rather a general C# issue. Instead of gathering all the rows in memory, yield the rows and calculate the stats iteratively.
这不是Excel问题,而是一般的C#问题。而不是收集内存中的所有行,产生行并迭代计算统计数据。
For example
例如
class Program
{
static void Main(string[] args)
{
var totalOfAllAges = 0D;
var rows = new ExcelRows();
//calculate various statistics
foreach (var item in rows.GetRow())
{
totalOfAllAges += item.Age;
}
Console.WriteLine("The total of all ages is {0}", totalOfAllAges);
}
}
internal class ExcelRows
{
private double rowCount = 1500000D;
private double rowIndex = 0D;
public IEnumerable<ExcelRow> GetRow()
{
while (rowIndex < rowCount)
{
rowIndex++;
yield return new ExcelRow() { Age = rowIndex };
}
}
}
/// <summary>
/// represents the next read gathered by VSTO
/// </summary>
internal class ExcelRow
{
public double Age { get; set; }
}