Excel CSV。包含1048,576行数据的文件

时间:2022-03-25 02:49:51

I have been given a CSV file with more than the MAX Excel can handle, and I really need to be able to see all the data. I understand and have tried the method of "splitting" it, but it doesnt work.

我得到了一个CSV文件,它的容量超过了MAX Excel的容量,我需要能够看到所有的数据。我理解并尝试过“分裂”它的方法,但它不起作用。

Some background: The CSV file is an Excel CSV file, and the person who gave the file has said there are about 2m rows of data.

一些背景:CSV文件是一个Excel CSV文件,提交文件的人说有大约200万行数据。

When I import it into Excel, I get data up to row 1,048,576, then re-import it in a new tab starting at row 1,048,577 in the data, but it only gives me one row, and I know for a fact that there should be more (not only because of the fact that "the person" said there are more than 2 million, but because of the information in the last few sets of rows)

当我将它导入到Excel,我得到1048576行数据,然后重新导入一个新的选项卡开始的行1048577年的数据,但它只给了我一个行,我知道一个事实应该有更多(不仅因为“人”说,有超过200万人,但由于信息的最后几集的行)

I thought that maybe the reason for this happening is because I have been provided the CSV file as an Excel CSV file, and so all the information past 1,048,576 is lost (?).

我认为发生这种情况的原因可能是因为我已经提供了CSV文件作为Excel CSV文件,所以超过1048576的所有信息都丢失了(?)

DO I need to ask for a file in an SQL database format?

是否需要使用SQL数据库格式的文件?

13 个解决方案

#1


17  

You should try delimit it can open up to 2 billion rows and 2 million columns very quickly has a free 15 day trial too. Does the job for me!

你应该试着去限制它能打开20亿行,200万列很快就有了15天的免费试用。为我做这份工作!

#2


8  

I would suggest to load the .CSV file in MS-Access.

我建议在MS-Access中加载. csv文件。

With MS-Excel you can then create a data connection to this source (without actual loading the records in a worksheet) and create a connected pivot table. You then can have virtually unlimited number of lines in your table (depending on processor and memory: I have now 15 mln lines with 3 Gb Memory).

使用MS-Excel,您可以创建到此源的数据连接(不实际加载工作表中的记录)并创建连接的pivot表。然后,您可以在您的表中拥有几乎无限的行数(取决于处理器和内存:我现在有15个mln行,有3 Gb内存)。

Additional advantage is that you can now create an aggregate view in MS-Access. In this way you can create overviews from hundreds of millions of lines and then view them in MS-Excel (beware of the 2Gb limitation of NTFS files in 32 bits OS).

额外的好处是,您现在可以在MS-Access中创建聚合视图。通过这种方式,您可以从数亿行创建概述,然后在MS-Excel中查看它们(注意32位操作系统中NTFS文件的2Gb限制)。

#3


7  

First you want to change the file format from csv to txt. That is simple to do, just edit the file name and change csv to txt. (Windows will give you warning about possibly corrupting the data, but it is fine, just click ok). Then make a copy of the txt file so that now you have two files both with 2 millions rows of data. Then open up the first txt file and delete the second million rows and save the file. Then open the second txt file and delete the first million rows and save the file. Now change the two files back to csv the same way you changed them to txt originally.

首先要将文件格式从csv更改为txt。这很简单,只需编辑文件名并将csv更改为txt。(Windows会警告您可能损坏数据,但没关系,只要单击ok即可)。然后复制一个txt文件,这样就有了两个文件,两个文件都有200万行数据。然后打开第一个txt文件,删除第二个百万行并保存文件。然后打开第二个txt文件,删除第一个百万行并保存文件。现在将这两个文件更改为csv,与您最初将它们更改为txt的方式相同。

#4


4  

Excel 2007+ is limited to somewhat over 1 million rows ( 2^20 to be precise), so it will never load your 2M line file. I think that the technique you refer to as splitting is the built-in thing Excel has, but afaik that only works for width problems, not for length problems.

Excel 2007 +仅限于有点超过100万行(2 ^ 20精确),所以它不会加载您的2 m行文件。我认为您所提到的分割技术是Excel内置的东西,但是afaik只适用于宽度问题,而不适用于长度问题。

The really easiest way I see right away is to use some file splitting tool - there's tons of 'em and use that to load the resulting partial csv files into multiple worksheets.

我马上看到的最简单的方法是使用一些文件分割工具——有大量的em,并使用它将产生的部分csv文件加载到多个工作表中。

ps: "excel csv files" don't exist, there are only files produced by Excel that use one of the formats commonly referred to as csv files...

ps:“excel csv文件”不存在,只有excel生成的文件使用的格式通常被称为csv文件……

#5


4  

Try using Open Refine. It has been able to handle datasets that otherwise crashed Excel for me.

试着用开放的完善。它能够处理那些为我崩溃的Excel数据集。

#6


3  

You can use PowerPivot to work with files of up to 2GB, which will be enough for your needs.

您可以使用PowerPivot来处理最多2GB的文件,这就足够满足您的需要了。

#7


2  

If you have Matlab, you can open large CSV (or TXT) files via its import facility. The tool gives you various import format options including tables, column vectors, numeric matrix, etc. However, with Matlab being an interpreter package, it does take its own time to import such a large file and I was able to import one with more than 2 million rows in about 10 minutes.

如果您有Matlab,您可以通过它的导入工具打开大型CSV(或TXT)文件。工具给你各种导入格式选项包括表、列向量,数字矩阵,等等。然而,Matlab是译员包,它需要自己的时间导入这么大的文件,我可以导入一个拥有超过200万行大约10分钟。

The tool is accessible via Matlab's Home tab by clicking on the "Import Data" button. An example image of a large file upload is shown below: Excel CSV。包含1048,576行数据的文件 Once imported, the data appears on the right-hand-side Workspace, which can then be double-clicked in an Excel-like format and even be plotted in different formats. Excel CSV。包含1048,576行数据的文件

该工具可通过Matlab的Home选项卡通过单击“导入数据”按钮访问。一个大型文件上传的示例图片如下所示:一旦导入,数据就会出现在右侧的工作区中,然后可以以类似excellike的格式双击,甚至可以以不同的格式绘制。

#8


1  

Use MS Access. I have a file of 2,673,404 records. It will not open in notepad++ and excel will not load more than 1,048,576 records. It is tab delimited since I exported the data from a mysql database and I need it in csv format. So I imported it into Access. Change the file extension to .txt so MS Access will take you through the import wizard.

使用女士访问。我有一个包含2,673,404条记录的文件。它不会在notepad++中打开,excel也不会加载超过1048,576条记录。它是带标签分隔符的,因为我从mysql数据库导出数据,我需要它的csv格式。所以我将它导入了Access。将文件扩展名更改为.txt,以便MS Access将带您通过导入向导。

MS Access will link to your file so for the database to stay intact keep the csv file

MS Access将链接到您的文件,以便数据库保持完整,保持csv文件

#9


1  

I'm surprised no one mentioned Microsoft Query. You can simply request data from the large CSV file as you need it by querying only that which you need. (Querying is setup like how you filter a table in Excel)

我很惊讶没有人提到微软查询。您只需通过查询所需的数据就可以从大型CSV文件中请求数据。(查询的设置类似于如何在Excel中过滤表)

Better yet, if one is open to installing the Power Query add-in, it's super simple and quick.

更好的是,如果你愿意安装Power Query add-in,它会非常简单和快速。

#10


0  

"DO I need to ask for a file in an SQL database format?" YES!!!

“我需要SQL数据库格式的文件吗?”是的! ! !

Use a database, is the best option for this problem.

使用数据库,是这个问题的最佳选择。

Excel 2010 specifications .

Excel 2010规格。

#11


0  

Split the CSV into two files in Notepad. It's a pain, but you can just edit each of them individually in Excel after that.

在记事本中将CSV拆分为两个文件。这很痛苦,但是您可以在Excel中分别编辑它们。

#12


0  

I would strongly recommend you import the data into Access so you can then query it from inside access. You could try to use R to query you file as well, which I'd be more than happy to help with. Otherwise, you could look at a free solution such as this product, which allows you to run SQL statements from within an Excel file. http://www.querystorm.com/Home/Guide

我强烈建议您将数据导入到Access中,这样您就可以从内部访问中查询它了。您也可以尝试使用R查询您的文件,我非常乐意提供帮助。否则,您可以查看一个免费的解决方案,比如这个产品,它允许您在Excel文件中运行SQL语句。http://www.querystorm.com/Home/Guide

#13


0  

I was able to edit a large 17GB csv file in Sublime Text without issue (line numbering makes it a lot easier to keep track of manual splitting), and then dump it into Excel in chunks smaller than 1,048,576 lines. Simple and quite quick - less faffy than researching into, installing and learning bespoke solutions. Quick and dirty, but it works.

我可以用出色的文本编辑一个17GB的csv文件(行号可以更容易地跟踪手工分割),然后将它以小于1048,576行的块转储到Excel中。简单和相当快-没有比研究,安装和学习定制解决方案更不时髦。又快又脏,但很管用。

#1


17  

You should try delimit it can open up to 2 billion rows and 2 million columns very quickly has a free 15 day trial too. Does the job for me!

你应该试着去限制它能打开20亿行,200万列很快就有了15天的免费试用。为我做这份工作!

#2


8  

I would suggest to load the .CSV file in MS-Access.

我建议在MS-Access中加载. csv文件。

With MS-Excel you can then create a data connection to this source (without actual loading the records in a worksheet) and create a connected pivot table. You then can have virtually unlimited number of lines in your table (depending on processor and memory: I have now 15 mln lines with 3 Gb Memory).

使用MS-Excel,您可以创建到此源的数据连接(不实际加载工作表中的记录)并创建连接的pivot表。然后,您可以在您的表中拥有几乎无限的行数(取决于处理器和内存:我现在有15个mln行,有3 Gb内存)。

Additional advantage is that you can now create an aggregate view in MS-Access. In this way you can create overviews from hundreds of millions of lines and then view them in MS-Excel (beware of the 2Gb limitation of NTFS files in 32 bits OS).

额外的好处是,您现在可以在MS-Access中创建聚合视图。通过这种方式,您可以从数亿行创建概述,然后在MS-Excel中查看它们(注意32位操作系统中NTFS文件的2Gb限制)。

#3


7  

First you want to change the file format from csv to txt. That is simple to do, just edit the file name and change csv to txt. (Windows will give you warning about possibly corrupting the data, but it is fine, just click ok). Then make a copy of the txt file so that now you have two files both with 2 millions rows of data. Then open up the first txt file and delete the second million rows and save the file. Then open the second txt file and delete the first million rows and save the file. Now change the two files back to csv the same way you changed them to txt originally.

首先要将文件格式从csv更改为txt。这很简单,只需编辑文件名并将csv更改为txt。(Windows会警告您可能损坏数据,但没关系,只要单击ok即可)。然后复制一个txt文件,这样就有了两个文件,两个文件都有200万行数据。然后打开第一个txt文件,删除第二个百万行并保存文件。然后打开第二个txt文件,删除第一个百万行并保存文件。现在将这两个文件更改为csv,与您最初将它们更改为txt的方式相同。

#4


4  

Excel 2007+ is limited to somewhat over 1 million rows ( 2^20 to be precise), so it will never load your 2M line file. I think that the technique you refer to as splitting is the built-in thing Excel has, but afaik that only works for width problems, not for length problems.

Excel 2007 +仅限于有点超过100万行(2 ^ 20精确),所以它不会加载您的2 m行文件。我认为您所提到的分割技术是Excel内置的东西,但是afaik只适用于宽度问题,而不适用于长度问题。

The really easiest way I see right away is to use some file splitting tool - there's tons of 'em and use that to load the resulting partial csv files into multiple worksheets.

我马上看到的最简单的方法是使用一些文件分割工具——有大量的em,并使用它将产生的部分csv文件加载到多个工作表中。

ps: "excel csv files" don't exist, there are only files produced by Excel that use one of the formats commonly referred to as csv files...

ps:“excel csv文件”不存在,只有excel生成的文件使用的格式通常被称为csv文件……

#5


4  

Try using Open Refine. It has been able to handle datasets that otherwise crashed Excel for me.

试着用开放的完善。它能够处理那些为我崩溃的Excel数据集。

#6


3  

You can use PowerPivot to work with files of up to 2GB, which will be enough for your needs.

您可以使用PowerPivot来处理最多2GB的文件,这就足够满足您的需要了。

#7


2  

If you have Matlab, you can open large CSV (or TXT) files via its import facility. The tool gives you various import format options including tables, column vectors, numeric matrix, etc. However, with Matlab being an interpreter package, it does take its own time to import such a large file and I was able to import one with more than 2 million rows in about 10 minutes.

如果您有Matlab,您可以通过它的导入工具打开大型CSV(或TXT)文件。工具给你各种导入格式选项包括表、列向量,数字矩阵,等等。然而,Matlab是译员包,它需要自己的时间导入这么大的文件,我可以导入一个拥有超过200万行大约10分钟。

The tool is accessible via Matlab's Home tab by clicking on the "Import Data" button. An example image of a large file upload is shown below: Excel CSV。包含1048,576行数据的文件 Once imported, the data appears on the right-hand-side Workspace, which can then be double-clicked in an Excel-like format and even be plotted in different formats. Excel CSV。包含1048,576行数据的文件

该工具可通过Matlab的Home选项卡通过单击“导入数据”按钮访问。一个大型文件上传的示例图片如下所示:一旦导入,数据就会出现在右侧的工作区中,然后可以以类似excellike的格式双击,甚至可以以不同的格式绘制。

#8


1  

Use MS Access. I have a file of 2,673,404 records. It will not open in notepad++ and excel will not load more than 1,048,576 records. It is tab delimited since I exported the data from a mysql database and I need it in csv format. So I imported it into Access. Change the file extension to .txt so MS Access will take you through the import wizard.

使用女士访问。我有一个包含2,673,404条记录的文件。它不会在notepad++中打开,excel也不会加载超过1048,576条记录。它是带标签分隔符的,因为我从mysql数据库导出数据,我需要它的csv格式。所以我将它导入了Access。将文件扩展名更改为.txt,以便MS Access将带您通过导入向导。

MS Access will link to your file so for the database to stay intact keep the csv file

MS Access将链接到您的文件,以便数据库保持完整,保持csv文件

#9


1  

I'm surprised no one mentioned Microsoft Query. You can simply request data from the large CSV file as you need it by querying only that which you need. (Querying is setup like how you filter a table in Excel)

我很惊讶没有人提到微软查询。您只需通过查询所需的数据就可以从大型CSV文件中请求数据。(查询的设置类似于如何在Excel中过滤表)

Better yet, if one is open to installing the Power Query add-in, it's super simple and quick.

更好的是,如果你愿意安装Power Query add-in,它会非常简单和快速。

#10


0  

"DO I need to ask for a file in an SQL database format?" YES!!!

“我需要SQL数据库格式的文件吗?”是的! ! !

Use a database, is the best option for this problem.

使用数据库,是这个问题的最佳选择。

Excel 2010 specifications .

Excel 2010规格。

#11


0  

Split the CSV into two files in Notepad. It's a pain, but you can just edit each of them individually in Excel after that.

在记事本中将CSV拆分为两个文件。这很痛苦,但是您可以在Excel中分别编辑它们。

#12


0  

I would strongly recommend you import the data into Access so you can then query it from inside access. You could try to use R to query you file as well, which I'd be more than happy to help with. Otherwise, you could look at a free solution such as this product, which allows you to run SQL statements from within an Excel file. http://www.querystorm.com/Home/Guide

我强烈建议您将数据导入到Access中,这样您就可以从内部访问中查询它了。您也可以尝试使用R查询您的文件,我非常乐意提供帮助。否则,您可以查看一个免费的解决方案,比如这个产品,它允许您在Excel文件中运行SQL语句。http://www.querystorm.com/Home/Guide

#13


0  

I was able to edit a large 17GB csv file in Sublime Text without issue (line numbering makes it a lot easier to keep track of manual splitting), and then dump it into Excel in chunks smaller than 1,048,576 lines. Simple and quite quick - less faffy than researching into, installing and learning bespoke solutions. Quick and dirty, but it works.

我可以用出色的文本编辑一个17GB的csv文件(行号可以更容易地跟踪手工分割),然后将它以小于1048,576行的块转储到Excel中。简单和相当快-没有比研究,安装和学习定制解决方案更不时髦。又快又脏,但很管用。