I have a large CSV file (7.3GB; 16,300,000 lines), how can I split this file into two files?
我有一个大的CSV文件(7.3GB; 16,300,000行),如何将此文件拆分为两个文件?
2 个解决方案
#1
16
Have you taken a look at the split
command? See this man page for more information.
你看过split命令了吗?有关更多信息,请参见此手册页。
This page contains an example use of this command.
此页面包含此命令的示例用法。
Aside:
the man -k
command is rather useful for finding unix/linux commands if you aren't quite sure what the specific command is. Specify a keyword with the man -k command and the system will pull out related commands. E.g.,
如果您不太确定具体命令是什么,man -k命令对于查找unix / linux命令非常有用。使用man -k命令指定关键字,系统将提取相关命令。例如。,
% man -k split
will yield:
csplit (1) - split a file into sections determined by context lines
dirsplit (1) - splits directory into multiple with equal size
dpkg-split (1) - Debian package archive split/join tool
gpgsplit (1) - Split an OpenPGP message into packets
pnmsplit (1) - split a multi-image portable anymap into multiple single-image files
ppmtoyuvsplit (1) - convert a portable pixmap into 3 subsampled raw YUV files
split (1) - split a file into pieces
splitdiff (1) - separate out incremental patches
splitfont (1) - extract characters from an ISO-type font.
URI::Split (3pm) - Parse and compose URI strings
wcstok (3) - split wide-character string into tokens
yuvsplittoppm (1) - convert a Y- and a U- and a V-file into a portable pixmap
zipsplit (1) - split a zipfile into smaller zipfiles
#2
1
split -d -n l/N filename.csv tempfile.part.
split -d -n l / N filename.csv tempfile.part。
splits the file into N files without splitting lines. As mentioned in the comments above, the header is not repeated in each file.
将文件拆分为N个文件而不拆分行。如上面的注释中所述,标题不会在每个文件中重复。
#1
16
Have you taken a look at the split
command? See this man page for more information.
你看过split命令了吗?有关更多信息,请参见此手册页。
This page contains an example use of this command.
此页面包含此命令的示例用法。
Aside:
the man -k
command is rather useful for finding unix/linux commands if you aren't quite sure what the specific command is. Specify a keyword with the man -k command and the system will pull out related commands. E.g.,
如果您不太确定具体命令是什么,man -k命令对于查找unix / linux命令非常有用。使用man -k命令指定关键字,系统将提取相关命令。例如。,
% man -k split
will yield:
csplit (1) - split a file into sections determined by context lines
dirsplit (1) - splits directory into multiple with equal size
dpkg-split (1) - Debian package archive split/join tool
gpgsplit (1) - Split an OpenPGP message into packets
pnmsplit (1) - split a multi-image portable anymap into multiple single-image files
ppmtoyuvsplit (1) - convert a portable pixmap into 3 subsampled raw YUV files
split (1) - split a file into pieces
splitdiff (1) - separate out incremental patches
splitfont (1) - extract characters from an ISO-type font.
URI::Split (3pm) - Parse and compose URI strings
wcstok (3) - split wide-character string into tokens
yuvsplittoppm (1) - convert a Y- and a U- and a V-file into a portable pixmap
zipsplit (1) - split a zipfile into smaller zipfiles
#2
1
split -d -n l/N filename.csv tempfile.part.
split -d -n l / N filename.csv tempfile.part。
splits the file into N files without splitting lines. As mentioned in the comments above, the header is not repeated in each file.
将文件拆分为N个文件而不拆分行。如上面的注释中所述,标题不会在每个文件中重复。