I have a database textfile. It is large text file about 387,480 KB. This file contains table name, headers of the table and values. I need to split this file into multiple files each containing table creation and insertion with a file name as table name. Please can anyone help me??
我有一个数据库文本文件。它是大型文本文件,大约387,480 KB。此文件包含表名,表的标题和值。我需要将此文件拆分为多个文件,每个文件包含表创建和插入,文件名为表名。请任何人可以帮助我?
1 个解决方案
#1
0
I don't see how Excel will open a 347MB file. You can try to load it into Access, and do the split, using VBA. However, the process of importing a file that large may fragment enough to blow Access up to @GB, and then it's all over. SQL Server would handle this kind of job. Alternatively, you could use Python or R to do the work for you.
我不知道Excel将如何打开347MB文件。您可以尝试将其加载到Access中,并使用VBA进行拆分。但是,导入大文件的过程可能足以将访问权限吹到@GB,然后全部结束。 SQL Server将处理这种工作。或者,您可以使用Python或R为您完成工作。
### Python:
import pandas as pd
for i,chunk in enumerate(pd.read_csv('C:/your_path/main.csv', chunksize=3)):
chunk.to_csv('chunk{}.csv'.format(i))
### R
setwd("C:/your_path/")
mydata = read.csv("annualsinglefile.csv")
# If you want 5 different chunks with same number of lines, lets say 30.
# Chunks = split(mydata,sample(rep(1:5,30))) ## 5 Chunks of 30 lines each
# If you want 100000 samples, put any range of 20 values within the range of number of rows
First_chunk <- sample(mydata[1:100000,]) ## this would contain first 100000 rows
# Or you can print any number of rows within the range
# Second_chunk <- sample(mydata[100:70,] ## this would contain last 30 rows in reverse order if your data had 100 rows.
# If you want to write these chunks out in a csv file:
write.csv(First_chunk,file="First_chunk.csv",quote=F,row.names=F,col.names=T)
# write.csv(Second_chunk,file="Second_chunk.csv",quote=F,row.names=F,col.names=T)
#1
0
I don't see how Excel will open a 347MB file. You can try to load it into Access, and do the split, using VBA. However, the process of importing a file that large may fragment enough to blow Access up to @GB, and then it's all over. SQL Server would handle this kind of job. Alternatively, you could use Python or R to do the work for you.
我不知道Excel将如何打开347MB文件。您可以尝试将其加载到Access中,并使用VBA进行拆分。但是,导入大文件的过程可能足以将访问权限吹到@GB,然后全部结束。 SQL Server将处理这种工作。或者,您可以使用Python或R为您完成工作。
### Python:
import pandas as pd
for i,chunk in enumerate(pd.read_csv('C:/your_path/main.csv', chunksize=3)):
chunk.to_csv('chunk{}.csv'.format(i))
### R
setwd("C:/your_path/")
mydata = read.csv("annualsinglefile.csv")
# If you want 5 different chunks with same number of lines, lets say 30.
# Chunks = split(mydata,sample(rep(1:5,30))) ## 5 Chunks of 30 lines each
# If you want 100000 samples, put any range of 20 values within the range of number of rows
First_chunk <- sample(mydata[1:100000,]) ## this would contain first 100000 rows
# Or you can print any number of rows within the range
# Second_chunk <- sample(mydata[100:70,] ## this would contain last 30 rows in reverse order if your data had 100 rows.
# If you want to write these chunks out in a csv file:
write.csv(First_chunk,file="First_chunk.csv",quote=F,row.names=F,col.names=T)
# write.csv(Second_chunk,file="Second_chunk.csv",quote=F,row.names=F,col.names=T)