如何使用vba将大文本文件拆分为较小的文本文件?

时间:2021-06-05 23:54:42

I have a database textfile. It is large text file about 387,480 KB. This file contains table name, headers of the table and values. I need to split this file into multiple files each containing table creation and insertion with a file name as table name. Please can anyone help me??

我有一个数据库文本文件。它是大型文本文件,大约387,480 KB。此文件包含表名,表的标题和值。我需要将此文件拆分为多个文件,每个文件包含表创建和插入,文件名为表名。请任何人可以帮助我?

1 个解决方案

#1


0  

I don't see how Excel will open a 347MB file. You can try to load it into Access, and do the split, using VBA. However, the process of importing a file that large may fragment enough to blow Access up to @GB, and then it's all over. SQL Server would handle this kind of job. Alternatively, you could use Python or R to do the work for you.

我不知道Excel将如何打开347MB文件。您可以尝试将其加载到Access中,并使用VBA进行拆分。但是,导入大文件的过程可能足以将访问权限吹到@GB,然后全部结束。 SQL Server将处理这种工作。或者,您可以使用Python或R为您完成工作。

### Python:
import pandas as pd
for i,chunk in enumerate(pd.read_csv('C:/your_path/main.csv', chunksize=3)):
    chunk.to_csv('chunk{}.csv'.format(i))

### R
setwd("C:/your_path/") 
mydata = read.csv("annualsinglefile.csv") 

# If you want 5 different chunks with same number of lines, lets say 30.
# Chunks = split(mydata,sample(rep(1:5,30)))  ## 5 Chunks of 30 lines each

# If you want 100000 samples, put any range of 20 values within the range of number of rows
First_chunk <- sample(mydata[1:100000,])  ## this would contain first 100000 rows

# Or you can print any number of rows within the range
# Second_chunk <- sample(mydata[100:70,] ## this would contain last 30 rows in reverse order if your data had 100 rows.

# If you want to write these chunks out in a csv file:
write.csv(First_chunk,file="First_chunk.csv",quote=F,row.names=F,col.names=T)
# write.csv(Second_chunk,file="Second_chunk.csv",quote=F,row.names=F,col.names=T)

#1


0  

I don't see how Excel will open a 347MB file. You can try to load it into Access, and do the split, using VBA. However, the process of importing a file that large may fragment enough to blow Access up to @GB, and then it's all over. SQL Server would handle this kind of job. Alternatively, you could use Python or R to do the work for you.

我不知道Excel将如何打开347MB文件。您可以尝试将其加载到Access中,并使用VBA进行拆分。但是,导入大文件的过程可能足以将访问权限吹到@GB,然后全部结束。 SQL Server将处理这种工作。或者,您可以使用Python或R为您完成工作。

### Python:
import pandas as pd
for i,chunk in enumerate(pd.read_csv('C:/your_path/main.csv', chunksize=3)):
    chunk.to_csv('chunk{}.csv'.format(i))

### R
setwd("C:/your_path/") 
mydata = read.csv("annualsinglefile.csv") 

# If you want 5 different chunks with same number of lines, lets say 30.
# Chunks = split(mydata,sample(rep(1:5,30)))  ## 5 Chunks of 30 lines each

# If you want 100000 samples, put any range of 20 values within the range of number of rows
First_chunk <- sample(mydata[1:100000,])  ## this would contain first 100000 rows

# Or you can print any number of rows within the range
# Second_chunk <- sample(mydata[100:70,] ## this would contain last 30 rows in reverse order if your data had 100 rows.

# If you want to write these chunks out in a csv file:
write.csv(First_chunk,file="First_chunk.csv",quote=F,row.names=F,col.names=T)
# write.csv(Second_chunk,file="Second_chunk.csv",quote=F,row.names=F,col.names=T)