R Linux Shell将多片xls转换成批量的csv

时间:2021-05-24 16:05:29

In R i have a script gets content of multiple xls files <Loop over directory to get Excel content>.

在R中,我有一个脚本获取多个xls文件的内容 <循环遍历目录以获得excel内容> 。

All files are about 2 MB. The script takes a few seconds for 3 files, but is now running for 6 hours on a Debian i7 system without results on 120 files.

所有文件都是2 MB左右,脚本3个文件需要几秒钟,但是现在在Debian i7系统上运行6个小时,却没有收到120个文件。

A better solution is therefore [hopefully] to convert all xls files to csv using ssconvert, using a bash script <Linux Shell Script For Each File in a Directory Grab the filename and execute a program>:

因此,更好的解决方案是[希望]使用ssconvert将所有xls文件转换为csv,使用bash脚本 :

for f in *.xls ; do xls2csv "$f" "${f%.xls}.csv" ; done

This script does the job, however my content is in sheet nr 14, whereas the csv files produced by this script just return the first sheet [i replaced 'xls2csv' with 'ssconvert'].

这个脚本可以完成这项工作,但是我的内容在表nr 14中,而这个脚本生成的csv文件只返回第一个表[我用'ssconvert'替换'xls2csv']。

Can this script be adopted to pickup only sheet nr 14 in the workbook?

是否可以使用此脚本只提取工作簿中的nr 14页?

2 个解决方案

#1


2  

If you know the worksheet name, you can do this:

如果您知道工作表名称,您可以这样做:

for f in *.xls ; xls2csv -x "$f" -w sheetName -c "${f%.xls}.csv";done

To see all the xls2csv details see here.

要查看所有的xls2csv细节,请参见这里。

EDIT

编辑

The OP find the right answer, so I edit mine to add it :

OP找到正确的答案,所以我编辑我的添加:

for f in *.xls ; do xls2csv -x "$f" -f -n 14 -c "${f%.xls}.csv" 

#2


1  

For this job I use a python script named ssconverter.py (which you can find here, scroll down and download the two attachments, ssconverter.py and ooutils.py), which I call directly from R using system().

对于这项工作,我使用一个名为ssconverter的python脚本。py(可以在这里找到,向下滚动并下载两个附件,ssconverter。py和ooutils.py),我使用system()直接从R调用它们。

It can extract a specific sheet in the workbook, not only by name but also by sheet number, for example:

它可以在工作簿中提取特定的工作表,不仅可以按名称,而且还可以按工作表编号,例如:

ssconverter.py infile.xls:2 outfile.csv

to extract the second sheet.

提取第二页。

You need to have python and python-uno installed.

您需要安装python和python-uno。

#1


2  

If you know the worksheet name, you can do this:

如果您知道工作表名称,您可以这样做:

for f in *.xls ; xls2csv -x "$f" -w sheetName -c "${f%.xls}.csv";done

To see all the xls2csv details see here.

要查看所有的xls2csv细节,请参见这里。

EDIT

编辑

The OP find the right answer, so I edit mine to add it :

OP找到正确的答案,所以我编辑我的添加:

for f in *.xls ; do xls2csv -x "$f" -f -n 14 -c "${f%.xls}.csv" 

#2


1  

For this job I use a python script named ssconverter.py (which you can find here, scroll down and download the two attachments, ssconverter.py and ooutils.py), which I call directly from R using system().

对于这项工作,我使用一个名为ssconverter的python脚本。py(可以在这里找到,向下滚动并下载两个附件,ssconverter。py和ooutils.py),我使用system()直接从R调用它们。

It can extract a specific sheet in the workbook, not only by name but also by sheet number, for example:

它可以在工作簿中提取特定的工作表,不仅可以按名称,而且还可以按工作表编号,例如:

ssconverter.py infile.xls:2 outfile.csv

to extract the second sheet.

提取第二页。

You need to have python and python-uno installed.

您需要安装python和python-uno。