在逗号上拆分字符串,但在.csv文件中使用shell脚本忽略双引号内的逗号?

时间:2021-08-21 21:43:04

Sample 'null.csv' file contains

示例'null.csv'文件包含

71131940,2015-05-01,"JEWELLERY,ITEM",P,,W

71131940,2015-05-01, “珠宝,ITEM”,P ,, w ^

I have a .csv file in which I want to handle commas(,) and null values(,,) so that when I split each line of the file using (,) it ignores commas within double-quotes and does not give the output like this given below.

我有一个.csv文件,我想在其中处理逗号(,)和空值(,,),这样当我使用(,)分割文件的每一行时,它会忽略双引号内的逗号而不提供输出如下所示。

71131940,2015-05-01,JEWELLERY,ITEM,P,,W

71131940,2015-05-01,珠宝首饰,项目,P ,, w ^

I handled null values i.e (,,) by replaces it with (,0,) using sed command

我通过使用sed命令将其替换为(,0,)来处理空值,即(,,)

sed -i -e "s/,,/,0,/g" null.csv

sed -i -e“s / ,, /,0,/ g”null.csv

and got output something like

得到类似的输出

71131940,2015-05-01,JEWELLERY,ITEM,P,0,W

But the problem is that, in here I don't want to split "JEWELLERY,ITEM" into JEWELLERY,ITEM .

但问题是,在这里我不想将“JEWELLERY,ITEM”分为JEWELLERY,ITEM。

Any kind of help will be appreciated.

任何形式的帮助将不胜感激。

1 个解决方案

#1


2  

I'm sure this has been asked and answered a million times but in any case, for input formatted as simply as you have shown (e.g. no quoted quotes or newlines within quotes):

我确信已经被问过并回答了一百万次,但无论如何,输入的格式与您所显示的一样简单(例如引号内没有引号或引号):

$ awk -v FPAT='[^,]*|"[^"]*"' '{for (i=1;i<=NF;i++) print i, $i}' file
1 71131940
2 2015-05-01
3 "JEWELLERY,ITEM"
4 P
5
6 W

The above uses GNU awk for FPAT (see https://www.gnu.org/software/gawk/manual/gawk.html#Splitting-By-Content).

以上使用GNU awk for FPAT(参见https://www.gnu.org/software/gawk/manual/gawk.html#Splitting-By-Content)。

#1


2  

I'm sure this has been asked and answered a million times but in any case, for input formatted as simply as you have shown (e.g. no quoted quotes or newlines within quotes):

我确信已经被问过并回答了一百万次,但无论如何,输入的格式与您所显示的一样简单(例如引号内没有引号或引号):

$ awk -v FPAT='[^,]*|"[^"]*"' '{for (i=1;i<=NF;i++) print i, $i}' file
1 71131940
2 2015-05-01
3 "JEWELLERY,ITEM"
4 P
5
6 W

The above uses GNU awk for FPAT (see https://www.gnu.org/software/gawk/manual/gawk.html#Splitting-By-Content).

以上使用GNU awk for FPAT(参见https://www.gnu.org/software/gawk/manual/gawk.html#Splitting-By-Content)。