Sample 'null.csv' file contains
示例'null.csv'文件包含
71131940,2015-05-01,"JEWELLERY,ITEM",P,,W
71131940,2015-05-01, “珠宝,ITEM”,P ,, w ^
I have a .csv file in which I want to handle commas(,) and null values(,,) so that when I split each line of the file using (,) it ignores commas within double-quotes and does not give the output like this given below.
我有一个.csv文件,我想在其中处理逗号(,)和空值(,,),这样当我使用(,)分割文件的每一行时,它会忽略双引号内的逗号而不提供输出如下所示。
71131940,2015-05-01,JEWELLERY,ITEM,P,,W
71131940,2015-05-01,珠宝首饰,项目,P ,, w ^
I handled null values i.e (,,) by replaces it with (,0,) using sed command
我通过使用sed命令将其替换为(,0,)来处理空值,即(,,)
sed -i -e "s/,,/,0,/g" null.csv
sed -i -e“s / ,, /,0,/ g”null.csv
and got output something like
得到类似的输出
71131940,2015-05-01,JEWELLERY,ITEM,P,0,W
But the problem is that, in here I don't want to split "JEWELLERY,ITEM" into JEWELLERY,ITEM .
但问题是,在这里我不想将“JEWELLERY,ITEM”分为JEWELLERY,ITEM。
Any kind of help will be appreciated.
任何形式的帮助将不胜感激。
1 个解决方案
#1
2
I'm sure this has been asked and answered a million times but in any case, for input formatted as simply as you have shown (e.g. no quoted quotes or newlines within quotes):
我确信已经被问过并回答了一百万次,但无论如何,输入的格式与您所显示的一样简单(例如引号内没有引号或引号):
$ awk -v FPAT='[^,]*|"[^"]*"' '{for (i=1;i<=NF;i++) print i, $i}' file
1 71131940
2 2015-05-01
3 "JEWELLERY,ITEM"
4 P
5
6 W
The above uses GNU awk for FPAT
(see https://www.gnu.org/software/gawk/manual/gawk.html#Splitting-By-Content).
以上使用GNU awk for FPAT(参见https://www.gnu.org/software/gawk/manual/gawk.html#Splitting-By-Content)。
#1
2
I'm sure this has been asked and answered a million times but in any case, for input formatted as simply as you have shown (e.g. no quoted quotes or newlines within quotes):
我确信已经被问过并回答了一百万次,但无论如何,输入的格式与您所显示的一样简单(例如引号内没有引号或引号):
$ awk -v FPAT='[^,]*|"[^"]*"' '{for (i=1;i<=NF;i++) print i, $i}' file
1 71131940
2 2015-05-01
3 "JEWELLERY,ITEM"
4 P
5
6 W
The above uses GNU awk for FPAT
(see https://www.gnu.org/software/gawk/manual/gawk.html#Splitting-By-Content).
以上使用GNU awk for FPAT(参见https://www.gnu.org/software/gawk/manual/gawk.html#Splitting-By-Content)。