I have here my script which will extract values from files. My problem is when there will be more than one value (specifically in the number variable), I don't know how I am going to separate them. I only come up with comma separated form.
这里我的脚本将从文件中提取值。我的问题是,当有不止一个值(特别是在数字变量中)时,我不知道如何将它们分开。我只是用逗号分开的形式。
#!/bin/bash
egrep -l 'NG' /home/archive/* > /home/archive/console.txt
chmod 644 /home/archive/console.txt
while read FILE
do
file=$FILE
cat $FILE | awk '{gsub("-",RS);print}' > file1.txt
chmod 644 file1.txt
cat file1.txt | awk '{gsub("*",RS);print}' > file2.txt
chmod 644 file2.txt
date=`sed -n '10p' < file2.txt `
number=`awk 'BEGIN {FS="*"; i=0; ORS=""}
$1=="NG" {a[i++]=$4}
END {
if (i>1) {
print "" a[0]
for (j = 1; j < i; j++)
print "," a[j]
}
if (i==1)
print "" a[0] ""
}' file1.txt`
echo "name: ${file}"
echo "creation_date: $date"
echo "number: ${number}"
done < /home/archive/console.txt
And here is the sample file.txt content:
这是样本文件。三种内容:
ABC*72832*123*782327*1234*SRET *723825W43*834734*73298 *2014Nov30*STRR*K*2014*WRERFD*0*P*-TREKTKR*FGSRTR*SRET*72382543*20140613*1805*78403698*O*005010-FD*7C*78405492-NG*VL*847*347*23-KDJRFK*6729*124713-7238S*A-6283HD*723H*124714-7120373*A-8723D*Y*2*2*2-
HSSDQW*7E*78405493-NG*VL*847*347*24-RDERE*872*124715-
My actual output using above script is this:
我使用上面脚本的实际输出是:
name: file1.txt
creation_date: 2014Nov30
number: 23,24名字:file1。txt creation_date: 2014Nov30 number: 23,24。
On the other hand, my expected output would be this one:
另一方面,我期望的输出是这个:
name: file1.txt
creation_date:2014Nov30
number:23
name: file1.txt
creation_date: 2014Nov30
number:24
It must have the same value for name and creation_date(since they are on the same file) while in the number field, they must be split/disjoined. Is this feasible?
它必须对名称和creation_date具有相同的值(因为它们在同一文件中),而在number字段中,它们必须被拆分/分离。这是可行的吗?
1 个解决方案
#1
0
You are needlessly dividing the task into very small snippets. It will be both easier and simpler to do everything in a single Awk script.
您不必要地将任务划分为非常小的片段。在一个Awk脚本中做任何事情都更容易,也更简单。
#!/bin/bash
egrep -l 'NG' /home/archive/* |
while read -r file # note -r
do
awk -v RS='-|\n' -v FS='*' '
# Assume date is always on first line, please check?
NR==1 { date=$10 }
$1=="NG" {x[++i]=$4}
END {
for (j = 1; j <= i; j++)
printf "name: %s\ncreation_date: %s\nnumber: %i\n",
FILENAME, date, x[j]
}' "$file"
done
Notice how the script avoids the use of temporary files by keeping all processing in a single pipeline.
请注意,脚本如何避免使用临时文件,因为它将所有处理保存在一个管道中。
I'm not sure the Awk script is entirely correct; but without knowledge of the file you are processing, there's a lot of guesswork involved. If you have an actual specification for the input file format, that would help reduce uncertainty.
我不确定Awk脚本是否完全正确;但是,如果不了解您正在处理的文件,就会涉及很多猜测。如果您有一个输入文件格式的实际规范,这将有助于减少不确定性。
#1
0
You are needlessly dividing the task into very small snippets. It will be both easier and simpler to do everything in a single Awk script.
您不必要地将任务划分为非常小的片段。在一个Awk脚本中做任何事情都更容易,也更简单。
#!/bin/bash
egrep -l 'NG' /home/archive/* |
while read -r file # note -r
do
awk -v RS='-|\n' -v FS='*' '
# Assume date is always on first line, please check?
NR==1 { date=$10 }
$1=="NG" {x[++i]=$4}
END {
for (j = 1; j <= i; j++)
printf "name: %s\ncreation_date: %s\nnumber: %i\n",
FILENAME, date, x[j]
}' "$file"
done
Notice how the script avoids the use of temporary files by keeping all processing in a single pipeline.
请注意,脚本如何避免使用临时文件,因为它将所有处理保存在一个管道中。
I'm not sure the Awk script is entirely correct; but without knowledge of the file you are processing, there's a lot of guesswork involved. If you have an actual specification for the input file format, that would help reduce uncertainty.
我不确定Awk脚本是否完全正确;但是,如果不了解您正在处理的文件,就会涉及很多猜测。如果您有一个输入文件格式的实际规范,这将有助于减少不确定性。