I am very new to bash scripting. I have a network trace file I want to parse. Part of the trace file is (two packets):
我对bash脚本非常陌生。我有一个我想要解析的网络跟踪文件。跟踪文件的一部分是(两个数据包):
[continues...]
+---------+---------------+----------+
05:00:00,727,744 ETHER
|0
|00|03|a0|09|5c|1c|00|10|07|df|a4|20|08|00|45|00|00|38|e7|55|
+---------+---------------+----------+
05:00:00,727,751 ETHER
|0
|00|03|a0|09|5c|1c|00|10|07|df|a4|20|08|00|45|00|00|38|e7|56|00|00|3a|01|
[continues...]
For each packet, I want to print the time stamp, and the length of the packet (the hex values coming on the next line after |0 header) so the output will look like:
对于每个包,我想打印时间戳,以及包的长度(|头后下一行的十六进制值),因此输出将如下所示:
05:00:00.727744 20 bytes
05:00:00.727751 24 bytes
I can get the line with time stamp and the packets separately using grep in bash:
我可以在bash中使用grep分别获取带有时间戳的行和数据包:
times=$(grep '..\:..\:' $fileName)
packets=$(grep '..|..|' $fileName)
But I can't work with the separate output lines after that. The whole result is concatenated in the two variables "times" and "packets". How can I get the length of each packet?
但是在那之后我就不能处理单独的输出了。整个结果连接在两个变量“times”和“packet”中。我怎样才能得到每包的长度?
P.S. a good reference that really explains how to do bash programming, rather than just doing examples would be appreciated.
附注:一个很好的参考,真正地解释如何做bash编程,而不是仅仅做示例,将是值得赞赏的。
2 个解决方案
#1
1
You really don't want to do such things with your shell.
你真的不想用你的壳做这些事。
You want to write a real parser that understands the format to output the needed informations.
您希望编写一个真正的解析器,它能够理解输出所需信息的格式。
For a quick and dirty hack you can do something like that:
对于一个快速而肮脏的黑客,你可以做这样的事情:
perl -wne 'print "$& " if /^\d\S*/; print split(/\|/)-2, " bytes\n" if /^\|..\|/'
#2
2
Okay, with plain old shell...
好的,用普通的旧壳……
You can get the length of the line like this:
可以得到这条线的长度
line="|00|03|a0|09|5c|1c|00|10|07|df|a4|20|08|00|45|00|00|38|e7|55|"
wc -c<<<$line
62
There are sixty two characters in that line. Think of each character as |00
where 00
can be any digit. In that case, there's an extra |
on the end. Plus, the wc -c
includes the NL
on the end.
这一行有62个字符。把每个字符都看成|00,00可以是任意数字。在这种情况下,最后还有一个额外的|。另外,wc -c末尾包含NL。
So, if we take the value of wc -c
, and subtract 2, we get 60
. If we divide that by 3, we get 20
which is the number of characters.
如果取wc -c的值,减去2,就得到60。如果除以3,就得到20,也就是字符数。
Okay, now we need a little loop, figure out the various lines, and then parse them:
好的,现在我们需要一个小的循环,找出不同的行,然后解析它们:
#! /bin/bash
while read line
do
if [[ $line =~ ^[[:digit:]]{2} ]]
then
echo -n "${line% *}"
elif [[ $line =~ ^\|[[:digit:]]{2} ]]
then
length=$(wc -c<<<$line)
((length-=2))
((length=length/3))
echo "$length bytes"
fi
done < test.txt
There a PURE BASH solution to your problems!
您的问题有一个纯粹的BASH解决方案!
You're a beginning Bash programmer, and you have no idea what's going on...
你是一个初学者Bash程序员,你不知道发生了什么……
Let's take this one step at a time:
让我们一步一步来:
A common way to loop through a file in BASH is using a while read
loop. This combines the while
with a read
:
在BASH中循环遍历文件的一种常见方法是使用while read循环。这结合了一段时间的阅读:
while read line
do
echo "My line is '$line'"
done < test.txt
Each line in test.txt
is being read into the $line
shell variable.
在每一行测试。txt被读入$line shell变量。
Let's take the next one:
我们来看下一个:
if [[ $line =~ ^[[:digit:]]{2} ]]
This is an if
statement. Always use the [[ ... ]]
brackets because they fix issues with the shell interpolating stuff. Plus, they have a bit more power.
这是一个if语句。总是用[[…][英语背诵文选因为他们解决了壳内插的问题。另外,它们还有点能量。
The =~
is a regular expression match. The [[:digit:]]
matches any digit. The ^
anchors the regular expression to the beginning of the line, and {2}
means I want exactly two of these. This says if I match a line that starts with two digits (which is your timestamp line), execute this if
clause.
=~是一个正则表达式匹配。[[:digit:]]匹配任意数字。^锚的正则表达式的开始,和{ 2 }我想要两个。这表示如果我匹配以两位数开头的行(即您的时间戳行),执行这个if子句。
${line% *}
is a pattern filter. The %
says to match the (glob) smallest glob pattern to the right and filter it from my $line
variable. I use this to remove the ETHER
from my line. The -n
tells echo
not to do a NL.
${line% *}是一个模式过滤器。%表示要匹配右边(glob)最小的glob模式,并从$line变量中过滤它。我用这个从我的直线上除去醚。n告诉echo不要做NL。
Let's take my elif
which is an else if clause.
我们取我的elif,这是一个else if子句。
elif [[ $line =~ ^\|[[:digit:]]{2} ]]
Again, I am matching a regular expression. This regular expression starts with (The ^
) a |
. I have to put a backslash in front because |
is a magical regular expression character and \
kills the magic. It's now just a pipe. Then, that's followed by two digits. Note this skips |0
but catches |00
.
同样,我匹配一个正则表达式。这个正则表达式开头(^)|。我必须在前面加上一个反斜杠,因为|是一个神奇的正则表达式字符,并杀死魔法。它现在只是一个管道。然后是两个数字。注意,它跳过|但捕获|00。
Now, we have to do some calculations:
现在,我们要做一些计算:
length=$(wc -c<<<$line)
The $(...)
say to execute the enclosed command and resubstitute it back in the line. The wc -c
counts the characters and <<<$line
is what we're counting. This gave us 62
characters. We have to subtract 2, then divide by 3. That's the next two lines:
$(…)表示执行所包含的命令并将其重新替换回行中。wc -c计数字符,<<<$line是我们所计数的。这给了我们62个字符。我们要减去2,然后除以3。这是接下来的两句话:
((length-=2))
((length/=3))
The ((...))
allows me to do integer based math. The first subtracts 2 from $length
and the next divides it by 3
. Now, I can echo this out:
(…)允许我做基于整数的数学运算。第一个将2减去$length,下一个将它除以3。现在,我可以重复一下:
echo "$length bytes"
And that's our pure Bash answer to this question.
这就是我们对这个问题的答案。
#1
1
You really don't want to do such things with your shell.
你真的不想用你的壳做这些事。
You want to write a real parser that understands the format to output the needed informations.
您希望编写一个真正的解析器,它能够理解输出所需信息的格式。
For a quick and dirty hack you can do something like that:
对于一个快速而肮脏的黑客,你可以做这样的事情:
perl -wne 'print "$& " if /^\d\S*/; print split(/\|/)-2, " bytes\n" if /^\|..\|/'
#2
2
Okay, with plain old shell...
好的,用普通的旧壳……
You can get the length of the line like this:
可以得到这条线的长度
line="|00|03|a0|09|5c|1c|00|10|07|df|a4|20|08|00|45|00|00|38|e7|55|"
wc -c<<<$line
62
There are sixty two characters in that line. Think of each character as |00
where 00
can be any digit. In that case, there's an extra |
on the end. Plus, the wc -c
includes the NL
on the end.
这一行有62个字符。把每个字符都看成|00,00可以是任意数字。在这种情况下,最后还有一个额外的|。另外,wc -c末尾包含NL。
So, if we take the value of wc -c
, and subtract 2, we get 60
. If we divide that by 3, we get 20
which is the number of characters.
如果取wc -c的值,减去2,就得到60。如果除以3,就得到20,也就是字符数。
Okay, now we need a little loop, figure out the various lines, and then parse them:
好的,现在我们需要一个小的循环,找出不同的行,然后解析它们:
#! /bin/bash
while read line
do
if [[ $line =~ ^[[:digit:]]{2} ]]
then
echo -n "${line% *}"
elif [[ $line =~ ^\|[[:digit:]]{2} ]]
then
length=$(wc -c<<<$line)
((length-=2))
((length=length/3))
echo "$length bytes"
fi
done < test.txt
There a PURE BASH solution to your problems!
您的问题有一个纯粹的BASH解决方案!
You're a beginning Bash programmer, and you have no idea what's going on...
你是一个初学者Bash程序员,你不知道发生了什么……
Let's take this one step at a time:
让我们一步一步来:
A common way to loop through a file in BASH is using a while read
loop. This combines the while
with a read
:
在BASH中循环遍历文件的一种常见方法是使用while read循环。这结合了一段时间的阅读:
while read line
do
echo "My line is '$line'"
done < test.txt
Each line in test.txt
is being read into the $line
shell variable.
在每一行测试。txt被读入$line shell变量。
Let's take the next one:
我们来看下一个:
if [[ $line =~ ^[[:digit:]]{2} ]]
This is an if
statement. Always use the [[ ... ]]
brackets because they fix issues with the shell interpolating stuff. Plus, they have a bit more power.
这是一个if语句。总是用[[…][英语背诵文选因为他们解决了壳内插的问题。另外,它们还有点能量。
The =~
is a regular expression match. The [[:digit:]]
matches any digit. The ^
anchors the regular expression to the beginning of the line, and {2}
means I want exactly two of these. This says if I match a line that starts with two digits (which is your timestamp line), execute this if
clause.
=~是一个正则表达式匹配。[[:digit:]]匹配任意数字。^锚的正则表达式的开始,和{ 2 }我想要两个。这表示如果我匹配以两位数开头的行(即您的时间戳行),执行这个if子句。
${line% *}
is a pattern filter. The %
says to match the (glob) smallest glob pattern to the right and filter it from my $line
variable. I use this to remove the ETHER
from my line. The -n
tells echo
not to do a NL.
${line% *}是一个模式过滤器。%表示要匹配右边(glob)最小的glob模式,并从$line变量中过滤它。我用这个从我的直线上除去醚。n告诉echo不要做NL。
Let's take my elif
which is an else if clause.
我们取我的elif,这是一个else if子句。
elif [[ $line =~ ^\|[[:digit:]]{2} ]]
Again, I am matching a regular expression. This regular expression starts with (The ^
) a |
. I have to put a backslash in front because |
is a magical regular expression character and \
kills the magic. It's now just a pipe. Then, that's followed by two digits. Note this skips |0
but catches |00
.
同样,我匹配一个正则表达式。这个正则表达式开头(^)|。我必须在前面加上一个反斜杠,因为|是一个神奇的正则表达式字符,并杀死魔法。它现在只是一个管道。然后是两个数字。注意,它跳过|但捕获|00。
Now, we have to do some calculations:
现在,我们要做一些计算:
length=$(wc -c<<<$line)
The $(...)
say to execute the enclosed command and resubstitute it back in the line. The wc -c
counts the characters and <<<$line
is what we're counting. This gave us 62
characters. We have to subtract 2, then divide by 3. That's the next two lines:
$(…)表示执行所包含的命令并将其重新替换回行中。wc -c计数字符,<<<$line是我们所计数的。这给了我们62个字符。我们要减去2,然后除以3。这是接下来的两句话:
((length-=2))
((length/=3))
The ((...))
allows me to do integer based math. The first subtracts 2 from $length
and the next divides it by 3
. Now, I can echo this out:
(…)允许我做基于整数的数学运算。第一个将2减去$length,下一个将它除以3。现在,我可以重复一下:
echo "$length bytes"
And that's our pure Bash answer to this question.
这就是我们对这个问题的答案。