I am struggling with the following task(I've been searching for answer for a while).
我正在努力完成以下任务(我一直在寻找答案)。
The search is for text between START_PATTERN and END_PATTERN1
搜索是在START_PATTERN和END_PATTERN1之间的文本
Having a file structured like this:
有一个像这样结构的文件:
text
text
...
START_PATTERN
line1
line2
END_PATTERN2
text
text
...
START_PATTERN
line1
line2
END_PATTERN1
text
text
...
The task would be to restart search if END_PATTERN2 is found. Thus the command output should be:
如果找到END_PATTERN2,任务将是重新开始搜索。因此命令输出应该是:
START_PATTERN
line1
line2
END_PATTERN1
Thank you for your time!
感谢您的时间!
3 个解决方案
#1
2
this line should work for your example:
这一行应该适用于你的例子:
tac file|sed '/END_PATTERN1/,/START_PAT/!d'|tac
test: (I added xx
to the expected block lines):
测试:(我将xx添加到预期的块行):
kent$ cat f
text
text
...
START_PATTERN
line1
line2
END_PATTERN2
text
text
...
START_PATTERN
xxline1
xxline2
END_PATTERN1
text
kent$ tac f|sed '/END_PATTERN1/,/START_PAT/!d'|tac
START_PATTERN
xxline1
xxline2
END_PATTERN1
Edit
take only the first match, with awk only:
仅使用awk进行第一场比赛:
awk '{a[NR]=$0}
/START_PAT/{s=NR}
/END_PATTERN2/{s=0}
/END_PATTERN1/{exit}
END{for(i=s;i<=NR;i++)print a[i]}' file
#2
0
I'd go about this by keeping a buffer of lines after the first pattern is found and resetting it if END_PATTERN2
is found:
我会通过在找到第一个模式后保留行缓冲区并在找到END_PATTERN2时重置它来解决这个问题:
awk 'x { next }
/START_PATTERN/ { n = 1; f = 1 }
f { lines[n++] = $0 }
/END_PATTERN1/ { f = 0; x = 1 }
/END_PATTERN2/ { n = 1; f = 0 }
END { for (i = 1; i < n; ++i) print lines[i] }' file
f
is a flag to determine whether to save the current line to the buffer lines
. n
is a counter used to index the buffer. Once the file is processed, the first n
lines in the buffer are printed.
f是用于确定是否将当前行保存到缓冲行的标志。 n是用于索引缓冲区的计数器。处理完文件后,将打印缓冲区中的前n行。
I've also added a variable x
which, once set, causes all lines to be skipped. This means that only the first matching block will be saved.
我还添加了一个变量x,一旦设置,就会跳过所有行。这意味着只保存第一个匹配的块。
#3
0
This might work for you (GNU sed):
这可能适合你(GNU sed):
sed -n '/START_PATTERN/!d;:a;N;/END_PATTERN2/d;/END_PATTERN1/!ba;p;d' file
Use the -n
grep-like switch. Start collecting lines on finding START_PATTERN
. Delete the collection if END_PATTERN2
is found. On finding END_PATTERN1
print the lines.
使用-n grep-like开关。在找到START_PATTERN时开始收集行。如果找到END_PATTERN2,则删除该集合。在找到END_PATTERN1时打印线条。
#1
2
this line should work for your example:
这一行应该适用于你的例子:
tac file|sed '/END_PATTERN1/,/START_PAT/!d'|tac
test: (I added xx
to the expected block lines):
测试:(我将xx添加到预期的块行):
kent$ cat f
text
text
...
START_PATTERN
line1
line2
END_PATTERN2
text
text
...
START_PATTERN
xxline1
xxline2
END_PATTERN1
text
kent$ tac f|sed '/END_PATTERN1/,/START_PAT/!d'|tac
START_PATTERN
xxline1
xxline2
END_PATTERN1
Edit
take only the first match, with awk only:
仅使用awk进行第一场比赛:
awk '{a[NR]=$0}
/START_PAT/{s=NR}
/END_PATTERN2/{s=0}
/END_PATTERN1/{exit}
END{for(i=s;i<=NR;i++)print a[i]}' file
#2
0
I'd go about this by keeping a buffer of lines after the first pattern is found and resetting it if END_PATTERN2
is found:
我会通过在找到第一个模式后保留行缓冲区并在找到END_PATTERN2时重置它来解决这个问题:
awk 'x { next }
/START_PATTERN/ { n = 1; f = 1 }
f { lines[n++] = $0 }
/END_PATTERN1/ { f = 0; x = 1 }
/END_PATTERN2/ { n = 1; f = 0 }
END { for (i = 1; i < n; ++i) print lines[i] }' file
f
is a flag to determine whether to save the current line to the buffer lines
. n
is a counter used to index the buffer. Once the file is processed, the first n
lines in the buffer are printed.
f是用于确定是否将当前行保存到缓冲行的标志。 n是用于索引缓冲区的计数器。处理完文件后,将打印缓冲区中的前n行。
I've also added a variable x
which, once set, causes all lines to be skipped. This means that only the first matching block will be saved.
我还添加了一个变量x,一旦设置,就会跳过所有行。这意味着只保存第一个匹配的块。
#3
0
This might work for you (GNU sed):
这可能适合你(GNU sed):
sed -n '/START_PATTERN/!d;:a;N;/END_PATTERN2/d;/END_PATTERN1/!ba;p;d' file
Use the -n
grep-like switch. Start collecting lines on finding START_PATTERN
. Delete the collection if END_PATTERN2
is found. On finding END_PATTERN1
print the lines.
使用-n grep-like开关。在找到START_PATTERN时开始收集行。如果找到END_PATTERN2,则删除该集合。在找到END_PATTERN1时打印线条。