如何获取包含特定搜索字符串的文件名及其句子的列表?

时间:2022-09-13 08:01:19

I tried with following command on unix machine:

我在unix机器上尝试了以下命令:

ls -l | awk '{print $9}' | xargs -I {} cat  {}  | grep {"String to search"}

Though this works with text files but when i try it with xml files it is not able to display proper grepped text.Instead it displays whole xml file.

虽然这适用于文本文件,但是当我尝试使用xml文件时,它不能显示正确的加了油的文本。相反,它显示整个xml文件。

I think the possible reason behind this is absense of new line character in xml file that i use.

我认为这背后可能的原因是我使用的xml文件中没有新的行字符。

Example: Search string: "/1031/"

示例:搜索字符串:/ 1031 /

Xml line containing search string: <eventtype uri="{any_url}/1031/"/>

包含搜索字符串的Xml行:

To clarify a bit :

澄清一点:

ls -l | awk '{print $9}' | xargs -I {} cat {} | grep -o "/1031"

ls -l | awk '{打印$9}' | xargs -I {} | grep -o "/1031"

This gives output as:

这使输出为:

/1031

/ 1031

/1031

/ 1031

/1031...

/ 1031…

I also want the name of the file in which it belongs.

我还需要它所属的文件的名称。

2 个解决方案

#1


4  

grep has a flag -o which only outputs the matching text.

grep有一个标记-o,它只输出匹配的文本。

ls -l | awk '{print $9}' | xargs -I {} cat {} | grep -o {"String to search"}

From your edit it looks like you need the "line" that contains the URL as well. By default grep will match greedily which means a regex to account for the XML formatting will still give you an incorrect result.

从您的编辑中,看起来您还需要包含URL的“行”。默认情况下,grep将贪婪地匹配,这意味着用于解释XML格式的regex仍然会给出错误的结果。

I can think of 2 possible options:

我可以想到两种可能的选择:

For the next examples, test.xml contains the string:

对于下一个示例,测试。xml包含字符串:

<eventtype uri="{www.example1.com}/1031/"/><eventtype uri="{www.example2.com}/1031/"/><eventtype uri="{www.example3.com}/1031/"/>

The first is to use the -P flag for grep to enable perl syntax and match lazily.

第一个是使用grep的-P标志来启用perl语法并延迟匹配。

grep -Po '".*?/1031/"' test.xml 

This outputs:

这个输出:

"{www.example1.com}/1031/"
"{www.example2.com}/1031/"
"{www.example3.com}/1031/"

The second is to use sed to manually append a newline after each match and pipe to grep:

第二是使用sed在每次匹配后手动添加新行,并将管道连接到grep:

sed 's/1031/1031\n/g' test.xml | grep 1031

Outputs:

输出:

<eventtype uri="{www.example1.com}/1031
/"/><eventtype uri="{www.example2.com}/1031
/"/><eventtype uri="{www.example3.com}/1031

I believe both methods should work ok on plain text files although you may need to conditionally use one of these methods on .xml extensions.

我认为这两种方法在纯文本文件上都可以工作,尽管您可能需要在.xml扩展上有条件地使用其中的一种方法。

#2


1  

Dear You could use below command

亲爱的你可以使用下面的命令

find -type f -exec grep -HPo '".*?/1031/"' {} \;

Sample Output

样例输出

[root@MUM03S001 ~]# find -type f -exec grep -HPo '".*?/1031/"' {} \;

[root@MUM03S001 ~]#查找-type f -exec grep -HPo '" *?/ 1031 /“{ } \;

./File:"{www.example1.com}/1031/"

/文件:“{ www.example1.com } / 1031 /”

./File:"{www.example2.com}/1031/"

/文件:“{ www.example2.com } / 1031 /”

./File:"{www.example3.com}/1031/"

/文件:“{ www.example3.com } / 1031 /”

[root@MUM03S001 ~]#

root@MUM03S001 ~ #

#1


4  

grep has a flag -o which only outputs the matching text.

grep有一个标记-o,它只输出匹配的文本。

ls -l | awk '{print $9}' | xargs -I {} cat {} | grep -o {"String to search"}

From your edit it looks like you need the "line" that contains the URL as well. By default grep will match greedily which means a regex to account for the XML formatting will still give you an incorrect result.

从您的编辑中,看起来您还需要包含URL的“行”。默认情况下,grep将贪婪地匹配,这意味着用于解释XML格式的regex仍然会给出错误的结果。

I can think of 2 possible options:

我可以想到两种可能的选择:

For the next examples, test.xml contains the string:

对于下一个示例,测试。xml包含字符串:

<eventtype uri="{www.example1.com}/1031/"/><eventtype uri="{www.example2.com}/1031/"/><eventtype uri="{www.example3.com}/1031/"/>

The first is to use the -P flag for grep to enable perl syntax and match lazily.

第一个是使用grep的-P标志来启用perl语法并延迟匹配。

grep -Po '".*?/1031/"' test.xml 

This outputs:

这个输出:

"{www.example1.com}/1031/"
"{www.example2.com}/1031/"
"{www.example3.com}/1031/"

The second is to use sed to manually append a newline after each match and pipe to grep:

第二是使用sed在每次匹配后手动添加新行,并将管道连接到grep:

sed 's/1031/1031\n/g' test.xml | grep 1031

Outputs:

输出:

<eventtype uri="{www.example1.com}/1031
/"/><eventtype uri="{www.example2.com}/1031
/"/><eventtype uri="{www.example3.com}/1031

I believe both methods should work ok on plain text files although you may need to conditionally use one of these methods on .xml extensions.

我认为这两种方法在纯文本文件上都可以工作,尽管您可能需要在.xml扩展上有条件地使用其中的一种方法。

#2


1  

Dear You could use below command

亲爱的你可以使用下面的命令

find -type f -exec grep -HPo '".*?/1031/"' {} \;

Sample Output

样例输出

[root@MUM03S001 ~]# find -type f -exec grep -HPo '".*?/1031/"' {} \;

[root@MUM03S001 ~]#查找-type f -exec grep -HPo '" *?/ 1031 /“{ } \;

./File:"{www.example1.com}/1031/"

/文件:“{ www.example1.com } / 1031 /”

./File:"{www.example2.com}/1031/"

/文件:“{ www.example2.com } / 1031 /”

./File:"{www.example3.com}/1031/"

/文件:“{ www.example3.com } / 1031 /”

[root@MUM03S001 ~]#

root@MUM03S001 ~ #