This is the html I'm parsing
这是我正在解析的HTML
<li id="dl_linux_32">
<a href="link">Link</a>
</li>
<li id="dl_linux_64">
<a href="another_link">Another Link</a>
</li>
with this curl URL 2>&1 | grep -oE 'href="([^"#]+)"' | sed "s/ /%20/g" | cut -f2 -d "="
I'm able to get all href
's values. However I just want the href
's value of the anchor inside the li
with id
equals to dl_linux_32
.
这个卷曲URL 2>&1 | grep -oE'href =“([^”#] +)“'| sed”s / /%20 / g“| cut -f2 -d”=“我能够得到所有href的值。但是我只是想要li里面的href的值,id等于dl_linux_32。
Can someone help me finish the regex?
有人可以帮我完成正则表达式吗?
4 个解决方案
#1
1
Perl One-Liner
The regex must check across multiple lines. In this sort of situation, a Perl one-liner will work beautifully.
正则表达式必须检查多行。在这种情况下,Perl one-liner可以很好地工作。
perl -0777 -ne 'print "$&\n" if /<li id="dl_linux_32">\s*<a \Khref="[^"]+"/' yourfile
#2
0
Through GNU awk,
通过GNU awk,
$ awk -F'"' -v RS="</li>" '/<li\s*id=\"dl_linux_32\">/{print $4}' file
link
#3
0
The regex I was looking for is dl_linux_32.+href="([^"#]+)"
. I'm searching for all href
's values that before it has one or more characters and dl_linux_32
我正在寻找的正则表达式是dl_linux_32。+ href =“([^”#] +)“。我正在搜索所有href的值,它之前有一个或多个字符和dl_linux_32
#4
0
IF the html is valid XML, you can use a tool that incorporates xpath searching
如果html是有效的XML,您可以使用包含xpath搜索的工具
echo '<html>
<li id="dl_linux_32">
<a href="link">Link</a>
</li>
<li id="dl_linux_64">
<a href="another_link">Another Link</a>
</li>
</html>
' | xmlstarlet sel -t -v '//li[@id="dl_linux_32"]/a/@href'
link
#1
1
Perl One-Liner
The regex must check across multiple lines. In this sort of situation, a Perl one-liner will work beautifully.
正则表达式必须检查多行。在这种情况下,Perl one-liner可以很好地工作。
perl -0777 -ne 'print "$&\n" if /<li id="dl_linux_32">\s*<a \Khref="[^"]+"/' yourfile
#2
0
Through GNU awk,
通过GNU awk,
$ awk -F'"' -v RS="</li>" '/<li\s*id=\"dl_linux_32\">/{print $4}' file
link
#3
0
The regex I was looking for is dl_linux_32.+href="([^"#]+)"
. I'm searching for all href
's values that before it has one or more characters and dl_linux_32
我正在寻找的正则表达式是dl_linux_32。+ href =“([^”#] +)“。我正在搜索所有href的值,它之前有一个或多个字符和dl_linux_32
#4
0
IF the html is valid XML, you can use a tool that incorporates xpath searching
如果html是有效的XML,您可以使用包含xpath搜索的工具
echo '<html>
<li id="dl_linux_32">
<a href="link">Link</a>
</li>
<li id="dl_linux_64">
<a href="another_link">Another Link</a>
</li>
</html>
' | xmlstarlet sel -t -v '//li[@id="dl_linux_32"]/a/@href'
link