I need to lift the YouTube link from some text which looks like this:
我需要从某些文本中提取YouTube链接,如下所示:
[youtube=http://www.youtube.com/v/qpbAe2HyzqA&hl=en&fs=1&]
Can anyone help?
有人可以帮忙吗?
2 个解决方案
#1
Try something like this:
尝试这样的事情:
\[youtube=(https?://[^\]]+)\]
#2
You could use awk.
你可以使用awk。
awk ' FS="[" {print $(NF) } ' file_with_text > temp.txt
awk ' FS="]" {print $(NF-1)} ' temp.txt > results.txt
It is in two parts to make it clearer and because awk is strange like that. If you want just the URL and not the youtube= first then you will need to run an awk with the file separator like FS="youtube=". Also awk can be strange with the input; if file_with_text has text on the first line it may act strange and if the file ends with the file separator you chose then awk may error (just add any text other than the FS symbol to the end of the file).
它分为两部分,使它更清晰,因为awk很奇怪。如果您只想要URL而不是youtube = first,那么您将需要使用FS =“youtube =”等文件分隔符运行awk。 awk也很奇怪输入;如果file_with_text在第一行有文本,它可能会很奇怪,如果文件以你选择的文件分隔符结束,那么awk可能会出错(只需将FS符号以外的任何文本添加到文件末尾)。
Edit: Removed the cat function. Seems less clear as a pedagogical answer, but it is more concise.
编辑:删除了cat功能。作为教学答案似乎不太清楚,但它更简洁。
#1
Try something like this:
尝试这样的事情:
\[youtube=(https?://[^\]]+)\]
#2
You could use awk.
你可以使用awk。
awk ' FS="[" {print $(NF) } ' file_with_text > temp.txt
awk ' FS="]" {print $(NF-1)} ' temp.txt > results.txt
It is in two parts to make it clearer and because awk is strange like that. If you want just the URL and not the youtube= first then you will need to run an awk with the file separator like FS="youtube=". Also awk can be strange with the input; if file_with_text has text on the first line it may act strange and if the file ends with the file separator you chose then awk may error (just add any text other than the FS symbol to the end of the file).
它分为两部分,使它更清晰,因为awk很奇怪。如果您只想要URL而不是youtube = first,那么您将需要使用FS =“youtube =”等文件分隔符运行awk。 awk也很奇怪输入;如果file_with_text在第一行有文本,它可能会很奇怪,如果文件以你选择的文件分隔符结束,那么awk可能会出错(只需将FS符号以外的任何文本添加到文件末尾)。
Edit: Removed the cat function. Seems less clear as a pedagogical answer, but it is more concise.
编辑:删除了cat功能。作为教学答案似乎不太清楚,但它更简洁。