I'm only learning to use REGEX, AWK and SED. I currently have a group of files that I'd like to rename - they all sit in one directory.
我只是在学习使用REGEX、AWK和SED。我目前有一组我想重命名的文件——它们都位于一个目录中。
The naming pattern is consistent, but I would like to re-arrange the filenames, here is the format:
命名模式是一致的,但我想重新安排文件名,以下是格式:
01._HORRIBLE_HISTORIES_S2.mp4
02._HORRIBLE_HISTORIES_S2.mp4
I'd like to rename them to HORRIBLE_HISTORIES_s01e01.mp4 - where the e01 is gleaned from the first column. I know that I want to grab "01" from the first column, stuff it in a variable then paste it after the S2 in each filename, at the same time I want to remove it from the beginning of the filename along with the "._", additionally I want to change the "S2" to "s02".
我想把它们重命名为可怕的历史。mp4 - e01是从第一列收集的。我知道我想从第一列中获取“01”,将它填充到一个变量中,然后在每个文件名中S2之后粘贴它,同时我想从文件名的开头删除它,并将其与“”一起删除。另外,我想把“S2”改为“s02”。
If anyone would be so kind, could you help me write something using awk/sed and explain the procedure, that I might learn from it?
如果有谁能帮我写点东西用awk/sed说明一下这个过程,我可以从中学习一下吗?
5 个解决方案
#1
7
for f in *.mp4; do
echo mv "$f" \
"$(awk -F '[._]' '{ si = sprintf("%02s", substr($5,2));
print $3 "_" $4 "_s" si "e" $1 "." $6 }' <<<"$f")"
done
- Loops over all
*.mp4
files. - 循环遍历所有的*。mp4文件。
- Renames each to the result of the
awk
command, provided via command substitution ($(...)
). - 通过命令替换($(…))将每个元素重命名为awk命令的结果。
- The
awk
command splits the input filename into tokens by.
or "_" (which makes the first token available as$1
, the second as$2
, ...). - awk命令将输入文件名分割为令牌。或“_”(使第一个令牌可用为$1,第二个令牌可用为$2,…)。
- First, the number in "_S{number}" is left-padded to 2 digits with a
0
(i.e., a0
is only prepended if the number doesn't already have 2 digits) and stored in variablesi
(season index); if it's OK to always prepend0
, the awk "program" can be simplified to:{ print $3 "_" $4 "_s0" substr($5,2) "e" $1 "." $6 }
- 首先,“_S{number}”中的数字被左填为两位数,数字为0(即,一个0只有在没有2位数字的情况下才被预写,并存储在变量si(季节指数)中;如果总是prepend 0是可以的,那么awk“程序”可以简化为:{print $3“_”$4“_s0”substr($5,2)1美元的“e”。$ 6 }
- The result, along with the remaining tokens, is then rearranged to form the desired filename.
- 然后,将结果和其他标记一起重新排列,以形成所需的文件名。
Note the echo
before mv
to allow you to safely preview the resulting command - remove it to perform actual renaming.
注意mv之前的echo,以允许您安全地预览结果命令——删除它以执行实际的重命名。
Alternative: a pure bash
solution using a regular expression:
可选方案:使用正则表达式的纯粹bash解决方案:
for f in *.mp4; do
[[ $f =~ ^([0-9]+)\._([^.]+)_S([^.]+)\.(.+)$ ]]
echo mv "$f" \
"${BASH_REMATCH[2]}_s0${BASH_REMATCH[3]}e${BASH_REMATCH[1]}.${BASH_REMATCH[4]}"
done
- Uses bash's regular-expression matching operator,
=~
, with capture groups (the substrings in(...)
) to match against each filename and extract substrings of interest. - 使用bash的正则表达式匹配操作符=~和捕获组(在(…)中的子字符串)匹配每个文件名并提取相关的子字符串。
- The matching results are stored in the special array variable
$BASH_REMATCH
, with element0
containing the entire match,1
containing what matches the first capture group,2
the second, and so on. - 匹配结果存储在特殊的数组变量$BASH_REMATCH中,元素0包含整个匹配,元素1包含与第一个捕获组匹配的内容,元素2包含第二个捕获组,依此类推。
- The
mv
command's target argument then assembles the capture-group matches in the desired order; note that in this case, for simplicity, I've made the zero-padding ofs{number}
unconditional - a0
is simply prepended. - mv命令的目标参数然后按照期望的顺序组装捕获组匹配项;注意,在这种情况下,为了简单起见,我将s{number}的零填充设置为无条件——0仅仅是在前面加上的。
As above, you need to remove echo
before mv
to perform actual renaming.
如上所述,您需要在mv之前删除echo来执行实际的重命名。
#2
8
A common way of renaming multiple files according to a pattern, is to use the Perl command rename
. It uses Perl regular expressions and is very powerful. Use -n -v
to test the pattern without touching the files:
根据模式重命名多个文件的一种常见方法是使用Perl命令rename。它使用Perl正则表达式,功能非常强大。使用-n -v测试模式而不触及文件:
$ rename -n -v 's/^(\d+)._(.+)_S2\.mp4/$2_s02e$1.mp4/' *.mp4
01._HORRIBLE_HISTORIES_S2.mp4 renamed as HORRIBLE_HISTORIES_s02e01.mp4
02._HORRIBLE_HISTORIES_S2.mp4 renamed as HORRIBLE_HISTORIES_s02e02.mp4
Use parentheses to capture strings into variables $1
(first capture), $2
(second capture) etc:
使用圆括号将字符串捕获到变量$1(第一次捕获)、$2(第二次捕获)等:
-
^(\d+)
capture numbers at beginning of filename (into$1)
- ^(\ d +)捕捉数字开头的文件名(1美元)
-
._(.+)_S2\.mp4
capture everything between._
and_S2.mp4
(into$2
) - ._ _S2 \(+)。mp4捕捉._和_S2之间的一切。mp4(2美元)
-
$2_s02e$1.mp4
assemble your new filename with the captured data as you want it - 2美元_s02e 1美元。根据需要将捕获的数据组装到新的文件名中
When you are happy with the result, remove -n
from the command and it will rename all the files for real.
当您对结果感到满意时,从命令中删除-n,它会将所有文件重命名为real。
rename
is often available by default on Linux (package util-linux
). There is a similar discussion here on SO with more details about finding/installing the right command.
在Linux上,rename通常可以默认使用(包util-linux)。这里有一个类似的讨论,关于查找/安装正确命令的更多细节。
#3
1
You can do it with pure almost bash
(with variable expansion):
你可以用pure almost bash(使用变量展开):
for f in in *mp4 ; do
newfilename="${f:5:20}_s01e${f:1:2}.mp4"
echo mv $f $newfilename
done
If the output from this command suites your needs you may remove the echo
from the cycle or more simply (if your last command was the above) issue: !! | bash
如果您需要这个命令套件的输出,您可以从循环中删除echo,或者更简单地(如果您的最后一个命令是上面的)问题:!| bash
#4
0
Make the filename string into a textfile then use loop and awk to rename file.
使文件名字符串成为一个textfile,然后使用循环和awk重命名文件。
while read oldname; do
newname=$(awk -F'.' '{ print substr($2, 2) "_e" $1 "." $3 }' <<< ${oldname} | \
awk -F'_' '{ print $1 "_s0" substr($2, 2) $3 }');
mv ${oldname} ${newname};
done<input.txt
#5
0
If you're willing to use gawk
, the regex matching really comes in handy. I find this pipe-based solution a little nicer than worrying about looping constructs.
如果你愿意使用gawk,那么regex匹配真的会派上用场。我发现这种基于管道的解决方案比担心循环结构要好一些。
ls -1 | \
gawk 'match($0, /.../, a) { printf ... | "sh" } \
END { close("sh") }'
For ease of reading I've replaced the regex and the mv
command with ellipses.
为了便于阅读,我用省略号替换了regex和mv命令。
- Line 1 lists all the file names in the current directory, one line each and pipes that to the gawk command.
- 第1行列出当前目录中的所有文件名,每行一行,并将其传输到gawk命令。
- Line 2 runs the regex match, assigning captured groups to the array variable
a
. The action converts this into our desired command withprintf
which is itself piped tosh
to execute. - 第2行运行regex匹配,将捕获的组分配给数组变量a。该操作将其转换为我们需要的命令,并将其自身通过管道发送到sh执行。
- Line 3 closes the shell that was implicitly opened when we started piping things to it.
- 第3行关闭当我们开始向它输送管道时隐式打开的shell。
So then you just fill in your regex and command syntax (borrowing from mklement0). For example (LIVE CODE WARNING):
因此,只需填充regex和命令语法(从mklement0中借用)。例如(实时代码警告):
ls -1 | \
gawk 'match($0, /^([0-9]+)\._([^.]+)_S([^.]+)\.(.+)$/, a) { printf "mv %s %s_s0%se%s.%s\n",a[0],a[2],a[3],a[1],a[4] | "sh" } \
END { close("sh") }'
To preview that command (as you should) you can simply remove the | "sh"
from the second line.
要预览该命令(您应该这样做),只需从第二行删除|“sh”。
#1
7
for f in *.mp4; do
echo mv "$f" \
"$(awk -F '[._]' '{ si = sprintf("%02s", substr($5,2));
print $3 "_" $4 "_s" si "e" $1 "." $6 }' <<<"$f")"
done
- Loops over all
*.mp4
files. - 循环遍历所有的*。mp4文件。
- Renames each to the result of the
awk
command, provided via command substitution ($(...)
). - 通过命令替换($(…))将每个元素重命名为awk命令的结果。
- The
awk
command splits the input filename into tokens by.
or "_" (which makes the first token available as$1
, the second as$2
, ...). - awk命令将输入文件名分割为令牌。或“_”(使第一个令牌可用为$1,第二个令牌可用为$2,…)。
- First, the number in "_S{number}" is left-padded to 2 digits with a
0
(i.e., a0
is only prepended if the number doesn't already have 2 digits) and stored in variablesi
(season index); if it's OK to always prepend0
, the awk "program" can be simplified to:{ print $3 "_" $4 "_s0" substr($5,2) "e" $1 "." $6 }
- 首先,“_S{number}”中的数字被左填为两位数,数字为0(即,一个0只有在没有2位数字的情况下才被预写,并存储在变量si(季节指数)中;如果总是prepend 0是可以的,那么awk“程序”可以简化为:{print $3“_”$4“_s0”substr($5,2)1美元的“e”。$ 6 }
- The result, along with the remaining tokens, is then rearranged to form the desired filename.
- 然后,将结果和其他标记一起重新排列,以形成所需的文件名。
Note the echo
before mv
to allow you to safely preview the resulting command - remove it to perform actual renaming.
注意mv之前的echo,以允许您安全地预览结果命令——删除它以执行实际的重命名。
Alternative: a pure bash
solution using a regular expression:
可选方案:使用正则表达式的纯粹bash解决方案:
for f in *.mp4; do
[[ $f =~ ^([0-9]+)\._([^.]+)_S([^.]+)\.(.+)$ ]]
echo mv "$f" \
"${BASH_REMATCH[2]}_s0${BASH_REMATCH[3]}e${BASH_REMATCH[1]}.${BASH_REMATCH[4]}"
done
- Uses bash's regular-expression matching operator,
=~
, with capture groups (the substrings in(...)
) to match against each filename and extract substrings of interest. - 使用bash的正则表达式匹配操作符=~和捕获组(在(…)中的子字符串)匹配每个文件名并提取相关的子字符串。
- The matching results are stored in the special array variable
$BASH_REMATCH
, with element0
containing the entire match,1
containing what matches the first capture group,2
the second, and so on. - 匹配结果存储在特殊的数组变量$BASH_REMATCH中,元素0包含整个匹配,元素1包含与第一个捕获组匹配的内容,元素2包含第二个捕获组,依此类推。
- The
mv
command's target argument then assembles the capture-group matches in the desired order; note that in this case, for simplicity, I've made the zero-padding ofs{number}
unconditional - a0
is simply prepended. - mv命令的目标参数然后按照期望的顺序组装捕获组匹配项;注意,在这种情况下,为了简单起见,我将s{number}的零填充设置为无条件——0仅仅是在前面加上的。
As above, you need to remove echo
before mv
to perform actual renaming.
如上所述,您需要在mv之前删除echo来执行实际的重命名。
#2
8
A common way of renaming multiple files according to a pattern, is to use the Perl command rename
. It uses Perl regular expressions and is very powerful. Use -n -v
to test the pattern without touching the files:
根据模式重命名多个文件的一种常见方法是使用Perl命令rename。它使用Perl正则表达式,功能非常强大。使用-n -v测试模式而不触及文件:
$ rename -n -v 's/^(\d+)._(.+)_S2\.mp4/$2_s02e$1.mp4/' *.mp4
01._HORRIBLE_HISTORIES_S2.mp4 renamed as HORRIBLE_HISTORIES_s02e01.mp4
02._HORRIBLE_HISTORIES_S2.mp4 renamed as HORRIBLE_HISTORIES_s02e02.mp4
Use parentheses to capture strings into variables $1
(first capture), $2
(second capture) etc:
使用圆括号将字符串捕获到变量$1(第一次捕获)、$2(第二次捕获)等:
-
^(\d+)
capture numbers at beginning of filename (into$1)
- ^(\ d +)捕捉数字开头的文件名(1美元)
-
._(.+)_S2\.mp4
capture everything between._
and_S2.mp4
(into$2
) - ._ _S2 \(+)。mp4捕捉._和_S2之间的一切。mp4(2美元)
-
$2_s02e$1.mp4
assemble your new filename with the captured data as you want it - 2美元_s02e 1美元。根据需要将捕获的数据组装到新的文件名中
When you are happy with the result, remove -n
from the command and it will rename all the files for real.
当您对结果感到满意时,从命令中删除-n,它会将所有文件重命名为real。
rename
is often available by default on Linux (package util-linux
). There is a similar discussion here on SO with more details about finding/installing the right command.
在Linux上,rename通常可以默认使用(包util-linux)。这里有一个类似的讨论,关于查找/安装正确命令的更多细节。
#3
1
You can do it with pure almost bash
(with variable expansion):
你可以用pure almost bash(使用变量展开):
for f in in *mp4 ; do
newfilename="${f:5:20}_s01e${f:1:2}.mp4"
echo mv $f $newfilename
done
If the output from this command suites your needs you may remove the echo
from the cycle or more simply (if your last command was the above) issue: !! | bash
如果您需要这个命令套件的输出,您可以从循环中删除echo,或者更简单地(如果您的最后一个命令是上面的)问题:!| bash
#4
0
Make the filename string into a textfile then use loop and awk to rename file.
使文件名字符串成为一个textfile,然后使用循环和awk重命名文件。
while read oldname; do
newname=$(awk -F'.' '{ print substr($2, 2) "_e" $1 "." $3 }' <<< ${oldname} | \
awk -F'_' '{ print $1 "_s0" substr($2, 2) $3 }');
mv ${oldname} ${newname};
done<input.txt
#5
0
If you're willing to use gawk
, the regex matching really comes in handy. I find this pipe-based solution a little nicer than worrying about looping constructs.
如果你愿意使用gawk,那么regex匹配真的会派上用场。我发现这种基于管道的解决方案比担心循环结构要好一些。
ls -1 | \
gawk 'match($0, /.../, a) { printf ... | "sh" } \
END { close("sh") }'
For ease of reading I've replaced the regex and the mv
command with ellipses.
为了便于阅读,我用省略号替换了regex和mv命令。
- Line 1 lists all the file names in the current directory, one line each and pipes that to the gawk command.
- 第1行列出当前目录中的所有文件名,每行一行,并将其传输到gawk命令。
- Line 2 runs the regex match, assigning captured groups to the array variable
a
. The action converts this into our desired command withprintf
which is itself piped tosh
to execute. - 第2行运行regex匹配,将捕获的组分配给数组变量a。该操作将其转换为我们需要的命令,并将其自身通过管道发送到sh执行。
- Line 3 closes the shell that was implicitly opened when we started piping things to it.
- 第3行关闭当我们开始向它输送管道时隐式打开的shell。
So then you just fill in your regex and command syntax (borrowing from mklement0). For example (LIVE CODE WARNING):
因此,只需填充regex和命令语法(从mklement0中借用)。例如(实时代码警告):
ls -1 | \
gawk 'match($0, /^([0-9]+)\._([^.]+)_S([^.]+)\.(.+)$/, a) { printf "mv %s %s_s0%se%s.%s\n",a[0],a[2],a[3],a[1],a[4] | "sh" } \
END { close("sh") }'
To preview that command (as you should) you can simply remove the | "sh"
from the second line.
要预览该命令(您应该这样做),只需从第二行删除|“sh”。