使用shell脚本从unix中的文件名中提取日期

时间:2022-01-29 00:15:20

I am working on shell script. I want to extract date from a file name.

我正在研究shell脚本。我想从文件名中提取日期。

The file name is: abcd_2014-05-20.tar.gz

文件名是:abcd_2014-05-20.tar.gz

I want to extract date from it: 2014-05-20

我想从中提取日期:2014-05-20

6 个解决方案

#1


14  

echo abcd_2014-05-20.tar.gz |grep -Eo '[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}'      

Output:

2014-05-20

grep got input as echo stdin or you can also use cat command if you have these strings in a file.

grep作为echo stdin输入,或者如果文件中包含这些字符串,也可以使用cat命令。

-E Interpret PATTERN as an extended regular expression.

-E将PATTERN解释为扩展正则表达式。

-o Show only the part of a matching line that matches PATTERN.

-o仅显示与PATTERN匹配的匹配行的一部分。

[[:digit:]] It will fetch digit only from input.

[[:digit:]]它只从输入中获取数字。

{N} It will check N number of digits in given string, i.e.: 4 for years 2 for months and days

{N}它将检查给定字符串中的N个位数,即:4年,2个月和几天

Most importantly it will fetch without using any separators like "_" and "." and this is why It's most flexible solution.

最重要的是,它将在不使用任何分隔符(如“_”和“。”)的情况下获取。这就是为什么它是最灵活的解决方案。

#2


9  

Using awk with custom field separator, it is quite simple:

将awk与自定义字段分隔符一起使用,非常简单:

echo 'abcd_2014-05-20.tar.gz' | awk -F '[_.]' '{print $2}'
2014-05-20

#3


5  

Use grep:

$ ls -1 abcd_2014-05-20.tar.gz | grep -oP '[\d]+-[\d]+-[\d]+'
2014-05-20
  • -o causes grep to print only the matching part
  • -o使grep仅打印匹配的部分

  • -P interprets the pattern as perl regex
  • -P将模式解释为perl regex

  • [\d]+-[\d]+-[\d]+: stands for one or more digits followed by a dash (3 times) that matches your date.
  • [\ d] + - [\ d] + - [\ d] +:代表一个或多个数字,后跟与您的日期匹配的短划线(3次)。

#4


1  

I will use some kind of regular expression with the "grep" command, depending on how your file name is created.

我将使用“grep”命令使用某种正则表达式,具体取决于文件名的创建方式。

If your date is always after "_" char I will use something like this.

如果你的日期总是在“_”字符之后,我将使用这样的东西。

ls -l | grep ‘_[REGEXP]’

Where REGEXP is your regular expression according to your date format.

根据您的日期格式,REGEXP是您的正则表达式。

Take a look here http://www.linuxnix.com/2011/07/regular-expressions-linux-i.html

请看这里http://www.linuxnix.com/2011/07/regular-expressions-linux-i.html

#5


1  

Multiple ways you could do it:

你可以采取多种方式:

echo abcd_2014-05-20.tar.gz | sed -n 's/.*_\(.*\).tar.gz/\1/p'

sed will extract the date and will print it.

sed将提取日期并将其打印出来。

Another way:

filename=abcd_2014-05-20.tar.gz
temp=${filename#*_}
date=${temp%.tar.gz}

Here temp will hold string in file name post "_" i.e. 2014-05-20.tar.gz Then you can extract date by removing .tar.gz from the end.

这里temp将文件名中的字符串保存为“_”,即2014-05-20.tar.gz然后您可以通过从末尾删除.tar.gz来提取日期。

#6


0  

Here few more examples,

这里还有几个例子,

  1. Using cut command (cut gives more readability like awk command)
  2. 使用剪切命令(剪切提供更多可读性,如awk命令)

echo "abcd_2014-05-20.tar.gz" | cut -d "_" -f2 | cut -d "." -f1

Output is:

2014-05-20
  1. using grep commnad
  2. 使用grep commnad

echo "abcd_2014-05-20.tar.gz" | grep -Eo "[0-9]{4}\-[0-9]{2}\-[0-9]{2}"

Output is:

2014-05-20

An another advantage of using grep command format is that, it will also help to fetch multiple dates like this:

使用grep命令格式的另一个好处是,它还有助于获取如下所示的多个日期:

echo "ab2014-15-12_cd_2014-05-20.tar.gz" | grep -Eo "[0-9]{4}\-[0-9]{2}\-[0-9]{2}"

Output is:

2014-15-12
2014-05-20

#1


14  

echo abcd_2014-05-20.tar.gz |grep -Eo '[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}'      

Output:

2014-05-20

grep got input as echo stdin or you can also use cat command if you have these strings in a file.

grep作为echo stdin输入,或者如果文件中包含这些字符串,也可以使用cat命令。

-E Interpret PATTERN as an extended regular expression.

-E将PATTERN解释为扩展正则表达式。

-o Show only the part of a matching line that matches PATTERN.

-o仅显示与PATTERN匹配的匹配行的一部分。

[[:digit:]] It will fetch digit only from input.

[[:digit:]]它只从输入中获取数字。

{N} It will check N number of digits in given string, i.e.: 4 for years 2 for months and days

{N}它将检查给定字符串中的N个位数,即:4年,2个月和几天

Most importantly it will fetch without using any separators like "_" and "." and this is why It's most flexible solution.

最重要的是,它将在不使用任何分隔符(如“_”和“。”)的情况下获取。这就是为什么它是最灵活的解决方案。

#2


9  

Using awk with custom field separator, it is quite simple:

将awk与自定义字段分隔符一起使用,非常简单:

echo 'abcd_2014-05-20.tar.gz' | awk -F '[_.]' '{print $2}'
2014-05-20

#3


5  

Use grep:

$ ls -1 abcd_2014-05-20.tar.gz | grep -oP '[\d]+-[\d]+-[\d]+'
2014-05-20
  • -o causes grep to print only the matching part
  • -o使grep仅打印匹配的部分

  • -P interprets the pattern as perl regex
  • -P将模式解释为perl regex

  • [\d]+-[\d]+-[\d]+: stands for one or more digits followed by a dash (3 times) that matches your date.
  • [\ d] + - [\ d] + - [\ d] +:代表一个或多个数字,后跟与您的日期匹配的短划线(3次)。

#4


1  

I will use some kind of regular expression with the "grep" command, depending on how your file name is created.

我将使用“grep”命令使用某种正则表达式,具体取决于文件名的创建方式。

If your date is always after "_" char I will use something like this.

如果你的日期总是在“_”字符之后,我将使用这样的东西。

ls -l | grep ‘_[REGEXP]’

Where REGEXP is your regular expression according to your date format.

根据您的日期格式,REGEXP是您的正则表达式。

Take a look here http://www.linuxnix.com/2011/07/regular-expressions-linux-i.html

请看这里http://www.linuxnix.com/2011/07/regular-expressions-linux-i.html

#5


1  

Multiple ways you could do it:

你可以采取多种方式:

echo abcd_2014-05-20.tar.gz | sed -n 's/.*_\(.*\).tar.gz/\1/p'

sed will extract the date and will print it.

sed将提取日期并将其打印出来。

Another way:

filename=abcd_2014-05-20.tar.gz
temp=${filename#*_}
date=${temp%.tar.gz}

Here temp will hold string in file name post "_" i.e. 2014-05-20.tar.gz Then you can extract date by removing .tar.gz from the end.

这里temp将文件名中的字符串保存为“_”,即2014-05-20.tar.gz然后您可以通过从末尾删除.tar.gz来提取日期。

#6


0  

Here few more examples,

这里还有几个例子,

  1. Using cut command (cut gives more readability like awk command)
  2. 使用剪切命令(剪切提供更多可读性,如awk命令)

echo "abcd_2014-05-20.tar.gz" | cut -d "_" -f2 | cut -d "." -f1

Output is:

2014-05-20
  1. using grep commnad
  2. 使用grep commnad

echo "abcd_2014-05-20.tar.gz" | grep -Eo "[0-9]{4}\-[0-9]{2}\-[0-9]{2}"

Output is:

2014-05-20

An another advantage of using grep command format is that, it will also help to fetch multiple dates like this:

使用grep命令格式的另一个好处是,它还有助于获取如下所示的多个日期:

echo "ab2014-15-12_cd_2014-05-20.tar.gz" | grep -Eo "[0-9]{4}\-[0-9]{2}\-[0-9]{2}"

Output is:

2014-15-12
2014-05-20