tr命令详解
什么是tr命令?tr,translate的简写,translate的翻译:
[trænsˈleit]
vi. 翻译, 能被译出
vt. 翻译, 解释, 转化, 转变为, 调动
在这里用到的意思是转化,转变,转换,在linux下输入tr --help查看一下提示:
amosli@amosli-pc:~$ tr --help Usage: tr [OPTION]... SET1 [SET2] Translate, squeeze, and/or delete characters from standard input, writing to standard output. -c, -C, --complement use the complement of SET1 -d, --delete delete characters in SET1, do not translate -s, --squeeze-repeats replace each input sequence of a repeated character that is listed in SET1 with a single occurrence of that character -t, --truncate-set1 first truncate SET1 to length of SET2 --help display this help and exit --version output version information and exit SETs are specified as strings of characters. Most represent themselves. Interpreted sequences are: \NNN character with octal value NNN (1 to 3 octal digits) \\ backslash \a audible BEL \b backspace \f form feed \n new line \r return \t horizontal tab \v vertical tab CHAR1-CHAR2 all characters from CHAR1 to CHAR2 in ascending order [CHAR*] in SET2, copies of CHAR until length of SET1 [CHAR*REPEAT] REPEAT copies of CHAR, REPEAT octal if starting with 0 [:alnum:] all letters and digits [:alpha:] all letters [:blank:] all horizontal whitespace [:cntrl:] all control characters [:digit:] all digits [:graph:] all printable characters, not including space [:lower:] all lower case letters [:print:] all printable characters, including space [:punct:] all punctuation characters [:space:] all horizontal or vertical whitespace [:upper:] all upper case letters [:xdigit:] all hexadecimal digits [=CHAR=] all characters which are equivalent to CHAR Translation occurs if -d is not given and both SET1 and SET2 appear. -t may be used only when translating. SET2 is extended to length of SET1 by repeating its last character as necessary. Excess characters of SET2 are ignored. Only [:lower:] and [:upper:] are guaranteed to expand in ascending order; used in SET2 while translating, they may only be used in pairs to specify case conversion. -s uses SET1 if not translating nor deleting; else squeezing uses SET2 and occurs after translation or deletion.
全是英文?翻译过来看下:
tr [选项]… 集合1 [集合2] 选项说明: -c, -C, –complement 用集合1中的字符串替换,要求字符集为ASCII。 -d, –delete 删除集合1中的字符而不是转换 -s, –squeeze-repeats 删除所有重复出现字符序列,只保留第一个;即将重复出现字符串压缩为一个字符串。 -t, –truncate-set1 先删除第一字符集较第二字符集多出的字符 字符集合的范围: \NNN 八进制值的字符 NNN (1 to 3 为八进制值的字符) \\ 反斜杠 \a Ctrl-G 铃声 \b Ctrl-H 退格符 \f Ctrl-L 走行换页 \n Ctrl-J 新行 \r Ctrl-M 回车 \t Ctrl-I tab键 \v Ctrl-X 水平制表符 CHAR1-CHAR2 从CHAR1 到 CHAR2的所有字符按照ASCII字符的顺序 [CHAR*] in SET2, copies of CHAR until length of SET1 [CHAR*REPEAT] REPEAT copies of CHAR, REPEAT octal if starting with 0 [:alnum:] 所有的字母和数字 [:alpha:] 所有字母 [:blank:] 水平制表符,空白等 [:cntrl:] 所有控制字符 [:digit:] 所有的数字 [:graph:] 所有可打印字符,不包括空格 [:lower:] 所有的小写字符 [:print:] 所有可打印字符,包括空格 [:punct:] 所有的标点字符 [:space:] 所有的横向或纵向的空白 [:upper:] 所有大写字母
经过上面的help提示应该大致能够看明白tr的作用,tr是UNIX命令行家工具箱中的一款精美小工具,它经常用来编写优美的单行命令。主要用来对来自标准输入的字符串从set1映射到set2,并将其输出写入stdout(标准输出).set1和set2是字符类或字符集。如果两个字符集的长度不相等,那么set2会不断重复其最后一个字符,直到长度与set1相同。如果set2的长度大于set1,那么在set2中超出set1长度的那部分字符则全部被忽略。
其调用格式如上示:
tr [OPTION]... SET1 [SET2]
实际应用1,大小写转换:
amosli@amosli-pc:~$ echo "HI_AMOS" | tr "A-Z" 'a-z' hi_amos
"a-z"和"A-Z"都是集合(set),集合表示方式非常简单即"起始字符-终止字符"。
实际应用2,加密解密:
amosli@amosli-pc:~$ echo 12345 | tr '0-9' '987654321' #加密 87654 amosli@amosli-pc:~$ echo 87654 | tr '987654321' '0-9' #解密 12345
上面是一个非常有趣的小例子,通过映射来实现简单的加密解密,看懂这个例子,可以接着往下看古罗马时期发明的凯撒加密的一种变体ROT13
amosli@amosli-pc:~$ echo "hi,this is amosli" | tr 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz' 'NOPQRSTUVWXYZABCDEFGHIJKLMnopqrstuvwxyzabcdefghijklm' uv,guvf vf nzbfyv amosli@amosli-pc:~$ echo "uv,guvf vf nzbfyv" | tr 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz' 'NOPQRSTUVWXYZABCDEFGHIJKLMnopqrstuvwxyzabcdefghijklm' hi,this is amosli
ROT13是它自己本身的逆反;也就是说,要还原ROT13,套用加密同样的算法即可得,故同样的操作可用再加密与解密。非常神奇!
实际应用3,删除字符:
ls | tr -d ‘\n’ 删除换行符(所有内容拼接成一行)
amosli@amosli-pc:~$ echo "hello 132 world 56 " | tr -d '0-9' #删除数字 hello world
实际应用4,字符集补集:
tr -c [set1] [set2]
set1的补集意味着从这个集合中包含set1中没有的所有字符。
最典型的用法就是从输入文本中将不在补集中的所有字符全部删除。例如:
amosli@amosli-pc:~$ echo "hello 123 world " | tr -d -c '0-9 \n' 123
在这里,补集中包含了除数字、空格字符和换行符之外的所有字符,因为指定了-d,所以这些字符全部都会被删除。
实际应用5,用tr压缩字符:
amosli@amosli-pc:~$ echo "GNU is not UNIX . Recursicve right?" | tr -s ' ' GNU is not UNIX . Recursicve right?
#tr -s '[set]'
使用-s参数可以压缩字符串中重复的字符
看另一个例子:
amosli@amosli-pc:~/learn$ cat sum.txt 5 4 3 5 4 3 amosli@amosli-pc:~/learn$ cat sum.txt | echo $[ $(tr '\n' '+' ) 0 ] 24 amosli@amosli-pc:~/learn$ cat sum.txt | echo $[ $(tr '\n' '+' ) ] bash: 5+4+3+5+4+3+ : syntax error: operand expected (error token is "+ ")
这里,运用tr实现了加法运算, tr '\n' '+'使用换行符来替换为'+'然后连接起来,最后多出来一个'+'再接上0即实现了加法。
实际应用6,字符类:
tr可以像合作集合一样使用种不同的字符类,在上面已经列举过了:
......
[:digit:] 所有的数字
[:graph:] 所有可打印字符,不包括空格
[:lower:] 所有的小写字符
[:print:] 所有可打印字符,包括空格
[:punct:] 所有的标点字符
[:space:] 所有的横向或纵向的空白
[:upper:] 所有大写字母
.......
下面举例说明:
amosli@amosli-pc:~/learn$ echo amosli | tr '[:lower:]' '[:upper:]'
AMOSLI