awk(gawk)模式扫描及处理语言
基本用法:awk [options] 'program' FILE
program:PAT{action statements}
语句之间用分号隔开
print,printf
[root@makeISO ~]# cat /etc/fstab |grep '^[^#]'|awk '{print $1,$2}' /dev/mapper/vg_makeiso-lv_root /
UUID=e20786ee-7475-4afd-b28b-0585dcb3c1a7 /boot
/dev/mapper/vg_makeiso-lv_swap swap
tmpfs /dev/shm
devpts /dev/pts
sysfs /sys
proc /proc
选项:
-F:指明输入时用到的字段分隔符,默认分隔符为一个或多个连续的空格
-v var=value:自定义变量
1、print
print item1,item2
要点:
1、逗号分隔符
2、输出的个item可以是字符串,也可以是数值;当前记录的字段、变量或者awk的表达式
3、如省略item,则显示$0,即显示所有item
[root@makeISO ~]# head -3 /etc/passwd |awk -F: '{print }' root:x:0:0:root:/root:/bin/bash bin:x:1:1:bin:/bin:/sbin/nologin daemon:x:2:2:daemon:/sbin:/sbin/nologin [root@makeISO ~]# head -3 /etc/passwd |awk -F: '{print $0}' root:x:0:0:root:/root:/bin/bash bin:x:1:1:bin:/bin:/sbin/nologin daemon:x:2:2:daemon:/sbin:/sbin/nologin [root@makeISO ~]# head -3 /etc/passwd |awk -F: '{print $1 $2}' rootx binx daemonx [root@makeISO ~]# head -3 /etc/passwd |awk -F: '{print $1 "\t" $2}' root x bin x daemon x [root@makeISO ~]# head -3 /etc/passwd |awk -F: '{print $1 , $2}' root x bin x daemon x [root@makeISO ~]# head -3 /etc/passwd |awk -F: '{print $1 "---" $2}' root---x bin---x daemon---x
2、变量
2.1 内建变量
FS:输入的分隔符,默认为空白字符(-v FS=":")
OFS:输出时的分隔符,默认为空白字符(-v OFS="-")
RS:输入时的换行符
ORS:输出时的换行符
NF:每一行的字段数量 awk '{print NF}' file
NR:文件中的行数
FNR:每个文件分别计数NR
FILENAME:文件名
ARGC:命令行中参数的个数
ARGV:数组,保存的是命令行中保存的各参数
[root@makeISO ~]# head -3 /etc/passwd |awk -v FS=: -v OFS=- '{print $1,$2}' root-x bin-x daemon-x
[root@makeISO ~]# head -2 /etc/passwd |awk -v RS=: '{print $0}' root x 0 0 root /root /bin/bash bin x 1 1 bin /bin /sbin/nologin
[root@makeISO ~]# head -2 /etc/passwd |awk -v RS=: '{print $0}' |awk -v ORS=# '{print $0}' root#x#0#0#root#/root#/bin/bash#bin#x#1#1#bin#/bin#/sbin/nologin##[root@makeISO ~]# [root@makeISO ~]# head -3 /etc/passwd |awk -F: '{print NF}' 7 7 7 [root@makeISO ~]# head -3 /etc/passwd |awk -F: '{print $NF}' /bin/bash /sbin/nologin /sbin/nologin
[root@makeISO ~]# cat file1 root:x:0:0:root:/root:/bin/bash bin:x:1:1:bin:/bin:/sbin/nologin daemon:x:2:2:daemon:/sbin:/sbin/nologin [root@makeISO ~]# cat file2 pulse:x:497:496:PulseAudio System Daemon:/var/run/pulse:/sbin/nologin sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin tcpdump:x:72:72::/:/sbin/nologin [root@makeISO ~]# awk -F: '{print FILENAME,NR,$1}' file1 file2 file1 1 root file1 2 bin file1 3 daemon file2 4 pulse file2 5 sshd file2 6 tcpdump [root@makeISO ~]# awk -F: '{print FILENAME,FNR,$1}' file1 file2 file1 1 root file1 2 bin file1 3 daemon file2 1 pulse file2 2 sshd file2 3 tcpdump
[root@makeISO ~]# awk -F: '{print ARGC}' file1 file2 3 3 3 3 3 3 [root@makeISO ~]# awk -F: '{print ARGV[0]}' file1 file2 awk awk awk awk awk awk [root@makeISO ~]# awk -F: '{print ARGV[1]}' file1 file2 file1 file1 file1 file1 file1 file1 [root@makeISO ~]# awk -F: '{print ARGV[2]}' file1 file2 file2 file2 file2 file2 file2 file2 [root@makeISO ~]# awk -F: 'BEGIN{print ARGC}' file1 file2 3 [root@makeISO ~]# awk -F: 'BEGIN{print ARGV[0]}' file1 file2 awk [root@makeISO ~]# awk -F: 'BEGIN{print ARGV[1]}' file1 file2 file1 [root@makeISO ~]# awk -F: 'BEGIN{print ARGV[2]}' file1 file2 file2
2.2 自定义变量
(1) -v var=value
变量名区分大小写
(2) 在program中直接定义
awk '{var=value ; print var}' file
[root@makeISO ~]# awk -v test='this is a test !!!' '{print test}' file1 this is a test !!! this is a test !!! this is a test !!! [root@makeISO ~]# awk -v test='this is a test !!!' 'BEGIN{print test}' file1 this is a test !!!
[root@makeISO ~]# awk '{test="this is a test \!\!\!"; print test }' file1 awk: warning: escape sequence `\!' treated as plain `!' this is a test !!! this is a test !!! this is a test !!! [root@makeISO ~]# awk 'BEGIN{test="this is a test \!\!\!"; print test }' file1 awk: warning: escape sequence `\!' treated as plain `!' this is a test !!!
3、printf命令--格式化输出
printf FORMAT,item1,item2
(1) FORMAT必须给出
(2) 不会自动换行,需要显示给出换行控制符,\n
(3) FORMAT中需要分别为后面的每个item指定一个格式化的符号
格式符:
%c 显示字符的ASCII码
%d,%i 显示十进制整数
%e,%E 科学计数法数值显示
%f 显示为浮点数
%g,%G 以科学计数法或浮点形式显示
%s 显示字符串
%u 显示无符号整数
%% 显示%自身
修饰符:
%s -->> %#[.#]s
%#[.#]s 第一个数字控制显示宽度,第二个数字表示小数点精度
%-#[.#]s 在固定的宽度内显示内容左对齐,默认为右对齐
注:当显示宽度小于内容宽度时,只会显示不整齐,不会漏显示内容
%d -->> %+d
负数默认显示-,正数强制显示+
[root@makeISO ~]# head -3 /etc/passwd |awk -F: '{printf "User"NR " is:%s\n",$1}' User1 is:root User2 is:bin User3 is:daemon
[root@makeISO ~]# head -3 /etc/passwd |awk -F: '{printf "username :%s \t UID : %d \n",$1,$3}' username :root UID : 0 username :bin UID : 1 username :daemon UID : 2
[root@makeISO ~]# head -3 /etc/passwd |awk -F: '{printf "username :%1s \t UID : %d \n",$1,$3}' username :root UID : 0 username :bin UID : 1 username :daemon UID : 2 [root@makeISO ~]# head -3 /etc/passwd |awk -F: '{printf "username :%2s \t UID : %d \n",$1,$3}' username :root UID : 0 username :bin UID : 1 username :daemon UID : 2 [root@makeISO ~]# head -3 /etc/passwd |awk -F: '{printf "username :%3s \t UID : %d \n",$1,$3}' username :root UID : 0 username :bin UID : 1 username :daemon UID : 2 [root@makeISO ~]# head -3 /etc/passwd |awk -F: '{printf "username :%4s \t UID : %d \n",$1,$3}' username :root UID : 0 username : bin UID : 1 username :daemon UID : 2 [root@makeISO ~]# head -3 /etc/passwd |awk -F: '{printf "username :%5s \t UID : %d \n",$1,$3}' username : root UID : 0 username : bin UID : 1 username :daemon UID : 2 [root@makeISO ~]# head -3 /etc/passwd |awk -F: '{printf "username :%6s \t UID : %d \n",$1,$3}' username : root UID : 0 username : bin UID : 1 username :daemon UID : 2 [root@makeISO ~]# head -3 /etc/passwd |awk -F: '{printf "username :%7s \t UID : %d \n",$1,$3}' username : root UID : 0 username : bin UID : 1 username : daemon UID : 2 [root@makeISO ~]# head -3 /etc/passwd |awk -F: '{printf "username :%8s \t UID : %d \n",$1,$3}' username : root UID : 0 username : bin UID : 1 username : daemon UID : 2 [root@makeISO ~]# head -3 /etc/passwd |awk -F: '{printf "username :%9s \t UID : %d \n",$1,$3}' username : root UID : 0 username : bin UID : 1 username : daemon UID : 2 [root@makeISO ~]# head -3 /etc/passwd |awk -F: '{printf "username :%-9s \t UID : %d \n",$1,$3}' username :root UID : 0 username :bin UID : 1 username :daemon UID : 2 [root@makeISO ~]# head -3 /etc/passwd |awk -F: '{printf "username :%-1s \t UID : %d \n",$1,$3}' username :root UID : 0 username :bin UID : 1 username :daemon UID : 2
[root@makeISO ~]# head -3 /etc/passwd |awk -F: '{printf "username :%-8s \t UID : %+d \n",$1,$3}' username :root UID : +0 username :bin UID : +1 username :daemon UID : +2
4、操作符
算数操作符
x+y
x-y
x*y
x/y
x^y
x%y
-x
+x 转换成数值
字符串操作符:没有符号的操作符,字符串连接
赋值操作符:
=
+=
-=
*=
/=
%=
^=
++
--
比较操作符
>
>=
<
<=
!=
==
模式匹配符
~ 是否匹配
!~ 是否不匹配
逻辑操作符:
&&
||
!
函数调用
function_name (argu1,argu2,...)
条件表达式
selectot?if-true-expression:if-false-expression
[root@makeISO ~]# awk -F: '{$3>500?tt="oth user":tt="sys user";printf "%15s:%5d:\t%s\n",$1,$3,tt}' /etc/passwd root: 0: sys user bin: 1: sys user daemon: 2: sys user adm: 3: sys user lp: 4: sys user sync: 5: sys user shutdown: 6: sys user halt: 7: sys user mail: 8: sys user uucp: 10: sys user operator: 11: sys user games: 12: sys user gopher: 13: sys user ftp: 14: sys user nobody: 99: sys user dbus: 81: sys user usbmuxd: 113: sys user vcsa: 69: sys user rpc: 32: sys user rtkit: 499: sys user avahi-autoipd: 170: sys user abrt: 173: sys user rpcuser: 29: sys user nfsnobody:65534: oth user haldaemon: 68: sys user gdm: 42: sys user ntp: 38: sys user apache: 48: sys user saslauth: 498: sys user postfix: 89: sys user pulse: 497: sys user sshd: 74: sys user tcpdump: 72: sys user
5、PATTERN
(1) empty:匹配每一行
(2) /正则表达式/:仅处理能够被匹配到的行
(3) 关系表达式:结果有真有假,只处理结果为真(非0非空)的情况
(4) 地址定界:行范围
awk -F":" '/^root/,/^lp/ {print $1}' /etc/passwd
(5) BEGIN/END模式:
BEGIN{} 仅在开始处理文件中的文本之前执行一次
END{} 仅在问恩处理完成之后执行一次
awk 'BEGIN{开始代码}{正文代码}END{结尾代码}' file
[root@makeISO ~]# awk '/^U/{print $1}' /etc/fstab UUID=e20786ee-7475-4afd-b28b-0585dcb3c1a7 [root@makeISO ~]# awk '!/^#/{print $1}' /etc/fstab
/dev/mapper/vg_makeiso-lv_root UUID=e20786ee-7475-4afd-b28b-0585dcb3c1a7 /dev/mapper/vg_makeiso-lv_swap tmpfs devpts sysfs proc
[root@makeISO ~]# awk -F":" '/^root/,/^lp/ {print $1}' /etc/passwd root bin daemon adm lp
[root@makeISO ~]# cat -n /etc/passwd |grep -C 2 adm 2 bin:x:1:1:bin:/bin:/sbin/nologin 3 daemon:x:2:2:daemon:/sbin:/sbin/nologin 4 adm:x:3:4:adm:/var/adm:/sbin/nologin 5 lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin 6 sync:x:5:0:sync:/sbin:/bin/sync [root@makeISO ~]# awk -F":" '(NR>=3&&NR<=6) {print $1}' /etc/passwd daemon adm lp sync
[root@makeISO ~]# awk -F: 'BEGIN{print "username\tuid\n------------------------"}{printf "%-15s %-5d\n",$1,$3}END{print "------------------------\nuser count:"NR}' /etc/passwd username uid ------------------------ root 0 bin 1 daemon 2 adm 3 lp 4 sync 5 shutdown 6 halt 7 mail 8 uucp 10 operator 11 games 12 gopher 13 ftp 14 nobody 99 dbus 81 usbmuxd 113 vcsa 69 rpc 32 rtkit 499 avahi-autoipd 170 abrt 173 rpcuser 29 nfsnobody 65534 haldaemon 68 gdm 42 ntp 38 apache 48 saslauth 498 postfix 89 pulse 497 sshd 74 tcpdump 72 ------------------------ user count:33
6、常用的action
(1) 表达式
(2) 控制语句:if、while
(3) 组合语句
(4) 输入语句
(5) 输出语句
7、控制语句
if(条件){}
if(条件){}else{}
while(条件){}
do{}while{}
for(expr1;expr2;expr3){}
break
continue
delete array[index]
delete array
exit
{}
7.1、if-else
使用场景:对awk取得的整行或某个字段做条件判断
awk '{if(condition) statement [else statement]}'
[root@makeISO ~]# awk -F: '{if($3>10){print $1,$3}else{print $3,$1}}' /etc/passwd 0 root 1 bin 2 daemon 3 adm 4 lp 5 sync 6 shutdown 7 halt 8 mail 10 uucp operator 11 games 12 gopher 13 ftp 14 nobody 99
[root@makeISO ~]# awk '{if(NF>5)print $0}' /etc/fstab # Created by anaconda on Mon Nov 23 10:15:17 2015 # Accessible filesystems, by reference, are maintained under '/dev/disk' # See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info /dev/mapper/vg_makeiso-lv_root / ext4 defaults 1 1 UUID=e20786ee-7475-4afd-b28b-0585dcb3c1a7 /boot ext4 defaults 1 2 /dev/mapper/vg_makeiso-lv_swap swap swap defaults 0 0 tmpfs /dev/shm tmpfs defaults 0 0 devpts /dev/pts devpts gid=5,mode=620 0 0 sysfs /sys sysfs defaults 0 0 proc /proc proc defaults 0 0 [root@makeISO ~]# awk '{if(NF>6)print $0}' /etc/fstab # Created by anaconda on Mon Nov 23 10:15:17 2015 # Accessible filesystems, by reference, are maintained under '/dev/disk' # See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info [root@makeISO ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/vg_makeiso-lv_root 18G 8.2G 8.1G 51% / tmpfs 1.9G 228K 1.9G 1% /dev/shm /dev/sda1 477M 34M 419M 8% /boot [root@makeISO ~]# df -h |awk -F% '{print $1}' Filesystem Size Used Avail Use /dev/mapper/vg_makeiso-lv_root 18G 8.2G 8.1G 51 tmpfs 1.9G 228K 1.9G 1 /dev/sda1 477M 34M 419M 8 [root@makeISO ~]# df -h |awk -F% '{print $1}' |awk '/[[:digit:]]$/{if($NF>=1)print $0}' 18G 8.2G 8.1G 51 tmpfs 1.9G 228K 1.9G 1 /dev/sda1 477M 34M 419M 8
7.2 while
语法: while(condition) statement
条件为真进入循环,条件为假退出循环
使用场景:对一行内的多个字段逐一类似处理时,对数组中的各元素逐一处理时
[root@bl2cxtru1341 ~]# awk '/^[[:space:]]*linux/{i=1;while(i<=NF) {print $i,length($i);i++}}' /etc/grub2.cfg linux16 7 /boot/vmlinuz-3.10.0-229.el7.x86_64 35 root=UUID=a74e310c-ffa3-485e-9744-bf2828cf826f 46 ro 2 crashkernel=auto 16 rhgb 4 quiet 5 LANG=en_US.UTF-8 16 linux16 7 /boot/vmlinuz-0-rescue-ca2e0b5862c148cabe8c774423b52bce 55 root=UUID=a74e310c-ffa3-485e-9744-bf2828cf826f 46 ro 2 crashkernel=auto 16 rhgb 4 quiet 5
[root@bl2cxtru1341 ~]# awk '/^[[:space:]]*linux/{i=1;while(i<=NF) {if(length($i)>7)print $i,length($i);i++}}' /etc/grub2.cfg /boot/vmlinuz-3.10.0-229.el7.x86_64 35 root=UUID=a74e310c-ffa3-485e-9744-bf2828cf826f 46 crashkernel=auto 16 LANG=en_US.UTF-8 16 /boot/vmlinuz-0-rescue-ca2e0b5862c148cabe8c774423b52bce 55 root=UUID=a74e310c-ffa3-485e-9744-bf2828cf826f 46 crashkernel=auto 16 [root@bl2cxtru1341 ~]# awk '/^[[:space:]]*linux/{i=1;while(i<=NF) {if(length($i)>7){print $i,length($i)};i++}}' /etc/grub2.cfg /boot/vmlinuz-3.10.0-229.el7.x86_64 35 root=UUID=a74e310c-ffa3-485e-9744-bf2828cf826f 46 crashkernel=auto 16 LANG=en_US.UTF-8 16 /boot/vmlinuz-0-rescue-ca2e0b5862c148cabe8c774423b52bce 55 root=UUID=a74e310c-ffa3-485e-9744-bf2828cf826f 46 crashkernel=auto 16
7.3 do-while
语法: do statement while (condition)
至少循环一次
7.4 for循环
语法:for(expr1;expr2;expr3) statement
可以遍历数组中的元素
语法: for(var in array) {for-body}
[root@bl2cxtru1341 ~]# awk '/^[[:space:]]*linux/{for(i=1;i<=NF;i++) if(length($i)>7) print $i,length($i) }' /etc/grub2.cfg /boot/vmlinuz-3.10.0-229.el7.x86_64 35 root=UUID=a74e310c-ffa3-485e-9744-bf2828cf826f 46 crashkernel=auto 16 LANG=en_US.UTF-8 16 /boot/vmlinuz-0-rescue-ca2e0b5862c148cabe8c774423b52bce 55 root=UUID=a74e310c-ffa3-485e-9744-bf2828cf826f 46 crashkernel=auto 16 [root@bl2cxtru1341 ~]# awk '/^[[:space:]]*linux/{for(i=1;i<=NF;i++) {if(length($i)>7) print $i,length($i)} }' /etc/grub2.cfg /boot/vmlinuz-3.10.0-229.el7.x86_64 35 root=UUID=a74e310c-ffa3-485e-9744-bf2828cf826f 46 crashkernel=auto 16 LANG=en_US.UTF-8 16 /boot/vmlinuz-0-rescue-ca2e0b5862c148cabe8c774423b52bce 55 root=UUID=a74e310c-ffa3-485e-9744-bf2828cf826f 46 crashkernel=auto 16
7.5 switch
语法:switch(expression){case VALUE1 or /REGEXP/:statement; case VALUE2 or /REGEXP2/:statement;...;default:statement}
7.6 break continue
break [n]
continue
7.7 next
awk提前结束对本行的处理,直接处理下一行
[root@bl2cxtru1341 ~]# awk -F: '{if($3%2!=0) next ; print $1,$3}' /etc/passwd root 0 daemon 2 lp 4 shutdown 6 mail 8 games 12 ftp 14 avahi-autoipd 170 sshd 74 [root@bl2cxtru1341 ~]# awk -F: '{if($3%2==0) next ; print $1,$3}' /etc/passwd bin 1 adm 3 sync 5 halt 7 operator 11 nobody 99 dbus 81 polkitd 999 tss 59 postfix 89
8 array
关联数组:array[index-expression]
index-expression
1、可以使用任意字符串,字符串需要“”
2、如果某数组元素事先不存在,在引用时,awk会自动创建此元素,并将其值初始化为空串
若要判断数组中是否存在某元素,要使用“index in array”格式进行
若要遍历数组中的每个元素,要使用for循环
for (var in array) {for-body;}
var会遍历array的每个索引
[root@localhost ~]# awk 'BEGIN{week["1"]="mon";week["2"]="tue";print week["1"]}' mon [root@localhost ~]# awk 'BEGIN{week["1"]="mon";week["2"]="tue";print week["2"]}' tue
[root@localhost ~]# awk 'BEGIN{week["1"]="mon";week["2"]="tue";for(i in week)print week[i]}' mon tue
[root@makeISO ~]# awk '{ip[$1]++}END{for(i in ip){print i,"\t"ip[i]}}' access.log | sort -k2 -n |tail -10 180.104.17.116 1217 27.40.123.82 1219 1.80.100.57 1272 116.21.39.78 1300 180.140.5.152 1434 36.63.173.86 1442 120.35.197.97 1786 117.28.110.114 1898 218.9.162.248 2451 124.163.206.86 3013
练习1、统计/etc/fstab中每个文件系统使用的次数
[root@localhost ~]# awk '/^UUID/{fs[$3]++}END{for(i in fs){print i,fs[i]}}' /etc/fstab ext4 1
练习2、统计某文件中单词出现的次数
[root@makeISO ~]# awk '{for(i=1;i<=NF;i++){fs[$i]++}}END{for(i in fs){print fs[i]"\t",i}}' /etc/fstab |sort -n -r |head 10 0 7 # 6 defaults 3 1 2 tmpfs 2 sysfs 2 swap 2 proc 2 ext4 2 devpts
9 函数
9.1 内置函数
数值处理:
rand() 返回0和1之间的一个随机数
字符串处理:
length([s]) 返回指定字符串的长度
sub(r,s,[t]) 在t中查找r,替换成s 仅替换第一次出现
gsub(r,s,[t]) 在t中查找r,替换成s 全部替换
split(s,a[,r]) 以r为分隔符切割字符s,并将切割后的结果保存到a所表示的数组中
[root@makeISO ~]# awk 'BEGIN{print rand(),rand(),rand()}' 0.237788 0.291066 0.845814
[root@localhost ~]# netstat -tan |awk '/^tcp\>/{split($5,ip,":");count[ip[1]]++}END{for (i in count) {print count[i]"\t",i}}' 2 0.0.0.0 3 123.119.159.114
9.2 自定义函数