The problem is simple. I have an AWK script and I have two strings (names). If they have the same length, I need to pick the one which is "sooner" in aplhabet according ASCII.
问题很简单。我有一个AWK脚本,我有两个字符串(名称)。如果它们具有相同的长度,我需要根据ASCII选择aplhabet中“更快”的那个。
first example:
第一个例子:
1st string = "aac", 2nd string = "aab"
result: aab
结果:aab
second example:
第二个例子:
1st string = "Donald J Cat", 2nd string = "Donald J Bat"
result : Donald J Bat
结果:唐纳德J蝙蝠
Is there a simple way how to do it in AWK ?
在AWK中有一个简单的方法吗?
3 个解决方案
#1
2
With awk:
用awk:
if ("aab" < "aac") {print "aab is sooner"}
#2
0
assume the compared fields are first and second, print the one that is shorter, or if equal length based on lexical order (aka dictionary order)
假设比较字段是第一个和第二个,打印较短的一个,或者如果基于词汇顺序(也就是字典顺序)等长
awk '...
len1=length($1); len2=length($2);
f = len1<len2 || (len1==len2 && $1<$2);
print f?$1:$2; ...'
if you want case insensitive change to tolower($1)<tolower($2)
如果您希望不区分大小写更改为tolower($ 1)
#3
0
If you are only dealing with two strings, you can use awk's behavior with string comparison and a ternary to assign the two strings to a single string in the order you describe:
如果您只处理两个字符串,则可以使用字符串比较的awk行为和三元组按照您描述的顺序将两个字符串分配给单个字符串:
$ echo "aac,aab
Donald J Cat,Donald J Bat
zoom batman,ahem Mr President
zzzzzz,a
aa,z" | awk -F, '{s=$1<$2 ? $1 "," $2 : $2 "," $1; print s}'
aab,aac
Donald J Bat,Donald J Cat
ahem Mr President,zoom batman
a,zzzzzz
aa,z
This will print the two words in asciibetical order; an a
beats a full hand of zzzz
's
这将以asciibetical顺序打印两个单词;一个击败zzzz的全部手
If you wanted to sort more than one string, and you have a recent gawk
vs POSIX awk, you can use PROCINFO
to traverse an array sorted by values:
如果你想排序多个字符串,并且你有一个最近的gawk vs POSIX awk,你可以使用PROCINFO来遍历按值排序的数组:
echo "aac,aab,Donald J Cat,Donald J Bat
zoom batman,ahem Mr President,zzzzzz,a,aa,zz" | awk -F, '{s="";split("",a);
for (i=1;i<=NF;i++) a[i]=$i
PROCINFO["sorted_in"] = "@val_num_asc"
for (e in a) s=s a[e] ","
print gensub(",$","","1",s)}'
Donald J Bat,Donald J Cat,aab,aac
a,aa,ahem Mr President,zoom batman,zz,zzzzzz
Note that in asciibetical sorting 'D'<'a'
. In gawk, it is easy to write a custom comparison function if needed.
请注意,在asciibetical排序'D'<'a'。在gawk中,如果需要,可以很容易地编写自定义比较函数。
#1
2
With awk:
用awk:
if ("aab" < "aac") {print "aab is sooner"}
#2
0
assume the compared fields are first and second, print the one that is shorter, or if equal length based on lexical order (aka dictionary order)
假设比较字段是第一个和第二个,打印较短的一个,或者如果基于词汇顺序(也就是字典顺序)等长
awk '...
len1=length($1); len2=length($2);
f = len1<len2 || (len1==len2 && $1<$2);
print f?$1:$2; ...'
if you want case insensitive change to tolower($1)<tolower($2)
如果您希望不区分大小写更改为tolower($ 1)
#3
0
If you are only dealing with two strings, you can use awk's behavior with string comparison and a ternary to assign the two strings to a single string in the order you describe:
如果您只处理两个字符串,则可以使用字符串比较的awk行为和三元组按照您描述的顺序将两个字符串分配给单个字符串:
$ echo "aac,aab
Donald J Cat,Donald J Bat
zoom batman,ahem Mr President
zzzzzz,a
aa,z" | awk -F, '{s=$1<$2 ? $1 "," $2 : $2 "," $1; print s}'
aab,aac
Donald J Bat,Donald J Cat
ahem Mr President,zoom batman
a,zzzzzz
aa,z
This will print the two words in asciibetical order; an a
beats a full hand of zzzz
's
这将以asciibetical顺序打印两个单词;一个击败zzzz的全部手
If you wanted to sort more than one string, and you have a recent gawk
vs POSIX awk, you can use PROCINFO
to traverse an array sorted by values:
如果你想排序多个字符串,并且你有一个最近的gawk vs POSIX awk,你可以使用PROCINFO来遍历按值排序的数组:
echo "aac,aab,Donald J Cat,Donald J Bat
zoom batman,ahem Mr President,zzzzzz,a,aa,zz" | awk -F, '{s="";split("",a);
for (i=1;i<=NF;i++) a[i]=$i
PROCINFO["sorted_in"] = "@val_num_asc"
for (e in a) s=s a[e] ","
print gensub(",$","","1",s)}'
Donald J Bat,Donald J Cat,aab,aac
a,aa,ahem Mr President,zoom batman,zz,zzzzzz
Note that in asciibetical sorting 'D'<'a'
. In gawk, it is easy to write a custom comparison function if needed.
请注意,在asciibetical排序'D'<'a'。在gawk中,如果需要,可以很容易地编写自定义比较函数。