使用AWK查找列中最小和最大的数字?

时间:2022-10-11 19:59:33

If I have a file with few column and I want to use an AWK command to show the largest and the lowest number in a particular column!

如果我有一个列很少的文件,我想使用AWK命令来显示特定列中的最大和最小数字!

example:

例:

a  212
b  323
c  23
d  45
e  54
f  102

I want my command to show that the lowest number is 23 and another command to say the highest number is 323

我希望我的命令显示最低数字是23,另一个命令说最高数字是323

I have no idea why the answers are not working! I put a more realistic example of my file( maybe I should mention that is tab determined)

我不知道为什么答案不起作用!我把我文件的一个更现实的例子(也许我应该提到是确定的标签)

##FORMAT=<ID=DP,Number=1,Type=Integer,Description="# high-quality bases">
##FORMAT=<ID=SP,Number=1,Type=Integer,Description="Phred-scaled strand bias P-value">
##FORMAT=<ID=PL,Number=-1,Type=Integer,Description="List of Phred-scaled genotype likelihoods, number of values is (#ALT+1)*(#ALT+2)/2">
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  rmdup_wl_25248.bam
Chr10   247     .       T       C       7.8     .       DP=37;AF1=0.5;CI95=0.5,0.5;DP4=7,1,19,0;MQ=15;FQ=6.38;PV4=0.3,1,0.038,1 GT:PL:GQ        0/1:37,0,34:36
Chr10   447     .       A       C       75      .       DP=30;AF1=1;CI95=1,1;DP4=0,0,22,5;MQ=14;FQ=-108 GT:PL:GQ        1/1:108,81,0:99
Chr10   449     .       G       C       35.2    .       DP=33;AF1=1;CI95=0.5,1;DP4=3,2,20,3;MQ=14;FQ=-44;PV4=0.21,1.7e-06,1,0.34        GT:PL:GQ        1/1:68,17,0:31
Chr10   517     .       G       A       222     .       DP=197;AF1=1;CI95=1,1;DP4=0,0,128,62;MQ=24;FQ=-282      GT:PL:GQ        1/1:255,255,0:99
Chr10   761     .       G       A       27      .       DP=185;AF1=0.5;CI95=0.5,0.5;DP4=24,71,8,54;MQ=20;FQ=30;PV4=0.07,8.4e-50,1,1     GT:PL:GQ        0/1:57,0,149:60
Chr10   1829    .       A       G       3.01    .       DP=74;AF1=0.4998;CI95=0.5,0.5;DP4=18,0,54,0;MQ=19;FQ=4.68;PV4=1,9.1e-12,0.003,1 GT:PL:GQ        0/1:30,0,45:28

I should say that I have already add excluding line that start with # so this is the command that I use:

我应该说我已经添加了以#开头的排除行,所以这是我使用的命令:

awk '$1 !~/#/' | awk -F'\t' 'BEGIN{first=1;} {if (first) { max = min = $6; first = 0; next;} if (max < $6) max=$6; if (min > $6) min=$6; } END { print min, max }' wl_25210_filtered.vcf

awk '$1 !~/#/' | awk -F'\t' 'BEGIN{getline;min=max=$6} NF{ max=(max>$6)?max:$6 min=(min>$6)?$6:min} END{print min,max}' wl_25210_filtered.vcf

and

awk '$1 !~/#/' | awk -F'\t' '
NR==2{min=max=$6;next}
NR>2 && NF{
    max=(max>$6)?max:$6
    min=(min>$6)?$6:min
}
END{print min,max}' wl_25210_filtered.vcf

6 个解决方案

#1


4  

You can create two user defined functions and use them as per your need. This will offer more generic solution.

您可以创建两个用户定义的函数,并根据需要使用它们。这将提供更通用的解决方案。

[jaypal:~/Temp] cat file
a  212
b  323
c  23
d  45
e  54
f  102
[jaypal:~/Temp] awk '
function max(x){i=0;for(val in x){if(i<=x[val]){i=x[val];}}return i;}
function min(x){i=max(x);for(val in x){if(i>x[val]){i=x[val];}}return i;}
{a[$2]=$2;next}
END{minimum=min(a);maximum=max(a);print "Maximum = "maximum " and Minimum = "minimum}' file
Maximum = 323 and Minimum = 23

In the above solution, there are 2 user defined functions - max and min. We store the column 2 in an array. You can store each of your columns like this. In the END statement you can invoke the function and store the value in a variable and print it.

在上面的解决方案中,有2个用户定义的函数 - max和min。我们将第2列存储在一个数组中。您可以像这样存储每个列。在END语句中,您可以调用该函数并将值存储在变量中并打印它。

Hope this helps!

希望这可以帮助!

Update:

更新:

Executed the following as per the latest example -

按照最新的例子执行以下操作 -

[jaypal:~/Temp] awk '
function max(x){i=0;for(val in x){if(i<=x[val]){i=x[val];}}return i;}
function min(x){i=max(x);for(val in x){if(i>x[val]){i=x[val];}}return i;}
/^#/{next}
{a[$6]=$6;next}
END{minimum=min(a);maximum=max(a);print "Maximum = "maximum " and Minimum = "minimum}' sample
Maximum = 222 and Minimum = 3.01

#2


6  

If your file contains empty lines, neither of the posted solutions will work. For correct handling of empty lines try this:

如果您的文件包含空行,则所发布的解决方案都不起作用。要正确处理空行,请尝试以下方法:

$ cat f.awk
BEGIN{getline;min=max=$6}
NF{
    max=(max>$6)?max:$6
    min=(min>$6)?$6:min
}
END{print min,max} 

Then run this command:

然后运行以下命令:

sed "/^#/d" my_file | awk -f f.awk

At first it catches the first line of the file to set min and max. Than for each non-empty line it use the ternary operator check, if a new min or max was found. At the end the result ist printed.

首先,它捕获文件的第一行以设置最小值和最大值。如果找到新的最小值或最大值,则对于每个非空行使用三元运算符检查。最后打印结果。

HTH Chris

HTH Chris

#3


2  

awk 'BEGIN {max = 0} {if ($6>max) max=$6} END {print max}' yourfile.txt

#4


1  

awk 'BEGIN{first=1;} 
     {if (first) { max = min = $2; first = 0; next;}
      if (max < $2) max=$2; if (min > $2) min=$2; }
     END { print min, max }' file

#5


1  

Use the BEGIN and END blocks to initialize and print variables that keep track of the min and max.

使用BEGIN和END块初始化和打印跟踪最小值和最大值的变量。

e.g.,

例如。,

awk 'BEGIN{max=0;min=512} { if (max < $1){ max = $1 }; if(min > $1){ min = $1 } } END{ print max, min}'

#6


1  

The min can be found by:

最小可以通过以下方式找到:

awk 'BEGIN {min=1000000; max=0;}; { if($2<min && $2 != "") min = $2; if($2>max && $2 != "") max = $2; } END {print min, max}' file

This will output the minimum and maximum, comma-separated

这将输出逗号分隔的最小值和最大值

#1


4  

You can create two user defined functions and use them as per your need. This will offer more generic solution.

您可以创建两个用户定义的函数,并根据需要使用它们。这将提供更通用的解决方案。

[jaypal:~/Temp] cat file
a  212
b  323
c  23
d  45
e  54
f  102
[jaypal:~/Temp] awk '
function max(x){i=0;for(val in x){if(i<=x[val]){i=x[val];}}return i;}
function min(x){i=max(x);for(val in x){if(i>x[val]){i=x[val];}}return i;}
{a[$2]=$2;next}
END{minimum=min(a);maximum=max(a);print "Maximum = "maximum " and Minimum = "minimum}' file
Maximum = 323 and Minimum = 23

In the above solution, there are 2 user defined functions - max and min. We store the column 2 in an array. You can store each of your columns like this. In the END statement you can invoke the function and store the value in a variable and print it.

在上面的解决方案中,有2个用户定义的函数 - max和min。我们将第2列存储在一个数组中。您可以像这样存储每个列。在END语句中,您可以调用该函数并将值存储在变量中并打印它。

Hope this helps!

希望这可以帮助!

Update:

更新:

Executed the following as per the latest example -

按照最新的例子执行以下操作 -

[jaypal:~/Temp] awk '
function max(x){i=0;for(val in x){if(i<=x[val]){i=x[val];}}return i;}
function min(x){i=max(x);for(val in x){if(i>x[val]){i=x[val];}}return i;}
/^#/{next}
{a[$6]=$6;next}
END{minimum=min(a);maximum=max(a);print "Maximum = "maximum " and Minimum = "minimum}' sample
Maximum = 222 and Minimum = 3.01

#2


6  

If your file contains empty lines, neither of the posted solutions will work. For correct handling of empty lines try this:

如果您的文件包含空行,则所发布的解决方案都不起作用。要正确处理空行,请尝试以下方法:

$ cat f.awk
BEGIN{getline;min=max=$6}
NF{
    max=(max>$6)?max:$6
    min=(min>$6)?$6:min
}
END{print min,max} 

Then run this command:

然后运行以下命令:

sed "/^#/d" my_file | awk -f f.awk

At first it catches the first line of the file to set min and max. Than for each non-empty line it use the ternary operator check, if a new min or max was found. At the end the result ist printed.

首先,它捕获文件的第一行以设置最小值和最大值。如果找到新的最小值或最大值,则对于每个非空行使用三元运算符检查。最后打印结果。

HTH Chris

HTH Chris

#3


2  

awk 'BEGIN {max = 0} {if ($6>max) max=$6} END {print max}' yourfile.txt

#4


1  

awk 'BEGIN{first=1;} 
     {if (first) { max = min = $2; first = 0; next;}
      if (max < $2) max=$2; if (min > $2) min=$2; }
     END { print min, max }' file

#5


1  

Use the BEGIN and END blocks to initialize and print variables that keep track of the min and max.

使用BEGIN和END块初始化和打印跟踪最小值和最大值的变量。

e.g.,

例如。,

awk 'BEGIN{max=0;min=512} { if (max < $1){ max = $1 }; if(min > $1){ min = $1 } } END{ print max, min}'

#6


1  

The min can be found by:

最小可以通过以下方式找到:

awk 'BEGIN {min=1000000; max=0;}; { if($2<min && $2 != "") min = $2; if($2>max && $2 != "") max = $2; } END {print min, max}' file

This will output the minimum and maximum, comma-separated

这将输出逗号分隔的最小值和最大值