在awk中对关联数组进行排序

时间:2021-05-21 16:01:49

I have an associative array in awk that gets populated like this:

我在awk中有一个关联数组,它被填充如下:

chr_count[$3]++

When I try to print my chr_counts, I use this:

当我尝试打印我的chr_counts时,我使用这个:

for (i in chr_count) {
    print i,":",chr_count[i];
}

But not surprisingly, the order of i is not sorted in any way. Is there an easy way to iterate over the sorted keys of chr_count?

但毫不奇怪,i的顺序没有以任何方式排序。有没有一种简单的方法来迭代chr_count的排序键?

5 个解决方案

#1


23  

Instead of asort, use asorti(source, destination) which sorts the indices into a new array and you won't have to copy the array.

而不是asort,使用asorti(源,目标)将索引排序到新数组中,您不必复制数组。

Then you can use the destination array as pointers into the source array.

然后,您可以使用目标数组作为源数组的指针。

For your example, you would use it like this:

对于您的示例,您可以像这样使用它:

n=asorti(chr_count, sorted)
for (i=1; i<=n; i++) {
        print sorted[i] " : " chr_count[sorted[i]]
}

#2


11  

you can use the sort command. e.g.

你可以使用sort命令。例如

for ( i in data )
 print i ":", data[i]  | "sort"

#3


7  

Note that asort() and asorti() are specific to gawk, and are unknown to awk. For plain awk, you can roll your own sort() or get one from elsewhere.

请注意,asort()和asorti()特定于gawk,并且awk不知道。对于普通awk,您可以自己滚动排序()或从其他地方获取一个。

#4


4  

This is taken directly from the documentation:

这可以直接从文档中获取:

 populate the array data
 # copy indices
 j = 1
 for (i in data) {
     ind[j] = i    # index value becomes element value
     j++
 }
 n = asort(ind)    # index values are now sorted
 for (i = 1; i <= n; i++) {
     do something with ind[i]           Work with sorted indices directly
     ...
     do something with data[ind[i]]     Access original array via sorted indices
 }

#5


1  

I recently came across this issue and found that with gawk I could set the value of PROCINFO["sorted_in"] to control iteration order. I found a list of valid values for this by searching for PROCINFO online and landed on this GNU Awk User's Guide page: https://www.gnu.org/software/gawk/manual/html_node/Controlling-Scanning.html

我最近遇到了这个问题并发现使用gawk我可以设置PROCINFO [“sorted_in”]的值来控制迭代顺序。我通过在线搜索PROCINFO并在此GNU Awk用户指南页面上找到了有效值的列表:https://www.gnu.org/software/gawk/manual/html_node/Controlling-Scanning.html

This lists options of the form @{key|val}_{num|type|str}_{asc|desc} with:

这列出了@ {key | val} _ {num | type | str} _ {asc | desc}形式的选项:

  • key sorting by key and val sorting by value.
  • 按键排序键和按值排序。
  • num sorting numerically, str by string and type by assigned type.
  • num按数字排序,str按字符串排序,按类型分类。
  • asc for ascending order and desc for descending order.
  • asc用于升序和desc用于降序。

I simply used:

我只是用过:

PROCINFO["sorted_in"] = "@val_num_desc"
for (i in map) print i, map[i]

And the output was sorted in descending order of values.

输出按值的降序排序。

#1


23  

Instead of asort, use asorti(source, destination) which sorts the indices into a new array and you won't have to copy the array.

而不是asort,使用asorti(源,目标)将索引排序到新数组中,您不必复制数组。

Then you can use the destination array as pointers into the source array.

然后,您可以使用目标数组作为源数组的指针。

For your example, you would use it like this:

对于您的示例,您可以像这样使用它:

n=asorti(chr_count, sorted)
for (i=1; i<=n; i++) {
        print sorted[i] " : " chr_count[sorted[i]]
}

#2


11  

you can use the sort command. e.g.

你可以使用sort命令。例如

for ( i in data )
 print i ":", data[i]  | "sort"

#3


7  

Note that asort() and asorti() are specific to gawk, and are unknown to awk. For plain awk, you can roll your own sort() or get one from elsewhere.

请注意,asort()和asorti()特定于gawk,并且awk不知道。对于普通awk,您可以自己滚动排序()或从其他地方获取一个。

#4


4  

This is taken directly from the documentation:

这可以直接从文档中获取:

 populate the array data
 # copy indices
 j = 1
 for (i in data) {
     ind[j] = i    # index value becomes element value
     j++
 }
 n = asort(ind)    # index values are now sorted
 for (i = 1; i <= n; i++) {
     do something with ind[i]           Work with sorted indices directly
     ...
     do something with data[ind[i]]     Access original array via sorted indices
 }

#5


1  

I recently came across this issue and found that with gawk I could set the value of PROCINFO["sorted_in"] to control iteration order. I found a list of valid values for this by searching for PROCINFO online and landed on this GNU Awk User's Guide page: https://www.gnu.org/software/gawk/manual/html_node/Controlling-Scanning.html

我最近遇到了这个问题并发现使用gawk我可以设置PROCINFO [“sorted_in”]的值来控制迭代顺序。我通过在线搜索PROCINFO并在此GNU Awk用户指南页面上找到了有效值的列表:https://www.gnu.org/software/gawk/manual/html_node/Controlling-Scanning.html

This lists options of the form @{key|val}_{num|type|str}_{asc|desc} with:

这列出了@ {key | val} _ {num | type | str} _ {asc | desc}形式的选项:

  • key sorting by key and val sorting by value.
  • 按键排序键和按值排序。
  • num sorting numerically, str by string and type by assigned type.
  • num按数字排序,str按字符串排序,按类型分类。
  • asc for ascending order and desc for descending order.
  • asc用于升序和desc用于降序。

I simply used:

我只是用过:

PROCINFO["sorted_in"] = "@val_num_desc"
for (i in map) print i, map[i]

And the output was sorted in descending order of values.

输出按值的降序排序。