查找两个阵列之间的常见和唯一项

时间:2021-03-14 12:47:13

I use ec2.py dynamic inventory script with ansible to extract a list of ec2 hosts and their tag names. It returns me a list of JSON as below,

我使用带有ansible的ec2.py动态库存脚本来提取ec2主机及其标签名称的列表。它返回一个JSON列表,如下所示,

  "tag_aws_autoscaling_groupName_asg_test": [
    "aa.b.bb.55",
    "1b.b.c.d"
  ],

  "tag_aws_autoscaling_groupName_asg_unknown": [
    "aa.b.bb.55",
    "1b.b.c.e"
  ],

I'm using jq for parsing this output.

我正在使用jq来解析此输出。

  1. How can I extract only fields common to both these ASG?
  2. 如何仅提取这两个ASG共有的字段?
  3. How can I extract only fields unique to both these ASG?
  4. 如何仅提取这两个ASG独有的字段?

2 个解决方案

#1


1  

difference/2

Because of the way jq's "-" operator is defined on arrays, one invocation of unique is sufficient to produce a "uniquified" answer:

由于jq的“ - ”运算符是在数组上定义的,因此对unique的一次调用就足以产生一个“未经过验证”的答案:

def difference($a; $b): ($a | unique) - $b;

Similarly, for the symmetric difference, a single sorting operation is sufficient to produce a "uniquified" value:

类似地,对于对称差异,单个排序操作足以产生“未加”的值:

def sdiff($a; $b): (($a-$b) + ($b-$a)) | unique;

intersect/2

Here is a faster version of intersect/2 that should work with all versions of jq -- it eliminates group_by in favor of sort:

这是一个更快的版本的intersect / 2应该适用于所有版本的jq - 它消除了group_by,有利于排序:

def intersect(x;y):
  ( (x|unique) + (y|unique) | sort) as $sorted
  | reduce range(1; $sorted|length) as $i
      ([];
       if $sorted[$i] == $sorted[$i-1] then . + [$sorted[$i]] else . end) ;

intersection/2

If you have jq 1.5, then here's a similar but still measurably faster set-intersection function: it produces a stream of the elements in the set-intersection of the two arrays:

如果你有jq 1.5,那么这里有一个相似但仍然明显更快的set-intersection函数:它在两个数组的集合交集中产生一个元素流:

def intersection(x;y):
  (x|unique) as $x | (y|unique) as $y
  | ($x|length) as $m
  | ($y|length) as $n
  | if $m == 0 or $n == 0 then empty
    else { i:-1, j:-1, ans:false }
    | while(  .i < $m and .j < $n;
        $x[.i+1] as $nextx
        | if $nextx == $y[.j+1] then {i:(.i+1), j:(.j+1), ans: true, value: $nextx}
          elif  $nextx < $y[.j+1] then .i += 1 | .ans = false
          else  .j += 1 | .ans = false
          end )
    end
  | if .ans then .value else empty end ;

#2


1  

To find items common between two arrays, just perform a set intersection between the two. There's no intersection function available but it should be simple enough to define on your own. Take the unique items of each array, group them up by value, then take the items where there are more than 1 in a group.

要查找两个数组之间的公共项,只需在两者之间执行集合交集。没有可用的交叉功能,但它应该足够简单,可以自己定义。获取每个数组的唯一项,按值对其进行分组,然后获取组中多于1个项的项。

def intersect($a; $b): [($a | unique)[], ($b | unique)[]]
    | [group_by(.)[] | select(length > 1)[0]];

Using this, to find the common elements (assuming your input is actually a valid json object):

使用它,找到公共元素(假设您的输入实际上是一个有效的json对象):

$ jq 'def intersect($a; $b): [($a | unique)[], ($b | unique)[]]
    | [group_by(.)[] | select(length > 1)[0]];
intersect(.tag_aws_autoscaling_groupName_asg_test;
          .tag_aws_autoscaling_groupName_asg_unknown)' < input.json
[
  "aa.b.bb.55"
]

To find items unique to an array, just perform the set difference.

要查找数组特有的项,只需执行设置差异即可。

$ jq 'def difference($a; $b): ($a | unique) - ($b | unique);
difference(.tag_aws_autoscaling_groupName_asg_test;
           .tag_aws_autoscaling_groupName_asg_unknown)' < input.json
[
  "1b.b.c.d"
]

#1


1  

difference/2

Because of the way jq's "-" operator is defined on arrays, one invocation of unique is sufficient to produce a "uniquified" answer:

由于jq的“ - ”运算符是在数组上定义的,因此对unique的一次调用就足以产生一个“未经过验证”的答案:

def difference($a; $b): ($a | unique) - $b;

Similarly, for the symmetric difference, a single sorting operation is sufficient to produce a "uniquified" value:

类似地,对于对称差异,单个排序操作足以产生“未加”的值:

def sdiff($a; $b): (($a-$b) + ($b-$a)) | unique;

intersect/2

Here is a faster version of intersect/2 that should work with all versions of jq -- it eliminates group_by in favor of sort:

这是一个更快的版本的intersect / 2应该适用于所有版本的jq - 它消除了group_by,有利于排序:

def intersect(x;y):
  ( (x|unique) + (y|unique) | sort) as $sorted
  | reduce range(1; $sorted|length) as $i
      ([];
       if $sorted[$i] == $sorted[$i-1] then . + [$sorted[$i]] else . end) ;

intersection/2

If you have jq 1.5, then here's a similar but still measurably faster set-intersection function: it produces a stream of the elements in the set-intersection of the two arrays:

如果你有jq 1.5,那么这里有一个相似但仍然明显更快的set-intersection函数:它在两个数组的集合交集中产生一个元素流:

def intersection(x;y):
  (x|unique) as $x | (y|unique) as $y
  | ($x|length) as $m
  | ($y|length) as $n
  | if $m == 0 or $n == 0 then empty
    else { i:-1, j:-1, ans:false }
    | while(  .i < $m and .j < $n;
        $x[.i+1] as $nextx
        | if $nextx == $y[.j+1] then {i:(.i+1), j:(.j+1), ans: true, value: $nextx}
          elif  $nextx < $y[.j+1] then .i += 1 | .ans = false
          else  .j += 1 | .ans = false
          end )
    end
  | if .ans then .value else empty end ;

#2


1  

To find items common between two arrays, just perform a set intersection between the two. There's no intersection function available but it should be simple enough to define on your own. Take the unique items of each array, group them up by value, then take the items where there are more than 1 in a group.

要查找两个数组之间的公共项,只需在两者之间执行集合交集。没有可用的交叉功能,但它应该足够简单,可以自己定义。获取每个数组的唯一项,按值对其进行分组,然后获取组中多于1个项的项。

def intersect($a; $b): [($a | unique)[], ($b | unique)[]]
    | [group_by(.)[] | select(length > 1)[0]];

Using this, to find the common elements (assuming your input is actually a valid json object):

使用它,找到公共元素(假设您的输入实际上是一个有效的json对象):

$ jq 'def intersect($a; $b): [($a | unique)[], ($b | unique)[]]
    | [group_by(.)[] | select(length > 1)[0]];
intersect(.tag_aws_autoscaling_groupName_asg_test;
          .tag_aws_autoscaling_groupName_asg_unknown)' < input.json
[
  "aa.b.bb.55"
]

To find items unique to an array, just perform the set difference.

要查找数组特有的项,只需执行设置差异即可。

$ jq 'def difference($a; $b): ($a | unique) - ($b | unique);
difference(.tag_aws_autoscaling_groupName_asg_test;
           .tag_aws_autoscaling_groupName_asg_unknown)' < input.json
[
  "1b.b.c.d"
]