使用JQ创建嵌套的Json对象

时间:2020-12-22 14:29:53

I need to get the values in Exon of the following Json input and split it by ";" and convert to a nested JSON as show below in the Expected ouput section

我需要获取以下Json输入的Exon中的值并将其拆分为“;”并转换为嵌套的JSON,如下面的预期输出部分所示

Sample Input

  {  
   "regions":[  
      {  
         "metric":"GENE1",
         "value":[  
            {  
               "metric":"Exons",
               "value":[  
                  "GENE1;chr1;45656;5656667"    
               ],
               "type":"set"
            },
            {  
               "metric":"Precent_no_call",
               "value":4.22623,
               "type":"simple"
            },
            {  
               "metric":"Total_NoCall_bases",
               "value":112533,
               "type":"simple"
            }
         ],
         "type":"metrics-set"
      },
      {  
         "metric":"GENE2",
         "value":[  
            {  
               "metric":"Exons",
               "value":[  
                  "GENE2_Exon5;chr1;45656;5656667",
                  "GENE2_Exon10;chr1;45656;5656667"                 
               ],
               "type":"set"
            },
            {  
               "metric":"Precent_no_call",
               "value":0.746464,
               "type":"simple"
            },
            {  
               "metric":"Total_NoCall_bases",
               "value":16842,
               "type":"simple"
            }
         ],
         "type":"metrics-set"
      }
   ]
}

Expected Output

{  
   "regions":[  
      {  
         "metric":"GENE1",
         "value":[  
            {  
               "metric":"Exons",
               "value":[  
                  "GENE1",
                  {  
                     "chromosome":"chr1",
                     "start":45656,
                     "end":5656667
                  }
               ],
               "type":"set"
            },
            {  
               "metric":"Precent_no_call",
               "value":4.22623,
               "type":"simple"
            },
            {  
               "metric":"Total_NoCall_bases",
               "value":112533,
               "type":"simple"
            }
         ],
         "type":"metrics-set"
      },
      {  
         "metric":"GENE2",
         "value":[  
            {  
               "metric":"Exons",
               "value":[  
                  "GENE2_Exon5",
                  {  
                     "chromosome":"chr1",
                     "start":45656,
                     "end":5656667
                  },
                  "GENE2_Exon10",
                  {  
                     "chromosome":"chr1",
                     "start":45656,
                     "end":5656667
                  }
               ],
               "type":"set"
            },
            {  
               "metric":"Precent_no_call",
               "value":0.746464,
               "type":"simple"
            },
            {  
               "metric":"Total_NoCall_bases",
               "value":16842,
               "type":"simple"
            }
         ],
         "type":"metrics-set"
      }
   ]
}

NOTE

Also, this is related to the question here:- Converting comma separated file to nested objects json in jq

此外,这与此处的问题有关: - 将逗号分隔文件转换为jq中的嵌套对象json

Thanks for your help in advance.

感谢您的帮助。

Solution I tried from a comma separated input file(See in another question I posted)

def parse:
  [
      inputs                     # read lines
    | split(",")                 # split into columns
    | select(length>0)           # eliminate blanks
    | .[:1] + [.[1:-3]] + .[-3:] # normalize columns

  ]
;
def simple(n;v): {metric:n, value:v|tonumber, type:"simple"};
def set(n;v):    {metric:n, value:v,          type:"set"};
def chr(c;s;e):  {chromsome:c, start:s, end:e}; 
def region:
  set(.[0]; [
      set("Exons";  (.[1] | tostring | split(";") |.[0]); 
      chr((.[1] | tostring | split(";") |.[1]),(.[1] | tostring | split(";") |.[2]),(.[1] | tostring | split(";") |.[3]))
     ]

     ),
      simple("Fraction of bases"; .[5]),
      simple("Total_bases"; .[6])
    ]
  )
;
{
   "Regions": parse | map(region)
}

I was unable to loop it and read recursively.

我无法循环它并递归读取。

1 个解决方案

#1


1  

Since the low-level requirements are clear enough, I've assembled the following solution, which behaves exactly in accordance with the example. However, the higher level requirements are rather sketchy so you may need to make some adjustments.

由于低级要求足够清晰,我已经汇总了以下解决方案,其行为完全符合示例。但是,更高级别的要求相当粗略,因此您可能需要进行一些调整。

The low-level requirement (about converting the strings) can be implemented as follows:

低级要求(关于转换字符串)可以实现如下:

# Input: a string
def gene2object:
  split(";")
  | [.[0], { chromosome: .[1], 
             start: (.[2]|tonumber),
             end:   (.[3]|tonumber)} ];

A solution can now be written quite simply as follows:

现在可以非常简单地编写解决方案,如下所示:

walk( if type == "object" and .metric == "Exons" 
      then .value |= (map(gene2object)|add) 
      else .
      end )

The standard invocation (along the lines of: jq -f program.jq input.json) produces the output exactly as described, so I won't repeat it here.

标准调用(沿着:jq -f program.jq input.json)产生的输出完全如上所述,所以我在此不再重复。

If your jq does not have walk/1, then you can snarf its official definition from https://github.com/stedolan/jq/blob/master/src/builtin.jq That is, search for: def walk

如果您的jq没有walk / 1,那么您可以从https://github.com/stedolan/jq/blob/master/src/builtin.jq搜索其官方定义即搜索:def walk

#1


1  

Since the low-level requirements are clear enough, I've assembled the following solution, which behaves exactly in accordance with the example. However, the higher level requirements are rather sketchy so you may need to make some adjustments.

由于低级要求足够清晰,我已经汇总了以下解决方案,其行为完全符合示例。但是,更高级别的要求相当粗略,因此您可能需要进行一些调整。

The low-level requirement (about converting the strings) can be implemented as follows:

低级要求(关于转换字符串)可以实现如下:

# Input: a string
def gene2object:
  split(";")
  | [.[0], { chromosome: .[1], 
             start: (.[2]|tonumber),
             end:   (.[3]|tonumber)} ];

A solution can now be written quite simply as follows:

现在可以非常简单地编写解决方案,如下所示:

walk( if type == "object" and .metric == "Exons" 
      then .value |= (map(gene2object)|add) 
      else .
      end )

The standard invocation (along the lines of: jq -f program.jq input.json) produces the output exactly as described, so I won't repeat it here.

标准调用(沿着:jq -f program.jq input.json)产生的输出完全如上所述,所以我在此不再重复。

If your jq does not have walk/1, then you can snarf its official definition from https://github.com/stedolan/jq/blob/master/src/builtin.jq That is, search for: def walk

如果您的jq没有walk / 1,那么您可以从https://github.com/stedolan/jq/blob/master/src/builtin.jq搜索其官方定义即搜索:def walk