将值合并到一个有序的命令中的一个键

时间:2022-09-28 19:32:00

So I was wondering if there was a much more elegant solution to the one I have implemented right now into merging values of an ordered dict.

因此,我想知道是否有一个更优雅的解决方案,我现在已经实现了合并价值的命令。

I have an ordered dict that looks like this

我有一个像这样的命令

'fields': OrderedDict([
    ("Sample Code", "Vendor Sample ID"),
    ("Donor ID", "Vendor Subject ID"),
    ("Format", "Material Format"),
    ("Sample Type", "Sample Type"),
    ("Age", "Age"),
    ("Gender", "Gender"),
    ("Ethnicity/ Race", "Race"),
]),

If I pass in a parameter like so as a list

如果我像列表一样传入一个参数

[2,3] or [2,4,5]

is there an elegant way to merge the values together under a new key so

是否有一种优雅的方法将值合并到一个新的键下

[2,3], "Random_Key"

would return

将返回

'fields': OrderedDict([
        ("Sample Code", "Vendor Sample ID"),
        ("Donor ID", "Vendor Subject ID"),
        **("Random Key", "Material Format Sample Type"),**
        ("Age", "Age"),
        ("Gender", "Gender"),
        ("Ethnicity/ Race", "Race"),
    ]),

while also deleting the keys in the dictionary?

同时删除字典中的键?

3 个解决方案

#1


1  

This can also be done nicely with a generator.

这也可以用生成器很好地完成。

This generator yields the key item pair if it doesn't have to be squashed, and if it has, it saves the items till the last entry, and then yields it, with a new key and the saved items joined.

如果不需要压缩的话,这个生成器将生成关键项对,如果需要,它将保存这些项到最后一个条目,然后使用一个新键和已保存的项联接来生成它。

With the generator a new OrderedDict can be constructed.

通过生成器,可以构造一个新的排序命令。

from collections import OrderedDict    

def sqaushDict(d, ind, new_key):
    """ Takes an OrderedDictionary d, and yields its key item pairs, 
    except the ones at an index in indices (ind), these items are merged 
    and yielded at the last position of indices (ind) with a new key (new_key)
    """
    if not all(x < len(d) for x in ind):
        raise IndexError ("Index out of bounds")
    vals = []
    for n, (k, i), in enumerate(d.items()):
        if n in ind:
            vals += [i]
            if n == ind[-1]:
                yield (new_key, " ".join(vals))
        else:
            yield (i, k)

d = OrderedDict([
    ("Sample Code", "Vendor Sample ID"),
    ("Donor ID", "Vendor Subject ID"),
    ("Format", "Material Format"),
    ("Sample Type", "Sample Type"),
    ("Age", "Age"),
    ("Gender", "Gender"),
])

t = OrderedDict(squashDict(d, [2, 3], "Random"))
print(t)

#2


1  

not sure there's an elegant way. OrderedDict has a move_to_end method to move keys at start or end, but not at a random position.

不确定是否有一种优雅的方式。OrderedDict有一个move_to_end方法在开始或结束时移动键,但不是在随机位置。

I'd try to be as efficient as possible, and minimze loops

我将尽可能地提高效率,最小化循环

  • get a list of the keys
  • 获取键的列表
  • find the index of the key you want to merge with the following one
  • 找到要与下面的索引合并的键的索引
  • remove the next key of the dictionary
  • 删除字典的下一个键
  • create a list with d items
  • 用d项创建一个列表
  • alter this list with the new value at the stored index
  • 使用存储索引处的新值修改此列表
  • rebuild an OrderedDict from it
  • 从它重新构建一个命令

like this (I removed some keys because it shortens the example):

像这样(我删除了一些键,因为它缩短了示例):

from collections import OrderedDict

d = OrderedDict([
    ("Sample Code", "Vendor Sample ID"),
    ("Donor ID", "Vendor Subject ID"),
    ("Format", "Material Format"),
    ("Sample Type", "Sample Type"),
    ("Age", "Age"),
    ("Gender", "Gender"),
])

lk = list(d.keys())
index = lk.index("Sample Type")
v = d.pop(lk[index+1])

t = list(d.items())
t[index] = ("new key",t[index][1]+" "+v)

d = OrderedDict(t)

print(d)

result:

结果:

OrderedDict([('Sample Code', 'Vendor Sample ID'), ('Donor ID', 'Vendor Subject ID'), ('Format', 'Material Format'), ('new key', 'Sample Type Age'), ('Gender', 'Gender')])

OrderedDict((“样本代码”、“供应商样本ID”)、(“捐赠者ID”、“供应商主体ID”)、(“格式”、“材料格式”)、(“新键”、“样本类型年龄”)、(“性别”、“性别”)

#3


0  

You can optimize this by sorting the indices descending, then you can use dict.pop(key,None) to retreive and remove the key/value at once, but I decided against it, append the values in the order the occured in indices.

您可以通过排序下降的索引来优化它,然后您可以使用dict.pop(key,None)来返回并立即删除键/值,但我决定不这么做,按索引中发生的值的顺序添加值。

from collections import OrderedDict
from pprint import pprint

def mergeEm(d,indices,key):
    """Merges the values at index given by 'indices' on OrderedDict d into a list.        
    Appends this list with key as key to the dict. Deletes keys used to build list."""

    if not all(x < len(d) for x in indices):
        raise IndexError ("Index out of bounds")

    vals = []                      # stores the values to be removed in order
    allkeys = list(d.keys())
    for i in indices:
        vals.append(d[allkeys[i]])   # append to temporary list
    d[key] = vals                  # add to dict, use ''.join(vals) to combine str
    for i in indices:              # remove all indices keys
        d.pop(allkeys[i],None)
    pprint(d)


fields= OrderedDict([
    ("Sample Code", "Vendor Sample ID"),
    ("Donor ID", "Vendor Subject ID"),
    ("Format", "Material Format"),
    ("Sample Type", "Sample Type"),
    ("Age", "Age"),
    ("Gender", "Gender"),
    ("Ethnicity/ Race", "Race"),
    ("Sample Type", "Sample Type"),
    ("Organ", "Organ"),
    ("Pathological Diagnosis", "Diagnosis"),
    ("Detailed Pathological Diagnosis", "Detailed Diagnosis"),
    ("Clinical Diagnosis/Cause of Death", "Detailed Diagnosis option 2"),
    ("Dissection", "Dissection"),
    ("Quantity (g, ml, or ug)", "Quantity"),
    ("HIV", "HIV"),
    ("HEP B", "HEP B")
])
pprint(fields)
mergeEm(fields, [5,4,2], "tata")

Output:

输出:

OrderedDict([('Sample Code', 'Vendor Sample ID'),
             ('Donor ID', 'Vendor Subject ID'),
             ('Format', 'Material Format'),
             ('Sample Type', 'Sample Type'),
             ('Age', 'Age'),
             ('Gender', 'Gender'),
             ('Ethnicity/ Race', 'Race'),
             ('Organ', 'Organ'),
             ('Pathological Diagnosis', 'Diagnosis'),
             ('Detailed Pathological Diagnosis', 'Detailed Diagnosis'),
             ('Clinical Diagnosis/Cause of Death',
              'Detailed Diagnosis option 2'),
             ('Dissection', 'Dissection'),
             ('Quantity (g, ml, or ug)', 'Quantity'),
             ('HIV', 'HIV'),
             ('HEP B', 'HEP B')])


OrderedDict([('Sample Code', 'Vendor Sample ID'),
             ('Donor ID', 'Vendor Subject ID'),
             ('Sample Type', 'Sample Type'),
             ('Ethnicity/ Race', 'Race'),
             ('Organ', 'Organ'),
             ('Pathological Diagnosis', 'Diagnosis'),
             ('Detailed Pathological Diagnosis', 'Detailed Diagnosis'),
             ('Clinical Diagnosis/Cause of Death',
              'Detailed Diagnosis option 2'),
             ('Dissection', 'Dissection'),
             ('Quantity (g, ml, or ug)', 'Quantity'),
             ('HIV', 'HIV'),
             ('HEP B', 'HEP B'),
             ('tata', ['Gender', 'Age', 'Material Format'])])

#1


1  

This can also be done nicely with a generator.

这也可以用生成器很好地完成。

This generator yields the key item pair if it doesn't have to be squashed, and if it has, it saves the items till the last entry, and then yields it, with a new key and the saved items joined.

如果不需要压缩的话,这个生成器将生成关键项对,如果需要,它将保存这些项到最后一个条目,然后使用一个新键和已保存的项联接来生成它。

With the generator a new OrderedDict can be constructed.

通过生成器,可以构造一个新的排序命令。

from collections import OrderedDict    

def sqaushDict(d, ind, new_key):
    """ Takes an OrderedDictionary d, and yields its key item pairs, 
    except the ones at an index in indices (ind), these items are merged 
    and yielded at the last position of indices (ind) with a new key (new_key)
    """
    if not all(x < len(d) for x in ind):
        raise IndexError ("Index out of bounds")
    vals = []
    for n, (k, i), in enumerate(d.items()):
        if n in ind:
            vals += [i]
            if n == ind[-1]:
                yield (new_key, " ".join(vals))
        else:
            yield (i, k)

d = OrderedDict([
    ("Sample Code", "Vendor Sample ID"),
    ("Donor ID", "Vendor Subject ID"),
    ("Format", "Material Format"),
    ("Sample Type", "Sample Type"),
    ("Age", "Age"),
    ("Gender", "Gender"),
])

t = OrderedDict(squashDict(d, [2, 3], "Random"))
print(t)

#2


1  

not sure there's an elegant way. OrderedDict has a move_to_end method to move keys at start or end, but not at a random position.

不确定是否有一种优雅的方式。OrderedDict有一个move_to_end方法在开始或结束时移动键,但不是在随机位置。

I'd try to be as efficient as possible, and minimze loops

我将尽可能地提高效率,最小化循环

  • get a list of the keys
  • 获取键的列表
  • find the index of the key you want to merge with the following one
  • 找到要与下面的索引合并的键的索引
  • remove the next key of the dictionary
  • 删除字典的下一个键
  • create a list with d items
  • 用d项创建一个列表
  • alter this list with the new value at the stored index
  • 使用存储索引处的新值修改此列表
  • rebuild an OrderedDict from it
  • 从它重新构建一个命令

like this (I removed some keys because it shortens the example):

像这样(我删除了一些键,因为它缩短了示例):

from collections import OrderedDict

d = OrderedDict([
    ("Sample Code", "Vendor Sample ID"),
    ("Donor ID", "Vendor Subject ID"),
    ("Format", "Material Format"),
    ("Sample Type", "Sample Type"),
    ("Age", "Age"),
    ("Gender", "Gender"),
])

lk = list(d.keys())
index = lk.index("Sample Type")
v = d.pop(lk[index+1])

t = list(d.items())
t[index] = ("new key",t[index][1]+" "+v)

d = OrderedDict(t)

print(d)

result:

结果:

OrderedDict([('Sample Code', 'Vendor Sample ID'), ('Donor ID', 'Vendor Subject ID'), ('Format', 'Material Format'), ('new key', 'Sample Type Age'), ('Gender', 'Gender')])

OrderedDict((“样本代码”、“供应商样本ID”)、(“捐赠者ID”、“供应商主体ID”)、(“格式”、“材料格式”)、(“新键”、“样本类型年龄”)、(“性别”、“性别”)

#3


0  

You can optimize this by sorting the indices descending, then you can use dict.pop(key,None) to retreive and remove the key/value at once, but I decided against it, append the values in the order the occured in indices.

您可以通过排序下降的索引来优化它,然后您可以使用dict.pop(key,None)来返回并立即删除键/值,但我决定不这么做,按索引中发生的值的顺序添加值。

from collections import OrderedDict
from pprint import pprint

def mergeEm(d,indices,key):
    """Merges the values at index given by 'indices' on OrderedDict d into a list.        
    Appends this list with key as key to the dict. Deletes keys used to build list."""

    if not all(x < len(d) for x in indices):
        raise IndexError ("Index out of bounds")

    vals = []                      # stores the values to be removed in order
    allkeys = list(d.keys())
    for i in indices:
        vals.append(d[allkeys[i]])   # append to temporary list
    d[key] = vals                  # add to dict, use ''.join(vals) to combine str
    for i in indices:              # remove all indices keys
        d.pop(allkeys[i],None)
    pprint(d)


fields= OrderedDict([
    ("Sample Code", "Vendor Sample ID"),
    ("Donor ID", "Vendor Subject ID"),
    ("Format", "Material Format"),
    ("Sample Type", "Sample Type"),
    ("Age", "Age"),
    ("Gender", "Gender"),
    ("Ethnicity/ Race", "Race"),
    ("Sample Type", "Sample Type"),
    ("Organ", "Organ"),
    ("Pathological Diagnosis", "Diagnosis"),
    ("Detailed Pathological Diagnosis", "Detailed Diagnosis"),
    ("Clinical Diagnosis/Cause of Death", "Detailed Diagnosis option 2"),
    ("Dissection", "Dissection"),
    ("Quantity (g, ml, or ug)", "Quantity"),
    ("HIV", "HIV"),
    ("HEP B", "HEP B")
])
pprint(fields)
mergeEm(fields, [5,4,2], "tata")

Output:

输出:

OrderedDict([('Sample Code', 'Vendor Sample ID'),
             ('Donor ID', 'Vendor Subject ID'),
             ('Format', 'Material Format'),
             ('Sample Type', 'Sample Type'),
             ('Age', 'Age'),
             ('Gender', 'Gender'),
             ('Ethnicity/ Race', 'Race'),
             ('Organ', 'Organ'),
             ('Pathological Diagnosis', 'Diagnosis'),
             ('Detailed Pathological Diagnosis', 'Detailed Diagnosis'),
             ('Clinical Diagnosis/Cause of Death',
              'Detailed Diagnosis option 2'),
             ('Dissection', 'Dissection'),
             ('Quantity (g, ml, or ug)', 'Quantity'),
             ('HIV', 'HIV'),
             ('HEP B', 'HEP B')])


OrderedDict([('Sample Code', 'Vendor Sample ID'),
             ('Donor ID', 'Vendor Subject ID'),
             ('Sample Type', 'Sample Type'),
             ('Ethnicity/ Race', 'Race'),
             ('Organ', 'Organ'),
             ('Pathological Diagnosis', 'Diagnosis'),
             ('Detailed Pathological Diagnosis', 'Detailed Diagnosis'),
             ('Clinical Diagnosis/Cause of Death',
              'Detailed Diagnosis option 2'),
             ('Dissection', 'Dissection'),
             ('Quantity (g, ml, or ug)', 'Quantity'),
             ('HIV', 'HIV'),
             ('HEP B', 'HEP B'),
             ('tata', ['Gender', 'Age', 'Material Format'])])

相关文章