如何拆分两个嵌套列表并组合这些部分以创建两个新的嵌套列表

时间:2022-08-12 15:47:28

I'm trying to code a simple genetic programming utility in python. But right now I'm stuck at the crossover/mate function for my trees. The trees are built by nested lists and look something like this:

我正在尝试在python中编写一个简单的遗传编程实用程序。但是现在我被困在树木的交叉/配合功能上。树是由嵌套列表构建的,看起来像这样:

# f = internal node (a function), c = leaf node (a constant)
tree1 = [f, [f, [f, c, c], [f, c, c]], [f, [f, c, c], [f, c, c]]]
tree2 = [f, [f, [f, c, c], c], [f, [f, c, c], c]]

I want to randomly select a point in each tree to split at and then I want one part from each tree to be combined into a new tree. There is also a max depth that shouldn't be exceeded so the selects can't really take place anywhere in the tree as it might create a too large tree. Below is an example on how it should work:

我想随机选择每棵树中的一个点来分割,然后我希望每棵树中的一个部分组合成一个新的树。还有一个不应超过的最大深度,因此选择不能真正发生在树中的任何位置,因为它可能会创建一个太大的树。下面是一个如何工作的例子:

# f:n, where n is the number of arguments the function take
#               + split here  
tree1 = [f:2, [f:3, a, a, a], a]
#                            + split here
tree2 = [f:2, [f:2, a, a], [f:1, a]

tree_child1 = [f:2, [f:1, a], a]
tree_child2 = [f:2, [f:2, a, a], [f:3, a, a, a]]

I have no idea (at the moment) on how to solve this. Any tips or solutions are more than welcome!

我不知道(目前)如何解决这个问题。任何提示或解决方案都非常受欢迎!

(Added my parse function as it might help someone to understand the structure better.)

(添加了我的解析功能,因为它可能有助于人们更好地理解结构。)

# My recursive code to parse the tree.
def parse(self, node=None):
    if not node:
        node = self.root

    if isinstance(node, list):
        function = node[0]
        res = []
        for child in node[1:function.arity+1]:
            res.append(self.parse(child))
        value = function.parse(*res) # function
    else:
        value = node.parse() # constant
    return value

2 个解决方案

#1


I ended up implementing most of this as an exercise.

我最终将大部分内容作为练习来实现。

First, find the number of possible locations to split: the number of non-function nodes.

首先,找到要拆分的可能位置的数量:非功能节点的数量。

def count(obj):
    total = 0
    for o in obj[1:]:
        # Add the node itself.
        total += 1

        if isinstance(o, list):
            total += count(o)
    return total

Then, a helper: given an index in the above range, figure out where it is.

然后,帮助者:给出上述范围内的索引,找出它的位置。

def find_idx(tree, idx):
    """
    Return the node containing the idx'th function parameter, and the index of that
    parameter.  If the tree contains fewer than idx parameters, return (None, None).
    """
    if not isinstance(idx, list):
        # Stash this in a list, so recursive calls share the same value.
        idx = [idx]

    for i, o in enumerate(tree):
        # Skip the function itself.
        if i == 0:
            continue

        if idx[0] == 0:
            return tree, i

        idx[0] -= 1
        if isinstance(o, list):
            container, result_index = find_idx(o, idx)
            if container is not None:
                return container, result_index

    return None, None

Doing the swap is pretty simple now:

现在进行交换非常简单:

def random_swap(tree1, tree2):
    from random import randrange
    pos_in_1 = randrange(0, count(tree1))
    pos_in_2 = randrange(0, count(tree2))

    parent1, idx1 = find_idx(tree1, pos_in_1)
    parent2, idx2 = find_idx(tree2, pos_in_2)

    # Swap:
    parent1[idx1], parent2[idx2] = parent2[idx2], parent1[idx1]

c = 1
tree1 = ["f:2", c, ["f:1", c]]
tree2 = ["f:2", ["f:2", ["f:2", c, c], ["f:2", c, c]], ["f:3", ["f:4", c, c, c, c], ["f:2", c, c], c]]

while True:
    random_swap(tree1, tree2)
    print tree1
    print tree2

This doesn't implement a max depth, but it's a start.

这并没有实现最大深度,但它是一个开始。

This will also never replace the root node, where a node in tree1 becomes the new tree2 and all of tree2 becomes a node in tree1. A workaround would be to wrap the whole thing in eg. [lambda a: a, tree], so editable nodes always have a parent node.

这也将永远不会替换根节点,其中tree1中的节点成为新的tree2,并且tree2的所有节点都成为tree1中的节点。解决方法是将整个事物包裹在例如。 [lamba a:a,tree],因此可编辑节点始终具有父节点。

This isn't very efficient. Maintaining node counts could make it faster, but then you'd need to store a reference to the parent, too, in order to update the counts efficiently. If you go that route, you'll really want to find or implement a real tree class.

这不是很有效。维护节点计数可以使其更快,但是您也需要存储对父节点的引用,以便有效地更新计数。如果你走这条路,你真的想找到或实现一个真正的树类。

#2


If you store in each internal node a count of the children in each branch, then you could pick a split point by generating a random number from 0 to 1+total children. If the answer is 1, split at that node, otherwise use the number to figure out which subtree to descend to, and repeat the process.

如果在每个内部节点中存储每个分支中子项的计数,则可以通过生成0到1 +总子项的随机数来选择分割点。如果答案为1,则在该节点处拆分,否则使用该数字来确定要下降到哪个子树,然后重复该过程。

#1


I ended up implementing most of this as an exercise.

我最终将大部分内容作为练习来实现。

First, find the number of possible locations to split: the number of non-function nodes.

首先,找到要拆分的可能位置的数量:非功能节点的数量。

def count(obj):
    total = 0
    for o in obj[1:]:
        # Add the node itself.
        total += 1

        if isinstance(o, list):
            total += count(o)
    return total

Then, a helper: given an index in the above range, figure out where it is.

然后,帮助者:给出上述范围内的索引,找出它的位置。

def find_idx(tree, idx):
    """
    Return the node containing the idx'th function parameter, and the index of that
    parameter.  If the tree contains fewer than idx parameters, return (None, None).
    """
    if not isinstance(idx, list):
        # Stash this in a list, so recursive calls share the same value.
        idx = [idx]

    for i, o in enumerate(tree):
        # Skip the function itself.
        if i == 0:
            continue

        if idx[0] == 0:
            return tree, i

        idx[0] -= 1
        if isinstance(o, list):
            container, result_index = find_idx(o, idx)
            if container is not None:
                return container, result_index

    return None, None

Doing the swap is pretty simple now:

现在进行交换非常简单:

def random_swap(tree1, tree2):
    from random import randrange
    pos_in_1 = randrange(0, count(tree1))
    pos_in_2 = randrange(0, count(tree2))

    parent1, idx1 = find_idx(tree1, pos_in_1)
    parent2, idx2 = find_idx(tree2, pos_in_2)

    # Swap:
    parent1[idx1], parent2[idx2] = parent2[idx2], parent1[idx1]

c = 1
tree1 = ["f:2", c, ["f:1", c]]
tree2 = ["f:2", ["f:2", ["f:2", c, c], ["f:2", c, c]], ["f:3", ["f:4", c, c, c, c], ["f:2", c, c], c]]

while True:
    random_swap(tree1, tree2)
    print tree1
    print tree2

This doesn't implement a max depth, but it's a start.

这并没有实现最大深度,但它是一个开始。

This will also never replace the root node, where a node in tree1 becomes the new tree2 and all of tree2 becomes a node in tree1. A workaround would be to wrap the whole thing in eg. [lambda a: a, tree], so editable nodes always have a parent node.

这也将永远不会替换根节点,其中tree1中的节点成为新的tree2,并且tree2的所有节点都成为tree1中的节点。解决方法是将整个事物包裹在例如。 [lamba a:a,tree],因此可编辑节点始终具有父节点。

This isn't very efficient. Maintaining node counts could make it faster, but then you'd need to store a reference to the parent, too, in order to update the counts efficiently. If you go that route, you'll really want to find or implement a real tree class.

这不是很有效。维护节点计数可以使其更快,但是您也需要存储对父节点的引用,以便有效地更新计数。如果你走这条路,你真的想找到或实现一个真正的树类。

#2


If you store in each internal node a count of the children in each branch, then you could pick a split point by generating a random number from 0 to 1+total children. If the answer is 1, split at that node, otherwise use the number to figure out which subtree to descend to, and repeat the process.

如果在每个内部节点中存储每个分支中子项的计数,则可以通过生成0到1 +总子项的随机数来选择分割点。如果答案为1,则在该节点处拆分,否则使用该数字来确定要下降到哪个子树,然后重复该过程。