为什么我的链表没有删除多个重复项?

时间:2021-11-17 19:31:02

I have the following code:

我有以下代码:

class LinkedList:

  def __init__(self):
    self.head = None

I added the remove_duplicate() function to this LinkedList class which removes any duplicates except for the first instance in the list.

我将remove_duplicate()函数添加到此LinkedList类,该类除去列表中的第一个实例之外的任何重复项。

  def remove_duplicate(self, value):
    prev = None
    curr = self.head
    count = 0

    while curr:
        if curr.get_value() == value:
            count += 1
            if count > 1 :
                prev.set_next_node(curr.get_next_node())

        prev = curr
        curr = curr.get_next_node()

In my main function, I am doing these series of calls.

在我的主要功能中,我正在进行这些系列的调用。

linked_list = LinkedList()
linked_list.add("john")
linked_list.add("john")
linked_list.add("john")
linked_list.remove_duplicate("john")
print(linked_list)

I expected to get

我期望得到

['john']

But instead I got

但相反,我得到了

['john', 'john']

Why doesn't my code remove duplicates like it's supposed to?

为什么我的代码不会像它应该删除重复项?

p.s. there is Node code that i wrote earlier

附:我之前写的节点代码

class Node:
    def __init__(self, new_value):
        self.value = new_value
        self.next_node = None

    def get_value(self):
        return self.value

    def get_next_node(self):
        return self.next_node

    def set_value(self, new_value):
        self.value = new_value

    def set_next_node(self, new_next):
        self.next_node = new_next

1 个解决方案

#1


0  

Right, I think the first time your loop runs, you set curr to head, find a match to value, so increase count then move curr to prev and get the second element as curr.

是的,我认为你的循环第一次运行时,你将curr设置为head,找到一个匹配的值,所以增加count然后将curr移动到prev并将第二个元素作为curr。

Now, you find a match and drop curr from the list. This attaches the next element to prev.next (which is head). But you haven't changed what curr is, so when you leave the if , you then replace prev with curr which you had tried to drop.

现在,您可以从列表中找到匹配并删除curr。这将下一个元素附加到prev.next(即head)。但是你没有改变curr是什么,所以当你离开if时,你就用你想要掉落的curr替换prev。

So the next element you try to drop after the first, you can't do - i.e. any sequence to two duplicates, the second is ignored before you get back onto the correct chain.

因此,您尝试在第一个元素之后删除的下一个元素,您不能这样做 - 即任何序列到两个重复,第二个元素在您返回到正确的链之前被忽略。

I've found it easier to follow by spelling it out:

我发现通过拼写来说更容易理解:

# self.head
#     |
# [ node1 ].next > [ node2 ].next > [ node3 ].next > None
#     |                |                |
#   val1             val2             val3

prev = None, curr = node1
prev = curr # prev = node1 
curr = curr.next # curr = node2 

prev = node1, curr = node2
# here we change the contents of one of the nodes (node1.next)
prev.next = curr.next # node1.next = node3
# but here we are then placing the `disposed of` node as the previous one
prev = curr # prev = node2
curr = curr.next # curr = curr3

# prev is incorrect here
prev = node2, curr = node3
# here, when theres a match to delete, we then delete the link from the wrong node
prev.next = curr.next # node2.next = None
prev = curr # prev = node 3
curr = curr.next # curr = None

As a solution, you need to treat when you find a double and all other times slightly differently, possibly like below:

作为一种解决方案,您需要在找到双倍时以及所有其他时间稍微不同时对待,可能如下所示:

def remove_duplicate(self, value):
    prev = None
    curr = self.head
    count = 0

    while curr:
        # this is just to see which Nodes weren't being removed.
        if curr.value.startswith(value):
            count += 1
            # if this is false, your original update is still used
            if count > 1:
                # here, the first thing to do is get rid of curr,
                curr = curr.get_next_node()
                # and then attach the new curr to the previous Node
                prev.set_next_node(curr)
                # and restart the loop
                continue
        # this is done for any Node which isn't a multiple of value
        prev = curr
        curr = curr.get_next_node()

I have made a few guesses at bits of the rest of your code, but if this is run as

我已经对你的其余代码进行了一些猜测,但是如果这样运行的话

linked_list = LinkedList()
linked_list.add("john1")
linked_list.add("john2")
linked_list.add("fred")
linked_list.add("john3")
linked_list.add("john4")
linked_list.add("alice")
linked_list.add("john5")
linked_list.remove_duplicate("john")
print(linked_list)

you get

~$>./linked_list.py 
['john1', 'fred', 'alice']

#1


0  

Right, I think the first time your loop runs, you set curr to head, find a match to value, so increase count then move curr to prev and get the second element as curr.

是的,我认为你的循环第一次运行时,你将curr设置为head,找到一个匹配的值,所以增加count然后将curr移动到prev并将第二个元素作为curr。

Now, you find a match and drop curr from the list. This attaches the next element to prev.next (which is head). But you haven't changed what curr is, so when you leave the if , you then replace prev with curr which you had tried to drop.

现在,您可以从列表中找到匹配并删除curr。这将下一个元素附加到prev.next(即head)。但是你没有改变curr是什么,所以当你离开if时,你就用你想要掉落的curr替换prev。

So the next element you try to drop after the first, you can't do - i.e. any sequence to two duplicates, the second is ignored before you get back onto the correct chain.

因此,您尝试在第一个元素之后删除的下一个元素,您不能这样做 - 即任何序列到两个重复,第二个元素在您返回到正确的链之前被忽略。

I've found it easier to follow by spelling it out:

我发现通过拼写来说更容易理解:

# self.head
#     |
# [ node1 ].next > [ node2 ].next > [ node3 ].next > None
#     |                |                |
#   val1             val2             val3

prev = None, curr = node1
prev = curr # prev = node1 
curr = curr.next # curr = node2 

prev = node1, curr = node2
# here we change the contents of one of the nodes (node1.next)
prev.next = curr.next # node1.next = node3
# but here we are then placing the `disposed of` node as the previous one
prev = curr # prev = node2
curr = curr.next # curr = curr3

# prev is incorrect here
prev = node2, curr = node3
# here, when theres a match to delete, we then delete the link from the wrong node
prev.next = curr.next # node2.next = None
prev = curr # prev = node 3
curr = curr.next # curr = None

As a solution, you need to treat when you find a double and all other times slightly differently, possibly like below:

作为一种解决方案,您需要在找到双倍时以及所有其他时间稍微不同时对待,可能如下所示:

def remove_duplicate(self, value):
    prev = None
    curr = self.head
    count = 0

    while curr:
        # this is just to see which Nodes weren't being removed.
        if curr.value.startswith(value):
            count += 1
            # if this is false, your original update is still used
            if count > 1:
                # here, the first thing to do is get rid of curr,
                curr = curr.get_next_node()
                # and then attach the new curr to the previous Node
                prev.set_next_node(curr)
                # and restart the loop
                continue
        # this is done for any Node which isn't a multiple of value
        prev = curr
        curr = curr.get_next_node()

I have made a few guesses at bits of the rest of your code, but if this is run as

我已经对你的其余代码进行了一些猜测,但是如果这样运行的话

linked_list = LinkedList()
linked_list.add("john1")
linked_list.add("john2")
linked_list.add("fred")
linked_list.add("john3")
linked_list.add("john4")
linked_list.add("alice")
linked_list.add("john5")
linked_list.remove_duplicate("john")
print(linked_list)

you get

~$>./linked_list.py 
['john1', 'fred', 'alice']