将Django模型迁移到unique_together约束

时间:2022-06-01 11:47:44

I have a model with three fields

我有一个有三个领域的模型

class MyModel(models.Model):
    a    = models.ForeignKey(A)
    b    = models.ForeignKey(B)
    c    = models.ForeignKey(C)

I want to enforce a unique constraint between these fields, and found django's unique_together, which seems to be the solution. However, I already have an existing database, and there are many duplicates. I know that since unique_together works at the database level, I need to unique-ify the rows, and then try a migration.

我想强制执行这些字段之间的唯一约束,并找到django的unique_together,这似乎是解决方案。但是,我已经有了一个现有的数据库,并且有很多重复数据库。我知道,因为unique_together在数据库级别工作,我需要使用unique-ify行,然后尝试迁移。

Is there a good way to go about removing duplicates (where a duplicate has the same (A,B,C)) so that I can run migration to get the unique_together contstraint?

是否有一个很好的方法去删除重复项(副本具有相同的(A,B,C)),以便我可以运行迁移以获得unique_together contstraint?

1 个解决方案

#1


22  

If you are happy to choose one of the duplicates arbitrarily, I think the following might do the trick. Perhaps not the most efficient but simple enough and I guess you only need to run this once. Please verify this all works yourself on some test data in case I've done something silly, since you are about to delete a bunch of data.

如果您乐意随意选择其中一个副本,我认为以下可能会有所帮助。也许不是最有效但足够简单,我想你只需要运行一次。如果我做了一些愚蠢的事情,请验证这一切都可以自己处理一些测试数据,因为你要删除一堆数据。

First we find groups of objects which form duplicates. For each group, (arbitrarily) pick a "master" that we are going to keep. Our chosen method is to pick the one with lowest pk

首先,我们找到形成重复的对象组。对于每个组,(任意)选择我们将要保留的“主人”。我们选择的方法是选择pk最低的方法

master_pks = MyModel.objects.values('A', 'B', 'C'
    ).annotate(Min('pk'), count=Count('pk')
    ).filter(count__gt=1
    ).values_list('pk__min', flat=True)

we then loop over each master, and delete all its duplicates

然后我们遍历每个主服务器,并删除所有重复项

masters = MyModel.objects.in_bulk( list(master_pks) )

for master in masters.values():
    MyModel.objects.filter(a=master.a, b=master.b, c=master.c
        ).exclude(pk=master.pk).del_ACCIDENT_PREVENTION_ete()

#1


22  

If you are happy to choose one of the duplicates arbitrarily, I think the following might do the trick. Perhaps not the most efficient but simple enough and I guess you only need to run this once. Please verify this all works yourself on some test data in case I've done something silly, since you are about to delete a bunch of data.

如果您乐意随意选择其中一个副本,我认为以下可能会有所帮助。也许不是最有效但足够简单,我想你只需要运行一次。如果我做了一些愚蠢的事情,请验证这一切都可以自己处理一些测试数据,因为你要删除一堆数据。

First we find groups of objects which form duplicates. For each group, (arbitrarily) pick a "master" that we are going to keep. Our chosen method is to pick the one with lowest pk

首先,我们找到形成重复的对象组。对于每个组,(任意)选择我们将要保留的“主人”。我们选择的方法是选择pk最低的方法

master_pks = MyModel.objects.values('A', 'B', 'C'
    ).annotate(Min('pk'), count=Count('pk')
    ).filter(count__gt=1
    ).values_list('pk__min', flat=True)

we then loop over each master, and delete all its duplicates

然后我们遍历每个主服务器,并删除所有重复项

masters = MyModel.objects.in_bulk( list(master_pks) )

for master in masters.values():
    MyModel.objects.filter(a=master.a, b=master.b, c=master.c
        ).exclude(pk=master.pk).del_ACCIDENT_PREVENTION_ete()