suppose we have a model in django defined as follows:
假设我们在django有一个模型,定义如下:
class Literal:
name = models.CharField(...)
...
Name field is not unique, and thus can have duplicate values. I need to accomplish the following task: Select all rows from the model that have at least one duplicate value of the name
field.
Name字段不是唯一的,因此可以有重复的值。我需要完成以下任务:从模型中选择所有具有name字段至少一个重复值的行。
I know how to do it using plain SQL (may be not the best solution):
我知道如何使用纯SQL(可能不是最好的解决方案):
select * from literal where name IN (
select name from literal group by name having count((name)) > 1
);
So, is it possible to select this using django ORM? Or better SQL solution?
那么,是否可以使用django ORM来选择它呢?SQL或更好的解决方案吗?
5 个解决方案
#1
135
Try:
试一试:
from django.db.models import Count
Literal.objects.values('name')
.annotate(Count('id'))
.order_by()
.filter(id__count__gt=1)
This is as close as you can get with Django. The problem is that this will return a ValuesQuerySet
with only name
and count
. However, you can then use this to construct a regular QuerySet
by feeding it back into another query:
这和Django是一样的。问题是,这将返回一个只有名称和计数的ValuesQuerySet。但是,您可以使用它来构造一个常规的QuerySet,方法是将它返回到另一个查询中:
dupes = Literal.objects.values('name')
.annotate(Count('id'))
.order_by()
.filter(id__count__gt=1)
Literal.objects.filter(name__in=[item['name'] for item in dupes])
#2
27
This was rejected as an edit. So here it is as a better answer
这被拒绝作为编辑。这是一个更好的答案
dups = (
Literal.objects.values('name')
.annotate(count=Count('id'))
.values('name')
.order_by()
.filter(count__gt=1)
)
This will return a ValuesQuerySet with all of the duplicate names. However, you can then use this to construct a regular QuerySet by feeding it back into another query. The django orm is smart enough to combine these into a single query:
这将返回一个包含所有重复名称的ValuesQuerySet。但是,您可以使用它来构造一个常规的查询集,方法是将它返回到另一个查询中。django orm足够聪明,可以将它们组合为一个查询:
Literal.objects.filter(name__in=dups)
The extra call to .values('name') after the annotate call looks a little strange. Without this, the subquery fails. The extra values tricks the orm into only selecting the name column for the subquery.
在注释调用之后对.values('name')的额外调用看起来有点奇怪。否则,子查询将失败。额外的值使orm只选择子查询的name列。
#3
9
try using aggregation
试着用聚合
Literal.objects.values('name').annotate(name_count=Count('name')).exclude(name_count=1)
#4
1
In case you use PostgreSQL, you can do something like this:
如果你使用PostgreSQL,你可以这样做:
from django.contrib.postgres.aggregates import ArrayAgg
from django.db.models import Func, Value
duplicate_ids = (Literal.objects.values('name')
.annotate(ids=ArrayAgg('id'))
.annotate(c=Func('ids', Value(1), function='array_length'))
.filter(c__gt=1)
.annotate(ids=Func('ids', function='unnest'))
.values_list('ids', flat=True))
It results in this rather simple SQL query:
它导致这个相当简单的SQL查询:
SELECT unnest(ARRAY_AGG("app_literal"."id")) AS "ids"
FROM "app_literal"
GROUP BY "app_literal"."name"
HAVING array_length(ARRAY_AGG("app_literal"."id"), 1) > 1
#5
0
If you want to result only names list but not objects, you can use the following query
如果希望只生成名称列表而不生成对象,可以使用以下查询
repeated_names = Literal.objects.values('name').annotate(Count('id')).order_by().filter(id__count__gt=1).values_list('name', flat='true')
#1
135
Try:
试一试:
from django.db.models import Count
Literal.objects.values('name')
.annotate(Count('id'))
.order_by()
.filter(id__count__gt=1)
This is as close as you can get with Django. The problem is that this will return a ValuesQuerySet
with only name
and count
. However, you can then use this to construct a regular QuerySet
by feeding it back into another query:
这和Django是一样的。问题是,这将返回一个只有名称和计数的ValuesQuerySet。但是,您可以使用它来构造一个常规的QuerySet,方法是将它返回到另一个查询中:
dupes = Literal.objects.values('name')
.annotate(Count('id'))
.order_by()
.filter(id__count__gt=1)
Literal.objects.filter(name__in=[item['name'] for item in dupes])
#2
27
This was rejected as an edit. So here it is as a better answer
这被拒绝作为编辑。这是一个更好的答案
dups = (
Literal.objects.values('name')
.annotate(count=Count('id'))
.values('name')
.order_by()
.filter(count__gt=1)
)
This will return a ValuesQuerySet with all of the duplicate names. However, you can then use this to construct a regular QuerySet by feeding it back into another query. The django orm is smart enough to combine these into a single query:
这将返回一个包含所有重复名称的ValuesQuerySet。但是,您可以使用它来构造一个常规的查询集,方法是将它返回到另一个查询中。django orm足够聪明,可以将它们组合为一个查询:
Literal.objects.filter(name__in=dups)
The extra call to .values('name') after the annotate call looks a little strange. Without this, the subquery fails. The extra values tricks the orm into only selecting the name column for the subquery.
在注释调用之后对.values('name')的额外调用看起来有点奇怪。否则,子查询将失败。额外的值使orm只选择子查询的name列。
#3
9
try using aggregation
试着用聚合
Literal.objects.values('name').annotate(name_count=Count('name')).exclude(name_count=1)
#4
1
In case you use PostgreSQL, you can do something like this:
如果你使用PostgreSQL,你可以这样做:
from django.contrib.postgres.aggregates import ArrayAgg
from django.db.models import Func, Value
duplicate_ids = (Literal.objects.values('name')
.annotate(ids=ArrayAgg('id'))
.annotate(c=Func('ids', Value(1), function='array_length'))
.filter(c__gt=1)
.annotate(ids=Func('ids', function='unnest'))
.values_list('ids', flat=True))
It results in this rather simple SQL query:
它导致这个相当简单的SQL查询:
SELECT unnest(ARRAY_AGG("app_literal"."id")) AS "ids"
FROM "app_literal"
GROUP BY "app_literal"."name"
HAVING array_length(ARRAY_AGG("app_literal"."id"), 1) > 1
#5
0
If you want to result only names list but not objects, you can use the following query
如果希望只生成名称列表而不生成对象,可以使用以下查询
repeated_names = Literal.objects.values('name').annotate(Count('id')).order_by().filter(id__count__gt=1).values_list('name', flat='true')