The ORM in Django lets us easily annotate (add fields to) querysets based on related data, hwoever I can't find a way to get multiple annotations for different filtered subsets of related data.
Django中的ORM允许我们根据相关数据方便地注释(向)查询集(添加字段),hwoever I无法找到一种方法来获取不同过滤的相关数据子集的多个注释。
This is being asked in relation to django-helpdesk, an open-source Django-powered trouble-ticket tracker. I need to have data pivoted like this for charting and reporting purposes
这是关于django-helpdesk的问题,django-helpdesk是一个开源的django-机动故障跟踪器。我需要有这样的数据来进行制图和报告。
Consider these models:
考虑这些模型:
CHOICE_LIST = (
('open', 'Open'),
('closed', 'Closed'),
)
class Queue(models.model):
name = models.CharField(max_length=40)
class Issue(models.Model):
subject = models.CharField(max_length=40)
queue = models.ForeignKey(Queue)
status = models.CharField(max_length=10, choices=CHOICE_LIST)
And this dataset:
这数据集:
Queues:
队列:
ID | Name
---+------------------------------
1 | Product Information Requests
2 | Service Requests
Issues:
问题:
ID | Queue | Status
---+-------+---------
1 | 1 | open
2 | 1 | open
3 | 1 | closed
4 | 2 | open
5 | 2 | closed
6 | 2 | closed
7 | 2 | closed
I would like to see an annotation/aggregate look something like this:
我想看到注释/聚合是这样的:
Queue ID | Name | open | closed
---------+-------------------------------+------+--------
1 | Product Information Requests | 2 | 1
2 | Service Requests | 1 | 3
This is basically a crosstab or pivot table, in Excel parlance. I am currently building this output using some custom SQL queries, however if I can move to using the Django ORM I can more easily filter the data dynamically without doing dodgy insertion of WHERE clauses in my SQL.
这基本上是一个交叉表或数据透视表,用Excel表示。我现在正在使用一些定制的SQL查询来构建这个输出,但是如果我可以使用Django ORM,我可以更容易地动态地过滤数据,而不需要在SQL中插入WHERE子句。
For "bonus points": How would one do this where the pivot field (status
in the example above) was a date, and we wanted the columns to be months / weeks / quarters / days?
对于“附加点”:如果pivot字段(在上面的示例中为状态)是一个日期,并且我们希望列是月/周/季度/天,那么如何做到这一点呢?
2 个解决方案
#1
6
You have Python, use it.
你有Python,用它。
from collections import defaultdict
summary = defaultdict( int )
for issue in Issues.objects.all():
summary[issue.queue, issue.status] += 1
Now your summary
object has queue, status as a two-tuple key. You can display it directly, using various template techniques.
现在您的摘要对象有了队列,状态为双元组键。您可以使用各种模板技术直接显示它。
Or, you can regroup it into a table-like structure, if that's simpler.
或者,如果更简单的话,您可以将它重新分组到类似表格的结构中。
table = []
queues = list( q for q,_ in summary.keys() )
for q in sorted( queues ):
table.append( q.id, q.name, summary.count(q,'open'), summary.count(q.'closed') )
You have lots and lots of Python techniques for doing pivot tables.
您有很多Python技术来做数据透视表。
If you measure, you may find that a mostly-Python solution like this is actually faster than a pure SQL solution. Why? Mappings can be faster than SQL algorithms which require a sort as part of a GROUP-BY.
如果您进行度量,您可能会发现像这样的python解决方案实际上比纯SQL解决方案要快。为什么?映射可以比SQL算法更快,而SQL算法需要作为组的一部分进行排序。
#2
2
Django has added a lot of functionality to the ORM since this question was originally asked. The answer to how to pivot data since Django 1.8 is to use the Case/When conditional expressions. And there is a third party app that will do that for you, PyPI and documentation
Django向ORM添加了很多功能,因为这个问题是最初提出的。自Django 1.8以来,如何透视数据的答案是使用Case/When条件表达式。还有一个第三方应用程序可以帮你做这个,PyPI和文档
#1
6
You have Python, use it.
你有Python,用它。
from collections import defaultdict
summary = defaultdict( int )
for issue in Issues.objects.all():
summary[issue.queue, issue.status] += 1
Now your summary
object has queue, status as a two-tuple key. You can display it directly, using various template techniques.
现在您的摘要对象有了队列,状态为双元组键。您可以使用各种模板技术直接显示它。
Or, you can regroup it into a table-like structure, if that's simpler.
或者,如果更简单的话,您可以将它重新分组到类似表格的结构中。
table = []
queues = list( q for q,_ in summary.keys() )
for q in sorted( queues ):
table.append( q.id, q.name, summary.count(q,'open'), summary.count(q.'closed') )
You have lots and lots of Python techniques for doing pivot tables.
您有很多Python技术来做数据透视表。
If you measure, you may find that a mostly-Python solution like this is actually faster than a pure SQL solution. Why? Mappings can be faster than SQL algorithms which require a sort as part of a GROUP-BY.
如果您进行度量,您可能会发现像这样的python解决方案实际上比纯SQL解决方案要快。为什么?映射可以比SQL算法更快,而SQL算法需要作为组的一部分进行排序。
#2
2
Django has added a lot of functionality to the ORM since this question was originally asked. The answer to how to pivot data since Django 1.8 is to use the Case/When conditional expressions. And there is a third party app that will do that for you, PyPI and documentation
Django向ORM添加了很多功能,因为这个问题是最初提出的。自Django 1.8以来,如何透视数据的答案是使用Case/When条件表达式。还有一个第三方应用程序可以帮你做这个,PyPI和文档