We have tables in the following format:
我们有以下格式的表格:
Order(OrderID,CustomerID,OrderDate,CreatedByUserID,LastModifiedByUserID)
OrderItem(OrderID,ProductID,ProviderID,ItemStatus,CompletedByUserID)
Companies(CompanyID, CompanyName, CompanyParentID, CompanyRegionID)
The full data table required for generating reports for Orders requires almost 12 joins and about 250 fields. Below is a short example
为订单生成报告所需的完整数据表需要大约12个连接和大约250个字段。下面是一个简短的例子。
SELECT o.OrderID, o.CustomerID ... FROM Orders AS o
INNER JOIN OrderItems AS items ON o.OrderID = items.OrderID
INNER JOIN Products AS p ON items.ProductID = p.ProductID
INNER JOIN Companies AS cust ON o.CustomerID = cust.ComapnyID
LEFT OUTER JOIN Companies AS prov ON items.ProviderID = prov.ComapnyID
INNER JOIN Users AS u1 ON items.CreatedByUserID = u1.UserID
INNER JOIN Users AS u2 ON items.LastModifiedByUserID = u2.UserID
LEFT OUTER JOIN Users AS ui1 ON items.CompletedByUserID = ui1.UserID
LEFT OUTER JOIN Users AS ui2 ON items.VerifiedByUserID = ui2.UserID
LEFT OUTER JOIN Companies AS parent ON cust.CompanyParentID = parent.ComapnyID
LEFT OUTER JOIN Companies AS region ON cust.CompanyRegionID = region.ComapnyID
My question is: SInce this is a reporting application, should we run this SQL once (e.g. every hour) and copy the data to a temp table from which the reports are run, or should we always run all these joins whenever a user requests to see the report?
我的问题是:既然这是一个报告应用程序,我们是否应该一次运行这个SQL(例如,每小时),然后将数据复制到一个临时表中,从这个表中运行报表,或者当用户请求查看报告时,我们是否应该始终运行所有这些连接?
Note:
注意:
- The reports can be up to an hour out of date. Since they are typically run on a weekly/monthly basis.
- 这些报告可以长达一个小时的时间。因为它们通常是按周/月运行的。
- The data is multi-tenant. i.e. it is filtered depending on who is running reports (customers, parent companies, regional offices, product providers etc.)
- 是多租户的数据。也就是说,根据谁在运行报告(客户、母公司、区域办事处、产品供应商等)进行过滤。
2 个解决方案
#1
1
It's always a good idea to separate OLTP and reporting tasks. Ideally in different DB instances.
分离OLTP和报告任务总是一个好主意。理想情况下,在不同的DB实例中。
But you must take into account, how recent data in reports should be.
但你必须考虑到,报告中最近的数据应该是怎样的。
#2
1
It really depends on what you need, do you need the data to be completely up to date every time the report is run? If not then you have a few options (I wouldn't use a temp table personally):
这实际上取决于您需要什么,您是否需要在每次运行报表时都要完全更新数据?如果没有,那么您有一些选择(我不会亲自使用临时表):
Cached Reports - You can get the report server to cache a copy of a report rather than generate it each time, it will get refreshed once the cached copy expires:
缓存的报告——您可以让报表服务器缓存报告的副本,而不是每次生成它,一旦缓存副本过期,它将得到刷新:
缓存一份报告
Report Snapshots - You can get the report server to create a snapshot of the data at a certain point in time, the reports will then run against this snapshot:
报告快照——您可以让报表服务器在某个时间点创建数据的快照,然后将对该快照运行报告:
报告处理属性
Ultimately what would be the best option if you have many of these types of reports with many joins etc is to implement a data warehouse type solution with a schema that is optimised for reporting, rather than the highly normalised schemas found in OLTP systems.
如果您有许多这样的类型的报告,并且有许多连接等,那么最好的选择是实现一个数据仓库类型的解决方案,该方案的模式是优化的报告,而不是在OLTP系统中发现的高度规范化的模式。
#1
1
It's always a good idea to separate OLTP and reporting tasks. Ideally in different DB instances.
分离OLTP和报告任务总是一个好主意。理想情况下,在不同的DB实例中。
But you must take into account, how recent data in reports should be.
但你必须考虑到,报告中最近的数据应该是怎样的。
#2
1
It really depends on what you need, do you need the data to be completely up to date every time the report is run? If not then you have a few options (I wouldn't use a temp table personally):
这实际上取决于您需要什么,您是否需要在每次运行报表时都要完全更新数据?如果没有,那么您有一些选择(我不会亲自使用临时表):
Cached Reports - You can get the report server to cache a copy of a report rather than generate it each time, it will get refreshed once the cached copy expires:
缓存的报告——您可以让报表服务器缓存报告的副本,而不是每次生成它,一旦缓存副本过期,它将得到刷新:
缓存一份报告
Report Snapshots - You can get the report server to create a snapshot of the data at a certain point in time, the reports will then run against this snapshot:
报告快照——您可以让报表服务器在某个时间点创建数据的快照,然后将对该快照运行报告:
报告处理属性
Ultimately what would be the best option if you have many of these types of reports with many joins etc is to implement a data warehouse type solution with a schema that is optimised for reporting, rather than the highly normalised schemas found in OLTP systems.
如果您有许多这样的类型的报告,并且有许多连接等,那么最好的选择是实现一个数据仓库类型的解决方案,该方案的模式是优化的报告,而不是在OLTP系统中发现的高度规范化的模式。