如果运行超过50个表的视图,如何提高查询性能?

时间:2021-08-08 08:23:11

I have a bit situation with SSRS reports which I have built and currently under development.

我对SSRS报告有一点了解,这些报告是我构建的,目前正在开发中。

Some background on DB. We have CRM on-prem2015, which has SQL DB in the back end. My SSRS reports are based on Filtered Views, which has matching names in the front-end in CRM. So I have to pick and choose the field from the filtered view and then put the SQL logic in.

关于数据库的一些背景知识。我们有CRM on-prem2015,在后端有SQL DB。我的SSRS报告基于经过筛选的视图,在客户关系管理的前端有匹配的名称。所以我必须从过滤后的视图中选择字段,然后放入SQL逻辑。

Since mostly reports are based on new Admission and Service Activities view, which has 1-N relationship respectively. Both this views are growing exponentially day by day.

因为大多数报告都是基于新的录取和服务活动视图,它们分别有1-N个关系。这两种观点都呈指数级增长。

If I just run Select * from ServiceActivitesFilteredView it takes more than 15 minutes to return around 500,000 rows which i growing by 2000 a day. This view is based on more than 50 tables, mostly I checked those are connected in the back end with Left Outer Join.

如果我从ServiceActivitesFilteredView中运行Select *,返回大约500000行需要超过15分钟,到2000年,我每天都会增加这些行。这个视图基于50多个表,主要是我检查了那些在后端与左外连接连接的表。

And If I just run Select * from AdmissionFileteredView it takes around 7 minutes and growing I would say day by day and returns around 215,000 rows.

如果我从AdmissionFileteredView中运行Select *它大约需要7分钟,并且每天增长,返回215,000行。

So when I have to make any reports via including both above FilteredViews it is becoming nightmare. There are two situation though!

因此,当我必须通过上述两种过滤视图来做任何报告时,这将是一场噩梦。但是有两种情况!

  1. If I put too many parameters in SSRS and try to drill down to client level( Most granular level) which is either one or few rows as result, SSRS report works fine.
  2. 如果我在SSRS中放置了太多的参数,并试图深入到客户端级别(最细的级别),结果可能是一行或几行,那么SSRS报告工作得很好。
  3. But when reports need data at Office level or Area level which may have few hundreds clients it's started taking more than 20 minutes to return the results with depending on the office( see blow code, but no more than 100 rows).
  4. 但是,当报表需要办公级别或区域级别的数据(可能只有几百个客户端)时,根据办公室的不同,需要花费超过20分钟的时间返回结果(见blow代码,但不超过100行)。

I have created ODS for few reports where it was OK to have one-day or one-week old data. But few reports needs live data and which are getting very poor in performance day by day. I tried "SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;" and also "with(nolock)" option in stored procedure where I use this views. Just FYI. we are not ready at this point to go DataWarehouse side.

我为一些报告创建了ODS,在这些报告中,有一天或一周的数据是可以的。但很少有报告需要实时数据,而且这些数据的表现越来越差。我在使用此视图的存储过程中尝试了“设置事务隔离级别读取未提交;”和“with(nolock)”选项。仅供参考。我们现在还没有准备好去DataWarehouse。

Here is the stored procedure which forced me to ask this question on this forum. Basically, our company has policy that Supervisor will go to client house who has agreed to install Air conditioner services, Supervisor will go on the day installers will install the AC. What I am trying to do here is to get list of clients who missed Initial Visit from supervisor when they installed Air conditioner from our company, and next booked or reschedule service date for the same client so that installed can go with them and finish their Initial visit as mentioned in policy.

以下是迫使我在这个论坛上问这个问题的存储过程。基本上,我们的公司政策,主管会去客户家同意安装空调服务,主管会当天安装程序安装交流。我想做的就是让客户错过了最初的访问列表从上司当他们从我们公司安装了空调,和明年预定或安排服务日期相同的客户端安装,这样可以和他们完成他们的首次访问中提到的政策。

Select data.ServiceproviderName,data.new_clientidname,data.new_subprogramname,data.createdon,data.new_addresscity,data.new_workgroupidname,nextdate.NextVisitdate,data.new_sitename from 

        (Select distinct

            fa.new_sitename,
            fa.new_clientidname,
            fs.new_subprogramname,
            fa.new_servicename,
            fa.createdon,
            fa.new_admissionid,
            fa.ServiceproviderName,
            fa.new_addresscity,
            fa.new_workgroupidname

        from AdmissionFilteredView fa with(nolock)
        left join ServiceAppoinmentFilteredView fs with(nolock)
        on
        fa.new_admissionid=fs.regardingobjectid

        where 
        fa.new_sitename IN (SELECT value FROM dbo.udf_Split(@Office, ','))
        and cast(fa.createdon as date) BETWEEN cast(@Start as date) AND cast(@End as date)
        and fa.new_admissionstatusname In ('Admitted')
        and fa.new_servicename like 'AC Repair%'
        and fs.new_visittypename <> 'Initial'

        group by
        fa.new_sitename,fa.new_clientidname,fa.new_admissionid,fa.new_servicename,fa.createdon,fs.new_subprogramname,fa.ServiceproviderName,fa.new_admissionid,fa.new_addresscity,fa.new_workgroupidname)  data

        left join
        (Select distinct new_clientidname,min(fs.scheduledstart) as NextVisitdate
        from
        AdmissionFilteredView fa with(nolock)
        left join ServiceAppoinmentFilteredView fs with(nolock)
        on
        fa.new_admissionid=fs.regardingobjectid
        where fa.new_sitename IN (SELECT value FROM dbo.udf_Split(@Office, ','))
        and cast(fa.createdon as date) BETWEEN cast(@Start as date) AND cast(@End as date)
        and fa.new_admissionstatusname In ('Admitted')
        and fa.new_servicename like 'AC Repair%'
        and fs.new_visittypename <> 'Initial'
        and fs.statuscodename IN ('Booked','Rescheduled')
        group by 
        new_clientidname) nextdate
        on data.new_clientidname=nextdate.new_clientidname 

This takes roungly 25 minutes in SSMS and 35 minutes in SSRS in SSDT and it even doesn't run on the CRM and goes in SQL - time out error. I can't create ODS since this report needs the live data.

这在ssm中需要花25分钟,ssr中的SSRS需要35分钟,而且它甚至不会在CRM上运行,并在SQL - time out错误中运行。我不能创建ODS,因为这个报告需要实时数据。

Only thing I can think of is to find actual tables from which these two views are created and re-write this stor. proc. based on these tables or create two tables from these two views and write a code to have up-to-date data in these tables ,I am not sure even this is possible by something like Change data capture or incremental load or update these two tables every time there is new entry in the views or tables which made these two views.

我唯一能想到的就是找到创建这两个视图的实际表并重新编写这个stor。proc.基于这些表或创建两个表从这两个视图和写代码在这些最新数据表,我甚至不知道这是可能的类似变化数据捕获或增量加载或更新这两个表每次有新条目视图或表使这两个观点。

Please help, considering the bigger picture and not just this stored procedure in general.

请帮助,考虑到更大的前景,而不仅仅是这个存储过程。

Thanks in advance.

提前谢谢。

5 个解决方案

#1


1  

You can use snapshot option in ssrs ,so report will not keep on loading at client end . At database end ,have you tried creating indexes on your tables ??

您可以在ssrs中使用snapshot选项,因此报表不会在客户端继续加载。在数据库端,是否尝试在表上创建索引?

#2


1  

I agree with the comments regarding your split function. You could store that overhead in a table variable, and the just reference the variable in your query:

我同意你关于拆分函数的评论。您可以将该开销存储在表变量中,并且在查询中引用该变量:

DECLARE @Start DATETIME, @End DATETIME 
;

DECLARE @office VARCHAR(123) = 'Office A,Office B,Office C'
;

DECLARE @officeList TABLE ( 
    Office VARCHAR(100)
);

INSERT INTO @officeList
SELECT Value FROM dbo.udf_Split(@office, ',')
;

DECLARE @local_StartDate DATE = cast(@Start as date), 
        @local_EndDate DATE = cast(@End as date);

-- from your query
where 
    fa.new_sitename IN (SELECT Office FROM @officeList)
    and cast(fa.createdon as date) BETWEEN @local_StartDate AND @local_EndDate

#3


1  

  1. Create snapshot of the database which is also called reporting database. You can do hourly snapshot or weekly depending upon the frequency of reports.
  2. 创建数据库快照,也称为报表数据库。您可以根据报告的频率每小时做一次快照或每周做一次。
  3. Run reports as task on background thread:- You can create jobs that run at night, that will create reports for you by querying reporting database and will present you a report when you come in morning. Or you can create it as a task that runs on background and sends you an email when report is ready so you do not have to wait 15 minutes for report to be generated.
  4. 在后台线程上以任务的形式运行报表:-您可以创建在夜间运行的作业,通过查询报表数据库为您创建报表,并在您早上到来时为您呈现报表。或者,您可以将其创建为一个运行在后台的任务,并在报告准备就绪时向您发送一封电子邮件,这样您就不必等待15分钟来生成报告了。
  5. Use non-clustered indexes/Filtered indexes for columns you are returning and using in where clauses.
  6. 对于要返回和在where子句中使用的列,使用非集群索引/过滤索引。
  7. You can create a new table and insert the reporting data query that you have into that at night and then just do select * from new table to get a report which would be very fast because data will already build at night.
  8. 您可以创建一个新表,并将您在夜间拥有的报表数据查询插入到该表中,然后从新表中选择*,以获得一个非常快速的报表,因为数据将在夜间生成。

If you cannot improve your query and data is increasing every day, then snapshot/reporting database is your best bet.

如果您不能改进查询,并且数据每天都在增加,那么快照/报告数据库是您最好的选择。

#4


1  

How often do you need to run the report and how concurrent does it need to be? Would the users be ok with near-real time data vs. real time data? Perhaps you could pre-execute the heavy queries with a sql job and store the results in staging tables and then report off of the staging table or a combination of staging tables. Perhaps some of the 50 operational tables could be warehoused into dimensional tables or staging tables designed for reporting. Also, what version of SQL Server are you using? It will help us figure out what might be available in your bag of tricks.

你需要多久运行一次报告,它需要多并发?用户对近实时数据和实时数据是否满意?也许您可以使用sql作业预先执行繁重的查询,并将结果存储在staging表中,然后报告staging表或staging表的组合。也许可以将50个操作表中的一些存储到设计用于报告的维度表或staging表中。此外,您使用的是哪个版本的SQL Server ?它将帮助我们弄清楚你的小把戏里可能有什么。

#5


0  

The first issue I noticed was you have 'select distinct' right at the top. This type of statement locks up tables and negates use of with no lock, and left join.

我注意到的第一个问题是,你在顶部“选择明显”。这种类型的语句锁住表并拒绝不加锁的使用,并保留连接。

You need to re-write your queries so they do NOT use any 'select distinct' clauses.

您需要重新编写查询,这样它们就不会使用任何“选择不同的”子句。

#1


1  

You can use snapshot option in ssrs ,so report will not keep on loading at client end . At database end ,have you tried creating indexes on your tables ??

您可以在ssrs中使用snapshot选项,因此报表不会在客户端继续加载。在数据库端,是否尝试在表上创建索引?

#2


1  

I agree with the comments regarding your split function. You could store that overhead in a table variable, and the just reference the variable in your query:

我同意你关于拆分函数的评论。您可以将该开销存储在表变量中,并且在查询中引用该变量:

DECLARE @Start DATETIME, @End DATETIME 
;

DECLARE @office VARCHAR(123) = 'Office A,Office B,Office C'
;

DECLARE @officeList TABLE ( 
    Office VARCHAR(100)
);

INSERT INTO @officeList
SELECT Value FROM dbo.udf_Split(@office, ',')
;

DECLARE @local_StartDate DATE = cast(@Start as date), 
        @local_EndDate DATE = cast(@End as date);

-- from your query
where 
    fa.new_sitename IN (SELECT Office FROM @officeList)
    and cast(fa.createdon as date) BETWEEN @local_StartDate AND @local_EndDate

#3


1  

  1. Create snapshot of the database which is also called reporting database. You can do hourly snapshot or weekly depending upon the frequency of reports.
  2. 创建数据库快照,也称为报表数据库。您可以根据报告的频率每小时做一次快照或每周做一次。
  3. Run reports as task on background thread:- You can create jobs that run at night, that will create reports for you by querying reporting database and will present you a report when you come in morning. Or you can create it as a task that runs on background and sends you an email when report is ready so you do not have to wait 15 minutes for report to be generated.
  4. 在后台线程上以任务的形式运行报表:-您可以创建在夜间运行的作业,通过查询报表数据库为您创建报表,并在您早上到来时为您呈现报表。或者,您可以将其创建为一个运行在后台的任务,并在报告准备就绪时向您发送一封电子邮件,这样您就不必等待15分钟来生成报告了。
  5. Use non-clustered indexes/Filtered indexes for columns you are returning and using in where clauses.
  6. 对于要返回和在where子句中使用的列,使用非集群索引/过滤索引。
  7. You can create a new table and insert the reporting data query that you have into that at night and then just do select * from new table to get a report which would be very fast because data will already build at night.
  8. 您可以创建一个新表,并将您在夜间拥有的报表数据查询插入到该表中,然后从新表中选择*,以获得一个非常快速的报表,因为数据将在夜间生成。

If you cannot improve your query and data is increasing every day, then snapshot/reporting database is your best bet.

如果您不能改进查询,并且数据每天都在增加,那么快照/报告数据库是您最好的选择。

#4


1  

How often do you need to run the report and how concurrent does it need to be? Would the users be ok with near-real time data vs. real time data? Perhaps you could pre-execute the heavy queries with a sql job and store the results in staging tables and then report off of the staging table or a combination of staging tables. Perhaps some of the 50 operational tables could be warehoused into dimensional tables or staging tables designed for reporting. Also, what version of SQL Server are you using? It will help us figure out what might be available in your bag of tricks.

你需要多久运行一次报告,它需要多并发?用户对近实时数据和实时数据是否满意?也许您可以使用sql作业预先执行繁重的查询,并将结果存储在staging表中,然后报告staging表或staging表的组合。也许可以将50个操作表中的一些存储到设计用于报告的维度表或staging表中。此外,您使用的是哪个版本的SQL Server ?它将帮助我们弄清楚你的小把戏里可能有什么。

#5


0  

The first issue I noticed was you have 'select distinct' right at the top. This type of statement locks up tables and negates use of with no lock, and left join.

我注意到的第一个问题是,你在顶部“选择明显”。这种类型的语句锁住表并拒绝不加锁的使用,并保留连接。

You need to re-write your queries so they do NOT use any 'select distinct' clauses.

您需要重新编写查询,这样它们就不会使用任何“选择不同的”子句。