在数据库上创建变量以保存全局统计信息

时间:2021-08-18 00:10:29

Let's pretend I've got a social network. I'm always showing to the user how many users are registered and have activated their profile. So, everytime a single user logs in, it goes to DB and make a:

让我们假装我有一个社交网络。我总是向用户显示已注册的用户数量并激活了他们的个人资料。因此,每次单个用户登录时,都会转到数据库并生成:

select count(*) from users where status = 'activated'

so if 5.000 users logs in, or simply refreshes the page, it will make 5.000 requests to SQL above. I was wondering if is better to have a variable some place(that I still have no idea where to put) that everytime a user activates his profile will add 1 and then, when I want to show how many users are registered to that social network, I'll only get the value of this variable.

因此,如果5.000用户登录,或者只是刷新页面,它将向上面的SQL发出5.000个请求。我想知道是否更好地在一个地方(我仍然不知道放在哪里)有一个变量,每次用户激活他的个人资料时将添加1然后,当我想显示有多少用户注册到该社交网络时,我只会得到这个变量的值。

How can I make this? Is it really a better solution to what I've got?

我该怎么做?它真的是一个更好的解决方案,我有什么?

5 个解决方案

#1


2  

You could use an indexed view, that SQL Server will automatically maintain:

您可以使用SQL Server将自动维护的索引视图:

create table dbo.users (
    ID int not null,
    Activated bit not null
)
go
create view dbo.user_status_stats (Activated,user_count)
with schemabinding
as
    select Activated,COUNT_BIG(*) from dbo.users group by Activated
go
create unique clustered index IX_user_status_stats on dbo.user_status_stats (Activated)
go

This just has two possible statuses, but could expand to more using a different data type. As I say, in this case, SQL Server will maintain the counts behind the scenes, so you can just query the view:

这只有两种可能的状态,但可以使用不同的数据类型扩展到更多。正如我所说,在这种情况下,SQL Server将维护幕后的计数,因此您只需查询视图:

SELECT user_count from user_status_stats with (NOEXPAND) where Activated = 1

and it won't have to query the underlying table. You need to use the WITH (NOEXPAND) hint on editions below (Enterprise/Developer).

它不必查询基础表。您需要在下面的版本(Enterprise / Developer)上使用WITH(NOEXPAND)提示。


Although as @Jim suggested, doing a COUNT(*) against an index when the index column(s) can satisfy the query criteria using equality comparisons should be pretty quick also.

虽然正如@Jim建议的那样,当索引列使用相等比较满足查询条件时,对索引执行COUNT(*)也应该很快。

#2


1  

As you've already guessed - it's not a great idea to calculate this value every time someone hits the site.

正如您已经猜到的那样 - 每次有人点击网站时计算这个值并不是一个好主意。

You could do as you suggest, and update a central value as users are added, although you'll have to ensure that you don't end up with two processes updating the number simultaneously.

您可以按照建议进行操作,并在添加用户时更新中心值,但您必须确保不会同时更新数字的两个进程。

Alternatively you could have a job which runs your SQL routinely and updates the central 'user count' value.

或者,您可以拥有一个定期运行SQL并更新*“用户计数”值的作业。

Alternatively #2, you could use something like MemCache to hold the calculated value for a period of time, and then when the cache expires, recalculate it again.

或者#2,您可以使用MemCache之类的东西来保存一段时间的计算值,然后当缓存过期时,再次重新计算它。

#3


0  

There's a few options you could consider:

你可以考虑几个选项:

1) like you say, maintain a global count each time a profile is activated to save the hit on the users table each time. You could just store that count in a "Stats" table and then query that value from there.

1)就像你说的,每次激活配置文件时都要保持全局计数,以便每次都保存用户表上的命中。您可以将该计数存储在“统计”表中,然后从那里查询该值。

2) don't show the actual "live" count, show a count that's "pretty much up to date" - e.g. cache the count in your application and have the value expire periodically so you then requery the count less frequently. Or if you store the count in a "Stats" table per above, you could have a scheduled job that updates the count every hour, instead of every time a profile is activated.

2)不显示实际的“实时”计数,显示“几乎是最新的”计数 - 例如缓存应用程序中的计数并使值定期到期,以便您更频繁地重新查询计数。或者,如果您将计数存储在上面的“统计”表中,您可以拥有一个每小时更新计数的计划作业,而不是每次激活配置文件。

Depends whether you want to show the exact figure in real-time or whether you can live with a delay. Obviously, data volumes matter too - if you have a large database, then having a slightly out of date cached value could be worth while.

取决于您是否要实时显示确切数字或是否可以延迟生活。显然,数据量也很重要 - 如果你有一个大型数据库,那么稍微过时的缓存值可能是值得的。

#4


0  

From a purely SQL Server standpoint, no, you are not going to find a better way of doing this. Unless, perhaps, your social network is Facebook sized. Denormalizing your data design (such as keeping a count in a separate table) will lead to possible sources of the data getting out of sync. It doesn't have to get out of sync if it is coded properly, but it can...

从纯粹的SQL Server角度来看,不,你不会找到更好的方法来做到这一点。除非,您的社交网络可能是Facebook大小的。对数据设计进行非规范化(例如将计数保存在单独的表中)将导致数据的可能来源不同步。如果编码正确,它不必失去同步,但它可以......

Just make sure that you have an index on Status. At which point SQL will not scan the table for the count, but it will scan the index instead. The index will be much smaller (that is, more data will fit in a disk page). If you were to convert your status to an int, smallint, or tinyint you would get even more index leaves in a disk page and thus much less IO. To get your description ('activated', etc.), use a reference table. The reference table would be so small, SQL would just keep the whole thing in RAM after the first access.

只需确保您有状态索引。此时SQL不会扫描表中的计数,但它会扫描索引。索引将小得多(即,更多数据将适合磁盘页面)。如果要将状态转换为int,smallint或tinyint,则会在磁盘页面中获得更多的索引,因此IO更少。要获取您的描述('激活'等),请使用参考表。参考表会很小,SQL会在第一次访问后将整个内容保存在RAM中。

Now, if you still think this is too much overhead (and it should't be) you could come up with hybrid method. You could store your count in a separate table (which SQL would keep in RAM if it is just the one record) or assuming your site is in asp.net you could create an Application variable to keep track of the count. You could increment it in Session_Start and decrement it in Session_End. But, you will have to come up with a way of making the the increment and decrement thread safe so two sessions don't try and update the value at the same time.

现在,如果你仍然认为这是太多的开销(并且不应该),你可以想出混合方法。您可以将计数存储在一个单独的表中(如果只是一条记录,SQL将保留在RAM中)或假设您的站点位于asp.net中,您可以创建一个Application变量来跟踪计数。您可以在Session_Start中递增它并在Session_End中递减它。但是,您必须想出一种方法来使增量和减量线程安全,因此两个会话不会尝试同时更新该值。

#5


0  

You can also use the Global Temporary table. You will always get fast retrieval. Even if you are setting 30 seconds ping. The Example Trigger Link1, Example Trigger Link2 will maintain such activities in this table.

您还可以使用Global Temporary表。您将始终获得快速检索。即使你设置30秒ping。示例触发器Link1,示例触发器Link2将在此表中维护此类活动。

#1


2  

You could use an indexed view, that SQL Server will automatically maintain:

您可以使用SQL Server将自动维护的索引视图:

create table dbo.users (
    ID int not null,
    Activated bit not null
)
go
create view dbo.user_status_stats (Activated,user_count)
with schemabinding
as
    select Activated,COUNT_BIG(*) from dbo.users group by Activated
go
create unique clustered index IX_user_status_stats on dbo.user_status_stats (Activated)
go

This just has two possible statuses, but could expand to more using a different data type. As I say, in this case, SQL Server will maintain the counts behind the scenes, so you can just query the view:

这只有两种可能的状态,但可以使用不同的数据类型扩展到更多。正如我所说,在这种情况下,SQL Server将维护幕后的计数,因此您只需查询视图:

SELECT user_count from user_status_stats with (NOEXPAND) where Activated = 1

and it won't have to query the underlying table. You need to use the WITH (NOEXPAND) hint on editions below (Enterprise/Developer).

它不必查询基础表。您需要在下面的版本(Enterprise / Developer)上使用WITH(NOEXPAND)提示。


Although as @Jim suggested, doing a COUNT(*) against an index when the index column(s) can satisfy the query criteria using equality comparisons should be pretty quick also.

虽然正如@Jim建议的那样,当索引列使用相等比较满足查询条件时,对索引执行COUNT(*)也应该很快。

#2


1  

As you've already guessed - it's not a great idea to calculate this value every time someone hits the site.

正如您已经猜到的那样 - 每次有人点击网站时计算这个值并不是一个好主意。

You could do as you suggest, and update a central value as users are added, although you'll have to ensure that you don't end up with two processes updating the number simultaneously.

您可以按照建议进行操作,并在添加用户时更新中心值,但您必须确保不会同时更新数字的两个进程。

Alternatively you could have a job which runs your SQL routinely and updates the central 'user count' value.

或者,您可以拥有一个定期运行SQL并更新*“用户计数”值的作业。

Alternatively #2, you could use something like MemCache to hold the calculated value for a period of time, and then when the cache expires, recalculate it again.

或者#2,您可以使用MemCache之类的东西来保存一段时间的计算值,然后当缓存过期时,再次重新计算它。

#3


0  

There's a few options you could consider:

你可以考虑几个选项:

1) like you say, maintain a global count each time a profile is activated to save the hit on the users table each time. You could just store that count in a "Stats" table and then query that value from there.

1)就像你说的,每次激活配置文件时都要保持全局计数,以便每次都保存用户表上的命中。您可以将该计数存储在“统计”表中,然后从那里查询该值。

2) don't show the actual "live" count, show a count that's "pretty much up to date" - e.g. cache the count in your application and have the value expire periodically so you then requery the count less frequently. Or if you store the count in a "Stats" table per above, you could have a scheduled job that updates the count every hour, instead of every time a profile is activated.

2)不显示实际的“实时”计数,显示“几乎是最新的”计数 - 例如缓存应用程序中的计数并使值定期到期,以便您更频繁地重新查询计数。或者,如果您将计数存储在上面的“统计”表中,您可以拥有一个每小时更新计数的计划作业,而不是每次激活配置文件。

Depends whether you want to show the exact figure in real-time or whether you can live with a delay. Obviously, data volumes matter too - if you have a large database, then having a slightly out of date cached value could be worth while.

取决于您是否要实时显示确切数字或是否可以延迟生活。显然,数据量也很重要 - 如果你有一个大型数据库,那么稍微过时的缓存值可能是值得的。

#4


0  

From a purely SQL Server standpoint, no, you are not going to find a better way of doing this. Unless, perhaps, your social network is Facebook sized. Denormalizing your data design (such as keeping a count in a separate table) will lead to possible sources of the data getting out of sync. It doesn't have to get out of sync if it is coded properly, but it can...

从纯粹的SQL Server角度来看,不,你不会找到更好的方法来做到这一点。除非,您的社交网络可能是Facebook大小的。对数据设计进行非规范化(例如将计数保存在单独的表中)将导致数据的可能来源不同步。如果编码正确,它不必失去同步,但它可以......

Just make sure that you have an index on Status. At which point SQL will not scan the table for the count, but it will scan the index instead. The index will be much smaller (that is, more data will fit in a disk page). If you were to convert your status to an int, smallint, or tinyint you would get even more index leaves in a disk page and thus much less IO. To get your description ('activated', etc.), use a reference table. The reference table would be so small, SQL would just keep the whole thing in RAM after the first access.

只需确保您有状态索引。此时SQL不会扫描表中的计数,但它会扫描索引。索引将小得多(即,更多数据将适合磁盘页面)。如果要将状态转换为int,smallint或tinyint,则会在磁盘页面中获得更多的索引,因此IO更少。要获取您的描述('激活'等),请使用参考表。参考表会很小,SQL会在第一次访问后将整个内容保存在RAM中。

Now, if you still think this is too much overhead (and it should't be) you could come up with hybrid method. You could store your count in a separate table (which SQL would keep in RAM if it is just the one record) or assuming your site is in asp.net you could create an Application variable to keep track of the count. You could increment it in Session_Start and decrement it in Session_End. But, you will have to come up with a way of making the the increment and decrement thread safe so two sessions don't try and update the value at the same time.

现在,如果你仍然认为这是太多的开销(并且不应该),你可以想出混合方法。您可以将计数存储在一个单独的表中(如果只是一条记录,SQL将保留在RAM中)或假设您的站点位于asp.net中,您可以创建一个Application变量来跟踪计数。您可以在Session_Start中递增它并在Session_End中递减它。但是,您必须想出一种方法来使增量和减量线程安全,因此两个会话不会尝试同时更新该值。

#5


0  

You can also use the Global Temporary table. You will always get fast retrieval. Even if you are setting 30 seconds ping. The Example Trigger Link1, Example Trigger Link2 will maintain such activities in this table.

您还可以使用Global Temporary表。您将始终获得快速检索。即使你设置30秒ping。示例触发器Link1,示例触发器Link2将在此表中维护此类活动。