如何减少从数据类型更改生成的SQL Server表的大小

时间:2022-03-06 16:58:46

I have a table on SQL Server 2005 that was about 4gb in size.

我在SQL Server 2005上有一个大约4gb大小的表。

(about 17 million records)

(大约1700万条记录)

I changed one of the fields from datatype char(30) to char(60) (there are in total 25 fields most of which are char(10) so the amount of char space adds up to about 300)

我将其中的一个字段从datatype char(30)更改为char(60)(总共有25个字段,其中大多数都是char(10),因此char空间的总量大约为300)

This caused the table to double in size (over 9gb)

这使得表的大小增加了一倍(超过9gb)

I then changed the char(60) to varchar(60) and then ran a function to cut extra whitespace out of the data (so as to reduce the average length of the data in the field to about 15)

然后,我将char(60)改为varchar(60),然后运行一个函数,从数据中删除额外的空格(以便将字段中的数据的平均长度减少到大约15)

This did not reduce the table size. Shrinking the database did not help either.

这并没有减少表的大小。缩小数据库也没有帮助。

Short of actually recreating the table structure and copying the data over (that's 17 million records!) is there a less drastic way of getting the size back down again?

除了重新创建表结构和复制数据(这是1700万的记录!)之外,是否有一种不那么激烈的方式让大小再次下降?

5 个解决方案

#1


16  

Well it's clear you're not getting any space back ! :-)

很明显你没有得到任何空间回来!:-)

When you changed your text fields to CHAR(60), they are all filled up to capacity with spaces. So ALL your fields are now really 60 characters long.

当您将文本字段更改为CHAR(60)时,它们都被填满了空间。所有的字段现在都是60个字符长。

Changing that back to VARCHAR(60) won't help - the fields are still all 60 chars long....

改变,回到VARCHAR(60)不会帮助——领域仍然是所有60字符长....

What you really need to do is run a TRIM function over all your fields to reduce them back to their trimmed length, and then do a database shrinking.

您真正需要做的是在所有字段上运行一个TRIM函数,以将它们减少到它们的修剪长度,然后进行数据库收缩。

After you've done that, you need to REBUILD your clustered index in order to reclaim some of that wasted space. The clustered index is really where your data lives - you can rebuild it like this:

完成这些之后,您需要重新构建集群索引,以便回收一些浪费的空间。群集索引实际上是您的数据所在的位置——您可以这样重新构建它:

ALTER INDEX IndexName ON YourTable REBUILD 

By default, your primary key is your clustered index (unless you've specified otherwise).

默认情况下,您的主键是群集索引(除非您已经指定了其他)。

Marc

马克

#2


24  

You have not cleaned or compacted any data, even with a "shrink database".

即使使用“收缩数据库”,也没有清理或压缩任何数据。

DBCC CLEANTABLE

DBCC CLEANTABLE

Reclaims space from dropped variable-length columns in tables or indexed views.

从表或索引视图中删除的可变长度列重新获取空间。

However, a simple index rebuild if there is a clustered index should also do it

然而,如果有一个聚集索引,那么一个简单的索引重构也应该这样做。

ALTER INDEX ALL ON dbo.Mytable REBUILD

A worked example from Tony Rogerson

Tony Rogerson举了一个例子

#3


2  

I know I'm not answering your question as you are asking, but have you considered archiving some of the data to a history table, and work with fewer rows?

我知道我没有像您所问的那样回答您的问题,但是您是否考虑过将一些数据归档到历史表中,并使用更少的行?

Most of the times you might think at first glance that you need all that data all the time but when actually sitting down and examining it, there are cases where that's not true. Or at least I've experienced that situation before.

大多数时候,你可能会认为你需要所有的数据,但是当你坐下来仔细检查的时候,有些情况是不正确的。或者至少我以前经历过这种情况。

#4


0  

I had a similar problem here SQL Server, Converting NTEXT to NVARCHAR(MAX) that was related to changing ntext to nvarchar(max).

我在这里遇到了类似的问题SQL Server,将NTEXT转换为NVARCHAR(MAX),这与将NTEXT转换为NVARCHAR(MAX)有关。

I had to do an UPDATE MyTable SET MyValue = MyValue in order to get it to resize everything nicely.

我必须对MyTable设置MyValue = MyValue进行更新,以便让它能够很好地调整大小。

This obviously takes quite a long time with a lot of records. There were a number of suggestions as how better to do it. They key one was a temporary flag indicated if it had been done or not and then updating a few thousand at a time in a loop until it was all done. This meant I had "some" control over how much it was doing.

这显然要花很长时间,因为有很多记录。关于如何做得更好有很多建议。它们的关键之一是一个临时标志,指示它是否已经完成,然后在循环中每次更新几千个,直到全部完成。这意味着我可以“一些”控制它做了多少。

On another note though, if you really want to shrink the database as much as possible, it can help if you turn the recovery model down to simple, shrink the transaction logs, reorganise all the data in the pages, then set it back to full recovery model. Be careful though, shrinking of databases is generally not advisable, and if you reduce the recovery model of a live database you are asking for something to go wrong.

另一方面,如果您确实希望尽可能地缩小数据库,那么如果您将恢复模型简化为simple,收缩事务日志,重新组织页面中的所有数据,然后将其设置为完全恢复模型,那么将会有所帮助。但是要小心,收缩数据库通常是不可取的,如果您减少了一个活动数据库的恢复模型,那么您将要求出错。

#5


0  

Alternatively, you could do a full table rebuild to ensure there's no extra data hanging around anywhere:

或者,您可以进行完整的表重构,以确保在任何地方都不存在额外的数据:

CREATE TABLE tmp_table(<column definitions>);
GO
INSERT INTO tmp_table(<columns>) SELECT <columns> FROM <table>;
GO
DROP TABLE <table>;
GO
EXEC sp_rename N'tmp_table', N'<table>';
GO

Of course, things get more complicated with identity, indexes, etc etc...

当然,随着身份,索引等等,事情变得更加复杂。

#1


16  

Well it's clear you're not getting any space back ! :-)

很明显你没有得到任何空间回来!:-)

When you changed your text fields to CHAR(60), they are all filled up to capacity with spaces. So ALL your fields are now really 60 characters long.

当您将文本字段更改为CHAR(60)时,它们都被填满了空间。所有的字段现在都是60个字符长。

Changing that back to VARCHAR(60) won't help - the fields are still all 60 chars long....

改变,回到VARCHAR(60)不会帮助——领域仍然是所有60字符长....

What you really need to do is run a TRIM function over all your fields to reduce them back to their trimmed length, and then do a database shrinking.

您真正需要做的是在所有字段上运行一个TRIM函数,以将它们减少到它们的修剪长度,然后进行数据库收缩。

After you've done that, you need to REBUILD your clustered index in order to reclaim some of that wasted space. The clustered index is really where your data lives - you can rebuild it like this:

完成这些之后,您需要重新构建集群索引,以便回收一些浪费的空间。群集索引实际上是您的数据所在的位置——您可以这样重新构建它:

ALTER INDEX IndexName ON YourTable REBUILD 

By default, your primary key is your clustered index (unless you've specified otherwise).

默认情况下,您的主键是群集索引(除非您已经指定了其他)。

Marc

马克

#2


24  

You have not cleaned or compacted any data, even with a "shrink database".

即使使用“收缩数据库”,也没有清理或压缩任何数据。

DBCC CLEANTABLE

DBCC CLEANTABLE

Reclaims space from dropped variable-length columns in tables or indexed views.

从表或索引视图中删除的可变长度列重新获取空间。

However, a simple index rebuild if there is a clustered index should also do it

然而,如果有一个聚集索引,那么一个简单的索引重构也应该这样做。

ALTER INDEX ALL ON dbo.Mytable REBUILD

A worked example from Tony Rogerson

Tony Rogerson举了一个例子

#3


2  

I know I'm not answering your question as you are asking, but have you considered archiving some of the data to a history table, and work with fewer rows?

我知道我没有像您所问的那样回答您的问题,但是您是否考虑过将一些数据归档到历史表中,并使用更少的行?

Most of the times you might think at first glance that you need all that data all the time but when actually sitting down and examining it, there are cases where that's not true. Or at least I've experienced that situation before.

大多数时候,你可能会认为你需要所有的数据,但是当你坐下来仔细检查的时候,有些情况是不正确的。或者至少我以前经历过这种情况。

#4


0  

I had a similar problem here SQL Server, Converting NTEXT to NVARCHAR(MAX) that was related to changing ntext to nvarchar(max).

我在这里遇到了类似的问题SQL Server,将NTEXT转换为NVARCHAR(MAX),这与将NTEXT转换为NVARCHAR(MAX)有关。

I had to do an UPDATE MyTable SET MyValue = MyValue in order to get it to resize everything nicely.

我必须对MyTable设置MyValue = MyValue进行更新,以便让它能够很好地调整大小。

This obviously takes quite a long time with a lot of records. There were a number of suggestions as how better to do it. They key one was a temporary flag indicated if it had been done or not and then updating a few thousand at a time in a loop until it was all done. This meant I had "some" control over how much it was doing.

这显然要花很长时间,因为有很多记录。关于如何做得更好有很多建议。它们的关键之一是一个临时标志,指示它是否已经完成,然后在循环中每次更新几千个,直到全部完成。这意味着我可以“一些”控制它做了多少。

On another note though, if you really want to shrink the database as much as possible, it can help if you turn the recovery model down to simple, shrink the transaction logs, reorganise all the data in the pages, then set it back to full recovery model. Be careful though, shrinking of databases is generally not advisable, and if you reduce the recovery model of a live database you are asking for something to go wrong.

另一方面,如果您确实希望尽可能地缩小数据库,那么如果您将恢复模型简化为simple,收缩事务日志,重新组织页面中的所有数据,然后将其设置为完全恢复模型,那么将会有所帮助。但是要小心,收缩数据库通常是不可取的,如果您减少了一个活动数据库的恢复模型,那么您将要求出错。

#5


0  

Alternatively, you could do a full table rebuild to ensure there's no extra data hanging around anywhere:

或者,您可以进行完整的表重构,以确保在任何地方都不存在额外的数据:

CREATE TABLE tmp_table(<column definitions>);
GO
INSERT INTO tmp_table(<columns>) SELECT <columns> FROM <table>;
GO
DROP TABLE <table>;
GO
EXEC sp_rename N'tmp_table', N'<table>';
GO

Of course, things get more complicated with identity, indexes, etc etc...

当然,随着身份,索引等等,事情变得更加复杂。