如何加快SVN更新？

We have a rather large SVN repository. Doing SVN updates are taking longer and longer the more we add code. We added svn:externals to folders that were repeated in some projects like the FCKeditor on various websites. This helped, but not that much.

我们有一个相当大的SVN存储库。随着我们添加代码越多,执行SVN更新的时间越长越长。我们将svn:externals添加到在某些项目中重复的文件夹,例如各种网站上的FCKeditor。这有帮助,但不是那么多。

What is the best way to reduce update time and boost SVN speed?

减少更新时间和提高SVN速度的最佳方法是什么?

12 个解决方案

#1

If it's an older SVN repository (or even quite new, but wasn't setup optimally), it maybe using the older BDB style of repository database. http://svn.apache.org/repos/asf/subversion/trunk/notes/fsfs has notes on the new one. To change from one to another isn;t too hard - dump the entire history, re-initialise it with the new svn format of file system and re-import. It may also be useful at the same time to filter the repo-dump to remove entire checkins of useless information (I, for example, have removed 20MB+ tarball files that someone had checked in).

如果它是一个较旧的SVN存储库(或者甚至是新的,但没有最佳设置),它可能使用较旧的BDB样式的存储库数据库。 http://svn.apache.org/repos/asf/subversion/trunk/notes/fsfs对新版本进行了说明。从一个变为另一个并不太难 - 转储整个历史记录,使用新的svn格式的文件系统重新初始化它并重新导入。同时过滤repo-dump以删除无用信息的完整签入也是有用的(例如,我删除了某人已签入的20MB + tarball文件)。

As far as general speed goes - a quality (speedy) hard-drive and extra memory for OS-based caching would be hard to fault in terms of increasing the speed of how SVN will work.

就一般速度而言 - 在提高SVN工作速度方面,基于操作系统的缓存的质量(快速)硬盘和额外内存很难出错。

On the client side, if you have tortoisesvn setup through PuttyAgent for SSH access to an external repository machine, you can also enable SSH compression, which can also help.

在客户端,如果您通过PuttyAgent进行tortoisesvn设置以便SSH访问外部存储库计算机,您还可以启用SSH压缩,这也可以提供帮助。

Edit: SVN v1.5 also has the fsfs-reshard.py tool which can help split a FSFS based svn repository into a number of directories - which can themselves be linked onto different drive spindles. If you have thousands of revisions, that can also help - if for no other reason than finding one file among thousands takes time (and you tell tell if thats a problem by looking at the IOwait times)

编辑:SVN v1.5还有fsfs-reshard.py工具,可以帮助将基于FSFS的svn存储库拆分为多个目录 - 这些目录本身可以链接到不同的驱动器主轴上。如果你有成千上万的修订,这也可以帮助 - 如果除了找到一个文件之外没有其他原因需要花费时间(并且通过查看IOwait时间告诉你是否存在问题)

#2

Disable virus checking on folders that contain working copy code. This caused my updates to become twice as fast.

禁用对包含工作副本代码的文件夹的病毒检查。这导致我的更新速度提高了一倍。

#3

Not really an answer, but it may be interesting to know that one of the reasons svn is so I/O-heavy is the fact that it stores one extra copy of each file in the .svn/text-base directory. This makes local diff operations fast, but eats lot's of harddisk space and I/O.

这不是一个真正的答案,但可能有趣的是,知道svn之所以如此I / O重的原因之一是它在.svn / text-base目录中存储了每个文件的一个额外副本。这使得本地差异操作变得快速,但是吃了很多硬盘空间和I / O.

http://subversion.tigris.org/issues/show_bug.cgi?id=525 has the details.

http://subversion.tigris.org/issues/show_bug.cgi?id=525有详细信息。

#4

Sounds like you've got multiple projects in one repository. Splitting them up where appropriate will give you a big boost.

听起来你在一个存储库中有多个项目。在适当的地方拆分它们会给你带来很大的推动力。

Supposedly Git is much faster than Subversion due to the way it stores/processes changes, but I have no first-hand experience with it.

据说Git比Subversion快得多,因为它存储/处理变化的方式,但我没有第一手经验。

#5

Make sure your connection to the server is a fast as can be (gigabit ethernet). Make sure the server has fast disks in an array. And, of course, only check out what you need.

确保您与服务器的连接速度很快(千兆位以太网)。确保服务器在阵列中具有快速磁盘。当然,只检查你需要的东西。

#6

There are some common performance tweaks. SVN is very I/O heavy, so faster hard disks are an option (on both ends). Add more memory to your server. Make sure your clients have a defragmented hard disk (for Windows).

有一些常见的性能调整。 SVN的I / O非常重,因此可以选择更快的硬盘(两端)。为服务器添加更多内存。确保您的客户端具有经过碎片整理的硬盘(适用于Windows)。

What access method you use also matters. Repositories stored on remote filesystems (using file:/// access) are going to be much slower than either svnserve or Apache with mod_svn. Consider using one of the latter if you have the repository on a simple file share.

您使用的访问方法也很重要。存储在远程文件系统上的存储库(使用file:/// access)将比使用mod_svn的svnserve或Apache慢得多。如果您在简单文件共享上拥有存储库,请考虑使用后者之一。

#7

TotoiseSVN by default looks at file changes in the background and I have seen that slow down my machine. I changed the config to exclude everything and then only include the directories where I have checkouts. You can also turn off the background checks. Both of these settings are in the Icon Overlays settings node.

默认情况下,TotoiseSVN在后台查看文件更改,我看到这会降低我的机器速度。我更改了配置以排除所有内容,然后只包括我有结帐的目录。您也可以关闭后台检查。这两个设置都位于“图标叠加”设置节点中。

#8

Sometimes slow svn operation, especially with many externals, is DNS-related. It looks like svn performs DNS lookup per every svn:external, even for relative ones. Adding your svn server hostname to /etc/hosts or fixing resolv.conf can be useful.

有时,慢速svn操作,尤其是许多外部操作,与DNS有关。看起来svn对每个svn:external执行DNS查找,即使是相对的。将svn服务器主机名添加到/ etc / hosts或修复resolv.conf可能很有用。

#9

I've found in my own experience (ie: not through any actual tests) that, especially if the SVN repo server is remote, using externals seems to slow things down. If you've got duplicated code (like your FCK editor) in multiple places, I would tend to stick to using externals since keeping those files synchronised and manageable is more important than update speeds - though, you could look at using symbolic links to bring in duplicated code instead. (If you're using Windows XP, you can use junction points).

我根据自己的经验(即:不通过任何实际测试)发现,特别是如果SVN repo服务器是远程的,使用外部设备似乎会减慢速度。如果你在多个地方都有重复的代码(比如你的FCK编辑器),我倾向于坚持使用外部,因为保持这些文件的同步和可管理性比更新速度更重要 - 但是,你可以看看使用符号链接带来而是在重复的代码中。 (如果您使用的是Windows XP,则可以使用连接点)。

#10

We've split our code base into several sibling modules and wrote the Ant scripts so that one developer can work on one module at a time without bothering too much about what's happening in the other modules.

我们将代码库分成几个兄弟模块并编写了Ant脚本,这样一个开发人员可以一次处理一个模块,而不必过多地关注其他模块中发生的事情。

a top-level build script triggers all modules build scripts

*构建脚本触发所有模块构建脚本

external libraries are not stored in Subversion but rather pulled from a network drive using Apache Ivy. (think of it like an in-house Maven repository).

外部库不存储在Subversion中,而是使用Apache Ivy从网络驱动器中提取。 (想想它就像一个内部Maven存储库)。

dependencies between modules are also managed using Ivy.

模块之间的依赖关系也使用Ivy进行管理。

Typically, developers will need to update their entire tree a couple times a week but it can easily be done before going to lunch/coffee break.

通常,开发人员需要每周更新整棵树几次,但在午餐/喝咖啡休息之前可以很容易地完成。

#11

Using read-access rights (i.e. restricting read access to certain persons/groups) will slow down the repository a lot. Especially when the authentication is done in some special way, e.g. against a windows domain. The same holds true for write access rights, of course, but writing is less frequent then reading. And restricting write access can be more important than restricting read access

使用读访问权限(即限制对某些人/组的读访问权限)将大大减慢存储库的速度。特别是当以某种特殊方式进行认证时,例如,针对Windows域名。当然,对于写访问权限也是如此,但写入比阅读更少。限制写访问权限比限制读访问权限更重要

#12

If you have many folders in the root of repository and your local copy reflects the repository, then try to slit monolithic local copy into many separated downloadable folders and update these folders separately too, It will be really faster than one big folder.

如果存储库的根目录中有许多文件夹,并且本地副本反映了存储库,那么尝试将单片本地副本分割成许多可分离的可下载文件夹,并单独更新这些文件夹,它将比一个大文件夹快得多。

#1