我怎么能把数据库放在git(版本控制)?

I'm doing a web app, and I need to make a branch for some major changes, the thing is, these changes require changes to the database schema, so I'd like to put the entire database under git as well.

我正在做一个web应用，我需要为一些重大的变更做一个分支，事情是，这些变更需要对数据库模式进行修改，所以我也想把整个数据库放在git之下。

How do I do that? is there a specific folder that I can keep under a git repository? How do I know which one? How can I be sure that I'm putting the right folder?

我该怎么做呢?是否有一个特定的文件夹可以保存在git仓库中?我怎么知道是哪一个?我怎么能确定我在放正确的文件夹呢?

I need to be sure, because these changes are not backward compatible; I can't afford to screw up.

我需要确定,因为这些变化不是向后兼容的;我不能搞砸。

The database in my case is PostgreSQL

我的数据库是PostgreSQL

Edit:

Someone suggested taking backups and putting the backup file under version control instead of the database. To be honest, I find that really hard to swallow.

有人建议进行备份，并将备份文件置于版本控制之下，而不是数据库。老实说，我觉得这很难接受。

~~There has to be a better way.~~

必须有一个更好的方法。

Update:

OK, so there' no better way, but I'm still not quite convinced, so I will change the question a bit:

好吧，没有更好的办法了，但我还是不太相信，所以我要改变一下这个问题:

I'd like to put the entire database under version control, what database engine can I use so that I can put the actual database under version control instead of its dump?

我想要将整个数据库置于版本控制之下，我可以使用什么数据库引擎来将实际的数据库置于版本控制之下，而不是转储?

Would sqlite be git-friendly?

sqlite会git-friendly吗?

Since this is only the development environment, I can choose whatever database I want.

因为这只是开发环境，所以我可以选择任何我想要的数据库。

Edit2:

What I really want is not to track my development history, but to be able to switch from my "new radical changes" branch to the "current stable branch" and be able for instance to fix some bugs/issues, etc, with the current stable branch. Such that when I switch branches, the database auto-magically becomes compatible with the branch I'm currently on. I don't really care much about the actual data.

我真正想要的不是跟踪我的开发历史，而是能够从“新激进的变更”分支切换到“当前稳定分支”，并能够修复一些bug /问题，等等，用当前稳定分支。这样，当我切换分支时，数据库就会自动与当前所在的分支兼容。我不太关心实际数据。

23 个解决方案

#1

115

Take a database dump, and version control that instead. This way it is a flat text file.

使用数据库转储，并对其进行版本控制。这样它就是一个平面文本文件。

Personally I suggest that you keep both a data dump, and a schema dump. This way using diff it becomes fairly easy to see what changed in the schema from revision to revision.

我个人建议您同时保留一个数据转储和一个模式转储。这样使用diff，就可以很容易地看出从修订到修订的模式中发生了什么变化。

If you are making big changes, you should have a secondary database that you make the new schema changes to and not touch the old one since as you said you are making a branch.

如果您正在进行重大更改，您应该有一个辅助数据库，您可以对其进行新的模式更改，而不需要对旧数据库进行修改，因为正如您所说，您正在创建一个分支。

#2

Check out Refactoring Databases (http://databaserefactoring.com/) for a bunch of good techniques for maintaining your database in tandem with code changes.

请查看重构数据库(http://databaserefactoring.com/)，以获得在代码更改的同时维护数据库的一些好技术。

Suffice to say that you're asking the wrong questions. Instead of putting your database into git you should be decomposing your changes into small verifiable steps so that you can migrate/rollback schema changes with ease.

我只想说你问错了问题。您应该将您的更改分解为可验证的小步骤，以使您可以轻松地迁移/回滚模式，而不是将您的数据库放入git中。

If you want to have full recoverability you should consider archiving your postgres WAL logs and use the PITR (point in time recovery) to play back/forward transactions to specific known good states.

如果您希望具有完全可恢复性，您应该考虑归档您的postgres WAL日志，并使用PITR(时间恢复点)将事务回放/转发到特定的已知良好状态。

#3

I'm starting to think of a really simple solution, don't know why I didn't think of it before!!

我开始想到一个非常简单的解决方案，不知道为什么我以前没想过!!

Duplicate the database, (both the schema and the data).
复制数据库(模式和数据)。
In the branch for the new-major-changes, simply change the project configuration to use the new duplicate database.
在新更改的分支中，只需更改项目配置以使用新的重复数据库。

This way I can switch branches without worrying about database schema changes.

通过这种方式，我可以切换分支，而不必担心数据库模式的更改。

EDIT:

By duplicate, I mean create another database with a different name (like my_db_2); not doing a dump or anything like that.

我说的duplicate，是指创建另一个名称不同的数据库(比如my_db_2);不做垃圾场之类的。

#4

Instead of manually dumping your DB and saving it into git, use Offscale DataGrove.

不要手动转储数据库并将其保存到git中，而要使用超出比例的DataGrove。

DataGrove is basically a DB version control - it tracks changes to the entire DB (schema AND data) and allows you to tag versions into it's repository. You can use it alongside git and have it tag a version each time you check-in code, and load the right DB state whenever you pull code.

DataGrove基本上是一个DB版本控件——它跟踪对整个DB(模式和数据)的更改，并允许您将版本标记到它的存储库中。您可以与git一起使用它，并在每次签入代码时让它标记一个版本，并在每次提取代码时加载正确的DB状态。

Specifically regarding "Edit 2" - with DataGrove you can simply have two branches of the DB, one for each of you code branches. When you load a certain branch of the code, DataGrove will automagically re-create the entire DB state, with all the data inside for that version/ branch. This means you can switch between development branches with a single, simple command.

特别是关于“编辑2”——使用DataGrove，您可以简单地拥有DB的两个分支，每个分支对应一个代码分支。当您加载代码的某个分支时，DataGrove将自动重新创建整个DB状态，其中包含该版本/分支的所有数据。这意味着您可以使用一个简单的命令在开发分支之间进行切换。

#5

Use something like LiquiBase this lets you keep revision control of your Liquibase files. you can tag changes for production only, and have lb keep your DB up to date for either production or development, (or whatever scheme you want).

使用像LiquiBase这样的东西可以让你对你的LiquiBase文件进行修改控制。您可以标记仅用于生产的更改，并使lb使您的DB在生产或开发中保持最新(或您想要的任何方案)。

#6

There is a tool that is under heavy development called Klonio, whose beta release is available for use. It supports MongoDB and MySQL as of now.

有一个正在开发的工具叫做Klonio，它的beta版本可以使用。它现在支持MongoDB和MySQL。

Of course, it has git integration and you can snapshot either your schema alone or even the data included.

当然，它有git集成，您可以单独快照模式，甚至包括数据。

#7

There is a great project called Migrations under Doctrine that built just for this purpose.

有一个伟大的项目叫做“迁移”，这是为了这个目的而建造的。

Its still in alpha state and built for php.

它仍然处于alpha状态并为php构建。

http://docs.doctrine-project.org/projects/doctrine-migrations/en/latest/index.html

#8

Take a look at RedGate SQL Source Control.

看看RedGate SQL源代码控件。

http://www.red-gate.com/products/sql-development/sql-source-control/

This tool is a SQL Server Management Studio snap-in which will allow you to place your database under Source Control with Git.

该工具是一个SQL Server Management Studio snapin，允许您使用Git将数据库置于源代码控制之下。

It's a bit pricey at $495 per user, but there is a 28 day free trial available.

每个用户495美元有点贵，但有28天免费试用。

NOTE I am not affiliated with RedGate in any way whatsoever.

注意，我与RedGate没有任何关系。

#9

I've come across this question, as I've got a similar problem, where something approximating a DB based Directory structure, stores 'files', and I need git to manage it. It's distributed, across a cloud, using replication, hence it's access point will be via MySQL.

我遇到了这个问题，因为我遇到了一个类似的问题，类似于基于DB的目录结构，存储“文件”，我需要git来管理它。它通过复制分布在云上，因此它的访问点将通过MySQL。

The gist of the above answers, seem to similarly suggest an alternative solution to the problem asked, which kind of misses the point, of using Git to manage something in a Database, so I'll attempt to answer that question.

上述答案的要点似乎同样是对问题提出的另一种解决方案，即使用Git来管理数据库中的某些内容，而这正是问题的关键所在，因此我将尝试回答这个问题。

Git is a system, which in essence stores a database of deltas (differences), which can be reassembled, in order, to reproduce a context. The normal usage of git assumes that context is a filesystem, and those deltas are diff's in that file system, but really all git is, is a hierarchical database of deltas (hierarchical, because in most cases each delta is a commit with at least 1 parents, arranged in a tree).

Git是一个系统，它本质上存储一个增量(差异)数据库，这些增量可以重新组合，以便重新生成上下文。git的正常使用假设上下文是一个文件系统，而这些增量在文件系统中是diff，但是实际上所有的git都是，一个分层的增量数据库(分层的，因为在大多数情况下，每个增量都是一个至少有一个父元素的提交，在树中排列)。

As long as you can generate a delta, in theory, git can store it. The problem is normally git expects the context, on which it's generating delta's to be a file system, and similarly, when you checkout a point in the git hierarchy, it expects to generate a filesystem.

只要你能生成一个delta，理论上，git可以存储它。问题通常是git期望上下文(它在上下文上生成的delta是一个文件系统)，同样，当您在git层次结构中签出一个点时，它希望生成一个文件系统。

If you want to manage change, in a database, you have 2 discrete problems, and I would address them separately (if I were you). The first is schema, the second is data (although in your question, you state data isn't something you're concerned about). A problem I had in the past, was a Dev and Prod database, where Dev could take incremental changes to the schema, and those changes had to be documented in CVS, and propogated to live, along with additions to one of several 'static' tables. We did that by having a 3rd database, called Cruise, which contained only the static data. At any point the schema from Dev and Cruise could be compared, and we had a script to take the diff of those 2 files and produce an SQL file containing ALTER statements, to apply it. Similarly any new data, could be distilled to an SQL file containing INSERT commands. As long as fields and tables are only added, and never deleted, the process could automate generating the SQL statements to apply the delta.

如果您想管理变更，在数据库中，您有两个独立的问题，我将分别处理它们(如果我是您)。第一个是模式，第二个是数据(尽管在您的问题中，您声明数据不是您关心的内容)。我过去遇到的一个问题是开发和Prod数据库，在这个数据库中，开发人员可以对模式进行增量更改，这些更改必须在CVS中进行记录，并建议在live中进行，并在几个“静态”表中添加一些内容。我们通过第三个数据库Cruise来实现这一点，它只包含静态数据。在任何时候，都可以比较Dev和Cruise的模式，我们有一个脚本来获取这两个文件的差异，并生成一个包含ALTER语句的SQL文件来应用它。类似地，任何新数据都可以被压缩到包含插入命令的SQL文件中。只要只添加字段和表，而不删除它们，流程就可以自动生成SQL语句来应用delta。

The mechanism by which git generates deltas is diff and the mechanism by which it combines 1 or more deltas with a file, is called merge. If you can come up with a method for diffing and merging from a different context, git should work, but as has been discussed you may prefer a tool that does that for you. My first thought towards solving that is this https://git-scm.com/book/en/v2/Customizing-Git-Git-Configuration#External-Merge-and-Diff-Tools which details how to replace git's internal diff and merge tool. I'll update this answer, as I come up with a better solution to the problem, but in my case I expect to only have to manage data changes, in-so-far-as a DB based filestore may change, so my solution may not be exactly what you need.

git生成增量的机制是diff，它将一个或多个增量与文件结合的机制称为merge。如果您可以从不同的上下文中提出一种扩散和合并的方法，那么git应该可以工作，但是正如前面所讨论的，您可能更希望使用一种工具来实现这一点。我的第一个解决方案是https://git- scm.com/book/en/v2/customizinggit - git- git- git- git- git- git- git- git- git- git- git- git- git- git- git- git- git- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -我将更新这个答案，因为我为这个问题找到了更好的解决方案，但是在我的情况下，我只希望管理数据更改，因为基于DB的filestore可能会更改，所以我的解决方案可能不是您所需要的。

#10

You can't do it without atomicity, and you can't get atomicity without either using pg_dump or a snapshotting filesystem.

如果没有原子性，就不能实现原子性，如果不使用pg_dump或快照文件系统，就不能获得原子性。

My postgres instance is on zfs, which I snapshot occasionally. It's approximately instant and consistent.

我的postgres实例在zfs上，我偶尔快照它。它几乎是即时的和一致的。

#11

What you want, in spirit, is perhaps something like Post Facto, which stores versions of a database in a database. Check this presentation.

本质上，您想要的可能是类似于Post fact的东西，它在数据库中存储数据库的版本。检查这个演讲。

The project apparently never really went anywhere, so it probably won't help you immediately, but it's an interesting concept. I fear that doing this properly would be very difficult, because even version 1 would have to get all the details right in order to have people trust their work to it.

这个项目显然从来没有真正的进展，所以它可能不会立即帮助你，但这是一个有趣的概念。我担心这样做会非常困难，因为即使是版本1也必须正确地获取所有的细节，以使人们信任他们的工作。

#12

I've released a tool for sqlite that does what you're asking for. It uses a custom diff driver leveraging the sqlite projects tool 'sqldiff', UUIDs as primary keys, and leaves off the sqlite rowid. It is still in alpha so feedback is appreciated.

我为sqlite发布了一个工具，它可以满足您的要求。它使用一个定制的diff驱动程序，利用sqlite项目工具“sqldiff”，uuid作为主键，并省略了sqlite rowid。它仍然在阿尔法，所以反馈是值得赞赏的。

Postgres and mysql are trickier, as the binary data is kept in multiple files and may not even be valid if you were able to snapshot it.

Postgres和mysql比较复杂，因为二进制数据保存在多个文件中，如果能够快照它，甚至可能无效。

https://github.com/cannadayr/git-sqlite

#13

I want to make something similar, add my database changes to my version control system.

我想做一些类似的事情，将数据库更改添加到版本控制系统中。

I am going to follow the ideas in this post from Vladimir Khorikov "Database versioning best practices". In summary i will

我将遵循Vladimir Khorikov在本文中提出的“数据库版本化最佳实践”的思想。总之我将

store both its schema and the reference data in a source control system.
将其模式和引用数据存储在源控制系统中。
for every modification we will create a separate SQL script with the changes
对于每一个修改，我们都将使用修改创建一个单独的SQL脚本

In case it helps!

如果它能帮助!

#14

I think X-Istence is on the right track, but there are a few more improvements you can make to this strategy. First, use:

我认为x - i是正确的，但是你可以对这个策略有更多的改进。首先,使用:

$pg_dump --schema ...

to dump the tables, sequences, etc and place this file under version control. You'll use this to separate the compatibility changes between your branches.

要转储表、序列等并将此文件置于版本控制之下。您将使用它来分隔您的分支之间的兼容性变更。

Next, perform a data dump for the set of tables that contain configuration required for your application to operate (should probably skip user data, etc), like form defaults and other data non-user modifiable data. You can do this selectively by using:

接下来，对包含应用程序操作所需配置(可能应该跳过用户数据等)的表集执行数据转储，如表单默认值和其他数据非用户可修改数据。你可以有选择地使用:

$pg_dump --table=.. <or> --exclude-table=..

This is a good idea because the repo can get really clunky when your database gets to 100Mb+ when doing a full data dump. A better idea is to back up a more minimal set of data that you require to test your app. If your default data is very large though, this may still cause problems though.

这是一个好主意，因为当您的数据库达到100Mb+时，当您进行完整的数据转储时，repo会变得非常笨拙。更好的办法是备份测试应用程序所需的更小的数据集。不过，如果您的默认数据非常大，这可能仍然会导致问题。

If you absolutely need to place full backups in the repo, consider doing it in a branch outside of your source tree. An external backup system with some reference to the matching svn rev is likely best for this though.

如果您绝对需要在repo中放置完全备份，请考虑在源树之外的分支中进行备份。一个外部备份系统引用匹配的svn rev可能是最好的。

Also, I suggest using text format dumps over binary for revision purposes (for the schema at least) since these are easier to diff. You can always compress these to save space prior to checking in.

此外，我建议出于修订的目的(至少对于模式)使用文本格式转储，因为这更容易区分。

Finally, have a look at the postgres backup documentation if you haven't already. The way you're commenting on backing up 'the database' rather than a dump makes me wonder if you're thinking of file system based backups (see section 23.2 for caveats).

最后，看看postgres备份文档，如果还没有的话。您评论备份“数据库”而不是转储的方式让我怀疑您是否在考虑基于文件系统的备份(请参阅第23.2节的注意事项)。

#15

This question is pretty much answered but I would like to complement X-Istence's and Dana the Sane's answer with a small suggestion.

这个问题已经得到了很好的回答，但是我想用一个小建议来补充x线的存在和理智的达纳的回答。

If you need revision control with some degree of granularity, say daily, you could couple the text dump of both the tables and the schema with a tool like rdiff-backup which does incremental backups. The advantage is that instead of storing snapshots of daily backups, you simply store the differences from the previous day.

如果您需要具有一定粒度的修订控制，比如每天，您可以将表和模式的文本转储与执行增量备份的rdb -backup之类的工具结合起来。这样做的好处是，您不必存储每日备份的快照，而只需存储前一天的差异。

With this you have both the advantage of revision control and you don't waste too much space.

有了它，您就拥有了修订控制的优势，并且不会浪费太多的空间。

In any case, using git directly on big flat files which change very frequently is not a good solution. If your database becomes too big, git will start to have some problems managing the files.

无论如何，在频繁变化的大型平面文件上直接使用git并不是一个好的解决方案。如果您的数据库太大，git将开始出现管理文件的问题。

#16

I would recommend neXtep for version controlling the database it has got a good set of documentation and forums that explains how to install and the errors encountered. I have tested it for postgreSQL 9.1 and 9.3, i was able to get it working for 9.1 but for 9.3 it doesn't seems to work.

我推荐neXtep用于控制数据库的版本，它有一组很好的文档和论坛来解释如何安装和遇到的错误。我在postgreSQL 9.1和9.3中测试过，我可以让它在9.1中工作，但是对于9.3，它似乎不起作用。

#17

What I do in my personal projects is, I store my whole database to dropbox and then point MAMP, WAMP workflow to use it right from there.. That way database is always up-to-date where ever I need to do some developing. But that's just for dev! Live sites is using own server for that off course! :)

我在我的个人项目中所做的是，我将我的全部数据库存储到dropbox上，然后指向MAMP, WAMP工作流来使用它。这样，当我需要进行一些开发时，数据库总是最新的。但这只是为dev准备的!现场站点正在使用自己的服务器，这偏离了方向!:)

#18

Storing each level of database changes under git versioning control is like pushing your entire database with each commit and restoring your entire database with each pull. If your database is so prone to crucial changes and you cannot afford to loose them, you can just update your pre_commit and post_merge hooks. I did the same with one of my projects and you can find the directions here.

在git版本控制控制下存储每个级别的数据库更改，就像每次提交时推动整个数据库，每次拉动时恢复整个数据库一样。如果您的数据库非常容易发生关键的更改，并且您无法负担得起它们，那么您只需更新pre_commit和post_merge钩子。我在我的一个项目上做了同样的事情，你可以在这里找到方向。

#19

That's how I do it:

我就是这么做的:

Since your have free choise about DB type use a filebased DB like e.g. firebird.

由于你有关于DB类型的*选择，使用一个基于文件的DB，比如firebird。

Create a template DB which has the schema that fits your actual branch and store it in your repository.

创建一个模板DB，其中包含适合实际分支的模式，并将其存储在存储库中。

When executing your application programmatically create a copy of your template DB, store it somewhere else and just work with that copy.

当以编程方式执行应用程序时，创建模板DB的副本，将其存储到其他地方，并使用该副本。

This way you can put your DB schema under version control without the data. And if you change your schema you just have to change the template DB

这样，您就可以在没有数据的情况下将DB模式置于版本控制之下。如果你改变你的模式，你只需要改变模板DB

#20

We used to run a social website, on a standard LAMP configuration. We had a Live server, Test server, and Development server, as well as the local developers machines. All were managed using GIT.

我们曾经在一个标准的LAMP配置上运行一个社交网站。我们有一个活动服务器、测试服务器和开发服务器，以及本地开发人员机器。所有这些都是使用GIT进行管理的。

On each machine, we had the PHP files, but also the MySQL service, and a folder with Images that users would upload. The Live server grew to have some 100K (!) recurrent users, the dump was about 2GB (!), the Image folder was some 50GB (!). By the time that I left, our server was reaching the limit of its CPU, Ram, and most of all, the concurrent net connection limits (We even compiled our own version of network card driver to max out the server 'lol'). We could not (nor should you assume with your website) put 2GB of data and 50GB of images in GIT.

在每台机器上，我们都有PHP文件，还有MySQL服务，以及一个用户上传图片的文件夹。Live服务器增加了大约100K(!)经常用户，转储大约2GB(!)，图像文件夹是50GB(!)。在我离开的时候，我们的服务器已经达到了CPU、Ram和最重要的并发网络连接限制(我们甚至编译了我们自己的网卡驱动程序，以最大限度地减少服务器“lol”)的限制。我们不能(也不应该假设你的网站)在GIT中放入2GB的数据和50GB的图像。

To manage all this under GIT easily, we would ignore the binary folders (the folders containing the Images) by inserting these folder paths into .gitignore. We also had a folder called SQL outside the Apache documentroot path. In that SQL folder, we would put our SQL files from the developers in incremental numberings (001.florianm.sql, 001.johns.sql, 002.florianm.sql, etc). These SQL files were managed by GIT as well. The first sql file would indeed contain a large set of DB schema. We don't add user-data in GIT (eg the records of the users table, or the comments table), but data like configs or topology or other site specific data, was maintained in the sql files (and hence by GIT). Mostly its the developers (who know the code best) that determine what and what is not maintained by GIT with regards to SQL schema and data.

要在GIT下轻松管理所有这些，我们可以通过将这些文件夹路径插入.gitignore来忽略二进制文件夹(包含图像的文件夹)。我们在Apache documentroot路径外还有一个名为SQL的文件夹。在这个SQL文件夹中，我们将把来自开发人员的SQL文件以增量号(001.florianm)表示。001. sql,约翰。sql,002. florianm。sql等等)。这些SQL文件也由GIT管理。第一个sql文件确实包含大量的DB模式。我们不会在GIT中添加用户数据(如用户表或comments表的记录)，但是在sql文件中(也就是GIT)中维护了configs或拓扑或其他站点特定数据的数据。大多数情况下，是开发人员(他们最了解代码)决定GIT在SQL模式和数据方面维护什么和不维护什么。

When it got to a release, the administrator logs in onto the dev server, merges the live branch with all developers and needed branches on the dev machine to an update branch, and pushed it to the test server. On the test server, he checks if the updating process for the Live server is still valid, and in quick succession, points all traffic in Apache to a placeholder site, creates a DB dump, points the working directory from 'live' to 'update', executes all new sql files into mysql, and repoints the traffic back to the correct site. When all stakeholders agreed after reviewing the test server, the Administrator did the same thing from Test server to Live server. Afterwards, he merges the live branch on the production server, to the master branch accross all servers, and rebased all live branches. The developers were responsible themselves to rebase their branches, but they generally know what they are doing.

当它到达一个发行版时，管理员登录到dev服务器，将live分支与所有开发人员合并，并且在dev机器上需要分支到更新分支，并将其推送到测试服务器。在测试服务器上,他检查现场服务器的更新过程仍然是有效的,在接二连三,点所有流量在Apache站点一个占位符,创建了一个数据库转储,点“活”的工作目录更新,执行所有新的sql文件到mysql,重嵌交通回正确的网站。当所有涉众在评审测试服务器之后达成一致时，管理员从测试服务器到活动服务器执行了相同的操作。然后，他将生产服务器上的实时分支合并到所有服务器上的主分支，并重新建立所有的分支。开发人员有责任调整他们的分支，但是他们通常知道他们在做什么。

If there were problems on the test server, eg. the merges had too many conflicts, then the code was reverted (pointing the working branch back to 'live') and the sql files were never executed. The moment that the sql files were executed, this was considered as a non-reversible action at the time. If the SQL files were not working properly, then the DB was restored using the Dump (and the developers told off, for providing ill-tested SQL files).

如果测试服务器有问题，例如。合并有太多的冲突，然后代码被还原(将工作分支指向“live”)，sql文件永远不会执行。当执行sql文件时，这被认为是一个不可逆的操作。如果SQL文件不能正常工作，则使用Dump恢复数据库(开发人员指责为提供了测试不佳的SQL文件)。

Today, we maintain both a sql-up and sql-down folder, with equivalent filenames, where the developers have to test that both the upgrading sql files, can be equally downgraded. This could ultimately be executed with a bash script, but its a good idea if human eyes kept monitoring the upgrade process.

今天，我们维护了一个sql-up和sql-down文件夹，其中包含等价的文件名，开发人员必须在其中测试升级的sql文件是否可以同样降级。这最终可以用bash脚本执行，但是如果人类的眼睛继续监视升级过程，这是一个好主意。

It's not great, but its manageable. Hope this gives an insight into a real-life, practical, relatively high-availability site. Be it a bit outdated, but still followed.

它不是很好，但是可以管理。希望这能让我们深入了解一个真实的、实用的、相对高可用性的站点。虽然有点过时了，但还是照做了。

#21

Use a tool like iBatis Migrations (manual, short tutorial video) which allows you to version control the changes you make to a database throughout the lifecycle of a project, rather than the database itself.

使用像iBatis Migrations(手动的、简短的教程视频)这样的工具，它允许您在项目的整个生命周期中对数据库进行修改，而不是数据库本身。

This allows you to selectively apply individual changes to different environments, keep a changelog of which changes are in which environments, create scripts to apply changes A through N, rollback changes, etc.

这允许您有选择地将单个更改应用到不同的环境中，保留更改的变化量，创建脚本应用更改a到N，回滚更改，等等。

#22

I'd like to put the entire database under version control, what database engine can I use so that I can put the actual database under version control instead of its dump?

我想要将整个数据库置于版本控制之下，我可以使用什么数据库引擎来将实际的数据库置于版本控制之下，而不是转储?

This is not database engine dependent. By Microsoft SQL Server there are lots of version controlling programs. I don't think that problem can be solved with git, you have to use a pgsql specific schema version control system. I don't know whether such a thing exists or not...

这与数据库引擎无关。Microsoft SQL Server有许多版本控制程序。我认为这个问题不能用git来解决，您必须使用pgsql特定的模式版本控制系统。我不知道这种东西是否存在……

#23

Here is what i am trying to do in my projects:

以下是我在我的项目中试图做的:

separate data and schema and default data.
分离数据、模式和默认数据。

The database configuration is stored in configuration file that is not under version control (.gitignore)

数据库配置存储在不受版本控制的配置文件中(.gitignore)

The database defaults (for setting up new Projects) is a simple SQL file under version control.

数据库默认值(用于设置新项目)是版本控制下的一个简单的SQL文件。

For the database schema create a database schema dump under the version control.

对于数据库模式，在版本控制下创建一个数据库模式转储。

The most common way is to have update scripts that contains SQL Statements, (ALTER Table.. or UPDATE). You also need to have a place in your database where you save the current version of you schema)

最常见的方法是使用包含SQL语句的更新脚本。或更新)。您还需要在数据库中放置一个位置，保存当前版本的模式)

Take a look at other big open source database projects (piwik,or your favorite cms system), they all use updatescripts (1.sql,2.sql,3.sh,4.php.5.sql)

看看其他大型的开源数据库项目(piwik，或者您最喜欢的cms系统)，它们都使用updatesscripts (1.sql,2.sql,3.sh,4.php. sql)

But this a very time intensive job, you have to create, and test the updatescripts and you need to run a common updatescript that compares the version and run all necessary update scripts.

但是这是一项非常耗时的工作，您必须创建并测试updatesscripts并且您需要运行一个通用的updatescript来比较版本并运行所有必要的更新脚本。

So theoretically (and thats what i am looking for) you could dumped the the database schema after each change (manually, conjob, git hooks (maybe before commit)) (and only in some very special cases create updatescripts)

因此，理论上(这就是我要寻找的)，您可以在每次更改(手动、conjob、git钩子(可能在提交之前)之后丢弃数据库模式(只有在某些非常特殊的情况下才创建updatesscripts))

After that in your common updatescript (run the normal updatescripts, for the special cases) and then compare the schemas (the dump and current database) and then automatically generate the nessesary ALTER Statements. There some tools that can do this already, but haven't found yet a good one.

在此之后，在常见的updatescript中(对于特殊情况运行常规的updatesscripts)，然后比较模式(转储和当前数据库)，然后自动生成nessesary ALTER语句。有一些工具已经可以做到这一点，但还没有找到一个好的工具。

#1

115

Take a database dump, and version control that instead. This way it is a flat text file.

使用数据库转储，并对其进行版本控制。这样它就是一个平面文本文件。

Personally I suggest that you keep both a data dump, and a schema dump. This way using diff it becomes fairly easy to see what changed in the schema from revision to revision.

我个人建议您同时保留一个数据转储和一个模式转储。这样使用diff，就可以很容易地看出从修订到修订的模式中发生了什么变化。

If you are making big changes, you should have a secondary database that you make the new schema changes to and not touch the old one since as you said you are making a branch.

#2

Check out Refactoring Databases (http://databaserefactoring.com/) for a bunch of good techniques for maintaining your database in tandem with code changes.

请查看重构数据库(http://databaserefactoring.com/)，以获得在代码更改的同时维护数据库的一些好技术。

我只想说你问错了问题。您应该将您的更改分解为可验证的小步骤，以使您可以轻松地迁移/回滚模式，而不是将您的数据库放入git中。

If you want to have full recoverability you should consider archiving your postgres WAL logs and use the PITR (point in time recovery) to play back/forward transactions to specific known good states.

如果您希望具有完全可恢复性，您应该考虑归档您的postgres WAL日志，并使用PITR(时间恢复点)将事务回放/转发到特定的已知良好状态。

#3

I'm starting to think of a really simple solution, don't know why I didn't think of it before!!

我开始想到一个非常简单的解决方案，不知道为什么我以前没想过!!

Duplicate the database, (both the schema and the data).
复制数据库(模式和数据)。
In the branch for the new-major-changes, simply change the project configuration to use the new duplicate database.
在新更改的分支中，只需更改项目配置以使用新的重复数据库。

This way I can switch branches without worrying about database schema changes.

通过这种方式，我可以切换分支，而不必担心数据库模式的更改。

EDIT:

By duplicate, I mean create another database with a different name (like my_db_2); not doing a dump or anything like that.

我说的duplicate，是指创建另一个名称不同的数据库(比如my_db_2);不做垃圾场之类的。

#4

Instead of manually dumping your DB and saving it into git, use Offscale DataGrove.

不要手动转储数据库并将其保存到git中，而要使用超出比例的DataGrove。

#5

#6

There is a tool that is under heavy development called Klonio, whose beta release is available for use. It supports MongoDB and MySQL as of now.

有一个正在开发的工具叫做Klonio，它的beta版本可以使用。它现在支持MongoDB和MySQL。

Of course, it has git integration and you can snapshot either your schema alone or even the data included.

当然，它有git集成，您可以单独快照模式，甚至包括数据。

#7

There is a great project called Migrations under Doctrine that built just for this purpose.

有一个伟大的项目叫做“迁移”，这是为了这个目的而建造的。

Its still in alpha state and built for php.

它仍然处于alpha状态并为php构建。

http://docs.doctrine-project.org/projects/doctrine-migrations/en/latest/index.html

#8

Take a look at RedGate SQL Source Control.

看看RedGate SQL源代码控件。

http://www.red-gate.com/products/sql-development/sql-source-control/

This tool is a SQL Server Management Studio snap-in which will allow you to place your database under Source Control with Git.

该工具是一个SQL Server Management Studio snapin，允许您使用Git将数据库置于源代码控制之下。

It's a bit pricey at $495 per user, but there is a 28 day free trial available.

每个用户495美元有点贵，但有28天免费试用。

NOTE I am not affiliated with RedGate in any way whatsoever.

注意，我与RedGate没有任何关系。

#9

#10

You can't do it without atomicity, and you can't get atomicity without either using pg_dump or a snapshotting filesystem.

如果没有原子性，就不能实现原子性，如果不使用pg_dump或快照文件系统，就不能获得原子性。

My postgres instance is on zfs, which I snapshot occasionally. It's approximately instant and consistent.

我的postgres实例在zfs上，我偶尔快照它。它几乎是即时的和一致的。

#11

What you want, in spirit, is perhaps something like Post Facto, which stores versions of a database in a database. Check this presentation.

本质上，您想要的可能是类似于Post fact的东西，它在数据库中存储数据库的版本。检查这个演讲。

#12

Postgres and mysql are trickier, as the binary data is kept in multiple files and may not even be valid if you were able to snapshot it.

Postgres和mysql比较复杂，因为二进制数据保存在多个文件中，如果能够快照它，甚至可能无效。

https://github.com/cannadayr/git-sqlite

#13

I want to make something similar, add my database changes to my version control system.

我想做一些类似的事情，将数据库更改添加到版本控制系统中。

I am going to follow the ideas in this post from Vladimir Khorikov "Database versioning best practices". In summary i will

我将遵循Vladimir Khorikov在本文中提出的“数据库版本化最佳实践”的思想。总之我将

store both its schema and the reference data in a source control system.
将其模式和引用数据存储在源控制系统中。
for every modification we will create a separate SQL script with the changes
对于每一个修改，我们都将使用修改创建一个单独的SQL脚本

In case it helps!

如果它能帮助!

#14

I think X-Istence is on the right track, but there are a few more improvements you can make to this strategy. First, use:

我认为x - i是正确的，但是你可以对这个策略有更多的改进。首先,使用:

$pg_dump --schema ...

to dump the tables, sequences, etc and place this file under version control. You'll use this to separate the compatibility changes between your branches.

要转储表、序列等并将此文件置于版本控制之下。您将使用它来分隔您的分支之间的兼容性变更。

$pg_dump --table=.. <or> --exclude-table=..

如果您绝对需要在repo中放置完全备份，请考虑在源树之外的分支中进行备份。一个外部备份系统引用匹配的svn rev可能是最好的。

Also, I suggest using text format dumps over binary for revision purposes (for the schema at least) since these are easier to diff. You can always compress these to save space prior to checking in.

此外，我建议出于修订的目的(至少对于模式)使用文本格式转储，因为这更容易区分。

#15

This question is pretty much answered but I would like to complement X-Istence's and Dana the Sane's answer with a small suggestion.

这个问题已经得到了很好的回答，但是我想用一个小建议来补充x线的存在和理智的达纳的回答。

With this you have both the advantage of revision control and you don't waste too much space.

有了它，您就拥有了修订控制的优势，并且不会浪费太多的空间。

In any case, using git directly on big flat files which change very frequently is not a good solution. If your database becomes too big, git will start to have some problems managing the files.

无论如何，在频繁变化的大型平面文件上直接使用git并不是一个好的解决方案。如果您的数据库太大，git将开始出现管理文件的问题。

#16

#17

#18

#19

That's how I do it:

我就是这么做的:

Since your have free choise about DB type use a filebased DB like e.g. firebird.

由于你有关于DB类型的*选择，使用一个基于文件的DB，比如firebird。

Create a template DB which has the schema that fits your actual branch and store it in your repository.

创建一个模板DB，其中包含适合实际分支的模式，并将其存储在存储库中。

When executing your application programmatically create a copy of your template DB, store it somewhere else and just work with that copy.

当以编程方式执行应用程序时，创建模板DB的副本，将其存储到其他地方，并使用该副本。

This way you can put your DB schema under version control without the data. And if you change your schema you just have to change the template DB

这样，您就可以在没有数据的情况下将DB模式置于版本控制之下。如果你改变你的模式，你只需要改变模板DB

#20

We used to run a social website, on a standard LAMP configuration. We had a Live server, Test server, and Development server, as well as the local developers machines. All were managed using GIT.

It's not great, but its manageable. Hope this gives an insight into a real-life, practical, relatively high-availability site. Be it a bit outdated, but still followed.

它不是很好，但是可以管理。希望这能让我们深入了解一个真实的、实用的、相对高可用性的站点。虽然有点过时了，但还是照做了。

#21

使用像iBatis Migrations(手动的、简短的教程视频)这样的工具，它允许您在项目的整个生命周期中对数据库进行修改，而不是数据库本身。

这允许您有选择地将单个更改应用到不同的环境中，保留更改的变化量，创建脚本应用更改a到N，回滚更改，等等。

#22

I'd like to put the entire database under version control, what database engine can I use so that I can put the actual database under version control instead of its dump?

我想要将整个数据库置于版本控制之下，我可以使用什么数据库引擎来将实际的数据库置于版本控制之下，而不是转储?

#23

Here is what i am trying to do in my projects:

以下是我在我的项目中试图做的:

separate data and schema and default data.
分离数据、模式和默认数据。

The database configuration is stored in configuration file that is not under version control (.gitignore)

数据库配置存储在不受版本控制的配置文件中(.gitignore)

The database defaults (for setting up new Projects) is a simple SQL file under version control.

数据库默认值(用于设置新项目)是版本控制下的一个简单的SQL文件。

For the database schema create a database schema dump under the version control.

对于数据库模式，在版本控制下创建一个数据库模式转储。

The most common way is to have update scripts that contains SQL Statements, (ALTER Table.. or UPDATE). You also need to have a place in your database where you save the current version of you schema)

最常见的方法是使用包含SQL语句的更新脚本。或更新)。您还需要在数据库中放置一个位置，保存当前版本的模式)

Take a look at other big open source database projects (piwik,or your favorite cms system), they all use updatescripts (1.sql,2.sql,3.sh,4.php.5.sql)

看看其他大型的开源数据库项目(piwik，或者您最喜欢的cms系统)，它们都使用updatesscripts (1.sql,2.sql,3.sh,4.php. sql)

But this a very time intensive job, you have to create, and test the updatescripts and you need to run a common updatescript that compares the version and run all necessary update scripts.

但是这是一项非常耗时的工作，您必须创建并测试updatesscripts并且您需要运行一个通用的updatescript来比较版本并运行所有必要的更新脚本。

我怎么能把数据库放在git(版本控制)?

Edit:

Update:

Edit2:

23 个解决方案

#1

#2

#3

EDIT:

#4

#5

#6

#7

#8

#9

#10

#11

#12

#13

#14

#15

#16

#17

#18

#19

#20

#21

#22

#23

#1

#2

#3

EDIT:

#4

#5

#6

#7

#8

#9

#10

#11

#12

#13

#14

#15

#16

#17

#18

#19

#20

#21

#22

#23

相关文章