In a corporate development environment writing mostly administrative software, should every developer use their own database instance, or should they use a central database instance during development? What are the advantages and disadvantages of each approach? What about other environments and other products?
在以管理软件为主的企业开发环境中,每个开发人员应该使用自己的数据库实例,还是应该在开发过程中使用中心数据库实例?每种方法的优缺点是什么?其他环境和其他产品呢?
19 个解决方案
#1
34
If you all share the same database, you might have some issues if someone make a structure change to the database and that the code is not "Synchronized" with it.
如果您共享同一个数据库,如果有人对数据库进行结构更改,并且代码没有与之“同步”,那么您可能会遇到一些问题。
I highly recommend one DB per developer for the only reason that you don't want to do "write" test to see someone else override you right after. A simple exemple? You try to display product for a website. Everything works until all the products disappear. Problem? Another developer decided to play with the "Active" flag of the product to test something else. In cases like that, a transaction might not even work. End of the story, you spend time debugging for someone else action.
我强烈推荐每个开发人员使用一个DB,唯一的原因是您不希望进行“写”测试,以便看到其他人马上覆盖您。一个简单的例子吗?你尝试为一个网站显示产品。所有的东西都有效,直到所有的产品消失。问题吗?另一个开发人员决定使用产品的“活动”标志来测试其他东西。在这样的情况下,事务甚至可能无法工作。故事的结尾,你花时间调试别人的行动。
I highly recommend replicating the staging database to the developer database once in a while to synchronize the structure (or better, have a tool to rebuild a database from scratch).
我强烈建议将登台数据库偶尔复制到开发人员数据库,以便同步结构(或者更好的方法是使用工具从头重新构建数据库)。
Of course, we require scripts for changes to the database and EVERYTHING is in a Source Control.
当然,我们需要对数据库进行修改的脚本,并且所有内容都在源代码控件中。
#2
15
The days when database environments should be scarce are long gone. I'm writing this posting on a XW9300 with 5x15k SCSI disks in it. This machine will run a substantial ETL job in a fairly reasonable length of time and (in mid-2007) cost me about £1,700 on ebay including the disks. From a developer's perspective, especially on database centric projects like data warehousing, the line between a developer and a DBA is quite blurred. As I write this I am building a partition management framework for a SQL Server 2005 data warehouse.
数据库环境应该稀缺的日子早已一去不复返了。我在XW9300上写这篇文章,里面有5x15k的SCSI磁盘。这台机器将运行一个实质性的ETL作业一个相当合理的时间长度和(2007年中期)花了我在ebay上约£1700,包括磁盘。从开发人员的角度来看,特别是在数据仓库等以数据库为中心的项目中,开发人员和DBA之间的界限非常模糊。在编写本文时,我正在为SQL Server 2005数据仓库构建一个分区管理框架。
Developers should have one or more development databases of their own for (IMO) these reasons:
开发人员应该拥有一个或多个自己的开发数据库(IMO),原因如下:
-
Requires people to keep stored procedures, patch scripts and schema definition files in source control. Applying the patches can be automated to a fairly large extent. There are even tools such as Redgate SQL Compare Pro that do much of the grunt work for this.
要求人们在源代码控制中保存存储过程、补丁脚本和模式定义文件。应用补丁可以在很大程度上实现自动化。甚至还有Redgate SQL Compare Pro这样的工具,它们为此做了大量繁重的工作。
-
Encourages an application architecture that facilitates easy configuration management and deployment, as people have to deploy onto their own workstations. Many deployment wrinkles will get sorted out long before they hit production or people even realise they could have gone wrong.
鼓励易于配置管理和部署的应用程序体系结构,因为人们必须将其部署到自己的工作站上。许多部署问题在进入生产阶段之前就会得到解决,或者人们甚至会意识到它们可能会出错。
-
Avoids developers tripping up on each other's work. On something like a data warehouse where people are working with ETL code this is an even bigger win.
避免开发人员互相干扰对方的工作。在数据仓库等人们使用ETL代码的地方,这是一个更大的胜利。
-
It encourages a degree of responsibility as developers have to learn basic database administration. This also eliminates a lot of the requirements for operational support staff and some of the dev-vs. ops friction.
它鼓励一定程度的责任,因为开发人员必须学习基本的数据库管理。这也消除了许多业务支持人员和一些开发人员的需求。运维摩擦。
-
If you have your own database, there are no gatekeepers obstructing experimentation or other work on it. The politics around managing 'servers' disappear as there are no 'servers'. This is a productivity win in an any environment with significant incumbent bureaucracy.
如果您有自己的数据库,则不会有看门人妨碍实验或其他工作。管理“服务器”的政治色彩随着没有“服务器”而消失。这是一个生产力的胜利,在任何一个现任官僚的环境中。
-
For small data volumes an ordinary PC is fast enough for this. Developer editions or licencing are available for most if not all database management systems and will run on a desktop O/S. If you're working with Linux or Unix this is even less of an issue. For larger data volumes, up to and including most MIS applications, a workstation like an HP XW9400 or Lenovo D10 can be outfitted with 5 15k disks for less than the cost of a lot of professional development tooling. (Yes, I know it's dual licence, but a commercial all-platform licence for QT is about £4000 a seat). A machine like this will run an ETL process with 10's to 100's of millions of rows faster than you might think.
对于小数据量,一台普通的PC就足够快了。大多数数据库管理系统(如果不是全部的话)都可以使用开发人员版本或许可,并将在桌面O/S上运行。如果您使用的是Linux或Unix,那么这个问题就更小了。对于更大的数据量(包括大多数MIS应用程序),一台像惠普XW9400或联想D10这样的工作站可以配备5个15k磁盘,其成本低于许多专业开发工具的成本。(是的,我知道这是双牌照,但是商业所有平台牌照QT约£4000个座位)。像这样的机器将运行ETL进程,速度比您想象的要快10到100万行。
-
It facilitates setting up more than one environment for smoke testing or reconciliation purposes. As you have complete control over the machine, you have quite a lot of scope for mocking up conditions in a production environment. For example, I once made a simple emulator for Control-M by just bodging some of its runtime scripts. Where you have this level of control and transparency over the environment you can produce a fairly robustly tested deployment process which does quite a lot to eliminate opportunities for finger-pointing in production deployment.
它有助于为烟雾测试或调解目的建立多个环境。由于您对机器有完全的控制,所以在生产环境中,您有相当大的范围来模拟环境。例如,我曾经为Control-M创建了一个简单的仿真器,方法是只修改它的一些运行时脚本。当您拥有对环境的这种级别的控制和透明性时,您就可以生成一个相当健壮的测试部署过程,这对于消除在生产部署中指手画脚的机会有很大的作用。
I've seen small teams working with 14 environments, and had 7 active on a workstation at the same time. On database heavy work such as ETL, where you're with with whole tables, working in a single dev environment is a recipe for time wastage or spending your time walking on eggshells.
我见过一些团队在14个环境中工作,同时在工作站上有7个活动。在ETL这样的数据库繁重工作中,您需要处理整个表,在一个开发环境中工作是浪费时间或浪费时间的方法。
Also, you can use single user development licences for database platforms, which can save you the cost of the workstations just in database licencing. Most developer licences (Microsoft and OTN are a couple of examples I'm familiar with) will let you use the system on a single workstation for a single developer free or for a nominal price.
此外,您还可以为数据库平台使用单个用户开发许可证,这可以为您节省仅在数据库许可中使用工作站的成本。大多数开发人员许可证(Microsoft和OTN是我所熟悉的几个例子)将允许您在单个开发人员的工作站上免费或以名义价格使用系统。
Conversely, licencing terms on shared development servers are often somewhat murky and I've seen vendors try to shake customers down for licencing on dev servers on more than one occasion.
相反,共享开发服务器上的许可条款通常比较模糊,我曾看到供应商不止一次地在开发服务器上试图让客户失望。
#3
7
Each of our developers has a fully functional database. Changes are scripted and source controlled like any other code.
我们的每个开发人员都有一个功能齐全的数据库。更改是脚本化的,并且像其他代码一样由源代码控制。
#4
4
Ideally, yes, each developer should have a "sandbox" development environment, so they can test their code even before deploying it to a shared testing/staging environment.
理想情况下,每个开发人员都应该有一个“沙箱”开发环境,这样他们就可以在将代码部署到共享测试/交付环境之前测试他们的代码。
Each developer's environment should run scripted tests that reset the database to a known state. This is impossible to do in a shared environment.
每个开发人员的环境都应该运行脚本化的测试,将数据库重置为已知状态。这在共享环境中是不可能实现的。
The cost of giving each developer their own instance is less than the cost of the chaos resulting from multiple developers trying to test volatile changes together in a shared environment.
为每个开发人员提供他们自己的实例的成本,要低于多个开发人员试图在共享环境中一起测试不稳定变化所造成的混乱的成本。
On the other hand, in many IT shops the system uses complex infrastructure, involving multiple application servers or multiple physical nodes. Then the economics change; it's less expensive for people to cooperate and avoid stepping on each other's work than it would be to replicate it for each developer. Especially true if you integrate expensive third-party systems that don't give you licenses for multiple development environments.
另一方面,在许多IT商店中,系统使用复杂的基础设施,涉及多个应用服务器或多个物理节点。然后经济变化;与为每个开发人员复制工作相比,人们合作和避免相互干涉工作的成本更低。特别是如果您集成了昂贵的第三方系统,而这些系统不会为多个开发环境提供许可。
So the answer is yes and no. :-) Do give each developer their own environment if that environment can be reproduced inexpensively.
所以答案是肯定和否定。:-)如果能以低成本复制每个开发人员自己的环境。
#5
3
My recommendation is to have 2 levels of development environment:
我的建议是有两个层次的开发环境:
Each developer has their own personal development system, with its own dp, web servers, etc. This allows them to code against a known setup, write automated (system level) tests that initialize their database and systems to a known state, etc.
每个开发人员都有自己的个人开发系统,有自己的dp、web服务器等。这允许他们根据已知的设置编写代码,编写自动(系统级)测试,将数据库和系统初始化为已知状态,等等。
The development integration environment is shared by all developers and used to make sure everything is working together as expected before handing it off to QA. Code is checked out from source control and installed there, and there's only a single instance of any servers (db or otherwise).
开发集成环境由所有开发人员共享,并用于确保所有的开发人员在将其交付给QA之前都按照预期工作。代码从源代码控制中检出并安装在那里,并且任何服务器(db或其他)只有一个实例。
#6
3
This question hints at what a developer needs to do his/her job. Certainly a private DB instance should be provided. Equally important, I would make sure that the DB is the same product/version as what you intend to deploy to. Don't develop on MySQL 6.x and deploy to MySQL 5.x. (This goes for app servers, and web servers as well!)
这个问题暗示了开发人员需要做什么工作。当然,应该提供一个私有的DB实例。同样重要的是,我将确保DB与您打算部署的产品/版本相同。不要在MySQL 6上开发。部署到MySQL 5.x。(这也适用于应用服务器和web服务器!)
Having a developer DB doesn't necessarily ean you need it hosted on your local machine. You could have a central DBMS host machine with all dev dbs located on it. The pros are the garauntee that you develop against the target DB. Less overhead on dev boxes, more space/horsepower for beefy IDEs and app servers. The cons are single point of failure for all devs. (The DBMS server goes down nobody can work.) Lack of dev exposure to setting up and administering the DBMS. Devs cannot experiment as easily with upcoming DB releases or alternate DB choices to solve tough problems.
拥有一个开发人员DB并不一定需要将它托管在本地机器上。您可以有一个*DBMS主机,其中包含所有的开发dbs。优点是针对目标DB开发的结束语。开发框上的开销更少,健壮的ide和应用服务器的空间更大。缺点是所有开发人员的单一失败点。(DBMS服务器坏了,没人能工作。)缺少建立和管理DBMS的开发公开。Devs不能很容易地使用即将发布的DB版本或替代DB的选择来解决棘手的问题。
Some of the pros can be cons and vice-versa depending on your organization and structure. Maybe you don't want devs administering the DBMS. Maybe you do plan to support varying DB platforms. The decision boils down to your organization as well as your target platform choices. If you plan to target a variety of DB/OS/app server combinations then each dev should not only have their own DB but should work in a unique combination. (MySQL/Tomcat/OSX for one DB2/Jetty/Linux for another Postegres/Geronimo/WinXP for a 3rd, etc.) If you setup an ASP (Application Service Provider) type shop on an iSeries on the other hand then of course you'll likely have a central host with all dev dbmses still each dev should have at least a separate db instance to allow structural changes to schema.
根据您的组织和结构,一些优点可能是缺点,反之亦然。也许您不希望开发人员管理DBMS。也许您确实计划支持不同的DB平台。决定归结为您的组织以及您的目标平台选择。如果您计划针对各种DB/OS/app服务器组合,那么每个开发人员不仅应该有自己的DB,而且应该以独特的组合工作。(MySQL / Tomcat / OSX一个DB2 / Jetty / Linux为另一个Postegres / Geronimo / WinXP第三,等等)如果你设置一个ASP(应用服务提供商)类型店一个iSeries另一方面当然你可能会有一个*主机和所有dev dbms仍然每个开发者都应该至少有一个单独的数据库实例允许结构变化模式。
#7
1
I have an instance of SQLServer Development Edition installed locally. We have a QA DB server, as well as multiple production servers. All development and integration testing is done using my local server (or other developers local servers). New releases are staged to the QA server. Each release, after acceptance by the customer, is put into production.
我在本地安装了一个SQLServer开发版本的实例。我们有一个QA DB服务器,以及多个生产服务器。所有开发和集成测试都使用本地服务器(或其他开发人员本地服务器)完成。新的版本被分阶段发布到QA服务器。每个版本,在客户接受后,投入生产。
Since I mostly do web development, I use the web server bundled with VS2008 for development and local test, then publish the web app to a QA web server hosted on a VM. Once accepted by the customer, it is published to one of several different production web servers -- some virtual, some not, depending on the application.
由于我主要进行web开发,所以使用与VS2008绑定的web服务器进行开发和本地测试,然后将web应用程序发布到驻留在VM上的QA web服务器。一旦被客户接受,它将被发布到几个不同的生产web服务器之一——有些是虚拟的,有些不是,这取决于应用程序。
#8
1
My department at my company only has limited development environments, purely because of cost of support and hardware. We have a couple of environments which are based on t-1 nightly refreshes from production, and some static ones.
我的部门在我的公司只有有限的开发环境,纯粹是因为支持和硬件的成本。我们有几个基于t-1的环境,以及一些静态的环境。
Ideally, everyone should have their own, but in many cases, this is going to be impractical when the following are true:
理想情况下,每个人都应该有自己的想法,但在很多情况下,如果以下是真的,这将是不切实际的:
- you have a large number of developers needing resources (our department has maybe 80)
- 您有大量的开发人员需要资源(我们的部门可能有80个)
- each developer needs multiple resources (typically i use 4-5 different dbs each day)
- 每个开发人员都需要多个资源(通常我每天使用4-5个不同的dbs)
- up to date data is important (you just cant refresh them fast enough)
- 最新数据很重要(只是更新速度不够快)
In these cases, shared instances and good communication are whats needed.
在这些情况下,需要共享实例和良好的沟通。
#9
0
One advantage to one database per developer, each developer has a snapshot of their own data in a "known" state.
每个开发人员拥有一个数据库的优点是,每个开发人员都有一个“已知”状态下自己数据的快照。
#10
0
I like the idea of using a local version when a developer must be isolated - developing a schema change, performance testing, setting up specific scenarios, etc...
当开发人员必须被隔离时,我喜欢使用本地版本——开发模式更改、性能测试、设置特定场景等等。
At other times use the shared version as to insure everything is in sync with each other.
在其他时候,使用共享版本以确保所有内容彼此同步。
#11
0
I think there's a terminology problem here. It's been a while since I've worn my DBA hat (golly gee, almost 10 years) - so someone else can chime in and correct me.
我认为这里有一个术语问题。我戴上DBA帽子已经有一段时间了(天啊,差不多有10年了)——这样别人就可以插嘴纠正我了。
I think everyone is in agreement that each developer should have his own sandbox schema set.
我认为每个开发人员都应该有自己的沙箱模式集。
In MySQL and Sybase/MS SQLServer, each database engine can support multiple databases. Each database is (normally) fully independent of the other database. So you can have one database engine instance, and give each developer his database space to do as he wish. the only problem is if the developers are using tempdb -- there can be collisions there (I think -- this you will need to look up). Just be careful that cross-database queries with fixed database names are not used.
在MySQL和Sybase/MS SQLServer中,每个数据库引擎都可以支持多个数据库。每个数据库(通常)完全独立于其他数据库。因此,您可以有一个数据库引擎实例,并为每个开发人员提供自己的数据库空间,让他们按照自己的意愿进行操作。唯一的问题是,如果开发人员正在使用tempdb——那里可能存在冲突(我认为——您需要查找这个)。请注意,使用固定数据库名称的跨数据库查询不会被使用。
In Oracle, the database engine instance is tied to a particular schema set. If you have multiple developers on the same engine, they are all pointing to the same tables. In this case, yes, you will need to run multiple instances.
在Oracle中,数据库引擎实例绑定到一个特定的模式集。在这种情况下,是的,您将需要运行多个实例。
#12
0
Each of our developers has a local database. We store the create script AND a dump of the "standard data" in our SVN repo. We have an extensive set of tests that must pass against this test data. We also have a "sandbox" database that is available for people to put data in that they want shared into the standard data. This works well for us and allows us to let developers modify their local copies of data to test things, but we control what gets passed to other developers. We also strictly control schema changes, so we don't encounter the problems that someone else mentioned.
我们的每个开发人员都有一个本地数据库。我们在SVN repo中存储创建脚本和“标准数据”转储。我们有大量的测试,必须通过这些测试数据。我们还有一个“沙盒”数据库,可以让人们将他们想要共享的数据放入标准数据中。这对我们来说工作得很好,并且允许开发人员修改他们的本地数据拷贝来测试东西,但是我们控制传递给其他开发人员的东西。我们还严格控制模式更改,因此不会遇到其他人提到的问题。
#13
0
It really depends on the nature of your application. If yours is a client-server architecture in a distributed environment, it is best to have a central database that everyone uses. If the product gives users an environment with local database instances, you can use that. It is best if your development mirrors the real world environment as closely as possible.
这取决于应用程序的性质。如果您的体系结构是分布式环境中的客户机-服务器体系结构,那么最好使用每个人都使用的*数据库。如果产品为用户提供了一个具有本地数据库实例的环境,您可以使用它。如果您的开发尽可能地反映真实世界环境,那么这是最好的。
It is also dependent on what stage of development you are in. Probably in the early stages, you dont want to get bogged down by connectivity, network and distributed environment issues and just want to be up and running. In such a case, you can start with a database instance-per-user model before switching to the central model as the product reaches some level of maturity.
这也取决于你所处的发展阶段。可能在早期阶段,您不希望陷入连接、网络和分布式环境问题的泥沼,而只想要启动并运行。在这种情况下,您可以从每个用户的数据库实例开始,然后在产品达到某种成熟度时切换到中心模型。
#14
0
In my company we tend to copy the entire DB when working on non-trivial new features. The reasoning there is disk space is cheap, whereas accidental data loss (even if it's test data) isn't.
在我的公司中,我们在处理非平凡的新特性时倾向于复制整个DB。存在磁盘空间的理由是便宜的,而意外的数据丢失(即使是测试数据)不是便宜的。
#15
0
I've worked in both types of development environments. Personally, I prefer to have my own DB/app server. However, there may be some advantages to using a shared infrastructure for development.
我曾在这两种开发环境中工作过。就我个人而言,我更喜欢拥有自己的DB/app服务器。但是,使用共享基础设施进行开发可能有一些好处。
The main one is that a shared environment more closely resembles a real-world scenario: you are more likely to uncover problems with locking or transactions when all developers share a DB. Giving each developer their own DB may lead to "it works on my DB" syndrome.
主要的一点是,共享环境更类似于真实场景:当所有开发人员共享一个DB时,您更有可能发现锁或事务的问题。给每个开发人员自己的DB可能会导致“它对我的DB”综合症产生影响。
However, if you need to apply and test schema changes or optimisations, then I can see problems in this sort of set-up.
但是,如果您需要应用和测试模式更改或优化,那么我可以在这种设置中看到问题。
Maybe a compromise solution would work best: all developers share infrastructure, and if someone needs to test schema changes, they create their own temporary DB instance (maybe there is one just sitting there for this purpose?) until they are happy to commit the new schema to source control.
也许折衷的解决方案最有效:所有开发人员共享基础设施,如果有人需要测试模式更改,他们会创建自己的临时DB实例(也许有一个只是为了这个目的?),直到他们乐意将新的模式提交给源代码控制。
You do have your entire schema (and test data) in source control, right? Right???
在源代码控制中,您确实有完整的模式(和测试数据),对吗?对吧? ? ?
#16
0
I like the compromise solution (all developers share infrastructure, and if someone needs to test schema changes, they create their own temporary DB instance (maybe there is one just sitting there for this purpose?) until they are happy to commit the new schema to source control.)
我喜欢折衷的解决方案(所有开发人员都共享基础设施,如果有人需要测试模式更改,他们会创建自己的临时DB实例(可能有一个只是为了这个目的?),直到他们乐意将新的模式提交给源代码控制为止。)
#17
0
One DB per developer. No question. But the issue really is how to script entire databases, "control data", and version them. My solution is here : http://dbsourcetools.codeplex.com/ Have fun. - Nathan.
每个开发人员一个DB。没有问题。但真正的问题是如何编写整个数据库的脚本、“控制数据”和版本。我的解决方案是:http://dbsourcetools.codeplex.com/好好享受。-内森。
#18
0
The database schemas should be held in source control and developers should own the changesets checked in for code and db together. Prior to checkin the developer should be working on his own database. After checkin, an automated build (eg: on checkin, nightly, etc), should update a central integrated db, along with the apps themselves.
数据库模式应该保存在源代码控制中,开发人员应该拥有检入的代码和db的变更集。在签入之前,开发人员应该在自己的数据库上工作。在签入后,自动构建(例如:在签入、每晚等)中,应该更新一个*集成数据库,以及应用程序本身。
At developer instance level the data loaded should be appropriate for unit testing, at least. At integrated level, the shared db should hold data also appropriate for testing, but should not rely on production replication - this is just a slack substitute for managed test data.
在开发人员实例级,至少加载的数据应该适合于单元测试。在集成级别上,共享db应该包含适合于测试的数据,但不应该依赖于生产复制——这只是托管测试数据的一种松散替代。
In my experience the only reason that developers opt for a shared db is that they believe that developing and running on recent production data is somehow 'real' and means that they can put less effort into testing. They prefer to tread on each others toes and put up with a shared db that slowly corrupts before the next production refresh than write and manage proper tests. It's this kind of management practice that gives the IT world the poor reputation to deliver that it currently has.
在我的经验中,开发人员选择共享数据库的唯一原因是他们认为开发和运行最近的生产数据是“真实的”,这意味着他们可以在测试中投入更少的精力。他们更喜欢踩在别人的脚趾上,并且在下一次的生产刷新之前,用一个共享的db来处理,而不是编写和管理适当的测试。正是这种管理实践给It世界带来了它目前拥有的糟糕名声。
#19
-1
I'd suggest to use one instance of the database. You don't want your database to be a moving target.
我建议使用数据库的一个实例。您不希望您的数据库成为移动目标。
#1
34
If you all share the same database, you might have some issues if someone make a structure change to the database and that the code is not "Synchronized" with it.
如果您共享同一个数据库,如果有人对数据库进行结构更改,并且代码没有与之“同步”,那么您可能会遇到一些问题。
I highly recommend one DB per developer for the only reason that you don't want to do "write" test to see someone else override you right after. A simple exemple? You try to display product for a website. Everything works until all the products disappear. Problem? Another developer decided to play with the "Active" flag of the product to test something else. In cases like that, a transaction might not even work. End of the story, you spend time debugging for someone else action.
我强烈推荐每个开发人员使用一个DB,唯一的原因是您不希望进行“写”测试,以便看到其他人马上覆盖您。一个简单的例子吗?你尝试为一个网站显示产品。所有的东西都有效,直到所有的产品消失。问题吗?另一个开发人员决定使用产品的“活动”标志来测试其他东西。在这样的情况下,事务甚至可能无法工作。故事的结尾,你花时间调试别人的行动。
I highly recommend replicating the staging database to the developer database once in a while to synchronize the structure (or better, have a tool to rebuild a database from scratch).
我强烈建议将登台数据库偶尔复制到开发人员数据库,以便同步结构(或者更好的方法是使用工具从头重新构建数据库)。
Of course, we require scripts for changes to the database and EVERYTHING is in a Source Control.
当然,我们需要对数据库进行修改的脚本,并且所有内容都在源代码控件中。
#2
15
The days when database environments should be scarce are long gone. I'm writing this posting on a XW9300 with 5x15k SCSI disks in it. This machine will run a substantial ETL job in a fairly reasonable length of time and (in mid-2007) cost me about £1,700 on ebay including the disks. From a developer's perspective, especially on database centric projects like data warehousing, the line between a developer and a DBA is quite blurred. As I write this I am building a partition management framework for a SQL Server 2005 data warehouse.
数据库环境应该稀缺的日子早已一去不复返了。我在XW9300上写这篇文章,里面有5x15k的SCSI磁盘。这台机器将运行一个实质性的ETL作业一个相当合理的时间长度和(2007年中期)花了我在ebay上约£1700,包括磁盘。从开发人员的角度来看,特别是在数据仓库等以数据库为中心的项目中,开发人员和DBA之间的界限非常模糊。在编写本文时,我正在为SQL Server 2005数据仓库构建一个分区管理框架。
Developers should have one or more development databases of their own for (IMO) these reasons:
开发人员应该拥有一个或多个自己的开发数据库(IMO),原因如下:
-
Requires people to keep stored procedures, patch scripts and schema definition files in source control. Applying the patches can be automated to a fairly large extent. There are even tools such as Redgate SQL Compare Pro that do much of the grunt work for this.
要求人们在源代码控制中保存存储过程、补丁脚本和模式定义文件。应用补丁可以在很大程度上实现自动化。甚至还有Redgate SQL Compare Pro这样的工具,它们为此做了大量繁重的工作。
-
Encourages an application architecture that facilitates easy configuration management and deployment, as people have to deploy onto their own workstations. Many deployment wrinkles will get sorted out long before they hit production or people even realise they could have gone wrong.
鼓励易于配置管理和部署的应用程序体系结构,因为人们必须将其部署到自己的工作站上。许多部署问题在进入生产阶段之前就会得到解决,或者人们甚至会意识到它们可能会出错。
-
Avoids developers tripping up on each other's work. On something like a data warehouse where people are working with ETL code this is an even bigger win.
避免开发人员互相干扰对方的工作。在数据仓库等人们使用ETL代码的地方,这是一个更大的胜利。
-
It encourages a degree of responsibility as developers have to learn basic database administration. This also eliminates a lot of the requirements for operational support staff and some of the dev-vs. ops friction.
它鼓励一定程度的责任,因为开发人员必须学习基本的数据库管理。这也消除了许多业务支持人员和一些开发人员的需求。运维摩擦。
-
If you have your own database, there are no gatekeepers obstructing experimentation or other work on it. The politics around managing 'servers' disappear as there are no 'servers'. This is a productivity win in an any environment with significant incumbent bureaucracy.
如果您有自己的数据库,则不会有看门人妨碍实验或其他工作。管理“服务器”的政治色彩随着没有“服务器”而消失。这是一个生产力的胜利,在任何一个现任官僚的环境中。
-
For small data volumes an ordinary PC is fast enough for this. Developer editions or licencing are available for most if not all database management systems and will run on a desktop O/S. If you're working with Linux or Unix this is even less of an issue. For larger data volumes, up to and including most MIS applications, a workstation like an HP XW9400 or Lenovo D10 can be outfitted with 5 15k disks for less than the cost of a lot of professional development tooling. (Yes, I know it's dual licence, but a commercial all-platform licence for QT is about £4000 a seat). A machine like this will run an ETL process with 10's to 100's of millions of rows faster than you might think.
对于小数据量,一台普通的PC就足够快了。大多数数据库管理系统(如果不是全部的话)都可以使用开发人员版本或许可,并将在桌面O/S上运行。如果您使用的是Linux或Unix,那么这个问题就更小了。对于更大的数据量(包括大多数MIS应用程序),一台像惠普XW9400或联想D10这样的工作站可以配备5个15k磁盘,其成本低于许多专业开发工具的成本。(是的,我知道这是双牌照,但是商业所有平台牌照QT约£4000个座位)。像这样的机器将运行ETL进程,速度比您想象的要快10到100万行。
-
It facilitates setting up more than one environment for smoke testing or reconciliation purposes. As you have complete control over the machine, you have quite a lot of scope for mocking up conditions in a production environment. For example, I once made a simple emulator for Control-M by just bodging some of its runtime scripts. Where you have this level of control and transparency over the environment you can produce a fairly robustly tested deployment process which does quite a lot to eliminate opportunities for finger-pointing in production deployment.
它有助于为烟雾测试或调解目的建立多个环境。由于您对机器有完全的控制,所以在生产环境中,您有相当大的范围来模拟环境。例如,我曾经为Control-M创建了一个简单的仿真器,方法是只修改它的一些运行时脚本。当您拥有对环境的这种级别的控制和透明性时,您就可以生成一个相当健壮的测试部署过程,这对于消除在生产部署中指手画脚的机会有很大的作用。
I've seen small teams working with 14 environments, and had 7 active on a workstation at the same time. On database heavy work such as ETL, where you're with with whole tables, working in a single dev environment is a recipe for time wastage or spending your time walking on eggshells.
我见过一些团队在14个环境中工作,同时在工作站上有7个活动。在ETL这样的数据库繁重工作中,您需要处理整个表,在一个开发环境中工作是浪费时间或浪费时间的方法。
Also, you can use single user development licences for database platforms, which can save you the cost of the workstations just in database licencing. Most developer licences (Microsoft and OTN are a couple of examples I'm familiar with) will let you use the system on a single workstation for a single developer free or for a nominal price.
此外,您还可以为数据库平台使用单个用户开发许可证,这可以为您节省仅在数据库许可中使用工作站的成本。大多数开发人员许可证(Microsoft和OTN是我所熟悉的几个例子)将允许您在单个开发人员的工作站上免费或以名义价格使用系统。
Conversely, licencing terms on shared development servers are often somewhat murky and I've seen vendors try to shake customers down for licencing on dev servers on more than one occasion.
相反,共享开发服务器上的许可条款通常比较模糊,我曾看到供应商不止一次地在开发服务器上试图让客户失望。
#3
7
Each of our developers has a fully functional database. Changes are scripted and source controlled like any other code.
我们的每个开发人员都有一个功能齐全的数据库。更改是脚本化的,并且像其他代码一样由源代码控制。
#4
4
Ideally, yes, each developer should have a "sandbox" development environment, so they can test their code even before deploying it to a shared testing/staging environment.
理想情况下,每个开发人员都应该有一个“沙箱”开发环境,这样他们就可以在将代码部署到共享测试/交付环境之前测试他们的代码。
Each developer's environment should run scripted tests that reset the database to a known state. This is impossible to do in a shared environment.
每个开发人员的环境都应该运行脚本化的测试,将数据库重置为已知状态。这在共享环境中是不可能实现的。
The cost of giving each developer their own instance is less than the cost of the chaos resulting from multiple developers trying to test volatile changes together in a shared environment.
为每个开发人员提供他们自己的实例的成本,要低于多个开发人员试图在共享环境中一起测试不稳定变化所造成的混乱的成本。
On the other hand, in many IT shops the system uses complex infrastructure, involving multiple application servers or multiple physical nodes. Then the economics change; it's less expensive for people to cooperate and avoid stepping on each other's work than it would be to replicate it for each developer. Especially true if you integrate expensive third-party systems that don't give you licenses for multiple development environments.
另一方面,在许多IT商店中,系统使用复杂的基础设施,涉及多个应用服务器或多个物理节点。然后经济变化;与为每个开发人员复制工作相比,人们合作和避免相互干涉工作的成本更低。特别是如果您集成了昂贵的第三方系统,而这些系统不会为多个开发环境提供许可。
So the answer is yes and no. :-) Do give each developer their own environment if that environment can be reproduced inexpensively.
所以答案是肯定和否定。:-)如果能以低成本复制每个开发人员自己的环境。
#5
3
My recommendation is to have 2 levels of development environment:
我的建议是有两个层次的开发环境:
Each developer has their own personal development system, with its own dp, web servers, etc. This allows them to code against a known setup, write automated (system level) tests that initialize their database and systems to a known state, etc.
每个开发人员都有自己的个人开发系统,有自己的dp、web服务器等。这允许他们根据已知的设置编写代码,编写自动(系统级)测试,将数据库和系统初始化为已知状态,等等。
The development integration environment is shared by all developers and used to make sure everything is working together as expected before handing it off to QA. Code is checked out from source control and installed there, and there's only a single instance of any servers (db or otherwise).
开发集成环境由所有开发人员共享,并用于确保所有的开发人员在将其交付给QA之前都按照预期工作。代码从源代码控制中检出并安装在那里,并且任何服务器(db或其他)只有一个实例。
#6
3
This question hints at what a developer needs to do his/her job. Certainly a private DB instance should be provided. Equally important, I would make sure that the DB is the same product/version as what you intend to deploy to. Don't develop on MySQL 6.x and deploy to MySQL 5.x. (This goes for app servers, and web servers as well!)
这个问题暗示了开发人员需要做什么工作。当然,应该提供一个私有的DB实例。同样重要的是,我将确保DB与您打算部署的产品/版本相同。不要在MySQL 6上开发。部署到MySQL 5.x。(这也适用于应用服务器和web服务器!)
Having a developer DB doesn't necessarily ean you need it hosted on your local machine. You could have a central DBMS host machine with all dev dbs located on it. The pros are the garauntee that you develop against the target DB. Less overhead on dev boxes, more space/horsepower for beefy IDEs and app servers. The cons are single point of failure for all devs. (The DBMS server goes down nobody can work.) Lack of dev exposure to setting up and administering the DBMS. Devs cannot experiment as easily with upcoming DB releases or alternate DB choices to solve tough problems.
拥有一个开发人员DB并不一定需要将它托管在本地机器上。您可以有一个*DBMS主机,其中包含所有的开发dbs。优点是针对目标DB开发的结束语。开发框上的开销更少,健壮的ide和应用服务器的空间更大。缺点是所有开发人员的单一失败点。(DBMS服务器坏了,没人能工作。)缺少建立和管理DBMS的开发公开。Devs不能很容易地使用即将发布的DB版本或替代DB的选择来解决棘手的问题。
Some of the pros can be cons and vice-versa depending on your organization and structure. Maybe you don't want devs administering the DBMS. Maybe you do plan to support varying DB platforms. The decision boils down to your organization as well as your target platform choices. If you plan to target a variety of DB/OS/app server combinations then each dev should not only have their own DB but should work in a unique combination. (MySQL/Tomcat/OSX for one DB2/Jetty/Linux for another Postegres/Geronimo/WinXP for a 3rd, etc.) If you setup an ASP (Application Service Provider) type shop on an iSeries on the other hand then of course you'll likely have a central host with all dev dbmses still each dev should have at least a separate db instance to allow structural changes to schema.
根据您的组织和结构,一些优点可能是缺点,反之亦然。也许您不希望开发人员管理DBMS。也许您确实计划支持不同的DB平台。决定归结为您的组织以及您的目标平台选择。如果您计划针对各种DB/OS/app服务器组合,那么每个开发人员不仅应该有自己的DB,而且应该以独特的组合工作。(MySQL / Tomcat / OSX一个DB2 / Jetty / Linux为另一个Postegres / Geronimo / WinXP第三,等等)如果你设置一个ASP(应用服务提供商)类型店一个iSeries另一方面当然你可能会有一个*主机和所有dev dbms仍然每个开发者都应该至少有一个单独的数据库实例允许结构变化模式。
#7
1
I have an instance of SQLServer Development Edition installed locally. We have a QA DB server, as well as multiple production servers. All development and integration testing is done using my local server (or other developers local servers). New releases are staged to the QA server. Each release, after acceptance by the customer, is put into production.
我在本地安装了一个SQLServer开发版本的实例。我们有一个QA DB服务器,以及多个生产服务器。所有开发和集成测试都使用本地服务器(或其他开发人员本地服务器)完成。新的版本被分阶段发布到QA服务器。每个版本,在客户接受后,投入生产。
Since I mostly do web development, I use the web server bundled with VS2008 for development and local test, then publish the web app to a QA web server hosted on a VM. Once accepted by the customer, it is published to one of several different production web servers -- some virtual, some not, depending on the application.
由于我主要进行web开发,所以使用与VS2008绑定的web服务器进行开发和本地测试,然后将web应用程序发布到驻留在VM上的QA web服务器。一旦被客户接受,它将被发布到几个不同的生产web服务器之一——有些是虚拟的,有些不是,这取决于应用程序。
#8
1
My department at my company only has limited development environments, purely because of cost of support and hardware. We have a couple of environments which are based on t-1 nightly refreshes from production, and some static ones.
我的部门在我的公司只有有限的开发环境,纯粹是因为支持和硬件的成本。我们有几个基于t-1的环境,以及一些静态的环境。
Ideally, everyone should have their own, but in many cases, this is going to be impractical when the following are true:
理想情况下,每个人都应该有自己的想法,但在很多情况下,如果以下是真的,这将是不切实际的:
- you have a large number of developers needing resources (our department has maybe 80)
- 您有大量的开发人员需要资源(我们的部门可能有80个)
- each developer needs multiple resources (typically i use 4-5 different dbs each day)
- 每个开发人员都需要多个资源(通常我每天使用4-5个不同的dbs)
- up to date data is important (you just cant refresh them fast enough)
- 最新数据很重要(只是更新速度不够快)
In these cases, shared instances and good communication are whats needed.
在这些情况下,需要共享实例和良好的沟通。
#9
0
One advantage to one database per developer, each developer has a snapshot of their own data in a "known" state.
每个开发人员拥有一个数据库的优点是,每个开发人员都有一个“已知”状态下自己数据的快照。
#10
0
I like the idea of using a local version when a developer must be isolated - developing a schema change, performance testing, setting up specific scenarios, etc...
当开发人员必须被隔离时,我喜欢使用本地版本——开发模式更改、性能测试、设置特定场景等等。
At other times use the shared version as to insure everything is in sync with each other.
在其他时候,使用共享版本以确保所有内容彼此同步。
#11
0
I think there's a terminology problem here. It's been a while since I've worn my DBA hat (golly gee, almost 10 years) - so someone else can chime in and correct me.
我认为这里有一个术语问题。我戴上DBA帽子已经有一段时间了(天啊,差不多有10年了)——这样别人就可以插嘴纠正我了。
I think everyone is in agreement that each developer should have his own sandbox schema set.
我认为每个开发人员都应该有自己的沙箱模式集。
In MySQL and Sybase/MS SQLServer, each database engine can support multiple databases. Each database is (normally) fully independent of the other database. So you can have one database engine instance, and give each developer his database space to do as he wish. the only problem is if the developers are using tempdb -- there can be collisions there (I think -- this you will need to look up). Just be careful that cross-database queries with fixed database names are not used.
在MySQL和Sybase/MS SQLServer中,每个数据库引擎都可以支持多个数据库。每个数据库(通常)完全独立于其他数据库。因此,您可以有一个数据库引擎实例,并为每个开发人员提供自己的数据库空间,让他们按照自己的意愿进行操作。唯一的问题是,如果开发人员正在使用tempdb——那里可能存在冲突(我认为——您需要查找这个)。请注意,使用固定数据库名称的跨数据库查询不会被使用。
In Oracle, the database engine instance is tied to a particular schema set. If you have multiple developers on the same engine, they are all pointing to the same tables. In this case, yes, you will need to run multiple instances.
在Oracle中,数据库引擎实例绑定到一个特定的模式集。在这种情况下,是的,您将需要运行多个实例。
#12
0
Each of our developers has a local database. We store the create script AND a dump of the "standard data" in our SVN repo. We have an extensive set of tests that must pass against this test data. We also have a "sandbox" database that is available for people to put data in that they want shared into the standard data. This works well for us and allows us to let developers modify their local copies of data to test things, but we control what gets passed to other developers. We also strictly control schema changes, so we don't encounter the problems that someone else mentioned.
我们的每个开发人员都有一个本地数据库。我们在SVN repo中存储创建脚本和“标准数据”转储。我们有大量的测试,必须通过这些测试数据。我们还有一个“沙盒”数据库,可以让人们将他们想要共享的数据放入标准数据中。这对我们来说工作得很好,并且允许开发人员修改他们的本地数据拷贝来测试东西,但是我们控制传递给其他开发人员的东西。我们还严格控制模式更改,因此不会遇到其他人提到的问题。
#13
0
It really depends on the nature of your application. If yours is a client-server architecture in a distributed environment, it is best to have a central database that everyone uses. If the product gives users an environment with local database instances, you can use that. It is best if your development mirrors the real world environment as closely as possible.
这取决于应用程序的性质。如果您的体系结构是分布式环境中的客户机-服务器体系结构,那么最好使用每个人都使用的*数据库。如果产品为用户提供了一个具有本地数据库实例的环境,您可以使用它。如果您的开发尽可能地反映真实世界环境,那么这是最好的。
It is also dependent on what stage of development you are in. Probably in the early stages, you dont want to get bogged down by connectivity, network and distributed environment issues and just want to be up and running. In such a case, you can start with a database instance-per-user model before switching to the central model as the product reaches some level of maturity.
这也取决于你所处的发展阶段。可能在早期阶段,您不希望陷入连接、网络和分布式环境问题的泥沼,而只想要启动并运行。在这种情况下,您可以从每个用户的数据库实例开始,然后在产品达到某种成熟度时切换到中心模型。
#14
0
In my company we tend to copy the entire DB when working on non-trivial new features. The reasoning there is disk space is cheap, whereas accidental data loss (even if it's test data) isn't.
在我的公司中,我们在处理非平凡的新特性时倾向于复制整个DB。存在磁盘空间的理由是便宜的,而意外的数据丢失(即使是测试数据)不是便宜的。
#15
0
I've worked in both types of development environments. Personally, I prefer to have my own DB/app server. However, there may be some advantages to using a shared infrastructure for development.
我曾在这两种开发环境中工作过。就我个人而言,我更喜欢拥有自己的DB/app服务器。但是,使用共享基础设施进行开发可能有一些好处。
The main one is that a shared environment more closely resembles a real-world scenario: you are more likely to uncover problems with locking or transactions when all developers share a DB. Giving each developer their own DB may lead to "it works on my DB" syndrome.
主要的一点是,共享环境更类似于真实场景:当所有开发人员共享一个DB时,您更有可能发现锁或事务的问题。给每个开发人员自己的DB可能会导致“它对我的DB”综合症产生影响。
However, if you need to apply and test schema changes or optimisations, then I can see problems in this sort of set-up.
但是,如果您需要应用和测试模式更改或优化,那么我可以在这种设置中看到问题。
Maybe a compromise solution would work best: all developers share infrastructure, and if someone needs to test schema changes, they create their own temporary DB instance (maybe there is one just sitting there for this purpose?) until they are happy to commit the new schema to source control.
也许折衷的解决方案最有效:所有开发人员共享基础设施,如果有人需要测试模式更改,他们会创建自己的临时DB实例(也许有一个只是为了这个目的?),直到他们乐意将新的模式提交给源代码控制。
You do have your entire schema (and test data) in source control, right? Right???
在源代码控制中,您确实有完整的模式(和测试数据),对吗?对吧? ? ?
#16
0
I like the compromise solution (all developers share infrastructure, and if someone needs to test schema changes, they create their own temporary DB instance (maybe there is one just sitting there for this purpose?) until they are happy to commit the new schema to source control.)
我喜欢折衷的解决方案(所有开发人员都共享基础设施,如果有人需要测试模式更改,他们会创建自己的临时DB实例(可能有一个只是为了这个目的?),直到他们乐意将新的模式提交给源代码控制为止。)
#17
0
One DB per developer. No question. But the issue really is how to script entire databases, "control data", and version them. My solution is here : http://dbsourcetools.codeplex.com/ Have fun. - Nathan.
每个开发人员一个DB。没有问题。但真正的问题是如何编写整个数据库的脚本、“控制数据”和版本。我的解决方案是:http://dbsourcetools.codeplex.com/好好享受。-内森。
#18
0
The database schemas should be held in source control and developers should own the changesets checked in for code and db together. Prior to checkin the developer should be working on his own database. After checkin, an automated build (eg: on checkin, nightly, etc), should update a central integrated db, along with the apps themselves.
数据库模式应该保存在源代码控制中,开发人员应该拥有检入的代码和db的变更集。在签入之前,开发人员应该在自己的数据库上工作。在签入后,自动构建(例如:在签入、每晚等)中,应该更新一个*集成数据库,以及应用程序本身。
At developer instance level the data loaded should be appropriate for unit testing, at least. At integrated level, the shared db should hold data also appropriate for testing, but should not rely on production replication - this is just a slack substitute for managed test data.
在开发人员实例级,至少加载的数据应该适合于单元测试。在集成级别上,共享db应该包含适合于测试的数据,但不应该依赖于生产复制——这只是托管测试数据的一种松散替代。
In my experience the only reason that developers opt for a shared db is that they believe that developing and running on recent production data is somehow 'real' and means that they can put less effort into testing. They prefer to tread on each others toes and put up with a shared db that slowly corrupts before the next production refresh than write and manage proper tests. It's this kind of management practice that gives the IT world the poor reputation to deliver that it currently has.
在我的经验中,开发人员选择共享数据库的唯一原因是他们认为开发和运行最近的生产数据是“真实的”,这意味着他们可以在测试中投入更少的精力。他们更喜欢踩在别人的脚趾上,并且在下一次的生产刷新之前,用一个共享的db来处理,而不是编写和管理适当的测试。正是这种管理实践给It世界带来了它目前拥有的糟糕名声。
#19
-1
I'd suggest to use one instance of the database. You don't want your database to be a moving target.
我建议使用数据库的一个实例。您不希望您的数据库成为移动目标。