对于数据库,每个开发人员应该了解什么?

时间:2022-09-11 10:19:01

Whether we like it or not, many if not most of us developers either regularly work with databases or may have to work with one someday. And considering the amount of misuse and abuse in the wild, and the volume of database-related questions that come up every day, it's fair to say that there are certain concepts that developers should know - even if they don't design or work with databases today. So:

不管我们喜欢与否,我们中的大多数开发人员要么经常使用数据库,要么有朝一日不得不使用数据库。考虑到大量的滥用和滥用,以及每天都会出现的大量与数据库相关的问题,可以公平地说,开发人员应该知道某些概念——即使他们现在没有设计或使用数据库。所以:



What are the important concepts that developers and other software professionals ought to know about databases?


Guidelines for Responses:


Keep your list short.
One concept per answer is best.

保持你的短名单。每个答案都有一个概念是最好的。

Be specific.
"Data modelling" may be an important skill, but what does that mean precisely?

要具体。“数据建模”可能是一项重要的技能,但这究竟意味着什么呢?

Explain your rationale.
Why is your concept important? Don't just say "use indexes." Don't fall into "best practices." Convince your audience to go learn more.

解释你的理由。为什么你的概念很重要?不要只说“使用索引”。不要陷入“最佳实践”。说服你的听众去了解更多。

Upvote answers you agree with.
Read other people's answers first. One high-ranked answer is a more effective statement than two low-ranked ones. If you have more to add, either add a comment or reference the original.

赞成你的回答。先读别人的答案。一个高排名的答案比两个低排名的答案更有效。如果要添加更多内容,可以添加注释或引用原文。

Don't downvote something just because it doesn't apply to you personally.
We all work in different domains. The objective here is to provide direction for database novices to gain a well-founded, well-rounded understanding of database design and database-driven development, not to compete for the title of most-important.

不要因为某件事不适合你,就对它投反对票。我们都在不同的领域工作。这里的目标是为数据库新手提供方向,使他们对数据库设计和数据库驱动开发有一个充分的、全面的了解,而不是竞争最重要的头衔。

31 个解决方案

#1


104  

The very first thing developers should know about databases is this: what are databases for? Not how do they work, nor how do you build one, nor even how do you write code to retrieve or update the data in a database. But what are they for?

开发人员首先应该知道的是:数据库的用途是什么?不是如何工作,也不是如何构建,甚至不是如何编写代码来检索或更新数据库中的数据。但是它们是用来干什么的呢?

Unfortunately, the answer to this one is a moving target. In the heydey of databases, the 1970s through the early 1990s, databases were for the sharing of data. If you were using a database, and you weren't sharing data you were either involved in an academic project or you were wasting resources, including yourself. Setting up a database and taming a DBMS were such monumental tasks that the payback, in terms of data exploited multiple times, had to be huge to match the investment.

不幸的是,这个问题的答案是一个移动的目标。在数据库的heydey中,20世纪70年代到90年代早期,数据库是用来共享数据的。如果你正在使用一个数据库,而你没有共享数据,你要么参与了一个学术项目,要么你在浪费资源,包括你自己。建立数据库和管理数据库管理系统是一项非常重要的任务,因此,在数据被多次利用的情况下,回报必须是巨大的,才能与投资相匹配。

Over the last 15 years, databases have come to be used for storing the persistent data associated with just one application. Building a database for MySQL, or Access, or SQL Server has become so routine that databases have become almost a routine part of an ordinary application. Sometimes, that initial limited mission gets pushed upward by mission creep, as the real value of the data becomes apparent. Unfortunately, databases that were designed with a single purpose in mind often fail dramatically when they begin to be pushed into a role that's enterprise wide and mission critical.

在过去的15年中,数据库被用于存储与一个应用程序关联的持久数据。为MySQL或SQL Server构建数据库已经变得如此常规,以至于数据库几乎成了普通应用程序的常规部分。有时,最初的有限任务被任务蠕变推高,因为数据的真正价值变得明显。不幸的是,设计时只考虑了一个目的的数据库,当它们开始被推到企业范围和任务至关重要的角色时,常常会发生戏剧性的失败。

The second thing developers need to learn about databases is the whole data centric view of the world. The data centric world view is more different from the process centric world view than anything most developers have ever learned. Compared to this gap, the gap between structured programming and object oriented programming is relatively small.

开发人员需要了解的第二件事是数据库的整个数据中心视图。以数据为中心的世界观与以流程为中心的世界观比大多数开发人员都学过的东西更不同。与此差距相比,结构化编程与面向对象编程之间的差距相对较小。

The third thing developers need to learn, at least in an overview, is data modeling, including conceptual data modeling, logical data modeling, and physical data modeling.

开发人员需要学习的第三件事是数据建模,至少在概述中是这样,包括概念数据建模、逻辑数据建模和物理数据建模。

Conceptual data modeling is really requirements analysis from a data centric point of view.

概念数据建模实际上是从数据中心的角度进行需求分析。

Logical data modeling is generally the application of a specific data model to the requirements discovered in conceptual data modeling. The relational model is used far more than any other specific model, and developers need to learn the relational model for sure. Designing a powerful and relevant relational model for a nontrivial requirement is not a trivial task. You can't build good SQL tables if you misunderstand the relational model.

逻辑数据建模通常是特定数据模型对概念数据建模中发现的需求的应用。关系模型比其他任何特定的模型使用得都多,开发人员肯定需要学习关系模型。为一个重要的需求设计一个强大的、相关的关系模型并不是一项简单的任务。如果错误理解关系模型,就无法构建良好的SQL表。

Physical data modeling is generally DBMS specific, and doesn't need to be learned in much detail, unless the developer is also the database builder or the DBA. What developers do need to understand is the extent to which physical database design can be separated from logical database design, and the extent to which producing a high speed database can be accomplished just by tweaking the physical design.

物理数据建模通常是特定于DBMS的,不需要详细了解,除非开发人员也是数据库构建者或DBA。开发人员需要理解的是物理数据库设计可以从逻辑数据库设计中分离出来的程度,以及通过调整物理设计来实现高速数据库的程度。

The next thing developers need to learn is that while speed (performance) is important, other measures of design goodness are even more important, such as the ability to revise and extend the scope of the database down the road, or simplicity of programming.

开发人员需要学习的下一件事是,虽然速度(性能)是重要的,但是其他设计良善的度量方法更重要,比如修改和扩展数据库范围的能力,或者编程的简单性。

Finally, anybody who messes with databases needs to understand that the value of data often outlasts the system that captured it.

最后,任何把数据库搞砸的人都需要明白,数据的价值往往比捕获数据的系统更持久。

Whew!

唷!

#2


72  

Good question. The following are some thoughts in no particular order:

好问题。以下是一些没有特别顺序的想法:

  1. Normalization, to at least the second normal form, is essential.

    标准化,至少是第二种范式,是必不可少的。

  2. Referential integrity is also essential, with proper cascading delete and update considerations.

    引用完整性也很重要,需要适当的级联删除和更新注意事项。

  3. Good and proper use of check constraints. Let the database do as much work as possible.

    良好和正确地使用检查约束。让数据库做尽可能多的工作。

  4. Don't scatter business logic in both the database and middle tier code. Pick one or the other, preferably in middle tier code.

    不要在数据库和中间层代码中分散业务逻辑。选择其中之一,最好是在中间层代码中。

  5. Decide on a consistent approach for primary keys and clustered keys.

    确定主键和集群键的一致方法。

  6. Don't over index. Choose your indexes wisely.

    不要在索引。明智地选择你的索引。

  7. Consistent table and column naming. Pick a standard and stick to it.

    一致的表和列命名。选择一个标准并坚持下去。

  8. Limit the number of columns in the database that will accept null values.

    限制数据库中接受空值的列的数量。

  9. Don't get carried away with triggers. They have their use but can complicate things in a hurry.

    不要被触发器冲昏了头脑。它们有自己的用途,但可能会使事情迅速复杂化。

  10. Be careful with UDFs. They are great but can cause performance problems when you're not aware how often they might get called in a query.

    小心使用udf。它们很好,但是当您不知道它们在查询中被调用的频率时,可能会导致性能问题。

  11. Get Celko's book on database design. The man is arrogant but knows his stuff.

    获取Celko关于数据库设计的书。这个人很傲慢,但他知道自己的事情。

#3


22  

First, developers need to understand that there is something to know about databases. They're not just magic devices where you put in the SQL and get out result sets, but rather very complicated pieces of software with their own logic and quirks.

首先,开发人员需要了解关于数据库的一些信息。它们不仅是输入SQL并输出结果集的神奇设备,而且是非常复杂的具有自己逻辑和怪癖的软件。

Second, that there are different database setups for different purposes. You do not want a developer making historical reports off an on-line transactional database if there's a data warehouse available.

其次,不同的数据库设置有不同的目的。如果有可用的数据仓库,您不希望开发人员从联机事务数据库中生成历史报告。

Third, developers need to understand basic SQL, including joins.

第三,开发人员需要理解基本的SQL,包括连接。

Past this, it depends on how closely the developers are involved. I've worked in jobs where I was developer and de facto DBA, where the DBAs were just down the aisle, and where the DBAs are off in their own area. (I dislike the third.) Assuming the developers are involved in database design:

除此之外,还取决于开发人员的参与程度。我曾在开发人员和实际上的DBA的工作中工作过,在那里DBA刚刚起步,DBA也在他们自己的领域。(我不喜欢第三个。)假设开发人员参与数据库设计:

They need to understand basic normalization, at least the first three normal forms. Anything beyond that, get a DBA. For those with any experience with US courtrooms (and random television shows count here), there's the mnemonic "Depend on the key, the whole key, and nothing but the key, so help you Codd."

他们需要了解基本的标准化,至少前三种正常形式。除此之外,找一个DBA吧。对于那些有美国法庭经验的人(和随机的电视节目在这里算),有一个助记符“取决于关键,整个关键,除了关键,什么都没有,所以帮助你Codd。”

They need to have a clue about indexes, by which I mean they should have some idea what indexes they need and how they're likely to affect performance. This means not having useless indices, but not being afraid to add them to assist queries. Anything further (like the balance) should be left for the DBA.

他们需要知道索引,我的意思是他们应该知道他们需要什么索引,以及它们可能如何影响性能。这意味着没有无用的索引,但是不要害怕添加它们来帮助查询。任何更进一步的东西(如余额)都应该留给DBA。

They need to understand the need for data integrity, and be able to point to where they're verifying the data and what they're doing if they find problems. This doesn't have to be in the database (where it will be difficult to issue a meaningful error message for the user), but has to be somewhere.

他们需要理解数据完整性的需要,并且能够指出他们正在验证数据的位置,以及如果发现问题,他们正在做什么。这并不一定要在数据库中(在数据库中很难为用户发出有意义的错误消息),但必须在某个地方。

They should have the basic knowledge of how to get a plan, and how to read it in general (at least enough to tell whether the algorithms are efficient or not).

他们应该具备如何制定计划以及如何通读计划的基本知识(至少足以判断算法是否有效)。

They should know vaguely what a trigger is, what a view is, and that it's possible to partition pieces of databases. They don't need any sort of details, but they need to know to ask the DBA about these things.

他们应该模糊地知道什么是触发器,什么是视图,并且可以对数据库进行分区。他们不需要任何细节,但他们需要知道如何向DBA询问这些事情。

They should of course know not to meddle with production data, or production code, or anything like that, and they should know that all source code goes into a VCS.

他们当然应该知道不要干预生产数据,或者生产代码,或者类似的东西,并且他们应该知道所有的源代码都属于vc。

I've doubtless forgotten something, but the average developer need not be a DBA, provided there is a real DBA at hand.

我肯定忘记了一些事情,但是一般的开发人员不需要是DBA,只要有一个真正的DBA在手边。

#4


19  

Basic Indexing

I'm always shocked to see a table or an entire database with no indexes, or arbitrary/useless indexes. Even if you're not designing the database and just have to write some queries, it's still vital to understand, at a minimum:

看到没有索引或任意/无用索引的表或整个数据库时,我总是感到震惊。即使您没有设计数据库,只需要编写一些查询,至少理解它仍然是至关重要的:

  • What's indexed in your database and what's not:
  • 你的数据库中有哪些索引,哪些没有:
  • The difference between types of scans, how they're chosen, and how the way you write a query can influence that choice;
  • 扫描类型之间的差异,它们的选择方式,以及编写查询的方式如何影响选择;
  • The concept of coverage (why you shouldn't just write SELECT *);
  • 覆盖率的概念(为什么不写SELECT *);
  • The difference between a clustered and non-clustered index;
  • 集群索引和非集群索引之间的差异;
  • Why more/bigger indexes are not necessarily better;
  • 为什么更多/更大的索引不一定更好;
  • Why you should try to avoid wrapping filter columns in functions.
  • 为什么应该尽量避免在函数中包装过滤器列。

Designers should also be aware of common index anti-patterns, for example:

设计人员还应该注意常见的索引反模式,例如:

  • The Access anti-pattern (indexing every column, one by one)
  • 访问反模式(对每个列逐一进行索引)
  • The Catch-All anti-pattern (one massive index on all or most columns, apparently created under the mistaken impression that it would speed up every conceivable query involving any of those columns).
  • 包罗万象的反模式(所有或大多数列上的一个大型索引,显然是在错误的印象中创建的,以为它会加快涉及这些列中的任何一个的所有可能查询)。

The quality of a database's indexing - and whether or not you take advantage of it with the queries you write - accounts for by far the most significant chunk of performance. 9 out of 10 questions posted on SO and other forums complaining about poor performance invariably turn out to be due to poor indexing or a non-sargable expression.

数据库索引的质量——以及您是否利用它编写查询——是到目前为止性能最重要的部分。在SO和其他论坛上发布的10个问题中,有9个问题都抱怨表现不佳,但结果却无一例外地是由于糟糕的或非糟糕的索引表达式。

#5


16  

Normalization

It always depresses me to see somebody struggling to write an excessively complicated query that would have been completely straightforward with a normalized design ("Show me total sales per region.").

看到有人在努力编写一个过于复杂的查询,而这完全是一种标准化的设计(“让我看到每个区域的总销售额”),这总是让我感到沮丧。

If you understand this at the outset and design accordingly, you'll save yourself a lot of pain later. It's easy to denormalize for performance after you've normalized; it's not so easy to normalize a database that wasn't designed that way from the start.

如果你一开始就明白了这一点,并据此进行设计,那么以后你就可以避免很多痛苦。在规范化之后,很容易去规范化性能;要规范化一个从一开始就没有这样设计的数据库并不容易。

At the very least, you should know what 3NF is and how to get there. With most transactional databases, this is a very good balance between making queries easy to write and maintaining good performance.

至少,你应该知道3NF是什么,以及如何到达那里。对于大多数事务数据库,这是使查询易于编写和保持良好性能之间的非常好的平衡。

#6


14  

How Indexes Work

It's probably not the most important, but for sure the most underestimated topic.

这可能不是最重要的,但肯定是最被低估的话题。

The problem with indexing is that SQL tutorials usually don't mention them at all and that all the toy examples work without any index.

索引的问题在于,SQL教程通常没有提到它们,而且所有的玩具示例都没有索引。

Even more experienced developers can write fairly good (and complex) SQL without knowing more about indexes than "An index makes the query fast".

甚至更有经验的开发人员也可以编写相当好的(和复杂的)SQL,而不需要比“索引使查询速度更快”更了解索引。

That's because SQL databases do a very good job working as black-box:

这是因为SQL数据库在黑盒方面做得很好:

Tell me what you need (gimme SQL), I'll take care of it.

告诉我你需要什么(给我SQL),我会处理好的。

And that works perfectly to retrieve the correct results. The author of the SQL doesn't need to know what the system is doing behind the scenes--until everything becomes sooo slooooow.....

这能很好地得到正确的结果。SQL的作者不需要知道系统在后台做什么——直到所有的事情都变成sooo slooooow…

That's when indexing becomes a topic. But that's usually very late and somebody (some company?) is already suffering from a real problem.

这时索引就成了一个话题。但那通常已经很晚了,一些人(一些公司?)已经在遭受真正的问题。

That's why I believe indexing is the No. 1 topic not to forget when working with databases. Unfortunately, it is very easy to forget it.

这就是为什么我认为在使用数据库时,索引是最重要的话题。不幸的是,很容易忘记它。

Disclaimer

免责声明

The arguments are borrowed from the preface of my free eBook "Use The Index, Luke". I am spending quite a lot of my time explaining how indexes work and how to use them properly.

这些论点是从我的免费电子书“使用索引,卢克”的序言中借来的。我花了很多时间解释索引是如何工作的,以及如何正确地使用它们。

#7


12  

I just want to point out an observation - that is that it seems that the majority of responses assume database is interchangeable with relational databases. There are also object databases, flat file databases. It is important to asses the needs of the of the software project at hand. From a programmer perspective the database decision can be delayed until later. Data modeling on the other hand can be achieved early on and lead to much success.

我只想指出一个观察结果——似乎大多数响应都假定数据库可以与关系数据库互换。还有对象数据库,平面文件数据库。评估手头软件项目的需求是很重要的。从程序员的角度来看,数据库决策可以延迟到稍后。另一方面,数据建模可以在早期实现,并取得很大的成功。

I think data modeling is a key component and is a relatively old concept yet it is one that has been forgotten by many in the software industry. Data modeling, especially conceptual modeling, can reveal the functional behavior of a system and can be relied on as a road map for development.

我认为数据建模是一个关键的组成部分,是一个相对古老的概念,但是它已经被许多软件行业的人遗忘了。数据建模,特别是概念建模,可以揭示系统的功能行为,可以作为开发的路线图。

On the other hand, the type of database required can be determined based on many different factors to include environment, user volume, and available local hardware such as harddrive space.

另一方面,需要的数据库类型可以根据许多不同的因素确定,包括环境、用户数量和可用的本地硬件,如harddrive空间。

#8


11  

Avoiding SQL injection and how to secure your database

避免SQL注入和如何保护数据库

#9


9  

Every developer should know that this is false: "Profiling a database operation is completely different from profiling code."

每个开发人员都应该知道这是错误的:“分析数据库操作与分析代码完全不同。”

There is a clear Big-O in the traditional sense. When you do an EXPLAIN PLAN (or the equivalent) you're seeing the algorithm. Some algorithms involve nested loops and are O( n ^ 2 ). Other algorithms involve B-tree lookups and are O( n log n ).

在传统意义上有一个明显的大o。当你做一个解释计划(或类似的计划)时,你看到的是算法。有些算法包括嵌套循环和O(n ^ 2)。其他算法包括b树查找和O(n log n)。

This is very, very serious. It's central to understanding why indexes matter. It's central to understanding the speed-normalization-denormalization tradeoffs. It's central to understanding why a data warehouse uses a star-schema which is not normalized for transactional updates.

这是非常非常严重的。这是理解索引为何重要的关键。它对于理解速度-规范化-非规范化权衡至关重要。这对于理解为什么数据仓库使用不为事务更新规范化的星型模式至关重要。

If you're unclear on the algorithm being used do the following. Stop. Explain the Query Execution plan. Adjust indexes accordingly.

如果您对所使用的算法不清楚,请执行以下操作。停止。解释查询执行计划。相应地调整索引。

Also, the corollary: More Indexes are Not Better.

此外,推论是:索引越多并不越好。

Sometimes an index focused on one operation will slow other operations down. Depending on the ratio of the two operations, adding an index may have good effects, no overall impact, or be detrimental to overall performance.

有时专注于一个操作的索引会降低其他操作的速度。根据这两个操作的比率,添加一个索引可能会有良好的效果,不会产生总体影响,或者对整体性能有害。

#10


8  

I think every developer should understand that databases require a different paradigm.

我认为每个开发人员都应该理解,数据库需要不同的范式。

When writing a query to get at your data, a set-based approach is needed. Many people with an interative background struggle with this. And yet, when they embrace it, they can achieve far better results, even though the solution may not be the one that first presented itself in their iterative-focussed minds.

在编写查询以获取数据时,需要一种基于集合的方法。很多人都有这样的背景。然而,当他们接受它时,他们可以获得更好的结果,即使解决方案可能不是第一次出现在他们的迭代集中的头脑中。

#11


8  

Excellent question. Let's see, first no one should consider querying a datbase who does not thoroughly understand joins. That's like driving a car without knowing where the steering wheel and brakes are. You also need to know datatypes and how to choose the best one.

非常好的问题。让我们看看,首先不应该考虑查询不完全理解连接的数据库。这就像开车时不知道方向盘和刹车在哪里。您还需要了解数据类型以及如何选择最好的数据类型。

Another thing that developers should understand is that there are three things you should have in mind when designing a database:

开发人员应该理解的另一件事是,在设计数据库时,您应该考虑以下三点:

  1. Data integrity - if the data can't be relied on you essentially have no data - this means do not put required logic in the application as many other sources may touch the database. Constraints, foreign keys and sometimes triggers are necessary to data integrity. Don't fail to use them because you don't like them or don't want to be bothered to understand them.

    数据完整性——如果数据不能依赖于您实际上没有数据——这意味着不要在应用程序中放置必需的逻辑,因为许多其他数据源可能会触及数据库。约束、外键和有时触发器是数据完整性所必需的。不要因为你不喜欢或不愿费心去理解它们而不使用它们。

  2. Performance - it is very hard to refactor a poorly performing database and performance should be considered from the start. There are many ways to do the same query and some are known to be faster almost always, it is short-sighted not to learn and use these ways. Read some books on performance tuning before designing queries or database structures.

    性能——很难重构性能较差的数据库,从一开始就应该考虑性能。有许多方法可以执行相同的查询,有些几乎总是更快,不学习和使用这些方法是短视的。在设计查询或数据库结构之前,请阅读一些关于性能调优的书籍。

  3. Security - this data is the life-blood of your company, it also frequently contains personal information that can be stolen. Learn to protect your data from SQL injection attacks and fraud and identity theft.

    安全——这些数据是你公司的命脉,它也经常包含可以被窃取的个人信息。学习保护您的数据免受SQL注入攻击、欺诈和身份盗窃。

When querying a database, it is easy to get the wrong answer. Make sure you understand your data model thoroughly. Remember often actual decisions are made based on the data your query returns. When it is wrong, the wrong business decisions are made. You can kill a company from bad queries or loose a big customer. Data has meaning, developers often seem to forget that.

当查询数据库时,很容易得到错误的答案。确保您彻底了解了您的数据模型。记住,实际的决策通常是基于查询返回的数据做出的。当它是错误的,就会做出错误的商业决策。你可能会因为糟糕的查询或失去一个大客户而毁掉一家公司。数据有意义,开发人员似乎常常忘记这一点。

Data almost never goes away, think in terms of storing data over time instead of just how to get it in today. That database that worked fine when it had a hundred thousand records, may not be so nice in ten years. Applications rarely last as long as data. This is one reason why designing for performance is critical.

数据几乎永远不会消失,考虑到随着时间的推移如何存储数据,而不仅仅是如何将数据导入。这个数据库在它有十万张记录的时候运行得很好,十年后可能不会那么好了。应用程序很少能像数据那样持久。这是设计性能至关重要的原因之一。

Your database will probaly need fields that the application doesn't need to see. Things like GUIDs for replication, date inserted fields. etc. You also may need to store history of changes and who made them when and be able to restore bad changes from this storehouse. Think about how you intend to do this before you come ask a web site how to fix the problem where you forgot to put a where clause on an update and updated the whole table.

您的数据库可能需要应用程序不需要看到的字段。诸如用于复制的GUIDs、日期插入字段。等等。您还可能需要存储更改的历史记录,以及更改是谁创建的,并且能够从这个库中恢复糟糕的更改。在询问web站点如何修复问题之前,请考虑您打算如何进行此操作,您忘记在更新和更新整个表时添加where子句。

Never develop in a newer version of a database than the production version. Never, never, never develop directly against a production database.

不要在数据库的更新版本中开发,而要在生产版本中开发。决不,决不,决不,决不直接针对生产数据库开发。

If you don't have a database administrator, make sure someone is making backups and knows how to restore them and has tested restoring them.

如果没有数据库管理员,请确保有人正在进行备份,并知道如何恢复它们,并测试了如何恢复它们。

Database code is code, there is no excuse for not keeping it in source control just like the rest of your code.

数据库代码是代码,没有理由不把它放在源代码控制中,就像其他代码一样。

#12


6  

Evolutionary Database Design. http://martinfowler.com/articles/evodb.html

进化的数据库设计。http://martinfowler.com/articles/evodb.html

These agile methodologies make database change process manageable, predictable and testable.

这些敏捷方法使数据库更改过程可管理、可预测和可测试。

Developers should know, what it takes to refactor a production database in terms of version control, continious integration and automated testing.

开发人员应该知道,在版本控制、持续集成和自动化测试方面重构生产数据库需要什么。

Evolutionary Database Design process has administrative aspects, for example a column is to be dropped after some life time period in all databases of this codebase.

进化数据库设计过程具有管理方面的内容,例如,在此代码基的所有数据库中,一个列将在某个生命周期后被删除。

At least know, that Database Refactoring concept and methodologies exist. http://www.agiledata.org/essays/databaseRefactoringCatalog.html

至少要知道,数据库重构的概念和方法是存在的。http://www.agiledata.org/essays/databaseRefactoringCatalog.html

Classification and process description makes it possible to implement tooling for these refactorings too.

分类和过程描述使实现这些重构工具成为可能。

#13


5  

From my experience with relational databases, every developer should know:

从我使用关系数据库的经验来看,每个开发人员都应该知道:

- The different data types:

-不同的数据类型:

Using the correct type for the correct job will make your DB design more robust, your queries faster and your life easier.

为正确的工作使用正确的类型将使DB设计更加健壮,查询速度更快,生活更轻松。

- Learn about 1xM and MxM:

-了解1xM和MxM:

This is the bread and butter for relational databases. You need to understand one-to-many and many-to-many relations and apply then when appropriate.

这是关系数据库的基础。您需要理解一对多关系和多对多关系,然后在适当的时候应用它们。

- "K.I.S.S." principle applies to the DB as well:

-“K.I.S.S.”原则亦适用于DB:

Simplicity always works best. Provided you have studied how DB work, you will avoid unnecessary complexity which will lead to maintenance and speed problems.

简单总是效果最好。如果您已经研究了DB的工作方式,您将避免不必要的复杂性,这将导致维护和速度问题。

- Indices:

——指数:

It's not enough if you know what they are. You need to understand when to used them and when not to.

如果你知道它们是什么,那是不够的。你需要了解何时使用它们,何时不使用它们。


also:

另外:

  • Boolean algebra is your friend
  • 布尔代数是你的朋友
  • Images: Don't store them on the DB. Don't ask why.
  • 图像:不要将它们存储在数据库中。不要问为什么。
  • Test DELETE with SELECT
  • 测试与选择删除

#14


5  

I would like everyone, both DBAs and developer/designer/architects, to better understand how to properly model a business domain, and how to map/translate that business domain model into both a normalized database logical model, an optimized physical model, and an appropriate object oriented class model, each one of which is (can be) different, for various reasons, and understand when, why, and how they are (or should be) different from one another.

我想每个人,dba和开发人员/设计师/建筑师,为了更好地理解如何正确模型一个业务领域,以及如何映射/业务领域模型转化为一个规范化的数据库逻辑模型,优化物理模型,和一个合适的面向对象的类模型,每一个(可以)不同,由于种种原因,和理解的时候,为什么,以及它们如何(或应该)不同。

#15


5  

I would say strong basic SQL skills. I've seen a lot of developers so far who know a little about databases but are always asking for tips about how to formulate a quite simple query. Queries are not always that easy and simple. You do have to use multiple joins (inner, left, etc.) when querying a well normalized database.

我认为有很强的SQL基本技能。到目前为止,我已经见过许多开发人员,他们对数据库略知一二,但他们总是询问如何构造一个非常简单的查询。查询并不总是那么简单。在查询一个规范化良好的数据库时,必须使用多个连接(内部连接、左连接等)。

#16


5  

About the following comment to Walter M.'s answer:

关于以下对Walter M的评论。的回答:

"Very well written! And the historical perspective is great for people who weren't doing database work at that time (i.e. me)".

“写得很好!历史视角对于那些当时没有做过数据库工作的人(比如我)来说是很好的”。

The historical perspective is in a certain sense absolutely crucial. "Those who forget history, are doomed to repeat it.". Cfr XML repeating the hierarchical mistakes of the past, graph databases repeating the network mistakes of the past, OO systems forcing the hierarchical model upon users while everybody with even just a tenth of a brain should know that the hierarchical model is not suitable for general-purpose representation of the real world, etcetera, etcetera.

历史的观点在某种意义上是绝对重要的。“忘记历史的人,注定要重蹈覆辙。”Cfr XML重复分层过去的错误,图形数据库重复过去的网络错误,OO系统迫使层次模型在用户甚至在每个人只有十分之一的大脑应该知道层次模型不适合的通用表示现实世界中,等等,等等。

As for the question itself:

至于问题本身:

Every database developer should know that "Relational" is not equal to "SQL". Then they would understand why they are being let down so abysmally by the DBMS vendors, and why they should be telling those same vendors to come up with better stuff (e.g. DBMS's that are truly relational) if they want to go on sucking hilarious amounts of money out of their customers for such crappy software).

每个数据库开发人员都应该知道“关系”并不等于“SQL”。然后他们会理解为什么他们被DBMS失望所以深不可测地供应商,和为什么他们应该告诉这些供应商想出更好的东西(例如DBMS的真正关系),如果他们想继续吸吮的钱从他们的客户对于这样糟糕的软件)。

And every database developer should know everything about the relational algebra. Then there would no longer be a single developer left who had to post these stupid "I don't know how to do my job and want someone else to do it for me" questions on Stack Overflow anymore.

而且每个数据库开发人员都应该了解关系代数的所有知识。那么,就不会再有一个开发人员不得不在Stack Overflow上发布这些愚蠢的问题:“我不知道如何完成我的工作,希望别人来为我做这件事”了。

#17


5  

I think a lot of the technical details have been covered here and I don't want to add to them. The one thing I want to say is more social than technical, don't fall for the "DBA knowing the best" trap as an application developer.

我认为这里已经介绍了很多技术细节,我不想对它们进行补充。我想说的一件事是,与其说是技术上的,不如说是社会的,不要让“DBA知道最好的”陷阱作为应用程序开发人员。

If you are having performance issues with query take ownership of the problem too. Do your own research and push for the DBAs to explain what's happening and how their solutions are addressing the problem.

如果您正在使用查询的性能问题,那么也要考虑到问题的所有权。做你自己的研究,推动dba解释发生了什么,以及他们的解决方案如何解决问题。

Come up with your own suggestions too after you have done the research. That is, I try to find a cooperative solution to the problem rather than leaving database issues to the DBAs.

在你做了调查之后,也要提出你自己的建议。也就是说,我试图找到一个合作的解决方案来解决问题,而不是把数据库问题留给dba。

#18


5  

Simple respect.

简单的尊重。

  • It's not just a repository
  • 它不仅仅是一个存储库
  • You probably don't know better than the vendor or the DBAs
  • 您可能比供应商或dba更了解情况
  • You won't support it at 3 a.m. with senior managers shouting at you
  • 你不会支持在凌晨3点,高级经理对你大喊大叫

#19


3  

Consider Denormalization as a possible angel, not the devil, and also consider NoSQL databases as an alternative to relational databases.

考虑非规范化作为一个可能的天使,而不是魔鬼,并且考虑NoSQL数据库作为关系数据库的替代。

Also, I think the Entity-Relation model is a must-know for every developper even if you don't design databases. It'll let you understand thoroughly what's your database all about.

而且,我认为实体关系模型是每个开发人员都必须知道的,即使您不设计数据库。它将使您彻底了解数据库的全部内容。

#20


3  

Never insert data with the wrong text encoding.

不要用错误的文本编码插入数据。

Once your database becomes polluted with multiple encodings, the best you can do is apply some kind combination of heuristics and manual labor.

一旦你的数据库被多重编码污染,你所能做的最好的就是应用一些启发式和体力劳动的组合。

#21


3  

Aside from syntax and conceptual options they employ (such as joins, triggers, and stored procedures), one thing that will be critical for every developer employing a database is this:

除了它们所使用的语法和概念选项(例如连接、触发器和存储过程)之外,对于使用数据库的每个开发人员来说,有一件事是至关重要的:

Know how your engine is going to perform the query you are writing with specificity.

了解您的引擎将如何执行您正在编写的查询。

The reason I think this is so important is simply production stability. You should know how your code performs so you're not stopping all execution in your thread while you wait for a long function to complete, so why would you not want to know how your query will affect the database, your program, and perhaps even the server?

我认为这很重要的原因是生产稳定性。您应该知道代码是如何执行的,这样就不会在等待长函数完成时停止线程中的所有执行,所以为什么您不想知道查询将如何影响数据库、程序甚至服务器呢?

This is actually something that has hit my R&D team more times than missing semicolons or the like. The presumtion is the query will execute quickly because it does on their development system with only a few thousand rows in the tables. Even if the production database is the same size, it is more than likely going to be used a lot more, and thus suffer from other constraints like multiple users accessing it at the same time, or something going wrong with another query elsewhere, thus delaying the result of this query.

这其实是我的研发团队遇到的问题,比漏掉分号之类的问题要多。假设查询将快速执行,因为它在开发系统上执行,表中只有几千行。即使生产数据库是相同的大小,它更有可能会使用更多,因此遭受其他约束等多个用户同时访问它,或另一个查询问题,在其他地方,因此推迟这个查询的结果。

Even simple things like how joins affect performance of a query are invaluable in production. There are many features of many database engines that make things easier conceptually, but may introduce gotchas in performance if not thought of clearly.

即使是连接如何影响查询的性能这样简单的事情在生产中也是无价的。许多数据库引擎有许多特性,它们在概念上使事情变得更容易,但是如果不考虑清楚的话,可能会在性能上引入陷阱。

Know your database engine execution process and plan for it.

了解数据库引擎的执行过程并为之计划。

#22


3  

For a middle-of-the-road professional developer who uses databases a lot (writing/maintaining queries daily or almost daily), I think the expectation should be the same as any other field: You wrote one in college.

对于一个经常使用数据库(每天或几乎每天都编写/维护查询)的中级专业开发人员来说,我认为他们的期望应该和其他任何领域一样:你在大学里就写过一个。

Every C++ geek wrote a string class in college. Every graphics geek wrote a raytracer in college. Every web geek wrote interactive websites (usually before we had "web frameworks") in college. Every hardware nerd (and even software nerds) built a CPU in college. Every physician dissected an entire cadaver in college, even if she's only going to take my blood pressure and tell me my cholesterol is too high today. Why would databases be any different?

在大学里,每一个c++的极客都写过一个字符串类。在大学里,每个图形极客都写过一个射线追踪器。在大学里,每个网络极客都会写交互式网站(通常在我们有“网络框架”之前)。每个硬件书呆子(甚至软件书呆子)在大学里都有一个CPU。在大学里,每个医生都解剖过一具尸体,即使她只会给我量血压,告诉我我的胆固醇太高。为什么数据库会有所不同呢?

Unfortunately, they do seem different, today, for some reason. People want .NET programmers to know how strings work in C, but the internals of your RDBMS shouldn't concern you too much.

不幸的是,由于某种原因,他们今天看起来不一样。人们希望。net程序员知道字符串在C中是如何工作的,但是RDBMS的内部特性不应该太关心您。

It's virtually impossible to get the same level of understanding from just reading about them, or even working your way down from the top. But if you start at the bottom and understand each piece, then it's relatively easy to figure out the specifics for your database. Even things that lots of database geeks can't seem to grok, like when to use a non-relational database.

几乎不可能从阅读他们的文章中获得同样程度的理解,甚至不可能从最上面的文章中得到理解。但是,如果您从底层开始并理解每个部分,那么就比较容易理解数据库的细节。即使是那些大量的数据库极客们也无法理解的事情,比如何时使用非关系数据库。

Maybe that's a bit strict, especially if you didn't study computer science in college. I'll tone it down some: You could write one today, completely, from scratch. I don't care if you know the specifics of how the PostgreSQL query optimizer works, but if you know enough to write one yourself, it probably won't be too different from what they did. And you know, it's really not that hard to write a basic one.

这可能有点严格,尤其是如果你在大学里没有学过计算机科学的话。我把它写下来:你今天可以写一篇,完全从头开始。我不关心您是否了解PostgreSQL查询优化器如何工作的细节,但是如果您足够了解如何编写一个,那么它可能与它们所做的工作没有太大的不同。你知道,写出一个基本的题目并不难。

#23


2  

The order of columns in a non-unique index is important.

非唯一索引中的列的顺序很重要。

The first column should be the column that has the most variability in its content (i.e. cardinality).

第一列应该是内容中变化最大的列(即基数)。

This is to aid SQL Server ability to create useful statistics in how to use the index at runtime.

这有助于SQL Server在运行时如何使用索引时创建有用的统计信息。

#24


2  

Understand the tools that you use to program the database!!!

了解用于编程数据库的工具!!

I wasted so much time trying to understand why my code was mysteriously failing.

我浪费了很多时间试图理解为什么我的代码神秘地失败了。

If you're using .NET, for example, you need to know how to properly use the objects in the System.Data.SqlClient namespace. You need to know how to manage your SqlConnection objects to make sure they are opened, closed, and when necessary, disposed properly.

例如,如果您正在使用。net,您需要知道如何正确地使用System.Data中的对象。SqlClient名称空间。您需要知道如何管理SqlConnection对象,以确保它们被打开、关闭,并在必要时正确地处理。

You need to know that when you use a SqlDataReader, it is necessary to close it separately from your SqlConnection. You need to understand how to keep connections open when appropriate to how to minimize the number of hits to the database (because they are relatively expensive in terms of computing time).

您需要知道,当您使用SqlDataReader时,有必要将其与SqlConnection分开关闭。您需要了解如何在适当的时候保持连接的打开,以及如何最小化对数据库的命中(因为在计算时间方面,这些连接比较昂贵)。

#25


2  

  • Basic SQL skills.
  • 基本的SQL技能。
  • Indexing.
  • 索引。
  • Deal with different incarnations of DATE/ TIME/ TIMESTAMP.
  • 处理不同形式的日期/时间/时间戳。
  • JDBC driver documentation for the platform you are using.
  • 您正在使用的平台的JDBC驱动程序文档。
  • Deal with binary data types (CLOB, BLOB, etc.)
  • 处理二进制数据类型(CLOB、BLOB等)

#26


1  

For some projects, and Object-Oriented model is better.

对于某些项目来说,面向对象的模型更好。

For other projects, a Relational model is better.

对于其他项目,关系模型更好。

#27


1  

The impedance mismatch problem, and know the common deficiencies or ORMs.

阻抗失配问题,并了解常见的不足或缺陷。

#28


1  

RDBMS Compatibility

RDBMS兼容性

Look if it is needed to run the application in more than one RDBMS. If yes, it might be necessary to:

看看是否需要在多个RDBMS中运行应用程序。如果有,可能有必要:

  • avoid RDBMS SQL extensions
  • 避免RDBMS SQL扩展
  • eliminate triggers and store procedures
  • 消除触发器和存储过程
  • follow strict SQL standards
  • 遵循严格的SQL标准
  • convert field data types
  • 字段数据类型转换
  • change transaction isolation levels
  • 改变事务隔离级别

Otherwise, these questions should be treated separately and different versions (or configurations) of the application would be developed.

否则,这些问题应该单独处理,并开发应用程序的不同版本(或配置)。

#29


1  

Don't depend on the order of rows returned by an SQL query.

不要依赖于SQL查询返回的行的顺序。

#30


1  

http://www.reddit.com/r/programming/comments/azdd7/programmers_sit_your_butt_down_i_need_to_have_a/

http://www.reddit.com/r/programming/comments/azdd7/programmers_sit_your_butt_down_i_need_to_have_a/

#1


104  

The very first thing developers should know about databases is this: what are databases for? Not how do they work, nor how do you build one, nor even how do you write code to retrieve or update the data in a database. But what are they for?

开发人员首先应该知道的是:数据库的用途是什么?不是如何工作,也不是如何构建,甚至不是如何编写代码来检索或更新数据库中的数据。但是它们是用来干什么的呢?

Unfortunately, the answer to this one is a moving target. In the heydey of databases, the 1970s through the early 1990s, databases were for the sharing of data. If you were using a database, and you weren't sharing data you were either involved in an academic project or you were wasting resources, including yourself. Setting up a database and taming a DBMS were such monumental tasks that the payback, in terms of data exploited multiple times, had to be huge to match the investment.

不幸的是,这个问题的答案是一个移动的目标。在数据库的heydey中,20世纪70年代到90年代早期,数据库是用来共享数据的。如果你正在使用一个数据库,而你没有共享数据,你要么参与了一个学术项目,要么你在浪费资源,包括你自己。建立数据库和管理数据库管理系统是一项非常重要的任务,因此,在数据被多次利用的情况下,回报必须是巨大的,才能与投资相匹配。

Over the last 15 years, databases have come to be used for storing the persistent data associated with just one application. Building a database for MySQL, or Access, or SQL Server has become so routine that databases have become almost a routine part of an ordinary application. Sometimes, that initial limited mission gets pushed upward by mission creep, as the real value of the data becomes apparent. Unfortunately, databases that were designed with a single purpose in mind often fail dramatically when they begin to be pushed into a role that's enterprise wide and mission critical.

在过去的15年中,数据库被用于存储与一个应用程序关联的持久数据。为MySQL或SQL Server构建数据库已经变得如此常规,以至于数据库几乎成了普通应用程序的常规部分。有时,最初的有限任务被任务蠕变推高,因为数据的真正价值变得明显。不幸的是,设计时只考虑了一个目的的数据库,当它们开始被推到企业范围和任务至关重要的角色时,常常会发生戏剧性的失败。

The second thing developers need to learn about databases is the whole data centric view of the world. The data centric world view is more different from the process centric world view than anything most developers have ever learned. Compared to this gap, the gap between structured programming and object oriented programming is relatively small.

开发人员需要了解的第二件事是数据库的整个数据中心视图。以数据为中心的世界观与以流程为中心的世界观比大多数开发人员都学过的东西更不同。与此差距相比,结构化编程与面向对象编程之间的差距相对较小。

The third thing developers need to learn, at least in an overview, is data modeling, including conceptual data modeling, logical data modeling, and physical data modeling.

开发人员需要学习的第三件事是数据建模,至少在概述中是这样,包括概念数据建模、逻辑数据建模和物理数据建模。

Conceptual data modeling is really requirements analysis from a data centric point of view.

概念数据建模实际上是从数据中心的角度进行需求分析。

Logical data modeling is generally the application of a specific data model to the requirements discovered in conceptual data modeling. The relational model is used far more than any other specific model, and developers need to learn the relational model for sure. Designing a powerful and relevant relational model for a nontrivial requirement is not a trivial task. You can't build good SQL tables if you misunderstand the relational model.

逻辑数据建模通常是特定数据模型对概念数据建模中发现的需求的应用。关系模型比其他任何特定的模型使用得都多,开发人员肯定需要学习关系模型。为一个重要的需求设计一个强大的、相关的关系模型并不是一项简单的任务。如果错误理解关系模型,就无法构建良好的SQL表。

Physical data modeling is generally DBMS specific, and doesn't need to be learned in much detail, unless the developer is also the database builder or the DBA. What developers do need to understand is the extent to which physical database design can be separated from logical database design, and the extent to which producing a high speed database can be accomplished just by tweaking the physical design.

物理数据建模通常是特定于DBMS的,不需要详细了解,除非开发人员也是数据库构建者或DBA。开发人员需要理解的是物理数据库设计可以从逻辑数据库设计中分离出来的程度,以及通过调整物理设计来实现高速数据库的程度。

The next thing developers need to learn is that while speed (performance) is important, other measures of design goodness are even more important, such as the ability to revise and extend the scope of the database down the road, or simplicity of programming.

开发人员需要学习的下一件事是,虽然速度(性能)是重要的,但是其他设计良善的度量方法更重要,比如修改和扩展数据库范围的能力,或者编程的简单性。

Finally, anybody who messes with databases needs to understand that the value of data often outlasts the system that captured it.

最后,任何把数据库搞砸的人都需要明白,数据的价值往往比捕获数据的系统更持久。

Whew!

唷!

#2


72  

Good question. The following are some thoughts in no particular order:

好问题。以下是一些没有特别顺序的想法:

  1. Normalization, to at least the second normal form, is essential.

    标准化,至少是第二种范式,是必不可少的。

  2. Referential integrity is also essential, with proper cascading delete and update considerations.

    引用完整性也很重要,需要适当的级联删除和更新注意事项。

  3. Good and proper use of check constraints. Let the database do as much work as possible.

    良好和正确地使用检查约束。让数据库做尽可能多的工作。

  4. Don't scatter business logic in both the database and middle tier code. Pick one or the other, preferably in middle tier code.

    不要在数据库和中间层代码中分散业务逻辑。选择其中之一,最好是在中间层代码中。

  5. Decide on a consistent approach for primary keys and clustered keys.

    确定主键和集群键的一致方法。

  6. Don't over index. Choose your indexes wisely.

    不要在索引。明智地选择你的索引。

  7. Consistent table and column naming. Pick a standard and stick to it.

    一致的表和列命名。选择一个标准并坚持下去。

  8. Limit the number of columns in the database that will accept null values.

    限制数据库中接受空值的列的数量。

  9. Don't get carried away with triggers. They have their use but can complicate things in a hurry.

    不要被触发器冲昏了头脑。它们有自己的用途,但可能会使事情迅速复杂化。

  10. Be careful with UDFs. They are great but can cause performance problems when you're not aware how often they might get called in a query.

    小心使用udf。它们很好,但是当您不知道它们在查询中被调用的频率时,可能会导致性能问题。

  11. Get Celko's book on database design. The man is arrogant but knows his stuff.

    获取Celko关于数据库设计的书。这个人很傲慢,但他知道自己的事情。

#3


22  

First, developers need to understand that there is something to know about databases. They're not just magic devices where you put in the SQL and get out result sets, but rather very complicated pieces of software with their own logic and quirks.

首先,开发人员需要了解关于数据库的一些信息。它们不仅是输入SQL并输出结果集的神奇设备,而且是非常复杂的具有自己逻辑和怪癖的软件。

Second, that there are different database setups for different purposes. You do not want a developer making historical reports off an on-line transactional database if there's a data warehouse available.

其次,不同的数据库设置有不同的目的。如果有可用的数据仓库,您不希望开发人员从联机事务数据库中生成历史报告。

Third, developers need to understand basic SQL, including joins.

第三,开发人员需要理解基本的SQL,包括连接。

Past this, it depends on how closely the developers are involved. I've worked in jobs where I was developer and de facto DBA, where the DBAs were just down the aisle, and where the DBAs are off in their own area. (I dislike the third.) Assuming the developers are involved in database design:

除此之外,还取决于开发人员的参与程度。我曾在开发人员和实际上的DBA的工作中工作过,在那里DBA刚刚起步,DBA也在他们自己的领域。(我不喜欢第三个。)假设开发人员参与数据库设计:

They need to understand basic normalization, at least the first three normal forms. Anything beyond that, get a DBA. For those with any experience with US courtrooms (and random television shows count here), there's the mnemonic "Depend on the key, the whole key, and nothing but the key, so help you Codd."

他们需要了解基本的标准化,至少前三种正常形式。除此之外,找一个DBA吧。对于那些有美国法庭经验的人(和随机的电视节目在这里算),有一个助记符“取决于关键,整个关键,除了关键,什么都没有,所以帮助你Codd。”

They need to have a clue about indexes, by which I mean they should have some idea what indexes they need and how they're likely to affect performance. This means not having useless indices, but not being afraid to add them to assist queries. Anything further (like the balance) should be left for the DBA.

他们需要知道索引,我的意思是他们应该知道他们需要什么索引,以及它们可能如何影响性能。这意味着没有无用的索引,但是不要害怕添加它们来帮助查询。任何更进一步的东西(如余额)都应该留给DBA。

They need to understand the need for data integrity, and be able to point to where they're verifying the data and what they're doing if they find problems. This doesn't have to be in the database (where it will be difficult to issue a meaningful error message for the user), but has to be somewhere.

他们需要理解数据完整性的需要,并且能够指出他们正在验证数据的位置,以及如果发现问题,他们正在做什么。这并不一定要在数据库中(在数据库中很难为用户发出有意义的错误消息),但必须在某个地方。

They should have the basic knowledge of how to get a plan, and how to read it in general (at least enough to tell whether the algorithms are efficient or not).

他们应该具备如何制定计划以及如何通读计划的基本知识(至少足以判断算法是否有效)。

They should know vaguely what a trigger is, what a view is, and that it's possible to partition pieces of databases. They don't need any sort of details, but they need to know to ask the DBA about these things.

他们应该模糊地知道什么是触发器,什么是视图,并且可以对数据库进行分区。他们不需要任何细节,但他们需要知道如何向DBA询问这些事情。

They should of course know not to meddle with production data, or production code, or anything like that, and they should know that all source code goes into a VCS.

他们当然应该知道不要干预生产数据,或者生产代码,或者类似的东西,并且他们应该知道所有的源代码都属于vc。

I've doubtless forgotten something, but the average developer need not be a DBA, provided there is a real DBA at hand.

我肯定忘记了一些事情,但是一般的开发人员不需要是DBA,只要有一个真正的DBA在手边。

#4


19  

Basic Indexing

I'm always shocked to see a table or an entire database with no indexes, or arbitrary/useless indexes. Even if you're not designing the database and just have to write some queries, it's still vital to understand, at a minimum:

看到没有索引或任意/无用索引的表或整个数据库时,我总是感到震惊。即使您没有设计数据库,只需要编写一些查询,至少理解它仍然是至关重要的:

  • What's indexed in your database and what's not:
  • 你的数据库中有哪些索引,哪些没有:
  • The difference between types of scans, how they're chosen, and how the way you write a query can influence that choice;
  • 扫描类型之间的差异,它们的选择方式,以及编写查询的方式如何影响选择;
  • The concept of coverage (why you shouldn't just write SELECT *);
  • 覆盖率的概念(为什么不写SELECT *);
  • The difference between a clustered and non-clustered index;
  • 集群索引和非集群索引之间的差异;
  • Why more/bigger indexes are not necessarily better;
  • 为什么更多/更大的索引不一定更好;
  • Why you should try to avoid wrapping filter columns in functions.
  • 为什么应该尽量避免在函数中包装过滤器列。

Designers should also be aware of common index anti-patterns, for example:

设计人员还应该注意常见的索引反模式,例如:

  • The Access anti-pattern (indexing every column, one by one)
  • 访问反模式(对每个列逐一进行索引)
  • The Catch-All anti-pattern (one massive index on all or most columns, apparently created under the mistaken impression that it would speed up every conceivable query involving any of those columns).
  • 包罗万象的反模式(所有或大多数列上的一个大型索引,显然是在错误的印象中创建的,以为它会加快涉及这些列中的任何一个的所有可能查询)。

The quality of a database's indexing - and whether or not you take advantage of it with the queries you write - accounts for by far the most significant chunk of performance. 9 out of 10 questions posted on SO and other forums complaining about poor performance invariably turn out to be due to poor indexing or a non-sargable expression.

数据库索引的质量——以及您是否利用它编写查询——是到目前为止性能最重要的部分。在SO和其他论坛上发布的10个问题中,有9个问题都抱怨表现不佳,但结果却无一例外地是由于糟糕的或非糟糕的索引表达式。

#5


16  

Normalization

It always depresses me to see somebody struggling to write an excessively complicated query that would have been completely straightforward with a normalized design ("Show me total sales per region.").

看到有人在努力编写一个过于复杂的查询,而这完全是一种标准化的设计(“让我看到每个区域的总销售额”),这总是让我感到沮丧。

If you understand this at the outset and design accordingly, you'll save yourself a lot of pain later. It's easy to denormalize for performance after you've normalized; it's not so easy to normalize a database that wasn't designed that way from the start.

如果你一开始就明白了这一点,并据此进行设计,那么以后你就可以避免很多痛苦。在规范化之后,很容易去规范化性能;要规范化一个从一开始就没有这样设计的数据库并不容易。

At the very least, you should know what 3NF is and how to get there. With most transactional databases, this is a very good balance between making queries easy to write and maintaining good performance.

至少,你应该知道3NF是什么,以及如何到达那里。对于大多数事务数据库,这是使查询易于编写和保持良好性能之间的非常好的平衡。

#6


14  

How Indexes Work

It's probably not the most important, but for sure the most underestimated topic.

这可能不是最重要的,但肯定是最被低估的话题。

The problem with indexing is that SQL tutorials usually don't mention them at all and that all the toy examples work without any index.

索引的问题在于,SQL教程通常没有提到它们,而且所有的玩具示例都没有索引。

Even more experienced developers can write fairly good (and complex) SQL without knowing more about indexes than "An index makes the query fast".

甚至更有经验的开发人员也可以编写相当好的(和复杂的)SQL,而不需要比“索引使查询速度更快”更了解索引。

That's because SQL databases do a very good job working as black-box:

这是因为SQL数据库在黑盒方面做得很好:

Tell me what you need (gimme SQL), I'll take care of it.

告诉我你需要什么(给我SQL),我会处理好的。

And that works perfectly to retrieve the correct results. The author of the SQL doesn't need to know what the system is doing behind the scenes--until everything becomes sooo slooooow.....

这能很好地得到正确的结果。SQL的作者不需要知道系统在后台做什么——直到所有的事情都变成sooo slooooow…

That's when indexing becomes a topic. But that's usually very late and somebody (some company?) is already suffering from a real problem.

这时索引就成了一个话题。但那通常已经很晚了,一些人(一些公司?)已经在遭受真正的问题。

That's why I believe indexing is the No. 1 topic not to forget when working with databases. Unfortunately, it is very easy to forget it.

这就是为什么我认为在使用数据库时,索引是最重要的话题。不幸的是,很容易忘记它。

Disclaimer

免责声明

The arguments are borrowed from the preface of my free eBook "Use The Index, Luke". I am spending quite a lot of my time explaining how indexes work and how to use them properly.

这些论点是从我的免费电子书“使用索引,卢克”的序言中借来的。我花了很多时间解释索引是如何工作的,以及如何正确地使用它们。

#7


12  

I just want to point out an observation - that is that it seems that the majority of responses assume database is interchangeable with relational databases. There are also object databases, flat file databases. It is important to asses the needs of the of the software project at hand. From a programmer perspective the database decision can be delayed until later. Data modeling on the other hand can be achieved early on and lead to much success.

我只想指出一个观察结果——似乎大多数响应都假定数据库可以与关系数据库互换。还有对象数据库,平面文件数据库。评估手头软件项目的需求是很重要的。从程序员的角度来看,数据库决策可以延迟到稍后。另一方面,数据建模可以在早期实现,并取得很大的成功。

I think data modeling is a key component and is a relatively old concept yet it is one that has been forgotten by many in the software industry. Data modeling, especially conceptual modeling, can reveal the functional behavior of a system and can be relied on as a road map for development.

我认为数据建模是一个关键的组成部分,是一个相对古老的概念,但是它已经被许多软件行业的人遗忘了。数据建模,特别是概念建模,可以揭示系统的功能行为,可以作为开发的路线图。

On the other hand, the type of database required can be determined based on many different factors to include environment, user volume, and available local hardware such as harddrive space.

另一方面,需要的数据库类型可以根据许多不同的因素确定,包括环境、用户数量和可用的本地硬件,如harddrive空间。

#8


11  

Avoiding SQL injection and how to secure your database

避免SQL注入和如何保护数据库

#9


9  

Every developer should know that this is false: "Profiling a database operation is completely different from profiling code."

每个开发人员都应该知道这是错误的:“分析数据库操作与分析代码完全不同。”

There is a clear Big-O in the traditional sense. When you do an EXPLAIN PLAN (or the equivalent) you're seeing the algorithm. Some algorithms involve nested loops and are O( n ^ 2 ). Other algorithms involve B-tree lookups and are O( n log n ).

在传统意义上有一个明显的大o。当你做一个解释计划(或类似的计划)时,你看到的是算法。有些算法包括嵌套循环和O(n ^ 2)。其他算法包括b树查找和O(n log n)。

This is very, very serious. It's central to understanding why indexes matter. It's central to understanding the speed-normalization-denormalization tradeoffs. It's central to understanding why a data warehouse uses a star-schema which is not normalized for transactional updates.

这是非常非常严重的。这是理解索引为何重要的关键。它对于理解速度-规范化-非规范化权衡至关重要。这对于理解为什么数据仓库使用不为事务更新规范化的星型模式至关重要。

If you're unclear on the algorithm being used do the following. Stop. Explain the Query Execution plan. Adjust indexes accordingly.

如果您对所使用的算法不清楚,请执行以下操作。停止。解释查询执行计划。相应地调整索引。

Also, the corollary: More Indexes are Not Better.

此外,推论是:索引越多并不越好。

Sometimes an index focused on one operation will slow other operations down. Depending on the ratio of the two operations, adding an index may have good effects, no overall impact, or be detrimental to overall performance.

有时专注于一个操作的索引会降低其他操作的速度。根据这两个操作的比率,添加一个索引可能会有良好的效果,不会产生总体影响,或者对整体性能有害。

#10


8  

I think every developer should understand that databases require a different paradigm.

我认为每个开发人员都应该理解,数据库需要不同的范式。

When writing a query to get at your data, a set-based approach is needed. Many people with an interative background struggle with this. And yet, when they embrace it, they can achieve far better results, even though the solution may not be the one that first presented itself in their iterative-focussed minds.

在编写查询以获取数据时,需要一种基于集合的方法。很多人都有这样的背景。然而,当他们接受它时,他们可以获得更好的结果,即使解决方案可能不是第一次出现在他们的迭代集中的头脑中。

#11


8  

Excellent question. Let's see, first no one should consider querying a datbase who does not thoroughly understand joins. That's like driving a car without knowing where the steering wheel and brakes are. You also need to know datatypes and how to choose the best one.

非常好的问题。让我们看看,首先不应该考虑查询不完全理解连接的数据库。这就像开车时不知道方向盘和刹车在哪里。您还需要了解数据类型以及如何选择最好的数据类型。

Another thing that developers should understand is that there are three things you should have in mind when designing a database:

开发人员应该理解的另一件事是,在设计数据库时,您应该考虑以下三点:

  1. Data integrity - if the data can't be relied on you essentially have no data - this means do not put required logic in the application as many other sources may touch the database. Constraints, foreign keys and sometimes triggers are necessary to data integrity. Don't fail to use them because you don't like them or don't want to be bothered to understand them.

    数据完整性——如果数据不能依赖于您实际上没有数据——这意味着不要在应用程序中放置必需的逻辑,因为许多其他数据源可能会触及数据库。约束、外键和有时触发器是数据完整性所必需的。不要因为你不喜欢或不愿费心去理解它们而不使用它们。

  2. Performance - it is very hard to refactor a poorly performing database and performance should be considered from the start. There are many ways to do the same query and some are known to be faster almost always, it is short-sighted not to learn and use these ways. Read some books on performance tuning before designing queries or database structures.

    性能——很难重构性能较差的数据库,从一开始就应该考虑性能。有许多方法可以执行相同的查询,有些几乎总是更快,不学习和使用这些方法是短视的。在设计查询或数据库结构之前,请阅读一些关于性能调优的书籍。

  3. Security - this data is the life-blood of your company, it also frequently contains personal information that can be stolen. Learn to protect your data from SQL injection attacks and fraud and identity theft.

    安全——这些数据是你公司的命脉,它也经常包含可以被窃取的个人信息。学习保护您的数据免受SQL注入攻击、欺诈和身份盗窃。

When querying a database, it is easy to get the wrong answer. Make sure you understand your data model thoroughly. Remember often actual decisions are made based on the data your query returns. When it is wrong, the wrong business decisions are made. You can kill a company from bad queries or loose a big customer. Data has meaning, developers often seem to forget that.

当查询数据库时,很容易得到错误的答案。确保您彻底了解了您的数据模型。记住,实际的决策通常是基于查询返回的数据做出的。当它是错误的,就会做出错误的商业决策。你可能会因为糟糕的查询或失去一个大客户而毁掉一家公司。数据有意义,开发人员似乎常常忘记这一点。

Data almost never goes away, think in terms of storing data over time instead of just how to get it in today. That database that worked fine when it had a hundred thousand records, may not be so nice in ten years. Applications rarely last as long as data. This is one reason why designing for performance is critical.

数据几乎永远不会消失,考虑到随着时间的推移如何存储数据,而不仅仅是如何将数据导入。这个数据库在它有十万张记录的时候运行得很好,十年后可能不会那么好了。应用程序很少能像数据那样持久。这是设计性能至关重要的原因之一。

Your database will probaly need fields that the application doesn't need to see. Things like GUIDs for replication, date inserted fields. etc. You also may need to store history of changes and who made them when and be able to restore bad changes from this storehouse. Think about how you intend to do this before you come ask a web site how to fix the problem where you forgot to put a where clause on an update and updated the whole table.

您的数据库可能需要应用程序不需要看到的字段。诸如用于复制的GUIDs、日期插入字段。等等。您还可能需要存储更改的历史记录,以及更改是谁创建的,并且能够从这个库中恢复糟糕的更改。在询问web站点如何修复问题之前,请考虑您打算如何进行此操作,您忘记在更新和更新整个表时添加where子句。

Never develop in a newer version of a database than the production version. Never, never, never develop directly against a production database.

不要在数据库的更新版本中开发,而要在生产版本中开发。决不,决不,决不,决不直接针对生产数据库开发。

If you don't have a database administrator, make sure someone is making backups and knows how to restore them and has tested restoring them.

如果没有数据库管理员,请确保有人正在进行备份,并知道如何恢复它们,并测试了如何恢复它们。

Database code is code, there is no excuse for not keeping it in source control just like the rest of your code.

数据库代码是代码,没有理由不把它放在源代码控制中,就像其他代码一样。

#12


6  

Evolutionary Database Design. http://martinfowler.com/articles/evodb.html

进化的数据库设计。http://martinfowler.com/articles/evodb.html

These agile methodologies make database change process manageable, predictable and testable.

这些敏捷方法使数据库更改过程可管理、可预测和可测试。

Developers should know, what it takes to refactor a production database in terms of version control, continious integration and automated testing.

开发人员应该知道,在版本控制、持续集成和自动化测试方面重构生产数据库需要什么。

Evolutionary Database Design process has administrative aspects, for example a column is to be dropped after some life time period in all databases of this codebase.

进化数据库设计过程具有管理方面的内容,例如,在此代码基的所有数据库中,一个列将在某个生命周期后被删除。

At least know, that Database Refactoring concept and methodologies exist. http://www.agiledata.org/essays/databaseRefactoringCatalog.html

至少要知道,数据库重构的概念和方法是存在的。http://www.agiledata.org/essays/databaseRefactoringCatalog.html

Classification and process description makes it possible to implement tooling for these refactorings too.

分类和过程描述使实现这些重构工具成为可能。

#13


5  

From my experience with relational databases, every developer should know:

从我使用关系数据库的经验来看,每个开发人员都应该知道:

- The different data types:

-不同的数据类型:

Using the correct type for the correct job will make your DB design more robust, your queries faster and your life easier.

为正确的工作使用正确的类型将使DB设计更加健壮,查询速度更快,生活更轻松。

- Learn about 1xM and MxM:

-了解1xM和MxM:

This is the bread and butter for relational databases. You need to understand one-to-many and many-to-many relations and apply then when appropriate.

这是关系数据库的基础。您需要理解一对多关系和多对多关系,然后在适当的时候应用它们。

- "K.I.S.S." principle applies to the DB as well:

-“K.I.S.S.”原则亦适用于DB:

Simplicity always works best. Provided you have studied how DB work, you will avoid unnecessary complexity which will lead to maintenance and speed problems.

简单总是效果最好。如果您已经研究了DB的工作方式,您将避免不必要的复杂性,这将导致维护和速度问题。

- Indices:

——指数:

It's not enough if you know what they are. You need to understand when to used them and when not to.

如果你知道它们是什么,那是不够的。你需要了解何时使用它们,何时不使用它们。


also:

另外:

  • Boolean algebra is your friend
  • 布尔代数是你的朋友
  • Images: Don't store them on the DB. Don't ask why.
  • 图像:不要将它们存储在数据库中。不要问为什么。
  • Test DELETE with SELECT
  • 测试与选择删除

#14


5  

I would like everyone, both DBAs and developer/designer/architects, to better understand how to properly model a business domain, and how to map/translate that business domain model into both a normalized database logical model, an optimized physical model, and an appropriate object oriented class model, each one of which is (can be) different, for various reasons, and understand when, why, and how they are (or should be) different from one another.

我想每个人,dba和开发人员/设计师/建筑师,为了更好地理解如何正确模型一个业务领域,以及如何映射/业务领域模型转化为一个规范化的数据库逻辑模型,优化物理模型,和一个合适的面向对象的类模型,每一个(可以)不同,由于种种原因,和理解的时候,为什么,以及它们如何(或应该)不同。

#15


5  

I would say strong basic SQL skills. I've seen a lot of developers so far who know a little about databases but are always asking for tips about how to formulate a quite simple query. Queries are not always that easy and simple. You do have to use multiple joins (inner, left, etc.) when querying a well normalized database.

我认为有很强的SQL基本技能。到目前为止,我已经见过许多开发人员,他们对数据库略知一二,但他们总是询问如何构造一个非常简单的查询。查询并不总是那么简单。在查询一个规范化良好的数据库时,必须使用多个连接(内部连接、左连接等)。

#16


5  

About the following comment to Walter M.'s answer:

关于以下对Walter M的评论。的回答:

"Very well written! And the historical perspective is great for people who weren't doing database work at that time (i.e. me)".

“写得很好!历史视角对于那些当时没有做过数据库工作的人(比如我)来说是很好的”。

The historical perspective is in a certain sense absolutely crucial. "Those who forget history, are doomed to repeat it.". Cfr XML repeating the hierarchical mistakes of the past, graph databases repeating the network mistakes of the past, OO systems forcing the hierarchical model upon users while everybody with even just a tenth of a brain should know that the hierarchical model is not suitable for general-purpose representation of the real world, etcetera, etcetera.

历史的观点在某种意义上是绝对重要的。“忘记历史的人,注定要重蹈覆辙。”Cfr XML重复分层过去的错误,图形数据库重复过去的网络错误,OO系统迫使层次模型在用户甚至在每个人只有十分之一的大脑应该知道层次模型不适合的通用表示现实世界中,等等,等等。

As for the question itself:

至于问题本身:

Every database developer should know that "Relational" is not equal to "SQL". Then they would understand why they are being let down so abysmally by the DBMS vendors, and why they should be telling those same vendors to come up with better stuff (e.g. DBMS's that are truly relational) if they want to go on sucking hilarious amounts of money out of their customers for such crappy software).

每个数据库开发人员都应该知道“关系”并不等于“SQL”。然后他们会理解为什么他们被DBMS失望所以深不可测地供应商,和为什么他们应该告诉这些供应商想出更好的东西(例如DBMS的真正关系),如果他们想继续吸吮的钱从他们的客户对于这样糟糕的软件)。

And every database developer should know everything about the relational algebra. Then there would no longer be a single developer left who had to post these stupid "I don't know how to do my job and want someone else to do it for me" questions on Stack Overflow anymore.

而且每个数据库开发人员都应该了解关系代数的所有知识。那么,就不会再有一个开发人员不得不在Stack Overflow上发布这些愚蠢的问题:“我不知道如何完成我的工作,希望别人来为我做这件事”了。

#17


5  

I think a lot of the technical details have been covered here and I don't want to add to them. The one thing I want to say is more social than technical, don't fall for the "DBA knowing the best" trap as an application developer.

我认为这里已经介绍了很多技术细节,我不想对它们进行补充。我想说的一件事是,与其说是技术上的,不如说是社会的,不要让“DBA知道最好的”陷阱作为应用程序开发人员。

If you are having performance issues with query take ownership of the problem too. Do your own research and push for the DBAs to explain what's happening and how their solutions are addressing the problem.

如果您正在使用查询的性能问题,那么也要考虑到问题的所有权。做你自己的研究,推动dba解释发生了什么,以及他们的解决方案如何解决问题。

Come up with your own suggestions too after you have done the research. That is, I try to find a cooperative solution to the problem rather than leaving database issues to the DBAs.

在你做了调查之后,也要提出你自己的建议。也就是说,我试图找到一个合作的解决方案来解决问题,而不是把数据库问题留给dba。

#18


5  

Simple respect.

简单的尊重。

  • It's not just a repository
  • 它不仅仅是一个存储库
  • You probably don't know better than the vendor or the DBAs
  • 您可能比供应商或dba更了解情况
  • You won't support it at 3 a.m. with senior managers shouting at you
  • 你不会支持在凌晨3点,高级经理对你大喊大叫

#19


3  

Consider Denormalization as a possible angel, not the devil, and also consider NoSQL databases as an alternative to relational databases.

考虑非规范化作为一个可能的天使,而不是魔鬼,并且考虑NoSQL数据库作为关系数据库的替代。

Also, I think the Entity-Relation model is a must-know for every developper even if you don't design databases. It'll let you understand thoroughly what's your database all about.

而且,我认为实体关系模型是每个开发人员都必须知道的,即使您不设计数据库。它将使您彻底了解数据库的全部内容。

#20


3  

Never insert data with the wrong text encoding.

不要用错误的文本编码插入数据。

Once your database becomes polluted with multiple encodings, the best you can do is apply some kind combination of heuristics and manual labor.

一旦你的数据库被多重编码污染,你所能做的最好的就是应用一些启发式和体力劳动的组合。

#21


3  

Aside from syntax and conceptual options they employ (such as joins, triggers, and stored procedures), one thing that will be critical for every developer employing a database is this:

除了它们所使用的语法和概念选项(例如连接、触发器和存储过程)之外,对于使用数据库的每个开发人员来说,有一件事是至关重要的:

Know how your engine is going to perform the query you are writing with specificity.

了解您的引擎将如何执行您正在编写的查询。

The reason I think this is so important is simply production stability. You should know how your code performs so you're not stopping all execution in your thread while you wait for a long function to complete, so why would you not want to know how your query will affect the database, your program, and perhaps even the server?

我认为这很重要的原因是生产稳定性。您应该知道代码是如何执行的,这样就不会在等待长函数完成时停止线程中的所有执行,所以为什么您不想知道查询将如何影响数据库、程序甚至服务器呢?

This is actually something that has hit my R&D team more times than missing semicolons or the like. The presumtion is the query will execute quickly because it does on their development system with only a few thousand rows in the tables. Even if the production database is the same size, it is more than likely going to be used a lot more, and thus suffer from other constraints like multiple users accessing it at the same time, or something going wrong with another query elsewhere, thus delaying the result of this query.

这其实是我的研发团队遇到的问题,比漏掉分号之类的问题要多。假设查询将快速执行,因为它在开发系统上执行,表中只有几千行。即使生产数据库是相同的大小,它更有可能会使用更多,因此遭受其他约束等多个用户同时访问它,或另一个查询问题,在其他地方,因此推迟这个查询的结果。

Even simple things like how joins affect performance of a query are invaluable in production. There are many features of many database engines that make things easier conceptually, but may introduce gotchas in performance if not thought of clearly.

即使是连接如何影响查询的性能这样简单的事情在生产中也是无价的。许多数据库引擎有许多特性,它们在概念上使事情变得更容易,但是如果不考虑清楚的话,可能会在性能上引入陷阱。

Know your database engine execution process and plan for it.

了解数据库引擎的执行过程并为之计划。

#22


3  

For a middle-of-the-road professional developer who uses databases a lot (writing/maintaining queries daily or almost daily), I think the expectation should be the same as any other field: You wrote one in college.

对于一个经常使用数据库(每天或几乎每天都编写/维护查询)的中级专业开发人员来说,我认为他们的期望应该和其他任何领域一样:你在大学里就写过一个。

Every C++ geek wrote a string class in college. Every graphics geek wrote a raytracer in college. Every web geek wrote interactive websites (usually before we had "web frameworks") in college. Every hardware nerd (and even software nerds) built a CPU in college. Every physician dissected an entire cadaver in college, even if she's only going to take my blood pressure and tell me my cholesterol is too high today. Why would databases be any different?

在大学里,每一个c++的极客都写过一个字符串类。在大学里,每个图形极客都写过一个射线追踪器。在大学里,每个网络极客都会写交互式网站(通常在我们有“网络框架”之前)。每个硬件书呆子(甚至软件书呆子)在大学里都有一个CPU。在大学里,每个医生都解剖过一具尸体,即使她只会给我量血压,告诉我我的胆固醇太高。为什么数据库会有所不同呢?

Unfortunately, they do seem different, today, for some reason. People want .NET programmers to know how strings work in C, but the internals of your RDBMS shouldn't concern you too much.

不幸的是,由于某种原因,他们今天看起来不一样。人们希望。net程序员知道字符串在C中是如何工作的,但是RDBMS的内部特性不应该太关心您。

It's virtually impossible to get the same level of understanding from just reading about them, or even working your way down from the top. But if you start at the bottom and understand each piece, then it's relatively easy to figure out the specifics for your database. Even things that lots of database geeks can't seem to grok, like when to use a non-relational database.

几乎不可能从阅读他们的文章中获得同样程度的理解,甚至不可能从最上面的文章中得到理解。但是,如果您从底层开始并理解每个部分,那么就比较容易理解数据库的细节。即使是那些大量的数据库极客们也无法理解的事情,比如何时使用非关系数据库。

Maybe that's a bit strict, especially if you didn't study computer science in college. I'll tone it down some: You could write one today, completely, from scratch. I don't care if you know the specifics of how the PostgreSQL query optimizer works, but if you know enough to write one yourself, it probably won't be too different from what they did. And you know, it's really not that hard to write a basic one.

这可能有点严格,尤其是如果你在大学里没有学过计算机科学的话。我把它写下来:你今天可以写一篇,完全从头开始。我不关心您是否了解PostgreSQL查询优化器如何工作的细节,但是如果您足够了解如何编写一个,那么它可能与它们所做的工作没有太大的不同。你知道,写出一个基本的题目并不难。

#23


2  

The order of columns in a non-unique index is important.

非唯一索引中的列的顺序很重要。

The first column should be the column that has the most variability in its content (i.e. cardinality).

第一列应该是内容中变化最大的列(即基数)。

This is to aid SQL Server ability to create useful statistics in how to use the index at runtime.

这有助于SQL Server在运行时如何使用索引时创建有用的统计信息。

#24


2  

Understand the tools that you use to program the database!!!

了解用于编程数据库的工具!!

I wasted so much time trying to understand why my code was mysteriously failing.

我浪费了很多时间试图理解为什么我的代码神秘地失败了。

If you're using .NET, for example, you need to know how to properly use the objects in the System.Data.SqlClient namespace. You need to know how to manage your SqlConnection objects to make sure they are opened, closed, and when necessary, disposed properly.

例如,如果您正在使用。net,您需要知道如何正确地使用System.Data中的对象。SqlClient名称空间。您需要知道如何管理SqlConnection对象,以确保它们被打开、关闭,并在必要时正确地处理。

You need to know that when you use a SqlDataReader, it is necessary to close it separately from your SqlConnection. You need to understand how to keep connections open when appropriate to how to minimize the number of hits to the database (because they are relatively expensive in terms of computing time).

您需要知道,当您使用SqlDataReader时,有必要将其与SqlConnection分开关闭。您需要了解如何在适当的时候保持连接的打开,以及如何最小化对数据库的命中(因为在计算时间方面,这些连接比较昂贵)。

#25


2  

  • Basic SQL skills.
  • 基本的SQL技能。
  • Indexing.
  • 索引。
  • Deal with different incarnations of DATE/ TIME/ TIMESTAMP.
  • 处理不同形式的日期/时间/时间戳。
  • JDBC driver documentation for the platform you are using.
  • 您正在使用的平台的JDBC驱动程序文档。
  • Deal with binary data types (CLOB, BLOB, etc.)
  • 处理二进制数据类型(CLOB、BLOB等)

#26


1  

For some projects, and Object-Oriented model is better.

对于某些项目来说,面向对象的模型更好。

For other projects, a Relational model is better.

对于其他项目,关系模型更好。

#27


1  

The impedance mismatch problem, and know the common deficiencies or ORMs.

阻抗失配问题,并了解常见的不足或缺陷。

#28


1  

RDBMS Compatibility

RDBMS兼容性

Look if it is needed to run the application in more than one RDBMS. If yes, it might be necessary to:

看看是否需要在多个RDBMS中运行应用程序。如果有,可能有必要:

  • avoid RDBMS SQL extensions
  • 避免RDBMS SQL扩展
  • eliminate triggers and store procedures
  • 消除触发器和存储过程
  • follow strict SQL standards
  • 遵循严格的SQL标准
  • convert field data types
  • 字段数据类型转换
  • change transaction isolation levels
  • 改变事务隔离级别

Otherwise, these questions should be treated separately and different versions (or configurations) of the application would be developed.

否则,这些问题应该单独处理,并开发应用程序的不同版本(或配置)。

#29


1  

Don't depend on the order of rows returned by an SQL query.

不要依赖于SQL查询返回的行的顺序。

#30


1  

http://www.reddit.com/r/programming/comments/azdd7/programmers_sit_your_butt_down_i_need_to_have_a/

http://www.reddit.com/r/programming/comments/azdd7/programmers_sit_your_butt_down_i_need_to_have_a/