关于数据库，我需要知道什么?

In general, I think I do alright when it comes to coding in programming languages, but I think I'm missing something huge when it comes to databases.

总的来说，我认为在编程语言编程方面我做得很好，但是我认为在数据库方面我遗漏了一些重要的东西。

I see job ads requesting knowledge of MySQL, MSSQL, Oracle, etc. but I'm at a loss to determine what the differences would be.

我看到招聘广告要求了解MySQL、MSSQL、Oracle等，但我无法确定其中的差异。

You see, like so many new programmers, I tend to treat my databases as a dumping ground for data. Most of what I do comes down to relatively simple SQL (INSERT this, SELECT that, DELETE this_other_thing), which is mostly independent of the engine I'm using (with minor exceptions, of course, mostly minor tweaks for syntax).

你看，像许多新程序员一样，我倾向于把我的数据库当作数据的垃圾场。我所做的大部分工作都归结为相对简单的SQL(插入这个，选择那个，删除this_other_thing)，它基本上独立于我正在使用的引擎(当然，除了很小的例外，主要是语法上的小调整)。

Could someone explain some common use cases for databases where the specific platform comes into play?

有人能解释一下数据库的一些常见用例吗?

I'm sure things like stored procedures are a big one, but (a) these are mostly written in a specific language (T-SQL, etc) which would be a different job ad requirement than the specific RDBMS itself, and (b) I've heard from various sources that stored procedures are on their way out and that in a lot of cases they shouldn't be used now anyway. I believe Jeff Atwood is a member of this camp.

我相信诸如存储过程是一个大的,但这些大多是(a)在一个特定的语言编写(t - sql等),这将是一个不同的招聘广告要求比特定的RDBMS本身,和(b)我听说从各种来源,存储过程是在他们的出路,在很多情况下他们现在不应该使用。我相信杰夫·阿特伍德是这个阵营的一员。

Thanks.

谢谢。

The above concepts do not vary much for MySQL, SQL Server, Oracle, etc.

以上概念对于MySQL、SQL Server、Oracle等并没有太大的差异。

With this question, I'm mostly trying to determine the important difference between these. I.e. why would a job ad demand n years experience with MySQL when most common use cases are relatively stable across RDBMS platforms.

关于这个问题，我主要是想确定这两者之间的重要区别。例如，当大多数常见的用例在RDBMS平台上相对稳定时，为什么一个招聘广告需要n年使用MySQL的经验。

CRUD statements, joins, indexes.. all of these are relatively straightforward within the confines of a certain engine. The concepts are easily transferable if you know a different RDBMS.

CRUD语句、连接、索引。所有这些在特定的引擎范围内都是相对简单的。如果您知道一个不同的RDBMS，那么这些概念很容易转移。

What I'm looking for are the specifics which would cause an employer to specify a specific engine rather than "experience using common database engines."

我要找的是具体的细节，可以让雇主指定一个特定的引擎，而不是“使用通用数据库引擎的经验”。

9 个解决方案

#1

I believe that the essential knowledge about databases should be:

我认为关于数据库的基本知识应该是:

What database are for?
数据库是什么?
Basic CRUD Operations
基本的CRUD操作
SELECT queries with JOINs
选择查询和连接
Normalization
归一化
Basic Indexing
基本的索引
Referential Integrity with Foreign Key Constraints
具有外键约束的引用完整性
Basic Check Constraints
基本检查约束

The above concepts do not vary much between MySQL, SQL Server, Oracle, Postgres, and other relational database systems. However you'd find a different set of concepts for the now-popular NoSQL databases, such as CouchDB, MongoDB, SimpleDB, Cassandra, Bigtable, and many others.

上述概念在MySQL、SQL Server、Oracle、Postgres和其他关系数据库系统之间差别不大。但是，您会发现现在流行的NoSQL数据库有一组不同的概念，如CouchDB、MongoDB、SimpleDB、Cassandra、Bigtable和许多其他数据库。

#2

After the CRUD statements, to be an effective DB programmer I think some of the most important things to understand are JOIN statements. Understand the difference between LEFT and RIGHT, OUTER and INNER joins, and know when to use each. Most importantly, know what the database is actually constructing when it performs a JOIN.

在CRUD语句之后，要成为一个有效的DB程序员，我认为需要理解的最重要的事情是连接语句。理解左连接和右连接、外部连接和内部连接的区别，并知道何时使用它们。最重要的是，了解数据库执行连接时实际构造的内容。

For me, the Wikipedia article was very helpful.

对我来说，*的文章很有帮助。

Also, indexing is very important - this is how relational databases can perform fast queries. Understand how to use them and what happens under the hood.

此外，索引非常重要——关系数据库可以执行快速查询。理解如何使用它们以及在引擎盖下面发生了什么。

Wikipedia article on DB indexing.

关于DB索引的*文章。

You should also know how to construct a many-to-one relationship (using foreign keys) and a many-to-many relationship (using join tables).

您还应该知道如何构造多对一关系(使用外键)和多对多关系(使用连接表)。

I know that in your question you're asking about specific DB implementations, but if you're to be taken literally and you only know about SELECT, INSERT, UPDATE, and DELETE, then the above concepts will be far more valuable than learning the intricacies of a particular implementation.

我知道，在您的问题中，您正在询问特定的DB实现，但是如果您要按字面理解，并且只知道选择、插入、更新和删除，那么上述概念将比了解特定实现的复杂情况更有价值。

#3

It's not just stored procs and functions. Each database has fundamental differences and quirks that are important to understand even though SQL works more or less the same.

它不仅仅是存储procs和函数。每个数据库都有基本的差异和怪癖，这些差异和怪癖对于理解是很重要的，尽管SQL的工作原理或多或少是相同的。

Examples:

例子:

Oracle and MySQL handle locking differently, in different situations.
Oracle和MySQL在不同情况下处理锁定的方式不同。
Oracle doesn't have autoincrementing primary keys like MySQL and SQL Server.
Oracle没有像MySQL和SQL Server这样的自动递增主键。
Subtle vendor-specific behavior, like the way Oracle does sorting for VARCHARs differently depending on locale.
微妙的特定于供应商的行为，比如Oracle对VARCHARs进行排序的方式，根据地区的不同而有所不同。

If you really want to improve your applications, you eventually have to become familiar with the details about how your specific database works. Most of the time it doesn't make a lot of difference, but when it does matter, it usually makes a big difference, especially when it comes to performance.

如果您真的想要改进应用程序，您最终必须熟悉关于特定数据库如何工作的细节。大多数时候，这并没有多大的区别，但当它真的很重要时，它通常会产生很大的区别，尤其是在性能方面。

#4

Some things which seem to come up when talking with my Database-keen colleagues:

当我和那些热衷于数据库的同事交谈时，有些事情似乎会出现:

Row vs page vs table locking escalation when doing multiple complex joins, implies sometimes doing very different things on different vendors dbs. This is where the theory is really hitting the tarmac and often it is non-intuitive.
当执行多个复杂连接时，Row vs page vs table锁定升级，这意味着有时在不同的供应商dbs上执行非常不同的操作。这就是这个理论真正击中停机坪的地方，而且通常是不直观的。
Differences between how cursors are best used on different vendor db implementations
在不同的供应商db实现中，游标的最佳使用方式不同
Odd stuff in the stored proc language variants, like how best to handle failure cases
在存储的proc语言变体中有一些奇怪的东西，比如如何最好地处理失败案例
Differences in how temporary tables and views are best used depending on the underlying implementations.
临时表和视图的最佳使用方式的差异取决于底层实现。

All of these kind of things don't really matter until you are trying to solve something that either has to - Run very fast - Contain lots and lots of data - Gets very big and complex (i.e. multiple queries hitting same tables simultaneously)

所有这些事情都不重要，除非你想要解决一些事情——运行速度非常快——包含大量的数据——变得非常庞大和复杂(比如多个查询同时碰到相同的表)

These are the kinds of things that DBAs should be helping with, so depends on if you are aiming to be a DBA or a programmer. None of the above have really hurt me yet, because I've not worked on db-intensive systems, but I've worked near a few, and the programmers on those end up knowing a lot about the internals, restrictions, and good features about the specific database they are using.

这些都是DBA应该帮助解决的问题，所以这取决于您的目标是成为DBA还是程序员。以上这些都还没有真正伤害到我，因为我还没有在db密集型系统上工作过，但是我已经在一些系统上工作过了，这些程序员最终了解了他们所使用的特定数据库的内部结构、限制和优秀特性。

Best way to get knowledge like that (other than on the job) is to read the manuals or hang out with people that already know and ask them about it.

获得这样的知识(除了在工作中)的最好方法是阅读手册或者和已经知道的人一起出去，问问他们。

#5

Don't forget relation schemas, Primary and foreign keys and how they are related. To start with DB, I would use MySql and MSSQL as these are most common in the market. I take Oracle as more advanced and complex db

不要忘记关系模式、主键和外键以及它们之间的关系。从DB开始，我将使用MySql和MSSQL，因为这是市场上最常见的。我认为Oracle是更高级、更复杂的db

#6

As for the level of differences there are between vendors, it is because SQL is a standard (http://en.wikipedia.org/wiki/SQL#Standardization) and vendors implement that std differently.

至于供应商之间的差异程度，这是因为SQL是一个标准(http://en.wikipedia.org/wiki/SQL#标准化)，而供应商实现std的方式不同。

Each of these vendors try to offer extras to have the crowd by their side... that's why you see functions available to one and not the other. But sometimes that function make its way into the standard so its not always a bad thing.

每一个供应商都试图提供额外的服务，让人群站在他们身边……这就是为什么你看到一个函数可用，而不是另一个函数可用的原因。但有时这个函数会进入标准，所以它并不总是坏事。

For stored proc. I would agree as ORMs and practices of today tend to do a greater separation of concerns by removing business logic from the database and considering it "only" a repository.

对于存储的proc，我同意ORMs和现在的实践倾向于通过从数据库中删除业务逻辑并将其视为“仅”存储库来实现更大的关注点分离。

My 2 cents

我2美分

#7

I see job ads requesting knowledge of MySQL, MSSQL, Oracle, etc. but I'm at a loss to determine what the differences would be.

我看到招聘广告要求了解MySQL、MSSQL、Oracle等，但我无法确定其中的差异。

I'm what's called a SQL Developer. You won't see the differences much when you are doing run of the mill database work (CRUD). However the differences become quite apparent when you are dealing with the databases own brand of SQL.

我就是所谓的SQL开发人员。当您在运行工厂数据库工作(CRUD)时，您不会看到太多差异。然而，当您处理数据库自己的SQL品牌时，差异会变得非常明显。

When talking SQL outside of the standards, there are 4 distinctive types of commands. These are:

当在标准之外使用SQL时，有4种不同类型的命令。这些都是:

Data Manipulation Language (DML)
数据操作语言(DML)
Data Definition Language (DDL)
数据定义语言(DDL)
Data Control Language (DCL)
数据控制语言(DCL)
Transactional Control Language (TCL)
事务控制语言(TCL)

The biggest differences come in the last two, DCL and TCL. Those have a LOT of database specific non-standard SQL commands. The first two, DML and DDL, are very similar across any database that use the relational model.

最大的区别出现在后两个，DCL和TCL。它们有许多特定于数据库的非标准SQL命令。前两个DML和DDL在使用关系模型的任何数据库中都非常相似。

Also the bigger database vendors have nicknamed their SQL implementation. Here's a short sample:

此外，大型数据库供应商也给它们的SQL实现起了外号。这里有一个简短的示例:

SQL Server : T-SQL
SQL Server:t - SQL
Oracle : PL-SQL
Oracle:PL-SQL
PostgreSQL : P-SQL or NG-SQL
PostgreSQL: P-SQL或NG-SQL
Firebird : IB-SQL
火鸟:IB-SQL
MySQL : mSQL
MySQL:mSQL

The list goes on, but you get the point. Wikipedia has good articles on the different command acronyms.

这个列表还在继续，但是你明白了。*有很多关于不同的首字母缩略词的文章。

I have found that most employers won't be able to articulate this, because most will use non-technical managers and/or HR to do the hiring. They are basically being told by the tech managers that the new hires need to know X technology. This and also, because the majority are too lazy to hire for intelligence, instead they fall back on the "We have X, so darn it, we need to hire somebody that knows X!" meme. The differences are actually not that hard to learn, for the people who frequent *. I'm confident that anybody here can learn these fairly fast.

我发现，大多数雇主都无法清晰地表达这一点，因为大多数雇主会聘请非技术经理和/或人力资源部。技术经理告诉他们，新员工需要了解X技术。还有，因为大多数人太懒了，不愿雇佣聪明的人，相反，他们倾向于“我们有X，所以该死，我们需要雇佣一个知道X的人!”这些差异实际上并不难学，因为经常发生*的人。我相信在座的各位都能很快地学会这些。

#8

Even something as simple as an auto-incrementing primary key can be very different in Oracle, mysql, and SQL Server.

即使是像自动递增主键这样简单的东西，在Oracle、mysql和SQL Server中也可能有很大的不同。

Some other important differences:

其他一些重要的差异:

SQL Server makes a distinction between clustering key and primary key; other database do not. This choice comes with major performance implications.

SQL Server区分簇键和主键;其他数据库。这个选择带来了重大的性能影响。
SQL Server allows the SET @Total = Total = @Total + Amount syntax for fast computations of things like running totals. mysql lets you use a user variable in a similar way (I think). In other databases you'd probably have to use a correlated subquery. Huge difference in performance.

SQL Server允许使用SET @Total = Total = @Total + Amount语法快速计算诸如运行总数之类的内容。mysql允许您以类似的方式使用用户变量(我认为)。在其他数据库中，您可能需要使用相关子查询。巨大的性能差异。
SQL Server can generate "sequential GUIDs" with newsequentialid. I'm not sure which other databases have this feature, but as with the above two points, there are significant performance implications to using a traditional GUID as opposed to a sequential or comb.

SQL Server可以用newsequentialid生成“顺序向导”。我不确定其他哪些数据库具有此特性，但与上述两点一样，使用传统的GUID与使用顺序或梳子相比具有显著的性能影响。
Oracle's CONNECT BY is a very useful and pretty unique syntax. Common Table Expressions in SQL Server and mysql are similar but not exactly the same.

Oracle的CONNECT BY是一个非常有用且非常独特的语法。SQL Server和mysql中的通用表表达式是相似的，但并不完全相同。
Support for ranking/ordering functions varies vastly across different databases. I'm constantly posting answers here invoking ROW_NUMBER. A lot of queries are much harder to write without this - but at the same time, abusing it can hurt performance.

对排序/排序函数的支持在不同的数据库中差异很大。我不断地在这里发布调用ROW_NUMBER的答案。如果没有这个，编写许多查询会困难得多——但同时，滥用它会影响性能。
XML support is all over the map. Most databases have reasonably good support for it now, but both syntax and semantics are completely different on every platform.

XML支持无处不在。现在大多数数据库对它都有很好的支持，但是在每个平台上，语法和语义都是完全不同的。
Date/time handling can be quite different. Oracle has several different date/time-related types, some including time zone information. In general, Oracle is way better than other databases at managing temporal data, and has several features that you will miss if you switch. Until recently, Microsoft didn't have the date and time types, just datetime, which was much harder to normalize.

日期/时间处理可能完全不同。Oracle有几个不同的日期/时间相关类型，其中一些包括时区信息。一般来说，Oracle在管理时态数据方面比其他数据库要好，并且有几个特性，如果您切换，您将会错过。直到最近，微软还没有日期和时间类型，只有datetime，这很难规范化。
Spatial types are different and/or nonexistent in different databases. mysql exposes an entire OpenGIS model; Microsoft's support is a bit more basic but still competent. Oracle has it, but it's a little hard to find information on, and it's some sort of optional add-on. I think DB2 is starting to get it, but support is still a little spotty.

空间类型在不同的数据库中是不同的和/或不存在的。mysql公开了整个OpenGIS模型;微软的支持更加基本，但仍然有竞争力。甲骨文有这个功能，但是很难找到相关信息，而且它是一种可选的附加组件。我认为DB2已经开始获得它了，但是支持仍然有点不稳定。
mysql actually lets you choose how to store an index (i.e. btree or hash). This is also an important performance consideration.

mysql实际上允许您选择如何存储索引(即btree或hash)。这也是一个重要的性能考虑因素。
SQL Server allows you to INCLUDE columns in an index - very important for performance.

SQL Server允许在索引中包含列——这对性能非常重要。
Oracle allows you to create function-based indexes, bitmap indexes, and so on. These can be pretty difficult to wrap your head around.

Oracle允许您创建基于函数的索引、位图索引等等。这些东西很难让你的脑袋转过来转过去。
Oracle can perform "skip seeks" in very specific situations, something that I don't believe is supported in other databases (yet). This might factor into how you order index columns.

Oracle可以在非常特定的情况下执行“skip seek”，我认为在其他数据库中还不支持这种方法。这可能会影响到如何排序索引列。
SQL Server has CLR types/functions/aggregates. Obviously not supported in any other database product.

SQL Server具有CLR类型/函数/聚合。显然不支持任何其他数据库产品。
Trigger support varies significantly. SQL Server has AFTER and INSTEAD OF. mysql has BEFORE and AFTER. Oracle has all of those and more. These all behave quite differently.

触发器支持显著不同。SQL Server有AFTER和INSTEAD。mysql前后都有。Oracle拥有所有这些，甚至更多。它们的行为完全不同。

I'm sure that there are many, many more differences, but that should give you at least a basic idea of why 5 years of experience with Oracle is completely different from 5 years of experience with SQL Server.

我确信还有很多很多不同之处，但这至少可以让您了解为什么5年的Oracle经验与5年的SQL Server经验完全不同。

#9

That databases are encoded collections of assertions of fact. That the logical structure of the tables corresponds to the syntactical structure of those "assertions of fact". That Normalization theory helps you find the most optimal logical structure of the database, by minimizing redundancy, i.e. minimizing the possibility for contradictions in said assertions of fact to occur. That database constraints are really nothing else than business rules, expressed in a formal way and in terms of the components of the database. That really every and any business rule can be expressed as a database constraint. That therefore, it is possible for the DBMS to enforce any and every business rule you can imagine. That there is a very important difference between logical design and physical design. That SQL and SQL systems are, eurhm, not really helpful (and that's putting it mildly), in supporting developers to recognise this important distinction. That SQL and SQL systems are, eurhm, significantly deficient (and that's putting it mildly), in their support for database constraints. That these latter two examples are a very good illustration of the importance of the difference between a model (Codd's RM) and its implementation (some particular SQL system). As far as relational database technology is concerned, the latters deviate ever more propostrously from the former.

数据库是事实断言的编码集合。表的逻辑结构对应于那些“事实断言”的句法结构。规范化理论帮助您找到数据库最优的逻辑结构，通过最小化冗余，即尽可能减少对实际发生的断言的矛盾。数据库约束实际上就是业务规则，以一种正式的方式和数据库的组件来表示。实际上，任何业务规则都可以表示为数据库约束。因此，DBMS可以执行您可以想象的任何和所有业务规则。逻辑设计和物理设计之间有一个非常重要的区别。在支持开发人员识别这一重要区别方面，SQL和SQL系统并没有真正的帮助(说得客气一点)。在对数据库约束的支持方面，SQL和SQL系统严重不足(说得客气一点)。后面的两个例子很好地说明了模型(Codd的RM)和它的实现(某些特定的SQL系统)之间的差异的重要性。就关系数据库技术而言，查询器与关系数据库技术的差异越来越大。

And whatever else I forgot to remember.

还有其他我忘了记住的东西。

#1