历史上,什么使关系数据库流行?

时间:2022-09-03 16:57:35

EDIT I've just started skimming Codd's famous 1970 paper that started it all, that Oracle was based on (A Relational Model of Data for Large Shared Data Banks [pdf]), and was amazed to find that it seems it will answer this SO question. It talks about databases in the market at that time ("hierarchical" and "network" - like NoSQL?), the need for independence from internal representation, and a clear explanation of how to apply mathematical "relations" to a database.

编辑我刚刚开始略读Codd着名的1970年论文,启动了这一切,Oracle基于(大型共享数据库的数据关系模型[pdf]),并且惊讶地发现它似乎将回答这个问题。题。它讨论了当时市场上的数据库(“层级”和“网络” - 如NoSQL?),独立于内部表示的必要性,以及如何将数学“关系”应用于数据库的明确解释。


Historically, what feature of relational databases gave what benefit that caused businesses to adopt it, making it massively successful?

从历史上看,关系数据库的哪些特性给了企业采用它带来了哪些好处,使其大获成功?

Today, there are many reasons to use a RDB: it's standard, products are mature, debugged, full-featured, there's a choice of vendors, there's support, there's a trained workforce and so on. But why did it become so popular?

今天,有很多理由使用RDB:它是标准的,产品是成熟的,调试的,功能齐全的,有供应商选择,有支持,有训练有素的劳动力等等。但为什么它变得如此受欢迎?

I've heard "hierarchical databases" were popular before relational databases - they sound like a key-value store, where the value can be another set of key-values. If so, that is similar to the object oriented databases that were publicized a decade or two ago; and also to XML/document databases and NoSQL.

我听说“分层数据库”在关系数据库之前很流行 - 它们听起来像一个键值存储,其中值可以是另一组键值。如果是这样,那就类似于十年或两年前公布的面向对象数据库;还有XML /文档数据库和NoSQL。

Maybe ACID transactions (atomicity etc)? But that doesn't seem specific to RDB.

也许ACID交易(原子性等)?但这似乎并不是RDB特有的。

Maybe because relational databases enabled you to define a data schema that was purely about the data - independent of a particular programming language, version of an application (evolution), or purpose of the application (this makes "impedance mismatch" an inevitable) But any database with a data schema has this feature.

也许是因为关系数据库使您能够定义纯粹与数据相关的数据模式 - 独立于特定编程语言,应用程序版本(演化)或应用程序的目的(这使得“阻抗不匹配”成为必然)但是具有数据模式的数据库具有此功能。

Maybe because the relational model is mathematically sound? But this doesn't sound like it would convince managers to adopt it - and what would be the business benefit.

也许是因为关系模型在数学上是合理的?但这听起来并不能说服管理者采用它 - 这会带来什么商业利益。

Maybe because the mathematical model gives you a way to rearrange the database into different normal forms to give different performance characteristics, which are mathematically guaranteed to not change the meaning of the data? This seems plausible, and my uni textbooks make a big deal of it, but it doesn't sound very compelling to me as a practical business benefit (maybe I'm missing something)?

也许是因为数学模型为您提供了一种方法,可以将数据库重新排列为不同的常规形式,以提供不同的性能特征,这在数学上保证不会改变数据的含义?这似乎是合理的,我的单一教科书做了很多,但这对我来说听起来并不是一个实际的商业利益(也许我错过了一些东西)?

To summarise: historically, what made the relational model win so decisively over the hierarchical model? I'm also interested in whether RDB still have some special quality that actively makes them a better practical choice for businesses (other than the benefits of being a standard mentioned above).

总结一下:历史上,是什么让关系模型在层次模型中如此决定性地获胜?我也对RDB是否仍然具有某些特殊品质感兴趣,这些品质积极地使它们成为企业更好的实用选择(除了作为上述标准的好处之外)。

Many thanks if you can shed some light - I've long been curious about this.

非常感谢你能否解释一下 - 我一直很好奇这一点。

5 个解决方案

#1


6  

For the same reason why the script languages are popular.

出于同样的原因,脚本语言很受欢迎。

You can make a query with your favorite text editor and just issue it, without bothering about the actual physical schema.

您可以使用自己喜欢的文本编辑器进行查询,然后发布它,而无需考虑实际的物理架构。

It's not the fastest model, not the most reliable model — it's just the most productive model. You can write ten times as many queries in an hour.

它不是最快的型号,也不是最可靠的型号 - 它只是最高效的型号。您可以在一小时内写出十倍的查询。

You may want to read this article in my blog which compares the most popular database models:

您可能希望在我的博客中阅读这篇文章,其中比较了最流行的数据库模型:

#2


4  

The concept of making a logical representation of data abstracted from its physical representation was perhaps the most game-changing aspect of Codd's idea. He was apparently the first person who fully realised the benefits of separating logical and physical concerns and therefore the first to devise a data model worthy of the name. By describing a model based on relations, without navigational links or pointer structures he also created something uniquely powerful, flexible and of lasting relevance.

从物理表示中抽象出数据的逻辑表示的概念可能是Codd想法中改变游戏规则最多的方面。他显然是第一个完全意识到分离逻辑和物理问题的好处的人,因此他是第一个设计出一个名副其实的数据模型的人。通过描述基于关系的模型,没有导航链接或指针结构,他还创造了一些独特的强大,灵活和持久的相关性。

To be accurate it must be said that it was the SQL model rather than the relational one which eventually proved more successful commercially. SQL is a long way from a truly relational data model or language even though it would not have come into being without Codd's ideas to inspire it. The relational model's creator was naturally disappointed that SQL rather than relational became the database standard. Four decades later I think we have plenty of cause to regret that Codd's relational model isn't better supported by DBMS software.

准确地说,必须说它是SQL模型而不是最终证明在商业上更成功的关系模型。 SQL离真正的关系数据模型或语言还有很长的路要走,即使没有Codd的想法激发它,它也不会产生。关系模型的创建者自然很失望,SQL而不是关系成为数据库标准。四十年后,我认为我们有充足的理由感到后悔Codd的关系模型没有得到DBMS软件更好的支持。

#3


2  

From my knowledge, it is the normalization theory (the well known Codd's Third Normal Form) to define relational data model that is easy and efficient for storing and retrieving. This followed by the Standard Query Language (SQL) which allows it to be used across all the relational db system. Standardization was definitely lacking back then which also make this appealing to many.

据我所知,定义关系数据模型是规范化理论(众所周知的Codd第三范式),它可以轻松有效地存储和检索。接下来是标准查询语言(SQL),它允许在所有关系数据库系统中使用它。当时标准化肯定是缺乏的,这也使许多人感到有吸引力。

#4


2  

I believe that none of the answers really nailed it, as you are interested about historical aspect of it.

我相信没有一个答案真的能够确定它,因为你对它的历史方面感兴趣。

If I were to single out a reason I would say Quassnoi was close, but SQL was not just any language - it has one great feature that was not common in 70s:

如果我说出一个理由我会说Quassnoi很接近,但SQL不仅仅是任何语言 - 它有一个很棒的功能,在70年代不常见:

  • it is declarative, it describes what you want to get and does not prescribe how to do it
  • 它是声明性的,它描述了你想要得到的东西,并没有规定如何去做

I don't think it is possible to overstate this.

我认为不可能夸大这一点。

of course, the relational model is also big factor (and is related to the above)

当然,关系模型也是一个重要因素(与上述有关)

  • also not all databases that have a schema are only about data; hierarchical and network databases are also about how to get data, their structure determines the most efficient method to access data and therefore structure influence how will you model the problem (in another words modelling process is influenced by the way application will use it; this has an effect that another application that might need to use IMS for example would be in a severely disadvantaged position, or that some changes in the application would not require changes in the database design to achieve good performance - for example a need to sort by some new column)
  • 并非所有具有架构的数据库都只是关于数据;分层和网络数据库也是关于如何获取数据,它们的结构决定了访问数据的最有效方法,因此结构影响你将如何建模问题(换句话说,建模过程受应用程序使用它的方式的影响;这有例如,可能需要使用IMS的另一个应用程序将处于严重不利的位置,或者应用程序中的某些更改不需要更改数据库设计以实现良好性能 - 例如需要按一些新的排序柱)

note: on some of the above intentions SQL and 'R'DBMSes did not fully deliver (and/or other models overcame their problems in some way), but these were initial intentions and considering how stable SQL was in past ~40 years (here is a link to IBM's paper from 1974) or so it did not do such a bad, bad job either.

注意:关于上述一些意图,SQL和'R'DBMSes没有完全交付(和/或其他模型以某种方式克服了他们的问题),但这些是最初的意图,并考虑到SQL在过去〜40年内的稳定性(这里是从1974年开始的IBM论文的链接)或者它也没有做出如此糟糕,糟糕的工作。

There is also this quote from here

这里也引用了这句话

“Ted’s basic idea was that relationships between data items should be based on the item’s values, and not on separately specified linking or nesting. This notion greatly simplified the specification of queries and allowed unprecedented flexibility to exploit existing data sets in new ways,” said Don Chamberlin, co-inventor of SQL, the industry-standard language for querying relational databases, and a research staff member at Almaden. “He believed that computer users should be able to work at a more natural-language level and not be concerned about the details of where or how the data was stored.”

“Ted的基本思想是数据项之间的关系应该基于项目的值,而不是基于单独指定的链接或嵌套。这一概念极大地简化了查询的规范,并允许以新的方式利用现有数据集提供前所未有的灵活性,“查尔斯关系数据库的行业标准语言SQL的共同发明人,以及Almaden的研究人员Don Chamberlin说。 “他相信计算机用户应该能够以更自然的语言水平工作,而不必担心数据存储位置或方式的细节。”

At a 1995 reunion of IBM’s early relational database scientists, Chamberlin recalled having an epiphany as he first heard Codd describe his relational model at an internal seminar.

在1995年IBM早期关系数据库科学家的重聚中,Chamberlin回忆起他在第一次听到Codd在内部研讨会上描述他的关系模型时的顿悟。

“Codd had a bunch of fairly complicated queries,” Chamberlin said. “And since I’d been studying CODASYL (the language used to query navigational databases), I could imagine how those queries would have been represented in CODASYL by programs that were five pages long that would navigate through this labyrinth of pointers and stuff. Codd would sort of write them down as one-liners. ... (T)hey weren’t complicated at all. I said, ‘Wow.’ This was kind of a conversion experience for me. I understood what the relational thing was about after that.”

“Codd有一堆相当复杂的问题,”Chamberlin说。 “因为我一直在研究CODASYL(用于查询导航数据库的语言),我可以想象这些查询是如何通过五页长的程序在CODASYL中表示的,这些程序将通过这个迷宫般的指针和东西进行导航。 Codd会把它们写成一行。 ......(嘿)嘿并不复杂。我说,'哇。'这对我来说是一种转换体验。我明白了之后的关系是什么。“

I seem to remember a transcript of an interesting interview about begging of SQL but can't track it down..

我好像还记得一篇关于乞求SQL的有趣访谈的记录,但无法追踪它。

#5


1  

One key was the self contained products - you no longer had to manually define and maintain your key files (indexes) and the ability to change the data model with less effort. Combine that with the SET based structures made it a compelling product set to work with. Combine the SQL language on top of that to return data and it was a win-win situation over traditional ISAM data structures primarily associated with COBOL languages.

一个关键是自包含产品 - 您不再需要手动定义和维护密钥文件(索引)以及更轻松地更改数据模型的能力。将其与基于SET的结构相结合,使其成为令人信服的产品组合。将SQL语言结合起来返回数据,与主要与COBOL语言相关的传统ISAM数据结构相比,这是一个双赢的局面。

#1


6  

For the same reason why the script languages are popular.

出于同样的原因,脚本语言很受欢迎。

You can make a query with your favorite text editor and just issue it, without bothering about the actual physical schema.

您可以使用自己喜欢的文本编辑器进行查询,然后发布它,而无需考虑实际的物理架构。

It's not the fastest model, not the most reliable model — it's just the most productive model. You can write ten times as many queries in an hour.

它不是最快的型号,也不是最可靠的型号 - 它只是最高效的型号。您可以在一小时内写出十倍的查询。

You may want to read this article in my blog which compares the most popular database models:

您可能希望在我的博客中阅读这篇文章,其中比较了最流行的数据库模型:

#2


4  

The concept of making a logical representation of data abstracted from its physical representation was perhaps the most game-changing aspect of Codd's idea. He was apparently the first person who fully realised the benefits of separating logical and physical concerns and therefore the first to devise a data model worthy of the name. By describing a model based on relations, without navigational links or pointer structures he also created something uniquely powerful, flexible and of lasting relevance.

从物理表示中抽象出数据的逻辑表示的概念可能是Codd想法中改变游戏规则最多的方面。他显然是第一个完全意识到分离逻辑和物理问题的好处的人,因此他是第一个设计出一个名副其实的数据模型的人。通过描述基于关系的模型,没有导航链接或指针结构,他还创造了一些独特的强大,灵活和持久的相关性。

To be accurate it must be said that it was the SQL model rather than the relational one which eventually proved more successful commercially. SQL is a long way from a truly relational data model or language even though it would not have come into being without Codd's ideas to inspire it. The relational model's creator was naturally disappointed that SQL rather than relational became the database standard. Four decades later I think we have plenty of cause to regret that Codd's relational model isn't better supported by DBMS software.

准确地说,必须说它是SQL模型而不是最终证明在商业上更成功的关系模型。 SQL离真正的关系数据模型或语言还有很长的路要走,即使没有Codd的想法激发它,它也不会产生。关系模型的创建者自然很失望,SQL而不是关系成为数据库标准。四十年后,我认为我们有充足的理由感到后悔Codd的关系模型没有得到DBMS软件更好的支持。

#3


2  

From my knowledge, it is the normalization theory (the well known Codd's Third Normal Form) to define relational data model that is easy and efficient for storing and retrieving. This followed by the Standard Query Language (SQL) which allows it to be used across all the relational db system. Standardization was definitely lacking back then which also make this appealing to many.

据我所知,定义关系数据模型是规范化理论(众所周知的Codd第三范式),它可以轻松有效地存储和检索。接下来是标准查询语言(SQL),它允许在所有关系数据库系统中使用它。当时标准化肯定是缺乏的,这也使许多人感到有吸引力。

#4


2  

I believe that none of the answers really nailed it, as you are interested about historical aspect of it.

我相信没有一个答案真的能够确定它,因为你对它的历史方面感兴趣。

If I were to single out a reason I would say Quassnoi was close, but SQL was not just any language - it has one great feature that was not common in 70s:

如果我说出一个理由我会说Quassnoi很接近,但SQL不仅仅是任何语言 - 它有一个很棒的功能,在70年代不常见:

  • it is declarative, it describes what you want to get and does not prescribe how to do it
  • 它是声明性的,它描述了你想要得到的东西,并没有规定如何去做

I don't think it is possible to overstate this.

我认为不可能夸大这一点。

of course, the relational model is also big factor (and is related to the above)

当然,关系模型也是一个重要因素(与上述有关)

  • also not all databases that have a schema are only about data; hierarchical and network databases are also about how to get data, their structure determines the most efficient method to access data and therefore structure influence how will you model the problem (in another words modelling process is influenced by the way application will use it; this has an effect that another application that might need to use IMS for example would be in a severely disadvantaged position, or that some changes in the application would not require changes in the database design to achieve good performance - for example a need to sort by some new column)
  • 并非所有具有架构的数据库都只是关于数据;分层和网络数据库也是关于如何获取数据,它们的结构决定了访问数据的最有效方法,因此结构影响你将如何建模问题(换句话说,建模过程受应用程序使用它的方式的影响;这有例如,可能需要使用IMS的另一个应用程序将处于严重不利的位置,或者应用程序中的某些更改不需要更改数据库设计以实现良好性能 - 例如需要按一些新的排序柱)

note: on some of the above intentions SQL and 'R'DBMSes did not fully deliver (and/or other models overcame their problems in some way), but these were initial intentions and considering how stable SQL was in past ~40 years (here is a link to IBM's paper from 1974) or so it did not do such a bad, bad job either.

注意:关于上述一些意图,SQL和'R'DBMSes没有完全交付(和/或其他模型以某种方式克服了他们的问题),但这些是最初的意图,并考虑到SQL在过去〜40年内的稳定性(这里是从1974年开始的IBM论文的链接)或者它也没有做出如此糟糕,糟糕的工作。

There is also this quote from here

这里也引用了这句话

“Ted’s basic idea was that relationships between data items should be based on the item’s values, and not on separately specified linking or nesting. This notion greatly simplified the specification of queries and allowed unprecedented flexibility to exploit existing data sets in new ways,” said Don Chamberlin, co-inventor of SQL, the industry-standard language for querying relational databases, and a research staff member at Almaden. “He believed that computer users should be able to work at a more natural-language level and not be concerned about the details of where or how the data was stored.”

“Ted的基本思想是数据项之间的关系应该基于项目的值,而不是基于单独指定的链接或嵌套。这一概念极大地简化了查询的规范,并允许以新的方式利用现有数据集提供前所未有的灵活性,“查尔斯关系数据库的行业标准语言SQL的共同发明人,以及Almaden的研究人员Don Chamberlin说。 “他相信计算机用户应该能够以更自然的语言水平工作,而不必担心数据存储位置或方式的细节。”

At a 1995 reunion of IBM’s early relational database scientists, Chamberlin recalled having an epiphany as he first heard Codd describe his relational model at an internal seminar.

在1995年IBM早期关系数据库科学家的重聚中,Chamberlin回忆起他在第一次听到Codd在内部研讨会上描述他的关系模型时的顿悟。

“Codd had a bunch of fairly complicated queries,” Chamberlin said. “And since I’d been studying CODASYL (the language used to query navigational databases), I could imagine how those queries would have been represented in CODASYL by programs that were five pages long that would navigate through this labyrinth of pointers and stuff. Codd would sort of write them down as one-liners. ... (T)hey weren’t complicated at all. I said, ‘Wow.’ This was kind of a conversion experience for me. I understood what the relational thing was about after that.”

“Codd有一堆相当复杂的问题,”Chamberlin说。 “因为我一直在研究CODASYL(用于查询导航数据库的语言),我可以想象这些查询是如何通过五页长的程序在CODASYL中表示的,这些程序将通过这个迷宫般的指针和东西进行导航。 Codd会把它们写成一行。 ......(嘿)嘿并不复杂。我说,'哇。'这对我来说是一种转换体验。我明白了之后的关系是什么。“

I seem to remember a transcript of an interesting interview about begging of SQL but can't track it down..

我好像还记得一篇关于乞求SQL的有趣访谈的记录,但无法追踪它。

#5


1  

One key was the self contained products - you no longer had to manually define and maintain your key files (indexes) and the ability to change the data model with less effort. Combine that with the SET based structures made it a compelling product set to work with. Combine the SQL language on top of that to return data and it was a win-win situation over traditional ISAM data structures primarily associated with COBOL languages.

一个关键是自包含产品 - 您不再需要手动定义和维护密钥文件(索引)以及更轻松地更改数据模型的能力。将其与基于SET的结构相结合,使其成为令人信服的产品组合。将SQL语言结合起来返回数据,与主要与COBOL语言相关的传统ISAM数据结构相比,这是一个双赢的局面。