I had worked on relational database; but now want to learn about graph database. I came to know that these two are graph database. What is difference between these two databases. What should we prefer among them?
我曾在关系数据库工作;但现在想了解图数据库。我才知道这两个是图数据库。这两个数据库有什么区别。我们应该在他们中间选择什么?
3 个解决方案
#1
11
Titan was originally backed by Aurelius, which was bought by DataStax in 2015. This move was designed to give DataStax a jump-start into the Graph DB world, as they now offer their own "DSE Graph" enterprise product. Titan was since been forked (as previously mentioned) into JanusGraph.
Titan最初由Aurelius提供支持,2015年被DataStax收购。此举旨在让DataStax成为Graph DB世界的一个重要开端,因为他们现在提供自己的“DSE Graph”企业产品。从那以后,泰坦就被分到了(如前所述)JanusGraph。
The nice thing about Titan/Janus (IMO) is that it is "pluggable" with other existing back-end and search technologies. So it will "play nice" with things like Cassandra, HBase, Hadoop, Solr, and ElasticSearch.
Titan / Janus(IMO)的优点在于它可以与其他现有的后端和搜索技术“插拔”。因此,它将与Cassandra,HBase,Hadoop,Solr和ElasticSearch等“玩得很好”。
The drawback is that the community support is tough. The Titan project has been effectively killed, and Janus scores a whopping 0.23 on DBEngines. That makes it the 16th most-popular Graph DB (231st overall), which is pretty low.
缺点是社区支持很难。 Titan项目已被有效杀死,Janus在DBEngines上获得了惊人的0.23。这使它成为第16个最受欢迎的Graph DB(第231位),这是非常低的。
Neo4j is backed by Neo Technology, and is regarded as the front-runner in the Graph DB community (score of 38.52 right now, 1st graph DB and 21st overall). It is open source, but controlled by Neo Technologies so they can dictate a difference in feature set between open source and enterprise.
Neo4j得到了Neo Technology的支持,被认为是Graph DB社区的领跑者(目前得分为38.52,第一名得分为第一,第21名)。它是开源的,但由Neo Technologies控制,因此它们可以决定开源和企业之间功能集的差异。
The nice thing about Neo4j is that they have a lot of tutorials and learning aids built right-in to the Neo4j Browser, which is a nice, user-friendly web interface. Their documentation is top-notch, easy to read and search through, and they have a pretty good following here on Stack Overflow.
关于Neo4j的好处是他们在Neo4j浏览器中内置了很多教程和学习辅助工具,这是一个不错的,用户友好的Web界面。他们的文档是一流的,易于阅读和搜索,他们在Stack Overflow上有很好的关注。
Neo4j Browser screenshot:
Neo4j浏览器截图:
The drawback of Neo4j, is that some features (like clustering) are only available in the enterprise version. But if you work for a big company who doesn't mind shelling-out $ for an enterprise license, that may not be a big deal.
Neo4j的缺点是某些功能(如群集)仅在企业版中可用。但是,如果你为一家大公司工作而不介意为企业许可证敲诈$,这可能不是什么大问题。
Consistency: Titan/Janus is a part of the "eventual consistency" crowd, while Neo4j aims to be strong-consistent (especially in a causal clustering scenario). Although consistency can be tuned with configuration in both, with Titan/Janus that can be dependent on your choice of pluggable backend (ex: typically strong-consistent with HBase, while eventually consistent with Cassandra).
一致性:Titan / Janus是“最终一致性”人群的一部分,而Neo4j旨在保持一致性(特别是在因果聚类场景中)。虽然可以通过配置调整一致性,但Titan / Janus可以依赖于您选择的可插拔后端(例如:通常与HBase强一致,最终与Cassandra一致)。
Recommendation:
If you're just starting to learn graph databases and modeling, you can't go wrong with Neo4j. Simply download/install the community edition, run it, and execute :play movies
as your first command (tutorial that walks you through loading, modeling, and querying movie relationships).
如果您刚刚开始学习图形数据库和建模,那么Neo4j就不会出错。只需下载/安装社区版,运行它,然后执行:播放电影作为您的第一个命令(指导您加载,建模和查询电影关系的教程)。
If you have some experience with graph, and you don't mind troubleshooting/googling to figure out things (like how to set the max frame size for Thrift), then you could probably do some really cool things with Titan.
如果您对图表有一些经验,并且您不介意排除故障/谷歌搜索以找出事情(比如如何为Thrift设置最大帧大小),那么您可能可以使用Titan做一些非常酷的事情。
Try each out, and see which one works for you.
尝试每一个,看看哪个适合你。
#2
10
One approach is to simply try to choose one database over the other. For example, you might quickly search around to find that Titan has been forked to JanusGraph where it is more actively maintained. In your research you may find that there are other open source graph databases as well like OrientDb, ChronoGraph, or Sqlg as well as commercial alternatives like Microsoft's CosmosDb, DSE Graph or IBM Graph. How do you decide now?
一种方法是简单地尝试选择一个数据库而不是另一个数据库。例如,您可能会快速搜索,发现Titan已经分叉到JanusGraph,在那里可以更积极地维护它。在您的研究中,您可能会发现还有其他开源图形数据库,如OrientDb,ChronoGraph或Sqlg,以及Microsoft的CosmosDb,DSE Graph或IBM Graph等商业替代品。你现在如何决定?
There is a graph framework that ties together all of these graphs including Neo4j/Titan (and more than those listed here): Apache TinkerPop. TinkerPop provides an abstraction over different graph databases and graph processors allowing the same code to be used with different configurable backends. This pattern is quite similar to the one you find in SQL with JDBC which helps make your code vendor agnostic.
有一个图形框架将所有这些图形连接在一起,包括Neo4j / Titan(以及这里列出的更多):Apache TinkerPop。 TinkerPop提供了对不同图形数据库和图形处理器的抽象,允许相同的代码与不同的可配置后端一起使用。这种模式与您使用JDBC在SQL中找到的模式非常相似,这有助于使您的代码供应商不可知。
You can try all of the different supported graph databases before you make a choice and you can do this type of prototyping/benchmarking fairly quickly with the Gremlin Console. You will be able to make self-informed choice as to what is the best way to go for your project.
在做出选择之前,您可以尝试所有不同的受支持的图形数据库,并且可以使用Gremlin控制台快速地进行这种类型的原型设计/基准测试。您将能够自主选择什么是最适合您项目的方式。
It occurs to me as I come to the end of this post that I haven't directly answered your question. If you are just getting started and are just interested in learning about graph databases, then I likely wouldn't recommend starting with Titan/JanusGraph as it requires a bit of configuration to get started (schemas, backend selection, etc). Start with TinkerGraph or Neo4j using the Gremlin Console to try out some simple graph traversals and go from there.
当我到这篇文章的末尾时,我发现它还没有直接回答你的问题。如果您刚刚开始并且只是对学习图形数据库感兴趣,那么我可能不建议从Titan / JanusGraph开始,因为它需要一些配置才能开始(模式,后端选择等)。从使用Gremlin控制台的TinkerGraph或Neo4j开始尝试一些简单的图形遍历并从那里开始。
#3
3
There are far more than two graph databases - there are dozens. That being said, there are two with real market share: Neo4j and Titan/JanusGraph. But there are dozens of other graph datases, each with interesting strengths for different specific application spaces. That being said, I wouldn't dig into all of the niche players to start with - learning the basic idea of graph databases can be done with one of the two lead players.
有两个以上的图形数据库 - 有几十个。话虽如此,有两个真正的市场份额:Neo4j和Titan / JanusGraph。但是还有许多其他图形数据,每个数据库都针对不同的特定应用程序空间提供了有趣的优势。话虽如此,我不会深入研究所有利基玩家 - 学习图形数据库的基本思想可以通过两个主要参与者中的一个来完成。
Neo4j is the most mature, with the most nicely packaged install and documentation, tons of reference code, and support from a wide range of partners.
Neo4j是最成熟的,具有最精美的安装和文档,大量的参考代码,以及来自广泛合作伙伴的支持。
Titan/JanusGraph is the next most popular, as it's free/open source and has very strong support (e.g. IBM, Google, Hortonworks, AWS, ...). There's a recent complexity in that the leaders of the Titan project were acquired, freezing the Titan project. But the community forked the project into JanusGraph. So while JanusGraph is a new project, it's literally the same Titan code, with even broader industry support than Titan had.
Titan / JanusGraph是下一个最受欢迎的,因为它是免费/开源的,并且有很强的支持(例如IBM,Google,Hortonworks,AWS等)。最近的一个复杂因素是泰坦项目的领导者被收购,冻结了泰坦项目。但社区将该项目分为JanusGraph。因此,虽然JanusGraph是一个新项目,但它实际上是相同的Titan代码,拥有比Titan更广泛的行业支持。
Related to the two is the language used to work with the graphs. Neo4j uses its proprietary language, Cypher, while nearly everyone else uses Gremlin, and the TinkerPop open source tool set (which is a part of the Apache set of open source projects). Nearly all graph databases, including Neo4j, support Gremlin and TinkerPop. So, for example, you can use either Cypher or Gremlin to query Neo4j, though Neo (and some other proprietary graph database vendors) support Gremlin as a second-class citizen, so to speak. For example, you can connect to Neo using Gremlin from the (external) Gremlin console, but you can't use Gremlin in the (very nice) Neo4j console.
与这两者相关的是用于处理图形的语言。 Neo4j使用其专有语言Cypher,而几乎所有其他语言都使用Gremlin和TinkerPop开源工具集(这是Apache开源项目的一部分)。几乎所有的图形数据库,包括Neo4j,都支持Gremlin和TinkerPop。因此,例如,您可以使用Cypher或Gremlin来查询Neo4j,尽管Neo(以及其他一些专有图形数据库供应商)支持Gremlin作为二等公民,可以这么说。例如,你可以从(外部)Gremlin控制台使用Gremlin连接到Neo,但你不能在(非常好的)Neo4j控制台中使用Gremlin。
Note that there are many graph databases that support Gremlin other than Titan/JanusGraph. One new entrant that's very interesting is Microsoft's Azure Cosmos DB, which is a managed graph database that's "cheap and easy" if you use Azure already. And there are several vendors that provide managed JanusGraph.
请注意,有许多图形数据库支持除Titan / JanusGraph之外的Gremlin。一个非常有趣的新进入者是微软的Azure Cosmos DB,它是一个托管图形数据库,如果你已经使用Azure,它“便宜又简单”。有几家供应商提供托管的JanusGraph。
For personal learningk I'd say that Neo4j is the easiest to set up and learn - you download and run it, and open a web browser onto their web-based console, which only takes a few minutes. That being said, if you're comfortable on a command line JanusGraph only took a half hour to install and get running for me, so it's not too hard.
对于个人学习,我会说Neo4j是最容易设置和学习的 - 你下载并运行它,并在他们的基于Web的控制台上打开一个Web浏览器,这只需要几分钟。话虽这么说,如果你在命令行上感觉很舒服,JanusGraph只花了半个小时来安装并为我跑步,所以这并不难。
For learning the concepts Neo4j is great. Neo4j's query language, Cypher, and JanusGraph's query language, Gremlin, are semantically identical, just spelled differently, so you'll learn the concepts either way.
为了学习Neo4j的概念很棒。 Neo4j的查询语言Cypher和JanusGraph的查询语言Gremlin在语义上是相同的,只是拼写不同,因此您将以任何一种方式学习这些概念。
For building a real system, either could work (and there are many successful following both approaches).
为了构建一个真实的系统,要么可以工作(并且有两种方法都有很多成功)。
For which you choose, you'll want to think about whether you want to be strategically tied to a single vendor (Neo4j) or in a broader standards-based community. There's comfort level in picking the market leader with the most mature product - Neo4j. And there's a comfort level in picking open standards with strong industry support - JanusGraph. So IMO there's no "wrong" answer - people using either one are happy and successful. But since you have to pick, you'll need to think about which you're more comfortable with long-term.
您选择的是,您需要考虑是否要与单一供应商(Neo4j)或更广泛的基于标准的社区建立战略联系。使用最成熟的产品Neo4j挑选市场领导者的舒适度。在行业支持强劲的情况下,采用开放标准有一个舒适的水平--JanusGraph。所以IMO没有“错误”的答案 - 使用其中任何一个的人都很开心并且很成功。但是既然你必须选择,你需要考虑哪些你长期更舒服。
#1
11
Titan was originally backed by Aurelius, which was bought by DataStax in 2015. This move was designed to give DataStax a jump-start into the Graph DB world, as they now offer their own "DSE Graph" enterprise product. Titan was since been forked (as previously mentioned) into JanusGraph.
Titan最初由Aurelius提供支持,2015年被DataStax收购。此举旨在让DataStax成为Graph DB世界的一个重要开端,因为他们现在提供自己的“DSE Graph”企业产品。从那以后,泰坦就被分到了(如前所述)JanusGraph。
The nice thing about Titan/Janus (IMO) is that it is "pluggable" with other existing back-end and search technologies. So it will "play nice" with things like Cassandra, HBase, Hadoop, Solr, and ElasticSearch.
Titan / Janus(IMO)的优点在于它可以与其他现有的后端和搜索技术“插拔”。因此,它将与Cassandra,HBase,Hadoop,Solr和ElasticSearch等“玩得很好”。
The drawback is that the community support is tough. The Titan project has been effectively killed, and Janus scores a whopping 0.23 on DBEngines. That makes it the 16th most-popular Graph DB (231st overall), which is pretty low.
缺点是社区支持很难。 Titan项目已被有效杀死,Janus在DBEngines上获得了惊人的0.23。这使它成为第16个最受欢迎的Graph DB(第231位),这是非常低的。
Neo4j is backed by Neo Technology, and is regarded as the front-runner in the Graph DB community (score of 38.52 right now, 1st graph DB and 21st overall). It is open source, but controlled by Neo Technologies so they can dictate a difference in feature set between open source and enterprise.
Neo4j得到了Neo Technology的支持,被认为是Graph DB社区的领跑者(目前得分为38.52,第一名得分为第一,第21名)。它是开源的,但由Neo Technologies控制,因此它们可以决定开源和企业之间功能集的差异。
The nice thing about Neo4j is that they have a lot of tutorials and learning aids built right-in to the Neo4j Browser, which is a nice, user-friendly web interface. Their documentation is top-notch, easy to read and search through, and they have a pretty good following here on Stack Overflow.
关于Neo4j的好处是他们在Neo4j浏览器中内置了很多教程和学习辅助工具,这是一个不错的,用户友好的Web界面。他们的文档是一流的,易于阅读和搜索,他们在Stack Overflow上有很好的关注。
Neo4j Browser screenshot:
Neo4j浏览器截图:
The drawback of Neo4j, is that some features (like clustering) are only available in the enterprise version. But if you work for a big company who doesn't mind shelling-out $ for an enterprise license, that may not be a big deal.
Neo4j的缺点是某些功能(如群集)仅在企业版中可用。但是,如果你为一家大公司工作而不介意为企业许可证敲诈$,这可能不是什么大问题。
Consistency: Titan/Janus is a part of the "eventual consistency" crowd, while Neo4j aims to be strong-consistent (especially in a causal clustering scenario). Although consistency can be tuned with configuration in both, with Titan/Janus that can be dependent on your choice of pluggable backend (ex: typically strong-consistent with HBase, while eventually consistent with Cassandra).
一致性:Titan / Janus是“最终一致性”人群的一部分,而Neo4j旨在保持一致性(特别是在因果聚类场景中)。虽然可以通过配置调整一致性,但Titan / Janus可以依赖于您选择的可插拔后端(例如:通常与HBase强一致,最终与Cassandra一致)。
Recommendation:
If you're just starting to learn graph databases and modeling, you can't go wrong with Neo4j. Simply download/install the community edition, run it, and execute :play movies
as your first command (tutorial that walks you through loading, modeling, and querying movie relationships).
如果您刚刚开始学习图形数据库和建模,那么Neo4j就不会出错。只需下载/安装社区版,运行它,然后执行:播放电影作为您的第一个命令(指导您加载,建模和查询电影关系的教程)。
If you have some experience with graph, and you don't mind troubleshooting/googling to figure out things (like how to set the max frame size for Thrift), then you could probably do some really cool things with Titan.
如果您对图表有一些经验,并且您不介意排除故障/谷歌搜索以找出事情(比如如何为Thrift设置最大帧大小),那么您可能可以使用Titan做一些非常酷的事情。
Try each out, and see which one works for you.
尝试每一个,看看哪个适合你。
#2
10
One approach is to simply try to choose one database over the other. For example, you might quickly search around to find that Titan has been forked to JanusGraph where it is more actively maintained. In your research you may find that there are other open source graph databases as well like OrientDb, ChronoGraph, or Sqlg as well as commercial alternatives like Microsoft's CosmosDb, DSE Graph or IBM Graph. How do you decide now?
一种方法是简单地尝试选择一个数据库而不是另一个数据库。例如,您可能会快速搜索,发现Titan已经分叉到JanusGraph,在那里可以更积极地维护它。在您的研究中,您可能会发现还有其他开源图形数据库,如OrientDb,ChronoGraph或Sqlg,以及Microsoft的CosmosDb,DSE Graph或IBM Graph等商业替代品。你现在如何决定?
There is a graph framework that ties together all of these graphs including Neo4j/Titan (and more than those listed here): Apache TinkerPop. TinkerPop provides an abstraction over different graph databases and graph processors allowing the same code to be used with different configurable backends. This pattern is quite similar to the one you find in SQL with JDBC which helps make your code vendor agnostic.
有一个图形框架将所有这些图形连接在一起,包括Neo4j / Titan(以及这里列出的更多):Apache TinkerPop。 TinkerPop提供了对不同图形数据库和图形处理器的抽象,允许相同的代码与不同的可配置后端一起使用。这种模式与您使用JDBC在SQL中找到的模式非常相似,这有助于使您的代码供应商不可知。
You can try all of the different supported graph databases before you make a choice and you can do this type of prototyping/benchmarking fairly quickly with the Gremlin Console. You will be able to make self-informed choice as to what is the best way to go for your project.
在做出选择之前,您可以尝试所有不同的受支持的图形数据库,并且可以使用Gremlin控制台快速地进行这种类型的原型设计/基准测试。您将能够自主选择什么是最适合您项目的方式。
It occurs to me as I come to the end of this post that I haven't directly answered your question. If you are just getting started and are just interested in learning about graph databases, then I likely wouldn't recommend starting with Titan/JanusGraph as it requires a bit of configuration to get started (schemas, backend selection, etc). Start with TinkerGraph or Neo4j using the Gremlin Console to try out some simple graph traversals and go from there.
当我到这篇文章的末尾时,我发现它还没有直接回答你的问题。如果您刚刚开始并且只是对学习图形数据库感兴趣,那么我可能不建议从Titan / JanusGraph开始,因为它需要一些配置才能开始(模式,后端选择等)。从使用Gremlin控制台的TinkerGraph或Neo4j开始尝试一些简单的图形遍历并从那里开始。
#3
3
There are far more than two graph databases - there are dozens. That being said, there are two with real market share: Neo4j and Titan/JanusGraph. But there are dozens of other graph datases, each with interesting strengths for different specific application spaces. That being said, I wouldn't dig into all of the niche players to start with - learning the basic idea of graph databases can be done with one of the two lead players.
有两个以上的图形数据库 - 有几十个。话虽如此,有两个真正的市场份额:Neo4j和Titan / JanusGraph。但是还有许多其他图形数据,每个数据库都针对不同的特定应用程序空间提供了有趣的优势。话虽如此,我不会深入研究所有利基玩家 - 学习图形数据库的基本思想可以通过两个主要参与者中的一个来完成。
Neo4j is the most mature, with the most nicely packaged install and documentation, tons of reference code, and support from a wide range of partners.
Neo4j是最成熟的,具有最精美的安装和文档,大量的参考代码,以及来自广泛合作伙伴的支持。
Titan/JanusGraph is the next most popular, as it's free/open source and has very strong support (e.g. IBM, Google, Hortonworks, AWS, ...). There's a recent complexity in that the leaders of the Titan project were acquired, freezing the Titan project. But the community forked the project into JanusGraph. So while JanusGraph is a new project, it's literally the same Titan code, with even broader industry support than Titan had.
Titan / JanusGraph是下一个最受欢迎的,因为它是免费/开源的,并且有很强的支持(例如IBM,Google,Hortonworks,AWS等)。最近的一个复杂因素是泰坦项目的领导者被收购,冻结了泰坦项目。但社区将该项目分为JanusGraph。因此,虽然JanusGraph是一个新项目,但它实际上是相同的Titan代码,拥有比Titan更广泛的行业支持。
Related to the two is the language used to work with the graphs. Neo4j uses its proprietary language, Cypher, while nearly everyone else uses Gremlin, and the TinkerPop open source tool set (which is a part of the Apache set of open source projects). Nearly all graph databases, including Neo4j, support Gremlin and TinkerPop. So, for example, you can use either Cypher or Gremlin to query Neo4j, though Neo (and some other proprietary graph database vendors) support Gremlin as a second-class citizen, so to speak. For example, you can connect to Neo using Gremlin from the (external) Gremlin console, but you can't use Gremlin in the (very nice) Neo4j console.
与这两者相关的是用于处理图形的语言。 Neo4j使用其专有语言Cypher,而几乎所有其他语言都使用Gremlin和TinkerPop开源工具集(这是Apache开源项目的一部分)。几乎所有的图形数据库,包括Neo4j,都支持Gremlin和TinkerPop。因此,例如,您可以使用Cypher或Gremlin来查询Neo4j,尽管Neo(以及其他一些专有图形数据库供应商)支持Gremlin作为二等公民,可以这么说。例如,你可以从(外部)Gremlin控制台使用Gremlin连接到Neo,但你不能在(非常好的)Neo4j控制台中使用Gremlin。
Note that there are many graph databases that support Gremlin other than Titan/JanusGraph. One new entrant that's very interesting is Microsoft's Azure Cosmos DB, which is a managed graph database that's "cheap and easy" if you use Azure already. And there are several vendors that provide managed JanusGraph.
请注意,有许多图形数据库支持除Titan / JanusGraph之外的Gremlin。一个非常有趣的新进入者是微软的Azure Cosmos DB,它是一个托管图形数据库,如果你已经使用Azure,它“便宜又简单”。有几家供应商提供托管的JanusGraph。
For personal learningk I'd say that Neo4j is the easiest to set up and learn - you download and run it, and open a web browser onto their web-based console, which only takes a few minutes. That being said, if you're comfortable on a command line JanusGraph only took a half hour to install and get running for me, so it's not too hard.
对于个人学习,我会说Neo4j是最容易设置和学习的 - 你下载并运行它,并在他们的基于Web的控制台上打开一个Web浏览器,这只需要几分钟。话虽这么说,如果你在命令行上感觉很舒服,JanusGraph只花了半个小时来安装并为我跑步,所以这并不难。
For learning the concepts Neo4j is great. Neo4j's query language, Cypher, and JanusGraph's query language, Gremlin, are semantically identical, just spelled differently, so you'll learn the concepts either way.
为了学习Neo4j的概念很棒。 Neo4j的查询语言Cypher和JanusGraph的查询语言Gremlin在语义上是相同的,只是拼写不同,因此您将以任何一种方式学习这些概念。
For building a real system, either could work (and there are many successful following both approaches).
为了构建一个真实的系统,要么可以工作(并且有两种方法都有很多成功)。
For which you choose, you'll want to think about whether you want to be strategically tied to a single vendor (Neo4j) or in a broader standards-based community. There's comfort level in picking the market leader with the most mature product - Neo4j. And there's a comfort level in picking open standards with strong industry support - JanusGraph. So IMO there's no "wrong" answer - people using either one are happy and successful. But since you have to pick, you'll need to think about which you're more comfortable with long-term.
您选择的是,您需要考虑是否要与单一供应商(Neo4j)或更广泛的基于标准的社区建立战略联系。使用最成熟的产品Neo4j挑选市场领导者的舒适度。在行业支持强劲的情况下,采用开放标准有一个舒适的水平--JanusGraph。所以IMO没有“错误”的答案 - 使用其中任何一个的人都很开心并且很成功。但是既然你必须选择,你需要考虑哪些你长期更舒服。