Between Mysql and PostgreSQL,which is suite for very large scale of data..for example, millions of record...i think,i should use PostgreSQL...any suggestion guys?
在Mysql和PostgreSQL之间,这是一个非常大规模的数据套件...例如,数百万的记录...我想,我应该使用PostgreSQL ...任何建议的家伙?
4 个解决方案
#1
4
I think it depends a lot on what you mean by "better". You should probably identify your needs before choosing one or the other.
我认为这取决于你对“更好”的意思。在选择其中一个之前,您应该确定自己的需求。
Faster? More reliable? Allows replication? Can do more complex queries? Is your application amenable to "sharding" in which case you probably want a database which can cluster and be administered more easily, or do you need everything in one massive set of linked tables, in which case you probably want good support for many cores and large memory. Do you have a complex authentication set up or is it a simple "one user" web application? Is the bulk of the data in binary objects, or is it simple numbers and strings? How will you do your backups?
更快?更可靠?允许复制?可以做更复杂的查询吗?您的应用程序是否适合“分片”,在这种情况下,您可能需要一个可以更容易地进行集群和管理的数据库,或者您是否需要一组大量链接表中的所有内容,在这种情况下,您可能需要对许多内核和大记忆。您是否拥有复杂的身份验证设置,或者它是一个简单的“单用户”Web应用程序?二进制对象中的大部分数据,还是简单的数字和字符串?你将如何备份?
MySQL and PostgreSQL both seem to be very capable databases, and both have been used successfully at large scale, so I'd suggest you need to identify the specific needs of your application first.
MySQL和PostgreSQL似乎都是非常强大的数据库,并且已经大规模成功使用,因此我建议您首先需要确定应用程序的特定需求。
My inclination would be towards PostgreSQL, but that's mainly because I had a few disasters with MySQL losing data a few years ago and I haven't come to trust it again. PostgreSQL has been very nice in terms of being able to make backups easily.
我的倾向是PostgreSQL,但这主要是因为几年前MySQL发生了一些丢失数据的灾难而且我还没有再相信它。 PostgreSQL在能够轻松进行备份方面非常出色。
#2
5
I've used both in similar situations, and sheer size of the DB doesn't seem to affect their scaling in substantially different ways. PostgreSQL is much more complete and solid, and will much better support complex queries and their optimization, while MySQL may shine in terms of retrieval speed for extremely simple queries; but these aspects are independent of the sheer size issue.
我在类似的情况下都使用了它们,并且数据库的庞大大小似乎并没有以完全不同的方式影响它们的缩放。 PostgreSQL更加完整和可靠,并且可以更好地支持复杂查询及其优化,而MySQL可以在极其简单的查询的检索速度方面发挥作用;但这些方面与庞大的问题无关。
#3
4
Postgres has a richer set of abilities and a better optimizer; its ability to do hash joins often makes it much faster than MySQL for joins. MySQL is rumored to be faster for simple table scans. The storage engine you use underneath matters a lot, as well.
Postgres拥有更丰富的能力和更好的优化者;它能够进行散列连接通常比连接MySQL的速度快得多。据传,对于简单的表扫描,MySQL更快。您在下面使用的存储引擎也很重要。
At some point, scaling becomes a choice between two options: scale by buying bigger hardware, or scale by introducing new machines (which you can shard the data to, use as slave replicas, or try a master-master setup -- both Posgres and MySQL have solutions of various levels of quality for these sorts of things).
在某些时候,缩放成为两种选择之间的选择:通过购买更大的硬件进行扩展,或通过引入新机器进行扩展(可以将数据分片,用作从属副本,或尝试主 - 主设置 - Posgres和MySQL为这些事物提供了各种质量水平的解决方案。
A few million rows of table data fit in a standard server's memory these days; if that's all you are doing, you don't need to worry about this stuff -- just optimize whatever database you are most comfortable with, to ensure the proper indexes are created, everything is cached (and something like memchached is used where appropriate), and so on.
如今,几百万行表数据适合标准服务器的内存;如果这就是你所做的一切,你不需要担心这些东西 - 只需优化你最熟悉的数据库,以确保创建正确的索引,缓存所有内容(并在适当的地方使用memchached) , 等等。
People mention that Facebook uses MySQL; that's kind of true. Kind of because what they are actually doing is using hundreds (thousands now?) of mysql databases, all of them responsible for their own little cross-section of the data. If you think you can load facebook into a MySQL (or postgres, or oracle) instance... well, they'd probably love to hear from you ;-).
人们提到Facebook使用MySQL;那是真的。有点因为他们实际上在做的是使用数百(现在数千?)的mysql数据库,所有这些都负责他们自己的小数据横截面。如果您认为可以将facebook加载到MySQL(或postgres或oracle)实例中......那么,他们可能很乐意听取您的意见;-)。
Once you get into the terabyte land, things get difficult. There are specialized solutions like Vertica, Greenplum, Aster Data. There are the various "nosql" datastores like Cassandra, Voldemort, and HBase. But I doubt you need to go to such an extreme. Just buy a bit more RAM.
一旦你进入太字节的土地,事情变得困难。有专门的解决方案,如Vertica,Greenplum,Aster Data。有各种“nosql”数据存储区,如Cassandra,Voldemort和HBase。但我怀疑你需要走到这么极端。只需购买更多内存。
#4
2
Well, it ultimately depends on what you are most comfortable with. According to MySQL, there is no imposed theoretical limit on the size of the database...it depends on the capability of the hardware supporting it. With the number of rows, using InnoDB, the theoretical limit is 256 terabytes. The reason I keep throwing out theoretical is that, there is probably a very small chance that you could possibly index 256 terabytes of data, so that is what they are approximating might be a limit. If you hit that max, you got bigger problems. Current users of MySQL in production, that I can think of, are YouTube and Facebook. Those are probably the two largest...and it appears that they are faring well.
嗯,这最终取决于你最满意的。根据MySQL的说法,对数据库的大小没有强加的理论限制......它取决于支持它的硬件的能力。对于行数,使用InnoDB,理论上的限制是256太字节。我不断抛弃理论的原因是,你可能很有可能索引256TB的数据,因此他们所接近的数据可能是一个限制。如果达到最大值,你会遇到更大的问题。我能想到的当前MySQL生产用户是YouTube和Facebook。这些可能是最大的两个......而且它们似乎表现得很好。
But once again, as I stated above. It is whatever you are most comfortable with.
但是,正如我上面所述,再一次。这是你最舒服的。
#1
4
I think it depends a lot on what you mean by "better". You should probably identify your needs before choosing one or the other.
我认为这取决于你对“更好”的意思。在选择其中一个之前,您应该确定自己的需求。
Faster? More reliable? Allows replication? Can do more complex queries? Is your application amenable to "sharding" in which case you probably want a database which can cluster and be administered more easily, or do you need everything in one massive set of linked tables, in which case you probably want good support for many cores and large memory. Do you have a complex authentication set up or is it a simple "one user" web application? Is the bulk of the data in binary objects, or is it simple numbers and strings? How will you do your backups?
更快?更可靠?允许复制?可以做更复杂的查询吗?您的应用程序是否适合“分片”,在这种情况下,您可能需要一个可以更容易地进行集群和管理的数据库,或者您是否需要一组大量链接表中的所有内容,在这种情况下,您可能需要对许多内核和大记忆。您是否拥有复杂的身份验证设置,或者它是一个简单的“单用户”Web应用程序?二进制对象中的大部分数据,还是简单的数字和字符串?你将如何备份?
MySQL and PostgreSQL both seem to be very capable databases, and both have been used successfully at large scale, so I'd suggest you need to identify the specific needs of your application first.
MySQL和PostgreSQL似乎都是非常强大的数据库,并且已经大规模成功使用,因此我建议您首先需要确定应用程序的特定需求。
My inclination would be towards PostgreSQL, but that's mainly because I had a few disasters with MySQL losing data a few years ago and I haven't come to trust it again. PostgreSQL has been very nice in terms of being able to make backups easily.
我的倾向是PostgreSQL,但这主要是因为几年前MySQL发生了一些丢失数据的灾难而且我还没有再相信它。 PostgreSQL在能够轻松进行备份方面非常出色。
#2
5
I've used both in similar situations, and sheer size of the DB doesn't seem to affect their scaling in substantially different ways. PostgreSQL is much more complete and solid, and will much better support complex queries and their optimization, while MySQL may shine in terms of retrieval speed for extremely simple queries; but these aspects are independent of the sheer size issue.
我在类似的情况下都使用了它们,并且数据库的庞大大小似乎并没有以完全不同的方式影响它们的缩放。 PostgreSQL更加完整和可靠,并且可以更好地支持复杂查询及其优化,而MySQL可以在极其简单的查询的检索速度方面发挥作用;但这些方面与庞大的问题无关。
#3
4
Postgres has a richer set of abilities and a better optimizer; its ability to do hash joins often makes it much faster than MySQL for joins. MySQL is rumored to be faster for simple table scans. The storage engine you use underneath matters a lot, as well.
Postgres拥有更丰富的能力和更好的优化者;它能够进行散列连接通常比连接MySQL的速度快得多。据传,对于简单的表扫描,MySQL更快。您在下面使用的存储引擎也很重要。
At some point, scaling becomes a choice between two options: scale by buying bigger hardware, or scale by introducing new machines (which you can shard the data to, use as slave replicas, or try a master-master setup -- both Posgres and MySQL have solutions of various levels of quality for these sorts of things).
在某些时候,缩放成为两种选择之间的选择:通过购买更大的硬件进行扩展,或通过引入新机器进行扩展(可以将数据分片,用作从属副本,或尝试主 - 主设置 - Posgres和MySQL为这些事物提供了各种质量水平的解决方案。
A few million rows of table data fit in a standard server's memory these days; if that's all you are doing, you don't need to worry about this stuff -- just optimize whatever database you are most comfortable with, to ensure the proper indexes are created, everything is cached (and something like memchached is used where appropriate), and so on.
如今,几百万行表数据适合标准服务器的内存;如果这就是你所做的一切,你不需要担心这些东西 - 只需优化你最熟悉的数据库,以确保创建正确的索引,缓存所有内容(并在适当的地方使用memchached) , 等等。
People mention that Facebook uses MySQL; that's kind of true. Kind of because what they are actually doing is using hundreds (thousands now?) of mysql databases, all of them responsible for their own little cross-section of the data. If you think you can load facebook into a MySQL (or postgres, or oracle) instance... well, they'd probably love to hear from you ;-).
人们提到Facebook使用MySQL;那是真的。有点因为他们实际上在做的是使用数百(现在数千?)的mysql数据库,所有这些都负责他们自己的小数据横截面。如果您认为可以将facebook加载到MySQL(或postgres或oracle)实例中......那么,他们可能很乐意听取您的意见;-)。
Once you get into the terabyte land, things get difficult. There are specialized solutions like Vertica, Greenplum, Aster Data. There are the various "nosql" datastores like Cassandra, Voldemort, and HBase. But I doubt you need to go to such an extreme. Just buy a bit more RAM.
一旦你进入太字节的土地,事情变得困难。有专门的解决方案,如Vertica,Greenplum,Aster Data。有各种“nosql”数据存储区,如Cassandra,Voldemort和HBase。但我怀疑你需要走到这么极端。只需购买更多内存。
#4
2
Well, it ultimately depends on what you are most comfortable with. According to MySQL, there is no imposed theoretical limit on the size of the database...it depends on the capability of the hardware supporting it. With the number of rows, using InnoDB, the theoretical limit is 256 terabytes. The reason I keep throwing out theoretical is that, there is probably a very small chance that you could possibly index 256 terabytes of data, so that is what they are approximating might be a limit. If you hit that max, you got bigger problems. Current users of MySQL in production, that I can think of, are YouTube and Facebook. Those are probably the two largest...and it appears that they are faring well.
嗯,这最终取决于你最满意的。根据MySQL的说法,对数据库的大小没有强加的理论限制......它取决于支持它的硬件的能力。对于行数,使用InnoDB,理论上的限制是256太字节。我不断抛弃理论的原因是,你可能很有可能索引256TB的数据,因此他们所接近的数据可能是一个限制。如果达到最大值,你会遇到更大的问题。我能想到的当前MySQL生产用户是YouTube和Facebook。这些可能是最大的两个......而且它们似乎表现得很好。
But once again, as I stated above. It is whatever you are most comfortable with.
但是,正如我上面所述,再一次。这是你最舒服的。