Is there any difference between CMS and hight traffic websites (like news portals) in logic and database design and optimization (PHP and MySQL)? I have searched for php site scalability
in * and memcached is in a majority. Is there techniques for MySQL optimization? (Im looking for a book for this issue. I have searched in amazon but I dont know what is the best choise.) Thanks in advance
CMS和高流量网站(如新闻门户)在逻辑和数据库设计和优化(PHP和MySQL)上有什么区别吗?我在*和memis中搜索了php站点的可扩展性。是否有针对MySQL优化的技术?)我在找这本书。我在亚马逊上搜索过,但我不知道最好的选择是什么。谢谢提前
2 个解决方案
#1
3
this isnt so easy to answer. there are different approaches and a variety of opinions but ill try to cover some common scenarios. but first some basics.
这不是那么容易回答的。有不同的方法和不同的观点,但我会尽量介绍一些常见的情况。但首先,一些基础知识。
most web applications can be sperated in application and database. database usage can be seperated into transactional (oltp) and analytical (olap)
大多数web应用程序都可以包含在应用程序和数据库中。数据库使用可以分为事务(oltp)和分析(olap)
in the best case you can just start a number of application servers and distribute traffic among them. they all have a connection to the same database server and can work independently. this can be however difficult if you have other shared data, sessions etc. you can accomplish this by simply adding multiple ip adresses to your domain namen in dns. or you use load balancing techniques to forward the clients do different servers.
在最好的情况下,您可以启动一些应用服务器并在它们之间分配流量。它们都有一个连接到同一个数据库服务器,并且可以独立工作。如果您有其他共享数据、会话等,那么您可以通过在dns中添加多个ip地址到您的域namen来实现这一点。或者使用负载平衡技术来转发客户端执行不同的服务器。
application scaling is generally very easy. database is much more complex.
应用程序扩展通常非常简单。数据库要复杂得多。
the first thing to do is usually set up one or more replication servers which have the same data as the main database. they can be cascaded but have 1 serous disadvantage. their data is not always up to date. in general not more than some seconds old but it can be more under load. but for many use cases this is fine. big sites that just display information could just replicate their database to some slave servers, set up some application servers (its a good practice to run one slave and one application server on the same server and let this application server access this database slave) and every is fine.
首先要做的是设置一个或多个复制服务器,这些服务器的数据与主数据库相同。它们可以级联,但有一个严重的缺点。他们的数据并不总是最新的。一般来说,不超过几秒的时间,但它可以承受更大的负荷。但是对于许多用例来说,这是可以的。只显示信息的大型站点可以将它们的数据库复制到一些从属服务器,设置一些应用服务器(在同一服务器上运行一个从属服务器和一个应用服务器,让这个应用服务器访问这个数据库从属服务器是一个很好的实践),而且每个都没问题。
every olap query can be directed to a slave. olap querys are those that dont modify anything and dont need 100% up 2 date data.
每个olap查询都可以定向到从查询。olap查询是指不修改任何内容、不需要100%的日期数据的查询。
so everything needs to be written to the very same database source server from which every other server gets its copy. for example every comment for an article.
所以所有的东西都需要写入到同一个数据库源服务器上,每个其他服务器都从这个数据库源服务器上获得副本。例如一篇文章的每个评论。
if this bottleneck gets too tight you can go in two dirctions.
如果这个瓶颈太紧,你可以做两步。
- sharding
- 分片
- master-master replication
- -主复制
sharding means you decide on the application server where to store and where to fetch your data. for example every comment that starts with a gets to server a, b-> b and so on. thats a stupid example but its basically how it is. mostly some internal ids are involved. if possible its good to shard data so that it can be completely pulled from that server agani. in the example above, if i wanted to have all comments for an article i would have to ask eveyr server a-z and merge the results. this is inefficitient but possible, because those servers can be replicated. this is called mapping (you could check the famous google map-reduce algorithm whcih basically does just this).
分片意味着您决定应用服务器在何处存储和在何处获取数据。例如,以a开头的每个注释都可以到达服务器a、b-> b等等。这是个愚蠢的例子,但基本上就是这样。主要涉及一些内部id。如果可能的话,最好将数据切分,这样就可以完全从服务器agani中提取数据。在上面的示例中,如果我想要对一篇文章有所有的评论,我必须请求eveyr服务器a-z并合并结果。这虽然低效,但却是可能的,因为这些服务器可以被复制。这被称为映射(您可以查看著名的谷歌map-reduce算法whcih基本上就是这么做的)。
master-master repliation means that you write your data to different master servers and they synchronize each other, and isnt stored seperately like if you do sharding. this has to be done if your application is not able to decide on its own where to store and fetch data. you just store to any master server, every server gets everything and everybody is happy? no... because this involves another serious problem. conflicts! imagine two users enter a comment. commentA gets stored on serverA, commentB gets stored on serverB. which id should we use. which one comes first? the best is to design an application that avoids this cases and has different keys and stuff. but what usually happens is conflict resolving, prioritizing and stuff. oracle has alot of features on this level and mysql is still behind. but trends are going into much more complex data structes like clouds anaway...
master-master和解意味着将数据写到不同的主服务器上,它们彼此同步,而不是像分片一样存储。如果您的应用程序不能自行决定在何处存储和获取数据,则必须这样做。你只需要存储到任何主服务器,每个服务器都能得到所有的东西,每个人都很高兴?不…因为这涉及到另一个严重的问题。冲突!假设有两个用户输入一条评论。commentA存储在serverA上,commentB存储在serverB上。我们应该使用哪个id。孰先孰后?最好的方法是设计一个应用程序来避免这种情况,并使用不同的键和东西。但通常发生的是解决冲突、确定优先级和诸如此类的事情。oracle在这个级别上有很多特性,mysql仍然落后。但趋势正在进入更复杂的数据结构,比如云或云……
well i dont think i explained well but you should at least get some keywords from the text that oyu can investigate further.
我想我没有解释清楚,但是你至少应该从文章中得到一些关键词,oyu可以进一步调查。
#2
1
Sure, there are all sorts of things you can do to optimize your PHP/MySQL web applications for high traffic websites. However, most of them depend on your specific situation, which you haven't given in your question.
当然,您可以为高流量网站优化PHP/MySQL web应用程序。然而,大多数都取决于你的具体情况,这是你在问题中没有给出的。
Your database should be well structured regardless of whether you have a high-traffic site or not. If you use an off-the-shelf CMS, this is typically fine. Aside from good application architecture, there is no one-size-fits-all solution.
不管您是否有高流量的站点,您的数据库都应该具有良好的结构。如果您使用现成的CMS,这通常是可以的。除了良好的应用程序架构之外,没有一种通用的解决方案。
#1
3
this isnt so easy to answer. there are different approaches and a variety of opinions but ill try to cover some common scenarios. but first some basics.
这不是那么容易回答的。有不同的方法和不同的观点,但我会尽量介绍一些常见的情况。但首先,一些基础知识。
most web applications can be sperated in application and database. database usage can be seperated into transactional (oltp) and analytical (olap)
大多数web应用程序都可以包含在应用程序和数据库中。数据库使用可以分为事务(oltp)和分析(olap)
in the best case you can just start a number of application servers and distribute traffic among them. they all have a connection to the same database server and can work independently. this can be however difficult if you have other shared data, sessions etc. you can accomplish this by simply adding multiple ip adresses to your domain namen in dns. or you use load balancing techniques to forward the clients do different servers.
在最好的情况下,您可以启动一些应用服务器并在它们之间分配流量。它们都有一个连接到同一个数据库服务器,并且可以独立工作。如果您有其他共享数据、会话等,那么您可以通过在dns中添加多个ip地址到您的域namen来实现这一点。或者使用负载平衡技术来转发客户端执行不同的服务器。
application scaling is generally very easy. database is much more complex.
应用程序扩展通常非常简单。数据库要复杂得多。
the first thing to do is usually set up one or more replication servers which have the same data as the main database. they can be cascaded but have 1 serous disadvantage. their data is not always up to date. in general not more than some seconds old but it can be more under load. but for many use cases this is fine. big sites that just display information could just replicate their database to some slave servers, set up some application servers (its a good practice to run one slave and one application server on the same server and let this application server access this database slave) and every is fine.
首先要做的是设置一个或多个复制服务器,这些服务器的数据与主数据库相同。它们可以级联,但有一个严重的缺点。他们的数据并不总是最新的。一般来说,不超过几秒的时间,但它可以承受更大的负荷。但是对于许多用例来说,这是可以的。只显示信息的大型站点可以将它们的数据库复制到一些从属服务器,设置一些应用服务器(在同一服务器上运行一个从属服务器和一个应用服务器,让这个应用服务器访问这个数据库从属服务器是一个很好的实践),而且每个都没问题。
every olap query can be directed to a slave. olap querys are those that dont modify anything and dont need 100% up 2 date data.
每个olap查询都可以定向到从查询。olap查询是指不修改任何内容、不需要100%的日期数据的查询。
so everything needs to be written to the very same database source server from which every other server gets its copy. for example every comment for an article.
所以所有的东西都需要写入到同一个数据库源服务器上,每个其他服务器都从这个数据库源服务器上获得副本。例如一篇文章的每个评论。
if this bottleneck gets too tight you can go in two dirctions.
如果这个瓶颈太紧,你可以做两步。
- sharding
- 分片
- master-master replication
- -主复制
sharding means you decide on the application server where to store and where to fetch your data. for example every comment that starts with a gets to server a, b-> b and so on. thats a stupid example but its basically how it is. mostly some internal ids are involved. if possible its good to shard data so that it can be completely pulled from that server agani. in the example above, if i wanted to have all comments for an article i would have to ask eveyr server a-z and merge the results. this is inefficitient but possible, because those servers can be replicated. this is called mapping (you could check the famous google map-reduce algorithm whcih basically does just this).
分片意味着您决定应用服务器在何处存储和在何处获取数据。例如,以a开头的每个注释都可以到达服务器a、b-> b等等。这是个愚蠢的例子,但基本上就是这样。主要涉及一些内部id。如果可能的话,最好将数据切分,这样就可以完全从服务器agani中提取数据。在上面的示例中,如果我想要对一篇文章有所有的评论,我必须请求eveyr服务器a-z并合并结果。这虽然低效,但却是可能的,因为这些服务器可以被复制。这被称为映射(您可以查看著名的谷歌map-reduce算法whcih基本上就是这么做的)。
master-master repliation means that you write your data to different master servers and they synchronize each other, and isnt stored seperately like if you do sharding. this has to be done if your application is not able to decide on its own where to store and fetch data. you just store to any master server, every server gets everything and everybody is happy? no... because this involves another serious problem. conflicts! imagine two users enter a comment. commentA gets stored on serverA, commentB gets stored on serverB. which id should we use. which one comes first? the best is to design an application that avoids this cases and has different keys and stuff. but what usually happens is conflict resolving, prioritizing and stuff. oracle has alot of features on this level and mysql is still behind. but trends are going into much more complex data structes like clouds anaway...
master-master和解意味着将数据写到不同的主服务器上,它们彼此同步,而不是像分片一样存储。如果您的应用程序不能自行决定在何处存储和获取数据,则必须这样做。你只需要存储到任何主服务器,每个服务器都能得到所有的东西,每个人都很高兴?不…因为这涉及到另一个严重的问题。冲突!假设有两个用户输入一条评论。commentA存储在serverA上,commentB存储在serverB上。我们应该使用哪个id。孰先孰后?最好的方法是设计一个应用程序来避免这种情况,并使用不同的键和东西。但通常发生的是解决冲突、确定优先级和诸如此类的事情。oracle在这个级别上有很多特性,mysql仍然落后。但趋势正在进入更复杂的数据结构,比如云或云……
well i dont think i explained well but you should at least get some keywords from the text that oyu can investigate further.
我想我没有解释清楚,但是你至少应该从文章中得到一些关键词,oyu可以进一步调查。
#2
1
Sure, there are all sorts of things you can do to optimize your PHP/MySQL web applications for high traffic websites. However, most of them depend on your specific situation, which you haven't given in your question.
当然,您可以为高流量网站优化PHP/MySQL web应用程序。然而,大多数都取决于你的具体情况,这是你在问题中没有给出的。
Your database should be well structured regardless of whether you have a high-traffic site or not. If you use an off-the-shelf CMS, this is typically fine. Aside from good application architecture, there is no one-size-fits-all solution.
不管您是否有高流量的站点,您的数据库都应该具有良好的结构。如果您使用现成的CMS,这通常是可以的。除了良好的应用程序架构之外,没有一种通用的解决方案。