Scenario
场景
Say you have a website or app that has tons of traffic. And even with a database connection pool, performance is taking a real hit (the site/app may even be crashing) because there are too many concurrent connections.
假设你有一个网站或应用程序,它有大量的流量。即使有数据库连接池,性能也会受到影响(站点/应用程序甚至可能会崩溃),因为并发连接太多。
Question
问题
What are someone's options for dealing with this problem?
如何解决这个问题?
My thoughts
我的思想
I was thinking someone with this problem could create multiple databases (possibly on different machines although I'm not sure that's necessary), each with the same information and updated at the same time, which would grant a multiple of the original number of connections for a single database. But if the database is large that doesn't seem like a very viable solution.
我认为有这个问题的人可以创建多个数据库(可能在不同的机器上,尽管我不确定这是否必要),每个数据库都具有相同的信息并同时更新,这将为单个数据库授予多个原始连接数。但是如果数据库很大,这似乎不是一个可行的解决方案。
7 个解决方案
#1
10
The stem is not specific enough to give a firm suggestion, but the complete list of what could be done is as follow:
该系统还不够具体,无法给出一个明确的建议,但可以采取的措施的完整清单如下:
- Database cluster: Suitable for situations where you don't want to change your application layer and database is all you touch. There's a limit on how much you can get out of a database cluster. If your request volume keeps on growing, this solution will fail as well eventually. But the good news is that you've got all the functionality you've already had in an ordinary single-instance MySQL.
- 数据库集群:适用于您不希望更改应用程序层和数据库的情况。您可以从数据库集群中获得多少内容是有限制的。如果您的请求量持续增长,这个解决方案最终也会失败。但好消息是,您已经拥有了在普通单实例MySQL中已经拥有的所有功能。
- Sharding: Since your question is tagged with MySQL, and it does not support sharding on its own, if you want to use this solution you need to implement it in your application layer. In this solution you'll scatter your data over multiple databases (preferably in multiple MySQL instances on separate hardware) logically. It will be your responsibility to find the appropriate database holding your designated data. It's one of the most effective solutions ever but it's not always feasible. Its biggest flaw is that data scattered among two or more databases can not be included within a transaction.
- 分片:由于您的问题是用MySQL标记的,而且它本身不支持分片,如果您想使用这个解决方案,您需要在应用程序层中实现它。在这个解决方案中,您将在多个数据库(最好是在单独的硬件上的多个MySQL实例中)上分散数据。您有责任找到适当的数据库来保存指定的数据。这是迄今为止最有效的解决方案之一,但并不总是可行的。它最大的缺点是分散在两个或多个数据库中的数据不能包含在事务中。
- Replication: Depending on your scenario you might be able to incorporate database replication and have copies of your data on them. This way you can connect to them instead of the master database and reduce the load on it. The default replication definition is master/slave scenario in which data flow is one way, from master to the slave. So changes you might make on the slave while will be applied on the salve, they won't be affecting the master. But there is also a master/master replication configuration in which data flow is in both ways. Yet you can not assume atomic integrity for concurrent data changes among both masters. In the end this solution is most effective if you plan to use it in master/slave mode and using slaves for read-only access.
- 复制:根据您的场景,您可能可以合并数据库复制并在其上拥有数据副本。通过这种方式,您可以连接到它们,而不是主数据库,并减少它的负载。默认的复制定义是主/从场景,其中数据流是从主到从的一种方式。所以你可能在奴隶身上做的改变会被应用在药膏上,它们不会影响主人。但是还有一个主/主复制配置,其中数据流有两种方式。然而,您不能假定两个主服务器之间并发数据更改的原子完整性。最后,如果您计划在主/从模式中使用这个解决方案,并使用从服务器进行只读访问,那么这个解决方案是最有效的。
- Caching: Perhaps this solution should not be included here but since your stem does not reject it, here it goes. One of the ways to reduce database load is to cache its data once extracted. This solution can be beneficial specially if extracting data is expensive. There are many cache servers out there, like memcached or redis. This way you can omit so many of the database connections but only for extraction of data.
- 缓存:也许这个解决方案不应该包含在这里,但是因为您的系统没有拒绝它,所以它就在这里。减少数据库负载的一种方法是在提取数据后缓存数据。如果提取数据的成本很高,这种解决方案会非常有用。有许多缓存服务器,比如memcached或redis。通过这种方式,您可以省略许多数据库连接,但仅用于提取数据。
- Other storage engines: You can always switch to more performant engines if your current one does not provide you with what you need. Of course this is only feasible if your needs allow you to. Nowadays there are NoSQL engines, much more performant than RDBMS, which support sharding natively and you can scale them linearly with minimum effort. There are also Lucene based solutions out there with powerful full-text search capabilities providing you with the same automatic sharding. In fact the only reason why you should be using a traditional RDBMS is the atomic behavior of transactions. But if transactions are not a must, there are much better solutions than RDBMS.
- 其他存储引擎:如果你当前的引擎不能满足你的需要,你可以随时切换到更高性能的引擎。当然,这只有在你的需要允许的情况下才可行。现在有了NoSQL引擎,比RDBMS性能好得多,它们支持本地分片,您可以以最少的工作量线性扩展它们。也有基于Lucene的解决方案,具有强大的全文搜索功能,为您提供相同的自动分片功能。事实上,您应该使用传统RDBMS的唯一原因是事务的原子行为。但是,如果事务不是必须的,那么有比RDBMS更好的解决方案。
#2
3
If you don't already, you could try running your application on an application server -- to get some middleware behind your app. Most application servers will do their own connection pooling (because getting a connection from a web app to a database connection pool is still really really expensive). Additionally, you should be able to configure your application server to use shared connections -- which as the name implies will allow connections to be shared wherever possible.
如果还没有,您可以尝试在应用服务器上运行应用程序——在应用程序背后获得一些中间件。大多数应用服务器将执行它们自己的连接池(因为从web应用程序到数据库连接池的连接仍然非常昂贵)。此外,您应该能够配置应用服务器来使用共享连接——顾名思义,共享连接将允许在任何可能的地方共享连接。
In short, use an appserver. If you already are, maybe mention which one you're using and we can look at optimizing the server config from there.
简而言之,使用appserver。如果您已经使用了,可以提到您正在使用的是哪一个,我们可以从那里查看服务器配置的优化。
#3
3
Replication -- Master plus any number of slaves. This gives you "unlimited" read scaling.
复制——Master加上任意数量的从服务器。这给了你“无限”的读缩放。
Disconnect -- A connection should not keep the connection open longer than necessary.
断开——连接不应该使连接打开的时间超过必要的时间。
Unix, not Windows -- Need I elaborate?
Unix,不是Windows——我需要详细说明吗?
InnoDB -- Use InnoDB, not MyISAM.
InnoDB——使用InnoDB,而不是MyISAM。
SlowLog -- Set long_query_time
to 1 and watch for the top couple of queries; optimize them. See pt-query-digest
for help in summarizing the slowlog.
SlowLog——将long_query_time设置为1,查看最上面的两个查询;优化它们。请参阅pt-query-digest,了解如何总结慢速日志。
#4
2
This is a tipical app scaling problem and many solutions have been devised - Google Big Table and Amazon Elastic products for instance. If moving into a cloud and taking advantage of the auto-scaling options they all provide is not an option then you'll need to create your own setup. Take a look at the docs for Postgres and MySQL, and you'll find that the ideas are pretty similar, including the concepts of
这是一个典型的应用程序扩展问题,已经设计了许多解决方案——比如谷歌Big Table和Amazon Elastic产品。如果进入云计算并利用它们提供的自动伸缩选项不是一个选项,那么您需要创建自己的设置。看看Postgres和MySQL的文档,你会发现它们非常相似,包括它们的概念
-
sharding: spread your client data into several databases and route clients requests to the right database instances.
分片:将客户端数据扩展到多个数据库,并将客户机请求路由到正确的数据库实例。
-
Load Balancing: have your app deployed in several servers and use a middleware to route requests based on load on the server. It'll require some kind of DB synchronizarion tool like SymmetricDS to keep databases in sync.
负载平衡:将应用程序部署在多个服务器上,并使用中间件根据服务器上的负载路由请求。它将需要某种DB synchronizarion工具(比如symmetricd)来保持数据库同步。
This is by no means a full-blown overview of all your options but might help you get started.
这绝不是对你所有选择的全面概述,但可能有助于你开始。
#5
0
There are many things you should investigate for this problem.
- How many simultaneous connections are there. You can always increase ram and increase the number of max connections. MySQL can support millions of connections.
对于这个问题,你应该调查很多事情。-有多少同时连接。您总是可以增加ram并增加最大连接数。MySQL可以支持数百万个连接。
-make sure your app is closing connections. Even with a pool the app has to return connections to the pool.
-确保你的应用正在关闭连接。即使有一个池,应用程序也必须返回到池的连接。
-run database on separate server.
-在单独的服务器上运行数据库。
-make sure you have optimized queries. One long running query can slow a system down.
-确保你有优化过的查询。一个长时间运行的查询可以降低系统的运行速度。
-finally use MySQL cluster if other approaches fail. With a high traffic site you may want to consider this to avoid single point of failure.
-如果其他方法失败,最后使用MySQL集群。对于高流量站点,您可能需要考虑这一点,以避免单点失败。
#6
0
I was running into similar issues, even though the app was supposedly closing it's connections I could see them stacking up in SQL as sleeping connections. After checking into the issue, I add the following to my connection string in webconfig with the following to test:
我也遇到过类似的问题,尽管这款应用可能会关闭它的连接,但我可以看到它们在SQL中以睡眠连接的方式堆积起来。在检查问题后,我在webconfig中将以下内容添加到我的连接字符串中,如下所示进行测试:
Connection Lifetime=600
This should have killed any sleeping connections after 10 mins - but it didn't...
这应该会在10分钟后扼杀任何睡眠连接——但它并没有……
On further review I had pending windows updates on both my web server and SQL server. And magically, the problem went away!
在进一步的检查中,我在web服务器和SQL server上都有windows更新。奇迹般地,问题消失了!
I wish I could have a more specific answer for you but somewhere between adding that "Connection Lifetime" and getting my web and SQL servers up to date with patches completely eliminated the issue for me. I've been clean for 3 weeks now, no issues.
我希望我能给你一个更具体的答案,但在添加“连接的生命周期”和让我的web和SQL服务器与补丁更新的某个地方之间的某个地方,完全消除了我的问题。我已经清洁了3周了,没问题。
#7
0
In our case, we were also facing the same issue when mysql concurrent connection reached to 100.
在我们的例子中,当mysql并发连接达到100时,我们也面临同样的问题。
Finally, we found a great npm express-myconnection module (https://www.npmjs.com/package/express-myconnection). It automatically release the connections when done. It supports Single and Pool connection strategies.
最后,我们找到了一个很棒的npm express-myconnection模块(https://www.npmjs.com/package/express-myconnection)。它在完成时自动释放连接。它支持单一和池连接策略。
It works fine.
它将正常工作。
#1
10
The stem is not specific enough to give a firm suggestion, but the complete list of what could be done is as follow:
该系统还不够具体,无法给出一个明确的建议,但可以采取的措施的完整清单如下:
- Database cluster: Suitable for situations where you don't want to change your application layer and database is all you touch. There's a limit on how much you can get out of a database cluster. If your request volume keeps on growing, this solution will fail as well eventually. But the good news is that you've got all the functionality you've already had in an ordinary single-instance MySQL.
- 数据库集群:适用于您不希望更改应用程序层和数据库的情况。您可以从数据库集群中获得多少内容是有限制的。如果您的请求量持续增长,这个解决方案最终也会失败。但好消息是,您已经拥有了在普通单实例MySQL中已经拥有的所有功能。
- Sharding: Since your question is tagged with MySQL, and it does not support sharding on its own, if you want to use this solution you need to implement it in your application layer. In this solution you'll scatter your data over multiple databases (preferably in multiple MySQL instances on separate hardware) logically. It will be your responsibility to find the appropriate database holding your designated data. It's one of the most effective solutions ever but it's not always feasible. Its biggest flaw is that data scattered among two or more databases can not be included within a transaction.
- 分片:由于您的问题是用MySQL标记的,而且它本身不支持分片,如果您想使用这个解决方案,您需要在应用程序层中实现它。在这个解决方案中,您将在多个数据库(最好是在单独的硬件上的多个MySQL实例中)上分散数据。您有责任找到适当的数据库来保存指定的数据。这是迄今为止最有效的解决方案之一,但并不总是可行的。它最大的缺点是分散在两个或多个数据库中的数据不能包含在事务中。
- Replication: Depending on your scenario you might be able to incorporate database replication and have copies of your data on them. This way you can connect to them instead of the master database and reduce the load on it. The default replication definition is master/slave scenario in which data flow is one way, from master to the slave. So changes you might make on the slave while will be applied on the salve, they won't be affecting the master. But there is also a master/master replication configuration in which data flow is in both ways. Yet you can not assume atomic integrity for concurrent data changes among both masters. In the end this solution is most effective if you plan to use it in master/slave mode and using slaves for read-only access.
- 复制:根据您的场景,您可能可以合并数据库复制并在其上拥有数据副本。通过这种方式,您可以连接到它们,而不是主数据库,并减少它的负载。默认的复制定义是主/从场景,其中数据流是从主到从的一种方式。所以你可能在奴隶身上做的改变会被应用在药膏上,它们不会影响主人。但是还有一个主/主复制配置,其中数据流有两种方式。然而,您不能假定两个主服务器之间并发数据更改的原子完整性。最后,如果您计划在主/从模式中使用这个解决方案,并使用从服务器进行只读访问,那么这个解决方案是最有效的。
- Caching: Perhaps this solution should not be included here but since your stem does not reject it, here it goes. One of the ways to reduce database load is to cache its data once extracted. This solution can be beneficial specially if extracting data is expensive. There are many cache servers out there, like memcached or redis. This way you can omit so many of the database connections but only for extraction of data.
- 缓存:也许这个解决方案不应该包含在这里,但是因为您的系统没有拒绝它,所以它就在这里。减少数据库负载的一种方法是在提取数据后缓存数据。如果提取数据的成本很高,这种解决方案会非常有用。有许多缓存服务器,比如memcached或redis。通过这种方式,您可以省略许多数据库连接,但仅用于提取数据。
- Other storage engines: You can always switch to more performant engines if your current one does not provide you with what you need. Of course this is only feasible if your needs allow you to. Nowadays there are NoSQL engines, much more performant than RDBMS, which support sharding natively and you can scale them linearly with minimum effort. There are also Lucene based solutions out there with powerful full-text search capabilities providing you with the same automatic sharding. In fact the only reason why you should be using a traditional RDBMS is the atomic behavior of transactions. But if transactions are not a must, there are much better solutions than RDBMS.
- 其他存储引擎:如果你当前的引擎不能满足你的需要,你可以随时切换到更高性能的引擎。当然,这只有在你的需要允许的情况下才可行。现在有了NoSQL引擎,比RDBMS性能好得多,它们支持本地分片,您可以以最少的工作量线性扩展它们。也有基于Lucene的解决方案,具有强大的全文搜索功能,为您提供相同的自动分片功能。事实上,您应该使用传统RDBMS的唯一原因是事务的原子行为。但是,如果事务不是必须的,那么有比RDBMS更好的解决方案。
#2
3
If you don't already, you could try running your application on an application server -- to get some middleware behind your app. Most application servers will do their own connection pooling (because getting a connection from a web app to a database connection pool is still really really expensive). Additionally, you should be able to configure your application server to use shared connections -- which as the name implies will allow connections to be shared wherever possible.
如果还没有,您可以尝试在应用服务器上运行应用程序——在应用程序背后获得一些中间件。大多数应用服务器将执行它们自己的连接池(因为从web应用程序到数据库连接池的连接仍然非常昂贵)。此外,您应该能够配置应用服务器来使用共享连接——顾名思义,共享连接将允许在任何可能的地方共享连接。
In short, use an appserver. If you already are, maybe mention which one you're using and we can look at optimizing the server config from there.
简而言之,使用appserver。如果您已经使用了,可以提到您正在使用的是哪一个,我们可以从那里查看服务器配置的优化。
#3
3
Replication -- Master plus any number of slaves. This gives you "unlimited" read scaling.
复制——Master加上任意数量的从服务器。这给了你“无限”的读缩放。
Disconnect -- A connection should not keep the connection open longer than necessary.
断开——连接不应该使连接打开的时间超过必要的时间。
Unix, not Windows -- Need I elaborate?
Unix,不是Windows——我需要详细说明吗?
InnoDB -- Use InnoDB, not MyISAM.
InnoDB——使用InnoDB,而不是MyISAM。
SlowLog -- Set long_query_time
to 1 and watch for the top couple of queries; optimize them. See pt-query-digest
for help in summarizing the slowlog.
SlowLog——将long_query_time设置为1,查看最上面的两个查询;优化它们。请参阅pt-query-digest,了解如何总结慢速日志。
#4
2
This is a tipical app scaling problem and many solutions have been devised - Google Big Table and Amazon Elastic products for instance. If moving into a cloud and taking advantage of the auto-scaling options they all provide is not an option then you'll need to create your own setup. Take a look at the docs for Postgres and MySQL, and you'll find that the ideas are pretty similar, including the concepts of
这是一个典型的应用程序扩展问题,已经设计了许多解决方案——比如谷歌Big Table和Amazon Elastic产品。如果进入云计算并利用它们提供的自动伸缩选项不是一个选项,那么您需要创建自己的设置。看看Postgres和MySQL的文档,你会发现它们非常相似,包括它们的概念
-
sharding: spread your client data into several databases and route clients requests to the right database instances.
分片:将客户端数据扩展到多个数据库,并将客户机请求路由到正确的数据库实例。
-
Load Balancing: have your app deployed in several servers and use a middleware to route requests based on load on the server. It'll require some kind of DB synchronizarion tool like SymmetricDS to keep databases in sync.
负载平衡:将应用程序部署在多个服务器上,并使用中间件根据服务器上的负载路由请求。它将需要某种DB synchronizarion工具(比如symmetricd)来保持数据库同步。
This is by no means a full-blown overview of all your options but might help you get started.
这绝不是对你所有选择的全面概述,但可能有助于你开始。
#5
0
There are many things you should investigate for this problem.
- How many simultaneous connections are there. You can always increase ram and increase the number of max connections. MySQL can support millions of connections.
对于这个问题,你应该调查很多事情。-有多少同时连接。您总是可以增加ram并增加最大连接数。MySQL可以支持数百万个连接。
-make sure your app is closing connections. Even with a pool the app has to return connections to the pool.
-确保你的应用正在关闭连接。即使有一个池,应用程序也必须返回到池的连接。
-run database on separate server.
-在单独的服务器上运行数据库。
-make sure you have optimized queries. One long running query can slow a system down.
-确保你有优化过的查询。一个长时间运行的查询可以降低系统的运行速度。
-finally use MySQL cluster if other approaches fail. With a high traffic site you may want to consider this to avoid single point of failure.
-如果其他方法失败,最后使用MySQL集群。对于高流量站点,您可能需要考虑这一点,以避免单点失败。
#6
0
I was running into similar issues, even though the app was supposedly closing it's connections I could see them stacking up in SQL as sleeping connections. After checking into the issue, I add the following to my connection string in webconfig with the following to test:
我也遇到过类似的问题,尽管这款应用可能会关闭它的连接,但我可以看到它们在SQL中以睡眠连接的方式堆积起来。在检查问题后,我在webconfig中将以下内容添加到我的连接字符串中,如下所示进行测试:
Connection Lifetime=600
This should have killed any sleeping connections after 10 mins - but it didn't...
这应该会在10分钟后扼杀任何睡眠连接——但它并没有……
On further review I had pending windows updates on both my web server and SQL server. And magically, the problem went away!
在进一步的检查中,我在web服务器和SQL server上都有windows更新。奇迹般地,问题消失了!
I wish I could have a more specific answer for you but somewhere between adding that "Connection Lifetime" and getting my web and SQL servers up to date with patches completely eliminated the issue for me. I've been clean for 3 weeks now, no issues.
我希望我能给你一个更具体的答案,但在添加“连接的生命周期”和让我的web和SQL服务器与补丁更新的某个地方之间的某个地方,完全消除了我的问题。我已经清洁了3周了,没问题。
#7
0
In our case, we were also facing the same issue when mysql concurrent connection reached to 100.
在我们的例子中,当mysql并发连接达到100时,我们也面临同样的问题。
Finally, we found a great npm express-myconnection module (https://www.npmjs.com/package/express-myconnection). It automatically release the connections when done. It supports Single and Pool connection strategies.
最后,我们找到了一个很棒的npm express-myconnection模块(https://www.npmjs.com/package/express-myconnection)。它在完成时自动释放连接。它支持单一和池连接策略。
It works fine.
它将正常工作。