We have a service which sees several hundred simultaneous connections throughout the day, peeking at about 2000, for about 3 million hits a day, and growing. With each request I need to log 4 or 5 pieces of data to MySQL, we originally used the logging that came with the app were using however it was terribly inefficient and would run my db server at >3x the average cpu load, and would eventually bring the server to it knees.
我们有一项服务,全天可以看到数百个同时连接,大约2000个时间,每天大约300万次点击,并且正在增长。对于每个请求,我需要将4或5个数据记录到MySQL,我们最初使用应用程序附带的日志记录但是它非常低效,并且运行我的数据库服务器的平均CPU负载> 3倍,并最终把服务器拉到膝盖。
At this point we are going to add our own logging to the application (php), the only option I have for logging data is the MySQL db, as this is the only common resource available to all of the http servers. This data will be mostly writes however everyday we generate reports based on the data, then crunch and archive the old data.
此时我们将自己的日志记录添加到应用程序(php),我记录数据的唯一选项是MySQL数据库,因为这是所有http服务器可用的唯一公共资源。这些数据主要是写入,但每天我们根据数据生成报告,然后处理旧数据并将其存档。
What recommendations can be made to ensure that I don't take down our services with logging data?
可以提出哪些建议来确保我不会使用记录数据来删除我们的服务?
2 个解决方案
#1
2
The solution we took with this problem was to create an archive table then regularly ( every 15 minutes, on an app server) crunch the data and put it back into the tables that were used to generate reports. The archive table of course did not have any indices, the tables which the reports are generated from have several indices.
我们针对此问题采取的解决方案是定期创建存档表(每15分钟,在应用服务器上)处理数据并将其放回用于生成报告的表中。归档表当然没有任何索引,报告生成的表有几个索引。
Some stats on this approach:
Short Version: >360 times faster
简短版:>快360倍
Long Version:
The original code/model did direct inserts into the indexed table, and the average insert took .036 seconds, using the new code/model inserts took less than .0001 seconds (I was not able to get an accurate fix on the insert time I had to measure 100,000 inserts and average for the insert time). The post-processing (crunch) took an average 12 seconds for several tens-of-thousands records. Overall we were greatly pleased with this approach and so far it has worked incredibly well for us.
原始代码/模型确实直接插入到索引表中,平均插入时间为.036秒,使用新代码/模型插入花费的时间少于0.0001秒(我无法准确修复插入时间I必须测量100,000个插入和插入时间的平均值)。对于数万条记录,后处理(紧缩)平均需要12秒。总的来说,我们对这种方法非常满意,到目前为止,它对我们来说效果非常好。
#2
0
Based on what you describe, I recommend you try to leverage the fact that you don't need to read this data immediately and pursue a "periodic bulk commit route". That is, buffer the logging data in RAM on the app servers and doing periodic bulk commits. If you have multiple application nodes, some sort of randomized approach would help even more (e.g., commit updated info every 5 +/- 2 minutes).
根据您的描述,我建议您尝试利用不需要立即读取此数据并采用“定期批量提交路径”的事实。也就是说,缓存应用程序服务器上的RAM中的日志记录数据并定期进行批量提交。如果您有多个应用程序节点,某种随机方法会有所帮助(例如,每隔5 +/- 2分钟提交更新的信息)。
The main drawback with this approach is that if an app server fails, you lose the buffered data. However, that's only bad if (a) you absolutely need all of the data and (b) your app servers crash regularly. Small chance that both are true, but in the event they are, you can simply persist your buffer to local disk (temporarily) on an app server if that's really a concern.
这种方法的主要缺点是,如果应用服务器出现故障,您将丢失缓冲数据。但是,如果(a)您绝对需要所有数据并且(b)您的应用服务器经常崩溃,那么这只会很糟糕。很有可能两者都是真的,但是如果它们是真的,你可以简单地将你的缓冲区(临时)保存在应用服务器上,如果这真的是一个问题。
The main idea is:
主要想法是:
- buffering the data
- periodic bulk commits (leveraging some sort of randomization in a distributed system would help)
缓冲数据
定期批量提交(在分布式系统中利用某种随机化会有所帮助)
Another approach is to stop opening and closing connections if possible (e.g., keep longer lived connections open). While that's likely a good first step, it may require a fair amount of work on your part on a part of the system that you may not have control over. But if you do, it's worth exploring.
另一种方法是在可能的情况下停止打开和关闭连接(例如,保持更长寿命的连接打开)。虽然这可能是一个良好的第一步,但您可能需要在您可能无法控制的系统的一部分上进行相当多的工作。但如果你这样做,那值得探索。
#1
2
The solution we took with this problem was to create an archive table then regularly ( every 15 minutes, on an app server) crunch the data and put it back into the tables that were used to generate reports. The archive table of course did not have any indices, the tables which the reports are generated from have several indices.
我们针对此问题采取的解决方案是定期创建存档表(每15分钟,在应用服务器上)处理数据并将其放回用于生成报告的表中。归档表当然没有任何索引,报告生成的表有几个索引。
Some stats on this approach:
Short Version: >360 times faster
简短版:>快360倍
Long Version:
The original code/model did direct inserts into the indexed table, and the average insert took .036 seconds, using the new code/model inserts took less than .0001 seconds (I was not able to get an accurate fix on the insert time I had to measure 100,000 inserts and average for the insert time). The post-processing (crunch) took an average 12 seconds for several tens-of-thousands records. Overall we were greatly pleased with this approach and so far it has worked incredibly well for us.
原始代码/模型确实直接插入到索引表中,平均插入时间为.036秒,使用新代码/模型插入花费的时间少于0.0001秒(我无法准确修复插入时间I必须测量100,000个插入和插入时间的平均值)。对于数万条记录,后处理(紧缩)平均需要12秒。总的来说,我们对这种方法非常满意,到目前为止,它对我们来说效果非常好。
#2
0
Based on what you describe, I recommend you try to leverage the fact that you don't need to read this data immediately and pursue a "periodic bulk commit route". That is, buffer the logging data in RAM on the app servers and doing periodic bulk commits. If you have multiple application nodes, some sort of randomized approach would help even more (e.g., commit updated info every 5 +/- 2 minutes).
根据您的描述,我建议您尝试利用不需要立即读取此数据并采用“定期批量提交路径”的事实。也就是说,缓存应用程序服务器上的RAM中的日志记录数据并定期进行批量提交。如果您有多个应用程序节点,某种随机方法会有所帮助(例如,每隔5 +/- 2分钟提交更新的信息)。
The main drawback with this approach is that if an app server fails, you lose the buffered data. However, that's only bad if (a) you absolutely need all of the data and (b) your app servers crash regularly. Small chance that both are true, but in the event they are, you can simply persist your buffer to local disk (temporarily) on an app server if that's really a concern.
这种方法的主要缺点是,如果应用服务器出现故障,您将丢失缓冲数据。但是,如果(a)您绝对需要所有数据并且(b)您的应用服务器经常崩溃,那么这只会很糟糕。很有可能两者都是真的,但是如果它们是真的,你可以简单地将你的缓冲区(临时)保存在应用服务器上,如果这真的是一个问题。
The main idea is:
主要想法是:
- buffering the data
- periodic bulk commits (leveraging some sort of randomization in a distributed system would help)
缓冲数据
定期批量提交(在分布式系统中利用某种随机化会有所帮助)
Another approach is to stop opening and closing connections if possible (e.g., keep longer lived connections open). While that's likely a good first step, it may require a fair amount of work on your part on a part of the system that you may not have control over. But if you do, it's worth exploring.
另一种方法是在可能的情况下停止打开和关闭连接(例如,保持更长寿命的连接打开)。虽然这可能是一个良好的第一步,但您可能需要在您可能无法控制的系统的一部分上进行相当多的工作。但如果你这样做,那值得探索。