I have array of mobile numbers, around 50,000. I'm trying to process and send bulk SMS to these numbers using third-party API, but the browser will freeze for some minutes. I'm looking for a better option.
我有大约50,000个手机号码。我正在尝试使用第三方API处理并向这些号码发送批量短信,但浏览器会冻结几分钟。我正在寻找更好的选择。
Processing of the data involves checking mobile number type (e.g CDMA), assigning unique ids to all the numbers for further referencing, check for network/country unique charges, etc.
处理数据涉及检查移动号码类型(例如CDMA),为所有号码分配唯一ID以便进一步参考,检查网络/国家独特费用等。
I thought of queuing the data in the database and using cron to send about 5k by batch every minute, but that will take time if there are many messages. What are my other options?
我想在数据库中排队数据并使用cron每分钟批量发送大约5k,但如果有很多消息则需要时间。我的其他选择是什么?
I'm using Codeigniter 2 on XAMPP server.
我在XAMPP服务器上使用Codeigniter 2。
4 个解决方案
#1
36
I would write two scripts:
我会写两个脚本:
File index.php
:
文件index.php:
<iframe src="job.php" frameborder="0" scrolling="no" width="1" height="1"></iframe>
<script type="text/javascript">
function progress(percent){
document.getElementById('done').innerHTML=percent+'%';
}
</script><div id="done">0%</div>
File job.php
:
文件job.php:
set_time_limit(0); // ignore php timeout
ignore_user_abort(true); // keep on going even if user pulls the plug*
while(ob_get_level())ob_end_clean(); // remove output buffers
ob_implicit_flush(true); // output stuff directly
// * This absolutely depends on whether you want the user to stop the process
// or not. For example: You might create a stop button in index.php like so:
// <a href="javascript:window.frames[0].location='';">Stop!</a>
// <a href="javascript:window.frames[0].location='job.php';">Start</a>
// But of course, you will need that line of code commented out for this feature to work.
function progress($percent){
echo '<script type="text/javascript">parent.progress('.$percent.');</script>';
}
$total=count($mobiles);
echo '<!DOCTYPE html><html><head></head><body>'; // webkit hotfix
foreach($mobiles as $i=>$mobile){
// send sms
progress($i/$total*100);
}
progress(100);
echo '</body></html>'; // webkit hotfix
#2
1
I'm assuming these numbers are in a database, if so you should add a new column titled isSent (or whatever you fancy).
我假设这些数字在数据库中,如果是这样,你应该添加一个名为isSent的新列(或任何你想要的)。
This next paragraph you typed should be queued and possibly done night/weekly/whenever appropriate. Unless you have a specific reason too, it shouldn't be done in bulk on demand. You can even add a column to the db to see when it was last checked so that if a number hasn't been checked in at least X days then you can perform a check on that number on demand.
您键入的下一段应该排队,并且可能在适当的时候每晚/每周进行一次。除非您有特定原因,否则不应按需批量进行。您甚至可以向数据库添加一列以查看上次检查的时间,以便在至少X天内未检查到数字时,您可以根据需要检查该数字。
Processing of the data involves checking mobile number type (e.g CDMA), assigning unique ids to all the numbers for further referencing, check for network/country unique charges, etc.
处理数据涉及检查移动号码类型(例如CDMA),为所有号码分配唯一ID以便进一步参考,检查网络/国家独特费用等。
But that still leads you back to the same question of how to do this for 50,000 numbers at once. Since you mentioned cron jobs, I'm assuming you have SSH access to your server which means you don't need a browser. These cron jobs can be executed via the command line as such:
但是,这仍然会让你回到同一个问题,即如何同时为50,000个数字执行此操作。既然你提到了cron作业,我假设你有SSH访问你的服务器,这意味着你不需要浏览器。这些cron作业可以通过命令行执行,如下所示:
/usr/bin/php /home/username/example.com/myscript.php
/ usr / bin / php /home/username/example.com/myscript.php
My recommendation is to process 1,000 numbers at a time every 10 minutes via cron and to time how long this takes, then save it to a DB. Since you're using a cron job, it doesn't seem like these are time-sensitive SMS messages so they can be spread out. Once you know how long it took for this script to run 50 times (50*1000 = 50k) then you can update your cron job to run more/less frequently.
我建议通过cron每10分钟一次处理1,000个数字,并计算这需要多长时间,然后将其保存到数据库中。由于您使用的是cron作业,因此它们似乎不是时间敏感的SMS消息,因此它们可以分散开来。一旦你知道这个脚本运行50次(50 * 1000 = 50k)需要多长时间,那么你可以更新你的cron作业以更频繁地运行。
$time_start = microtime(true);
set_time_limit(0);
function doSendSMS($phoneNum, $msg, $blah);
$time_end = microtime(true);
$time = $time_end - $time_start;
saveTimeRequiredToSendMessagesInDB($time);
Also, you might have noticed a set_time_limit(0), this will tell PHP to not timeout after the default 30seconds. If you are able to modify the PHP.ini file then you don't need to enter this line of code. Even if you are able to edit the PHP.ini file, I would still recommend not changing this feature since you might want other pages to time out.
此外,您可能已经注意到set_time_limit(0),这将告诉PHP在默认的30秒后不会超时。如果您能够修改PHP.ini文件,则无需输入此行代码。即使您能够编辑PHP.ini文件,我仍然建议您不要更改此功能,因为您可能希望其他页面超时。
http://php.net/manual/en/function.set-time-limit.php
http://php.net/manual/en/function.set-time-limit.php
#3
1
If this isn't a one-off type of situation, consider engineering a better solution.
如果这不是一次性的情况,请考虑设计更好的解决方案。
What you basically want is a queue that your browser-bound process can write to, and than 1-N worker processes can read from and update.
您基本上想要的是您的浏览器绑定进程可以写入的队列,而不是1-N工作进程可以读取和更新的队列。
Putting work in the queue should be rather inexpensive - perhaps a bunch of simple INSERT statements to a SQL RDBMS.
将工作放入队列应该相当便宜 - 可能是SQL RDBMS的一堆简单的INSERT语句。
Then you can have a daemon or two (or 100, distributed across multiple servers) that read from the queue and process stuff. You'll want to be careful here and avoid two workers taking on the same task, but that's not hard to code around.
然后你可以有一个或多个守护进程(或100个,分布在多个服务器上)从队列中读取和处理的东西。你在这里要小心,避免两个工作人员承担相同的任务,但这并不难编码。
So your browser-bound workflow is: click some button that causes a bunch of stuff to get added to the queue, then redirect to some "queue status" interface, where the user can watch the system chew through all their work.
所以你的浏览器绑定工作流程是:点击一些按钮,导致一大堆东西被添加到队列,然后重定向到一些“队列状态”界面,用户可以在这里看到系统咀嚼他们所有的工作。
A system like this is nice, because it's easy to scale horizontally quite a ways.
像这样的系统很不错,因为它很容易水平扩展。
EDIT: Christian Sciberras' answer is going in this direction, except the browser ends up driving both sides (it adds to the queue, then drives the worker process)
编辑:Christian Sciberras的回答正朝着这个方向发展,除了浏览器最终驱动双方(它添加到队列,然后驱动工作进程)
#4
0
Cronjob would be your best bet, I don't see why it would take any longer than doing it in the browser if your only problem at the moment is the browser timing out.
Cronjob将是你最好的选择,我不明白为什么在浏览器中执行此操作需要的时间长,如果您目前唯一的问题是浏览器超时。
If you insist on doing it via the browser then the other solution would be doing it in batches of say 1000 and redirecting to the same script but with some reference to where it got up to last time in a $_GET variable.
如果你坚持通过浏览器这样做,那么另一个解决方案就是批量执行1000次并重定向到同一个脚本,但是在上一次的$ _GET变量中引用了它。
#1
36
I would write two scripts:
我会写两个脚本:
File index.php
:
文件index.php:
<iframe src="job.php" frameborder="0" scrolling="no" width="1" height="1"></iframe>
<script type="text/javascript">
function progress(percent){
document.getElementById('done').innerHTML=percent+'%';
}
</script><div id="done">0%</div>
File job.php
:
文件job.php:
set_time_limit(0); // ignore php timeout
ignore_user_abort(true); // keep on going even if user pulls the plug*
while(ob_get_level())ob_end_clean(); // remove output buffers
ob_implicit_flush(true); // output stuff directly
// * This absolutely depends on whether you want the user to stop the process
// or not. For example: You might create a stop button in index.php like so:
// <a href="javascript:window.frames[0].location='';">Stop!</a>
// <a href="javascript:window.frames[0].location='job.php';">Start</a>
// But of course, you will need that line of code commented out for this feature to work.
function progress($percent){
echo '<script type="text/javascript">parent.progress('.$percent.');</script>';
}
$total=count($mobiles);
echo '<!DOCTYPE html><html><head></head><body>'; // webkit hotfix
foreach($mobiles as $i=>$mobile){
// send sms
progress($i/$total*100);
}
progress(100);
echo '</body></html>'; // webkit hotfix
#2
1
I'm assuming these numbers are in a database, if so you should add a new column titled isSent (or whatever you fancy).
我假设这些数字在数据库中,如果是这样,你应该添加一个名为isSent的新列(或任何你想要的)。
This next paragraph you typed should be queued and possibly done night/weekly/whenever appropriate. Unless you have a specific reason too, it shouldn't be done in bulk on demand. You can even add a column to the db to see when it was last checked so that if a number hasn't been checked in at least X days then you can perform a check on that number on demand.
您键入的下一段应该排队,并且可能在适当的时候每晚/每周进行一次。除非您有特定原因,否则不应按需批量进行。您甚至可以向数据库添加一列以查看上次检查的时间,以便在至少X天内未检查到数字时,您可以根据需要检查该数字。
Processing of the data involves checking mobile number type (e.g CDMA), assigning unique ids to all the numbers for further referencing, check for network/country unique charges, etc.
处理数据涉及检查移动号码类型(例如CDMA),为所有号码分配唯一ID以便进一步参考,检查网络/国家独特费用等。
But that still leads you back to the same question of how to do this for 50,000 numbers at once. Since you mentioned cron jobs, I'm assuming you have SSH access to your server which means you don't need a browser. These cron jobs can be executed via the command line as such:
但是,这仍然会让你回到同一个问题,即如何同时为50,000个数字执行此操作。既然你提到了cron作业,我假设你有SSH访问你的服务器,这意味着你不需要浏览器。这些cron作业可以通过命令行执行,如下所示:
/usr/bin/php /home/username/example.com/myscript.php
/ usr / bin / php /home/username/example.com/myscript.php
My recommendation is to process 1,000 numbers at a time every 10 minutes via cron and to time how long this takes, then save it to a DB. Since you're using a cron job, it doesn't seem like these are time-sensitive SMS messages so they can be spread out. Once you know how long it took for this script to run 50 times (50*1000 = 50k) then you can update your cron job to run more/less frequently.
我建议通过cron每10分钟一次处理1,000个数字,并计算这需要多长时间,然后将其保存到数据库中。由于您使用的是cron作业,因此它们似乎不是时间敏感的SMS消息,因此它们可以分散开来。一旦你知道这个脚本运行50次(50 * 1000 = 50k)需要多长时间,那么你可以更新你的cron作业以更频繁地运行。
$time_start = microtime(true);
set_time_limit(0);
function doSendSMS($phoneNum, $msg, $blah);
$time_end = microtime(true);
$time = $time_end - $time_start;
saveTimeRequiredToSendMessagesInDB($time);
Also, you might have noticed a set_time_limit(0), this will tell PHP to not timeout after the default 30seconds. If you are able to modify the PHP.ini file then you don't need to enter this line of code. Even if you are able to edit the PHP.ini file, I would still recommend not changing this feature since you might want other pages to time out.
此外,您可能已经注意到set_time_limit(0),这将告诉PHP在默认的30秒后不会超时。如果您能够修改PHP.ini文件,则无需输入此行代码。即使您能够编辑PHP.ini文件,我仍然建议您不要更改此功能,因为您可能希望其他页面超时。
http://php.net/manual/en/function.set-time-limit.php
http://php.net/manual/en/function.set-time-limit.php
#3
1
If this isn't a one-off type of situation, consider engineering a better solution.
如果这不是一次性的情况,请考虑设计更好的解决方案。
What you basically want is a queue that your browser-bound process can write to, and than 1-N worker processes can read from and update.
您基本上想要的是您的浏览器绑定进程可以写入的队列,而不是1-N工作进程可以读取和更新的队列。
Putting work in the queue should be rather inexpensive - perhaps a bunch of simple INSERT statements to a SQL RDBMS.
将工作放入队列应该相当便宜 - 可能是SQL RDBMS的一堆简单的INSERT语句。
Then you can have a daemon or two (or 100, distributed across multiple servers) that read from the queue and process stuff. You'll want to be careful here and avoid two workers taking on the same task, but that's not hard to code around.
然后你可以有一个或多个守护进程(或100个,分布在多个服务器上)从队列中读取和处理的东西。你在这里要小心,避免两个工作人员承担相同的任务,但这并不难编码。
So your browser-bound workflow is: click some button that causes a bunch of stuff to get added to the queue, then redirect to some "queue status" interface, where the user can watch the system chew through all their work.
所以你的浏览器绑定工作流程是:点击一些按钮,导致一大堆东西被添加到队列,然后重定向到一些“队列状态”界面,用户可以在这里看到系统咀嚼他们所有的工作。
A system like this is nice, because it's easy to scale horizontally quite a ways.
像这样的系统很不错,因为它很容易水平扩展。
EDIT: Christian Sciberras' answer is going in this direction, except the browser ends up driving both sides (it adds to the queue, then drives the worker process)
编辑:Christian Sciberras的回答正朝着这个方向发展,除了浏览器最终驱动双方(它添加到队列,然后驱动工作进程)
#4
0
Cronjob would be your best bet, I don't see why it would take any longer than doing it in the browser if your only problem at the moment is the browser timing out.
Cronjob将是你最好的选择,我不明白为什么在浏览器中执行此操作需要的时间长,如果您目前唯一的问题是浏览器超时。
If you insist on doing it via the browser then the other solution would be doing it in batches of say 1000 and redirecting to the same script but with some reference to where it got up to last time in a $_GET variable.
如果你坚持通过浏览器这样做,那么另一个解决方案就是批量执行1000次并重定向到同一个脚本,但是在上一次的$ _GET变量中引用了它。