I have written a Python program that scrapes information from a website using regex. My goal is to create a cron job to run this scraper each month.
我编写了一个Python程序,使用正则表达式从网站上删除信息。我的目标是创建一个cron作业来每月运行这个刮刀。
I have gone into the Linux terminal, typed in crontab -e
, and added to the bottom of the crontab file:
我已进入Linux终端,输入crontab -e,并添加到crontab文件的底部:
**
**
#!/usr/bin/python chmod +x
30 8 1 * * /home/pi/Nikita/The_Scraper/thescraper.py
PATH=/home/pi/Nikita/The_Scraper/thescraper.py
MAILTO="myemailaddress@gmail.com"
**
**
I am wondering:
我想知道:
-
If this is the correct text to include in the crontab file
如果这是要包含在crontab文件中的正确文本
-
How to verify if my scraper program with run each month and the cron job I created is working
如何验证我每个月运行的刮刀程序以及我创建的cron作业是否正常工作
1 个解决方案
#1
1
Where do I start:
我从哪说起呢:
1st: Cron starts the script via your shell. So /home/pi/Nikita/The_Scraper/thescraper.py have to have execute permissions.
1st:Cron通过你的shell启动脚本。所以/home/pi/Nikita/The_Scraper/thescraper.py必须具有执行权限。
2nd: PATH ist the name of the environment variable where the shell searches for your script if specified without PATH. It should contain only directories.
第二个:PATH是环境变量的名称,如果在没有PATH的情况下指定shell,它将搜索您的脚本。它应该只包含目录。
3rd: The crontab is read from top to bottom. It should be enough to use
第3步:从上到下读取crontab。应该足够使用
MAILTO="myemailaddress@gmail.com"
30 8 1 * * /home/pi/Nikita/The_Scraper/thescraper.py
This should run your script every 1st day of the month on 8:30.
这应该在每月的第一天8:30运行您的脚本。
The MAILTO setting specifies external mail address. You should have a properly configured MTA (program that delivers the mail) running. The content of that mail is STDOUT and STDERR of the mail.
MAILTO设置指定外部邮件地址。您应该有一个正确配置的MTA(传递邮件的程序)正在运行。该邮件的内容是邮件的STDOUT和STDERR。
To test you could specify a time some near time (5 minutes in the future) and see what happens. Also you can redirect the OUTPUT to a file then you see if the job has been run an what its OUTPUT was if sending a mail does not work.
要测试你可以指定一段时间(将来5分钟),看看会发生什么。此外,您可以将OUTPUT重定向到一个文件,然后您可以看到作业是否已经运行了它的OUTPUT是什么,如果发送邮件不起作用。
30 8 1 * * /home/pi/Nikita/The_Scraper/thescraper.py > /some/test/file/were/your/user/can/write
If you have access to the system logs you are able to see if the cronjob has been executed.
如果您有权访问系统日志,则可以查看是否已执行cronjob。
#1
1
Where do I start:
我从哪说起呢:
1st: Cron starts the script via your shell. So /home/pi/Nikita/The_Scraper/thescraper.py have to have execute permissions.
1st:Cron通过你的shell启动脚本。所以/home/pi/Nikita/The_Scraper/thescraper.py必须具有执行权限。
2nd: PATH ist the name of the environment variable where the shell searches for your script if specified without PATH. It should contain only directories.
第二个:PATH是环境变量的名称,如果在没有PATH的情况下指定shell,它将搜索您的脚本。它应该只包含目录。
3rd: The crontab is read from top to bottom. It should be enough to use
第3步:从上到下读取crontab。应该足够使用
MAILTO="myemailaddress@gmail.com"
30 8 1 * * /home/pi/Nikita/The_Scraper/thescraper.py
This should run your script every 1st day of the month on 8:30.
这应该在每月的第一天8:30运行您的脚本。
The MAILTO setting specifies external mail address. You should have a properly configured MTA (program that delivers the mail) running. The content of that mail is STDOUT and STDERR of the mail.
MAILTO设置指定外部邮件地址。您应该有一个正确配置的MTA(传递邮件的程序)正在运行。该邮件的内容是邮件的STDOUT和STDERR。
To test you could specify a time some near time (5 minutes in the future) and see what happens. Also you can redirect the OUTPUT to a file then you see if the job has been run an what its OUTPUT was if sending a mail does not work.
要测试你可以指定一段时间(将来5分钟),看看会发生什么。此外,您可以将OUTPUT重定向到一个文件,然后您可以看到作业是否已经运行了它的OUTPUT是什么,如果发送邮件不起作用。
30 8 1 * * /home/pi/Nikita/The_Scraper/thescraper.py > /some/test/file/were/your/user/can/write
If you have access to the system logs you are able to see if the cronjob has been executed.
如果您有权访问系统日志,则可以查看是否已执行cronjob。