如何自动化我在python中编写的每月运行的刮刀程序?

时间:2021-02-15 09:24:29

I have written a Python program that scrapes information from a website using regex. My goal is to create a cron job to run this scraper each month.

我编写了一个Python程序,使用正则表达式从网站上删除信息。我的目标是创建一个cron作业来每月运行这个刮刀。

I have gone into the Linux terminal, typed in crontab -e, and added to the bottom of the crontab file:

我已进入Linux终端,输入crontab -e,并添加到crontab文件的底部:

**

**

#!/usr/bin/python chmod +x
30 8 1 * * /home/pi/Nikita/The_Scraper/thescraper.py
PATH=/home/pi/Nikita/The_Scraper/thescraper.py
MAILTO="myemailaddress@gmail.com"

**

**

I am wondering:

我想知道:

  1. If this is the correct text to include in the crontab file

    如果这是要包含在crontab文件中的正确文本

  2. How to verify if my scraper program with run each month and the cron job I created is working

    如何验证我每个月运行的刮刀程序以及我创建的cron作业是否正常工作

1 个解决方案

#1


1  

Where do I start:

我从哪说起呢:

1st: Cron starts the script via your shell. So /home/pi/Nikita/The_Scraper/thescraper.py have to have execute permissions.

1st:Cron通过你的shell启动脚本。所以/home/pi/Nikita/The_Scraper/thescraper.py必须具有执行权限。

2nd: PATH ist the name of the environment variable where the shell searches for your script if specified without PATH. It should contain only directories.

第二个:PATH是环境变量的名称,如果在没有PATH的情况下指定shell,它将搜索您的脚本。它应该只包含目录。

3rd: The crontab is read from top to bottom. It should be enough to use

第3步:从上到下读取crontab。应该足够使用

MAILTO="myemailaddress@gmail.com"
30 8 1 * * /home/pi/Nikita/The_Scraper/thescraper.py

This should run your script every 1st day of the month on 8:30.

这应该在每月的第一天8:30运行您的脚本。

The MAILTO setting specifies external mail address. You should have a properly configured MTA (program that delivers the mail) running. The content of that mail is STDOUT and STDERR of the mail.

MAILTO设置指定外部邮件地址。您应该有一个正确配置的MTA(传递邮件的程序)正在运行。该邮件的内容是邮件的STDOUT和STDERR。

To test you could specify a time some near time (5 minutes in the future) and see what happens. Also you can redirect the OUTPUT to a file then you see if the job has been run an what its OUTPUT was if sending a mail does not work.

要测试你可以指定一段时间(将来5分钟),看看会发生什么。此外,您可以将OUTPUT重定向到一个文件,然后您可以看到作业是否已经运行了它的OUTPUT是什么,如果发送邮件不起作用。

30 8 1 * * /home/pi/Nikita/The_Scraper/thescraper.py > /some/test/file/were/your/user/can/write

If you have access to the system logs you are able to see if the cronjob has been executed.

如果您有权访问系统日志,则可以查看是否已执行cronjob。

#1


1  

Where do I start:

我从哪说起呢:

1st: Cron starts the script via your shell. So /home/pi/Nikita/The_Scraper/thescraper.py have to have execute permissions.

1st:Cron通过你的shell启动脚本。所以/home/pi/Nikita/The_Scraper/thescraper.py必须具有执行权限。

2nd: PATH ist the name of the environment variable where the shell searches for your script if specified without PATH. It should contain only directories.

第二个:PATH是环境变量的名称,如果在没有PATH的情况下指定shell,它将搜索您的脚本。它应该只包含目录。

3rd: The crontab is read from top to bottom. It should be enough to use

第3步:从上到下读取crontab。应该足够使用

MAILTO="myemailaddress@gmail.com"
30 8 1 * * /home/pi/Nikita/The_Scraper/thescraper.py

This should run your script every 1st day of the month on 8:30.

这应该在每月的第一天8:30运行您的脚本。

The MAILTO setting specifies external mail address. You should have a properly configured MTA (program that delivers the mail) running. The content of that mail is STDOUT and STDERR of the mail.

MAILTO设置指定外部邮件地址。您应该有一个正确配置的MTA(传递邮件的程序)正在运行。该邮件的内容是邮件的STDOUT和STDERR。

To test you could specify a time some near time (5 minutes in the future) and see what happens. Also you can redirect the OUTPUT to a file then you see if the job has been run an what its OUTPUT was if sending a mail does not work.

要测试你可以指定一段时间(将来5分钟),看看会发生什么。此外,您可以将OUTPUT重定向到一个文件,然后您可以看到作业是否已经运行了它的OUTPUT是什么,如果发送邮件不起作用。

30 8 1 * * /home/pi/Nikita/The_Scraper/thescraper.py > /some/test/file/were/your/user/can/write

If you have access to the system logs you are able to see if the cronjob has been executed.

如果您有权访问系统日志,则可以查看是否已执行cronjob。