I am writing a backup system in Python, with a Django front-end. I have decided to implement the scheduling in a slightly strange way - the client will poll the server (every 10 minutes or so), for a list of backups that need doing. The server will only respond when the time to backup is reached. This is to keep the system platform independent - so that I don't rely on cronjobs or suchlike. Therefore the Django front-end (which exposes an XML-RPC API) has to store the schedule in a database, and interpret that schedule to decide if a client should start backing up or not.
我正在用Python编写一个备份系统,带有Django前端。我决定以一种稍微奇怪的方式实现调度 - 客户端将轮询服务器(每10分钟左右),以获取需要执行的备份列表。服务器仅在达到备份时间时响应。这是为了保持系统平台的独立性 - 这样我就不依赖于cronjobs等。因此,Django前端(公开XML-RPC API)必须将调度存储在数据库中,并解释该调度以确定客户端是否应该开始备份。
At present, the schedule is stored using 3 fields: days, hours and minutes. These are comma-separated lists of integers, representing the days of the week (0-6), hours of the day (0-23) and minutes of the hour (0-59). To decide whether a client should start backing up or not is a horribly inefficient operation - Python must loop over all the days since a time 7-days in the past, then the hours, then the minutes. I have done some optimization to make sure it doesn't loop too much - but still!
目前,时间表使用3个字段存储:日,小时和分钟。这些是以逗号分隔的整数列表,表示星期几(0-6),一天中的小时(0-23)和小时的分钟(0-59)。要确定客户端是否应该开始备份是一个非常低效的操作 - Python必须循环过去7天以来的所有日子,然后是小时,然后是分钟。我做了一些优化,以确保它不会循环太多 - 但仍然!
This works relatively well, although the implementation is pretty ugly. The problem I have is how to display and interpret this information via the HTML form on the front-end. Currently I just have huge lists of multi-select fields, which obviously doesn't work well.
虽然实现起来非常难看,但效果相对较好。我遇到的问题是如何通过前端的HTML表单显示和解释这些信息。目前我只有大量的多选字段列表,显然效果不佳。
Can anyone suggest a different method for implementing the schedule that would be more efficient, and also easier to represent in an HTML form?
任何人都可以建议一种不同的方法来实现更高效的计划,并且更容易在HTML表单中表示吗?
2 个解决方案
#1
Take a look at django-chronograph. It has a pretty nice interface for scheduling jobs at all sorts of intervals. You might be able to borrow some ideas from that. It relies on python-dateutil, which you might also find useful for specifying repeating events.
看看django-计时码表。它有一个非常好的接口,用于以各种间隔调度作业。你也许可以从中借鉴一些想法。它依赖于python-dateutil,您可能会发现它对指定重复事件很有用。
#2
Your question is a bit ambiguous—do you mean: "Back up every Sunday, Monday and Friday at time X."?
你的问题有点含糊不清 - 你的意思是:“每个星期天,星期一和星期五在时间X备份。”?
If so, use a Bitmask to store the recurring schedule as an integer:
如果是这样,请使用位掩码将重复计划存储为整数:
Let's say that you want a backup as mentioned above—on Sundays, Mondays and Fridays. Encode the days of the week as an integer (represented in Binary):
假设你想要一个如上所述的备份 - 周日,周一和周五。将星期几编码为整数(以二进制表示):
S M T W T F S
1 1 0 0 0 1 0 = 98
To find out if today (eg. Friday) is a backup day, simply do a bitwise and
:
要了解今天(例如星期五)是否是备份日,只需按位执行:
>>> 0b1100010 & 0b0000010 != 0
True
To get the current day as an integer, you need to offset it by one since weekday()
assumes week starts on Monday:
要将当前日期作为整数,您需要将其偏移一,因为工作日()假定星期一开始星期:
current_day = (timezone.now().weekday() + 1) % 7
In summary, the schema for your Schedule
object would look something like:
总之,Schedule对象的架构如下所示:
class Schedule(models.Model):
days_recurrence = models.PositiveSmallIntegerField(db_index=True)
time = models.TimeField()
With this schema, you would need a new Schedule
object for each time of the day you would like to back-up. This is a fast lookup since the bitwise operation costs around 2 cycles and since you're indexing the field days_recurrence
, you have a worst-case day-lookup of O(logn)
which should cut down your complexity considerably. If you want to squeeze more performance out of this, you can also use a bitmask for an hour then store the minute.
使用此架构,您需要为要备份的每一天的新时间对象。这是一个快速查找,因为按位运算花费大约2个周期,并且因为你正在索引字段days_recurrence,所以你有一个最坏情况的O(logn)日查找,这应该会大大降低你的复杂性。如果你想从中获得更多的性能,你也可以使用一个位掩码一小时然后存储分钟。
#1
Take a look at django-chronograph. It has a pretty nice interface for scheduling jobs at all sorts of intervals. You might be able to borrow some ideas from that. It relies on python-dateutil, which you might also find useful for specifying repeating events.
看看django-计时码表。它有一个非常好的接口,用于以各种间隔调度作业。你也许可以从中借鉴一些想法。它依赖于python-dateutil,您可能会发现它对指定重复事件很有用。
#2
Your question is a bit ambiguous—do you mean: "Back up every Sunday, Monday and Friday at time X."?
你的问题有点含糊不清 - 你的意思是:“每个星期天,星期一和星期五在时间X备份。”?
If so, use a Bitmask to store the recurring schedule as an integer:
如果是这样,请使用位掩码将重复计划存储为整数:
Let's say that you want a backup as mentioned above—on Sundays, Mondays and Fridays. Encode the days of the week as an integer (represented in Binary):
假设你想要一个如上所述的备份 - 周日,周一和周五。将星期几编码为整数(以二进制表示):
S M T W T F S
1 1 0 0 0 1 0 = 98
To find out if today (eg. Friday) is a backup day, simply do a bitwise and
:
要了解今天(例如星期五)是否是备份日,只需按位执行:
>>> 0b1100010 & 0b0000010 != 0
True
To get the current day as an integer, you need to offset it by one since weekday()
assumes week starts on Monday:
要将当前日期作为整数,您需要将其偏移一,因为工作日()假定星期一开始星期:
current_day = (timezone.now().weekday() + 1) % 7
In summary, the schema for your Schedule
object would look something like:
总之,Schedule对象的架构如下所示:
class Schedule(models.Model):
days_recurrence = models.PositiveSmallIntegerField(db_index=True)
time = models.TimeField()
With this schema, you would need a new Schedule
object for each time of the day you would like to back-up. This is a fast lookup since the bitwise operation costs around 2 cycles and since you're indexing the field days_recurrence
, you have a worst-case day-lookup of O(logn)
which should cut down your complexity considerably. If you want to squeeze more performance out of this, you can also use a bitmask for an hour then store the minute.
使用此架构,您需要为要备份的每一天的新时间对象。这是一个快速查找,因为按位运算花费大约2个周期,并且因为你正在索引字段days_recurrence,所以你有一个最坏情况的O(logn)日查找,这应该会大大降低你的复杂性。如果你想从中获得更多的性能,你也可以使用一个位掩码一小时然后存储分钟。