4.工作流日志:一种通用的XML格式
在这一部分,我们将集中讨论工作流日志中所存信息的语法和语义。在这里,我们将展示一种文中所描述的挖掘方法或工具使用的独立于工具的XML格式。图4表明这种XML格式连接了各种交互系统,如工作流系统、ERP系统、CRM系统和案例处理系统。原则上,任何注册了与(包含案例的)任务执行相关的事件的系统都能用到这种独立于工具的格式来存储和交换日志。这种XML格式用作第5-9部分提到的分析工具的输入。使用一种唯一格式的目的在于节省实现时的工作和改进多个上下文中这些挖掘技巧的应用。
表3展示了工作流日志的文档类型定义(DTD)[12]。这份DTD定义了工作流日志的语法。一份工作流日志是一份严密的XML文档,即:一种WorkFlow_log作为*元素的正式的和有效的XML文件(参见表3)。如表所示,一份工作流日志包含了关于源程序的(可选)信息和一个或更多的工作流过程的信息。每个工作流过程(过程元素)包含一系列的案例(案例元素),每个案例又包含了若干的日志行(日志行元素)。过程和案例都有一个编号和描述。日志中的每一行含有一项任务的名称(任务名元素)。此外,还可能包含任务实例(任务实例元素)、事件类型(事件元素)、日期(日期元素)和事件的时间(时间元素)的信息。
考虑到过程和案例各自的描述都要确保唯一,一个过程内对于一项任务要有一个唯一的标识,如果一个过程内的2个或多个任务具有相同的名称,它们可能会指向同一个任务。(例如,在Staffware系统中,具有不同名称的任务可以具有相同的描述。由于出现在日志中的是任务描述而非任务名称,所以这回造成混淆)。尽管我们假设任务要有唯一的名称,但相同的任务还是可能会有多个实例。例如一个导致任务为了一项给定的案例而执行了多次的循环。因此,我们可以在日志行上加一个“任务实例”元素,这个元素一般是一个数字,如:如果任务A执行到了第5次,则任务名称元素的值为“A”,任务实例元素的值为“5”。日期和时间元素都是可选的。日期元素的格式必须为:dd-mm-yyyy。所以每个日期(元素)严格的包含10个字母,其中天2位,月2位(即:01表示1月),年4位。时间元素则遵循下面的格式:HH:MM(:ss(:mmmmmm))。所以每个时间元素包含5个、8个或17个字母,其中小时2位(00-23),分2位(00-59),其他的6位是可选的,用作记录经过的秒数。完整的日志应该按照下面的方式排序:每个案例,所有的日志入口都必须以他们发生的时间顺序出现。
如果信息是不可用的,我们可以输入缺省值。例如,保存表2所示的信息时,事件类型将设为普通,日期和时间可以随便设定。可见,表1所示的Staffware日志转换为XML格式是相当简单(直接)的。
表3给出了XML文件的语法定义。除了事件元素(即事件类型)外,大多数结构的语义都是自带解释的。我们将事件类型分为8种:普通、计划、启动、撤销、暂停、恢复、放弃和完成。为了解释这些事件类型,我们用到了图5所示的FSM(有限状态机)。该状态机描述了一项任务从创建到完成所有可能的状态。图中的箭头描述了状态间所有可能的转换,同时我们假设这些转换是原子事件(即:发生时间可以忽略的事件)。“新建”状态是任务启动时的状态,这种状态下仅仅可能发生的是“计划”事件。这种事件在任务即将执行(即:可用或已计划了的)时发生。“结果”状态时按计划要发生的状态。在这种状态下的任务一般会出现在1个或多个工作人员的工作列表中。在计划状态下可能发生的事件有两种:启动和撤销。如果一项任务撤销了,它将从工作列表中被删除,(相应的)结果状态也被终止。启动了的任务也将从工作列表中移走,但结果状态仍然是激活的。在激活状态下,任务的实际过程发生了。如果过程成功,案例转化为完成状态,对应完成事件。如果由于某些原因,过程不肯能完成,任务会转变为终止状态,对应放弃事件。在激活状态下,任务是可能暂停的(暂停事件)。暂停了的任务(即:处于暂停状态的任务)可以被转为激活状态,对应恢复事件。
图5所示的事件显然比表2所示的事件丰富的多。有时将相关的任务简化为即时的和总是成功完成的原子事件是适宜的。出于这个目的,我们使用图5中没有的普通类型的事件。普通类事件可以看作是计划事件、启动时间和完成时间在一个原子动作内的执行。
一些系统日志记录的事件比图5所示的还要丰富。其他的系统仅仅记录图5所示的事件。此外,各种类型的事件名称一般都不一样。作为例子,我们提到Staffware工作流管理系统和表1所示的日志。Staffware记录了计划、撤销和完成事件。这些事件被命名为“进行到”、“撤销”和“被释放”(参看表1)。Staffware不记录启动、暂停和放弃事件。此外,它还记录了图5没有的事件类型,如“启动”,2“终止”和“中断”。Staffware日志转换为XML格式时,我们可以选择简单的过滤出这些事件,或将他们归为普通类型的事件。
有限状态机展示的IBM MQSeries工作流记录的潜在事件的顺序不同于图5所示的普通有限状态机(更详细的论述参看[33])。因此,我们需要一个事件和从MQSeries构造普通FSM事件的事件序列的对照表。MQSeries记录了在普通FSM中出现的所有事件所对应的事件。由于MQSeries FSM比普通FSM有更多的状态和转换,所以有若干组必须对照到普通FSM的独立事件的事件。在MQSeries中出现最频繁的事件序列是活动准备—活动开始—活动完成。无论活动执行时有无异常或混乱发生,这一序列都要被记入日志。与其相对照的是计划—开始—完成。进而,还有许多不同的特殊情况。例如,活动可能在计划状态下就被取消了。在普通FSM中,事件的顺序为计划—撤销。与此等价的,MQSeries FSM的第一个事件是活动准备,第二个则可能是活动错误、活动取消、用户发出终止命令、活动强行结束或活动中断。所以,一个以活动准备为开始并且接下来是上面提到的5种事件之一的序列被对照为了计划—撤销。另外一种特殊情况是活动在运行(即:处于激活状态)时就可能被取消了。MQSeries以“活动准备—活动开始—上面提到的5种事件之一”的序列形式将它记入日志。这样的序列对照到普通FSM就是:计划—开始—放弃。除了这些例子之外,还要处理许多其他情况。工具QuaXMap(MQSeries AuditTrailXML Mapper[53])实现了完整的对照(功能)。
让我们再看一下图 4 。现在,我们已经完成了从 Staffware ( Staffware PLC[58] )、 InConcert ( TIBCO[60] )和 MQSeries Workflow ( IBM[32] )等工作流管理系统的日志到我们的 XML 格式的转换。之后,我们计划提供从一个更大范围的系统( ERP 、 CRM 、案例处理和 B2B 系统)的转换。经验表明,从面向企业的信息系统中提取信息并将其转换为我们的 XML 格式是相当容易的(只要系统中记录了这些信息)。图 4 也展示了一些可用的挖掘工具,它们支持本文其他章节所提到的方法,而且能够识别我们的 XML 格式。2 Staffware中的启动事件不是指案例的创建,而且不要与图5中的启动事件相混淆。
4. Workflow logs: A common XML format
In this section we focus on the syntax and semantics of the information stored in the workflow log. We will do this by presenting a tool independent XML format that is used by each of the mining approaches/tools described in the remainder. Fig. 4 shows that this XML format connects transactional systems such as workflow management systems, ERP systems, CRM systems, and case handling systems. In principle, any system that registers events related to the execution of tasks for cases can use this tool independent format to store and exchange logs. The XML format is used as input for the analysis tools presented in Sections 5–9. The goal of using a single format is to reduce the implementation effort and to promote the use of these mining techniques in multiple contexts.
Table 3 shows the Document Type Definition (DTD) [12] for workflow logs. This DTD specifies the syntax of a workflow log. A workflow log is a consistent XML document, i.e., a well-formed and valid XML file with top element WorkFlow_log (see Table 3). As shown, a workflow log consists of (optional) information about the source program and information about one or more workflow processes. Each workflow process (element process) consists of a sequence of cases (element case) and each case consists of a sequence of log lines (element log_line). Both processes and cases have an id and a description. Each line in the log contains the name of a task (element task_name). In addition, the line may contain information about the task instance (element task_instance), the type of event (element event), the date (element date), and the time of the event (element time).
It is advised to make sure that the process description and the case description are unique for each process or case respectively. The task name should be a unique identifier for a task within a process. If there are two or more tasks in a process with the same task name, they are assumed to refer to the same task. (For example, in Staffware it is possible to have two tasks with different names, but the same description. Since the task description and not the task name appears in the log this can lead to confusion.) Although we assume tasks to have a unique name, there may be multiple instances of the same task. Consider for example a loop which causes a task to be executed multiple times for a given case. Therefore, one can add the element task_instance to a log line. This element will typically be a number, e.g., if task A is executed for the fifth time, element task_name is “A” and element task_instance is “5”. The date and time elements are also optional. The date element must be in the following format: dd-mm-yyyy. So each date consists of exactly 10 characters of which there are 2 for the day, 2 for the month (i.e., 01 for January) and 4 for the year. The time element must be in the following format: HH:MM(:ss(:mmmmmm)). So each time-element consists of five, eight or seventeen characters of which there are two for the hour (00–23), two for the minutes (00–59) and optionally two for the seconds (00–59) and again optionally six for the fraction of a second that has passed. The complete log has to be sorted in the following way: Per case, all log entry’s have to appear in the order in which they took place.
If information is not available, one can enter default values. For example, when storing the information shown in Table 2 the event type will be set to normal and the date and time will be set to some arbitrary value. Note that it is also fairly straightforward to map the Staffware log of Table 1 onto the XML format.
Table 3 specifies the syntax of the XML file. The semantics of most constructs are selfexplaining except for the element event, i.e., the type of event. We identify eight event types: normal, schedule, start, withdraw, suspend, resume, abort, and complete. To explain these event types we use the FSM shown in Fig. 5. The FSM describes all possible states of a task from creation to completion. The arrows in this figure describe all possible transitions between states and we assume these transitions to be atomic events (i.e., events that take no time). State New is the state in which the task starts. From this state only the event Schedule is possible. This event occurs when the task becomes ready to be executed (i.e., enabled or scheduled). The resulting state is the state Scheduled. In this state the task is typically in the worklist of one or more workers. From state Scheduled two events are possible: Start and Withdraw. If a task is withdrawn, it is deleted from the worklist and the resulting state is Terminated. If the task is started it is also removed from the worklist but the resulting state is Active. In state Active the actual processing of the task takes places. If the processing is successful, the case is moved to state Completed via event Complete. If for some reason it is not possible to complete, the task can be moved to state Terminated via an event of type Abort. In state Active it is also possible to suspend a task (event Suspend). Suspended tasks (i.e., tasks in state Suspended) can move back to state Active via the event Resume.
The events shown in Fig. 5 are at a more fine grained level than the events shown in Table 2. Sometimes it is convenient to simply consider tasks as atomic events which do not take any time and always complete successfully. For this purpose, we use the event type Normal which is not shown in Fig. 5. Events of type Normal can be considered as the execution of events Schedule, Start, and Complete in one atomic action.
Some systems log events at an even more fine-grained level than Fig. 5. Other systems only log some of the events shown in Fig. 5. Moreover, the naming of the event types is typically different. As an example, we consider the workflow management system Staffware and the log shown in Table 1. Staffware records the Schedule, Withdraw, and Complete events. These events are named respectively “Processed To”, “Withdrawn”, and “Released By” (see Table 1). Staffware does not record start, suspend, resume, and abort event. Moreover, it records event types not in Fig. 5, e.g., “Start”, [2] “Expired”, and “Terminated”. When mapping Staffware logs onto the XML format, one can choose to simply filter out these events or to map them on events of type normal.
The FSM representing the potential events orders recorded by IBM MQSeries Workflow is quite different from the common FSM shown in Fig. 5. (For more details see [33].) Therefore, we need a mapping of events and event sequences originating from MQSeries into events of the common FSM. MQSeries records corresponding events for all events represented in the common FSM. Due to the fact that MQSeries FSM has more states and transitions than the common FSM, there are sets of events that must be mapped into a single event of the common FSM. The most frequent event sequence in MQSeries is Activity ready––Activity started––Activity implementation completed. This sequence is logged whenever an activity is executed without any exceptions or complications. It is mapped into Schedule––Start––Complete. Furthermore, there are a lot of different special cases. For example, an activity may be cancelled while being in state Scheduled. The order of events in the common FSM is Schedule––Withdraw. The equivalent first event in MQSeries FSM is Activity ready. The second event could be Activity inError, Activity expired, User issued a terminate command, Activity force-finished or Activity terminated. So, a sequence with first part Activity ready and one of the five events mentioned before as second part is mapped into Schedule––Withdraw. Another difference is that an activity may be cancelled while running, i.e., it is in state Active. MQSeries will log this case in form of a sequence starting with Activity ready, proceeding with Activity started, and ending with one of the five events specified above. Such a sequence is mapped into Schedule––Start––Abort. Beside these examples, there are many more cases that have to be handled. The tool QuaXMap (MQSeries Audit Trail XML Mapper [53]) implements the complete mapping.
Let us return to Fig. 4. At this moment, we have developed translations from the log files of workflow management systems Staffware (Staffware PLC [58]), InConcert (TIBCO [60]), and MQSeries Workflow (IBM [32]) to our XML format. In the future, we plan to provide more translations from a wide range of systems (ERP, CRM, case handling, and B2B systems). Experience shows that it is also fairly simple to extract information from enterprise-specific information systems and translate this to the XML format (as long as the information is there). Fig. 4 also shows some of the mining tools available. These tools support the approaches presented in the remainder of this paper and can all read the XML format.
[2] The start event in Staffware denotes the creation of a case and should not be confused with the start event in Fig. 5.