PBS查看历史作业

时间:2021-07-14 21:54:45

PBS查看历史作业

PBS默认情况下使用qstat查看作业时是看不到历史作业的,如果需要查询历史作业,需要修改pbs的配置,设置job_history_enable=True,具体命令如下:

# qmgr
Qmgr: set server job_history_enable=True
Qmgr: set server job_history_duration=336:00:00

上面的命令设置了保存两周的历史作业。duration的格式: [[hours:]minutes:]seconds[.milliseconds]

配置后可以使用下面的命令来查看历史作业

$ qstat -x
Job id Name User Time Use S Queue
---------------- ---------------- ---------------- -------- - -----
6.pbsmaster STDIN admin 00:00:00 F workq
8.pbsmaster STDIN admin 00:00:00 F workq
9[].pbsmaster STDIN admin 0 F workq
10.pbsmaster STDIN admin 00:00:00 F workq
11.pbsmaster STDIN admin 00:00:00 F workq
15[].pbsmaster STDIN admin 0 F workq
16.pbsmaster STDIN admin 00:00:00 F workq
17.pbsmaster STDIN admin 00:00:00 F workq
18.pbsmaster STDIN admin 0 F workq
19.pbsmaster STDIN admin 00:00:00 F workq
20.pbsmaster STDIN admin 0 F workq
21.pbsmaster STDIN admin 00:00:00 F workq
22.pbsmaster STDIN admin 00:00:00 F workq
23[].pbsmaster STDIN admin 0 F workq
26.pbsmaster STDIN admin 00:00:00 F workq

也可以查看历史作业的详细信息,如下:

# qstat -x -f 22
Job Id: 22.pbsmaster
Job_Name = STDIN
Job_Owner = admin@pbsmaster
resources_used.cpupercent = 0
resources_used.cput = 00:00:00
resources_used.mem = 11852kb
resources_used.ncpus = 16
resources_used.vmem = 53280kb
resources_used.walltime = 00:02:00
job_state = F
queue = workq
server = pbsmaster
Checkpoint = u
ctime = Mon Oct 10 11:02:03 2016
Error_Path = pbsmaster:/home/admin/STDIN.e22
exec_host = pbsmaster/0*8+pbsslave/0*8
exec_vnode = (pbsmaster:ncpus=8)+(pbsslave:ncpus=8)
Hold_Types = n
Join_Path = n
Keep_Files = n
Mail_Points = a
mtime = Mon Oct 10 11:05:21 2016
Output_Path = pbsmaster:/home/admin/STDIN.o22
Priority = 0
qtime = Mon Oct 10 11:02:03 2016
Rerunable = True
Resource_List.ncpus = 16
Resource_List.nodect = 2
Resource_List.place = free
Resource_List.select = 2:ncpus=8
schedselect = 2:ncpus=8
stime = Mon Oct 10 11:03:20 2016
session_id = 3383
jobdir = /home/admin
substate = 92
Variable_List = PBS_O_HOME=/home/admin,PBS_O_LANG=en_US.UTF-8,
PBS_O_LOGNAME=admin,
PBS_O_PATH=/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt
/pbs/bin:/home/admin/.local/bin:/home/admin/bin,
PBS_O_MAIL=/var/spool/mail/admin,PBS_O_SHELL=/bin/bash,
PBS_O_WORKDIR=/home/admin,PBS_O_SYSTEM=Linux,PBS_O_QUEUE=workq,
PBS_O_HOST=pbsmaster
euser = admin
egroup = admin
hashname = 22.pbsmaster
queue_rank = 15
queue_type = E
comment = Job run at Mon Oct 10 at 11:03 on (pbsmaster:ncpus=8)+(pbsslave:n
cpus=8) and finished
etime = Mon Oct 10 11:02:03 2016
run_count = 1
Stageout_status = 1
Exit_status = 0
Submit_arguments = -lselect=2:ncpus=8
history_timestamp = 1476097521
project = _pbs_project_default
run_version = 1

还可以使用printjob命令来查看历史作业的详细信息,如下:

# printjob 22
---------------------------------------------------
jobid: 22.pbsmaster
---------------------------------------------------
state: 0x9
substate: 0x5c (92)
svrflgs: 0x21 (33)
ordering: 0
inter prior: 0
stime: 1476097400
file base:
queue: workq
union type exec:
momaddr 18446744072301379588
exits 0
--attributes--
hashname = 22.pbsmaster
comment = Job run at Mon Oct 10 at 11:03 on (pbsmaster:ncpus=8)+(pbsslave:ncpus=8) and finished
run_count = 1
job_kill_delay = 10
run_version = 1
Job_Name = STDIN
Job_Owner = admin@pbsmaster
job_state = F
queue = workq
server = pbsmaster
Checkpoint = u
ctime = 1476097323
Error_Path = pbsmaster:/home/admin/STDIN.e22
Hold_Types = n
Join_Path = n
Keep_Files = n
Mail_Points = a
mtime = 1476097521
Output_Path = pbsmaster:/home/admin/STDIN.o22
Priority = 0
qtime = 1476097323
Rerunable = True
Resource_List.ncpus = 16
Resource_List.nodect = 2
Resource_List.place = free
Resource_List.select = 2:ncpus=8
schedselect = 2:ncpus=8
substate = 92
Variable_List = PBS_O_HOME=/home/admin,PBS_O_LANG=en_US.UTF-8,PBS_O_LOGNAME=admin,PBS_O_PATH=/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/pbs/bin:/home/admin/.local/bin:/home/admin/bin,PBS_O_MAIL=/var/spool/mail/admin,PBS_O_SHELL=/bin/bash,PBS_O_WORKDIR=/home/admin,PBS_O_SYSTEM=Linux,PBS_O_QUEUE=workq,PBS_O_HOST=pbsmaster
euser = admin
egroup = admin
hop_count = 1
queue_rank = 15
queue_type = E
etime = 1476097323
Submit_arguments = <jsdl-hpcpa:Argument>-lselect=2:ncpus=8</jsdl-hpcpa:Argument>
project = _pbs_project_default
exec_host = pbsmaster/0*8+pbsslave/0*8
exec_host2 = pbsmaster:15002/0*8+pbsslave:15002/0*8
exec_vnode = (pbsmaster:ncpus=8)+(pbsslave:ncpus=8)
resources_used.cpupercent = 0
resources_used.cput = 00:00:00
resources_used.mem = 11852kb
resources_used.ncpus = 16
resources_used.vmem = 53280kb
resources_used.walltime = 00:02:00
stime = 1476097400
session_id = 3383
jobdir = /home/admin
Stageout_status = 1
Exit_status = 0
history_timestamp = 1476097521

转载请以链接形式标明本文链接
本文链接:http://blog.csdn.net/kongxx/article/details/52790801