I have an R script that performs analysis on one chromosome. I want to run this script repeatedly for each chromosome (1-22, X and Y). Right now I have the script set up to accept one argument from command line, the chromosome number. I want to submit multiple jobs to my server in parallel since analysis for one chromosome takes a few hours. After playing around with some options and googling everything, I'm still not sure what the best option is as I've never submitted jobs in parallel to a server (Sun Grid Engine server). I looked into GNU parallel
but I'm not sure how to use it or if it even runs for R scripts. Maybe throw everything in a shell script and submit that to the server? This is a pretty basic question, but any direction would be greatly appreciated!
我有一个R脚本,可以在一条染色体上进行分析。我想为每个染色体(1-22,X和Y)重复运行这个脚本。现在我将脚本设置为从命令行接受一个参数,即染色体编号。我想并行地向我的服务器提交多个作业,因为对一条染色体的分析需要几个小时。在玩了一些选项并搜索了所有内容后,我仍然不确定最佳选择是什么,因为我从未向服务器(Sun Grid Engine服务器)并行提交作业。我研究了GNU parallel,但我不确定如何使用它,或者它是否运行R脚本。也许把所有内容都放在shell脚本中并将其提交给服务器?这是一个非常基本的问题,但任何方向都将非常感谢!
2 个解决方案
#1
0
parallel Rscript plot_LRR_BAF_chromosome_parallel ::: {1..22} X Y
#2
0
using GNU make with option -j
, replace __CHROM__
in your R script with the chromosome name.
使用带有选项-j的GNU make,将R脚本中的__CHROM__替换为染色体名称。
chroms=1 2 3 4 5 6 7 8 9 10
define method1
$$(addsuffix .out,$(1)) : script.R
cat $$< | sed 's/__CHROM__/$(1)/g' | R --nosave > $$@
endef
all: $(addsuffix .out,$(chroms))
$(foreach C, $(chroms),$(eval $(call method1, $(C) )))
#1
0
parallel Rscript plot_LRR_BAF_chromosome_parallel ::: {1..22} X Y
#2
0
using GNU make with option -j
, replace __CHROM__
in your R script with the chromosome name.
使用带有选项-j的GNU make,将R脚本中的__CHROM__替换为染色体名称。
chroms=1 2 3 4 5 6 7 8 9 10
define method1
$$(addsuffix .out,$(1)) : script.R
cat $$< | sed 's/__CHROM__/$(1)/g' | R --nosave > $$@
endef
all: $(addsuffix .out,$(chroms))
$(foreach C, $(chroms),$(eval $(call method1, $(C) )))