将interproscan的结果转化格式
很奇怪 tsv格式里没有go, kegg, inter-domain信息,但是xml文件里面却有,tsv文件比较好处理,所以先将xml文件转化为tsv。用软件自带的工具:
The convert mode is designed to work only for XML documents created with the same version. This makes sure we can introduce new schema updates in the future. However the XML schema is stable and will only change, if we need to add new features for instance.
You can use InterProScan 5's CONVERT mode to reformat your XML result file into any other possible output format (TSV, GFF3, SVG and HTML). For compatibility reasons you can also convert XML results into InterProScan 4.8 raw format. This will give our users enough time to migrate their pipeline to InterProScan 5.
Please note it is NOT possible to reformat any non-XML format. XML is the richest data type and is therefore the only format which allows us to produce any other format of interest.
To enable InterProScan 5 to run in CONVERT mode you need to set the mode option to 'CONVERT'.
Usage instructions
./interproscan.sh -mode convert
You will see the following usage instructions:
Welcome to InterProScan v5 CONVERT mode.usage: java -XX:+UseParallelGC -XX:+AggressiveOpts -XX:+UseFastAccessorMethods -Xms512M -Xmx2048M -jar interproscan-5.jar Please give us your feedback by sending an email tointerhelp@ebi.ac.uk -b,--output-file-base <OUTPUT-FILE-BASE> Optional, base output filename. Note that this option and the --outfile (-o) option are mutually exclusive. The appropriate file extension for the output format(s) will be appended automatically. By default the input file path/name will be used. -d,--output-dir <OUTPUT-DIR> Optional, output directory. Note that this option and the --outfile (-o) option or the --output-file-base (-b) option are mutually exclusive. The appropriate file extension for the output format(s) will be appended automatically. By default the input file path/name will be used. -f,--formats <OUTPUT-FORMATS> Optional, case-insensitive, comma separated list of output formats. Available formats are TSV, GFF3 (default set) and RAW (InterProScan 4 TSV), HTML, SVG. -i,--xml <XML-FILE-PATH> Mandatory, path to the IMPACT XML file that should be loaded and converted. -o,--outfile <EXPLICIT_OUTPUT_FILENAME> Optional explicit output file name. Note that this option and the --output-file-base (-b) option are mutually exclusive. If this option is given, you MUST specify a single output format using the -f option. The output file name will not be modified. Note that specifying an output file name using this option OVERWRITES ANY EXISTING FILE. -T,--tempdir <TEMP-DIR> Optional, specify temporary file directory. The default location is /temp. 具体使用:
/share/bioinfo/miaochenyong/interproscan-software/tars/interproscan-5.7-48.0/interproscan.sh -mode convert -f tsv -i your_xml_file
如果输入文件为test.xml
默认输出文件为:test.xml.tsv
by freemao
FAFU