linux下下载基因组程序,从 NCBI 批量下载基因组的方法

时间:2024-12-16 20:15:23

先下载 assembly summary files

The assembly_summary files report metadata for the genome assemblies on the NCBI genomes FTP site.

Four master files reporting data for either GenBank or RefSeq genome assemblies are available under ftp:///genomes/ASSEMBLY_REPORTS/

assembly_summary_genbank.txt                  - current GenBank genome assemblies

assembly_summary_genbank_historical.txt  - replaced and suppressed GenBank genome assemblies

assembly_summary_refseq.txt                      - current RefSeq genome assemblies

assembly_summary_refseq_historical.txt      - replaced and suppressed RefSeq genome assemblies

assembly_summary_genbank.txt and assembly_summary_genbank_historical.txt are also available at:

ftp:///genomes/genbank/assembly_summary_genbank.txt

ftp:///genomes/genbank/assembly_summary_genbank_historical.txt

assembly_summary_refseq.txt and assembly_summary_refseq_historical.txt are also available at:

ftp:///genomes/refseq/assembly_summary_refseq.txt

ftp:///genomes/refseq/assembly_summary_refseq_historical.txt

The assembly_summary.txt files in the directories named for taxonomic groups or species contain the relevant subsets of the data from the master files.

也可以从 ftp:///genomes/genbank/ 下载单独的summary文件(bacteria fungi viral 等)

也可以从 ftp:///genomes/refseq/ 下载单独的summary文件(bacteria fungi viral 等)

根据 summary 文件中的 ftp_path 列 可以下载到基因组及相关信息