001、下载测试gff文件,以山羊的gff注释文件为例
[root@pc1 test]# wget https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/001/704/415/GCF_001704415.2_ARS1.2/GCF_001704415.2_ARS1.2_genomic.gff.gz [root@pc1 test]# ls GCF_001704415.2_ARS1.2_genomic.gff.gz [root@pc1 test]# gunzip GCF_001704415.2_ARS1.2_genomic.gff.gz ## 解压 [root@pc1 test]# ls GCF_001704415.2_ARS1.2_genomic.gff
002、下载gffread软件, 从github直接下载
01、直接检索
02、点开此链接
03、点右侧releases
04、下载最新版
wget https://github.com/gpertea/gffread/releases/download/v0.12.7/gffread-0.12.7.Linux_x86_64.tar.gz [root@pc1 gffread]# ls gffread-0.12.7.Linux_x86_64.tar.gz [root@pc1 gffread]# tar -xzvf gffread-0.12.7.Linux_x86_64.tar.gz ## 解压 gffread-0.12.7.Linux_x86_64/ gffread-0.12.7.Linux_x86_64/README.md gffread-0.12.7.Linux_x86_64/gffread gffread-0.12.7.Linux_x86_64/LICENSE [root@pc1 gffread]# ls gffread-0.12.7.Linux_x86_64 gffread-0.12.7.Linux_x86_64.tar.gz [root@pc1 gffread]# cd gffread-0.12.7.Linux_x86_64/ ## 进入解压后生成的目录 [root@pc1 gffread-0.12.7.Linux_x86_64]# ls ## 不用编译,可以发现课执行程序 gffread LICENSE README.md
003、将注释文件gff格式转换为gtf格式
[root@pc1 test]# ls ## 测试gff文件 GCF_001704415.2_ARS1.2_genomic.gff [root@pc1 test]# /home/software/gffread/gffread-0.12.7.Linux_x86_64/gffread GCF_001704415.2_ARS1.2_genomic.gff -T -o result.gtf ## 转换程序 [root@pc1 test]# ls ## 转换结果 GCF_001704415.2_ARS1.2_genomic.gff result.gtf [root@pc1 test]# head -n 2 result.gtf ## 查看前两行 NC_030808.1 tRNAscan-SE transcript 60028 60099 . + . transcript_id "rna-TRNAC-GCA"; gene_id "gene-TRNAC-GCA"; gene_name "TRNAC-GCA" NC_030808.1 tRNAscan-SE exon 60028 60099 . + . transcript_id "rna-TRNAC-GCA"; gene_id "gene-TRNAC-GCA"; gene_name "TRNAC-GCA";
004、将gtf转换为gff文件
[root@pc1 test]# ls GCF_001704415.2_ARS1.2_genomic.gff result.gtf [root@pc1 test]# /home/software/gffread/gffread-0.12.7.Linux_x86_64/gffread result.gtf -o result.gff ## 转换程序 [root@pc1 test]# ls ## 查看抓换结果 GCF_001704415.2_ARS1.2_genomic.gff result.gff result.gtf [root@pc1 test]# head -n 2 result.gff ##gff-version 3 # gffread v0.12.7 [root@pc1 test]# tail -n 2 result.gff ## 查看最后两行 NC_005044.2 RefSeq transcript 15365 15430 . - . ID=rna-KEF96_t22;geneID=gene-KEF96_t22 NC_005044.2 RefSeq exon 15365 15430 . - . Parent=rna-KEF96_t22
标签:gtf,gffread,gff,pc1,64,0.12,root From: https://www.cnblogs.com/liujiaxin2018/p/17010874.html