001、
[b20223040323@admin1 test]$ ls ## 测试gff文件 exons_only.gff [b20223040323@admin1 test]$ gff2bed <exons_only.gff > exons_only.bed ## gff2bed模块转换 Warning: If your Wiggle data is a significant portion of available system memory, use the --max-mem and --sort-tmpdir options, or use --do-not-sort to disable post-conversion sorting. See --help for more information. [b20223040323@admin1 test]$ ls ## 转换结果 exons_only.bed exons_only.gff [b20223040323@admin1 test]$ awk -F "\t" '{OFS = "\t"; print $1, $4 - 1, $5, $6, $8, $7, $2, $3, ".", $NF}' exons_only.gff > tem.gff ## 列的重排, [b20223040323@admin1 test]$ cut -f 1 tem.gff | sort | uniq | while read i; do grep $i tem.gff | sort -k 2n -k 3n >> result.bed; done ## 排序 [b20223040323@admin1 test]$ ls ## 结果文件 exons_only.bed exons_only.gff result.bed tem.gff [b20223040323@admin1 test]$ diff exons_only.bed result.bed ## 比较gff2bed模块和shell脚本的结果, 有一行差异?? 54392c54392 < NC_052532.1 67271350 67271351 . . - Gnomon exon .ID=exon-XM_015290272.4-6;Parent=rna-XM_015290272.4;Dbxref=GeneID:418207,Genbank:XM_015290272.4,CGNC:10484;experiment=COORDINATES: cap analysis [ECO:0007248] and polyA evidence [ECO:0006239];gbkey=mRNA;gene=KRAS;product=KRAS proto-oncogene%2C GTPase%2C transcript variant X3;transcript_id=XM_015290272.4;zero_length_insertion=True --- > NC_052532.1 67271350 67271351 . . - Gnomon exon .ID=exon-XM_015290272.4-6;Parent=rna-XM_015290272.4;Dbxref=GeneID:418207,Genbank:XM_015290272.4,CGNC:10484;experiment=COORDINATES: cap analysis [ECO:0007248] and polyA evidence [ECO:0006239];gbkey=mRNA;gene=KRAS;product=KRAS proto-oncogene%2C GTPase%2C transcript variant X3;transcript_id=XM_015290272.4
标签:文件,shell,exons,XM,015290272.4,bed,only,gff From: https://www.cnblogs.com/liujiaxin2018/p/16884190.html