首页 > 其他分享 >gatk 对多个样本的g.vcf文件进行合并、进行变异检测

gatk 对多个样本的g.vcf文件进行合并、进行变异检测

时间:2022-10-29 01:33:35浏览次数:82  
标签:vcf -- 样本 gz combine SNP gatk

 

001、

gatk CombineGVCFs -R GCF_000001735.4_TAIR10.1_genomic.fna --variant SRR21814498.g.vcf --variant SRR21814509.g.vcf --variant SRR21814514.g.vcf -O cohort.g.vcf.gz

 

 

 

002、多个g.vcf文件可以写为一个list文件

gatk CombineGVCFs -R GCF_000001735.4_TAIR10.1_genomic.fna --variant gvcf.list -O cohort.g.vcf.gz

 

gvcf.list格式:

SRR21814498.g.vcf
SRR21814509.g.vcf
SRR21814514.g.vcf

 

 

 

003、变异检测、生成vcf文件

 gatk --java-options "-Xmx400g -Xms400g -XX:+UseSerialGC" GenotypeGVCFs -R GCF_000001735.4_TAIR10.1_genomic.fna -V cohort.g.vcf.gz -O combine.call.vcf.gz

 

 

 

004、提取SNP

gatk --java-options "-Xmx400g -Xms400g -XX:+UseSerialGC" SelectVariants -R GCF_000001735.4_TAIR10.1_genomic.fna -V combine.call.vcf.gz -select-type SNP -O combine.SNP.vcf.gz

 

 

 

005、过滤SNP

gatk --java-options "-Xmx400g -Xms400g -XX:+UseSerialGC" VariantFiltration -R GCF_000001735.4_TAIR10.1_genomic.fna -V combine.SNP.vcf.gz --filter-expression "QD < 2.0 || MQ < 40.0 || FS > 60.0 || SOR > 3.0 || MQRankSum < -12.5 || ReadPosRankSum < -8.0" --filter-name "Filter" -O combine.SNP.filter.vcf.gz

 

 

 

006、提取过滤好的SNP

gatk --java-options "-Xmx400g -Xms400g -XX:+UseSerialGC" SelectVariants -R GCF_000001735.4_TAIR10.1_genomic.fna -V combine.SNP.filter.vcf.gz --exclude-filtered -O combine.SNP.filtered.vcf.gz

 

 

参考:https://www.jianshu.com/p/7c124d5bbd4d

 

标签:vcf,--,样本,gz,combine,SNP,gatk
From: https://www.cnblogs.com/liujiaxin2018/p/16837944.html

相关文章