Sentieon●体细胞变异检测系列-2
Sentieon 致力于解决生物信息数据分析中的速度与准确度瓶颈,通过算法的深度优化和企业级的软件工程,大幅度提升NGS数据处理的效率、准确度和可靠性。
针对体细胞变异检测,Sentieon软件提供两个模块:TNscope和TNhaplotyer2。
TNscope:此模块使用Sentieon特有的算法,拥有更快的计算速度(提速10倍+)和更高的计算精度,对临床基因诊断样本尤其适用;
TNhaplotyper2:此模块匹配Mutect2(现在匹配到4.1.9)结果的同时,计算速度提升10倍以上。
ctDNA变异检测分析
以下给出的步骤脚本,主要针对ctDNA和其他高深度测序的样本数据(2000-5000x depth, AF > 0.3%)
第一步:Alignment
# ****************************************** # 1a. Mapping reads with BWA-MEM, sorting for tumor sample # ****************************************** ( sentieon bwa mem -M -R "@RG\tID:$tumor\tSM:$tumor\tPL:$platform" \ -t $nt -K 10000000 $fasta $tumor_fastq_1 $tumor_fastq_2 || \ echo -n 'error' ) | \ sentieon util sort -o tumor_sorted.bam -t $nt --sam2bam -i - # ****************************************** # 1b. Mapping reads with BWA-MEM, sorting for normal sample # ****************************************** ( sentieon bwa mem -M -R "@RG\tID:$normal\tSM:$normal\tPL:$platform" \ -t $nt -K 10000000 $fasta $normal_fastq_1 $normal_fastq_2 || echo -n 'error' ) | \ sentieon util sort -o normal_sorted.bam -t $nt --sam2bam -i -
第二步:PCR Duplicate Removal (Skip For Amplicon)
# ****************************************** # 2a. Remove duplicate reads for tumor sample. # ****************************************** # ****************************************** sentieon driver -t $nt -i tumor_sorted.bam \ --algo LocusCollector \ --fun score_info \ tumor_score.txt sentieon driver -t $nt -i tumor_sorted.bam \ --algo Dedup \ --score_info tumor_score.txt \ --metrics tumor_dedup_metrics.txt \ tumor_deduped.bam # ****************************************** # 2b. Remove duplicate reads for normal sample. # ****************************************** sentieon driver -t $nt -i normal_sorted.bam \ --algo LocusCollector \ --fun score_info \ normal_score.txt sentieon driver -t $nt -i normal_sorted.bam \ --algo Dedup \ --score_info normal_score.txt \ --metrics normal_dedup_metrics.txt \ normal_deduped.bam
第三步: Base Quality Score Recalibration (Skip For Small Panel)
# ****************************************** # 3a. Base recalibration for tumor sample # ****************************************** sentieon driver -r $fasta -t $nt -i tumor_deduped.bam --interval $BED \ --algo QualCal \ -k $dbsnp \ -k $known_Mills_indels \ -k $known_1000G_indels \ tumor_recal_data.table # ****************************************** # 3b. Base recalibration for normal sample # ****************************************** sentieon driver -r $fasta -t $nt -i normal_deduped.bam --interval $BED \ --algo QualCal \ -k $dbsnp \ -k $known_Mills_indels \ -k $known_1000G_indels \ normal_recal_data.table
第四步:Variant Calling (Tumor Only)
sentieon driver -r $fasta -t $nt -i tumor_deduped.bam --interval $BED --interval_padding 10 \ --algo TNscope \ --tumor_sample $TUMOR_SM \ --dbsnp $dbsnp \ --disable_detector sv \ --min_tumor_allele_frac 3e-3 \ --filter_t_alt_frac 3e-3 \ --clip_by_minbq 1 \ --min_init_tumor_lod 3.0 \ --min_tumor_lod 3.0 \ --assemble_mode 4 \ --resample_depth 100000 \ [--pon panel_of_normal.vcf \] output_tnscope.pre_filter.vcf.gz
第五步:Variant Filtration (Tumor Only)
bcftools annotate -x "FILTER/triallelic_site" output_tnscope.pre_filter.vcf.gz | \ bcftools filter -m + -s "low_qual" -e "QUAL < 10" | \ bcftools filter -m + -s "short_tandem_repeat" -e "RPA[0]>=10" | \ bcftools filter -m + -s "read_pos_bias" -e "FMT/ReadPosRankSumPS[0] < -5" | \ bcftools norm -f $fasta -m +any | \ sentieon util vcfconvert - output_tnscope.filtered.vcf.gz标签:体细胞,ctDNA,bam,normal,--,样本,tumor,sentieon,nt From: https://www.cnblogs.com/chsnp/p/17523178.html