首页 > 其他分享 >samtools 实现对bam文件统计测序深度 和 比对率

samtools 实现对bam文件统计测序深度 和 比对率

时间:2023-02-11 22:56:44浏览次数:51  
标签:samtools 比对率 003070.9 NC 测序 admin1 b20223040323 test bam

 

001、利用samtools计算每一个位点的测序深度

[b20223040323@admin1 test]$ ls                             ## 测试bam文件
SRR21814498.sorted.markdup.bam  SRR21814498.sorted.markdup.bam.bai
[b20223040323@admin1 test]$ samtools depth -aa SRR21814498.sorted.markdup.bam > all_depth.txt   ## 利用samtools统计参考基因组所有位点的测序深度
[b20223040323@admin1 test]$ ls                             ## 结果文件
all_depth.txt  SRR21814498.sorted.markdup.bam  SRR21814498.sorted.markdup.bam.bai
[b20223040323@admin1 test]$ head all_depth.txt             ## 查看前10行
NC_003070.9     1       0
NC_003070.9     2       0
NC_003070.9     3       0
NC_003070.9     4       0
NC_003070.9     5       0
NC_003070.9     6       0
NC_003070.9     7       0
NC_003070.9     8       0
NC_003070.9     9       0
NC_003070.9     10      0

 

 

002、统计比对到参考基因组的位点的平均测序深度

[b20223040323@admin1 test]$ ls
all_depth.txt  SRR21814498.sorted.markdup.bam  SRR21814498.sorted.markdup.bam.bai
[b20223040323@admin1 test]$ head all_depth.txt           ## 所有位点的测序深度信息
NC_003070.9     1       0
NC_003070.9     2       0
NC_003070.9     3       0
NC_003070.9     4       0
NC_003070.9     5       0
NC_003070.9     6       0
NC_003070.9     7       0
NC_003070.9     8       0
NC_003070.9     9       0
NC_003070.9     10      0
[b20223040323@admin1 test]$ awk -F "\t" 'BEGIN{a = 0; b = 0} {if($3 > 0) {a++; b += $3}} END {print b/a}' all_depth.txt       ## 统计测到的位点的平均的测序深度
9.88724

 

 

003、统计基因组所有位点,包括未测到的位点的平均测序深度

[b20223040323@admin1 test]$ ls
all_depth.txt  SRR21814498.sorted.markdup.bam  SRR21814498.sorted.markdup.bam.bai
[b20223040323@admin1 test]$ head all_depth.txt
NC_003070.9     1       0
NC_003070.9     2       0
NC_003070.9     3       0
NC_003070.9     4       0
NC_003070.9     5       0
NC_003070.9     6       0
NC_003070.9     7       0
NC_003070.9     8       0
NC_003070.9     9       0
NC_003070.9     10      0
[b20223040323@admin1 test]$ awk -F "\t" 'BEGIN{a = 0} {if($3 > 0) {a += $3}} END {print a/NR}' all_depth.txt    ## 统计参考基因组所有位点平均测序深度
8.84949

 

 

004、统计覆盖度

[b20223040323@admin1 test]$ ls
all_depth.txt  SRR21814498.sorted.markdup.bam  SRR21814498.sorted.markdup.bam.bai
[b20223040323@admin1 test]$ head all_depth.txt
NC_003070.9     1       0
NC_003070.9     2       0
NC_003070.9     3       0
NC_003070.9     4       0
NC_003070.9     5       0
NC_003070.9     6       0
NC_003070.9     7       0
NC_003070.9     8       0
NC_003070.9     9       0
NC_003070.9     10      0
[b20223040323@admin1 test]$ awk -F "\t" 'BEGIN{a = 0} {if($3 > 0) {a++}} END {print a/NR}' all_depth.txt       ## 统计测序覆盖度
0.895042

 

 

参考:https://www.jianshu.com/p/60b0fda0a591

 

标签:samtools,比对率,003070.9,NC,测序,admin1,b20223040323,test,bam
From: https://www.cnblogs.com/liujiaxin2018/p/17112743.html

相关文章