首页 > 其他分享 >Downsampling Bam file | 平衡测序深度

Downsampling Bam file | 平衡测序深度

时间:2022-12-04 01:55:05浏览次数:64  
标签:samtools bam Downsampling 测序 reads cut file total dir

 

目前对peak的数据处理上,发现测序深度对peak的数量有很大影响,即使做了normalization也没办法,所以这里希望从原始的bam文件开始做downsampling。

参考一:Downsample BAM file to specific amount of reads

input_dir=/home/da528/ATAC_Analysis/cut_run/zhixin_analysis/processed_bam/Exp1/
output_dir=/home/da528/ATAC_Analysis/cut_run/zhixin_analysis/downsampling_bam/Exp1

cd $input_dir

for bam in `ls *.bam`
do
echo $bam
reads=5000000
fraction=$(samtools idxstats $bam | cut -f3 | awk -v ct=$reads 'BEGIN {total=0} {total += $1} END {print ct/total}')
samtools view -b -s ${fraction} $bam > $output_dir/$bam
done

  

以上脚本的不完善的地方就是当测序read低于5000000,它还是执行downsampling,这不是我们想看到的。

 

参考二:Easy bam downsampling

input_dir=/home/da528/ATAC_Analysis/cut_run/zhixin_analysis/processed_bam/Exp1/
output_dir=/home/da528/ATAC_Analysis/cut_run/zhixin_analysis/downsampling_bam2/Exp1

cd $input_dir

for bam in `ls *.bam`
do
echo $bam
reads=5000000
# fraction=$(samtools idxstats $bam | cut -f3 | awk -v ct=$reads 'BEGIN {total=0} {total += $1} END {print ct/total}')
frac=$( samtools idxstats $bam | cut -f3 | awk -v ct=$reads 'BEGIN {total=0} {total += $1} END {frac=ct/total; if (frac > 1) {print 0.99} else {print frac}}' )
echo $frac
samtools view -b -s ${frac} $bam > $output_dir/$bam
done

  

samtools view: Incorrect sampling argument "1"

  

标签:samtools,bam,Downsampling,测序,reads,cut,file,total,dir
From: https://www.cnblogs.com/leezx/p/16949289.html

相关文章