GATK

fastp

bwa---产生sam文件

bwa有三种算法,其中mem比较全面
bwa index ref.fa
先建立index,下载参考基因组的fasta文件。
？不知道可不可以用压缩文件，教程是解压的。在操作的时候我也解压了
bwa mem ref.fa read1.fq read2.fq > aln-pe.sam #无参数R
这里输入的.fq可以是压缩文件
bwa mem -R '@RG\tID:group_n\tLB:library_n\tPL:illumina\tPU:unit1\tSM:sample_n' ref.fa read1.fq read2.fq > aln-pe.R.sam #加上参数R
GATk 要求read group的格式
Read group是@RG开始。

ID = Read group identifier
每一个Read group独有的ID；
Illumina 测序数据中，read group IDs由flowcell ，lane name 和number组成。
在矫正碱基质量时，read group IDs对区分技术批次效应是必须的；在这过程中，同一read group的reads假定为有一样的技术误差。

PU = Platform Unit
Platform Unit由三部分组成： {FLOWCELL_BARCODE}.{LANE}.{SAMPLE_BARCODE}
{FLOWCELL_BARCODE} refers to the unique identifier for a particular flow cell;
The {LANE} indicates the lane of the flow cell ;
The {SAMPLE_BARCODE} is a sample/library-specific identifier;
GATK 使用时，PU不是必须要求的；但是PU与ID同时存在时，PU优先级高于ID。

SM = Sample
reads属于的样品名；SM要设定正确，因为GATK产生的VCF文件也使用这个名字。

PL= Platform/technology used to produce the read
测序使用的平台： ILLUMINA, SOLID, LS454, HELICOS and PACBIO。

LB = DNA preparation library identifier
对一个read group的reads进行重复序列标记时，需要使用LB来区分reads来自那条lane;有时候，同一个库可能在不同的lane上完成测序;为了加以区分，同一个或不同库只要是在不同的lane产生的reads都要单独给一个ID。

作者：JeremyL
链接：https://www.jianshu.com/p/9a29bfc87a50
来源：简书

标签：fq,group,read,bwa,reads,ID
From： https://www.cnblogs.com/wang-yuheng/p/17930891.html

bwa aln使用方法
bwaaln是一个用于DNA序列比对的工具，主要用于将测序数据与参考基因组进行比对。您可以按照以下步骤使用bwaaln：安装bwa软件：首先，您需要从bwa的官方网站（http://bio-bwa.sourceforge.net/）下载并安装bwa软件，确保您的计算机上已经安装了必要的依赖项。准备参考基因组：您需要准备一......
LPI-IBWA: Predicting lncRNA-protein interactions based on an improved Bi-Random
LPI-IBWA:PredictinglncRNA-proteininteractionsbasedonanimprovedBi-RandomwalkalgorithmMinzhuXie 1, RuijieXie 2, HaoWang 3Affiliations expandPMID: 37972912 DOI: 10.1016/j.ymeth.2023.11.007 SigninAbstractManystudies......
B4185. LPI-IBWA:Predicting lncRNA-protein Interactions Based on Improved Bi-Ran
B4185.LPI-IBWA:PredictinglncRNA-proteinInteractionsBasedonImprovedBi-RandomWalkAlgorithmMinzhuXie1,HaoWang1 andRuijieXi11HunanNormalUniversityAbstract:Manystudieshaveshownthatlong-chainnoncodingRNAs(lncRNAs)areinvolvedinav......
CF131D Subway 题解
题目传送门前置知识强连通分量|最短路解法考虑用Tarjan进行缩点，然后跑最短路。缩点：本题的缩点有些特殊，基于有向图缩点修改而得，因为是无向图，所以在Tarjan过程中要额外记录一下从何处转移过来，防止在同一处一直循环。基环树上找环还有其他方法，这里仅讲解使用Tarjan求......
使用bwa进行序列比对
001、bwamem-t4-k32-M-R"@RG\tID:name\tSM:name\tPL:illumina\tLB:name\tPU:name"reference.fnasm.clean.1.fastq.gzsm.clean.2.fastq.gz|samtoolsview-Sb->sm.bam mem：mem比对算法-t：指定线程数-k：（这个参数可以不设置）最小的种子长度（minimumseedlen......
1131 Subway Map
题目：Inthebigcities,thesubwaysystemsalwayslooksocomplextothevisitors.Togiveyousomesense,thefollowingfigureshowsthemapofBeijingsubway.Nowyouaresupposedtohelppeoplewithyourcomputerskills!Giventhestartingpositionofy......
CF1060E Sergey and Subway
题目大意给定一棵树，每两个有边直接相连的点之间距离为\(1\)。现在我们要给所有原来距离为\(2\)的城市之间修一条长度为\(1\)的道路。记\(\operatorname{dis}(a,b)\)表示\(a,b\)之间的最短距离，求\[\sum_{i=1}^n\sum^{n}_{j=i+1}\operatorname{dis}(i,j)\]思路考虑修......
CF1060E Sergey and Subway 题解
题面由题意可知，在原图中经过边数为\(2\)的一对点，在新图中经过边数为\(1\)。所以每对点在新图中的距离为：\[\begin{aligned}\lceil\frac{dis(i,j)}{2}\rceil=\frac{dis(i,j)+dis(i,j)\;mod\;2}{2}\end{aligned}\]那么我们只需在原图上求出任意两点距离之和并加上\(dis......
bWAPP靶场搭建（phpstudy）
我目前只打算在windows上使用该靶场，所以只看了windows中phpstudy搭建的教程，如果使用linux的docker，那更方便，phpstudy搭建bWAPP靶场的具体过程可以参考以下两位大佬：无mysql冲突的情况：https://www.cnblogs.com/zzjdbk/p/12981726.html有mysql冲突的情况：https://blog.csdn.net/we......
Could not locate zlibwapi.dll. Please make sure it is in your library path
再跑CNN程序的时候报了这个错2023-06-2321:11:52.069321:Itensorflow/core/platform/cpu_feature_guard.cc:151]ThisTensorFlowbinaryisoptimizedwithoneAPIDeepNeuralNetworkLibrary(oneDNN)tousethefollowingCPUinstructionsinperformance-criticalop......

bwa比对

GATK

fastp

bwa---产生sam文件

相关文章

赞助商

阅读排行