<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[序列比对研究]]></title><description><![CDATA[<p dir="auto">一，什么是序列比对<br />
<a href="https://en.wikipedia.org/wiki/Sequence_alignment" rel="nofollow ugc">https://en.wikipedia.org/wiki/Sequence_alignment</a><br />
序列比对即alignment，也叫序列对齐。Sequence alignment is the procedure of comparing two (pair-wise alignment) or more(multiple sequence alignment) sequences by searching for a series of individual characters or character patterns that are in the same order in the sequences.</p>
<p dir="auto">二，序列比对的类型<br />
2.1 <strong>Pairwise Sequence Alignment (PSA)</strong> is used to identify regions of similarity that may indicate functional, structural and/or evolutionary relationships between two biological sequences (protein or nucleic acid).<br />
<a href="https://www.ebi.ac.uk/seqdb/confluence/display/JDSAT/Pairwise+Sequence+Alignment" rel="nofollow ugc">https://www.ebi.ac.uk/seqdb/confluence/display/JDSAT/Pairwise+Sequence+Alignment</a><br />
两两比对又分为下面几种类型<br />
<img src="/assets/uploads/files/1599898281626-21bbeda6-b24e-4bd1-8b7e-32c38ae8ee34-image.png" alt="21bbeda6-b24e-4bd1-8b7e-32c38ae8ee34-image.png" class=" img-responsive img-markdown" /><br />
全局比对试图对齐整个序列，比较适合相似度高而且长度比较类似的序列。<br />
局部比对试图找出来局部连续的子序列，序列长度不同，序列有共同特征时，比较适合这种方式。<br />
<img src="/assets/uploads/files/1599898971226-553ecd46-7704-47fd-8124-c8cf575ad832-image.png" alt="553ecd46-7704-47fd-8124-c8cf575ad832-image.png" class=" img-responsive img-markdown" /></p>
<p dir="auto">两两比对有如下几种方法：</p>
<ol>
<li>Dot matrix analysis<br />
除非序列非常相似，都应该首先使用该方法。因为该方法可以展示所有的对齐可能，包括插入、删除、重复等。<br />
<img src="/assets/uploads/files/1599904214971-dot-matrix-method-of-two-dna-sequences-figure-modified-from-junqueira-et-al-2014.png" alt="Dot-matrix-method-of-two-DNA-sequences-Figure-modified-from-Junqueira-et-al-2014.png" class=" img-responsive img-markdown" /></li>
</ol>
<p dir="auto"><a href="https://www.sanger.ac.uk/tool/seqtools/" rel="nofollow ugc">https://www.sanger.ac.uk/tool/seqtools/</a><br />
seqtools里面有一个叫dotter的工具　可以查看dot矩阵图<br />
2. The dynamic programming (or DP) algorithm<br />
比较费内存　但是很适合找最优对齐<br />
3. Word or k-tuple methods, such as used by the programs FASTA and BLAST<br />
比较适合搜索大型数据库</p>
<p dir="auto">2.2 <strong>Multiple Sequence Alignment (MSA)</strong> is generally the alignment of three or more biological sequences (protein or nucleic acid) of similar length. From the output, homology can be inferred and the evolutionary relationships between the sequences studied.</p>
<p dir="auto">By contrast, Pairwise Sequence Alignment tools are used to identify regions of similarity that may indicate functional, structural and/or evolutionary relationships between two biological sequences.<a href="https://www.ebi.ac.uk/seqdb/confluence/display/JDSAT/Multiple+Sequence+Alignment" rel="nofollow ugc">https://www.ebi.ac.uk/seqdb/confluence/display/JDSAT/Multiple+Sequence+Alignment</a></p>
<p dir="auto"><img src="/assets/uploads/files/1599897265359-%E4%B8%A4%E4%B8%A4%E6%AF%94%E5%AF%B9%E5%92%8C%E5%A4%9A%E5%A4%9A%E6%AF%94%E5%AF%B9%E7%9A%84%E5%B7%AE%E5%88%AB.png" alt="两两比对和多多比对的差别.png" class=" img-responsive img-markdown" /> <a href="https://www.semanticscholar.org/paper/Comparative-Analysis-of-Multiple-Sequence-Alignment-Mohamed-Mousa/c8c60c0708d1953196f6a558bab896c6e0ec9a1e" rel="nofollow ugc">https://www.semanticscholar.org/paper/Comparative-Analysis-of-Multiple-Sequence-Alignment-Mohamed-Mousa/c8c60c0708d1953196f6a558bab896c6e0ec9a1e</a></p>
<p dir="auto">三，序列比对的应用<br />
3.1　物种分类<br />
<a href="https://help.ezbiocloud.net/pairwise-nucleotide-sequence-alignment/" rel="nofollow ugc">https://help.ezbiocloud.net/pairwise-nucleotide-sequence-alignment/</a><br />
<img src="/assets/uploads/files/1599897579316-%E6%AF%94%E5%AF%B9%E5%9C%A8%E7%89%A9%E7%A7%8D%E5%88%86%E7%B1%BB%E4%B8%AD%E7%9A%84%E5%BA%94%E7%94%A8.png" alt="比对在物种分类中的应用.png" class=" img-responsive img-markdown" /><br />
距离评价(distance score):<br />
mismatches/(matches+mismatches)<br />
3.2 overlap determination in genome sequence assembly[4]<br />
3.3 gene finding and comparison[4]<br />
3.4  protein sequence comparison[4]</p>
<p dir="auto">参考资料：<br />
1.Bioinformatics: Sequence and Genome Analysis<br />
2.<a href="https://www.ebi.ac.uk/seqdb/confluence/display/JDSAT/Pairwise+Sequence+Alignment" rel="nofollow ugc">https://www.ebi.ac.uk/seqdb/confluence/display/JDSAT/Pairwise+Sequence+Alignment</a><br />
3.<a href="https://www.codeproject.com/Articles/304772/DNA-Sequence-Alignment-using-Dynamic-Programming-A" rel="nofollow ugc">https://www.codeproject.com/Articles/304772/DNA-Sequence-Alignment-using-Dynamic-Programming-A</a><br />
4.Reducing storage requirements for biological sequence comparison<br />
<a href="https://github.com/Peteraya/fer_bioinformatics" rel="nofollow ugc">https://github.com/Peteraya/fer_bioinformatics</a></p>
]]></description><link>http://an.forum.genostack.com/topic/82/序列比对研究</link><generator>RSS for Node</generator><lastBuildDate>Sat, 13 Jun 2026 09:36:56 GMT</lastBuildDate><atom:link href="http://an.forum.genostack.com/topic/82.rss" rel="self" type="application/rss+xml"/><pubDate>Sat, 12 Sep 2020 07:17:10 GMT</pubDate><ttl>60</ttl><item><title><![CDATA[Reply to 序列比对研究 on Fri, 25 Sep 2020 08:27:56 GMT]]></title><description><![CDATA[<p dir="auto"><a href="https://teacheng.illinois.edu/SequenceAlignmentDP/" rel="nofollow ugc">https://teacheng.illinois.edu/SequenceAlignmentDP/</a></p>
<p dir="auto">交互式的算法展示</p>
]]></description><link>http://an.forum.genostack.com/post/131</link><guid isPermaLink="true">http://an.forum.genostack.com/post/131</guid><dc:creator><![CDATA[anneng]]></dc:creator><pubDate>Fri, 25 Sep 2020 08:27:56 GMT</pubDate></item><item><title><![CDATA[Reply to 序列比对研究 on Fri, 25 Sep 2020 06:33:41 GMT]]></title><description><![CDATA[<p dir="auto"><strong>Dynamic Programming</strong><br />
Dynamic Programming is mainly an optimization over plain recursion. Wherever we see a recursive solution that has repeated calls for same inputs, we can optimize it using Dynamic Programming. The idea is to simply store the results of subproblems, so that we do not have to re-compute them when needed later. This simple optimization reduces time complexities from exponential to polynomial. For example, if we write simple recursive solution for Fibonacci Numbers, we get exponential time complexity and if we optimize it by storing solutions of subproblems, time complexity reduces to linear.<br />
<img src="/assets/uploads/files/1600256588032-68ef3a6a-bae2-4456-b131-af700d66312e-image.png" alt="68ef3a6a-bae2-4456-b131-af700d66312e-image.png" class=" img-responsive img-markdown" /></p>
<p dir="auto"><a href="/assets/uploads/files/1601015613834-what-is-dynamic-programming.pdf">What is dynamic programming.pdf</a></p>
]]></description><link>http://an.forum.genostack.com/post/122</link><guid isPermaLink="true">http://an.forum.genostack.com/post/122</guid><dc:creator><![CDATA[anneng]]></dc:creator><pubDate>Fri, 25 Sep 2020 06:33:41 GMT</pubDate></item></channel></rss>