暗能星系

    • 登录
    • 搜索

    序列比对研究

    其它
    1
    3
    11
    正在加载更多帖子
    • 从旧到新
    • 从新到旧
    • 最多赞同
    回复
    • 在新帖中回复
    登录后回复
    此主题已被删除。只有拥有主题管理权限的用户可以查看。
    • A
      anneng 最后由 anneng 编辑

      一,什么是序列比对
      https://en.wikipedia.org/wiki/Sequence_alignment
      序列比对即alignment,也叫序列对齐。Sequence alignment is the procedure of comparing two (pair-wise alignment) or more(multiple sequence alignment) sequences by searching for a series of individual characters or character patterns that are in the same order in the sequences.

      二,序列比对的类型
      2.1 Pairwise Sequence Alignment (PSA) is used to identify regions of similarity that may indicate functional, structural and/or evolutionary relationships between two biological sequences (protein or nucleic acid).
      https://www.ebi.ac.uk/seqdb/confluence/display/JDSAT/Pairwise+Sequence+Alignment
      两两比对又分为下面几种类型
      21bbeda6-b24e-4bd1-8b7e-32c38ae8ee34-image.png
      全局比对试图对齐整个序列,比较适合相似度高而且长度比较类似的序列。
      局部比对试图找出来局部连续的子序列,序列长度不同,序列有共同特征时,比较适合这种方式。
      553ecd46-7704-47fd-8124-c8cf575ad832-image.png

      两两比对有如下几种方法:

      1. Dot matrix analysis
        除非序列非常相似,都应该首先使用该方法。因为该方法可以展示所有的对齐可能,包括插入、删除、重复等。
        Dot-matrix-method-of-two-DNA-sequences-Figure-modified-from-Junqueira-et-al-2014.png

      https://www.sanger.ac.uk/tool/seqtools/
      seqtools里面有一个叫dotter的工具 可以查看dot矩阵图
      2. The dynamic programming (or DP) algorithm
      比较费内存 但是很适合找最优对齐
      3. Word or k-tuple methods, such as used by the programs FASTA and BLAST
      比较适合搜索大型数据库

      2.2 Multiple Sequence Alignment (MSA) is generally the alignment of three or more biological sequences (protein or nucleic acid) of similar length. From the output, homology can be inferred and the evolutionary relationships between the sequences studied.

      By contrast, Pairwise Sequence Alignment tools are used to identify regions of similarity that may indicate functional, structural and/or evolutionary relationships between two biological sequences.https://www.ebi.ac.uk/seqdb/confluence/display/JDSAT/Multiple+Sequence+Alignment

      两两比对和多多比对的差别.png https://www.semanticscholar.org/paper/Comparative-Analysis-of-Multiple-Sequence-Alignment-Mohamed-Mousa/c8c60c0708d1953196f6a558bab896c6e0ec9a1e

      三,序列比对的应用
      3.1 物种分类
      https://help.ezbiocloud.net/pairwise-nucleotide-sequence-alignment/
      比对在物种分类中的应用.png
      距离评价(distance score):
      mismatches/(matches+mismatches)
      3.2 overlap determination in genome sequence assembly[4]
      3.3 gene finding and comparison[4]
      3.4 protein sequence comparison[4]

      参考资料:
      1.Bioinformatics: Sequence and Genome Analysis
      2.https://www.ebi.ac.uk/seqdb/confluence/display/JDSAT/Pairwise+Sequence+Alignment
      3.https://www.codeproject.com/Articles/304772/DNA-Sequence-Alignment-using-Dynamic-Programming-A
      4.Reducing storage requirements for biological sequence comparison
      https://github.com/Peteraya/fer_bioinformatics

      1 条回复 最后回复 回复 引用 0
      • A
        anneng 最后由 anneng 编辑

        Dynamic Programming
        Dynamic Programming is mainly an optimization over plain recursion. Wherever we see a recursive solution that has repeated calls for same inputs, we can optimize it using Dynamic Programming. The idea is to simply store the results of subproblems, so that we do not have to re-compute them when needed later. This simple optimization reduces time complexities from exponential to polynomial. For example, if we write simple recursive solution for Fibonacci Numbers, we get exponential time complexity and if we optimize it by storing solutions of subproblems, time complexity reduces to linear.
        68ef3a6a-bae2-4456-b131-af700d66312e-image.png

        What is dynamic programming.pdf

        1 条回复 最后回复 回复 引用 0
        • A
          anneng 最后由 编辑

          https://teacheng.illinois.edu/SequenceAlignmentDP/

          交互式的算法展示

          1 条回复 最后回复 回复 引用 0
          • First post
            Last post
          Powered by 暗能星系