暗能星系

    • 登录
    • 搜索

    blast

    其它
    1
    14
    55
    正在加载更多帖子
    • 从旧到新
    • 从新到旧
    • 最多赞同
    回复
    • 在新帖中回复
    登录后回复
    此主题已被删除。只有拥有主题管理权限的用户可以查看。
    • A
      anneng 最后由 anneng 编辑

      blast对于reverse complement序列的支持
      Understanding BLAST and Reverse Complementary RNA, and Discussing miRcore’s Future Leadership – miRcore.pdf
      blast似乎用的这个数据库
      http://www.lmdb.tech/doc/

      安装:
      For get_species_taxids.sh
      ⚬ E-Direct: https://dataguide.nlm.nih.gov/edirect/install.html

      perl -MNet::FTP -e \
          '$ftp = new Net::FTP("ftp.ncbi.nlm.nih.gov", Passive => 1);
          $ftp->login; $ftp->binary;
          $ftp->get("/entrez/entrezdirect/edirect.tar.gz");'
      gunzip -c edirect.tar.gz | tar xf -
      export PATH=$PATH:$HOME/edirect"${PATH}:$HOME/edirect"
      sudo apt install libxml-simple-perl
      sudo apt --fix-broken install libio-socket-ssl-perl libnet-ssleay-perl perl-openssl-abi-1.1
      ./edirect/setup.sh
      

      • For update_blastdb.pl
      ⚬ Perl: https://www.perl.org/

      将blast db导入到ES:
      https://bitbucket.org/hspsdb/hspsdb-indexer/src/master/
      不知道ES可否能做类似的相似搜索

      1 条回复 最后回复 回复 引用 0
      • A
        anneng 最后由 编辑

        自己做库 但是没有找到序列
        https://bioinformatics.stackexchange.com/questions/4226/blastn-no-hits-found

        1 条回复 最后回复 回复 引用 0
        • A
          anneng 最后由 编辑

          https://blastalgorithm.com/

          1 条回复 最后回复 回复 引用 0
          • A
            anneng 最后由 编辑

            sparkblast
            https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-017-1723-8
            scalablast
            https://www.researchgate.net/publication/3301079_ScalaBLAST_A_Scalable_Implementation_of_BLAST_for_High-Performance_Data-Intensive_Bioinformatics_Analysis

            1 条回复 最后回复 回复 引用 0
            • A
              anneng 最后由 编辑

              https://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/algo/blast/
              blast的数据库格式:
              nhr is the header file,
              nin is the index file,
              nsq is the sequence file
              https://www.biostars.org/p/111501/

              1 条回复 最后回复 回复 引用 0
              • A
                anneng 最后由 anneng 编辑

                blast数据库的格式
                https://www.yumpu.com/en/document/view/31537242/ncbi-blast-database-format-janelia-farm-research-campus
                NCBI BLAST Database Format - Janelia Farm Research Campus.pdf

                1 条回复 最后回复 回复 引用 0
                • A
                  anneng 最后由 编辑

                  http://sequenceserver.com
                  一个界面更友好的blast

                  1 条回复 最后回复 回复 引用 0
                  • A
                    anneng 最后由 编辑

                    https://www.ncbi.nlm.nih.gov/IEB/ToolBox/SDKDOCS/DATAMODL.HTML
                    数据模型

                    1 条回复 最后回复 回复 引用 0
                    • A
                      anneng 最后由 编辑

                      cfa237a1-d69f-416e-a28f-28cd8e867dd2-image.png

                      https://open.oregonstate.education/computationalbiology/chapter/command-line-blast/

                      1 条回复 最后回复 回复 引用 0
                      • A
                        anneng 最后由 编辑

                        BLASTn (Nucleotide BLAST): compares one or more nucleotide query sequences to a subject nucleotide sequence or a database of nucleotide sequences. This is useful when trying to determine the evolutionary relationships among different organisms (see Comparing two or more sequences below).
                        BLASTx (translated nucleotide sequence searched against protein sequences): compares a nucleotide query sequence that is translated in six reading frames (resulting in six protein sequences) against a database of protein sequences. Because blastx translates the query sequence in all six reading frames and provides combined significance statistics for hits to different frames, it is particularly useful when the reading frame of the query sequence is unknown or it contains errors that may lead to frame shifts or other coding errors. Thus blastx is often the first analysis performed with a newly determined nucleotide sequence.
                        tBLASTn (protein sequence searched against translated nucleotide sequences): compares a protein query sequence against the six-frame translations of a database of nucleotide sequences. Tblastn is useful for finding homologous protein coding regions in unannotated nucleotide sequences such as expressed sequence tags (ESTs) and draft genome records (HTG), located in the BLAST databases est and htgs, respectively. ESTs are short, single-read cDNA sequences. They comprise the largest pool of sequence data for many organisms and contain portions of transcripts from many uncharacterized genes. Since ESTs have no annotated coding sequences, there are no corresponding protein translations in the BLAST protein databases. Hence a tblastn search is the only way to search for these potential coding regions at the protein level. The HTG sequences, draft sequences from various genome projects or large genomic clones, are another large source of unannotated coding regions.
                        BLASTp (Protein BLAST): compares one or more protein query sequences to a subject protein sequence or a database of protein sequences. This is useful when trying to identify a protein (see From sequence to protein and gene below).

                        1 条回复 最后回复 回复 引用 0
                        • A
                          anneng 最后由 编辑

                          https://www.ncbi.nlm.nih.gov/books/NBK279688/
                          Building a BLAST database with your (local) sequences

                          $ makeblastdb -in test.fsa -parse_seqids -blastdb_version 5 -taxid_map test_map.txt -title "Cookbook demo" -dbtype prot
                          
                          1 条回复 最后回复 回复 引用 0
                          • A
                            anneng 最后由 编辑

                            直接比较两个序列
                            blastn -query test.fasta -subject ac008901.fasta

                            也可以把一个序列用makeblastdb做成数据库 然后查询

                            1 条回复 最后回复 回复 引用 0
                            • A
                              anneng 最后由 编辑

                              blast算法流程:
                              1eee102e-4268-4a63-a4ad-caec9006809f-image.png
                              https://pdf.sciencedirectassets.com/280203/1-s2.0-S1877050918X00039/1-s2.0-S1877050918301108/main.pdf?X-Amz-Security-Token=IQoJb3JpZ2luX2VjEML%2F%2F%2F%2F%2F%2F%2F%2F%2F%2FwEaCXVzLWVhc3QtMSJHMEUCIQCXaIqiV2zRnAU91SvpDPg8HGhD%2BEWJKyp6aoDHwRebYQIgO4Uqm2kji6b6TlEczBEvr02t4fNbM8stvqnGako4fnMqtAMIKhADGgwwNTkwMDM1NDY4NjUiDL9LwjOfcNYmi81I8iqRAz1Bm%2BsEuUhtRlq0czOZtW6%2B%2FnzWByZA4dS0UjhABbODbWYS5Ohhd8tDzMd8f0SmFQlbcF9sbSqZnxm7sJK1prbP7Dy7tjVqUlGkAZ0GFOWwYEVpK3mJUSEaeAqYXqSYwoxKfFTdmhXe1yzks8OEFk2IeHOmfpbd0Xfya9DR4pDHTx%2Bri2QCmKTgcL09TL4nY7kMMx2kGEo1En%2FfdHhwZL%2BUF1mqSKhwX7Ayk7i5x4EQMejoRtQch0QMV%2BZm825PrIGO1likQ7KrliUrYvqwQ3l7uFdd747vvaQVdNd5XTlmr8zuVgPMIyA1cN9HVRgMurBa2ZbvpFzXnQ%2BHk4CzhwbJtGCiUBLXAFiGG8T4pYkb9ds7SciORF7pRSkr1yNJ8IkVr3VpuTbn8zLBt5lPZ8oWrKsll6TaRXNZBcZ66mzC5smZoB5TeFJzIDJ%2FLDnmNh9TPYc6JN3FQcmgGwVAXlBosJZL%2Bw5vvoHSvs6Mr3FvibeBoC9O0rKMvOGvu6zj0IXpYzepcWYdFdeb7g8Nq1Z4MJ2t5YMGOusBD0biXhie39OcpXnFSqKgzIhfFQkADX4%2FZ%2Brh7cXoJG5BQT%2FYHGI1U%2BahCBqOIpyk9mZYh3sSZls42KoGMRUs2qq0dQWQTDOGST8pELJh2Ft5tO5rmfgOgn6jz6Sp8kSRprduTuKbq1FXp2ejvj0WNOGw3QTefCYdyO08SmamN8K%2Bj8LbECvgaUYCAz0M1eNum2FfmJFMnnzXh3HfxFX36E9gsH4bI0A5W1ny9Vv%2FYPrtdDeWkmWOqy06Ls%2Fl5ko7BscQCj%2FnWPKBx5r8t0bBso8vN3hyeZPLFeIJ3ws2TTJ%2BnlbBjkXQdjwgeQ%3D%3D&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20210416T103735Z&X-Amz-SignedHeaders=host&X-Amz-Expires=300&X-Amz-Credential=ASIAQ3PHCVTY7NDY4LGS%2F20210416%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Signature=deed69f42675baeac58b861431d25a4d3c935d611241972e26fcd48772e65488&hash=609b019c6f3e02710ede4a21d1b4450ab839af5d60f65493600a6090dd0ccc0d&host=68042c943591013ac2b2430a89b270f6af2c76d8dfd086a07176afe7c76c2c61&pii=S1877050918301108&tid=spdf-46af0cf8-0df4-4a1b-b3a8-89db04f92e26&sid=98b0a6823be1a643d07b1e41bc1c3c37a1ffgxrqa&type=client

                              1 条回复 最后回复 回复 引用 0
                              • A
                                anneng 最后由 编辑

                                https://open.oregonstate.education/computationalbiology/chapter/command-line-blast/
                                blast的一个教程

                                1 条回复 最后回复 回复 引用 0
                                • First post
                                  Last post
                                Powered by 暗能星系