暗能星系

    • 登录
    • 搜索

    宏基因的组装 metagenomics assembly

    微生物组分析
    1
    12
    12
    正在加载更多帖子
    • 从旧到新
    • 从新到旧
    • 最多赞同
    回复
    • 在新帖中回复
    登录后回复
    此主题已被删除。只有拥有主题管理权限的用户可以查看。
    • A
      anneng 最后由 编辑

      https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0169662
      这个文章对宏基因的组装软件进行了评估 使用的不是CAMI 的"Critical Assessment of Metagenomic Interpretation" (CAMI) 的mock数据 用的是真实的NGS数据
      28125a85-2779-4681-bfd0-2bf30029f371-image.png

      1.为什么要组装?
      Read lengths of modern sequencing technologies are increasing as well (S1 Table), making a large depth of phylogenetic and community-based functional analyses already possible by directly examining the unassembled sequencing reads. However, the assembly of overlapping reads into continuous or semi-continuous genome fragments–so called contigs or scaffolds—allows an even more detailed view of different aspects within a genomic context. This allows the reconstruction of full-length gene sequences (and even better gene clusters), which can be much more reliably assigned to specific functions or taxa compared to partial gene fragments found on unassembled reads. Longer assembled sequences also enable a more sensitive detection of larger complex genomic features such as Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR), polyketide synthase (PKS) or non-ribosomal peptide synthase (NRPS) gene clusters encoding for secondary metabolites.

      2.为什么要 binning?
      In addition, the broader genomic context of interesting features may be further elucidated by sorting (or “binning”) partially assembled genome fragments into categories (so-called “bins”). The aim of this approach is to separate fragments that likely originate from different species while grouping those together that likely belong to the same species, leading to partial or even complete reconstruction of genomes from metagenomic datasets.

      文章的结果是spades比较好

      https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4053804/
      文章中还提到了一个pipeline metaAMOS
      4d9804e5-20a6-4bf3-b4c4-bb29c76c070b-image.png

      1 条回复 最后回复 回复 引用 0
      • A
        anneng 最后由 编辑

        43d28883-9ed4-4648-9dcc-646b9cea5aa2-image.png
        https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5502489/
        Assembling metagenomes, one community at a time
        这个文章推荐的选择过程

        1 条回复 最后回复 回复 引用 0
        • A
          anneng 最后由 编辑

          https://www.lcsciences.com/documents/sample_data/metagenomics/Metagenomics_html_report_DEMO.html

          宏基因分析报告样例
          6a55ff64-cc73-4e8a-abf5-3e6e77321c28-image.png

          1 条回复 最后回复 回复 引用 0
          • A
            anneng 最后由 编辑

            https://github.com/metagenome-atlas/atlas
            fdcb4c2e-04ed-4130-a579-8c5e665b4aad-image.png

            1 条回复 最后回复 回复 引用 0
            • A
              anneng 最后由 编辑

              https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8611953/
              a1924a93-6d11-4073-b4e5-c9b281c62c41-image.png

              1 条回复 最后回复 回复 引用 0
              • A
                anneng 最后由 编辑

                https://blog.csdn.net/weixin_39633455/article/details/116951189
                软件1、cutadapt

                input=test.fq.gz

                mkdir -p cutadapt

                cutadapt_input=$input

                cutadapt_out=cutadapt/trimed.fastq.gz

                interleaved=--interleaved

                cutadapt $interleaved -a AGATCGGAAGAGC -A AGATCGGAAGAGC -q 30 -m 20 --trim-n -O 10 -o $cutadapt_out $cutadapt_input

                软件2、megahit

                input_fa=$cutadapt_out

                assembly_out=assembly_out

                megahit --12 $input_fa --k-max 149 --max-tip-len 200 --min-contig-len 300 -o $assembly_out

                软件3、MetaGeneMark

                mkdir -p predict_gene

                input_dir=assembly_out

                predict_gene_out=predict_gene

                model_file=../MetaGeneMark_linux_64/mgm/MetaGeneMark_v1.mod

                cp ../MetaGeneMark_linux_64/gm_key ~/.gm_key

                gmhmmp -d -f G -m $model_file -o $predict_gene_out/out.gff -A $predict_gene_out/final.prot.fa -D $predict_gene_out/final.nucl.fa $input_dir/final.contigs.fa

                软件4、cd-hit

                mkdir -p unigene_set

                python filter_predict_nucl.py $predict_gene_out/final.nucl.fa $predict_gene_out/filter_final.nucl.fa #自写脚本

                cd-hit -i $predict_gene_out/filter_final.nucl.fa -o unigene_set/unigene.fa -c 0.95 -aS 0.9 -d 0 -M 10000 -T 0

                软件5、diamond

                mkdir -p function_anno

                #数据库文件需自行下载

                database_eggNOG=.../metagenomics/function/database/e5.proteomes

                diamond_eggNOG=function_anno/unigene.e5

                database_CARD=.../metagenomics/function/database/CARD/CARD.protein

                diamond_CARD=function_anno/unigene.CARD

                database_CAZy=.../metagenomics/function/database/CAZy/CAZyDB.07202017

                diamond_CAZy=function_anno/unigene.CAZyDB

                database_PHI=.../metagenomics/function/database/PHI/phi-base_current

                diamond_PHI=function_anno/unigene.phi

                diamond blastx -d $database_eggNOG -q unigene_set/unigene.fa -o $diamond_eggNOG --evalue 0.00001

                diamond blastx -d $database_CARD -q unigene_set/unigene.fa -o $diamond_CARD --evalue 0.00001

                diamond blastx -d $database_CAZy -q unigene_set/unigene.fa -o $diamond_CAZy --evalue 0.00001

                diamond blastx -d $database_PHI -q unigene_set/unigene.fa -o $diamond_PHI --evalue 0.00001

                1 条回复 最后回复 回复 引用 0
                • A
                  anneng 最后由 编辑

                  https://teaching.healthtech.dtu.dk/22126/index.php/Metagenomic_assembly_exercise

                  1 条回复 最后回复 回复 引用 0
                  • A
                    anneng 最后由 编辑

                    biomolecules-11-00530-v2.pdf
                    bb7837c5-8dad-462d-9ea6-5732af8ee530-image.png

                    b1a5a4a4-a762-4cd6-b352-77d787072e0a-image.png

                    1 条回复 最后回复 回复 引用 0
                    • A
                      anneng 最后由 编辑

                      https://link.springer.com/article/10.1186/s12864-019-6289-6
                      739fe5b4-7cf4-4451-bdd0-af2aca113b07-image.png

                      1 条回复 最后回复 回复 引用 0
                      • A
                        anneng 最后由 编辑

                        https://www.mdpi.com/2076-2607/8/5/669/htm#fig_body_display_microorganisms-08-00669-f001

                        1 条回复 最后回复 回复 引用 0
                        • A
                          anneng 最后由 编辑

                          https://bioinformaticsworkbook.org/dataAnalysis/Metagenomics/MetagenomicsP1.html#gsc.tab=0
                          95068b96-96ca-4599-bff6-78b675db003b-image.png

                          1 条回复 最后回复 回复 引用 0
                          • A
                            anneng 最后由 编辑

                            https://blogs.iu.edu/ncgas/2021/01/26/scaffold-length-histograms/
                            a793f3b2-9a06-4d57-a018-b3f6bfccef46-image.png

                            1 条回复 最后回复 回复 引用 0
                            • First post
                              Last post
                            Powered by 暗能星系