<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[宏基因的组装 metagenomics assembly]]></title><description><![CDATA[<p dir="auto"><a href="https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0169662" rel="nofollow ugc">https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0169662</a><br />
这个文章对宏基因的组装软件进行了评估 使用的不是CAMI 的"Critical Assessment of Metagenomic Interpretation" (CAMI) 的mock数据 用的是真实的NGS数据<br />
<img src="/assets/uploads/files/1648006586843-28125a85-2779-4681-bfd0-2bf30029f371-image-resized.png" alt="28125a85-2779-4681-bfd0-2bf30029f371-image.png" class=" img-responsive img-markdown" /></p>
<p dir="auto">1.为什么要组装？<br />
Read lengths of modern sequencing technologies are increasing as well (S1 Table), making a large depth of phylogenetic and community-based functional analyses already possible by directly examining the unassembled sequencing reads. However, the assembly of overlapping reads into continuous or semi-continuous genome fragments–so called contigs or scaffolds—allows an even more detailed view of different aspects within a genomic context. This allows the reconstruction of full-length gene sequences (and even better gene clusters), which can be much more reliably assigned to specific functions or taxa compared to partial gene fragments found on unassembled reads. Longer assembled sequences also enable a more sensitive detection of larger complex genomic features such as Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR), polyketide synthase (PKS) or non-ribosomal peptide synthase (NRPS) gene clusters encoding for secondary metabolites.</p>
<p dir="auto">2.为什么要 binning?<br />
In addition, the broader genomic context of interesting features may be further elucidated by sorting (or “binning”) partially assembled genome fragments into categories (so-called “bins”). The aim of this approach is to separate fragments that likely originate from different species while grouping those together that likely belong to the same species, leading to partial or even complete reconstruction of genomes from metagenomic datasets.</p>
<p dir="auto">文章的结果是spades比较好</p>
<p dir="auto"><a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4053804/" rel="nofollow ugc">https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4053804/</a><br />
文章中还提到了一个pipeline metaAMOS<br />
<img src="/assets/uploads/files/1648008689357-4d9804e5-20a6-4bf3-b4c4-bb29c76c070b-image.png" alt="4d9804e5-20a6-4bf3-b4c4-bb29c76c070b-image.png" class=" img-responsive img-markdown" /></p>
]]></description><link>http://an.forum.genostack.com/topic/600/宏基因的组装-metagenomics-assembly</link><generator>RSS for Node</generator><lastBuildDate>Sat, 13 Jun 2026 14:27:39 GMT</lastBuildDate><atom:link href="http://an.forum.genostack.com/topic/600.rss" rel="self" type="application/rss+xml"/><pubDate>Wed, 23 Mar 2022 04:11:32 GMT</pubDate><ttl>60</ttl><item><title><![CDATA[Reply to 宏基因的组装 metagenomics assembly on Sat, 26 Mar 2022 11:14:44 GMT]]></title><description><![CDATA[<p dir="auto"><a href="https://blogs.iu.edu/ncgas/2021/01/26/scaffold-length-histograms/" rel="nofollow ugc">https://blogs.iu.edu/ncgas/2021/01/26/scaffold-length-histograms/</a><br />
<img src="/assets/uploads/files/1648293282844-a793f3b2-9a06-4d57-a018-b3f6bfccef46-image.png" alt="a793f3b2-9a06-4d57-a018-b3f6bfccef46-image.png" class=" img-responsive img-markdown" /></p>
]]></description><link>http://an.forum.genostack.com/post/1330</link><guid isPermaLink="true">http://an.forum.genostack.com/post/1330</guid><dc:creator><![CDATA[anneng]]></dc:creator><pubDate>Sat, 26 Mar 2022 11:14:44 GMT</pubDate></item><item><title><![CDATA[Reply to 宏基因的组装 metagenomics assembly on Thu, 24 Mar 2022 11:51:51 GMT]]></title><description><![CDATA[<p dir="auto"><a href="https://bioinformaticsworkbook.org/dataAnalysis/Metagenomics/MetagenomicsP1.html#gsc.tab=0" rel="nofollow ugc">https://bioinformaticsworkbook.org/dataAnalysis/Metagenomics/MetagenomicsP1.html#gsc.tab=0</a><br />
<img src="/assets/uploads/files/1648122710547-95068b96-96ca-4599-bff6-78b675db003b-image.png" alt="95068b96-96ca-4599-bff6-78b675db003b-image.png" class=" img-responsive img-markdown" /></p>
]]></description><link>http://an.forum.genostack.com/post/1316</link><guid isPermaLink="true">http://an.forum.genostack.com/post/1316</guid><dc:creator><![CDATA[anneng]]></dc:creator><pubDate>Thu, 24 Mar 2022 11:51:51 GMT</pubDate></item><item><title><![CDATA[Reply to 宏基因的组装 metagenomics assembly on Thu, 24 Mar 2022 11:46:32 GMT]]></title><description><![CDATA[<p dir="auto"><a href="https://www.mdpi.com/2076-2607/8/5/669/htm#fig_body_display_microorganisms-08-00669-f001" rel="nofollow ugc">https://www.mdpi.com/2076-2607/8/5/669/htm#fig_body_display_microorganisms-08-00669-f001</a></p>
]]></description><link>http://an.forum.genostack.com/post/1315</link><guid isPermaLink="true">http://an.forum.genostack.com/post/1315</guid><dc:creator><![CDATA[anneng]]></dc:creator><pubDate>Thu, 24 Mar 2022 11:46:32 GMT</pubDate></item><item><title><![CDATA[Reply to 宏基因的组装 metagenomics assembly on Thu, 24 Mar 2022 04:17:08 GMT]]></title><description><![CDATA[<p dir="auto"><a href="https://link.springer.com/article/10.1186/s12864-019-6289-6" rel="nofollow ugc">https://link.springer.com/article/10.1186/s12864-019-6289-6</a><br />
<img src="/assets/uploads/files/1648095426857-739fe5b4-7cf4-4451-bdd0-af2aca113b07-image.png" alt="739fe5b4-7cf4-4451-bdd0-af2aca113b07-image.png" class=" img-responsive img-markdown" /></p>
]]></description><link>http://an.forum.genostack.com/post/1311</link><guid isPermaLink="true">http://an.forum.genostack.com/post/1311</guid><dc:creator><![CDATA[anneng]]></dc:creator><pubDate>Thu, 24 Mar 2022 04:17:08 GMT</pubDate></item><item><title><![CDATA[Reply to 宏基因的组装 metagenomics assembly on Thu, 24 Mar 2022 04:03:03 GMT]]></title><description><![CDATA[<p dir="auto"><a href="/assets/uploads/files/1648094485056-biomolecules-11-00530-v2.pdf">biomolecules-11-00530-v2.pdf</a><br />
<img src="/assets/uploads/files/1648094558571-bb7837c5-8dad-462d-9ea6-5732af8ee530-image.png" alt="bb7837c5-8dad-462d-9ea6-5732af8ee530-image.png" class=" img-responsive img-markdown" /></p>
<p dir="auto"><img src="/assets/uploads/files/1648094581744-b1a5a4a4-a762-4cd6-b352-77d787072e0a-image.png" alt="b1a5a4a4-a762-4cd6-b352-77d787072e0a-image.png" class=" img-responsive img-markdown" /></p>
]]></description><link>http://an.forum.genostack.com/post/1310</link><guid isPermaLink="true">http://an.forum.genostack.com/post/1310</guid><dc:creator><![CDATA[anneng]]></dc:creator><pubDate>Thu, 24 Mar 2022 04:03:03 GMT</pubDate></item><item><title><![CDATA[Reply to 宏基因的组装 metagenomics assembly on Thu, 24 Mar 2022 03:55:47 GMT]]></title><description><![CDATA[<p dir="auto"><a href="https://teaching.healthtech.dtu.dk/22126/index.php/Metagenomic_assembly_exercise" rel="nofollow ugc">https://teaching.healthtech.dtu.dk/22126/index.php/Metagenomic_assembly_exercise</a></p>
]]></description><link>http://an.forum.genostack.com/post/1309</link><guid isPermaLink="true">http://an.forum.genostack.com/post/1309</guid><dc:creator><![CDATA[anneng]]></dc:creator><pubDate>Thu, 24 Mar 2022 03:55:47 GMT</pubDate></item><item><title><![CDATA[Reply to 宏基因的组装 metagenomics assembly on Thu, 24 Mar 2022 03:47:00 GMT]]></title><description><![CDATA[<p dir="auto"><a href="https://blog.csdn.net/weixin_39633455/article/details/116951189" rel="nofollow ugc">https://blog.csdn.net/weixin_39633455/article/details/116951189</a><br />
软件1、cutadapt</p>
<p dir="auto">input=test.fq.gz</p>
<p dir="auto">mkdir -p cutadapt</p>
<p dir="auto">cutadapt_input=$input</p>
<p dir="auto">cutadapt_out=cutadapt/trimed.fastq.gz</p>
<p dir="auto">interleaved=--interleaved</p>
<p dir="auto">cutadapt $interleaved -a AGATCGGAAGAGC -A AGATCGGAAGAGC -q 30 -m 20 --trim-n -O 10 -o $cutadapt_out $cutadapt_input</p>
<p dir="auto">软件2、megahit</p>
<p dir="auto">input_fa=$cutadapt_out</p>
<p dir="auto">assembly_out=assembly_out</p>
<p dir="auto">megahit --12 $input_fa --k-max 149 --max-tip-len 200 --min-contig-len 300 -o $assembly_out</p>
<p dir="auto">软件3、MetaGeneMark</p>
<p dir="auto">mkdir -p predict_gene</p>
<p dir="auto">input_dir=assembly_out</p>
<p dir="auto">predict_gene_out=predict_gene</p>
<p dir="auto">model_file=../MetaGeneMark_linux_64/mgm/MetaGeneMark_v1.mod</p>
<p dir="auto">cp ../MetaGeneMark_linux_64/gm_key ~/.gm_key</p>
<p dir="auto">gmhmmp -d -f G -m $model_file -o $predict_gene_out/out.gff -A $predict_gene_out/final.prot.fa -D $predict_gene_out/final.nucl.fa $input_dir/final.contigs.fa</p>
<p dir="auto">软件4、cd-hit</p>
<p dir="auto">mkdir -p unigene_set</p>
<p dir="auto">python filter_predict_nucl.py $predict_gene_out/final.nucl.fa $predict_gene_out/filter_final.nucl.fa #自写脚本</p>
<p dir="auto">cd-hit -i $predict_gene_out/filter_final.nucl.fa -o unigene_set/unigene.fa -c 0.95 -aS 0.9 -d 0 -M 10000 -T 0</p>
<p dir="auto">软件5、diamond</p>
<p dir="auto">mkdir -p function_anno</p>
<p dir="auto">#数据库文件需自行下载</p>
<p dir="auto">database_eggNOG=.../metagenomics/function/database/e5.proteomes</p>
<p dir="auto">diamond_eggNOG=function_anno/unigene.e5</p>
<p dir="auto">database_CARD=.../metagenomics/function/database/CARD/CARD.protein</p>
<p dir="auto">diamond_CARD=function_anno/unigene.CARD</p>
<p dir="auto">database_CAZy=.../metagenomics/function/database/CAZy/CAZyDB.07202017</p>
<p dir="auto">diamond_CAZy=function_anno/unigene.CAZyDB</p>
<p dir="auto">database_PHI=.../metagenomics/function/database/PHI/phi-base_current</p>
<p dir="auto">diamond_PHI=function_anno/unigene.phi</p>
<p dir="auto">diamond blastx -d $database_eggNOG -q unigene_set/unigene.fa -o $diamond_eggNOG --evalue 0.00001</p>
<p dir="auto">diamond blastx -d $database_CARD -q unigene_set/unigene.fa -o $diamond_CARD --evalue 0.00001</p>
<p dir="auto">diamond blastx -d $database_CAZy -q unigene_set/unigene.fa -o $diamond_CAZy --evalue 0.00001</p>
<p dir="auto">diamond blastx -d $database_PHI -q unigene_set/unigene.fa -o $diamond_PHI --evalue 0.00001</p>
]]></description><link>http://an.forum.genostack.com/post/1308</link><guid isPermaLink="true">http://an.forum.genostack.com/post/1308</guid><dc:creator><![CDATA[anneng]]></dc:creator><pubDate>Thu, 24 Mar 2022 03:47:00 GMT</pubDate></item><item><title><![CDATA[Reply to 宏基因的组装 metagenomics assembly on Thu, 24 Mar 2022 03:46:38 GMT]]></title><description><![CDATA[<p dir="auto"><a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8611953/" rel="nofollow ugc">https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8611953/</a><br />
<img src="/assets/uploads/files/1648093595748-a1924a93-6d11-4073-b4e5-c9b281c62c41-image.png" alt="a1924a93-6d11-4073-b4e5-c9b281c62c41-image.png" class=" img-responsive img-markdown" /></p>
]]></description><link>http://an.forum.genostack.com/post/1307</link><guid isPermaLink="true">http://an.forum.genostack.com/post/1307</guid><dc:creator><![CDATA[anneng]]></dc:creator><pubDate>Thu, 24 Mar 2022 03:46:38 GMT</pubDate></item><item><title><![CDATA[Reply to 宏基因的组装 metagenomics assembly on Wed, 23 Mar 2022 08:20:47 GMT]]></title><description><![CDATA[<p dir="auto"><a href="https://github.com/metagenome-atlas/atlas" rel="nofollow ugc">https://github.com/metagenome-atlas/atlas</a><br />
<img src="/assets/uploads/files/1648023645592-fdcb4c2e-04ed-4130-a579-8c5e665b4aad-image.png" alt="fdcb4c2e-04ed-4130-a579-8c5e665b4aad-image.png" class=" img-responsive img-markdown" /></p>
]]></description><link>http://an.forum.genostack.com/post/1301</link><guid isPermaLink="true">http://an.forum.genostack.com/post/1301</guid><dc:creator><![CDATA[anneng]]></dc:creator><pubDate>Wed, 23 Mar 2022 08:20:47 GMT</pubDate></item><item><title><![CDATA[Reply to 宏基因的组装 metagenomics assembly on Wed, 23 Mar 2022 07:53:31 GMT]]></title><description><![CDATA[<p dir="auto"><a href="https://www.lcsciences.com/documents/sample_data/metagenomics/Metagenomics_html_report_DEMO.html" rel="nofollow ugc">https://www.lcsciences.com/documents/sample_data/metagenomics/Metagenomics_html_report_DEMO.html</a></p>
<p dir="auto">宏基因分析报告样例<br />
<img src="/assets/uploads/files/1648022009093-6a55ff64-cc73-4e8a-abf5-3e6e77321c28-image.png" alt="6a55ff64-cc73-4e8a-abf5-3e6e77321c28-image.png" class=" img-responsive img-markdown" /></p>
]]></description><link>http://an.forum.genostack.com/post/1298</link><guid isPermaLink="true">http://an.forum.genostack.com/post/1298</guid><dc:creator><![CDATA[anneng]]></dc:creator><pubDate>Wed, 23 Mar 2022 07:53:31 GMT</pubDate></item><item><title><![CDATA[Reply to 宏基因的组装 metagenomics assembly on Wed, 23 Mar 2022 04:14:19 GMT]]></title><description><![CDATA[<p dir="auto"><img src="/assets/uploads/files/1648008837058-43d28883-9ed4-4648-9dcc-646b9cea5aa2-image.png" alt="43d28883-9ed4-4648-9dcc-646b9cea5aa2-image.png" class=" img-responsive img-markdown" /><br />
<a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5502489/" rel="nofollow ugc">https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5502489/</a><br />
Assembling metagenomes, one community at a time<br />
这个文章推荐的选择过程</p>
]]></description><link>http://an.forum.genostack.com/post/1297</link><guid isPermaLink="true">http://an.forum.genostack.com/post/1297</guid><dc:creator><![CDATA[anneng]]></dc:creator><pubDate>Wed, 23 Mar 2022 04:14:19 GMT</pubDate></item></channel></rss>