<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[使用blast+megan 来分析物种]]></title><description><![CDATA[<p dir="auto">1.使用blast来比对（这一步很慢 需要用我们的大数据方案）<br />
nohup blastn -query barcode05.fasta -db /ceph_disk1/gene_data/MetaDatabase/NCBI_blast_db_FASTA/nt/ntdata/nt -num_threads 20 -out barcode05.m8 -outfmt 6 -evalue 0.001 &amp;<br />
2.准备工作：制作序列id和taxid的映射文件 把ncbi的映射文件转成megan自己的格式<br />
在MEGAN.vmoptions 文件中配置java的最大内存：<br />
-Xmx80000M  否则会报内存不足</p>
<pre><code>./make-acc2ncbi -i nucl_gb.accession2taxid.gz 
Version   MEGAN Ultimate Edition (version 6.21.2, built 14 Mar 2021)
Author(s) Daniel H. Huson
Copyright (C) 2020 Daniel H. Huson. This program comes with ABSOLUTELY NO WARRANTY.
Computing map:
Processing file: nucl_gb.accession2taxid.gz
10% 20% 30% 40% 100% (193.8s)
Building table:
(Bits: 26, buckets: 67,108,864, bucket size: 4)
Sorting map...
10% 20% 30% 40% 50% 60% 70% 80% 90% 100% (210.1s)
Writing table...
10% 20% 100% (38.7s)
(Bucket avg size: 3.9, max size: 19, used: 98%)
(Index size:  536,870,916, data size: 3,603,383,706)
Merging files
10% 100% (42.0s)
Opening file: acc2tax.abin
Size:  259,629,219
Total in:  259,629,220 
Total out: 259,629,219 
Total time:  497s
Peak memory: 60.2 of 78.1G

</code></pre>
<p dir="auto">生成的m8 文件可以直接到入megan分析 不过文件都比较大 在PC机上分析很慢 我们还是建议用户把数据上传到我们服务器进行分析</p>
<p dir="auto">3.用megan的blast2lca 来注释物种</p>
<pre><code>/opt/megan-ce/tools/blast2lca -i barcode05.m8.2 -f BlastTab -m BlastN -o barcode05.lca.norank -a2t /opt/megan-ue/tools/ncbi/acc2tax.abin -sr false
</code></pre>
<p dir="auto">生成的例子如下：这个格式我们要用图表展示到UI上<br />
f3d2e450-007a-4c9e-b43b-602e6302df32; ; Eukaryota; 100; Metazoa; 100; Chordata; 100; Mammalia; 100; Primates; 100; Hominidae; 50; Homo; 50; Homo sapiens; 50;</p>
<p dir="auto">常见错误：<br />
报错  格式不对<br />
/opt/megan-ce/tools/blast2lca -i barcode05.m8 -f BlastTab -m BlastN -o barcode05.lca -a2t ~/.basta/taxonomy/nucl_gb.accession2taxid<br />
Warning: Might not be a BLAST file in TAB format: barcode05.m8<br />
Error parsing file near line: 12028904: String index out of range: 53<br />
Error parsing file near line: 12028905: String index out of range: 53<br />
Error parsing file near line: 17258173: String index out of range: 53</p>
]]></description><link>http://an.forum.genostack.com/topic/261/使用blast-megan-来分析物种</link><generator>RSS for Node</generator><lastBuildDate>Sat, 13 Jun 2026 09:36:42 GMT</lastBuildDate><atom:link href="http://an.forum.genostack.com/topic/261.rss" rel="self" type="application/rss+xml"/><pubDate>Fri, 26 Mar 2021 12:02:40 GMT</pubDate><ttl>60</ttl><item><title><![CDATA[Reply to 使用blast+megan 来分析物种 on Fri, 08 Apr 2022 04:08:06 GMT]]></title><description><![CDATA[<h3>准备工作 制作序列id和taxid的映射文件</h3>
<p dir="auto"><strong>下载地址：</strong><br />
<a href="https://ftp.ncbi.nih.gov/pub/taxonomy/accession2taxid/" rel="nofollow ugc">https://ftp.ncbi.nih.gov/pub/taxonomy/accession2taxid/</a></p>
<ul>
<li>prot.accession2taxid.gz</li>
<li>nucl_gb.accession2taxid.gz</li>
</ul>
]]></description><link>http://an.forum.genostack.com/post/1372</link><guid isPermaLink="true">http://an.forum.genostack.com/post/1372</guid><dc:creator><![CDATA[ice-melt]]></dc:creator><pubDate>Fri, 08 Apr 2022 04:08:06 GMT</pubDate></item><item><title><![CDATA[Reply to 使用blast+megan 来分析物种 on Thu, 07 Apr 2022 10:34:14 GMT]]></title><description><![CDATA[<p dir="auto"><a href="http://megan.informatik.uni-tuebingen.de/t/problems-importing-blast-output-into-megan/1178" rel="nofollow ugc">http://megan.informatik.uni-tuebingen.de/t/problems-importing-blast-output-into-megan/1178</a></p>
]]></description><link>http://an.forum.genostack.com/post/1371</link><guid isPermaLink="true">http://an.forum.genostack.com/post/1371</guid><dc:creator><![CDATA[anneng]]></dc:creator><pubDate>Thu, 07 Apr 2022 10:34:14 GMT</pubDate></item><item><title><![CDATA[Reply to 使用blast+megan 来分析物种 on Thu, 07 Apr 2022 10:14:35 GMT]]></title><description><![CDATA[<p dir="auto">NCBI Blast output header -outfmt 6 or -m8(新版本没有这个选项了) header in tabular form</p>
<p dir="auto">query_id        subject_id      pct_identity    aln_length      n_of_mismatches gap_openings    q_start q_end   s_start   s_end   e_value bit_score</p>
]]></description><link>http://an.forum.genostack.com/post/1370</link><guid isPermaLink="true">http://an.forum.genostack.com/post/1370</guid><dc:creator><![CDATA[anneng]]></dc:creator><pubDate>Thu, 07 Apr 2022 10:14:35 GMT</pubDate></item><item><title><![CDATA[Reply to 使用blast+megan 来分析物种 on Thu, 07 Apr 2022 08:29:32 GMT]]></title><description><![CDATA[<p dir="auto"><a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5978398/" rel="nofollow ugc">https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5978398/</a><br />
BLAST-based validation of metagenomic sequence assignments<br />
使用blast进行物种注释<br />
在法医场景下 精度会非常重要 这篇文章对低精度软件的分析结果 使用blast进行二次验证<br />
<img src="/assets/uploads/files/1649319484621-2926d040-418b-43c5-a60a-e56133ae1ca0-image.png" alt="2926d040-418b-43c5-a60a-e56133ae1ca0-image.png" class=" img-responsive img-markdown" /></p>
]]></description><link>http://an.forum.genostack.com/post/1369</link><guid isPermaLink="true">http://an.forum.genostack.com/post/1369</guid><dc:creator><![CDATA[anneng]]></dc:creator><pubDate>Thu, 07 Apr 2022 08:29:32 GMT</pubDate></item><item><title><![CDATA[Reply to 使用blast+megan 来分析物种 on Sat, 27 Mar 2021 09:44:49 GMT]]></title><description><![CDATA[<p dir="auto">The program make-acc2ncbi can be used to create a new accession to taxonomy mapping file from files downloaded from <a href="ftp://ftp.ncbi.nih.gov/pub/taxonomy/accession2taxid" rel="nofollow ugc">ftp://ftp.ncbi.nih.gov/pub/taxonomy/accession2taxid</a>.<br />
这个工具只在Megan UE版本中有</p>
]]></description><link>http://an.forum.genostack.com/post/515</link><guid isPermaLink="true">http://an.forum.genostack.com/post/515</guid><dc:creator><![CDATA[anneng]]></dc:creator><pubDate>Sat, 27 Mar 2021 09:44:49 GMT</pubDate></item><item><title><![CDATA[Reply to 使用blast+megan 来分析物种 on Thu, 07 Apr 2022 09:10:17 GMT]]></title><description><![CDATA[<p dir="auto">下面这个工具也实现了megan lca相关的算法　可以重点关注<br />
<a href="https://github.com/fungs/taxator-tk/" rel="nofollow ugc">https://github.com/fungs/taxator-tk/</a><br />
<a href="https://github.com/emepyc/Blast2lca" rel="nofollow ugc">https://github.com/emepyc/Blast2lca</a> (已过时)<br />
<a href="https://github.com/etheleon/pymegan" rel="nofollow ugc">https://github.com/etheleon/pymegan</a>   <a href="https://etheleon.github.io/articles/pythonMEGAN/" rel="nofollow ugc">https://etheleon.github.io/articles/pythonMEGAN/</a></p>
<p dir="auto"><a href="https://www.biostars.org/p/362985/" rel="nofollow ugc">https://www.biostars.org/p/362985/</a>  blast2lca的解析<br />
<a href="https://www.jianshu.com/p/3d3253c59545" rel="nofollow ugc">https://www.jianshu.com/p/3d3253c59545</a> 2022-03-18 python pandas 处理megan中blast2lca的结果统计</p>
]]></description><link>http://an.forum.genostack.com/post/514</link><guid isPermaLink="true">http://an.forum.genostack.com/post/514</guid><dc:creator><![CDATA[anneng]]></dc:creator><pubDate>Thu, 07 Apr 2022 09:10:17 GMT</pubDate></item><item><title><![CDATA[Reply to 使用blast+megan 来分析物种 on Sat, 27 Mar 2021 09:44:13 GMT]]></title><description><![CDATA[<p dir="auto">blast可以直接给出物种名称 但是要注意加一个全局变量<br />
ad1bd439-6b2b-4178-91d5-8c6f36fd8d6f,N/A,58.4</p>
<p dir="auto">物种显示NA  需要在bashrc中添加一个全局变量<br />
export BLASTDB=/ceph_disk1/gene_data/MetaDatabase/NCBI_blast_db_FASTA/nt/ntdata/</p>
<pre><code>nohup blastn -query barcode05.fasta -db /ceph_disk1/gene_data/MetaDatabase/NCBI_blast_db_FASTA/nt/ntdata/nt -num_threads 20 -out barcode05.m8 -outfmt "6 qseqid,sscinames,bitscore" -evalue 0.001 &amp;

</code></pre>
]]></description><link>http://an.forum.genostack.com/post/513</link><guid isPermaLink="true">http://an.forum.genostack.com/post/513</guid><dc:creator><![CDATA[anneng]]></dc:creator><pubDate>Sat, 27 Mar 2021 09:44:13 GMT</pubDate></item></channel></rss>