暗能星系

    • 登录
    • 搜索

    HBV分析

    微生物组分析
    1
    52
    82
    正在加载更多帖子
    • 从旧到新
    • 从新到旧
    • 最多赞同
    回复
    • 在新帖中回复
    登录后回复
    此主题已被删除。只有拥有主题管理权限的用户可以查看。
    • A
      anneng 最后由 编辑

      https://hivdb.stanford.edu/HBV/releaseNotes/

      1 条回复 最后回复 回复 引用 0
      • A
        anneng 最后由 anneng 编辑

        https://www.ncbi.nlm.nih.gov/labs/pmc/articles/PMC4382110/

        ac0d250c-63dc-4b2f-b3ce-c3e9b93f8fdd-image.png

        直接用reads 在进化树上进行分型 并且能进行混合样本的分型

        1 条回复 最后回复 回复 引用 0
        • A
          anneng 最后由 anneng 编辑

          四医大HBV分析记录
          1.使用bbmerge合并R1 R2 下面脚本的意思是使用find找到样本名称 然后使用这个样本名称传递给parallel并发处理

          find ../all_data/*_L001_R1_001.fastq.gz | sed 's/_L001_R1_001.fastq.gz$//' | parallel 'bbmerge.sh in1=../all_data/{}_L001_R1_001.fastq.gz in2=../all_data/{}_L001_R2_001.fastq.gz out={}.fastq  outu1={}.R1.umerged outu2={}.R2.unmerged'
          

          发现327个样本中 有几个样本 R1 和 R2 的数量不一致 针对这些样本 使用spades进行组装 取最长的序列进行第二步
          因为涉及到组装 无法进行混合样品的分析 把这些样本当作单样本处理
          将所有的fastq转成fasta(blast只识别fasta)

          parallel 'seqtk seq -a {}> {.}.fasta' ::: *.fastq
          

          2.使用blast 对样本中的序列进行分型 得到每个样本中各种分型的序列数量
          构建blast数据库
          从hbvdb下载的参考序列 有一个类别是RF 例如 https://www.ncbi.nlm.nih.gov/nucleotide/EU871985.1?report=genbank&log$=nuclalign&blast_rank=1&RID=Z8DW1MY8016 这个序列 NCBI没有标识类型 hbvdb将其注释为了BC重组型 我们当前先把这种RF的去掉

          makeblastdb -in all_hbvdb_Genomes.fas -dbtype nucl
          
          blastn -task blastn -max_target_seqs 1 -query ../0-merging-pe/100_S42.fasta -db ../hbvdb/all_hbvdb_Genomes.fas -num_threads 10 -out 100_S42.m8 -outfmt 6
          
          nohup bash -c "find ../0-merging-pe/*.fasta | sed 's/.fasta$//' |  parallel --joblog ./logs -j40 blastn -task blastn -max_target_seqs 1 -query ../0-merging-pe/{}.fasta -db ../hbvdb/A-H/HBV_A_H.fas -out {/}.m8 -outfmt 6 " &
          
          

          3.比对

          nohup bash -c "find ../all_data/*_L001_R1_001.fastq.gz | sed 's/_L001_R1_001.fastq.gz$//' | parallel 'bwa mem -M AB033556_hbc_type_C.fasta {}_L001_R1_001.fastq.gz {}_L001_R2_001.fastq.gz > {/}.sam' " &
          
          nohup parallel "samtools view -bF 4 {} > {/.}.bam" ::: ./sam/*.sam &
          parallel samtools sort {} -o {.}.sorted.bam ::: *.bam
          

          4.call

          nohup parallel "lofreq indelqual {} --dindel -f ../3-mapping/AB033556_hbc_type_C.fasta -o {/.}.sorted.dindel.bam " ::: ../3-mapping/bam/*.sorted.bam &
          
          nohup parallel "lofreq call {} --call-indels -f ../3-mapping/AB033556_hbc_type_C.fasta -o {/.}.vcf " ::: *.bam &
          

          5.分析单倍型

          find /ceph_disk2/siyida_327_sample/3-mapping/sam/ -name "*.sam" -exec basename \{} .sam \; | sed 's/.sam$//' |parallel 'java -jar clique-snv.jar -m snv-illumina -in /ceph_disk2/siyida_327_sample/3-mapping/sam/{}.sam'
          
          1 条回复 最后回复 回复 引用 0
          • A
            anneng 最后由 编辑

            spades的组装

            /home/bioinfo/miniconda2/envs/assembly/bin/spades.py      -1      /ceph_disk3/hbv/HBV_illumina/106/106_S46_L001_R1_001.fastq      -2      /ceph_disk3/hbv/HBV_illumina/106/106_S46_L001_R2_001.fastq      -o      /ceph_disk3/hbv/HBV_illumina/106/spades
            
            1 条回复 最后回复 回复 引用 0
            • A
              anneng 最后由 编辑

              https://www.sciencedirect.com/science/article/pii/S1386653218300970
              Frequency of hepatitis B surface antigen variants (HBsAg) in hepatitis B virus genotype B and C infected East- and Southeast Asian patients: Detection by the Elecsys® HBsAg II assay

              e87934f0-1292-4109-b492-1e24fcf01653-image.png

              1 条回复 最后回复 回复 引用 0
              • A
                anneng 最后由 编辑

                https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0172101
                Ultra-deep sequencing reveals high prevalence and broad structural diversity of hepatitis B surface antigen mutations in a global population
                https://github.com/spabinger/HBV_data_publication_2016_07
                an MHR variant was defined as a nucleotide sequence change in the S gene region (encoding amino acids 99 to 170) with an allele frequency >5% (in both sequencing directions) and at least 3 variant reads present on the forward as well as on the reverse strand.
                1f045d8c-a618-4d93-8f61-a267962eef2a-image.png

                1 条回复 最后回复 回复 引用 0
                • A
                  anneng 最后由 编辑

                  https://sci-hub.st/10.1159/000361076
                  Hepatitis B Virus Drug Resistance Tools:
                  One Sequence, Two Predictions
                  www.genafor.org/services.php

                  HIV-GRADE HBV

                  文章提到了一些工具 用于分型、耐药、免疫逃逸的分析
                  d118672c-2903-407f-9be2-c9a5e2882cfd-image.png

                  1 条回复 最后回复 回复 引用 0
                  • A
                    anneng 最后由 编辑

                    Genetic Diversity of Hepatitis B Virus
                    Strains Derived Worldwide: Genotypes,
                    Subgenotypes, and HBsAg Subtypes

                    https://sci-hub.st/10.1159/000080872
                    对HBV进行进化树分析 里面也提到血清型和基因型之间的复杂的对应关系。
                    涉及的软件:
                    DNADIST and NEIGHBOR from the Phylip program package version 3.53

                    PUZZLE

                    Bootstrap on 1,000 replicas was performed with SEQBOOT, DNADIST, NEIGHBOR, and CONSENSE from the Phylip package.

                    1 条回复 最后回复 回复 引用 0
                    • A
                      anneng 最后由 anneng 编辑

                      Global Occurrence of Clinically Relevant Hepatitis B Virus
                      viruses-12-01344-v3.pdf

                      从蛋白序列预测血清型
                      01c98519-63c9-4919-932c-3d39f5266476-image.png

                      7f70201a-8ea6-441e-909d-c2be7f034a76-image.png

                      1 条回复 最后回复 回复 引用 0
                      • A
                        anneng 最后由 编辑

                        https://www.aimspress.com/article/doi/10.3934/microbiol.2020024?viewType=HTML
                        突变可能造成的影响 这个论文做了一个总结
                        ca8fd7ec-9dd2-4e70-8986-69fa7eeda9cf-image.png

                        1 条回复 最后回复 回复 引用 0
                        • A
                          anneng 最后由 anneng 编辑

                          https://www.nature.com/articles/s41598-019-43524-9
                          Illumina and Nanopore methods for whole genome sequencing of hepatitis B virus (HBV)
                          524466fb-c8d2-4e12-b6ff-c38928f745a9-image.png

                          1 条回复 最后回复 回复 引用 0
                          • A
                            anneng 最后由 编辑

                            https://www.frontiersin.org/articles/10.3389/fmicb.2020.616023/full
                            Comprehensive Analysis of Clinically Significant Hepatitis B Virus Mutations in Relation to Genotype, Subgenotype and Geographic Region
                            使用公开数据分析HBV的突变
                            Table_1_Comprehensive Analysis of Clinically Significant Hepatitis B Virus Mutations in Relation to Genotype, Subgenotype and Geographic Region.XLSX

                            这个表格的格式可以作为分析的模板
                            行是样本 列是突变的位置或者重要图标的代号

                            1 条回复 最后回复 回复 引用 0
                            • A
                              anneng 最后由 编辑

                              https://www.hiv.lanl.gov/content/sequence/ENTROPY/entropy.html

                              香农熵计算器

                              1 条回复 最后回复 回复 引用 0
                              • A
                                anneng 最后由 编辑

                                https://zhanglab.ccmb.med.umich.edu/I-TASSER/.
                                结构预测

                                1 条回复 最后回复 回复 引用 0
                                • A
                                  anneng 最后由 编辑

                                  https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7229894/
                                  四医大肖老师提供的一个文章 这个使用clone测序方法对HBV的全长进行了测序
                                  5c60a5ce-e884-4e0e-bf79-7f23e6861b1b-image.png
                                  组装:Contig-Express 和Codon Code Aligner
                                  序列对齐:MEGAX Clustal X

                                  1 条回复 最后回复 回复 引用 0
                                  • A
                                    anneng 最后由 编辑

                                    Inference with viral quasispecies diversity indices: Clonal and
                                    NGS approaches

                                    对突变频率 香农熵做了详细分析

                                    1 条回复 最后回复 回复 引用 0
                                    • A
                                      anneng 最后由 编辑

                                      https://www.yacinemahdid.com/shannon-entropy-from-theory-to-python/
                                      香农熵的python实现

                                      1 条回复 最后回复 回复 引用 0
                                      • A
                                        anneng 最后由 编辑

                                        https://elifesciences.org/articles/61803
                                        The haplotypes for each sample were reconstructed for each gene segment using a previously published pipeline (Cacciabue et al., 2020). In brief, FastQC (Andrews, 2010) was used for quality assurance of the NGS paired-end raw reads followed by BBtools (Bushnell, 2014), for removing and filtering adapters and low-quality reads. Bowtie2 (Langmead and Salzberg, 2012), an aligner tool to align the trimmed reads to the selected reference of the influenza strain (i.e. the inoculum), was then used. Samtools suite (Li et al., 2009) was used to sort, index, and generate depth and coverage statistics for read alignment files. Next, CliqueSNV (Knyazev, 2020) was used to infer the haplotypes and frequencies for all eight gene segments for each sample.

                                        1 条回复 最后回复 回复 引用 0
                                        • A
                                          anneng 最后由 编辑

                                          https://www.sciencedirect.com/science/article/pii/S004268221630037X
                                          2f740096-f9cc-4a91-a4c1-3feaffa2cd52-image.png

                                          15fb8be3-b242-4db0-a1b0-6fc3bfb38c67-image.png

                                          1 条回复 最后回复 回复 引用 0
                                          • A
                                            anneng 最后由 anneng 编辑

                                            走一遍qap的流程
                                            1.fqc

                                            docker run -v /home/bioinfo/hbv_pipeline/data:/data anneng01:8090/app/fqc fqc qc hbv s1 /data/SRR6378032_1.fastq.gz --r2 /data/SRR6378032_2.fastq.gz -o /data/qc/
                                            

                                            2.cutadapt
                                            具体算法见
                                            https://cutadapt.readthedocs.io/en/stable/guide.html?highlight=max-n#dealing-with-n-bases

                                            docker run -v /home/bioinfo/hbv_pipeline/:/workplace pegi3s/cutadapt -q 1 --max-n 0 --minimum-length 10 -o /workplace/data/SRR6378032_1.cleaned.fastq.gz -p /workplace/data/SRR6378032_2.cleaned.fastq.gz /workplace/data/SRR6378032_1.fastq.gz /workplace/data/SRR6378032_2.fastq.gz
                                            

                                            -q 按照质量值进行过滤
                                            --max-n 按N碱基数量进行过滤
                                            --minimum-length 按长度进行过滤
                                            -o R1的输出
                                            -p R2的输出

                                            3.qap的环状参考基因组修复过程涉及了几个自己的perl和R脚本 算法质量情况不明确 我们采用下面的软件来替代:
                                            https://github.com/apeltzer/CircularMapper
                                            java -jar generator-1.93.5.jar CircularGenerator -e 20 -i ../data/demo.fasta -s "AB033556.1"
                                            这个软件有个bug fasta中的序列id不能包括空格 有空格的话就找不到这条序列 导致没有进行处理 处理的算法很简单 就是从头部取一定的碱基数量加到尾部
                                            本步骤要做成一个cwl的选择项 只有环状基因组需要执行这个步骤
                                            参考序列处理完毕后就可以使用新生成的参考序列进行比对(BWA、Bowtie2等) 但是比对完毕后要继续使用另外一个模块(RealignSAMFile.jar)进行重新对齐

                                            4.序列比对
                                            docker run -v /home/bioinfo/hbv_pipeline/data/:/data anneng01:8090/library/angs_bwa:1.0.0 bwa mem -M /data/HBV_C_AB033556_150.fasta /data/SRR6377924_1.fastq.gz /data/SRR6377924_2.fastq.gz -o /data/SRR6377924.sam

                                            //下面的命令不支持bwa mem 的结果 先不执行
                                            java -jar RealignSAMFile.jar -e 500 -i SRR6377924.sam -r HBV_C_AB033556.fasta
                                            

                                            过滤没有比对上的序列
                                            docker run -v /home/bioinfo/hbv_pipeline/:/work jweinstk/samtools samtools view -bF 4 /work/mapping/SRR6377924.sam -o /work/mapping/SRR6377924.bam

                                            5.去除PCR重复

                                            docker run -v /home/bioinfo/hbv_pipeline/:/work quay.io/biocontainers/samtools:1.15.1--h1170115_0 samtools collate -o /work/mapping/SRR6377924.namecollate.bam /work/mapping/SRR6377924.bam 
                                            
                                            docker run -v /home/bioinfo/hbv_pipeline/:/work quay.io/biocontainers/samtools:1.15.1--h1170115_0 samtools fixmate -m /work/mapping/SRR6377924.namecollate.bam  /work/mapping/SRR6377924.fixmate.bam
                                            
                                            docker run -v /home/bioinfo/hbv_pipeline/:/work quay.io/biocontainers/samtools:1.15.1--h1170115_0 samtools sort -o /work/mapping/SRR6377924.sorted.bam /work/mapping/SRR6377924.fixmate.bam
                                            
                                            
                                            docker run -v /home/bioinfo/hbv_pipeline/:/work quay.io/biocontainers/samtools:1.15.1--h1170115_0 samtools markdup -r  /work/mapping/SRR6377924.sorted.bam /work/mapping/SRR6377924.sorted.rmdup.bam
                                            

                                            6.call snp

                                            docker run -v /home/bioinfo/hbv_pipeline/:/workplace quay.io/biocontainers/lofreq:broken---2.5.1--py38h1bd3507_2 lofreq call -f /workplace/data/HBV_C_AB033556_150.fasta -o /workplace/calling/SRR6377924.vcf /workplace/mapping/SRR6377924.sorted.rmdup.bam
                                            

                                            7.merge R1 R2

                                            docker run -v /home/bioinfo/hbv_pipeline/:/workhome quay.io/biocontainers/samtools:1.15.1--h1170115_0 samtools view -h -o /workhome/mapping/SRR6377924.sorted.rmdup.sam /workhome/mapping/SRR6377924.sorted.rmdup.bam
                                            
                                            docker run -v /home/bioinfo/hbv_pipeline/:/workhome quay.io/biocontainers/samtools:1.15.1--h1170115_0 samtools fasta -1 /workhome/merging/SRR6377924_R1.fasta -2 /workhome/merging/SRR6377924_R2.fasta  -0 /dev/null -s /dev/null -n  /workhome/mapping/SRR6377924.sorted.rmdup.sam
                                            
                                            docker run -v /home/bioinfo/hbv_pipeline/:/workplace staphb/bbtools bbmerge.sh in1=/workplace/merging/SRR6377924_R1.fasta in2=/workplace/merging/SRR6377924_R2.fasta out=/workplace/merging/SRR6377924_QS.fasta
                                            

                                            //======有异常 先不用这个软件====

                                            wget https://github.91chi.fun//https://github.com//neufeld/pandaseq/archive/refs/tags/v2.11.tar.gz
                                            tar xvfz v2.11.tar.gz
                                            sudo apt-get install build-essential libtool automake zlib1g-dev libbz2-dev pkg-config
                                            ./autogen.sh && ./configure && make && sudo make install
                                            
                                            wget https://github.91chi.fun//https://github.com//neufeld/pandaseq-sam/archive/refs/tags/v1.4.tar.gz
                                            ./autogen.sh && ./configure && make && sudo make install
                                            

                                            //==========================

                                            1 条回复 最后回复 回复 引用 0
                                            • First post
                                              Last post
                                            Powered by 暗能星系