暗能星系

    • 登录
    • 搜索

    HBV分析

    微生物组分析
    1
    52
    82
    正在加载更多帖子
    • 从旧到新
    • 从新到旧
    • 最多赞同
    回复
    • 在新帖中回复
    登录后回复
    此主题已被删除。只有拥有主题管理权限的用户可以查看。
    • A
      anneng 最后由 编辑

      https://zhanglab.ccmb.med.umich.edu/I-TASSER/.
      结构预测

      1 条回复 最后回复 回复 引用 0
      • A
        anneng 最后由 编辑

        https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7229894/
        四医大肖老师提供的一个文章 这个使用clone测序方法对HBV的全长进行了测序
        5c60a5ce-e884-4e0e-bf79-7f23e6861b1b-image.png
        组装:Contig-Express 和Codon Code Aligner
        序列对齐:MEGAX Clustal X

        1 条回复 最后回复 回复 引用 0
        • A
          anneng 最后由 编辑

          Inference with viral quasispecies diversity indices: Clonal and
          NGS approaches

          对突变频率 香农熵做了详细分析

          1 条回复 最后回复 回复 引用 0
          • A
            anneng 最后由 编辑

            https://www.yacinemahdid.com/shannon-entropy-from-theory-to-python/
            香农熵的python实现

            1 条回复 最后回复 回复 引用 0
            • A
              anneng 最后由 编辑

              https://elifesciences.org/articles/61803
              The haplotypes for each sample were reconstructed for each gene segment using a previously published pipeline (Cacciabue et al., 2020). In brief, FastQC (Andrews, 2010) was used for quality assurance of the NGS paired-end raw reads followed by BBtools (Bushnell, 2014), for removing and filtering adapters and low-quality reads. Bowtie2 (Langmead and Salzberg, 2012), an aligner tool to align the trimmed reads to the selected reference of the influenza strain (i.e. the inoculum), was then used. Samtools suite (Li et al., 2009) was used to sort, index, and generate depth and coverage statistics for read alignment files. Next, CliqueSNV (Knyazev, 2020) was used to infer the haplotypes and frequencies for all eight gene segments for each sample.

              1 条回复 最后回复 回复 引用 0
              • A
                anneng 最后由 编辑

                https://www.sciencedirect.com/science/article/pii/S004268221630037X
                2f740096-f9cc-4a91-a4c1-3feaffa2cd52-image.png

                15fb8be3-b242-4db0-a1b0-6fc3bfb38c67-image.png

                1 条回复 最后回复 回复 引用 0
                • A
                  anneng 最后由 anneng 编辑

                  走一遍qap的流程
                  1.fqc

                  docker run -v /home/bioinfo/hbv_pipeline/data:/data anneng01:8090/app/fqc fqc qc hbv s1 /data/SRR6378032_1.fastq.gz --r2 /data/SRR6378032_2.fastq.gz -o /data/qc/
                  

                  2.cutadapt
                  具体算法见
                  https://cutadapt.readthedocs.io/en/stable/guide.html?highlight=max-n#dealing-with-n-bases

                  docker run -v /home/bioinfo/hbv_pipeline/:/workplace pegi3s/cutadapt -q 1 --max-n 0 --minimum-length 10 -o /workplace/data/SRR6378032_1.cleaned.fastq.gz -p /workplace/data/SRR6378032_2.cleaned.fastq.gz /workplace/data/SRR6378032_1.fastq.gz /workplace/data/SRR6378032_2.fastq.gz
                  

                  -q 按照质量值进行过滤
                  --max-n 按N碱基数量进行过滤
                  --minimum-length 按长度进行过滤
                  -o R1的输出
                  -p R2的输出

                  3.qap的环状参考基因组修复过程涉及了几个自己的perl和R脚本 算法质量情况不明确 我们采用下面的软件来替代:
                  https://github.com/apeltzer/CircularMapper
                  java -jar generator-1.93.5.jar CircularGenerator -e 20 -i ../data/demo.fasta -s "AB033556.1"
                  这个软件有个bug fasta中的序列id不能包括空格 有空格的话就找不到这条序列 导致没有进行处理 处理的算法很简单 就是从头部取一定的碱基数量加到尾部
                  本步骤要做成一个cwl的选择项 只有环状基因组需要执行这个步骤
                  参考序列处理完毕后就可以使用新生成的参考序列进行比对(BWA、Bowtie2等) 但是比对完毕后要继续使用另外一个模块(RealignSAMFile.jar)进行重新对齐

                  4.序列比对
                  docker run -v /home/bioinfo/hbv_pipeline/data/:/data anneng01:8090/library/angs_bwa:1.0.0 bwa mem -M /data/HBV_C_AB033556_150.fasta /data/SRR6377924_1.fastq.gz /data/SRR6377924_2.fastq.gz -o /data/SRR6377924.sam

                  //下面的命令不支持bwa mem 的结果 先不执行
                  java -jar RealignSAMFile.jar -e 500 -i SRR6377924.sam -r HBV_C_AB033556.fasta
                  

                  过滤没有比对上的序列
                  docker run -v /home/bioinfo/hbv_pipeline/:/work jweinstk/samtools samtools view -bF 4 /work/mapping/SRR6377924.sam -o /work/mapping/SRR6377924.bam

                  5.去除PCR重复

                  docker run -v /home/bioinfo/hbv_pipeline/:/work quay.io/biocontainers/samtools:1.15.1--h1170115_0 samtools collate -o /work/mapping/SRR6377924.namecollate.bam /work/mapping/SRR6377924.bam 
                  
                  docker run -v /home/bioinfo/hbv_pipeline/:/work quay.io/biocontainers/samtools:1.15.1--h1170115_0 samtools fixmate -m /work/mapping/SRR6377924.namecollate.bam  /work/mapping/SRR6377924.fixmate.bam
                  
                  docker run -v /home/bioinfo/hbv_pipeline/:/work quay.io/biocontainers/samtools:1.15.1--h1170115_0 samtools sort -o /work/mapping/SRR6377924.sorted.bam /work/mapping/SRR6377924.fixmate.bam
                  
                  
                  docker run -v /home/bioinfo/hbv_pipeline/:/work quay.io/biocontainers/samtools:1.15.1--h1170115_0 samtools markdup -r  /work/mapping/SRR6377924.sorted.bam /work/mapping/SRR6377924.sorted.rmdup.bam
                  

                  6.call snp

                  docker run -v /home/bioinfo/hbv_pipeline/:/workplace quay.io/biocontainers/lofreq:broken---2.5.1--py38h1bd3507_2 lofreq call -f /workplace/data/HBV_C_AB033556_150.fasta -o /workplace/calling/SRR6377924.vcf /workplace/mapping/SRR6377924.sorted.rmdup.bam
                  

                  7.merge R1 R2

                  docker run -v /home/bioinfo/hbv_pipeline/:/workhome quay.io/biocontainers/samtools:1.15.1--h1170115_0 samtools view -h -o /workhome/mapping/SRR6377924.sorted.rmdup.sam /workhome/mapping/SRR6377924.sorted.rmdup.bam
                  
                  docker run -v /home/bioinfo/hbv_pipeline/:/workhome quay.io/biocontainers/samtools:1.15.1--h1170115_0 samtools fasta -1 /workhome/merging/SRR6377924_R1.fasta -2 /workhome/merging/SRR6377924_R2.fasta  -0 /dev/null -s /dev/null -n  /workhome/mapping/SRR6377924.sorted.rmdup.sam
                  
                  docker run -v /home/bioinfo/hbv_pipeline/:/workplace staphb/bbtools bbmerge.sh in1=/workplace/merging/SRR6377924_R1.fasta in2=/workplace/merging/SRR6377924_R2.fasta out=/workplace/merging/SRR6377924_QS.fasta
                  

                  //======有异常 先不用这个软件====

                  wget https://github.91chi.fun//https://github.com//neufeld/pandaseq/archive/refs/tags/v2.11.tar.gz
                  tar xvfz v2.11.tar.gz
                  sudo apt-get install build-essential libtool automake zlib1g-dev libbz2-dev pkg-config
                  ./autogen.sh && ./configure && make && sudo make install
                  
                  wget https://github.91chi.fun//https://github.com//neufeld/pandaseq-sam/archive/refs/tags/v1.4.tar.gz
                  ./autogen.sh && ./configure && make && sudo make install
                  

                  //==========================

                  1 条回复 最后回复 回复 引用 0
                  • A
                    anneng 最后由 编辑

                    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8656693/
                    The Impact of HBV Quasispecies Features on Immune Status in HBsAg+/HBsAb+ Patients With HBV Genotype C Using Next-Generation Sequencing

                    1 条回复 最后回复 回复 引用 0
                    • A
                      anneng 最后由 编辑

                      https://virologyj.biomedcentral.com/articles/10.1186/s12985-022-01836-9
                      Quality evaluation of raw reads was performed with the online tool fastqc (http:// www.bioinformatics.babraham.ac.uk/projects/fastqc/), and the reads having average base calling quality score under 20 were discarded. After quality filtration and adapter removal, paired-end reads were joined with FLASH, v1.2.10 [31]. Merged preS region sequence was genotyped with HBV STAR software as reported previously [32], and corresponding preS regions of 23 reference HBV genomes from the GenBank database were used for genotyping (Accession numbers: X02763, X51970, AF090842, D00329, AB073846, AB602818, X04615, AY123041, AB014381, X65259, M32138, X85254, X75657, AB032431, X69798, AB036910, AF223965, AF160501, AB064310, AF405706, AY090454, AY090457, AY090460). The genotype of each sample was defined as the most frequent one among all 8 types from A to H.

                      Data preprocessing and predictors
                      After sequencing the quasispecies, we collected the point mutation data for 457 positions including the positions from 1 to 61 and 2820 to 3215 in and close to the preS region. We counted the frequencies of the nucleotides in each position. To describe the mutation complexity in each position, we transformed the frequency data to Shannon entropy, which is defined as H=−∑ipilogpi, ∑ipi=1 where i∈{A,C,G,T} and pi is its frequency, xlog(x)=0 when x = 0. Entropy of all the 457 nucleotide positions of preS region were used as predictors for HCC diagnosis.

                      1 条回复 最后回复 回复 引用 0
                      • A
                        anneng 最后由 编辑

                        Amino acid occurrence frequency
                        https://sci-hub.st/10.1145/3386052.3386077
                        Identification of the Association between Hepatitis B Virus
                        and Liver Cancer using Machine Learning Approaches
                        based on Amino Acid

                        使用blast对其reads 然后根据密码子转换成氨基酸

                        1 条回复 最后回复 回复 引用 0
                        • A
                          anneng 最后由 编辑

                          https://www.intechopen.com/chapters/75997
                          Entropy Based Biological Sequence Study

                          1 条回复 最后回复 回复 引用 0
                          • First post
                            Last post
                          Powered by 暗能星系