暗能星系

    • 登录
    • 搜索

    变异分析

    临床生物信息
    1
    11
    19
    正在加载更多帖子
    • 从旧到新
    • 从新到旧
    • 最多赞同
    回复
    • 在新帖中回复
    登录后回复
    此主题已被删除。只有拥有主题管理权限的用户可以查看。
    • A
      anneng 最后由 编辑

      https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7253413/#:~:text=Multi-nucleotide variants (MNVs),of the individual variants3.
      Landscape of multi-nucleotide variants in 125,748 human exomes and 15,708 genomes
      Multi-nucleotide variants (MNVs) are defined as clusters of two or more nearby variants existing on the same haplotype in an individual1,2 (Fig. 1a). When variants in an MNV are found within the same codon, the overall impact may differ from the functional consequences of the individual variants3.
      368dd461-ac62-429d-bf6d-4a658cb5a505-image.png

      Identification of MNVs requires the constituent variants to be properly phased—that is, to be identified accurately as either both occurring on the same haplotype (in cis) or on two different haplotypes (in trans). Phasing can be performed following three broad strategies: read-based phasing18, which assesses whether nearby variants co-segregate on the same reads in DNA sequencing data; family-based phasing19, which assesses whether pairs of variants are co-inherited within families; and population-based phasing20, which leverages haplotype sharing between members of a large genotyped population to make a statistical inference of phase. Read-based phasing is particularly effective for pairs of nearby variants, making it suitable for the analysis of MNVs.

      1 条回复 最后回复 回复 引用 0
      • A
        anneng 最后由 anneng 编辑

        https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6957021/
        Misannotation of multiple-nucleotide variants risks misdiagnosis
        934ae468-4041-402c-9a68-79dbbd5d1ec5-image.png

        To investigate whether using alternative tools results in correct annotation of MNVs, we re-processed the VCF file of simulated MNVs using GATK 3.6.0 ReadBackedPhasing 10 (default parameters plus “-maxDistMNP 2 -enableMergeToMNP”) or MAC 1.2 9 then annotated the resulting VCF files using Alamut batch version 1.5.2 (Interactive Biosoftware, Rouen, France). We also tested re-calling the variants using VarDict 1.4 7 and Platypus 0.8.1 12.

        GATK新版本已经没有了 ReadBackedPhasing 工具
        However, they do not emit MNPs. If you would like to combine contiguous SNPs into MNPs, you will need to use the legacy ReadBackedPhasing tool in GATK3 with the MNP merging function activated. See the GATK3 tool documentation for details.
        https://gatk.broadinstitute.org/hc/en-us/articles/360035530752-What-types-of-variants-can-GATK-tools-detect-or-handle-

        1 条回复 最后回复 回复 引用 0
        • A
          anneng 最后由 anneng 编辑

          https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4521406/
          MAC: identifying and correcting annotation for multi-nucleotide variations
          https://github.com/leiwei-bioinfo/MAC

          1 条回复 最后回复 回复 引用 0
          • A
            anneng 最后由 anneng 编辑

            https://gatk.broadinstitute.org/hc/en-us/articles/360035530752-What-types-of-variants-can-GATK-tools-detect-or-handle-

            GATK and Picard variant manipulation tools are currently able to recognize the following types of alleles:

            SNP (single nucleotide polymorphism)
            INDEL (insertion/deletion)
            MIXED (combination of SNPs and indels at a single position)
            MNP (multi-nucleotide polymorphism, e.g. a dinucleotide substitution)
            SYMBOLIC (such as the <NON-REF> allele used in GVCFs produced by HaplotypeCaller, the * allele used to signify the presence of a spanning deletion, or undefined events like a very large allele or one that's fuzzy and not fully modeled; i.e. there's some event going on here but we don't know what exactly)

            1 条回复 最后回复 回复 引用 0
            • A
              anneng 最后由 编辑

              https://www.ensembl.org/info/genome/variation/prediction/classification.html
              d0526b6a-39c3-40f0-8b2a-a64e7bd7d46f-image.png

              1 条回复 最后回复 回复 引用 0
              • A
                anneng 最后由 编辑

                journal.pone.0262574.pdf
                1996ecf9-1022-4342-93cf-a120cb908c2e-image.png

                1 条回复 最后回复 回复 引用 0
                • A
                  anneng 最后由 anneng 编辑

                  对于MNP(即MNVs) GATK当前的版本不支持 我们有2个选择 使用MAC进行纠正 或者使用另外的工具 freebayes官方宣传支持MNP
                  https://github.com/freebayes/freebayes
                  MAC最近一直没有维护 建议直接用freebayes

                  freebayes验证 对于多个样本 每个样本都要加上RG头

                  bwa mem ecoli.fasta SRR10000374_1.fastq.gz SRR10000374_2.fastq.gz -R '@RG\tID:SRR10000374\tSM:SRR10000374' | samtools sort -o SRR10000374.bam -
                  bwa mem ecoli.fasta SRR10000377_1.fastq.gz SRR10000377_2.fastq.gz -R '@RG\tID:SRR10000377\tSM:SRR10000377' | samtools sort -o SRR10000377.bam -
                  ../freebayes-1.3.6-linux-amd64-static -L list -f ecoli.fasta -v demo.vcf
                  

                  https://bioinformaticsworkbook.org/dataAnalysis/VariantCalling/freebayes-dnaseq-workflow.html#gsc.tab=0

                  1 条回复 最后回复 回复 引用 0
                  • A
                    anneng 最后由 anneng 编辑

                    CNV

                    https://gatk.broadinstitute.org/hc/en-us/articles/360035531452-After-gCNV-calling-considerations
                    https://gatk.broadinstitute.org/hc/en-us/articles/360035531152
                    38539c97-e87a-4514-ab3a-c5905d7b9596-image.png

                    https://www.biorxiv.org/content/10.1101/2021.04.30.442110v1.full
                    A comparison of tools for copy-number variation detection in germline whole exome and whole genome sequencing data
                    715615c1-4cdc-40c0-8d74-32ad701e2314-image.png

                    cnvpytor 验证记录
                    小服务器 /ceph_disk3/var_demo_data

                    sudo apt-get install python-tk
                    pip3 install  cnvpytor -i https://pypi.tuna.tsinghua.edu.cn/simple
                    用conda的话  需要把源码中的这些文件复制下
                    cp CNVpytor/cnvpytor/data/*.pytor /opt/miniconda3/lib/python3.7/site-packages/cnvpytor/data/
                    samtools index NA12877_S1.bam
                    cnvpytor -root 12877.pytor -rd NA12877_S1.bam
                    cnvpytor -root 12877.pytor -his 1000 10000 100000
                    cnvpytor -root 12877.pytor -partition 1000 10000 100000
                    cnvpytor -root 12877.pytor -call 1000 10000 100000
                    
                    如下步骤为可选步骤 在使用snp时运行(CWL、WDL加开关控制)
                    /opt/miniconda3/bin/cnvpytor -root 12877.pytor -snp NA12877_S1.genome.vcf -sample NA12877
                    /opt/miniconda3/bin/cnvpytor -root 12877.pytor -pileup NA12877_S1.bam
                    /opt/miniconda3/bin/cnvpytor -root 12877.pytor -mask_snps
                    /opt/miniconda3/bin/cnvpytor -root 12877.pytor -baf 1000 10000 100000
                    /opt/miniconda3/bin/cnvpytor -root 12877.pytor -call baf 1000 10000 100000
                    注意下面这个命令需要一个文件来描述范围
                    /opt/miniconda3/bin/cnvpytor -root 12877.pytor -genotype 1000 10000 100000 <regions
                    /opt/miniconda3/bin/cnvpytor -root 12877.pytor -call combined 1000 10000 100000
                    后续还有几个命令来输出图片  等报告一起做
                    

                    错误:

                    Traceback (most recent call last):
                      File "/home/anneng/.local/bin/cnvpytor", line 11, in <module>
                        sys.exit(main())
                      File "/home/anneng/.local/lib/python2.7/site-packages/cnvpytor/__main__.py", line 437, in main
                        use_gc_corr=not args.no_gc_corr, use_mask=args.use_mask_with_rd)
                      File "/home/anneng/.local/lib/python2.7/site-packages/cnvpytor/root.py", line 1502, in call
                        distN = np.zeros_like(NN, dtype="long") - 1
                      File "/home/anneng/.local/lib/python2.7/site-packages/numpy/core/numeric.py", line 168, in zeros_like
                        res = empty_like(a, dtype=dtype, order=order, subok=subok)
                    TypeError: data type "long" not understood
                    需要使用python3 运行
                    
                    1 条回复 最后回复 回复 引用 0
                    • A
                      anneng 最后由 编辑

                      CEPH 1463家系数据
                      https://catalog.coriell.org/0/Sections/Collections/NIGMS/CEPHFamiliesDetail.aspx?PgId=441&fam=1463&

                      https://www.illumina.com/platinumgenomes.html

                      https://console.cloud.google.com/storage/browser/genomics-public-data/platinum-genomes/vcf?pageState=("StorageObjectListTable":("f":"%255B%255D"))&prefix=&forceOnObjectsSortingFiltering=false

                      1 条回复 最后回复 回复 引用 0
                      • A
                        anneng 最后由 编辑

                        0bc2cc59-a0da-4811-9299-de078db3a44e-image.png
                        https://seqone.com/science/

                        1 条回复 最后回复 回复 引用 0
                        • First post
                          Last post
                        Powered by 暗能星系