烟草混合物鉴定过程记录
-
1.bwa创建索引
docker run -v /public/data/data_20220228/:/data anneng01:8090/library/angs_bwa:1.0.0 bwa index /data/Nitab-v4.5_genome_Chr_Edwards2017.fasta2.比对
nohup find ../16_fastq/*_R1.fq.gz | sed 's/_R1.fq.gz$//' |parallel "docker run -v /public/data/data_20220228/:/data anneng01:8090/library/angs_bwa:1.0.0 bwa mem -M /data/Nitab-v4.5_genome_Chr_Edwards2017.fasta /data/16_fastq/{}_R1.fq.gz /data/16_fastq/{}_R2.fq.gz -o /data/0-mapping/{}.sam" &nohup parallel "docker run -v /public/data/data_20220228/0-mapping/:/data anneng01:8090/library/angs_bwa:1.0.0 samtools view -b -F 4 /data/{} -o /data/{.}.bam" ::: *.sam &nohup parallel "docker run -v /public/data/data_20220228/0-mapping/:/data anneng01:8090/library/angs_bwa:1.0.0 samtools sort /data/{} -o /data/{.}.sorted.bam" ::: *.bam &3.call snp
nohup parallel "docker run -v /public/data/data_20220228/:/data anneng01:8090/library/angs_bcftools:1.0.0 bcftools mpileup -Ov -f /data/Nitab-v4.5_genome_Chr_Edwards2017.fasta -o /data/1-calling/{/.}.vcf /data/0-mapping/{/}" ::: ../0-mapping/*.sorted.bam &nohup parallel "docker run -v /public/data/data_20220228/:/data anneng01:8090/library/angs_bcftools:1.0.0 bcftools call -mv -Ov -o /data/1-calling/{/.}.variants.vcf /data/1-calling/{/}" ::: *.sorted.vcf &nohup parallel "docker run -v /public/data/data_20220228/:/data anneng01:8090/library/angs_bcftools:1.0.0 bcftools norm -f /data/Nitab-v4.5_genome_Chr_Edwards2017.fasta -Ob -o /data/1-calling/{/.}.norm.bcf /data/1-calling/{/}" ::: *.sorted.variants.vcf &nohup parallel "docker run -v /public/data/data_20220228/:/data anneng01:8090/library/angs_bcftools:1.0.0 bcftools filter --IndelGap 5 -Ob -o /data/1-calling/{/.}.norm.filtered.bcf /data/1-calling/{/}" ::: *.norm.bcf &nohup parallel "docker run -v /public/data/data_20220228/:/data anneng01:8090/library/angs_bcftools:1.0.0 bcftools index /data/1-calling/{/}" ::: *.filtered.bcf &4.consensus
-
https://samtools.github.io/bcftools/howtos/consensus-sequence.html
consensus序列的制作过程 -
https://github.com/samtools/bcftools/issues/1155
这个帖子说对于大于2倍体的基因组 得用freebayes
Thank you. I don't know what to expect from the panthera pardus genome assembly in terms of quality, but the alignments are very very poor. It is no wonder that in some cases they give a conflicting evidence, I am afraid there is not much that can be done here.Regarding the question about ploidy, you can get a list of predefined ploidies by running bcftools call --ploidy ?. Note that bcftools can model only haploid or diploid genomes, therefore mtDNA can violate this assumption. For calling in arbitrary ploidy, try freebayes. However, variant calling works on human/mouse mtDNA, so arguably the problem here is the quality of the genome assembly.
-
https://www.biostars.org/p/354264/
The site chr1:724953 overlaps with another variant, skipping...
The site chr1:797380 overlaps with another variant, skipping...
The fasta sequence does not match the REF allele at chr1:1298836:
.vcf: [CAGAG]
.vcf: [CAG] <- (ALT)
.fa: [GAGAG]TTTTGTTCTTTTGCTCAGGATGGAGAGCAGTGGTGCAATC
在做consensus的时候 indel 会导致错误 我们只选择snp来做consensus