Faster NGS Duplicate Marking
-
https://medium.com/grail-eng/faster-ngs-duplicate-marking-d7a1fd287f46



Picard¹ and Sambamba 的算法类似
SAMTools (rmdup) -
https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-016-1097-3#:~:text=PCR duplicates are sequence reads,duplicates appear proportionately more often
这个文章说去重对后面的variant call 结果影响不大 但是这样能明显减少分析的数据量