About the Mark as Duplicates Reads option

For some applications, duplicate reads coming from PCR cause problems in downstream analysis. The presence of duplicate reads can create the appearance of multiple independent reads supporting a particular interpretation, when some of the reads are in fact duplicates of each other with no additional evidence for the interpretation.

Torrent Suite™ Software uses an Ion-optimized approach, which considers the read start and end positions by using both the 5′ alignment start site and the flow in which the 3′ adapter is detected. Duplicate reads are flagged in the BAM file in a dedicated field. Use of this method is recommended over other approaches, which consider only the 5′ alignment start site.

Marking duplicate reads is not appropriate for Ion AmpliSeq™ data, because many independent reads are expected to share the same 5′ alignment position and 3′ adapter flow as each other. Marking duplicates on an Ion AmpliSeq™ run risks inappropriately flagging many reads that are in fact independent of one another.