Barcode classification parameters

The more common BaseCaller module parameters relating to barcode classification are listed and described in the following table. All parameters listed in this table are barcode classification parameters.

Parameter

Default

Description

--barcode-cutoff

1.0

(float)

The maximum distance allowed in barcode matches. A threshold that sets the stringency for barcode matches. Lower values require more exact matches when assigning reads to barcodes. Higher values allow less exact matches.

Reads that have a distance greater than this value are counted as barcode no-matches.

--barcode-mode

2

(integer)

The barcode mode.

  • 0—Classification based on exact barcode base match.

  • 1—A barcode is scored by comparing each read sequence to each barcode sequence in a flow space alignment. Errors in each flow are summed over the length of the barcode flows. Then any barcode with a number of errors equal to or less than the --barcode-cutoff value can be considered, and the barcode with the fewest errors with respect to the input sequence is the matching barcode. (The default in 4.0, known as hard decision classification.)

  • 2—The barcode classification is based on signal information, specifically on the squared distance between the measured signal and the predicted barcode signal. (The default in 4.4, known as soft decision classification.)

--barcode-separation

2.5

(float)

This setting controls how much ambiguity in barcode assignment you want to tolerate, by investigating the distances to both the closest barcode and to the next closest barcode. A read is rejected if the difference in these two distances is less than the --barcode-separation setting.

--barcode-separation has no effect when --barcode-mode is set to 1.

--barcode-filter-postpone

1

(integer)

  • 0—Keeps the 4.0 behavior: barcode filtering is done independently on each block. This is the default for all Ion PGM™ analyses and for Ion Proton™ thumbnail (which consist of only a single block) processing and base calibration training stage processing.

  • 1—The BaseCaller module does barcode pre-filtering at a 10x lower frequency threshold (10 times more lenient). Barcode filtering is done on the full information of the whole chip, after the 96 blocks are merged into one. This is the default for Ion Proton™ full-chip (not thumbnail) analyses.

  • 2—The BaseCaller module does not do any barcode prefiltering. All barcode classification happens after the 96 blocks are merged into one. The "2" setting is slower, creates more files, and involves more processing than the "1" setting.

Do not change this parameter. Instead accept the pipeline defaults, which are different for Ion PGM™ and Ion Proton™ analyses.

--barcode-filter

0.01

(float)

Barcode frequency threshold to be reported in the user interface. The relative frequency of a barcode is the number of assigned reads divided by the number of reads assigned to the most frequent barcode.

0.0—Off. The setting 0.0 causes all barcodes in the barcode set to be reported in the user interface, including barcodes with no or very few reads, provided that the barcode group has at least --barcode-filter-minreads number of reads. Barcodes with no or very few reads typically are not relevant to your analysis and should be filtered out.

--barcode-filter-minreads

20

(integer)

The threshold for the minimum number of reads in a barcode group for that group to be reported in the user interface.

--end-barcodes

on

(boolean)

For dual barcoding runs, specifying "off" disables end barcode classification and uses the start barcodes for read classification instead of both barcodes.

For example, for DNA germline or somatic variant calling, dual barcodes are generally not needed. You may also do hybrid analyses with varying limits of detection, that account for fluctuations in read coverage.

Using this option turned "on":

  • End barcodes are still searched for and trimmed off the read if found.

  • No sk tag is written in the BAM read group header.

  • End barcodes, end barcode adapters, or PCR handles are trimmed if they are found and all stored in the YK tag.

  • Reads not having a YK tag are those that are normally filtered by a dual barcode analysis, such as read where no bead adapter was found.

  • Using Ion AmpliSeq™ HD chemistry, the tag trimmer options have to be adjusted to keep reads where no end barcode was identified. Use --tag-filter-method=need-prefix.

  • Using Ion AmpliSeq™ HD chemistry, variant caller options have to be adjusted to emulate a normal Ion AmpliSeq™ analysis. Specifically, variant caller needs to be configured to ignore molecular tag information.

  • End barcode classification will not be completely off. Reads where an end barcode was identified that does not match read group expectations, that is, reads that are “certified contamination” are still being filtered and pushed to no-match.

--trim-barcodes

on

(boolean)

Trim the barcode and barcode adapter. If off, disables all other 5’ trimming.

--barcode-adapter-check

0.15

Validate the barcode adapter sequence. The parameter given is the maximum allowed squared residual per flow. This feature reduces cross-contamination, for example, between the Ion Xpress™ Barcode Adapters and IonCode™ Barcode Adapters barcode sets.

0—Off.