Class ExtractIlluminaBarcodes


  • @DocumentedFeature
    public class ExtractIlluminaBarcodes
    extends ExtractBarcodesProgram
    Determine the barcode for each read in an Illumina lane. For each tile, a file is written to the basecalls directory of the form s___barcode.txt. An output file contains a line for each read in the tile, aligned with the regular basecall output The output file contains the following tab-separated columns: - read subsequence at barcode position - Y or N indicating if there was a barcode match - matched barcode sequence (empty if read did not match one of the barcodes). If there is no match but we're close to the threshold of calling it a match we output the barcode that would have been matched but in lower case - distance to best matching barcode, "mismatches" (*) - distance to second-best matching barcode, "mismatchesToSecondBest" (*) NOTE (*): Due to an optimization the reported mismatches & mismatchesToSecondBest values may be inaccurate as long as the conclusion (match vs. no-match) isn't affected. For example, reported mismatches and mismatchesToSecondBest may be smaller than their true value if mismatches is truly larger than MAX_MISMATCHES. Also, mismatchesToSecondBest might be smaller than its true value if its true value is greater than mismatches + MIN_MISMATCH_DELTA.
    • Field Detail

      • BARCODE_FILE

        @Argument(doc="Tab-delimited file of barcode sequences, barcode name and, optionally, library name.  Barcodes must be unique and all the same length.  Column headers must be \'barcode_sequence\' (or \'barcode_sequence_1\'), \'barcode_sequence_2\' (optional), \'barcode_name\', and \'library_name\'.",
                  mutex="BARCODE")
        public File BARCODE_FILE
      • BARCODE

        @Argument(doc="Barcode sequence.  These must be unique, and all the same length.  This cannot be used with reads that have more than one barcode; use BARCODE_FILE in that case. ",
                  mutex="BARCODE_FILE")
        public List<String> BARCODE
      • NUM_PROCESSORS

        @Argument(doc="Run this many PerTileBarcodeExtractors in parallel.  If NUM_PROCESSORS = 0, number of cores is automatically set to the number of cores available on the machine. If NUM_PROCESSORS < 0 then the number of cores used will be the number available on the machine less NUM_PROCESSORS.")
        public int NUM_PROCESSORS
      • OUTPUT_DIR

        @Argument(doc="Where to write _barcode.txt files.  By default, these are written to BASECALLS_DIR.",
                  optional=true)
        public File OUTPUT_DIR
    • Constructor Detail

      • ExtractIlluminaBarcodes

        public ExtractIlluminaBarcodes()
    • Method Detail

      • doWork

        protected int doWork()
        Description copied from class: CommandLineProgram
        Do the work after command line has been parsed. RuntimeException may be thrown by this method, and are reported appropriately.
        Specified by:
        doWork in class CommandLineProgram
        Returns:
        program exit status.
      • customCommandLineValidation

        protected String[] customCommandLineValidation()
        Description copied from class: ExtractBarcodesProgram
        Parses all barcodes from input files and validates all barcodes are the same length and unique
        Overrides:
        customCommandLineValidation in class ExtractBarcodesProgram
        Returns:
        null if command line is valid. If command line is invalid, returns an array of error message to be written to the appropriate place.