Class MarkDuplicatesForFlowHelper

  • All Implemented Interfaces:
    MarkDuplicatesHelper

    public class MarkDuplicatesForFlowHelper
    extends Object
    implements MarkDuplicatesHelper
    MarkDuplicates calculation helper class for flow based mode The class extends the behavior of MarkDuplicates which contains the complete code for the non-flow based mode. When in flow mode, additional parameters may control the establishment of read ends (start/end) for the purpose of determining duplication status. Additionally, the logic used to gather reads into (duplicate) buckets (chunks) is enhanced with an optional mechanism of read end uncertainty threshold. When active, reads are considered to belong to the same chunk if for each read in the chunk there exists at least one other read with the uncertainty distance on the read end.
    • Field Detail

      • CLIPPING_TAG_CONTAINS_A

        public static final char[] CLIPPING_TAG_CONTAINS_A
      • CLIPPING_TAG_CONTAINS_AQ

        public static final char[] CLIPPING_TAG_CONTAINS_AQ
      • CLIPPING_TAG_CONTAINS_QZ

        public static final char[] CLIPPING_TAG_CONTAINS_QZ
    • Constructor Detail

      • MarkDuplicatesForFlowHelper

        public MarkDuplicatesForFlowHelper​(MarkDuplicates md)
    • Method Detail

      • generateDuplicateIndexes

        public void generateDuplicateIndexes​(boolean useBarcodes,
                                             boolean indexOpticalDuplicates)
        This method is identical in function to generateDuplicateIndexes except that it accomodates for the possible significance of the end side of the reads (w/ or w/o uncertainty). This is only applicable for flow mode invocation.
        Specified by:
        generateDuplicateIndexes in interface MarkDuplicatesHelper
      • buildReadEnds

        public ReadEndsForMarkDuplicates buildReadEnds​(htsjdk.samtools.SAMFileHeader header,
                                                       long index,
                                                       htsjdk.samtools.SAMRecord rec,
                                                       boolean useBarcodes)
        Builds a read ends object that represents a single read - for flow based read
        Specified by:
        buildReadEnds in interface MarkDuplicatesHelper
      • getFlowSumOfBaseQualities

        protected static int getFlowSumOfBaseQualities​(htsjdk.samtools.SAMRecord rec,
                                                       int threshold)
        A quality summing scoring strategy used for flow based reads. The method walks on the bases of the read, in the synthesis direction. For each base, the effective quality value is defined as the value on the first base on the hmer to which the base belongs to. The score is defined to be the sum of all effective values above a given threshold.
        Parameters:
        rec - - SAMRecord to get a score for
        threshold - - threshold above which effective quality is included
        Returns:
        - calculated score (see method description)
      • getReadEndCoordinate

        protected static int getReadEndCoordinate​(htsjdk.samtools.SAMRecord rec,
                                                  boolean startEnd,
                                                  boolean certain,
                                                  MarkDuplicatesForFlowArgumentCollection flowBasedArguments)
      • isAdapterClipped

        public static boolean isAdapterClipped​(htsjdk.samtools.SAMRecord rec)
      • isAdapterClippedWithQ

        public static boolean isAdapterClippedWithQ​(htsjdk.samtools.SAMRecord rec)
      • isQualityClipped

        public static boolean isQualityClipped​(htsjdk.samtools.SAMRecord rec)