Class RichStreamReader

  • All Implemented Interfaces:
    SequenceIterator, BioEntryIterator, RichSequenceIterator

    public class RichStreamReader
    extends java.lang.Object
    implements RichSequenceIterator
    Parses a stream into sequences. This object implements SequenceIterator, so you can loop over each sequence produced. It consumes a stream, and uses a SequenceFormat to extract each sequence from the stream. It is assumed that the stream contains sequences that can be handled by the one format, and that they are not seperated other than by delimiters that the format can handle. Sequences are instantiated when they are requested by nextSequence, not before, so it is safe to use this object to parse a gigabyte fasta file, and do sequence-by-sequence processing, while being guaranteed that RichStreamReader will not require you to keep any of the sequences in memory.
    Since:
    1.5
    Author:
    Matthew Pocock, Thomas Down, Richard Holland
    • Constructor Detail

      • RichStreamReader

        public RichStreamReader​(java.io.InputStream is,
                                RichSequenceFormat format,
                                SymbolTokenization symParser,
                                RichSequenceBuilderFactory sf,
                                Namespace ns)
        Creates a new stream reader on the given input stream, which will attempt to read sequences in the given format, having symbols from the given tokenization, and pass them to the given factory to be transformed into RichSequence objects in the given namespace.
        Parameters:
        is - the input stream to read from
        format - the input file format
        symParser - the tokenizer that understands the sequence symbols in the file
        sf - the factory that will build the sequences
        ns - the namespace the sequences will be loaded into.
      • RichStreamReader

        public RichStreamReader​(java.io.BufferedReader reader,
                                RichSequenceFormat format,
                                SymbolTokenization symParser,
                                RichSequenceBuilderFactory sf,
                                Namespace ns)
        Creates a new stream reader on the given reader, which will attempt to read sequences in the given format, having symbols from the given tokenization, and pass them to the given factory to be transformed into RichSequence objects in the given namespace.
        Parameters:
        reader - the reader to read from
        format - the input file format
        symParser - the tokenizer that understands the sequence symbols in the file
        sf - the factory that will build the sequences
        ns - the namespace the sequences will be loaded into.