Class RiffReader

java.lang.Object
com.tino1b2be.dtmf.io.wav.internal.RiffReader

public final class RiffReader extends Object
Low-level RIFF byte reader for the clean-room WAV parser in this package.

WAV is a RIFF container: a flat sequence of length-prefixed, ID-tagged chunks preceded by a 12-byte RIFF | size | WAVE header. Every field inside a RIFF container is little-endian, four-byte chunk IDs are ASCII, and — critically for the provider's streaming model — a single zero-byte pad follows any chunk whose declared size is odd, so that the next chunk always starts on an even byte boundary. This class exposes exactly the byte-level primitives the higher-level parser needs to walk that structure: readAscii(int) for four-character chunk IDs and the WAVE form marker, readU32LE() and readU64LE() for the (unsigned) size fields, skip(long) for skipping chunks the parser does not understand, skipPaddingIfNeeded(long) for the odd-chunk-size pad byte, and position() for anchoring the byte offset that data-chunk payloads are later seeked against.

Two byte-source modes. A RIFF reader can be built from either a FileChannel (random-access, backs the open(Path) branch and keeps canSeek() on the returned WavAudioSource true) or an InputStream (forward-only, backs the open(InputStream, String) branch and forces canSeek() to false). Both constructors expose the same API; the difference is hidden behind the package-private RiffReader.ByteSource strategy below. Position tracking for the channel mode reads straight from FileChannel.position() so that whatever offset the caller started at is preserved; the stream mode maintains an internal counter that starts at zero.

Non-negotiable invariants. Every primitive read method throws EOFException if the underlying source does not contain enough bytes to satisfy the request (this is the concrete mechanism behind Requirement 9.11's "chunk size exceeding remaining file size" clause — when the higher-level parser calls skip(long) with a chunk size that runs off the end of the file, the EOFException bubbles up as the parser's IOException). position() always reports the number of bytes successfully consumed so far; it never moves backward, and the reader offers no general seek operation (the parser's random access happens later on the FileChannel directly, after the headers have been consumed sequentially).

This class is not part of the published API. It lives in com.tino1b2be.dtmf.io.wav.internal, whose stability contract (see the package Javadoc) explicitly allows breakage between any two releases. It is public at the type level purely so classes in the parent com.tino1b2be.dtmf.io.wav package can reach it; external callers MUST NOT depend on it.

Not thread-safe. A single RiffReader mediates mutable byte-source state and must be used by one thread at a time. Concurrent calls are undefined behaviour.

Since:
2.1.0
  • Constructor Summary

    Constructors
    Constructor
    Description
    Build a reader that pulls from a forward-only InputStream.
    Build a reader that pulls from a random-access FileChannel.
  • Method Summary

    Modifier and Type
    Method
    Description
    long
    Current byte position within the underlying source.
    readAscii(int n)
    Read exactly n bytes and interpret them as US-ASCII.
    void
    readBytes(byte[] buf, int off, int len)
    Read exactly len raw bytes from the underlying byte source into buf[off .. off + len).
    int
    Read an unsigned 16-bit little-endian integer.
    long
    Read four bytes and interpret them as a little-endian unsigned 32-bit integer widened to long.
    long
    Read eight bytes and interpret them as a little-endian 64-bit integer.
    void
    skip(long n)
    Skip exactly n bytes.
    void
    skipPaddingIfNeeded(long chunkSize)
    Skip a single pad byte when chunkSize is odd.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • RiffReader

      public RiffReader(FileChannel channel)
      Build a reader that pulls from a random-access FileChannel.

      The channel's current position is taken as the reader's origin: position() after construction equals the channel's position at the time the method is first called. Every primitive read advances both the channel's position and the reader's view in lockstep, so the higher-level parser can later ask the FileChannel directly for the offset of the data payload without doing its own bookkeeping.

      The reader does not take ownership of the channel: closing the reader does not close the channel, and there is no close method because there is nothing to close. The caller (the WAV provider's open method) owns the channel and is responsible for closing it on the WavAudioSource that eventually wraps it.

      Parameters:
      channel - the file channel to read from; must be non-null and open
      Throws:
      NullPointerException - if channel is null
    • RiffReader

      public RiffReader(InputStream stream)
      Build a reader that pulls from a forward-only InputStream.

      The reader starts reporting position() at zero and increments it by every byte successfully consumed. The stream itself is wrapped so that InputStream.skip(long)'s documented "may skip fewer bytes than requested" caveat is neutralised: skip(long) on this reader always consumes exactly the requested number of bytes or throws.

      The reader does not take ownership of the stream: there is no close method and closing the reader (there isn't one) would not close the stream. The caller owns the stream — consistent with AudioSourceProvider's "never close caller-supplied streams" rule (Requirement 4.10).

      Parameters:
      stream - the input stream to read from; must be non-null
      Throws:
      NullPointerException - if stream is null
  • Method Details

    • readAscii

      public String readAscii(int n) throws IOException
      Read exactly n bytes and interpret them as US-ASCII.

      Used for four-character chunk IDs ("fmt ", "data", "LIST", "ds64"…), the twelve-byte outer form (broken into two readAscii(4) calls around a readU32LE() in the caller), and the "WAVE" form marker.

      The RIFF specification guarantees chunk IDs are drawn from printable US-ASCII, so the fixed StandardCharsets.US_ASCII decoding is intentional — a non-ASCII byte indicates a malformed file, not a charset-coverage gap.

      Parameters:
      n - the number of bytes to read; must be non-negative
      Returns:
      a String of length exactly n
      Throws:
      IllegalArgumentException - if n is negative
      EOFException - if fewer than n bytes remain
      IOException - if the underlying source throws
    • readU32LE

      public long readU32LE() throws IOException
      Read four bytes and interpret them as a little-endian unsigned 32-bit integer widened to long.

      Returning long (not int) is deliberate: RIFF size fields are unsigned, and a direct-signed int would wrap around at the 2 GiB boundary — well within the range of legitimate WAV files at high bit depths and sample rates. The returned value is always in the range [0, 2^32 - 1] = [0, 4294967295L].

      Returns:
      the decoded value in [0, 4294967295L]
      Throws:
      EOFException - if fewer than four bytes remain
      IOException - if the underlying source throws
    • readU64LE

      public long readU64LE() throws IOException
      Read eight bytes and interpret them as a little-endian 64-bit integer.

      Used for the three 64-bit fields of the ds64 chunk in RF64 files (riffSize64, dataSize64, sampleCount64). Returned as a signed long because RF64's riffSize64 and dataSize64 are treated as unsigned 64-bit by the spec but no real file on any filesystem we care about exceeds 2^63 - 1 = 8 EiB, and keeping the return type signed avoids every call-site & ~0L mask.

      Returns:
      the decoded value as a signed long
      Throws:
      EOFException - if fewer than eight bytes remain
      IOException - if the underlying source throws
    • skip

      public void skip(long n) throws IOException
      Skip exactly n bytes.

      This is the mechanism behind the "chunk size exceeding remaining file size" clause of Requirement 9.11: the higher-level parser calls skip(chunk.size()) for every chunk it does not understand ("LIST", "bext", "junk", "PEAK"…), and if the declared size runs off the end of the underlying file or stream, the EOFException raised here bubbles up as the IOException the requirement mandates.

      For the InputStream mode this method is implemented as a loop over InputStream.skip(long) with a read-byte fallback, so partial skips from the underlying stream are transparently converted into a complete skip or a proper EOF. For the FileChannel mode it moves the channel position forward and then verifies against FileChannel.size() so that seeking past the end reports EOF instead of silently succeeding.

      Parameters:
      n - the number of bytes to skip; must be non-negative
      Throws:
      IllegalArgumentException - if n is negative
      EOFException - if fewer than n bytes remain
      IOException - if the underlying source throws
    • skipPaddingIfNeeded

      public void skipPaddingIfNeeded(long chunkSize) throws IOException
      Skip a single pad byte when chunkSize is odd.

      RIFF aligns every chunk to an even byte boundary: a chunk whose declared size field is odd is followed by a single zero-byte pad before the next chunk's ID starts. The parser calls this method after consuming (or skipping) each chunk's payload so the next readAscii(4) lands on a real chunk ID rather than the pad byte.

      This is a convenience wrapper over skip(long) that does nothing when chunkSize is even, so callers can invoke it unconditionally.

      Parameters:
      chunkSize - the chunk's declared size field; must be non-negative
      Throws:
      IllegalArgumentException - if chunkSize is negative
      EOFException - if chunkSize is odd and no bytes remain
      IOException - if the underlying source throws
    • readBytes

      public void readBytes(byte[] buf, int off, int len) throws IOException
      Read exactly len raw bytes from the underlying byte source into buf[off .. off + len).

      This is the general binary-read primitive complementing readAscii(int) (which US_ASCII-decodes its payload and therefore mangles non-ASCII bytes via the Unicode Replacement character). Callers use readBytes for binary payloads that must round-trip byte-for-byte — for example the 16-byte SubFormat GUID inside a WAVEFORMATEXTENSIBLE fmt chunk, or the small u16 field pairs in the same chunk.

      Unlike InputStream.read(byte[], int, int), this method is short-read intolerant: it either fills the requested span completely or throws. The underlying byte source is already wired to loop on short pulls, so the exception semantics match readAscii(int) and the readU* primitives — EOFException when fewer than len bytes remain, IOException on any other failure.

      Parameters:
      buf - destination buffer; must be non-null
      off - starting index in buf; must be non-negative
      len - number of bytes to read; must be non-negative and satisfy off + len <= buf.length
      Throws:
      NullPointerException - if buf is null
      IllegalArgumentException - if off or len is negative, or if off + len > buf.length
      EOFException - if fewer than len bytes remain
      IOException - if the underlying source throws
    • readU16LE

      public int readU16LE() throws IOException
      Read an unsigned 16-bit little-endian integer.

      Complements readU32LE() and readU64LE() for the u16 fields that pepper the fmt chunk of a WAV file (wFormatTag, nChannels, nBlockAlign, wBitsPerSample, cbSize, wValidBitsPerSample).

      Returns:
      the decoded value in [0, 65535]
      Throws:
      EOFException - if fewer than two bytes remain
      IOException - if the underlying source throws
    • position

      public long position() throws IOException
      Current byte position within the underlying source.

      For the FileChannel mode this is FileChannel.position(): the absolute byte offset in the file. For the InputStream mode this is the number of bytes successfully consumed since the reader was built (the reader starts reporting zero and counts up from there).

      The parser anchors the data chunk's dataStartByteOffset to this value the instant it finishes reading the chunk's 8-byte header, so that WavAudioSource.seek(frameIndex) can later compute dataStartByteOffset + frameIndex * bytesPerFrame as an absolute channel position.

      Returns:
      the current byte position
      Throws:
      IOException - if querying the underlying source throws