Class Mp3HeaderScanner

java.lang.Object
com.tino1b2be.dtmf.io.mp3.internal.Mp3HeaderScanner

public final class Mp3HeaderScanner extends Object
MPEG audio header detection helper for com.tino1b2be.dtmf.io.mp3.Mp3AudioSourceProvider.

Decides, in two steps and without decoding a single audio frame, whether the bytes flowing through a given InputStream plausibly belong to an MPEG Layer III stream — i.e. whether the provider should claim them in the SPI scoring round conducted by AudioSources:

  1. Skip any leading ID3v2 tag. An MP3 on disk is almost always preceded by an ID3v2 tag that carries the track title, artist, and so on. The tag sits before the first audio frame and is trivial to identify: its first three bytes are the ASCII sequence "ID3". The spec (ID3.org's id3v2.4.0-structure) lays the ten-byte tag header out as
    bytes 0..2: "ID3"
    byte 3 : major version
    byte 4 : revision
    byte 5 : flags (bit 4 = footer present)
    bytes 6..9: synchsafe tag size
    the "synchsafe integer" being the unusual part: each of the four bytes holds a clear top bit and only seven value bits, so size = (b6 << 21) | (b7 << 14) | (b8 << 7) | b9. A tag optionally repeats its ten-byte header as a trailing footer when bit 4 of the flags byte is set, so the total number of bytes to skip before audio starts is 10 (header) + size + (10 if footer else 0). Anything other than "ID3" in the first three bytes means no tag is present; those three bytes stay at the head of the scan and the scanner just starts looking for a sync word immediately.
  2. Scan up to maxBytes for an MPEG sync word. Every MPEG audio frame starts with an eleven-bit sync pattern of all ones — 0xFFE — laid out across the first two bytes of the four-byte frame header as 0xFF followed by a byte whose top three bits are 111. Finding prev == 0xFF && (cur & 0xE0) == 0xE0 is therefore necessary for MPEG but not sufficient: the second byte also carries the two-bit versionField at bits 4..3 and the two-bit layerField at bits 2..1, and only the combination versionField != 01 (00 = MPEG-2.5, 10 = MPEG-2, 11 = MPEG-1; 01 is reserved) with layerField == 01 (Layer III; 00 = reserved, 10 = Layer II, 11 = Layer I) indicates the content this provider can actually decode. Any other combination is either a spurious 0xFF byte inside an ID3v1 trailer, an unsupported layer, or a reserved field, and the scan keeps moving. The loop returns true on the first combination that does pass both checks and false if it walks maxBytes past the tag without finding one.

Why content-based detection, not extension-based? AudioSources picks a provider by asking each one to score the raw bytes, not the file name (see Requirements 4.4, 4.5, 5.6). A .mp3 file with a corrupted header must score low so the caller gets an UnsupportedAudioFormatException rather than a confusing decoder failure; a stream of MPEG Layer III bytes with no .mp3 extension (a URL ending in /audio, say) must still score high. This scanner is the mechanism that makes both happen.

What the scanner does not do. Finding a valid-looking Layer III sync header is enough for canOpen(...) to return 90 (Requirement 10.5) — it does not guarantee that mp3spi will be able to decode every subsequent frame. A malformed or truncated file can still fall over at open(...) time, at which point Mp3AudioSourceProvider translates the underlying failure into UnsupportedAudioFormatException (Requirement 10.8). The two-phase "score cheaply, decode carefully" contract is deliberate.

This class is not part of the published API. It lives in com.tino1b2be.dtmf.io.mp3.internal, whose stability contract (see the package Javadoc) explicitly allows breakage between any two releases. It is public at the type level purely so Mp3AudioSourceProvider in the parent package can reach it; external callers MUST NOT depend on it.

Thread safety. The class holds no mutable state. The single exposed method is a pure function of its InputStream argument, and threading concerns therefore reduce entirely to whether the caller-supplied stream is safe to read from concurrently — a question outside this scanner's scope.

Since:
2.1.0
  • Method Details

    • scanForSyncLayer3

      public static boolean scanForSyncLayer3(InputStream in, int maxBytes) throws IOException
      Decide whether in starts with content that looks like an MPEG Layer III stream.

      Consumes bytes from in up to the first Layer III sync word or up to 10 + tagSize + (10 if footer else 0) + maxBytes bytes total, whichever comes first. Returns true the moment a valid Layer III sync word is located; returns false if the byte budget runs out or the stream ends without finding one. Does not close or reset the stream; the caller owns the stream and is expected to have wrapped it in a mark/reset-capable buffer (see Mp3AudioSourceProvider.canOpen(InputStream, String)) when non-destructive probing is required.

      The scan resumes immediately after any leading ID3v2 tag, so the maxBytes budget applies to the post-tag portion of the stream — matching Requirement 10.5's wording and preventing a pathologically large ID3v2 block from starving the sync-word search.

      Parameters:
      in - the stream to probe; must be non-null and positioned at the start of the candidate MP3 content
      maxBytes - the maximum number of post-tag bytes to scan for a sync word; must be non-negative. A value of zero always returns false without reading past the tag
      Returns:
      true iff a valid MPEG Layer III sync word (non-reserved version, Layer III) is found within the first maxBytes bytes after any leading ID3v2 tag; false otherwise
      Throws:
      NullPointerException - if in is null
      IllegalArgumentException - if maxBytes is negative
      IOException - if reading from in throws