ISMV lacks any sort of edit list support, as well as tfxd is
effectively the PTS of the fragment for most intents and purposes.
Thus, if b-frames are requested without negative CTS offsets you
end up with N frames' worth of delay (tfxd PTS plus the CTS offset
of the first sample). Negative CTS offsets enable the first sample
to have CTS=DTS, and thus a/v desync due to b-frame reorder delay
is avoided.