decorrelate_ls, _rs and _ms are decorrelate[1], [2] and [3] respectively.
The code ended up testing indep ([0]) as twice, ms never, and misnaming
the other two.
Segmented loads are slow, so here we use unit-strided load and narrowing shifts.
c910:
fcmul_add_c: 2179
fcmul_add_rvv_f64: 1652
c908:
fcmul_add_c: 4891.2
fcmul_add_rvv_f64: 2399.5
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
This commit adds support for VP8 bitstream read methods to the cbs
codec. This enables the trace_headers bitstream filter to support VP8,
in addition to AV1, H.264, H.265, and VP9. This can be useful for
debugging VP8 stream issues.
The CBS VP8 implements a simple VP8 boolean decoder using GetBitContext
to read the bitstream.
Only the read methods `read_unit` and `split_fragment` are implemented.
The write methods `write_unit` and `assemble_fragment` return the error
code AVERROR_PATCHWELCOME. This is because CBS VP8 write is unlikely to
be used by any applications at the moment. The write methods can be
added later if there is a real need for them.
TESTS: ffmpeg -i fate-suite/vp8/frame_size_change.webm -vcodec copy
-bsf:v trace_headers -f null -
Signed-off-by: Jianhui Dai <jianhui.j.dai@intel.com>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
This commit exports the `vp8_token_update_probs` variable to internal
library scope to facilitate its reuse within the library.
Signed-off-by: Jianhui Dai <jianhui.j.dai@intel.com>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Better performance can probably be achieved with a more intricate
unrolled loop, but this is a start:
add_hfyu_left_pred_bgr32_c: 15084.0
add_hfyu_left_pred_bgr32_rvv_i32: 10280.2
This would actually be cleaner with the RISC-V P extension, but that is
not ratified yet (I think?) and usually not supported if V is supported.
Current code tracks min/max pts for each stream separately; then when
the file ends it combines them with last frame's duration to compute the
total duration of each stream; finally it selects the longest of those
durations as the file duration.
This is incorrect - the total file duration is the largest timestamp
difference between any frames, regardless of the stream.
Also change the way the last frame information is reported from decoders
to the muxer - previously it would be just the last frame's duration,
now the end timestamp is sent, which is simpler.
Changes the result of the fate-ffmpeg-streamloop-transcode-av test,
where the timestamps are shifted slightly forward. Note that the
matroska demuxer does not return the first audio packet after seeking
(due to buggy interaction betwen the generic code and the demuxer), so
there is a gap in audio.
This ensures that tq_receive() will always return EOF after all streams
were receive-finished, even though the sending side might not have
closed them yet. This may allow the receiver to avoid manually tracking
which streams it has already closed.
It is common for subtitle streams to have large gaps between packets.
When the caller is interleaving packets from multiple files, it can
easily happen that two successive subtitle packets trigger this limit,
even though no excessive buffering is happening.
Should fix#7064
With explicit unrolling, we can skip half of the sign bit flips, and
the compiler is then better able to optimise the scalar loop:
predictor_c: 31376.0 (before)
predictor_c: 23703.0 (after)
This is restricted to 128-bit vectors as larger vector sizes could read
past the end of the noise array. Support for future hardware with larger
vector sizes is left for some other time.
hf_apply_noise_0_c: 2319.7
hf_apply_noise_0_rvv_f32: 1229.0
hf_apply_noise_1_c: 2539.0
hf_apply_noise_1_rvv_f32: 1244.7
hf_apply_noise_2_c: 2319.7
hf_apply_noise_2_rvv_f32: 1232.7
hf_apply_noise_3_c: 2541.2
hf_apply_noise_3_rvv_f32: 1244.2
The tested functions treat s_m[i] == 0 as a special case. Other than
that, the functions are slightly complicated vector additions.
This actually makes the zero case happen pseudorandomly.
In my personal opinion, we should not need to support unaligned YUY2
pixel maps. They should always be aligned to at least 32 bits, and the
current code assumes just 16 bits. However checkasm does test for
unaligned input bitmaps. QEMU accepts it, but real hardware dose not.
In this particular case, we can at the same time improve performance and
handle unaligned inputs, so do just that.
uyvytoyuv422_c: 104379.0
uyvytoyuv422_c: 104060.0
uyvytoyuv422_rvv_i32: 25284.0 (before)
uyvytoyuv422_rvv_i32: 19303.2 (after)
This saves three scratch registers and three instructions per line. The
performance gains are mostly negligible. The main point is to free up
registers for further rework.
Fixes OOM when caller keeps adding frames into filtergraph
that reached EOF by other means, for example EOF is signalled
by other filter in filtergraph or by buffersink.
This commit fixes issue with missing SPS/PPS headers in video
encoded by AMF AV1 encoder.
Missing headers leads to broken seek in MPV video player.
Default value for property AV1_HEADER_INSERTION_MODE shouldn't be setup
to NONE (no headers insertion). We need to skip definition of this property,
because default value depends on USAGE property.
Signed-off-by: Dmitrii Ovchinnikov <ovchinnikov.dmitrii@gmail.com>
With 5 accumulator vectors and 6 inputs, this can only use LMUL=2.
Also the number of vector loop iterations is small, just 5 on 128-bit
vector hardware.
The vector loop is somewhat unusual in that it processes data in
descending memory order, in order to save on vector slides:
in descending order, we can extract elements to carry over to the next
iteration from the bottom of the vectors directly. With ascending order
(see in the Opus postfilter function), there are no ways to get the top
elements directly. On the downside, this requires the use of separate
shift and sub (the would-be SH3SUB instruction does not exist), with
a small pipeline stall on the vector load address.
The edge cases in scalar are done in scalar as this saves on loads
and remains significantly faster than C.
autocorrelate_c: 669.2
autocorrelate_rvv_f32: 421.0
Given the size of the data set, strided memory accesses cannot be avoided.
We can still do better than the current code.
ps_hybrid_synthesis_deint_c: 12065.5
ps_hybrid_synthesis_deint_rvv_i32: 13650.2 (before)
ps_hybrid_synthesis_deint_rvv_i64: 8181.0 (after)
Segmented loads may be slower than not. So this advantageously uses a
unit-strided load and narrowing shifts instead.
Before:
ps_add_squares_c: 60757.7
ps_add_squares_rvv_f32: 22242.5
After:
ps_add_squares_c: 60516.0
ps_add_squares_rvv_i64: 17067.7
Fixes: Erroneous HTTP POST instead of HTTP PUT for WebVTT HLS variant playlists.
Signed-off-by: Léon Spaans <leons@gridpoint.nl>
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
PGMYUV seems to be always limited range. This was a format originally
invented by FFmpeg at a time when YUVJ distinguished limited from full
range YUV, and this codec never appeared to output YUVJ in any
circumstance, so hard-coding limited range preserves the status quo.
The other formats are explicitly documented to be full range RGB/gray
formats. That said, don't tag them yet, due to outstanding bugs w.r.t
grayscale formats and color range handling.
This change in behavior updates a bunch of FATE tests in trivial ways
(added tagging being the only difference).
When using vf_scale to force a specific output color space, also tag
this on the AVFrame. (Mirroring existing logic for output range)
Move the sanity fix for RGB after the new assignment, to avoid leaking
bogus YUV colorspace metadata for RGB spaces.
No need to write a custom string parser when we can just use an integer
option with preset values. The various bits of fallback logic are wholly
redundant with equivalent logic already inside sws_getCoefficients.
Note: I disallowed setting 'out_color_matrix=auto', because this does
not do anything meaningful in the current code (just hard-codes
AVCOL_SPC_BT470BG fallback).
The documentation states that invalid entries default to SWS_CS_DEFAULT.
A value of 0 is not a valid SWS_CS_*, yet the code incorrectly
hard-codes it to BT.709 coefficients instead of SWS_CS_DEFAULT.
This was a complete hack seemingly designed to work around a different
bug, which was fixed in the previous commit. As such, there is no more
reason not to do this, as it simply breaks changing color range in
sws_setColorspaceDetails for no reason.
More commonly, this fixes the case of sws_setColorspaceDetails after
sws_getContext, since the latter implies sws_init_context.
The problem here is that sws_init_context sets up the range conversion
and fast path tables based on the values of srcRange/dstRange at init
time. This may result in locking in a "wrong" path (either using
unscaled fast path when range conversion later required, or using
scaled slow path when range conversion becomes no longer required).
There are two way outs:
1. Always initialize range conversion and unscaled converters, even if
they will be unused, and extend the runtime check.
2. Re-do initialization if the values change after
sws_setColorspaceDetails.
I opted for approach 1 because it was simpler and easier to reason
about.
Reword the av_log message to make it clear that this special converter
is not necessarily used, depending on whether or not there is range
conversion or YUV matrix conversion going on.
The check if drawing needs to be initialized and supported formats
should be drawable ones was flawed, as pad_stop/pad_start is only
populated from stop_duration/start_duration after these checks.
To fix that, check the _duration variants as well and for better
readability and maintainability break the check out into its own
helper.
Fixes a regression from 86b252ea9dFix#10621
Signed-off-by: Anton Khirnov <anton@khirnov.net>
We calculated offsets as pairs, but addressed them in the shader
as single float values, while reading them as ivec2s.
Also removes unused code (was provisionally added if cooperative matrices
could be used, but that turned out to be impossible).
VK_KHR_PORTABILITY_ENUMERATION_EXTENSION_NAME is required on macOS,
and VK_INSTANCE_CREATE_ENUMERATE_PORTABILITY_BIT_KHR flag should
be set.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Texinfo 7.0 produces quite different HTML to Texinfo 6.8. Without
this change, enumerated option flags (i.e. Possible values of x
are...) render as white text on a white background with Texinfo 7.0
and are unreadable. This change removes a style for the selector
`.table .table` which causes the background to turn white for these
elements. As far as I can tell, it is not actually used anywhere in
files generated by Texinfo 6.8.
Signed-off-by: Frank Plowman <post@frankplowman.com>
Resolves trac ticket #10636 (http://trac.ffmpeg.org/ticket/10636).
Texinfo 7.0, released in November 2022, changed the names of various
functions. Compiling docs with Texinfo 7.0 resulted in warnings and
improperly formatted documentation. More old names appear to have
been removed in Texinfo 7.1, released October 2023, which causes docs
compilation to fail.
This commit addresses the issue by adding logic to switch between the old
and new function names depending on the Texinfo version. Texinfo 6.8
produces identical documentation before and after the patch.
CC
https://www.mail-archive.com/debian-bugs-dist@lists.debian.org/msg1938238.htmlhttps://bugs.gentoo.org/916104
Signed-off-by: Frank Plowman <post@frankplowman.com>
By intent, and in practice, the "active contributor" criterion applies
to the person authoring the commits, not the one pushing them into the
repository.
When operating on large blocks of data it's common to repeatedly use
an instruction on multiple registers. Using the REPX macro makes it
easy to quickly write dense code to achieve this without having to
explicitly duplicate the same instruction over and over.
For example,
REPX {paddw x, m4}, m0, m1, m2, m3
REPX {mova [r0+16*x], m5}, 0, 1, 2, 3
will expand to
paddw m0, m4
paddw m1, m4
paddw m2, m4
paddw m3, m4
mova [r0+16*0], m5
mova [r0+16*1], m5
mova [r0+16*2], m5
mova [r0+16*3], m5
Commit taken from x264:
6d10612ab0
Signed-off-by: Frank Plowman <post@frankplowman.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
This uses a more traditional approach allowing up processing of up to
period minus two elements per iteration. This also allows the algorithm
to work for all and any vector length.
As the T-Head C908 device under test can load 16 elements loop, there is
unsurprisingly a little performance drop when the period is minimal and
the parallelism is capped at 13 elements:
Before:
postfilter_15_c: 21222.2
postfilter_15_rvv_f32: 22007.7
postfilter_512_c: 20189.7
postfilter_512_rvv_f32: 22004.2
postfilter_1022_c: 20189.7
postfilter_1022_rvv_f32: 22004.2
After:
postfilter_15_c: 20189.5
postfilter_15_rvv_f32: 7057.2
postfilter_512_c: 20189.5
postfilter_512_rvv_f32: 5667.2
postfilter_1022_c: 20192.7
postfilter_1022_rvv_f32: 5667.2
As in the aligned case, we can use VLSE64.V, though the way of doing so
gets more convoluted, so the performance gains are more modest:
get_pixels_unaligned_c: 126.7
get_pixels_unaligned_rvv_i32: 145.5 (before)
get_pixels_unaligned_rvv_i64: 62.2 (after)
For the reference, those are the aligned benchmarks (unchanged) on the
same T-Head C908 hardware:
get_pixels_c: 126.7
get_pixels_rvi: 85.7
get_pixels_rvv_i64: 33.2
Without this flag, timestamps were embedded into the final
binary if CUDA was enabled.
Signed-off-by: Rob Hall <robxnanocode@outlook.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
We currently mostly do not empty the MMX state in our MMX
DSP functions; instead we only do so before code that might
be using x87 code. This is a violation of the System V i386 ABI
(and maybe of other ABIs, too):
"The CPU shall be in x87 mode upon entry to a function. Therefore,
every function that uses the MMX registers is required to issue an
emms or femms instruction after using MMX registers, before returning
or calling another function." (See 2.2.1 in [1])
This patch does not intend to change all these functions to abide
by the ABI; it only does so for ff_pixelutils_sad_8x8_mmxext, as this
function can by called by external users, because it is exported
via the pixelutils API. Without this, the following fragment will
assert (on x86/x64):
uint8_t src1[8 * 8], src2[8 * 8];
av_pixelutils_sad_fn fn = av_pixelutils_get_sad_fn(3, 3, 0, NULL);
fn(src1, 8, src2, 8);
av_assert0_fpu();
[1]: https://raw.githubusercontent.com/wiki/hjl-tools/x86-psABI/intel386-psABI-1.1.pdf
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is cleaner, but it is also a workaround for when
the header exists, but cannot be compiled.
This will happen when the compiler has no inline asm
support.
Possibly the configure check should be improved as well.
The test can currently pass when _Pragma is not supported, since
_Pragma might be treated as a implicitly declared function.
This happens e.g. with tinycc.
Extending the check to 2 pragmas both matches the actual use
better and avoids this misdetection.
The `-caption_encoding` option was reported as having a default value of
'ass', whereas it's actually 'auto'.
Signed-off-by: zheng qian <xqq@xqq.im>
Signed-off-by: Gyan Doshi <ffmpeg@gyani.pro>
With 128-bit vectors, this is mostly pointless but also harmless.
Performance gains should be more noticeable with larger vector sizes.
neg_odd_64_c: 76.2
neg_odd_64_rvv_i64: 74.7
Up until now each thread had its own buffer pool for extradata
buffers when using frame-threading. Each thread can have at most
three references to extradata and in the long run, each thread's
bufferpool seems to fill up with three entries. But given
that at any given time there can be at most 2 + number of threads
entries used (the oldest thread can have two references to preceding
frames that are not currently decoded and each thread has its own
current frame, but there can be no references to any other frames),
this is wasteful. This commit therefore uses a single buffer pool
that is synced across threads.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Up until now, the VAAPI encoder uses fake data with the
AVBuffer-API: The data pointer does not point to real memory,
but is instead just a VABufferID converted to a pointer.
This has probably been copied from the VAAPI-hwcontext-API
(which presumably does it to avoid allocations).
This commit changes this without causing additional allocations
by switching to the RefStruct-pool API. This also fixes an
unchecked av_buffer_ref().
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It involves less allocations, in particular no allocations
after the entry has been created. Therefore creating a new
reference from an existing one can't fail and therefore
need not be checked. It also avoids indirections and casts.
Also note that nvdec_decoder_frame_init() (the callback
to initialize new entries from the pool) does not use
atomics to read and replace the number of entries
currently used by the pool. This relies on nvdec (like
most other hwaccels) not being run in a truely frame-threaded
way.
Tested-by: Timo Rothenpieler <timo@rothenpieler.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It involves less allocations and therefore has the nice property
that deriving a reference from a reference can't fail,
simplifying hevc_ref_frame().
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It involves less allocations and therefore has the nice property
that deriving a reference from a reference can't fail.
This allows for considerable simplifications in
ff_h264_(ref|replace)_picture().
Switching to the RefStruct API also allows to make H264Picture
smaller, because some AVBufferRef* pointers could be removed
without replacement.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Very similar to the AVBufferPool API, but with some differences:
1. Reusing an already existing entry does not incur an allocation
at all any more (the AVBufferPool API needs to allocate an AVBufferRef).
2. The tasks done while holding the lock are smaller; e.g.
allocating new entries is now performed without holding the lock.
The same goes for freeing.
3. The entries are freed as soon as possible (the AVBufferPool API
frees them in two batches: The first in av_buffer_pool_uninit() and
the second immediately before the pool is freed when the last
outstanding entry is returned to the pool).
4. The API is designed for objects and not naked buffers and
therefore has a reset callback. This is called whenever an object
is returned to the pool.
5. Just like with the RefStruct API, custom allocators are not
supported.
(If desired, the FFRefStructPool struct itself could be made
reference counted via the RefStruct API; an FFRefStructPool
would then be freed via ff_refstruct_unref().)
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This makes the code more testable as uninitialized fields are 0
and not random values from the last call
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
These entries do not correspond to VLC symbols that can be used
they do corrupt various variables like min/max bits
This also no longer assumes that there is a single non subtable
entry
Probably fixes some infinite loops too
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 1900031961 + 553590817 cannot be represented in type 'int'
Fixes: 63061/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_APE_fuzzer-5166188298371072
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The check is based on not infinite looping. It is likely
a more strict check can be done
Fixes: Infinite loop
Fixes: 62473/clusterfuzz-testcase-minimized-ffmpeg_BSF_EVC_FRAME_MERGE_fuzzer-5719883750703104
Fixes: 62765/clusterfuzz-testcase-minimized-ffmpeg_dem_EVC_fuzzer-6448531252314112
Fixes: 63378/clusterfuzz-testcase-minimized-ffmpeg_dem_MPEGPS_fuzzer-6504993844494336
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: "Dawid Kozinski/Multimedia (PLT) /SRPOL/Staff Engineer/Samsung Electronics" <d.kozinski@samsung.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Everything besides VLC.table is basically write-only
and even VLC.table can be removed by accessing the
underlying table directly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
For all VLCs here, the number of bits of the VLC is write-only,
because it is hardcoded at the call site. Therefore one can replace
these VLC structures with the only thing that is actually used:
The pointer to the VLCElem table. And in most cases one can even
avoid this.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Everything besides VLC.table is basically write-only
and even VLC.table can be removed by accessing the
underlying tables directly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
For some VLCs here, the number of bits of the VLC is
write-only, because it is hardcoded at the call site.
Therefore one can replace these VLC structures with
the only thing that is actually used: The pointer
to the VLCElem table.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This allows to avoid the relocations inherent in an array
to individual tables; it also reduces padding.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
For all VLCs here, the number of bits of the VLC is
write-only, because it is hardcoded at the call site.
Therefore one can replace these VLC structures with
the only thing that is actually used: The pointer
to the VLCElem table.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
ff_ps_init() initializes some tables for AAC parametric stereo
and some of them are only valid for the fixed- or floating-point
decoder, whereas others (namely VLCs) are valid for both.
The latter are therefore initialized by ff_ps_init_common()
and because the two versions of ff_ps_init() can be run
concurrently, it is guarded by an AVOnce.
Yet now that there is ff_aacdec_common_init_once() there is
a better way to do this: Call ff_ps_init_common()
from ff_aacdec_common_init_once(). That way there is no need
to guard ff_ps_init_common() by an AVOnce any more.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This allows to avoid the relocations inherent in a table
to individual tables; it also reduces padding.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It allows to replace code tables of type uint32_t or uint16_t
by symbols of type uint8_t. It is also faster.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
For all VLCs here, the number of bits of the VLC is
write-only, because it is hardcoded at the call site.
Therefore one can replace these VLC structures with
the only thing that is actually used: The pointer
to the VLCElem table.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The VLCs, their init code and the tables used for initialization
are currently duplicated for the floating- and fixed-point decoders.
This commit stops doing so and moves this stuff to aacdec_common.c.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
For all VLCs here, the number of bits of the VLC is
write-only, because it is hardcoded at the call site.
Therefore one can replace these VLC structures with
the only thing that is actually used: The pointer
to the VLCElem table. And in some cases one can even
avoid this.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Due to making the decode frames context use the coded size, the
filter started to display those artifacts as it reused the input frame's size.
Change it to instead output the real image size for images, not the input.
h->long_ref isn't guaranteed to be contiguously filled. Use the approach
from both vaapi_h264 and vdpau_h264 which goes through the 16 frames in
h->long_ref to find the LTR entries.
Fixes MR2_MW_A.264 from JVT-AVC_V1.
The fixed-point decoder actually does not use the floating-point
tables initialized by ff_aac_tableinit() at all. So don't
initialize them for it; instead merge initializing these tables
into ff_aac_float_common_init() which is already the function
for the common static initializations of the floating-point
AAC decoder and the (also floating-point) AAC encoder.
Doing so saves also one AVOnce.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
They (as well as their init code) are currently duplicated
for the floating- and fixed-point decoders.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
For all VLCs here, the number of bits of the VLC is write-only,
because it is hardcoded at the call site. Therefore one can replace
these VLC structures with the only thing that is actually used:
The pointer to the VLCElem table. And in some cases one can even
avoid this.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
For all VLCs here, the number of bits of the VLC is
write-only, because it is hardcoded at the call site.
Therefore one can replace these VLC structures with
the only thing that is actually used: The pointer
to the VLCElem table. And in some cases one can even
avoid this.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Everything besides VLC.table is basically write-only
and even VLC.table can be removed by accessing the
underlying table directly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Everything besides VLC.table is basically write-only
and even VLC.table can be removed by accessing the
underlying table directly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
For all VLCs here, the number of bits of the VLC is
write-only, because it is hardcoded at the call site.
Therefore one can replace these VLC structures with
the only thing that is actually used: The pointer
to the VLCElem table. And in some cases one can even
avoid this.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Everything besides VLC.table is basically write-only
and even VLC.table can be removed by accessing the
underlying table directly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Everything besides VLC.table is basically write-only
and even VLC.table can be removed by accessing the
underlying table directly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Everything besides VLC.table is basically write-only
and even VLC.table can be removed by accessing the
underlying table directly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Everything besides VLC.table is basically write-only
and even VLC.table can be removed by accessing the
underlying table directly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Everything besides VLC.table is basically write-only
and only VLC.table needs to be retained.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Everything besides VLC.table is basically write-only
and even VLC.table can be removed by accessing the
underlying table directly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Everything besides VLC.table is basically write-only
and even VLC.table can be removed by accessing the
underlying table directly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Everything besides VLC.table is basically write-only
and even VLC.table can be removed by accessing the
underlying table directly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Said object is only allowed to be modified during its
initialization and is immutable afterwards.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
For most VLCs here, the number of bits of the VLC is
write-only, because it is hardcoded at the call site.
Therefore one can replace these VLC structures with
the only thing that is actually used: The pointer
to the VLCElem table.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
These VLCs are very big: The VP3 one have 164382 elements
but due to the overallocation enough memory for 313344 elements
are allocated (1.195 MiB with sizeof(VLCElem) == 4);
for VP4 the numbers are very similar, namely 311296 and 164392
elements. Since 1f4cf92cfb, each
frame thread has its own copy of these VLCs.
This commit fixes this by sharing these VLCs across threads.
The approach used here will also make it easier to support
stream reconfigurations in case of frame-multithreading
in the future.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Of all these VLCs here, only VLC.table was really used
after init, so use the ff_vlc_init_tables API
to get rid of them.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Of all these VLCs here, only VLC.table was really used
after init, so use the ff_vlc_init_tables API
to get rid of them.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Of all these VLCs here, only VLC.table was really used
after init, so use the ff_vlc_init_tables API
to get rid of them.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Of all these VLCs here, only VLC.table was really used
after init, so use the ff_vlc_init_tables API
to get rid of them.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Of all these VLCs here, only VLC.table was really used
after init, so use the ff_vlc_init_tables API
to get rid of them.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Of all these VLCs here, only VLC.table was really used
after init, so use the ff_vlc_init_tables API
to get rid of them.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Of all these VLCs here, only VLC.table was really used
after init, so use the ff_vlc_init_tables API
to get rid of them.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Of all these VLCs here, only VLC.table was really used
after init, so use the ff_vlc_init_tables API
to get rid of them.
Also combine the ff_msmp4_dc_(luma|chroma)_vlcs as well
as the tables used to generate them to simplify the code.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Of all these VLCs here, only VLC.table was really used
after init, so use the ff_vlc_init_tables API
to get rid of them.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Of all these VLCs here, only VLC.table was really used
after init, so use the ff_vlc_init_tables API
to get rid of them.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Of all these VLCs here, only VLC.table was really used
after init, so use the ff_vlc_init_tables API
to get rid of them.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Of all these VLCs here, only VLC.table was really used
after init, so use the ff_vlc_init_tables API
to get rid of them.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Of all these VLCs here, only VLC.table was really used
after init, so use the ff_vlc_init_tables API
to get rid of them.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
These are quite small and therefore force reloads
that can be avoided by modest increases in the number of bits used.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is especially important for frame-threaded decoders like
this one, because up until now each thread had an identical
copy of all these VLC tables.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
For lots of static VLCs, the number of bits is not read from
VLC.bits, but rather a compile-constant that is hardcoded
at the callsite of get_vlc2(). Only VLC.table is ever used
and not using it directly is just an unnecessary indirection.
This commit adds helper functions and macros to avoid the VLC
structure when initializing VLC tables; there are 2x2 functions:
Two choices for init_sparse or from_lengths and two choices
for "overlong" initialization (as used when multiple VLCs are
initialized that share the same underlying table).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The code above this does a whitelist on desc->flags, which now includes
the (disallowed) AV_PIX_FMT_FLAG_XYZ for XYZ formats. So there is no
more need for a separate check, here.
There are already several places in the codebase that match desc->name
against "xyz", and many downstream clients replicate this behavior.
I have no idea why this is not just a flag.
Motivated by my desire to add yet another check for XYZ to the codebase,
and I'd rather not keep copy/pasting a string comparison hack.
These are not supported by the drawing functions at all, and were
incorrectly advertised as supported in the past.
Note: This check is added only to separate the logic change from the API
change in the following commit, and will be removed again after it
becomes redundant.
Clang versions before 17 (Xcode versions up to and including 15.0)
had a very annoying bug in its behaviour of the ".arch" directive
in assembly. If the directive only contained a level, such as
".arch armv8.2-a", it did validate the name of the level, but it
didn't apply the level to what instructions are allowed. The level
was applied if the directive contained an extra feature enabled,
such as ".arch armv8.2-a+crc" though. It was also applied on the
next ".arch_extension" directive.
This bug, combined with the fact that the same versions of Clang
didn't support the dotprod/i8mm extension names in either
".arch <level>+<feature>" or in ".arch_extension", could lead to
unexepcted build failures.
As the dotprod/i8mm extensions couldn't be enabled dynamically
via the ".arch_extension" directive, someone building ffmpeg could
try to enable them by configuring their build with
--extra-cflags="-march=armv8.6-a".
During configure, we test for support for the i8mm instructions
like this:
# Built with -march=armv8.6-a
.arch armv8.2-a # Has no visible effect here
#.arch_extension i8mm # Omitted as the extension name isn't known
usdot v0.4s, v0.16b, v0.16b
# Successfully assembled as armv8.6-a is the effective level,
# and i8mm is enabled implicitly in armv8.6-a.
Thus, we would enable assembling those instructions. However if
we later check for another extension, such as sve (which those
versions of Clang actually do support), we can later run into the
following situation when building actual code:
# Built with -march=armv8.6-a
.arch armv8.2-a # Has no visible effect here
#.arch_extension i8mm # Omitted as the extension name isn't known
.arch_extension sve # Included as "sve" is as supported extension name
# .arch_extension effectively activates the previous .arch directive,
# so the effective level is armv8.2-a+sve now.
usdot v0.4s, v0.16b, v0.16b
# Fails to build the instructions that require i8mm. Despite the
# configure check, the unrelated ".arch_extension sve" directive
# breaks the functionality of the i8mm feature.
This patch avoids this situation:
- By adding a dummy feature such as "+crc" on the .arch directive
(if supported), we make sure that it does get applied immediately,
avoiding it taking effect spuriously at a later unrelated
".arch_extension" directive.
- By checking for higher arch levels such as armv8.4-a and armv8.6-a,
we can assemble the dotprod and i8mm extensions without the user
needing to pass -march=armv8.6-a. This allows using the dotprod/i8mm
codepaths via runtime detection while keeping the binary runnable
on older versions. I.e. this enables the i8mm codepaths on Apple M2
machines while built with Xcode's Clang.
TL;DR: Enable the I8MM extensions for Apple M2 without the user needing
to do a custom configuration; avoid potential build breakage if a user
does such a custom configuration.
Once Xcode versions that have these issues fixed are prevalent, we
can consider reverting this change.
Signed-off-by: Martin Storsjö <martin@martin.st>
If the scan lines are aligned, we can load each row as a 64-bit value,
thus avoiding segmentation. And then we can factor the conversion or
subtraction.
In principle, the same optimisation should be possible for high depth,
but would require 128-bit elements, for which no FFmpeg CPU flag
exists.
This should hopefully clarify that the option only affects MSZ
full-width characters, and not all full-width ASCII. Additionally,
this matches the prefix with the upstream option.
Signed-off-by: TADANO Tokumei <aimingoff@pc.nifty.jp>
This patch adds two MSZ (Middle Size; half width) character
related options, mapping against newly added upstream
functionality:
* `replace_msz_japanese`, which was introduced in version 1.0.1
of libaribcaption.
* `replace_msz_glyph`, which was introduced in version 1.1.0
of libaribcaption.
The latter option improves bitmap type rendering if specified
fonts contain half-width glyphs (e.g., BIZ UDGothic), even
if both ASCII and Japanese MSZ replacement options are set
to false.
As these options require newer versions of libaribcaption, the
configure requirement has been bumped accordingly.
Signed-off-by: TADANO Tokumei <aimingoff@pc.nifty.jp>
On some environments, a `bool` variable is of smaller size than `int`.
As AV_OPT_TYPE_BOOL is internally handled as sizeof(int), if a `bool`
option was set on such an environment, the memory of following
variables would be filled. Additionally, set values may be destroyed
by av_opt_copy().
Signed-off-by: TADANO Tokumei <aimingoff@pc.nifty.jp>
av_hwframe_transfer_data try with src_ctx first. If the operation
failed with AVERROR(ENOSYS), it will try again with dst_ctx. Return
AVERROR(EINVAL) makes the second step being skipped.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Fixes a validation issue.
The issue is that the function gets called before we've sumitted a frame
for decoding to that context. However, we cannot run queries before
they've been reset, which happens at submission time.
As we'd need to otherwise run a command queue at init-time, just check
if submissions have happened.
Fixes: out of array access
Fixes: rtpdec_h264.c149/poc
Found-by: Hardik Shah of Vehere
Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: -1028966111 + -1314089526 cannot be represented in type 'int'
Fixes: 63174/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_AAC_FIXED_fuzzer-5853273711837184
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Mapping to ITU-R BS.2051-3 "Sound System J" and ITU-R BS.1196-8 "Channel
Configuration 19".
Signed-off-by: Will Wolcott <wwolcott@netflix.com>
Signed-off-by: James Almer <jamrial@gmail.com>
This layout maps to ITU-R BS.2051-3 "Sound System C" and ITU-R BS.1196-8 "Channel
Configuration 14", and it being the first layout with top layer channels, it's
best to use a different scheme to properly convey the presence and amount of said
channels.
The new name will also be a better fit for the additions in the following commits.
Signed-off-by: James Almer <jamrial@gmail.com>
Only the collocated_ref of the current frame (i.e. HEVCContext.ref)
is ever used*, so move it to HEVCContext directly after ref.
*: This goes so far that collocated_ref was not even synced across
threads in case of frame-threading.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The vidstab library added support in Nov 2020 for writing/reading
the transforms data in binary in addition to ASCII. The library default
was changed to binary format but no changes were made to the AVfilters
resulting in data file for writing or reading being always opened as text.
This effectively broke the filters.
Option added to vidstabdetect to specify file format and open files in
both filters with the correct attributes.
"Validation Error: [ VUID-VkImageViewCreateInfo-imageViewType-04974 ] Object 0: handle = 0x9f9b41000000003c, type = VK_OBJECT_TYPE_IMAGE; | MessageID = 0xc120e150 | vkCreateImageView():
Using pCreateInfo->viewType VK_IMAGE_VIEW_TYPE_2D and the subresourceRange.layerCount VK_REMAINING_ARRAY_LAYERS=(17) and must 1 (try looking into VK_IMAGE_VIEW_TYPE_*_ARRAY).
The Vulkan spec states: If viewType is VK_IMAGE_VIEW_TYPE_1D, VK_IMAGE_VIEW_TYPE_2D, or VK_IMAGE_VIEW_TYPE_3D; and subresourceRange.layerCount is VK_REMAINING_ARRAY_LAYERS,
then the remaining number of layers must be 1"
Partially fixes https://streams.videolan.org/issues/19938/20000_20180305-15.04.59.ts
The is coded as 1920x1080, meant to be rendered at 1440x1080 with cropping,
or 1680x1080 before cropping. Currently, the created DPB is 1440x1080, which results
in the image being decoded incorrectly, as the decoder overwrites output memory.
This commit fixes this.
This eases actual development of the assembly functions, by only
allowing extension instructions within the sections that explicitly
enable them, instead of having all extensions enabled everywhere.
Signed-off-by: Martin Storsjö <martin@martin.st>
When users zero-init'd the struct, or left it as-is, the encode
queue family matched the graphics queue family, which led it to be
incorrectly logged as being used for encode.
This just improves the logging so this isn't printed anymore.
GetStdHandle is unavailable outside of the desktop API subset.
This didn't use to be a problem with earlier WinSDKs, as kbhit also
used to be available only for desktop apps, and this whole section is
wrapped in #if HAVE_KBHIT. With newer WinSDKs, kbhit() is available also
for non-desktop apps, while GetStdHandle still isn't.
Signed-off-by: Martin Storsjö <martin@martin.st>
It is unnecessary since the removal of non-thread-safe callbacks
in e0786a8eeb. Since then, the
AVCodecContext has only been used as logcontext.
Removing ff_thread_release_buffer() allowed to remove AVCodecContext*
parameters from several other functions (not only unref functions,
but also e.g. ff_h264_ref_picture() which calls ff_h264_unref_picture()
on error).
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
At various points through the function librsvg_decode_frame, errors are
returned from immediately without deallocating any allocated structs.
This patch both fixes those leaks, and also fixes the use of functions
that are deprecated since librsvg version 2.52.0. The older calls are
still used, guarded by #ifdefs while the newer replacements are used if
librsvg >= 2.52.0. One of the deprecated functions is used as a check
for the configure shell script, so it was replaced with a different
function.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
libavcodec/aarch64/vc1dsp_neon.S is skipped here, as it intentionally
uses a layered indentation style to visually show how different
unrolled/interleaved phases fit together.
Signed-off-by: Martin Storsjö <martin@martin.st>
Favour left aligned columns over right aligned columns.
In principle either style should be ok, but some of the cases
easily lead to incorrect indentation in the surrounding code (see
a couple of cases fixed up in the preceding patch), and show up in
automatic indentation correction attempts.
Signed-off-by: Martin Storsjö <martin@martin.st>
Some functions have slightly different indentation styles; try
to match the surrounding code.
libavcodec/aarch64/vc1dsp_neon.S is skipped here, as it intentionally
uses a layered indentation style to visually show how different
unrolled/interleaved phases fit together.
Signed-off-by: Martin Storsjö <martin@martin.st>
Fix rendering of int values within a side data element, which was
broken since commit d2d3a83ad9, where the side data element was
correctly marked as a variable fields element. Logic to render a
string variable was implemented already, but it was not implemented
for the int fields path, which was enabled by that commit.
Also, code and schema is changed in order to account for multiple
variable-fields elements - such as side data, contained within the
same parent. Previously it was assumed that a single variable-fields
element was contained within the parent, which was the case for tags,
but is not the case for side-data.
Previously data was rendered as:
<side_data_list>
<side_data side_data_type="CPB properties" max_bitrate="0" min_bitrate="0" avg_bitrate="0" buffer_size="327680" vbv_delay="-1"/>
</side_data_list>
Now as:
<side_data_list>
<side_data type="CPB properties">
<side_datum key="side_data_type" value="CPB properties"/>
<side_datum key="max_bitrate" value="0"/>
<side_datum key="min_bitrate" value="0"/>
<side_datum key="avg_bitrate" value="0"/>
<side_datum key="buffer_size" value="49152"/>
<side_datum key="vbv_delay" value="-1"/>
</side_data>
</side_data_list>
Variable-fields elements are rendered as a containing element wrapping
generic key/values elements, enabling use of strict XML schema.
Fix trac issue:
https://trac.ffmpeg.org/ticket/10613
This logic only covers the case of yuv420p. Extend this logic to cover
*all* vertically subsampled YUV formats, which require the same
interlaced scaling logic.
Fortunately, we can get away with re-using the same code for both JPEG
and MPEG range YUV, because the only difference here is the horizontal
alignment. (Which I omit touching for now, to avoid introducing possibly
unintended changes in default behavior)
1. If user don't specify the profile, set it to main10 when pixel
format is 10 bits. Before the patch, videotoolbox output main
profile bitstream with 10 bit input, which can be confusing.
2. Warning when user set profile to main explicitly with 10 bit
input. It works, but not the best choice.
3. Reject main 10 profile with 8 bit input, it doesn't work.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Will be used in the following patch. With hw_config we can get
avctx->hw_frames_ctx, and with avctx->hw_frames_ctx we get
sw_pix_fmt. Otherwise sw_pix_fmt is none. I need sw_pix_fmt
before get the first frame to set hevc encoder profile.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
In f7ac3512f5 the size of the dynamically
allocated buffer was shrunk, but it was made too small for very small
alphabet sizes. This patch restores the size to prevent an OOB read.
Reported-by: Cole Dilorenzo <coolkingcole@gmail.com>
Signed-off-by: Leo Izen <leo.izen@gmail.com>
EAGAIN causes an assertion failure when it is returned from the decoder
Fixes: Assertion consumed != (-(11)) failed at libavcodec/decode.c:462
Fixes: assertion_IOT_instruction_decode_c_462/poc
Found-by: Hardik Shah of Vehere (Dawn Treaders team)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 1496950099 + 728014168 cannot be represented in type 'int'
Fixes: 62667/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_MJPEGB_fuzzer-6511785170305024
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: -2146469728 - 1488954 cannot be represented in type 'int'
Fixes: 62490/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_BONK_fuzzer-5612782399389696
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The libmfx deprecation warning tells you to build against libmfx 1.x,
but the actual solution is to use --enable-libvpl instead of using
--enable-libmfx. Update the warning message to reflect this.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
They are generally set in ff_mpv_init_context_frame()
(mostly called by ff_mpv_common_init()); setting them
somewhere else should be avoided.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is unused by ff_mpeg4_decode_picture_header() (unsurprisingly given
that when decoding this function is called before the context has been
initialized).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
(The return value doesn't really matter: For video decoders
every return value >= 0 is treated as "consumed all of the input".)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This patch makes the libkvazaar encoder respect color settings that are
present on the codec context, including color range, primaries, transfer
function and colorspace.
0cd8769207 utilized the rc_algorithm member of the kvz_config struct, which
was introduced in Kvazaar 2.0.0. This patch bumps the minimum version of
Kvazaar to 2.0.0 so that FFmpeg compiles successfully.
Signed-off-by: John Mather <johnmather@sidefx.com>
An AVFormatContext leaks on errors that happen before it is attached
to its permanent place (an InputFile). Fix this by attaching
it earlier.
Given that it is not documented that avformat_close_input() is usable
with an AVFormatContext that has only been allocated with
avformat_alloc_context() and not opened with avformat_open_input(),
one error path before avformat_open_input() had to be treated
specially: It uses avformat_free_context().
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The av_opt_eval family of functions emits errors messages on error
and can therefore not be used with fake objects when the AVClass
has a custom item_name callback. The AVClass for AVCodecContext
has such a custom callback (it searches whether an AVCodec is set
to use its name). In practice it means that whatever is directly
after the "cc" pointer to the AVClass for AVCodec in the stack frame
of ist_add() will be treated as a pointer to an AVCodec with
unpredictable consequences.
Fix this by using an actual AVCodecContext instead of a fake object.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Its function is analogous to that of the fps filter, so filtering is a
more appropriate place for this.
The main practical reason for this move is that it places the encoding
sync queue right at the boundary between filters and encoders. This will
be important when switching to threaded scheduling, as the sync queue
involves multiple streams and will thus need to do nontrivial
inter-thread synchronization.
In addition to framerate conversion, the closely-related
* encoder timebase selection
* applying the start_time offset
are also moved to filtering.
That field is used by the framerate code to track whether any output has
been generated for the last input frame(*). Its use in the last
invocation of print_report() is meant to account for the very last
filtered frame being dropped in the number of dropped frames printed in
the log. However, that is a highly inappropriate place to do so, as it
makes assumptions about vsync logic in completely unrelated code. Move
the increment to encoder flush instead.
(*) the name is misleading, as the input frame has not yet been dropped
and may still be output in the future
Always use the functionality of the latter, which makes more sense as it
avoids losing keyframes due to vsync code dropping frames.
Deprecate the 'source_no_drop' value, as it is now redundant.
Unlike the 'source' mode, which preserves source keyframe-marking as-is,
the 'source_no_drop' mode attempts to keep track of keyframes dropped by
framerate conversion and mark the next output frame as key in such
cases. However,
* c75be06148 broke this functionality entirely, and made it equivalent
to 'source'
* even before it would only work when the frame immediately following
the dropped keyframe is preserved and not dropped as well
Also, drop a redundant check for 'frame' in setting dropped_keyframe, as
it is redundant with the check on the above line.
ifilter_send_eof() will fail if the input has no real or fallback
parameters, so there is no need to handle the case of some inputs being
in EOF state yet having no parameters.
Since at least commit c954cf1e1b
(adding ff_encode_alloc_frame()), a large part of ff_alloc_picture()
is completely separate for the two callers. Move the caller-specific
parts out to the callers.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is unnecessary in case of user-supplied frames, because
it happens directly after a av_frame_ref() with the same
src and dst.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
ff_alloc_picture() performs two tasks: a) In most instances,
it allocates frame buffers and b) it allocates certain
auxiliary buffers.
The exception to a) is the case when the encoder can reuse
user-supplied frames. And for these frames the auxiliary buffers
are unused, because this frame will never be used as current_picture
(and therefore also not as next_picture or last_picture);
see select_input_picture().
This means that we can simply avoid calling ff_alloc_picture()
with user-supplied frames at all.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
None of the mpegvideo encoders support anything but coded frames;
and if this were to change, it is unclear whether they would need
the adjustment here. So remove it.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
In case "!direct" we are not reusing the input buffers
(due to e.g. insufficient alignment), but allocating
new ones. These of course do not alias with the ones
provided by the user, so these checks are always-false.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
mpegvideo_enc uses a fixed-size array of Pictures; a slot is
considered taken if the Picture's AVFrame is set.
When an error happens after a slot has been taken, this Picture
has typically not been reset and is therefore not usable for
future requests. The code aborts when one runs out of slots
and this can happen in case of allocation failures.
Fix this by always unreferencing a Picture in case of errors.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The custom-allocator logic has been completely defunct since a while,
since nothing depends on those targets, they never get used.
This updates jemalloc to pkg-config, adds the fallback option for
potential arbitrary allocators, and finally actually adds the libraries
to LDFLAGS.
The gather index vector is only used as double-length (due to register
pressure), so no need to initialise it for quad-length. Basically this
matches the multiplier in the prologue to the the multipler in the loop.
And stop setting picture_number which was only done to not parse
extradata multiple times.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Fixes: signed integer overflow: 0 - -9223372036854775808 cannot be represented in type 'long'
Fixes: 51896/clusterfuzz-testcase-minimized-ffmpeg_IO_DEMUXER_fuzzer-6112289464123392
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: -9223372036315799520 - 3873890816 cannot be represented in type 'long'
Fixes: 51896/clusterfuzz-testcase-minimized-ffmpeg_DEMUXER_fuzzer-5009302746431488
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 9223372036630775808 + 1000000000 cannot be represented in type 'long'
Fixes: 51896/clusterfuzz-testcase-minimized-ffmpeg_dem_WEBM_DASH_MANIFEST_fuzzer-5406131992526848
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 155 + 9223372036854775655 cannot be represented in type 'long'
Fixes: 51896/clusterfuzz-testcase-minimized-ffmpeg_dem_W64_fuzzer-5364032278495232
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 2147483647 + 1 cannot be represented in type 'int32_t' (aka 'int')
Fixes: 51896/clusterfuzz-testcase-minimized-ffmpeg_dem_RPL_fuzzer-6086131095830528
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: -9223372036854775808 - 9222726413022000000 cannot be represented in type 'long'
Fixes: 51896/clusterfuzz-testcase-minimized-ffmpeg_dem_MOV_fuzzer-5959420033761280
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 22014562800 * 934633746 cannot be represented in type 'long'
Fixes: 51896/clusterfuzz-testcase-minimized-ffmpeg_dem_JACOSUB_fuzzer-5189603246866432
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 9230955872951340 - -9223372036854775808 cannot be represented in type 'long'
Fixes: 51896/clusterfuzz-testcase-minimized-ffmpeg_dem_SBG_fuzzer-6330481893572608
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Nicolas George <george@nsup.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Avoids allocations and error checks as well as the boilerplate
code for creating an AVBuffer with a custom free callback.
Also increases type safety.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Tested-by: Timo Rothenpieler <timo@rothenpieler.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Avoids allocations and error checks and allows to remove
cleanup code for earlier allocations. Also avoids casts
and indirections.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Given that the RefStruct API relies on the user to know
the size of the objects and does not provide a way to get it,
we need to store the number of elements allocated ourselves;
but this is actually better than deriving it from the size
in bytes.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Avoids allocations and therefore error checks: Syncing
hwaccel_picture_private across threads can't fail any more.
Also gets rid of an unnecessary pointer in structures and
in the parameter list of ff_hwaccel_frame_priv_alloc().
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Reviewed-by: Lynne <dev@lynne.ee>
Tested-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The SEI message code uses the AVBuffer API for its SEI messages
and contained buffers (like the extension buffer for HEVC
or the user data (un)registered payload buffers).
Contrary to the ordinary CBS code (where some of these
contained buffer references are actually references
to the provided AVPacket's data so that one can't replace
them with the RefStruct API), the CBS SEI code never uses
outside buffers at all and can therefore be switched entirely
to the RefStruct API. This avoids the overhead inherent
in the AVBuffer API (namely the separate allocations etc.).
Notice that the refcounting here is actually currently unused;
the refcounts are always one (or zero in case of no refcounting);
its only advantage is the flexibility provided by custom
free functions.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This avoids allocations and error checks etc. as well
as duplicate pointer lists in the CodedBitstreamFooContexts.
It also avoids casting const away for use as opaque,
as the RefStruct API supports const opaques.
The fact that some of the units are not refcounted
(i.e. they are sometimes part of an encoding context
like VAAPIEncodeH264Context) meant that CodedBitstreamUnit
still contains two pointers, one to the content
and another ownership pointer, replacing the AVBufferRef* pointer.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It avoids allocations and the corresponding error checks.
Also avoids casts and indirections.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It avoids allocations and the corresponding error checks.
It also avoids indirections and casts.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Avoids allocations and error checks when syncing the buffers.
Also avoids indirections.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Avoids allocations and error checks for these allocations;
e.g. syncing buffers across threads can't fail any more
and needn't be checked. It also gets rid of casts and
indirections.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Avoids allocations and error checks for these allocations;
e.g. syncing buffers across threads can't fail any more
and needn't be checked. It also avoids having to keep
H264ParamSets.pps and H264ParamSets.pps_ref and PPS.sps
and PPS.sps_ref in sync and gets rid of casts and indirections.
(The removal of these checks and the syncing code even more
than offset the additional code for RefStruct.)
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Avoids allocations and frees and error checks for said allocations;
also avoids a few indirections and casts.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
For now, this API is supposed to replace all the internal uses
of reference counted objects in libavcodec; "internal" here
means that the object is created in libavcodec and is never
put directly in the hands of anyone outside of it.
It is intended to be made public eventually, but for now
I enjoy the ability to modify it freely.
Several shortcomings of the AVBuffer API motivated this API:
a) The unnecessary allocations (and ensuing error checks)
when using the API. Besides the need for runtime checks it
imposes upon the developer the burden of thinking through
what happens in case an error happens. Furthermore, these
error paths are typically not covered by FATE.
b) The AVBuffer API is designed with buffers and not with
objects in mind: The type for the actual buffers used
is uint8_t*; it pretends to be able to make buffers
writable, but this is wrong in case the buffer is not a POD.
Another instance of this thinking is the lack of a reset
callback in the AVBufferPool API.
c) The AVBuffer API incurs unnecessary indirections by
going through the AVBufferRef.data pointer. In case the user
tries to avoid this indirection and stores a pointer to
AVBuffer.data separately (which also allows to use the correct
type), the user has to keep these two pointers in sync
in case they can change (and in any case has two pointers
occupying space in the containing context). See the following
commit using this API for H.264 parameter sets for an example
of the removal of such syncing code as well as the casts
involved in the parts where only the AVBufferRef* pointer
was stored.
d) Given that the AVBuffer API allows custom allocators,
creating refcounted objects with dedicated free functions
often involves a lot of boilerplate like this:
obj = av_mallocz(sizeof(*obj));
ref = av_buffer_create((uint8_t*)obj, sizeof(*obj), free_func, opaque, 0);
if (!ref) {
av_free(obj);
return AVERROR(ENOMEM);
}
(There is also a corresponding av_free() at the end of free_func().)
This is now just
obj = ff_refstruct_alloc_ext(sizeof(*obj), 0, opaque, free_func);
if (!obj)
return AVERROR(ENOMEM);
See the subsequent patch for the framepool (i.e. get_buffer.c)
for an example.
This API does things differently; it is designed to be lightweight*
as well as geared to the common case where the allocator of the
underlying object does not matter as long as it is big enough and
suitably aligned. This allows to allocate the user data together
with the API's bookkeeping data which avoids an allocation as well
as the need for separate pointers to the user data and the API's
bookkeeping data. This entails that the actual allocation of the
object is performed by RefStruct, not the user. This is responsible
for avoiding the boilerplate code mentioned in d).
As a downside, custom allocators are not supported, but it will
become apparent in subsequent commits that there are enough
usecases to make it worthwhile.
Another advantage of this API is that one only needs to include
the relevant header if one uses the API and not when one includes
the header or some other component that uses it. This is because there
is no RefStruct type analog of AVBufferRef. This brings with it
one further downside: It is not apparent from the pointer itself
whether the underlying object is managed by the RefStruct API
or whether this pointer is a reference to it (or merely a pointer
to it).
Finally, this API supports const-qualified opaque pointees;
this will allow to avoid casting const away by the CBS code.
*: Basically the only exception to the you-only-pay-for-what-you-use
rule is that it always uses atomics for the refcount.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is only used by encoders; this unfortunately necessitated
to add separate allocations to the SVQ1 encoder which uses
motion estimation without being a full member of mpegvideo.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is no longer needed as the side data is available for decoders in the
AVCodecContext.
The tests affected reflect the removal of useless CPB and Stereo 3D side
data in packets.
Signed-off-by: James Almer <jamrial@gmail.com>
The changed references for fate-hevc-dv-rpu fate-mov-zombie happen because,
unlike ffmpeg and ffplay, ffprobe never injected packet side data, so the
display matrix side data at the container level is now present in the output
frames.
Signed-off-by: James Almer <jamrial@gmail.com>
Deprecate AVStream.side_data and its helpers in favor of the AVStream's
codecpar.coded_side_data.
This will considerably simplify the propagation of global side data to decoders
and from encoders. Instead of having to do it inside packets, it will be
available during init().
Global and frame specific side data will therefore be distinct.
Signed-off-by: James Almer <jamrial@gmail.com>
This will simplify the propagation of side data to decoders and from encoders.
Global side data will now reside in the AVCodecContext, thus be available
during init(), removing the need to propagate it inside packets.
Global and frame specific side data will therefore be distinct.
Signed-off-by: James Almer <jamrial@gmail.com>
Handling AVPacketSideData directly, which can used on structs other than
AVPacket.
This will be useful in the following commits.
Signed-off-by: James Almer <jamrial@gmail.com>
Drop reference to constants removed in 94eed68ace.
In particular, rename me_method to me_quality and add description for
supported values.
Address trac issue:
http://trac.ffmpeg.org/ticket/10003
In particular, clarify that it can either be set as a field of
AVCodecContext or as an x264 option, using a different specification.
Should address trac issue:
http://trac.ffmpeg.org/ticket/3947
The spec caps the prefix alphabet size to 32768 (i.e. 1 << 15) so we
should check for that and reject alphabets that are too large, in order
to prevent over-allocating.
Additionally, there's no need to allocate buffers that are as large as
the maximum alphabet size as these aren't stack-allocated, they're heap
allocated and thus can be variable size.
Added an overflow check as well, which fixes leaking the buffer, and
capping the alphabet size fixes two potential overruns as well.
Fixes: out of array access
Fixes: 62089/clusterfuzz-testcase-minimized-ffmpeg_DEMUXER_fuzzer-
5437089094959104.fuzz
Found-by: continuous fuzzing process
https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Found-by: Hardik Shah of Vehere (Dawn Treaders team)
Co-authored-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Leo Izen <leo.izen@gmail.com>
This patch will cause the parser to abort if it detects an icc profile
with an invalid size. This is particularly important if the icc profile
is entropy-encoded with zero bits per symbol, as it can prevent a
seemingly infinite loop during parsing.
Fixes: infinite loop
Fixes: 62374/clusterfuzz-testcase-minimized-ffmpeg_IO_DEMUXER_fuzzer
-5551878085410816
Found-by: continuous fuzzing process
https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reported-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Leo Izen <leo.izen@gmail.com>
Also, avoid spurious end-of-line after side data entries, and improve
rendering of compact output, by adding an indication of the side data
type for each entry.
Also fixes issue:
http://trac.ffmpeg.org/ticket/9266
Do not use put_sbits() where only unsigned is stored.
Reduce size of data_check_present field.
Reduce size of table of codebook_extremes[].
Avoid anonymously typedeffed structs.
Use encoder private context to store parameters.
Fix wrapping when calculating offsets.
Restructure arrays in encoder private context so to keep
arrays belonging to same subblock into separate structure.
Disable matrix coefficients as they are sometimes
producing wrong results.
Fixes multiplane support on Nvidia.
Also, remove the ENCODE usage, even if the driver signals it as supported.
Currently, it's not used, and when it is used, it'll be gated behind
two extension checks.
Some clips (i.e. SLIST_B_Sony_9) will use PPS 0 and 8, before PPS 1-7.
vulkan_hevc expects {sps,pps,vps}_list to be filled in order, which
causes PPS 8 to not be added to the Vulkan session params when it is
being used by a picture.
This removes the expectation that these lists are filled in order. The
indicies into vps_list are saved since there are multiple usages of it.
This also fixes a bug with some clips (i.e. PPS_A_qualcomm_7) which use
all 64 available PPS slots, causing the old loop to think there are more
than 64 PPS-es.
SVT-AV1 does not support requesting keyframes at arbitrary points
by setting pic_type to EB_AV1_KEY_PICTURE. So set force_key_frames
to 1 only when gop_size == 1.
Please see the comments in
https://gitlab.com/AOMediaCodec/SVT-AV1/-/issues/2076 for a bit more
details.
Signed-off-by: Vignesh Venkatasubramanian <vigneshv@google.com>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
It has currently not been done for H263, H263P and MPEG4.
Doing so avoids having to initialize the IDCT permutation
lateron when decoding packets in order to be able to parse
a quant matrix; it means that every mpegvideo decoder always
has an initialized IDCTDSPContext after init.
Initializing is done generically in ff_mpv_decode_init().
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Up until now, ff_mpeg_update_thread_context() zeroes
the context to initialize on initialization failure.
This has been added in e1d7d4bd13.
Just as now, ff_mpeg_update_thread_context() simply
copied the src MpegEncContext over the dst MpegEncContext
to initialize it, but clear_context() was only added in
b160fc290c, so that cleaning up
on init failure was a minefield if performed.
It was not always performed, namely not before the first
allocation needed to be freed. In the fuzzer sample that
led to e1d7d4bd13, the call
to av_image_check_size() failed and before said commit,
the context contained lots of pointers from the src context,
leading to assert violations lateron.
Of course, the proper fix for this is resetting the pointers
(or even better, not copying them in the first place), so
this zeroing is unnecessary since commit
b160fc290c. It is also harmful,
because it makes initializing something only once during init
more complicated; See the h264chroma handling in the diff
for an example. Therefore it is removed.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Before 998c9f15d1, the IDCTDSPContext
has only been initialized in ff_mpv_common_init() which is deferred
until immediately before decoding a picture; to nevertheless parse
the quant matrices in sequence headers or quant matrix extensions,
a dummy (identity) permutation has been stored in the codec's init
function; after ff_mpv_common_init() which could change the permutation
the matrices were repermutated.
Yet since said commit, the IDCTDSPContext is initialized during init
and does not change afterwards (unless the user forces different CPU
flags), so there is no need to reinitialize it; the repermutation code
can be removed as well.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Fixes: out of array access
Fixes: 62164/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_ESCAPE124_fuzzer-6035022714634240
Fixes: 62164/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_ESCAPE124_fuzzer-6422176201572352
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: -2147483506 + -801380 cannot be represented in type 'int'
Fixes: 62164/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_APE_fuzzer-6578985923117056
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 148676193 - -2006512262 cannot be represented in type 'int'
Fixes: 62164/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WAVARC_fuzzer-5963163952349184
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: -1364715454 + -1468954671 cannot be represented in type 'int'
Fixes: 62093/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_AAC_FIXED_fuzzer-5538774254485504
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: 62171/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_AAC_FIXED_fuzzer-5644657180409856
Fixes: signed integer overflow: 2 * 1079352273 cannot be represented in type 'int'
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 4 * 2307917133220067266 cannot be represented in type 'long'
Fixes: 62164/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_FLAC_fuzzer-6307690022043648
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 538976288 - -9223372036854775808 cannot be represented in type 'long'
Fixes: 62164/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_FLAC_fuzzer-6275845531238400
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
the type is also changed to int as it is interpreted as int in av_get_packet()
Fixes: signed integer overflow: 2147483647 + 1 cannot be represented in type 'int'
Fixes: 50993/clusterfuzz-testcase-minimized-ffmpeg_dem_WSVQA_fuzzer-6593408795279360
Fixes: 51896/clusterfuzz-testcase-minimized-ffmpeg_dem_WSVQA_fuzzer-4613908817903616
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 4481246996173000000 - -4778576820000000000 cannot be represented in type 'long'
Fixes: 51896/clusterfuzz-testcase-minimized-ffmpeg_dem_SBG_fuzzer-5063670588899328
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Nicolas George <george@nsup.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 91542414454000000 - -9154241494546000000 cannot be represented in type 'long'
Fixes: 51896/clusterfuzz-testcase-minimized-ffmpeg_dem_CONCAT_fuzzer-4739147999084544
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Nicolas George <george@nsup.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This revectors the inner loop to reverse vectors element in vectors,
thus eliminating the negative register stride. Note that RVV does not
have a vector reverse instruction, so this uses a gather.
It is already done generically in update_context_from_thread()
before this function is called.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This has been done for the luma plane of missing FLV1 and H263
references.
Also remove code duplication by reusing gray_frame(), which
has been renamed to color_frame() for this purpose.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Fixes segfaults with -debug +nomc -flags +gray (presuming
a build with --enable-gray).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
In this case any timestamps are guessed by compute_pkt_fields() in
libavformat. Since we are decoding the stream, we have more accurate
information from the decoder and do not need any guesses.
Eliminates spurious PTS gaps in a number of FATE tests.
Also avoids dropping the majority of frames in fate-dirac*
* export AVCodecParserContext.picture_structure.
* when there are two field pictures in the packet, set
the interlacing parameters accordingly:
* repeat_pict=1 and picture_structure=FRAME to indicate 2 fields
* field_order to indicate the first field of the two
The demuxer does not set packet timestamps itself after
c6b6356635 and instead relies on the
parser to do it. However, this does not matter from the caller
perspective as it still happens inside the demuxer. The demuxer should
thus not be flagged as not having timestamps.
The parser does not have a timebase associated with it, so in general it
makes no sense for it to be exporting durations. Longer-term this
should be handled more cleanly with a new parser API.
It dates back to pre-2005 days, when people generally tended to commit
their work directly without going through the mailing list. Few
developers do it today, and never outside of their standalone modules.
This item is thus confusing and misleading and is better removed.
The section consistes of a single short paragraph linking to other
chapters. The enclosing chapter also has no other sections, all other
text is placed in the chapter directly.
Keeping a separate section for this paragraph just adds more clutter.
Instead, use forward declarations; and in order not to affect
any user include these headers for them, but not internally.
This has the advantage of removing implicit inclusions of these
headers from almost all files providing codecs.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is of no value to the user, because every muxer can always
be flushed with a NULL packet. As its documentation shows
("If not set, the muxer will not receive a NULL packet in
the write_packet function") it is actually an internal flag
that has been publically exposed because there was no internal
flags field for output formats for a long time. But now there is
and so use it by replacing the public flag with a private one.
Reviewed-by: James Almer <jamrial@gmail.com>
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Buggy ICCv4 profiles are unfortunately used in the wild, and it's quite
easy to work around them by just forcing the white point to the correct
value. Display a warning just in case.
See-Also: https://trac.ffmpeg.org/ticket/9673
This is mathematically equivalent to what we were doing before, but
gives subtly different results due to rounding (rows first vs columns
first). Doing it this way makes our film grain database generation match
reference implementation and now produces bit-exact outputs in my
testing.
Rename the transposed variables to be a bit less confusing.
When linking the main tools, the object files to link are set up
via the variable OBJS-<name>, but for the tools, we've only
used the target's list of dependencies.
In most cases, this has been fine, but it has caused specifying
the libraries to link in a duplicate fashion; the linking command
has looked like this:
$CC -Llibavutil ... tools/tool.o libavutil/libavutil.a -lavutil
Normally, the libraries to link are handled with "-Llibavutil -lavutil";
when linking the main fftools, this is how they are linked.
In the case of the binaries under the "tools" directory (within the
make variable TOOLS), we've passed the full set of dependencies
to the linker, via $^, which does contain the names of the
dependency libraries as well.
When libraries are built as regular static libraries, or shared
unix libraries, this has all worked fine. When libraries are
built as DLLs for Windows, though, the norm is not to pass the
actual DLL to the linker, but an import library.
Mingw tools generally can handle linking directly against a DLL
as well, but MSVC tools don't support that, and error out with
a very cryptic error message:
libavdevice\avdevice.dll : fatal error LNK1107: invalid or corrupt file: cannot read at 0x2D8
By omitting these parts of the dependencies, linking of these tool
executables succeed in MSVC builds with shared libraries enabled.
Signed-off-by: Martin Storsjö <martin@martin.st>
This is slower than the Zbb version on real hardware due to register
strides. Proper support for vector byte-swap requires the Zvbb
extension, but it's much too early for me to worry about it.
Put it into an encoder-specific context with a SnowContext
at its front. This also avoids having to include mpegvideo.h
in snow.c and snowdec.c.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
We have to write an explicit BlockDuration element (and use
a BlockGroup instead of a SimpleBlock) in case the Track
has a DefaultDuration that is inconsistent with the duration
of the packet.
The matroska-h264-remux test uses a file with coded fields
where the duration of a Block is the duration of a field,
not of a frame, therefore this patch writes said BlockDuration
elements.
(When using a BlockGroup, one has to add ReferenceBlock elements
to distinguish keyframes from non-keyframes. Unfortunately,
the AV1 codec mapping [1] requires us to reference all references
and to really use the real references, which requires a lot of
effort for basically no gain. When BlockGroups are used with AV1,
the created files are most likely invalid, both before and after
this patch, but this patch makes this more likely to happen.)
[1]: https://github.com/ietf-wg-cellar/matroska-specification/blob/master/codec/av1.md
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
They were replaced by TX from libavutil; the tremendous work
to get to this point (both creating TX as well as porting
the users of the components removed in this commit) was
completely performed by Lynne alone.
Removing the subsystems from configure may break some command lines,
because the --disable-fft etc. options are no longer recognized.
Co-authored-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It more directly shows that ff_flac_decode_frame_header() does not
modify the AVCodecContext given to it at all; and it would not be
allowed to do so, given that it is used by the parser when it is
still unknown whether said frame header is even valid.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
In particular the encoder used only a small part of the context:
The new encoder context is only 128B here. It used to be 32992.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
When using frame threading, the FFV1 decoder's update_thread_context()
function copies the whole context and afterwards restores some allocated
fields with backups made earlier. Among these fields are the
ThreadFrames and the source context's ThreadFrames can change
concurrently without any synchronization, leading to data races which
are undefined behaviour even if they don't lead to problems in
practice (as the destination's own ThreadFrames are restored directly
thereafter).
Fix this by only copying the actually needed fields.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
After the AVFrame has been wrapped into a buffer,
it is owned by the buffer and must not be freed manually
any more. Yet this happens on subsequent errors.
This bug was introduced in 6ca43a9675.
Reviewed-by: Timo Rothenpieler <timo@rothenpieler.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Suggested-by: Tomas Härdin <git@haerdin.se>
Fixes: 51896/clusterfuzz-testcase-minimized-ffmpeg_dem_MXF_fuzzer-5130394286817280
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The AVDCT API used by this filter does in no way depend
upon the FFT subsystem.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The spec specified indices in the order [x][y], but our code follows the
traditional C convention of [y][x]. This was not correctly account for
when calculating the base index of the grain database access.
The spec specifies x^31 + x^3 + 1 as the polynomial, but the diagram in
Figure 1-1 omits the +1 offset. The initial implementation was based on
the diagram, but this is wrong (produces subtly incorrect results).
It does not make much sense to me, but GCC somehow optimises the
inline assembler even though the output is very obviously used and
having observable side effects.
This reverts commit 09731fbfc3.
In this logical branch, fq->queued and fq->allocated must be equal.
Deleting this code will make it easier to understand the logic
(copy the data before tail to the newly requested space)
Signed-off-by: yangyalei <yangyalei@xiaomi.com>
ff_sbr_noise_table is only used by SBR DSP code
and sbrdsp.h is the header providing its declaration.
Furthermore, this ensures that checkasm does not pull
the whole AAC SBR and PS code in.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
They are not common.
Furthermore, this file is pulled in when linking checkasm and
up until now, the calls to ff_get_buffer() and av_codec_is_decoder()
caused all of libavcodec to be pulled in as well. Besides being
bad size-wise this also has the downside that it pulls in
avpriv_(cga|vga16)_font from libavutil which are marked as
being imported from another library when building libavcodec
as a DLL; this breaks checkasm because it links both lavc and lavu
statically.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The part of said function that is common to both encoder and decoder
is negligible since c954cf1e1b
and more than offset by the costs of "Am I an encoder?" checks.
So move allocating the frames to the encoder and decoder directly.
Also rename ff_snow_frame_start() to ff_snow_frames_prepare(),
because a frame without a buffer has not been properly started.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This function is used by the AARCH64 code to filter a few
remaining lines in case the dimensions are not suitably
aligned; it is furthermore also used by checkasm to actually
test the AARCH64 code against. But it is normally not used
and this patch therefore moves it in bwdifdsp.h as a static
inline function so that it can be avoided if possible
or inlined where necessary.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Otherwise checkasm/vf_bwdif.c pulls in vf_bwdif.c and
then all of libavfilter. Besides being bad size-wise
this also has the downside that it pulls in
avpriv_(cga|vga16)_font from libavutil which are marked
as being imported from another library when building
libavfilter as a DLL and this breaks checkasm because
it links both lavfi and lavu statically.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This already avoids unnecessary indirectly included headers
in the arch-specific vf_bwdif_init.c files; it is also in
preparation for splitting the actual functions out of vf_bwdif.c.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Dnn models has different data preprocess requirements. Scale and mean
parameters are added to preprocess input data.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Dnn models have different input layout (NCHW or NHWC), so a
"layout" option is added
Use openvino's API to do layout conversion for input data. Use swscale
to do layout conversion for output data as openvino doesn't have
similiar C API for output.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Applying film grain happens after ff_thread_finish_setup(),
so the parameters synced in hevc_update_thread_context() must not
be modified. But this is exactly what happens in case applying
film grain fails. (The likely result is that in case of frame threading
an uninitialized frame is output.)
Given that it is actually very easy to know in advance whether
ff_h274_apply_film_grain() supports a given set of parameters,
one can check for this before ff_thread_finish_setup()
and avoid allocating an unused buffer lateron.
Reviewed-by: Niklas Haas <ffmpeg@haasn.xyz>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
From the documentation of GNU awk [1]:
"In most awk implementations, including gawk, rand() starts generating
numbers from the same starting number, or seed, each time you run awk.45
Thus, a program generates the same results each time you run it. The
numbers are random within one awk run but predictable from run to run.
This is convenient for debugging, but if you want a program to do
different things each time it is used, you must change the seed to a
value that is different in each run. To do this, use srand()."
This commit does exactly this.
[1]: https://www.gnu.org/software/gawk/manual/html_node/Numeric-Functions.html#index-rand_0028_0029-function
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Instead use float one by default for sample rate conversions.
The s16p internal transfer format produces visible and hearable
quantization artifacts.
Signed-off-by: Paul B Mahol <onemda@gmail.com>
for S8 we continue to use S16 as it should have enough precision
Fate is adjusted so bitexactness is maintained between mips/arm/x86
if more tests became bit-inexact on some platform, the same change
can be done to them
The use of higher precision and float intermediates inevitably
leads to more differences between platforms.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 2147483181 + 1024 cannot be represented in type 'int'
Fixes: 61117/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_VMIX_fuzzer-5387692433866752
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The PSNR filter uses the pixel values without considering
the color ranges. This is incorrect. Patch adds a warning
so at least the user knows it.
Let's see an example:
(1) Let's get a simple black pixel/white pixel image.
```
$ echo -n -e "\x00\x00\x00\xff\xff\xff" > /tmp/foo.rgb24
```
(2) From this image, let's distill full and limited range y4m
copies.
```
$ ffmpeg -y -f rawvideo -video_size 2x1 -pix_fmt rgb24 -i /tmp/foo.rgb24 -vf scale="out_range=full" -pix_fmt yuv420p /tmp/foo.full.y4m
$ xxd /tmp/foo.full.y4m
00000000: 5955 5634 4d50 4547 3220 5732 2048 3120 YUV4MPEG2 W2 H1
00000010: 4632 353a 3120 4970 2041 303a 3020 4334 F25:1 Ip A0:0 C4
00000020: 3230 6a70 6567 2058 5953 4353 533d 3432 20jpeg XYSCSS=42
00000030: 304a 5045 4720 5843 4f4c 4f52 5241 4e47 0JPEG XCOLORRANG
00000040: 453d 4655 4c4c 0a46 5241 4d45 0a00 ff80 E=FULL.FRAME....
00000050: 80 .
```
and
```
$ ffmpeg -y -f rawvideo -video_size 2x1 -pix_fmt rgb24 -i /tmp/foo.rgb24 -vf scale="out_range=limited" -pix_fmt yuv420p /tmp/foo.limited.y4m
$ xxd /tmp/foo.limited.y4m
00000000: 5955 5634 4d50 4547 3220 5732 2048 3120 YUV4MPEG2 W2 H1
00000010: 4632 353a 3120 4970 2041 303a 3020 4334 F25:1 Ip A0:0 C4
00000020: 3230 6a70 6567 2058 5953 4353 533d 3432 20jpeg XYSCSS=42
00000030: 304a 5045 4720 5843 4f4c 4f52 5241 4e47 0JPEG XCOLORRANG
00000040: 453d 4c49 4d49 5445 440a 4652 414d 450a E=LIMITED.FRAME.
00000050: 10eb 8080 ....
```
Note that the 2x images are the same (both have 1x pixel at the
darkest black, and one at the brightest white). Only difference
is the range.
(3) Let's calculate the PSNR score:
```
$ ./ffmpeg -filter_threads 1 -filter_complex_threads 1 -i /tmp/foo.full.y4m -i /tmp/foo.limited.y4m -lavfi "psnr" -f null -
...
[Parsed_psnr_0 @ 0x2f5dac0] PSNR y:22.972065 u:inf v:inf average:25.982365 min:25.982365 max:25.982365
```
As we are comparing an image with itself, we expect "y:inf" as the
luma PSNR. Issue here is that the PSNR filter just uses the pixel
values, ignoring the color ranges.
A possible solution would be to have the filter do the conversion.
Proposed solution is to add a warning.
```
$ ./ffmpeg -filter_threads 1 -filter_complex_threads 1 -i /tmp/foo.full.y4m -i /tmp/foo.limited.y4m -lavfi "psnr" -f null -
...
[Parsed_psnr_0 @ 0x2f5dac0] master and reference frames use different color ranges (pc != tv)
...
[Parsed_psnr_0 @ 0x2f5dac0] PSNR y:22.972065 u:inf v:inf average:25.982365 min:25.982365 max:25.982365
```
Tested:
Ran fate.
```
$ make fate -j
...
TEST seek-lavf-ppmpipe
TEST seek-lavf-pgmpipe
TEST seek-lavf-mxf_opatom
```
Fixes: signed integer overflow: 141394472 + 2038060365 cannot be represented in type 'int'
Fixes: 61787/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WAVARC_fuzzer-5882604925878272
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 1871429831 + 343006811 cannot be represented in type 'int'
Fixes: 61784/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_AIC_fuzzer-5372151001120768
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
by making gain unsigned we have 1 bit more available
alternatively we can clip twice as in the g729 reference
Fixes: left shift of 23404 by 17 places cannot be represented in type 'int'
Fixes: 61728/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_ACELP_KELVIN_fuzzer-6280412547383296
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The comments to the function say that it does not implement the spec and
instead follows VTM.
This patch is quite likely not the right solution and more intended to show
the issue to people knowing the specific part of VTM ...
Fixes: signed integer overflow: 2147483392 + 256 cannot be represented in type 'int'
Fixes: 60505/clusterfuzz-testcase-minimized-ffmpeg_DEMUXER_fuzzer-6216675924770816
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Turn tracing into callbacks for each syntax element, with default
callbacks to match current trace_headers behaviour for debug. Move
the construction of bit strings into the trace callback, which
simplifies all of the read and write functions.
Signed-off-by: Fei Wang <fei.w.wang@intel.com>
Reviewed-by: Neal Gompa <ngompa13@gmail.com>
from the specification:
For each OLS, there shall be at least one layer that is an output layer. In other words, for any value of i in the range of 0
to TotalNumOlss − 1, inclusive, the value of NumOutputLayersInOls[ i ] shall be greater than or equal to 1
Fixes: index 257 out of bounds for type 'uint8_t [257]'
Fixes: 61160/clusterfuzz-testcase-minimized-ffmpeg_BSF_VVC_METADATA_fuzzer-6709397181825024
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The AVFrame of a decoder with the ordinary decode callback
is generically unreferenced on error.
Reviewed-by: James Zern <jzern@google.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Keeping an ever growing list of CPUs just to pass -march to the compiler and
enable fast_cmov is a waste of time. Every CPU we know has limitations is
already handled here, so just fallback to enabling everything when a passed in
argument is not one of those.
This will enable optimizations for CPU architectures released in the past 7 or
so years with supported GCC and clang compilers when used as argument in
configure, instead of silently ignoring them.
Signed-off-by: James Almer <jamrial@gmail.com>
ISO C++ forbids compound-literals. It's not available with MSVC.
This is a known issue from 10 years ago, and that's why there is a
av_get_time_base_q().
Since we have no plan to remove AV_TIME_BASE_Q, just make it
available in C++.
There are multiple choices:
1. Use C++11 syntax: AVRational{1, AV_TIME_BASE}
Users may still use C++98 to write new code. So no.
2. Use av_get_time_base_q().
It's for this purpose. But it's not compile time constants as
AV_TIME_BASE_Q in C.
So I choose av_make_q() as Anton's suggestion.
https://libav-devel.libav.narkive.com/ZQCWfTun/patch-0-2-fix-avutil-h-usage-from-c
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
It is badly named (should have been -top_field_first, or at least -tff),
underdocumented and underspecified, and (most importantly) entirely
redundant with the setfield filter.
Merge three blocks with slightly inconsistent checks into one, treating
encoder input as interlaced when either:
* at least one of ilme/ildct flags is set
* the first frame in the stream is marked as interlaced
* the user specified the -top option
Stop modifying the frame passed to enc_open().
It was added in cb114ed464 with the comment
"This will allow fixing several bugs with the -shortest option".
Since
* there is no explanation of what these bugs are
* libavformat is not the place to work around ffmpeg CLI bugs
* there is no indication that this feature is actually in use
deprecate it without replacement.
Avoid using the deprecated functions ssh_try_publickey_from_file among
others in favor of symbols introduced in libssh 0.6.0 (Jan 2014) and
update configure to require this version.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
Since 432adca5fe no decoder
looks at the slice_count and slice_offset fields at all,
so there is no reason to synchronize them between the worker
and the user thread.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
When decoding non-keyframes, the decoding process expects
there to be two reference frames, the last one and the golden
one. The existence of the golden one is checked and in case
it is there, it is presumed that the last one exists as well.
This assumption is wrong in case of memory allocation failure,
namely in case the call to ff_thread_ref_frame() that sets
the last frame fails.
Fix this by actually performing a shuffle without creating
new references. This can't fail and has the advantage of
fewer implicit allocations.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
When decoding a keyframe, last_frame and golden_frame are
not used at all and (at least when starting decoding)
are not set at all. But due to code sharing pointer arithmetic
on the NULL data-pointers of these frames has nevertheless
been performed. This is undefined behaviour and causes e.g.
"runtime error: applying non-zero offset 173440 to null pointer"
from UBSan in the vp31, vp4, theora-coeff-level64 and theora-offset
FATE-tests.
Fix this by reusing the current frame for unavailable frames.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The '<' in ///< says that the comment pertains to the previous item
on the same line as the ///<.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The 1000 did result in the appearance of a never ending reload loop
The RFC mandates that "If the client reloads a Playlist file and finds that it has not
changed, then it MUST wait for a period of one-half the target
duration before retrying." and if it has changed
"the client MUST wait for at least the target duration before attempting to reload the
Playlist file again"
Trying to reload 3 times seems a better default than 1000 given these
durations
Issue found by: Сергей Колесников
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The issue is that we cannot rely on any context existing when we free
frames. The Vulkan functions are loaded in each context separately,
so until now, we've just been loading them on every frame's destruction.
Rather than do this, just save the function pointers we need in each
frame. The function pointers are guaranteed to not change and exist.
Since bfb28b5ce8 the permutation
type FF_IDCT_PERM_SIMPLE is ARCH_X86_32-only. So use this
knowledge to disable code for it when not on ARCH_X86_32.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
When using multi-threaded decoding, every decoding thread
has its own DBP consisting of H264Pictures and each of these
points to its own AVFrames. They are synced during
update_thread_context via av_frame_ref() and therefore
the threads actually decoding (as well as all the others)
must not modify any field that is copied by av_frame_ref()
after ff_thread_finish_setup().
Yet this is exactly what happens when an error occurs
during decoding and the AVFrame's decode_error_flags are updated.
Given that these errors only become apparent during decoding,
this can't be set before ff_thread_finish_setup() without
defeating the point of frame-threading; in practice,
this meant that the decoder did not set these flags correctly
in case frame-threading was in use. (This means that e.g.
the ffmpeg cli tool fails to output its "corrupt decoded frame"
message in a nondeterministic fashion.)
This commit fixes this by adding a new H264Picture field
that is actually propagated across threads; the field
is an AVBufferRef* whose data is an atomic_int; it is
atomic in order to allow multiple threads to update it
concurrently and not to provide synchronization
between the threads setting the field and the thread
ultimately returning the AVFrame.
This unfortunately has the overhead of one allocation
per H264Picture (both the original one as well as
creating a reference to an existing one), even in case
of no errors. In order to mitigate this, an AVBufferPool
has been used and only if frame-threading is actually
in use. This expense will be removed as soon as
a proper API for refcounted objects (not based upon
AVBuffer) is in place.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Add a pointer parameter that if supplied will be used to return
the updated decode_error_flags. This will allow to fix several
races when using frame-threading; these resulted from AVFrame
that the earlier code updated concurrently being used as source
in an av_frame_ref() call in the decoder's update_thread_context.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
When ov_model_const_input_by_name/ov_model_const_output_by_name
failed, input_port/output_port can be wild pointer.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
All Vulkan HWAccels share the same boilerplate code for creating
session params and this includes a common bug: In case actually
creating the video session parameters fails, the buffer destined
to hold them leaks; in case of HEVC this is also true if
get_data_set_buf() fails.
This commit factors this code out and fixes the leak.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The alignment in vulkan_unmap_from_drm() (formerly the clone
of vulkan_frame_free()) is nicer than the in vulkan_frame_free(),
let's preserve it.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The AVBuffer API uses uint8_t as base type for buffers
and therefore its free callbacks need to abide by this.
Therefore vulkan_frame_free() used an inappropriate signature
which caused casts whenever this function has been called
manually.
This commit changes this by making vulkan_frame_free()
use the proper type and a vulkan_frame_free_cb() that
is used as free callback for the AVBuffer API.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Fixes: signed integer overflow: -2147483300 - 12285 cannot be represented in type 'int'
Fixes: 59462/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_BONK_fuzzer-5714298807386112
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
jpeg2000 overrides the global lowres variable with a lowres field called reduction_factor
ffmpeg -lowres X causes the reduction_factor to be set
ffplay -lowres X causes both lowres and the reduction_factor to be set
ossfuss sets only lowres
only the ffmpeg variant works. This patch tries to make the other 2 work.
Alternative we could just error out if things are inconsistent.
More complex restructuring should be limited to the master branch
to keep this reasonably easy to backport
Fixes: out of array access
Fixes: 59672/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_JPEG2000
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: out of array access
Fixes: 61991/clusterfuzz-testcase-minimized-ffmpeg_dem_JPEGXL_ANIM_fuzzer-5524679648215040
Fixes: 62181/clusterfuzz-testcase-minimized-ffmpeg_dem_JPEGXL_ANIM_fuzzer-5504964305485824
Fixes: 62214/clusterfuzz-testcase-minimized-ffmpeg_dem_JPEGXL_ANIM_fuzzer-4782972823535616
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
An AVBPrint is initialized via av_bprint_init() (or
av_bprint_init_for_buffer()) which expects uninitialized
AVBPrints; it is therefore not necessary to zero them before
the actual initialization.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is not an error condition, but would be treated like one if the
program terminates on the next transcode loop iteration because of a
signal or keyboard input.
Fixes#10504
Tested-by: https://github.com/0Ky
For badly interleaved files, interleave packets from multiple tracks
at the demuxer level can trigger seeking back and forth, which can be
dramatically slow depending on the protocol. Demuxer level interleave
can be useless sometimes, e.g., reading mp4 via http and then
transcoding/remux to DASH. Disable this option when you don't need the
demuxer level interleave, and want to avoid the IO penalizes.
Co-authored-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Since fb548fba04,
this return -1 is in a function returning enum AVPixelFormat
whose caller checks for AV_PIX_FMT_NONE for failure.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Every code point in the BMP is representable with at most three bytes
in UTF-8 and every code point not in the BMP takes four bytes.
For each of the latter, the encoding of UTF-16 takes as many
bytes; for each of the former, it takes at most 3/2 as many.
Therefore one can decrease the size of the buffer allocated
here.
Reviewed-by: Peter Ross <pross@xvid.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Happens when length > INT_MAX / 2; use unsigned for the computation,
but restrict the value to INT_MAX, because avio_get_str16le()
accepts an int as buf_len argument. Notice that it can happen
that the string read by avio_get_str16le() is truncated in this case.
Reviewed-by: Peter Ross <pross@xvid.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
get_tag() is not designed with negative length in mind;
in this case, it will allocate a very small buffer
(LEN_PRETTY_GUID + 1) and might call avio_get_str16le()
with a negative maxlen (which relies on these parameters
to be signed).
Reviewed-by: Peter Ross <pross@xvid.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Each of the 16 bytes of a GUID is written as a two-character
hex value and three hyphens, leading to a length of 35.
GCC 13 emits a -Wformat-truncation= warning because of this.
Reviewed-by: Peter Ross <pross@xvid.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is undefined behaviour even in cases where it works
(it works because it is only a const uint8_t* vs. uint8_t* difference).
Instead add a cbuf parameter to pass a const buffer (for writing)
as well as a parameter indicating whether we are reading or writing;
retry_transfer_wrapper() itself then uses the correct function
based upon this information.
Reviewed-by: Marton Balint <cus@passwd.hu>
Reviewed-by: Tomas Härdin <git@haerdin.se>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
av_image_copy() accepts const uint8_t* const * as source;
lots of user have uint8_t* const * and therefore either
cast (the majority) or copy the array of pointers.
This commit changes this by adding a static inline wrapper
for av_image_copy() that casts between the two types
so that we do not need to add casts everywhere else.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Also constify AVAudioFifo* in the peek functions
besides constifying intermediate pointers (void**->void * const *).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Broken in e8704a8f60
when ffurl_read() has been turned into a static inline function
different from the actually used function ffurl_read2().
Fixes ticket #10562.
Tested-by: Mitzsch01
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
mxf_match_uid() accepts two const UID and a len parameter.
UID is a typedef for an array of 16 uint8_t, so the const UID
parameter is actually a pointer to const uint8_t.
The point of mxf_match_uid() is to check whether the initial
part of two UIDs match; the length of said part is given
by the len parameter. Once an incomplete UID has been passed
to mxf_match_uid() (albeit with the correct len, so safe),
which makes GCC emit -Wstringop-overread warnings.
Fix this by using a const uint8_t[] as type; it is more
natural for incomplete UIDs.
Reviewed-by: Tomas Härdin <git@haerdin.se>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Therefore use a proper prefix for this API, e.g.
ff_init_vlc_sparse -> ff_vlc_init_sparse
ff_free_vlc -> ff_vlc_free
INIT_VLC_LE -> VLC_INIT_LE
INIT_VLC_USE_NEW_STATIC -> VLC_INIT_USE_STATIC
(The ancient INIT_VLC_USE_STATIC has been removed
in 595324e143, so that
the NEW has been dropped.)
Finally, reorder the flags and change their values
accordingly.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Mostly taken from the documentation for ff_init_vlc_from_lengths();
also remove the documentation in vlc.c.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Not every user of idctdsp.h wants to initialize an IDCTDSPContext;
e.g. the proresdsp only uses ff_init_scantable_permutation()
and the IDCT permutation enum; similarly for cavsdsp and wmv2dsp.
Using a forward declaration here avoids an avcodec.h dependency
in the relevant files.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Due to non-byte-alignment a read of 32 bits guarantees only
25 usable bits; therefore get_bits_long() (which is used to
potentially read more than 25 bits) performs two reads in case
it needs to read more than 25 bits and combines the result.
Yet this is not necessary: One can just read 64 bits at a time
to get 32 usable bits (57 would be possible). This commit does so.
This reduced the size of .text by 30144B for GCC 11.4 and 5648B
for Clang 14 (both with -O3).
(get_bits_long() is a building block of show_bits_long()
and get_ue_golomb_long(), so this patch affects these, too.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
They are currently non-const for reasons unknown, although
avio_write() accepts a const buffer.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is undefined behaviour even in cases where it works
(it works because both are pointers). Instead change
the functions involved to use the type expected by the AVIO-API
and add inline wrappers for our internal callers.
Reviewed-by: Tomas Härdin <git@haerdin.se>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is only included because two very rarely used functions
use pointers to URLContexts; use struct URLContext instead.
Also move ffio_geturlcontext() so that one can avoid
a forward declaration of struct URLContext (which would be
necessary as soon as FF_API_AVIODIRCONTEXT is no more).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Incompatibility of the flags and the protocol's capabilities
are checked generically (see url_alloc_for_protocol()).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
A switch is simpler than a lookup over a table with
three entries, only two of which can happen at all.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Fixes: fc429d785e
cpb_cnt used to be initialized to 1 before
fc429d785e so cpb_cnt_minus1 should be
initialized to 0.
Also add +1 to the decode_sublayer_hrd call to account for the change to
the offset
And remove the AVOID_PROBING flag, given it's the last av1 decoder to be tested
either way.
This fixes a regression introduced in 1652f2492f,
where even if forcing the native av1 decoder, if another decoder was present,
like libdav1d or libaom-av1, they'd be used for probing and some fate tests
would have different results.
Signed-off-by: James Almer <jamrial@gmail.com>
There is likely a better way to fix this, this is mainly to show the problem
Fixes: MC within same frame resulting in overlapping memcpy()
Fixes: 60189/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_HEVC_fuzzer-4992746590175232
Fixes: 61753/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_HEVC_fuzzer-5022150806077440
Fixes: 58062/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_HEVC_fuzzer-4717458841010176
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
These casts cast const away temporarily; they are safe, because
the pointers that are initialized point to const data. But this
is nevertheless not nice and leads to warnings when using
-Wcast-qual. blend_modes.c generates 546 (2*39*7) such warnings
which is the majority of such warnings for FFmpeg as a whole.
vf_blend.c and vf_blend_init.h also use this pattern;
they have also been changed.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
lzma_stream.next_in is const for more than 15 years now
and has been so in every release of liblzma.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is not necessary at all. So remove it.
This also breaks an inclusion cycle mem.h->avutil.h->common.h->mem.h.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Up until now, avutil.h includes common.h which includes mem.h which
includes avutil.h, so that all these headers are in fact equivalent.
Yet mem.h does not need to include avutil.h at all and when it no longer
does, including common.h will no longer include error.h (included by
avutil.h) as well; change this by moving error.h to avutil.h, as error.h
is clearly a commonly used header.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Most users of ffio_init_context() simply want to wrap
a buffer into an AVIOContext; they do not provide
function pointers at all.
Therefore this commit adds shortcuts for these two common
operations. This also allows to accept const data when reading
(i.e. the const is now cast away at a central place in
ffio_init_read_context() instead of at several callers).
This also allows to constify the data in ff_text_init_buf().
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
These defines are also used in other contexts than just AVCodecContext
ones, e.g. in libavformat. Furthermore, given that these defines are
public, the AV-prefix is the right one, so deprecate (and not just move)
the FF-macros.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
AVCodec is only ever used as an incomplete type (i.e. via a pointer
to an AVCodec) in avformat.h and it is not really part of the core
of avformat.h or libavformat; almost none of our internal users
make use of it (and none make use of hwcontext.h, which is implicitly
included). So switch to use struct AVCodec, but continue to include
codec.h for external users for compatibility.
Also, do the same for AVFrame and frame.h, which is implicitly included
by codec.h (via lavu/hwcontext.h).
Also, remove an unnecessary inclusion of <time.h>.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
and also perform the remainder of the subtitle parsing directly
after mkv_parse_subtitle_codec().
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Namely to a place after the AVStream has already been created,
so that it can be directly attached to it.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
More exactly, factor codec-specific video parsing out of
matroska_parse_tracks(). This is intended to improve readability.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
If the size of the data of an EbmlBin is > 0, its data is always
present, as ebml_read_binary() always leaves the buffer
in a consistent state (even on error).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
More exactly, factor codec-specific audio parsing out of
matroska_parse_tracks(). This is intended to improve readability.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Up until now, matroska_parse_tracks() has two main ways
to set AVCodecParameters.extradata: A generic way via CodecPrivate
(possibly with an offset) and by allocating a buffer manually;
the pointer to this buffer is stored in a stack pointer.
In particular, the latter method is problematic, as the buffer
needs to be freed manually in case of error (currently there
are no error conditions between the place where it is set
to AVCodecParameters.extradata).
Most of these buffers are very small (<= 22B), so replace
the pointer to an allocated buffer with a stack buffer
and set the extradata directly for the one place where
the buffer may not be small.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Once upon a time, 413abbe164
added versions of some put_no_rnd_pixels functions for use
in VP3 and Theora (with an explicit check so that they are
only used for VP3 and Theora). When this was moved to hpeldsp
(from dsputil) in 3ced55d51c,
the check was replaced by a check for the bitexact flag
(and a CONFIG_VP3_DECODER compile-time check), so that
these functions were now used for other codecs as well.
Later commit 1dfc3cf89d
split off the "VP3-specific bits into a separate file",
yet these bits were not really VP3-specific bits at all
any more. (The error was repeated in commit
0a39c9ac0bfd7345fe676b4e2707d9cec3cbb553.) This commit
has not been reverted, because this would make future
changes from Libav (from where it originated) harder,
yet Libav is no more, so this commit effectively reverts
1dfc3cf89d.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Makes the output of the native decoder consistent with external decoders like
libdav1d with fate-enhanced-flv-av1.
Reviewed-by: Zhao Zhili <quinkblack@foxmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
This reverts commit eb88ccb92e.
AVCodecContext fields are the proper place for a decoder to export such values.
This change is in preparation for the following commits.
add option named rtmp_enhanced_codec,
it would support hvc1,av01,vp09 now,
the fourcc is using Array of strings.
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
Fixes: signed integer overflow: -2147483648 + -1048576 cannot be represented in type 'int'
Fixes: 59365/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_MPEG4_fuzzer-642654923954585
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 2079654542 - -139267653 cannot be represented in type 'int'
Fixes: 60811/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_TTA_fuzzer-5915858409750528
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes the test when the non-bitexact MMXEXT versions of
the hpeldsp functions are used.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
In order to send VP9 tracks with FLV or RTMP, the enhanced RTMP
specification tells that VPCodecConfigurationRecord, a.k.a. vpcC
ISO-BMFF box, must be inserted into a metadata message. However, the
function responsible for generating vpcCs currently returns invalid
boxes, that are lacking the Version and Flag fields, inherited from
FullBox. For some reason, both flags were being added manually in
movenc. This patch fixes the issue.
Signed-off-by: Alessandro Ros <aler9.dev@gmail.com>
Reviewed-by: Steven Liu <lingjiujianke@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
This makes the test stricter because it is checked that the
MMX registers are not accidentally clobbered.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Only sub_media_pred has an MMXEXT version, so one can use
the version with the stricter check (that checks that
the MMX registers have not been clobbered) for sub_left_predict.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Only the idct_dc and add_residual functions have MMX versions,
so one can use the version with the stricter check (that checks
that the MMX registers have not been clobbered) for all the other
checks.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is not even clear what these emms_c() are supposed to achieve:
create_filtergraph() (where it may be called) does not call
ASM functions itself; it merely sets some function pointers.
Furthermore, there are no colorspacedsp functions using MMX
(checked by checkasm which does not use declare_new_emms()).
Finally, checking whether to issue emms_c() is overblown anyway.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
winbase.h defines IGNORE and is included via bzlib.h when compiling
for Windows. So rename this macro to NOTHING.
Also rename the muxer macro for consistency.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It includes various Windows-specific headers when compiling
for Windows and these sometimes cause issues: E.g. winbase.h
defines IGNORE, which clashes with a macro used in the Matroska
muxer (since 884653ee5b) and demuxer.
This header provides fallback defines for various stuff that is
mostly not used directly by (de)muxers at all:
mkdir, rename, rmdir, unlink, access, poll, pollfd, nfds_t,
closesocket, socklen_t, fstat, stat, lseek, SHUT_(RD|WR|RDWR)
and various POLL* constants.
Ergo fix this issue by not auto-including this header in lots
of places via an inclusion in internal.h and instead include
it everywhere where the above stuff is used (most of these
translation units already included os_support.h).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is the very last user of any lavc transform code.
This also *corrects* wmavoice decoding, as the previous DCT/DST
transforms were incorrect, bringing it closer to Microsoft's
own wmavoice decoder.
These are in-place transforms, required for DCT-I and DST-I.
Templated as the mod2 variant requires minor modifications, and is
required specifically for DCT-I/DST-I.
The word "monotonous" means "spoken in a monotone" which is not what we
mean here. We mean "monotonic" i.e. nondecreasing.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
Adds a fate test for the jpegxl_anim demuxer, that should allow testing
for true positives and false positives for animated jpegxl files. Note
that two of the test cases are not animated, in order to help sort out
false positives.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
This simplification reduces codesize.
(It even reduces the size of .rodata here, because
the jump table used by the compiler is bigger than
the actual array.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
When muxing, the AVStreams' side-data is typically set
by the caller before avformat_write_header();
it is not documented to be else. Yet the Matroska muxer
added an AVStereo3D side data if certain metadata
was present:
Since commit 4d686fb721
(adding support for AVStereo3D stream side-data),
the Matroska muxer checked certain stream tags that
contain Matroska's StereoMode and (if they are present)
converted this value into an AVStereo3D struct that
gets attached to the AVStream (reusing a function from
the demuxer). Afterwards the AVStereo3D side data struct
(whether it has just been added by the muxer or not) gets
parsed and converted back into a Matroska StereoMode.
Besides being an API violation this change broke
StereoMode values without a corresponding AVStereo3D
(namely the anaglyph ones).
This commit fixes this: A StereoMode given via tags
is now used-as-is; if no such tag exists and an AVStereo3D
side data exists, it is converted into the corresponding
StereoMode (if possible). This approach also fixes
handling of the anaglyph ones; the changes to the
matroska-stereo_mode are due to this.
The new STEREOMODE_STEREO3D_MAPPING has been put to
good use for this.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It will allow to create tables for easy conversion from AVStereo3D
to stereomode and back again as well as derive the properties
of a given stereomode.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It has undefined behaviour in case the value does not fit into an int.
Also stop allowing to override a stream level "alpha_mode" tag
by an AVFormatContext one and properly check that the stereo_mode
number given via a tag is actually in the range 0..14: Negative
values would have been treated as zero before this patch.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
When no output video framerate is specified by the user with -r or can
be inferred from the filtergraph, encoder setup will arbitrarily decide
that the framerate is 25fps. However, making up any framerate value for
VFR encoding is at best unnecessary.
Changes the results of the sub2video tests, where the input timebase is
now used instead of 1/25.
Mainly this fixes handling special values of -enc_time_base ('demux' or
'filter') for audio. It also prints a warning if -enc_time_base is
specified for subtitles, instead of ignoring it silently (current
subtitle encoding API only works with AV_TIME_BASE_Q).
This function converts packet timestamps from the input stream timebase
to OutputStream.mux_timebase, which may or may not be equal to the
actual output AVStream timebase (and even when it is, this may not
always be the optimal choice due to bitstream filtering).
Just keep the timestamps in input stream timebase, they will be rescaled
as needed before bitstream filtering and/or sending the packet to the
muxer.
Move the av_rescale_delta() call for audio (needed to preserve accuracy
with coarse demuxer timebases) to write_packet.
Drop now-unused OutputStream.mux_timebase.
Bitstream filtering input timebase is not always necessarily equal to
OutputStream.mux_timebase. Also, set AVPacket.time_base correctly for
packets output by bitstream filters
Do not rescale at all in of_output_packet() when not doing bitstream
filtering, as it's unnecessary - write_packet() will rescale to the
actual muxer timebase.
When the marker writing code was merged from libav to FFmpeg
in dc62016c, it failed to take into account that the meaning of
cluster_pos had changed in bda5b662; in particular, the special
value for “I'm not currently working on a cluster” had changed
from 0 to -1. This makes the avio_write_marker() call never
be called. Update the if statement to fix it.
Fixes: Ticket9843
Signed-off-by: Steinar H. Gunderson <steinar+ffmpeg@gunderson.no>
Signed-off-by: Martin Storsjö <martin@martin.st>
The loader ensures only that functions with tagged supported extensions
exist, rather than ensuring only those with supported extensions are
loaded.
As the init function uses Vulkan functions, whose loading requires them
to have the extension flags set, the extension flags are guaranteed
to also exist at this point.
Can be used to configure libplacebo's underlying raw options, which
sometimes includes new or advanced / in-depth settings not (yet) exposed
by vf_libplacebo.
This new upstream struct simplifies params struct management by allowing
them to all be contained in a single dynamically allocated struct. This
commit switches to the new API in a backwards-compatible way.
The only nontrivial change that was required was to handle
`sigmoid_params` in a way consistent with the rest of the params
structs, instead of setting it directly to the upstream default.
This prevents code duplication in the source form by calling the parse
code that was moved to avcodec last commit. The code will be duplicated
in binary form for shared builds (it's not that large), but for source
code it will only exist in one location now.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
Add a full parser to libavcodec for AV_CODEC_ID_JPEGXL. It finds the
end of the stream in order to packetize the codec, and it looks at
the headers to set preliminary information like dimensions and pixel
format.
Note that much of this code is duplicated from avformat/jpegxl_probe.c,
but that code will be removed and call this instead in the next commit.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
Before this commit, the decoder erroneously assumes that the AVFrame
passed to the receive_frame is the same one each time. Now it keeps an
internal AVFrame to write into, and copies it over when it's done.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
Fixes an error that's caused by decoding a grayscale JXL image after an
RGB image is decoded, with the same decoder instance.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
OpenVINO API 2.0 was released in March 2022, which introduced new
features.
This commit implements current OpenVINO features with new 2.0 APIs. And
will add other features in API 2.0.
Please add installation path, which include openvino.pc, to
PKG_CONFIG_PATH mannually for new OpenVINO libs config.
Signed-off-by: Ting Fu <ting.fu@intel.com>
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
check_func() might return NULL, in which case the function is not to be
benched. Introduced in cc679054c7.
Signed-off-by: Matthias Dressel <code@deadcode.eu>
Signed-off-by: Martin Storsjö <martin@martin.st>
So far, AV_READ_TIME would return the cycle counter. This posed two
problems:
1) On recent systems, it would just raise an illegal instruction
exception. Indeed RDCYCLE is blocked in user space to ward off some
side channel attacks. In particular, this would cause the random
number generator to crash.
2) It does not match the x86 behaviour and the apparent original intent
of AV_READ_TIME in the functional code base (outside test cases).
So this replaces the cycle counter with the time counter. The unit is
a platform-dependent constant fraction of time, and the value should be
stable across harts (RISC-V lingo for physical CPU thread).
It is currently probably not possible for it to be negative as
the needed 2Mb input buf size is not achievable. But it is more
robust to check for it too.
If it would become negative than code like
s->samples[0][n] = s->samples[0][s->nb_samples + n];
would crash
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: -1403461578 + -843974775 cannot be represented in type 'int'
Fixes: 60868/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_MPEG1VIDEO_fuzzer-4599793035378688
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The fate-run.sh shell script exports LC_ALL=C before invoking the
test executables; this is probably done for consistency.
When executing Windows binaries with Wine, it normally handles
UTF-8 command line parameters just fine - but with LC_ALL set to
C, it treats them as plain ASCII.
As the unicode command line parameters wasn't the main thing
being tested here, just convert them to plain ASCII, for
portability. This fixes the test for all test configurations that
use Wine.
Signed-off-by: Martin Storsjö <martin@martin.st>
ddc1cd5cdd defined WIN32_LEAN_AND_MEAN
globally, which makes for much fewer transitive includes from
windows.h. With that define, CoTaskMemFree no longer gets
implicitly declared by just including windows.h, but one has to
include the right header objbase.h too.
That commit caused ole32 to no longer get detected, which caused
dxva2 to no longer be enabled. This gets fixed by this patch.
Signed-off-by: Martin Storsjö <martin@martin.st>
This commit fixes bug #10495
The code had several bugs related to post-loop compensation code:
- test assembly instruction performs bitwise AND operation and
generate flags used by jz branch instruction. Wrong test condition
leads to incorrect branching
- Incorrect compensation code for some branches
Signed-off-by: Evgeny Pavlov <lucenticus@gmail.com>
This abstraction is similar to the existing one for pthread_mutex_t and
pthread_once_t functions, and should reduce the amount of ifdeffery used
in future code.
Signed-off-by: James Almer <jamrial@gmail.com>
C++ doesn't support designated initializers until C++20. We have
a bunch of pre-defined channel layouts, the gains to make them
usable in C++ exceed the losses.
Bump minor version so C++ project can check before use these defines.
Also initialize .opaque field explicitly to reduce warning in C++.
Signed-off-by: James Almer <jamrial@gmail.com>
if !ph_deblocking_params_present_flag is true, ph_deblocking_filter_disabled_flag infered from pps
if !sh_deblocking_params_present_flag is true, sh_deblocking_filter_disabled_flag infered from ph
Failed clips:
ENT444MAINTIER_C_Sony_3.bit
ENT444HIGHTIER_D_Sony_3.bit
Signed-off-by: James Almer <jamrial@gmail.com>
if pps_alf_info_in_ph_flag is true
sh_alf_enabled_flag infered from ph
Failed clip:
LTRP_A_ERICSSON_3.bit
Signed-off-by: James Almer <jamrial@gmail.com>
if sh_picture_header_in_slice_header_flag is true
sh_lmcs_used_flag and sh_explicit_scaling_list_used_flag are infered from ph
Failed clips:
LMCS: CLM_A_KDDI_2.bit STILL444_A_KDDI_1.bit
Scaling: SCALING_B_InterDigital_1.bit SCALING_A_InterDigital_1.bit
Signed-off-by: James Almer <jamrial@gmail.com>
By default the OpenEXR decoder outputs linear light pixel data by
applying a gamma=1.0 transfer (i.e. a no-op). When it does so, it
should tag the data as linear so color-managed filters or other tools
can work with it correctly.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
If user doesn't set framerate when he creates a filter, the filter uses
default framerate {0, 1}. This causes error when setting timebase to
1/framerate. Now change it to pass inlink->time_base to outlink when
framerate is not set.
This patch fixes ticket: #10476#10468
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Including winsock2.h or windows.h without WIN32_LEAN_AND_MEAN cause
bzlib.h to parse as nonsense, due to an instance of #define char small
in rpcndr.h.
See:
https://stackoverflow.com/a/27794577
Signed-off-by: L. E. Segovia <amy@amyspark.me>
Signed-off-by: Martin Storsjö <martin@martin.st>
because the flv specification said the video frametype
should use value range from 0x00 to 0x70,
so use 0xF0 have no problem before support enhanced flv,
but the 0xF0 will get incorrect result after support enhanced flv,
so should set the video frametype mask 0x70 to make it correct now.
Reported-By: flvAnalyser <hybase@qq.com>
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
Writing the duration SimpleTag is special: It's size is
reserved in advance via an EBML Void element (if seekable)
and this reserved space is overwritten when writing the trailer;
it does not use put_ebml_string().
The string to write is created via snprintf on a buffer
of size 20; this buffer is then written via put_ebml_binary()
with a size of 20.
EBML strings need not be zero-terminated; if not, they
are implicitly terminated by the element's length field.
snprintf() always zero-terminates the buffer, i.e.
the last byte can be discarded when using an EBML string.
This patch does this.
The FATE changes are as expected: One byte saved for every
track; the only exception is the matroska-qt-mode test:
An additional byte is saved because an additional byte
could be saved from the enclosing Tags length field.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Do it only for video (the only thing for type for which HDR10+
makes sense).
This effectively reverts changes to several FATE ref-files
made in bda44f0f39.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Pass it as void* instead. While just at it, also constify
the pointee of AVDOVIDecoderConfigurationRecord* in
ff_isom_put_dvcc_dvvc().
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
These two AVIOContexts currently coincide, but this is not
guaranteed to remain so (in fact, I have plans to write each
TrackEntry into its own AVIOContext).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is only WebVTT which is special in WebM; hypothetical future
subtitle codecs in WebM will presumably use the ordinary code.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Specifically, test copying a channel layout with custom order,
so that the allocation codepath of av_channel_layout_copy()
is executed.
Reviewed-by: Nicolas George <george@nsup.org>
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
av_channel_name(), av_channel_description() and
av_channel_layout_describe() are supposed to return the size
of the needed buffer to allow the user to check for truncation;
the documentation ("If the returned value is bigger than buf_size,
then the string was truncated.") confirms that size does not
mean strlen.
Yet the AVBPrint API, i.e. AVBPrint.len, does not account for
the terminating '\0'. Therefore the returned length is off by one.
Furthermore, also check for whether the returned value actually
fits in an int (which is the return value of these functions).
Reviewed-by: Nicolas George <george@nsup.org>
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The AVBPrint API guarantees that the string buffer is always
zero-terminated; in order to honour this guarantee, there
obviously must be a string buffer at all and it must have
a size >= 1. Therefore av_bprint_init_for_buffer() treats
passing a NULL buffer or size == 0 as invalid data that
leads to undefined behaviour, namely NPD in case NULL is provided
or a write to a buffer of size 0 in case size == 0.
But it would be easy to support this, namely by using the internal
buffer with AV_BPRINT_SIZE_COUNT_ONLY in case size == 0.
There is a reason to allow this: Several functions like
av_channel_(description|name) are actually wrappers
around corresponding AVBPrint functions. They accept user
provided buffers and are supposed to return the required
size of the buffer, which would allow the user to call
it once to get the required buffer size and call it once
more after having allocated the buffer.
If av_bprint_init_for_buffer() treats size == 0 as invalid,
all these users would need to check for this themselves
and basically add the same codeblock that this patch
adds to av_bprint_init_for_buffer().
This change is in line with e.g. snprintf() which also allows
the pointer to be NULL in case size is zero.
This fixes Coverity issues #1503074, #1503076 and #1503082;
all of these issues are about providing NULL to the channel-layout
functions that are wrappers around AVBPrint versions.
Reviewed-by: Nicolas George <george@nsup.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The demuxer uses a extradata offset of 26, so we would need
to recreate the missing 26 bytes somehow in the muxer, but
we just don't. Remuxed files (like real/rv30.rm from the FATE-suite)
don't work due to missing extradata.
(The extradata offset also applies to RV40 and the extradata
is indeed lost upon remuxing, yet remuxing real/spygames-2MB.rmvb
works; our RV40 decoder does not use extradata at all.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
RV10 and RV20 are unsupported because creating the correct CodecPrivate
is unsupported (the demuxer uses a codecpriv_offset of 26, so one
would need to recreate the missing 26 bytes); COOK and SIPR are
unsupported, because Matroska uses a packetization mode that is
different from what FFmpeg uses in its packets (see
matroska_parse_rm_audio() in the demuxer).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Provides coverage for the code transforming the ALAC extradata.
Also set creation_time metadata to test this, too.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
bda44f0f39 added code that
potentially added another BlockMore master and BlockAdditional
data as well as BlockAddID number, yet it bumped the number
of EBML elements by four instead of only three.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Matroska supports orthogonal transformations (both pure rotations
as well as reflections) via its 3D-projection elements, namely
ProjectionPoseYaw (for a horizontal reflection) as well as
ProjectionPoseRoll (for rotations). This commit adds support
for this.
Support for this in the demuxer has been added in
937bb6bbc1 and
the sample used in the matroska-dovi-write-config8 FATE-test
includes a displaymatrix indicating a rotation which is now
properly written and read, thereby providing coverage for
the relevant code in the muxer as well as the demuxer.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Note: There is a slight difference in the handling of
the max_file_size option: The earlier code used it to mean
to limit the size of the buffer to allocate; the new code
treats it more literally as maximum size to read from
the input.
Reviewed-by: Nicolas George <george@nsup.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is a bit cleaner as int need not be the underlying type
of an enum if a smaller type can hold all its values.
Also declare the children_ids array as const as it never changes.
Reviewed-by: Stefano Sabatini <stefasab@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Also rename the contexts and the functions so their names will reflect their
intended size.
With the earlier patch this fixes the audio corruption regression caused by
6ba0aa1770.
Fixes ticket #10029.
Signed-off-by: Marton Balint <cus@passwd.hu>
Improves the audio corruption regression caused by
6ba0aa1770 reported in ticket #10029.
There is still however a noticable audio glitch, so the FFT conversion to AVTX
probably also needs some modifications.
Signed-off-by: Marton Balint <cus@passwd.hu>
This patch changes the return instruction in the tr_32x4 macro from
BR to RET.
Function returns should always use the RET instruction instead of BR,
to avoid interfering with branch prediction.
On devices that support BTI, this is observeable as a landing pad is
required when branching with BR. The change fixes
fate-hevc-hdr-vivid-metadata when on hardware with BTI support.
Signed-off-by: Casey Smalley <casey.smalley@arm.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
This commit is the AVHWAccel analogue of commit
20f9727018: It moves the private fields
of AVHWAccel to a new struct FFHWAccel extending AVHWAccel
in an internal header (namely hwaccel_internal.h).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
All usages of ff_hwaccel_frame_priv_alloc() have the same pattern:
Check for whether a hwaccel is in use; check whether it needs
private frame-specific data; allocate the AVBuffer and set
it.
This commit modifies ff_hwaccel_frame_priv_alloc() to perform
this task on its own.
(It also seems that the H.264 decoder did not perform proper
cleanup in case the buffer could not be allocated. This has been
changed.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
libavcodec/hwconfig.h currently contains HWACCEL_CAP_* flags
as well as the definition of AVCodecHWConfigInternal and some
macros to create them.
The users of these two are nearly disjoint: The flags are used
by files providing AVHWAccels whereas AVCodecHWConfigInternal
is used by files providing codecs (for FFCodec.hw_configs).
This patch therefore moves these flags to a new file hwaccel_internal.h.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
A filter needs formats.h iff it uses FILTER_QUERY_FUNC();
since lots of filters have been switched to use something
else than FILTER_QUERY_FUNC, they don't need it any more,
but removing this header has been forgotten.
This commit does this; files with formats.h inclusion went down
from 304 to 139 here (it were 449 before the preceding commit).
While just at it, also improve the other headers a bit.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
internal.h doesn't rely on it; instead include it directly
in every user that needs it (a filter needing it is basically
equivalent to it using FILTER_QUERY_FUNC, i.e. a majority of
filters doesn't need it).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This file is only used by the greyedge filter and therefore
only compiled if said filter is enabled. This also allows
to remove a config_components.h inclusion, avoiding unnecessary
rebuilds.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Lots of video filters use a very simple input or output:
An array with a single AVFilterPad whose name is "default"
and whose type is AVMEDIA_TYPE_VIDEO; everything else is unset.
Given that we never use pointer equality for inputs or outputs*,
we can simply use a single AVFilterPad instead of dozens; this
even saves .data.rel.ro (8312B here) as well as relocations.
*: In fact, several filters (like the filters in vf_lut.c)
already use the same outputs; furthermore, ff_filter_alloc()
duplicates the input and output pads so that we do not even
work with the pads directly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
internal.h does not depend on video.h (and should not depend on it)
and therefore should not include video.h at all; instead all users
of video.h should include it directly.
Doing so also avoids unnecessary video.h inclusions in files that
don't need it, like most audio filters.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Lots of audio filters use very simple inputs or outputs:
An array with a single AVFilterPad whose name is "default"
and whose type is AVMEDIA_TYPE_AUDIO; everything else is unset.
Given that we never use pointer equality for inputs or outputs*,
we can simply use a single AVFilterPad instead of dozens; this
even saves .data.rel.ro (4784B here) as well as relocations.
*: In fact, several filters (like the filters in af_biquads.c)
already use the same inputs; furthermore, ff_filter_alloc()
duplicates the input and output pads so that we do not even
work with the pads directly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
h264_sei.h is no longer used since the SEIs were moved to sei.h;
this also avoids inclusions of avcodec.h and bytestream.h.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Don't include them implicitly via avcodec.h. This avoids
indirect avcodec.h inclusions in lavc/dirac.c, lavf/oggparsedirac.c,
and lavf/rtp(dec|enc)_vc2hq.c.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This avoids including lavc/codec_desc.h everywhere and thereby
forces users to include it directly instead of lazily and potentially
unknowingly relying on indirect inclusions.
Also add the proper inclusion to libavformat/demux.c, one of the
two files that actually use the new field.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Most of the inline functions in h264dec.h are only used
by h264_cavlc.c and h264_cabac.c. Therefore move them
to the common header for these two, namely h264_mvpred.h.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This would only be necessary if this header declared a function
that takes a (pointer to) struct AVCodecContext as parameter.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
In WinRT mode, we use CreateThread instead of _beginthreadex.
CreateThread takes a LPTHREAD_START_ROUTINE function pointer,
which has got the signature DWORD WINAPI ThreadProc(LPVOID).
_beginthreadex takes a function with the signature
unsigned __stdcall func(void *).
DWORD is defined as an unsigned long, which is different type
from unsigned int, even if they have the same size on Windows.
This fixes build failures with Clang 16 and newer, where function
pointer type mismatches are a fatal error by default.
Signed-off-by: Martin Storsjö <martin@martin.st>
This fixes the test when running in a cross test setup where the
samples are located at a different path between build host and
temote test target.
Signed-off-by: Martin Storsjö <martin@martin.st>
Use the GCC specific codepath for Clang in MSVC mode too.
This matches the condition used in a number of other places.
MSVC doesn't have a way to signal potential aliasing, while GCC
(and Clang) can use __attribute__((may_alias)) for this purpose.
When building with Clang in MSVC mode, __GNUC__ isn't defined but
_MSC_VER is as Clang primarily impersonates MSVC - but even then it
does support the GCC style attributes.
The GCC specific codepath uses av_alias, which expands to
the may_alias attribute if supported. The MSVC specific codepath
doesn't use av_alias so far (as MSVC doesn't support any
corresponding attribute).
This fixes a couple HEVC decoder tests when built with Clang 14 or
newer in MSVC mode (with issues observed on all of x86_64, armv7
and aarch64).
Signed-off-by: Martin Storsjö <martin@martin.st>
Having a decode_slice callback is mandatory for HWAccels;
the only exception to this (and the reason why these checks
were added) was XvMC, but it is no more since commit
be95df12bb.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Otherwise the var_names and the corresponding enum will be off
and e.g. the array holding the variable values will be too small.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
User may set color range / matrix coefficient set / primaries / transfer
characteristics for output.
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Use the bytestream2-API instead.
Should also fix Coverity issue #1473536 (which is about an unchecked
init_get_bits8()).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
nvenc_store_frame_data() is always called with frame != NULL
(checked at the beginning of nvenc_send_frame());
in fact, frame is dereferenced unconditionally after the block
guarded by the check for frame. Therefore Coverity complains
about this in issue #1538295.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The code currently presumes that a return value of AVERROR(ENOMEM)
implies that ac3hdr could not be allocated, so it need not be freed.
Yet any avpriv_ac3_parse_header() might allocate more than the
AC3HeaderInfo internally (it doesn't currently), so simply free
it unconditionally.
Fixes Coverity issues #1492870 and #1492868.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The discrepancy between the actual definition and the declarations
in hwaccels.h is actually UB.
Reviewed-by: Timo Rothenpieler <timo@rothenpieler.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
We've been fuzzing torchvision with [sydr-fuzz](https://github.com/ispras/oss-sydr-fuzz)
and found out of bounds error in ffmpeg project at audioconvert.c:151.
To prevent error we need to fix checks for in and out fmt in swr_init.
Signed-off-by: Eli Kobrin <kobrineli@ispras.ru>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
libavutil/hwcontext_qsv.c: In function ‘qsv_map_to’:
libavutil/hwcontext_qsv.c:1905:47: warning: cast from pointer to integer
of different size [-Wpointer-to-int-cast]
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
These functions allow not only to read and write unsigned values,
but also to check ranges and to emit trace output which can be
beautified when processing arrays (indices like "[i]" are replaced
by their actual numbers).
Yet lots of callers actually only need something simpler:
Their range is only implicitly restricted by the amount
of bits used and they are not part of arrays, hence don't
need this beautification.
This commit adds specializations for these callers;
this is very beneficial size-wise (it reduced the size
of .text by 23312 bytes here), as a call is now cheaper.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The H.264 decoder, the only codec with which this code
is ever called, does not set AVCodec.pix_fmts.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Change some internal APIs a bit to make it harder to make
such mistakes.
In particular, have the read chunk functions return an error
when the result is incomplete.
This might be less flexible, but since there has been no
use-case for that so far, avoiding coding mistakes seems better.
Add a function to queue a AVBPrint directly (ff_subtitles_queue_insert_bprint).
Also fixes a leak in lrcdec when ff_subtitles_queue_insert fails.
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
ret can be given an argument instead.
This is also consistent with how other assembler code
in FFmpeg does it.
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
Pointers to these functions are used in comparisons.
Currently the compiler has to presume the worst for these,
namely that the functions are from another DSO and therefore
loads their addresses from the GOT (which also entails a
relocation entry that is processed at runtime, regardless
of whether the code using them is run or not). This changes
after these functions are declared as hidden.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is the more proper place for them given that this is
the only API using them.
Also use a forward-declaration of AVCodecContext in fdctdsp.h
to avoid including avcodec.h in jfdct(fst|int).c.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
libavcodec/vulkan_video_codec_av1std.h currently does not pass
checkheaders: It is missing stdint.h and vulkan/vulkan_core.h.
The comment "This header is NOT YET generated from the Khronos Vulkan
XML API Registry." as well as the fact that it does not use our standard
inclusion guards makes the file appear as if it is to be treated
like a third-party header and not one of our own. This commit
therefore "fixes" the issue by unconditionally skipping said header.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The contents are full TTML XML documents. TTML writing tests'
results are updated as the streams are now properly identified
as TTML ones.
Signed-off-by: Jan Ekström <jan.ekstrom@24i.com>
The commits eac4324bfb and
cd8211527e renamed the examples, but the
targets were not updated. Hence, the builds are missing -lm.
Signed-off-by: Sebastian Ramacher <sramacher@debian.org>
Signed-off-by: James Almer <jamrial@gmail.com>
The hevc parser parses the diagonal scan order in bitstream into raster
scan order. However, the Vulkan spec wants it as specified in H265 spec,
which is diagonal scan order.
Tested on RADV.
v2: fix copy-paste typo with PPS.
A POLLERR occurs when libavcodec attempts to dequeue output buffers
before enqueuing capture buffers. This could happen to an application
deciding to send the first coded packet. Suppress these POLLERRs when
the buffers are uninitialized and avoid crashing because of enumerating
uninitialized buffers.
See https://trac.ffmpeg.org/ticket/9957 for the original bug report.
Signed-off-by: Richard Acayan <mailingradian@gmail.com>
Signed-off-by: Marton Balint <cus@passwd.hu>
This allows this common H.274 SEI to be parsed from both H.264
as well as HEVC, as well as probably from VVC in the future.
Generally attempts to keep the original code as similar as possible.
FATE test refererence changes only change the order of side data
export within a single frame. Nothing else seems to have changed.
This allows this common H.274 SEI to be parsed from both H.264
as well as HEVC, as well as probably from VVC in the future.
Generally attempts to keep the original code as similar as possible.
FATE test refererence changes only change the order of side data
export within a single frame. Nothing else seems to have changed.
The unchecked read caused the 2nd subsequent tell call to move backward resulting
in a negative length
Fixes: assertion failure
Fixes: 60276/clusterfuzz-testcase-minimized-ffmpeg_BSF_TRACE_HEADERS_fuzzer-5434126636023808
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: vmixdec.c:132:34: runtime error: signed integer overflow: -2147483648 * 1856 cannot be represented in type 'int'
Fixes: vmixdec.c:119:20: runtime error: signed integer overflow: -1256 + -2147483648 cannot be represented in type 'int'
Fixes: vmixdec.c:137:36: runtime error: signed integer overflow: 2147483416 * 16 cannot be represented in type 'int'
Fixes: 59843/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_VMIX_fuzzer-4857434624360448
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The size offset was previously being accounted for in flv_set_video_codec
for h264 and mpeg4, instead of being directly accounted for in the spot
where its read, which desynced on HEVC streams.
For clarity, move the size offset directly to the parsing, similar to
how its done for all other header fields.
glcontext was added under CONFIG_SDL2
libavdevice/opengl_enc.c: In function ‘opengl_draw’:
libavdevice/opengl_enc.c:1204:15: error: ‘OpenGLContext’ has no member
named ‘glcontext’
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Fixed: signed integer overflow: -2 * -1085502286 cannot be represented in type 'int'
Fixed: 57986/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_AAC_FIXED_fuzzer-5123651145170944
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The implementation does not support k=32
Fixes: shift exponent 32 is too large for 32-bit type 'unsigned int'
Fixes: 57976/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WAVARC_fuzzer-5911925807775744
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 2147443649 + 65535 cannot be represented in type 'int'
Fixes: 60054/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_RKA_fuzzer-5095674572832768
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
major/minor are in <sys/types.h> on BSDs and <sys/mkdev.h> on Solaris-like.
libavutil/hwcontext_vulkan.c:55:10: fatal error: 'sys/sysmacros.h' file not found
#include <sys/sysmacros.h>
^~~~~~~~~~~~~~~~~
Take vector reduction out of the loop and unroll.
Before:
audiodsp.scalarproduct_int16_c: 12321.0
audiodsp.scalarproduct_int16_rvv_i32: 4175.7
After:
audiodsp.scalarproduct_int16_c: 12320.5
audiodsp.scalarproduct_int16_rvv_i32: 1230.2
1) Take the reductive sum out of the loop,
leaving a regular vector addition in the loop.
2) Merge the addition and the multiplication.
3) Unroll.
Before:
scalarproduct_float_rvv_f32: 832.5
After:
scalarproduct_float_rvv_f32: 275.2
ffmpeg CLI pixel format selection for filtering currently special-cases
MJPEG encoding, where it will restrict the supported list of pixel
formats depending on the value of the -strict option. In order to get
that value it will apply it from the options dict into the encoder
context, which is a highly invasive action even now, and would become a
race once encoding is moved to its own thread.
The ugliness of this code can be much reduced by moving the special
handling of MJPEG into ofilter_bind_ost(), which is called from encoder
init and is thus synchronized with it. There is also no need to write
anything to the encoder context, we can evaluate the option into our
stack variable.
There is also no need to access AVCodec at all during pixel format
selection, as the pixel formats array is already stored in
OutputFilterPriv.
When -pix_fmt designates a BE/LE pixel format, it gets translated into
the native one by av_get_pix_fmt(). This may not always be the best
choice, as the encoder might only support one endianness. In such a
case, explicitly choose the endianness supported by the encoder.
While this is currently redundant with choose_pixel_fmt() in
ffmpeg_filter.c, the latter code will be deprecated in following commits.
AVISYNTH_INTERFACE_VERSION 10 fell in-between the releases of
3.7.2 and 3.7.3, and is required to be able to read the channel
layout information.
Signed-off-by: Stephen Hutchinson <qyot27@gmail.com>
This cannot beat the Zbb implementation, and it is unlikely that a real
meaningful CPU design would support V and not Zbb. The best loop rewrite
that I could come up with (4 shifts, 2 ands, 3 ors) is still ~40% slower
than Zbb.
A proper faster vector implementation should be feasible with the
cryptographic vector extensions, but that is a story for another time.
The code was blindly assuming that Zbb or V implied Zba. While the
earlier is practically always true, the later broke some QEMU setups,
as V was introduced earlier than Zba.
I cannot find the spec, but according to the original commit
d4fdba0df7, it's CAVS. e571305a71 changed it to AVS by
accident. Ten years on, nothing happened. We still have the
sample [1], however, since there is no cavs_mp4tofoobar bsf, the
cavs decoder doesn't work. I don't know if there is any use case.
[1] https://samples.ffmpeg.org/AVS/AVSFileFormat/AVSFileFormat.mp4
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Even though they have the same size, and typically the same alignment,
uint32_t and float are under no circumstances compatible types in C.
The casts from float * to uint32_t * are invalid here. Insofar as the
resulting pointers are dereferenced, this is undefined behaviour.
This patch uses av_float2int() / av_int2float() instead.
Except for add_squares, telling the compiler that the output vector(s)
cannot alias helps quite a bit (cycles on SiFive U74-MC):
ps_add_squares_c: 98277.7
ps_add_squares_r: 98320.2
ps_hybrid_analysis_c: 3731.2
ps_hybrid_analysis_r: 2495.7
ps_hybrid_analysis_ileave_c: 20478.0
ps_hybrid_analysis_ileave_r: 16092.2
ps_hybrid_synthesis_deint_c: 19051.5
ps_hybrid_synthesis_deint_r: 15420.0
ps_mul_pair_single_c: 122941.2
ps_mul_pair_single_r: 91035.0
Encoders (usually) have no business modifying frame->data
(which need not be writable), so they should use the appropriate
pointers.
Reviewed-by: Tomas Härdin <git@haerdin.se>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Also allocate the AVFrame during init and use av_frame_replace()
to replace it later.
Reviewed-by: Tomas Härdin <git@haerdin.se>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
frame is always != NULL for audio and video here
(this is checked by an assert and the frame is already dereferenced
before it reaches this code here).
Fixes Coverity issue #1538858.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This increases the group multiplier as per T-Head C910 benchmarks:
inverse_coupling_c: 4597.0
inverse_coupling_rvv_i32: 1312.7 (m1)
inverse_coupling_rvv_i32: 1116.7 (m2)
inverse_coupling_rvv_i32: 732.2 (m4)
inverse_coupling_rvv_i32: 898.0 (m8)
av_random_bytes() can use OS provided strong random functions and does not
depend soley on openssl/gcrypt external libraries.
Fixes ticket #10441.
Signed-off-by: Marton Balint <cus@passwd.hu>
map_func is supposed to be an array of const pointer to function
returning int, not an array of pointer to function returning const int.
Reported-By: Martin Storsjö
This is not actually used for anything. The configure check causes the
CPU feature flag to be set, but nothing consumes it at all.
While AArch64 does have VFP, it is only used for the scalar C code.
Conversely, it is still possible to disable VFP, by changing the
C compiler flags as before (though that only makes sense for an
hypothetical non-standard Armv8 platform without VFP).
Note that this retains the "vfp" option flag, for backward
compatibility and on the very remote but theoretically possible chance
that FFmpeg actually makes use of it in the future.
AV_CPU_FLAG_VFP is retained as it is actually used by AArch32.
The IMDCT offset is only relevant for NEON optimisations. There are no
VFP optimisations here that would justify the HAVE_VFP flag. In
practice, this makes no difference because HAVE_NEON is practically
always true for standard Armv8 platforms.
Requires a new upstream function to test not for *import* support on a
given output pixel format, but also whether we can render to it.
Fixes: https://github.com/haasn/libplacebo/issues/173
Stop claiming the argument is always a floating point number, which
* confuses floating point and decimal numbers
* is not always true even accounting for the above point
Read the timebase from FrameData rather than the input stream. This
should fix#10393 and generally be more reliable.
Replace the use of '-1' to indicate demuxing timebase with the string
'demux'. Also allow to request filter timebase with
'-enc_time_base filter'.
It now contains data from multiple sources, so group those items that
always come from the decoder. Also, initialize them to invalid values,
so that frames that did not originate from a decoder can be
distinguished.
This is possible now that enc_open() is always called with a non-NULL
frame for audio/video.
Previously the code would directly reach into the buffersink, which is a
layering violation.
When no frames were passed from a filtergraph to an encoder, but the
filtergraph is configured (i.e. has output parameters), encoder flush
code will use those parameters to initialize the encoder in a last-ditch
effort to produce some useful output.
Rework this process so that it is triggered by the filtergraph, which
now sends a dummy frame with parameters, but no data, to the encoder,
rather than the encoder reaching backwards into the filter.
This approach is more in line with the natural data flow from filters to
encoders and will allow to reduce encoder-filter interactions in
following commits.
This code is tested by fate-adpcm-ima-cunning-trunc-t2-track1, which (as
confirmed by Zane) is supposed to produce empty output.
This line was added in c30a4489b4 along with
AVStream.sample_aspect_ratio. However, configuring SAR for video
encoding is now done after this code (specifically in enc_open(), which
is called when the first video frame to be encoded is obtained), so this
line cannot have any meaningful effect.
Replace duplicated(!) and broken* custom string parsing with
av_dict_parse_string(). Return error codes instead of aborting.
* e.g. it treats NULL returned from av_get_token() as "separator not
found", when in fact av_get_token() only returns NULL on memory
allocation failure
This decoding flag makes decoders drop all frames after a parameter
change, but what exactly constitutes a parameter change is not well
defined and will typically depend on the exact use case.
This functionality then does not belong in libavcodec, but rather in
user code
Copy packet side data to the output frame in ff_decode_frame_props_from_pkt()
instead of in discard_samples(), having the latter only applying the skip if
required.
This will be useful for the following commit.
Signed-off-by: James Almer <jamrial@gmail.com>
Prevent the fifo used in encoding VPx videos from filling up and
stopping encode when it reaches 21845 items, which happens when the
video has more than that number of frames.
Incorporated suggestion from James Zern to prevent calling
frame_data_submit() at all when performing the first pass of a 2-pass
encode so the fifo is not filled at all; replaces original patch which
drained the fifo after filling to prevent it from becoming full.
Fixes the regression originally introduced in
5bda4ec6c3
Co-authored-by: James Zern <jzern@google.com>
Signed-off-by: David Lemler <david@lemler.family>
Signed-off-by: James Zern <jzern@google.com>
In some versions of libsvtav1, setting intra_period_length to 0
does not produce the intended result (i.e.) all frames produced
are not keyframes.
Instead handle the gop_size == 1 as a special case by setting
the pic_type to EB_AV1_KEY_PICTURE when encoding each frame so
that all the output frames are keyframes.
SVT-AV1 Bug: https://gitlab.com/AOMediaCodec/SVT-AV1/-/issues/2076
Example command: ffmpeg -f lavfi -i testsrc=duration=1:size=64x64:rate=30 -c:v libsvtav1 -g 1 -y test.webm
Before: Only first frame is keyframe, rest are intraonly.
After: All frames are keyframes.
Signed-off-by: Vignesh Venkatasubramanian <vigneshv@google.com>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Add missing operand which clang complains about but GCC assumes it to be
'm1' if not specified.
Works around build failure with Clang:
| src/libswscale/riscv/rgb2rgb_rvv.S:88:25: error: operand must be e[8|16|32|64|128|256|512|1024],m[1|2|4|8|f2|f4|f8],[ta|tu],[ma|mu]
| vsetvli t4, t3, e8, ta, ma
| ^
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
EOF from sq_receive() means no packets will ever be output by the sync
queue. Since the muxing sync queue is always used by all interleaved
(i.e. non-attachment) streams, this means no further packets can make
it to the muxer and we can terminate muxing now.
Since “2d924b3a63 fftools/ffmpeg: move each muxer to a separate thread”,
opengl_write_packet() is called from a different thread than
opengl_write_header() and would nothing for lack of a selected context.
Mention encoder name in the message to emphasize that the value in
question is not supported by this specific encoder, not necessarily by
libavcodec in general.
Print a list of values supported by the encoder.
Useful for discovering bugs that depend on a specific thread count.
Use like THREADS=randomX for a random thread count from 1 to X, with
X=16 when not specified. Note that the thread count is different for
every test.
That feature is overkill for a constant pointer to AVFilterLink which
can be stored in AVCodecContext.opaque (indirectly, because the link is
not allocated yet at the time the codec is opened).
This also avoids leaking non-NULL AVFrame.opaque to callers.
Git master libjxl changed several function signatures, so this commit
adds some #ifdefs to handle the new signatures without breaking old
releases. Do note that old git master development versions of libjxl
will be broken, but no releases will be.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
From the spec: "It is a requirement of bitstream conformance that
the value of luma_bit_depth_entry_minus8 shall be equal to
the value of bit_depth_luma_minus8"; similarly for chroma.
Also fixes Coverity ticket #1529226 (complaining about the fact
that chroma_bit_depth_entry is checked twice).
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
I don't pretend to understand how we get into this situation, but there
are files out there where we can end up with the active PPS not being
identified when we call vk_hevc_end_frame. In these situations today,
we will segfault. So, before we give up, see if we can get the active
PPS id from the slice header, and use that if possible.
If that still doesn't work, return an error instead of segfaulting.
Also fix a couple of possible overflows while at it.
Fixes the negative initial timestamps in ticket #10358.
Signed-off-by: Marton Balint <cus@passwd.hu>
BMDTimeValue is defined as LONGLONG on Windows, but int64_t on Linux/Mac.
Fixes format string warnings:
libavdevice/decklink_enc.cpp: In function ‘void construct_cc(AVFormatContext*, decklink_ctx*, AVPacket*, klvanc_line_set_s*)’:
libavdevice/decklink_enc.cpp:424:48: warning: format ‘%lld’ expects argument of type ‘long long int’, but argument 4 has type ‘BMDTimeValue {aka long int}’ [-Wformat=]
ctx->bmd_tb_num, ctx->bmd_tb_den);
~~~~~~~~~~~~~~~ ^
libavdevice/decklink_enc.cpp:424:48: warning: format ‘%lld’ expects argument of type ‘long long int’, but argument 5 has type ‘BMDTimeValue {aka long int}’ [-Wformat=]
Signed-off-by: Marton Balint <cus@passwd.hu>
libavutil/random_seed.c calls arc4random_buf which is
not available on OSX 10.4 Tiger, but the configuration
script tests for arc4random which is available.
Fix the configuration test to match the actual API used.
Co-authored-by: James Almer <jamrial@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
When intra_refresh is enabled, gopLength is equal to
NVENC_INFINITE_GOPLENGTH. gopLength should be 1 at least.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Changes the result of fate-mxf-probe-dv25, where the bitrate is now
exported.
Also changes the result of fate-bsf-dv-error-marker, where the exported
bitrate is now different. Note that the codec layer bitrate does not
match the container bitrate, because container timing is 25fps, while
the DV profile is 50.
This way decoding errors will not be returned when the user starts
draining the decoder, avoiding confusion over whether draining did or
did not start.
Fixes failures of fate-h264-attachment-631 for certain numbers of frame
threads (e.g. 5).
Do it from ff_decode_get_packet() rather than from
avcodec_send_packet(). This way all nontrivial stages of the decoding
pipeline (i.e. other than just placing a packet at its entrance) are
pull-based rather than a mix of push an pull.
Decoding pipeline has multiple stages, some of which may have their own
delay (e.g. bitstream filters). The code currently uses
AVCodecInternal.draining to track all of them, but they do not have to
all be in sync.
The goal is to distinguish between APIs provided by the generic layer to
individual codecs and APIs internal to the generic layer.
Start by moving ff_{decode,encode}_receive_frame() and
ff_{decode,encode}_preinit() into this new header, as those functions
are called from generic code and should not be visible to individual
codecs.
For uint8_t buf[EVC_NALU_LENGTH_PREFIX_SIZE], &buf still points
to the beginning of buf, but it is of type "pointer to array of
EVC_NALU_LENGTH_PREFIX_SIZE uint8_t" (i.e. pointer arithmetic
would operate on blocks of size EVC_NALU_LENGTH_PREFIX_SIZE).
This is of course a different type than uint8_t*, which is why
there have been casts in evc_read_packet(). But these are unnecessary
if one justs removes the unnecessary address-of operator.
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Not sure if the function naming frame_queue_destory is intended because
"destory" is not really a word. Changing it to "destroy" makes more sense.
Signed-off-by: QiTong Li <liqitong@163.com>
Signed-off-by: Marton Balint <cus@passwd.hu>
example clips:
* 12b444vvc1_E_Sony_2
* 12b444Ietsrc_A_Kwai_2
* 10b444P16_D_Sony_2
* 12b444Iepp_A_Sharp_2
* 12b444SPetsrc_B_Kwai_2
Co-authored-by: Nuo Mi <nuomi2021@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
Add an optional filter_line3 to the available optimisations.
filter_line3 is equivalent to filter_line, memcpy, filter_line
filter_line shares quite a number of loads and some calculations in
common with its next iteration and testing shows that using aarch64
neon filter_line3s performance is 30% better than two filter_lines
and a memcpy.
Adds a test for vf_bwdif filter_line3 to checkasm
Rounds job start lines down to a multiple of 4. This means that if
filter_line3 exists then filter_line will not sometimes be called
once at the end of a slice depending on thread count. The final slice
may do up to 3 extra lines but filter_edge is faster than filter_line
so it is unlikely to create any noticable thread load variation.
Signed-off-by: John Cox <jc@kynesim.co.uk>
Signed-off-by: Martin Storsjö <martin@martin.st>
Exports C filter_line needed for tail fixup of neon code
Adds neon for filter_line
Signed-off-by: John Cox <jc@kynesim.co.uk>
Signed-off-by: Martin Storsjö <martin@martin.st>
Adds clip and spatial macros for aarch64 neon
Exports C filter_edge needed for tail fixup of neon code
Adds neon for filter_edge
Signed-off-by: John Cox <jc@kynesim.co.uk>
Signed-off-by: Martin Storsjö <martin@martin.st>
Adds an outline for aarch neon functions
Adds common macros and consts for aarch64 neon
Exports C filter_intra needed for tail fixup of neon code
Adds neon for filter_intra
Signed-off-by: John Cox <jc@kynesim.co.uk>
Signed-off-by: Martin Storsjö <martin@martin.st>
Stop overwriting values from the bitstream arrays pps_tile_column_width_minus1
and pps_tile_row_height_minus1.
Signed-off-by: James Almer <jamrial@gmail.com>
Uses the existing code for av_get_random_seed() to return a buffer with
cryptographically secure random data, or an error if none could be generated.
Signed-off-by: James Almer <jamrial@gmail.com>
This ensures the requested amount of bytes is read.
Also remove /dev/random as it's no longer necessary.
Signed-off-by: James Almer <jamrial@gmail.com>
The HDR metadata should be removed after HDR to SDR conversion,
otherwise the output frame still has HDR side data.
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
This makes the filter output match that of the C version.
It was left intentionally while we figured out if it was better
or not, and while it makes certain samples better, it makes static
samples jump around slightly.
The standard only defined configurationVersion 1.
configurationVersion 0 is for backward compatibility predates the
standard.
This patch reduces the chance that some malformated streams being
detected as hvcC.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
This is needed to ensure that AFD data continues to work when
capturing V210 video with the Decklink libavdevice input.
Signed-off-by: Devin Heitmueller <dheitmueller@ltnglobal.com>
Signed-off-by: Marton Balint <cus@passwd.hu>
Support decoding and embedding VANC packets delivered via SMPTE 2038
into the SDI output. We leverage an intermediate queue because
data packets are announced separately from video but we need to embed
the data into the video frame when it is output.
Note that this patch has some additional abstraction for data
streams in general as opposed to just SMPTE 2038 packets. This is
because subsequent patches will introduce support for other
data codecs.
Thanks to Marton Balint for review/feedback.
Signed-off-by: Devin Heitmueller <dheitmueller@ltnglobal.com>
Signed-off-by: Marton Balint <cus@passwd.hu>
The existing queue initialization function would always sets it's
maximum queue size to ctx->queue_size. But because we are introducing
more queues we may want the sizes to differ between them.
Move the specification of the queue size into an argument, which can
be passed from the caller.
This patch makes no functional change to the behavior. It is being
made to accommodate Marton Balin's request to split out the queue
size for the new VANC queue being introduced in a later patch.
Signed-off-by: Devin Heitmueller <dheitmueller@ltnglobal.com>
Signed-off-by: Marton Balint <cus@passwd.hu>
Use CFDictionarySetValue to enable low-latency encoding mode.
Since the key is a type of "EncoderSpecification", instead of
"CompressionProperty".
Signed-off-by: xufuji456 <839789740@qq.com>
Signed-off-by: Rick Kern <kernrj@gmail.com>
No need to generate intermediate files and probe them. We only care to know that the
output of the bsf excludes the frames in question, and a simple checksum is enough.
Signed-off-by: James Almer <jamrial@gmail.com>
VVCParserContext.au_info is only used once (and in a read-only manner);
but this happens immediately after au_info has been completely
overwritten. Therefore one can just the src structure used to overwrite
au_info directly and remove au_info.
This also means that the whole referencing and unreferncing of au_info
(which duplicates AVBufferRefs CodedBitstreamH266Context and is
therefore of dubious gain) can be removed, as can the AVBufferRef*
contained in PuInfo; this also removes a certain uglyness: Sometimes
these AVBufferRef* were ownership pointers and sometimes not.
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Content-Type can include charset and boundary which is not a part of
mime type and shouldn't be copied as such.
Fixes HLS playback when the Content-Type includes additional fields.
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
The discrepancy between the definition and the declaration
in allformats.c is actually UB.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The discrepancy between the definition and the declaration
in allfilters.c is actually UB.
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The discrepancy between the definition and the declaration
in parsers.c is actually UB.
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Add demuxer to probe raw vvc and parse vvcc byte stream format.
Co-authored-by: Nuo Mi <nuomi2021@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
Add CodedBitstreamContext to parse VPS,SPS,PPS in VVC nal units.
Implement parsing and writing of SPS,PPS,VPS,PH,AUD,SEI and slices.
Add ff_cbs_type_h266 to cbs types tables and AV_CODEC_ID_H266
to cbs codec ids.
Co-authored-by: Thomas Siedel <thomas.ff@spin-digital.com>
Signed-off-by: James Almer <jamrial@gmail.com>
This combination is not working (it writes out of array)
Reviewed-by: Tomas Härdin <git@haerdin.se>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This partially reverts commit d0fc1b3507, which reintroduced a regression
originally fixed in 5e9986fd2d.
Signed-off-by: James Almer <jamrial@gmail.com>
When qsv device is created by device_derive, the ctx->free function is
not registered, causing potential memory leak because of not properly
closing the MFX session.
Signed-off-by: Tong Wu <tong1.wu@intel.com>
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
The issue is that while decode_slice is guaranteed to never get
called without start_frame, end_frame is not. Moreover, it is
not guaranteed it won't be called twice.
On a badly-broken sample, this is what happens, which leads to
a segfault, as vp->slices_buf doesn't exist, as it has been handed
off for decoding already and isn't owned by the frame.
Return an error as it's indicative that it's a corrupt stream rather
than just missing any slices.
Prevents a segfault.
Should fix integer overflows, and improve encoding results.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
mov_try_read_block() allocates 1MB at least, which can be more than
enough. It was called when reading saiz box, which can appear
periodically inside fmp4. This consumes a lot of memory.
We can fix mov_try_read_block() by clamp 'block_size' with 'size'.
However, the function is harmful than helpful. It avoids allocating
large memory when the real data is small. Even in that case, if
allocating large memory directly failed, it's fine to return ENOMEM;
if allocating success and reading doesn't match the given size, it's
fine to free and return AVERROR_INVALIDDATA. In other cases, it's a
waste of CPU and memory.
So I decided to remove the function, and replace it by call
av_malloc() and avio_read() directly.
mov_read_saiz() and mov_read_pssh() need more check, but they don't
belong to this patch.
Fixes#7641 and #9243.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
The old logic was trying to be excessively clever in "deducing" that the
user wanted to stretch/scale the image when ow/oh differed from iw/ih
aspect ratio. But this is almost surely unintended except in
pathological cases, and in those cases users should simply disable
normalize_sar and do all the stretching/scaling logic themselves. This
is especially important in multi-input mode, where the canvas may be
vastly different from the input dimensions of any stream. Also, passing
through input 0 SAR in multi-input mode is arbitrary and nearly useless,
so again force output SAR to 1:1 here.
Parse through all NALUs in a packet, pull new ones when a complete AU could not
be assembled, or keep them around if an AU was assembled while data remained in
them.
Signed-off-by: James Almer <jamrial@gmail.com>
Default GOP size is now set by preset and tuning info. This GOP size is only overwriten if -g value is provided.
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
As per the spec:
VUID-VkVideoDecodeH264PictureInfoKHR-sliceCount-arraylength
sliceCount must be greater than 0
VUID-VkVideoDecodeH265PictureInfoKHR-sliceSegmentCount-arraylength
sliceSegmentCount must be greater than 0
This particularly happens with seeking in field-coded H264.
This commit scraps a bool to signal to recreate the session parameters,
but instead destroys them, forcing them to be recreated.
As this can happen between start_frame and end_frame, do this
at both places.
Move the slice offsets buffer to the thread decode context.
It isn't part of the resources for frame decoding, the driver
has to process and finish with it at submission time.
That way, it doesn't need to be alloc'd + freed on every frame.
This requires using the new AVHWFramesContext.opaque field, as
otherwise, the profile attached to the decoder will be freed
before the frames context, rendering the frames context useless.
But ensure the value returned by evc_read_nal_unit_length() fits in an int.
Should prevent integer overflows later in the code.
Signed-off-by: James Almer <jamrial@gmail.com>
Fixed-point AAC decoder currently does not produce the same output on
all platforms. Until that is fixed, silence the audio stream using the
volume filter.
Also, actually use the aac_fixed decoder as was the original intent.
Before the introduction of AV_CODEC_ID_TIMED_ID3 for timed_id3 metadata streams
in mpegts (commit 4a4437c0fb), AV_CODEC_ID_SMPTE_KLV
was the only existing codec for metadata.
It seems that this codec has a 5-bytes metadata header[1] that, for some reason,
was always skipped when decoding data packets.
However, when working with a AV_CODEC_ID_TIMED_ID3 streams, this results in the
5 first bytes of the payload being cut-off, which includes essential informations
such as the ID3 tag version.
This patch fixes the issue by keeping the 5-bytes skip only for AV_CODEC_ID_SMPTE_KLV
streams.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
Use the gcd of all input timebases to ensure PTS accuracy. For the
framerate, just pick the highest of all the inputs, under the assumption
that we will render frames with approximately this frequency. Of course,
this is not 100% accurate, in particular if the input frames are badly
misaligned. But this field is informational to begin with.
Importantly, it covers the "common" case of combining high FPS and low
FPS streams with aligned frames.
In the event that some frame mixes are OK while others are not, the
priority goes:
1. Errors in updating any frame -> return error
2. Any input incomplete -> request frames and return
3. Any inputs OK -> ignore EOF streams and render remaining inputs
4. No inputs OK -> set output to most recent status
This logic ensures that we can continue rendering the remaining streams,
no matter which streams reach their end of life, until we have no
streams left at which point we forward the last EOF.
When combining multiple inputs, the output PTS may be less than the PTS
of the input. In this case, the current's code assumption of always
draining one value from the FIFO is incorrect. Replace by a smarter
function which drains only those PTS values that were actually consumed.
When combining multiple inputs with different PTS and durations, in
input-timed mode, we emit one output frame for every input frame PTS,
from *any* input. So when combining a low FPS stream with a high FPS
stream, the output framerate would match the higher FPS, independent of
which order they are specified in.
Subsequent inputs require frame blending to be enabled, in order to not
overwrite the existing frame contents.
For output metadata, we implicitly copy the metadata of the *first*
available stream (falling back to the second stream if the first has
already reached EOF, and so on). This is done to resolve any conflicts
between inputs with differing metadata. So when e.g. input 1 is HDR and
output 2 is SDR, the output will be HDR, and vice versa. This logic
could probablly be improved by dynamically determining some "superior"
set of metadata, but I don't want to handle that complexity in this
series.
Instead of finding the ref frame in output_frame() and then passing its
signature to update_crops(), pull out the logic and invoke it a second
time inside update_crops().
This may seem wasteful at present, but will actually become required in
the future, since update_crops() runs on *every* input, and needs values
specific to that input (which the signature isn't), while output_frame()
is only interested in a single input. It's much easier to just split the
logic cleanly.
Including the queue status, because these will need to be re-queried
inside output_frame_mix when that function is refactored to handle
multiple inputs.
In anticipation of a refactor which will enable multiple input support.
Note: the renderer is also input-specific because it maintains a frame
cache, HDR peak detection state and mixing cache, all of which are tied
to a specific input stream.
If the input queue is EOF, then the s->status check should already have
covered it, and prevented the code from getting this far.
If we still hit this case for some reason, it's probably a bug. Better
to hit the AVERROR_BUG branch.
Incrementing a NULL pointer is undefined behaviour,
yet this is what would happen in case trailer were NULL
before the check.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
We will postpone the vpp session initialization to when input and output
frames are ready, this copy of the sequence parameters will be used to
initialize vpp session.
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
In ff_rtp_send_jpeg, the type is defined based on PIX_FMT and
color-range parsed in. There is limitation on current design
where need to include support newly introduced PIX_FMT such as
AV_PIX_FMT_QSV and there might be more and more in future. Hence,
retrive the sampling factor from SOF0 in JPEG compressed header
directly. This introduces flexibility to handle different type of
new codec introduced in future.
Signed-off-by: Yeoh, Hoong Tee <hoong.tee.yeoh@intel.com>
- text is now shaped using libharfbuz
- glyphs position is now accurate to 1/4 pixel in both directions
- the default line height is now the one defined in the font
Adds libharfbuzz dependency.
This is only a preparatory step to a fully threaded architecture and
does not yet make decoding truly parallel - the main thread will
currently submit a packet and wait until it has been fully processed by
the decoding thread before moving on. Decoder behavior as observed by
the rest of the program should remain unchanged. That will change in
future commits after encoders and filters are moved to threads and a
thread-aware scheduler is added.
It is only used for flushing the subtitle decoder, so allocate a
dedicated packet for that.
Keep Decoder.pkt unused for now, it will be repurposed in future
commits.
Make the function process just one input stream at a time and save an
indentation level. Also rename it to ist_add() to be consistent with an
analogous function in ffmpeg_mux_init.
This can happen when user set the avctx->profile field directly
instead of specify profile via option.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Other than save a few bytes, it also has the benefit to show the
AV_OPT_TYPE_CONST value in help, e.g.,
-profile <int> E..V....... Profile (from 0 to INT_MAX) (default 0)
baseline 66 E..V....... Baseline Profile
...
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
H.264 high10/high422/high44 are unlikely supported by devices.
It's there for developers to do the experiment.
H.265 main10 works on my device with AV_PIX_FMT_MEDIACODEC.
OMX_COLOR_FormatYUV420Planar16 doesn't work.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
This switches the jpegxl_collect_codestream_header function to use
avcodec/bytestream2, which better enforces barriers, and should avoid
overrunning buffers with jxlp boxes if the size is zero or if the size
is so small the box is invalid.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
Fixes: runtime error: pointer index expression with base 0x000000000000 overflowed to 0xfffffffffffffff8
Fixes: 58440/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_HEVC_fuzzer-5956015530311680
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: avcodec/takdsp.c:44:23: runtime error: signed integer overflow: -2097158 - 2147012608 cannot be represented in type 'int'
Fixes: 58417/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_TAK_fuzzer-5268919664640000
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This makes the null pointer checks match mpv_motion_internal()
Fixes: NULL pointer dereference
Fixes: 59671/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_MPEG1VIDEO_fuzzer-4993004566609920
Fixes: 59678/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_MPEGVIDEO_fuzzer-4893168991338496
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: -38912000 + -2109276160 cannot be represented in type 'int'
Fixes: 59670/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_RKA_fuzzer-4987563245699072
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The implementation is flawed in that the frame opaque data is not in
fact correctly reordered along with the packets, but is being output in
packet input order, just like the dts are.
This reverts commit 3553809703.
I've been sitting on this for 3 1/2 years now(!), and I finally got
around to fixing the loose ends and convincing myself that it was
correct. It follows the same basic structure as yadif_cuda, including
leaving out the edge handling, to avoid expensive branching.
As we are introducing two new formats and supporting conversions
between them, and also with the existing 0rgb32/0bgr32 formats, we get
a combinatorial explosion of kernels. I introduced a few new macros to
keep the things mostly managable.
The conversions are all simple, following existing patterns, with four
specific exceptions. When converting from 0rgb32/0bgr32 to rgb32/bgr32,
we need to ensure the alpha value is set to 1. In all other cases, it
can just be passed through, either to be used or ignored.
When an option could not be found, print its name and value. Note that
this is not done while applying the options in
avfilter_graph_segment_apply_opts() to give the caller the option of
handling the missing options in some other way.
The issue is that with a threadsafe hwaccel and multiple enabled
frame threads, hwaccel->uninit() is never called.
Previously, the function was guaranteed to never have any threads
with hwaccel contexts, so it never bothered to uninit any.
nvenc declares support for these formats, but if hwcontext_cuda doesn't
do that as well, then it's not possible to hwupload them for use in a
possible cuda pipeline before encoding.
I'm not sure why I originally did this, but there's no good reason to
put pointers to the cuda context and stream in the priv struct. They
are directly available in the device context that is already being
stored there.
- Changes in mov_write_video_tag function to handle EVC elementary stream
- Provided structure EVCDecoderConfigurationRecord that specifies the decoder configuration information for ISO/IEC 23094-1 video content
Signed-off-by: Dawid Kozinski <d.kozinski@samsung.com>
This reverts commit 9a245bdf5d.
This commit basically broke all samples with fractional framerates,
rather than fixing them.
I at this point do not understand the original issue anymore, and I'm
not sure how this slipped my initial testing.
All my test samples must have happened to have a simple timebase.
The actual dts values pretty much always are just a simple chain of
1,2,3,4,5,... Or maybe slightly bigger steps. Each increase by one means
an advance in time by one unit of the timebase.
So a fractional framerate/timebase is already not an issue.
So with this patch applied, the calculation might end up substracting
huge values (1001 is a common one) from the dts, which would be an
offset of that many frames, not of that many fractions of a second.
This broke at least muxing into mp4, if the sample happened to have a
fractional framerate.
I do not thing the original issue this patch tried to fix existed in the
first place, so it can be reverted without further consequences.
This flag is used to indicate to the hw frames in the gaps,
vaapi constructs it from a bunch of implicit API information
around surface ids. vulkan should just send it explicitly.
Reviewed-by: Lynne <dev@lynne.ee>
Enable the checked bitreader to avoid overread.
Also add a few checks in loops and between blocks so we exit instead of continued
execution.
Alternatively we could add manual checks so that no overread can happen. This would be
slightly faster but a bit more work and a bit more fragile
Fixes: Out of array accesses
Fixes: 59640/clusterfuzz-testcase-minimized-ffmpeg_dem_JPEGXL_ANIM_fuzzer-6584117345779712
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The tag comes from samples/ffmpeg/mov/unrecognized/bartjones.mov
really looks like some random data. Now the random tag matched
another file, which isn't a mov.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
This also fixed a warning: implicit conversion from enumeration
type 'TF_DataType' (aka 'enum TF_DataType') to different
enumeration type 'DNNDataType'.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Determined experimentally, on various videos and hardware.
On Intel, using less resources in-flight is around 15% faster,
with similar results on Nvidia hardware.
This reduces memory needed dramatically, as unneeded resources
can be immediately returned to the pool.
Although waitforfences is threadsafe, we add a mutex wait around
it, as the mutex fence in combination with waitforfences assures
us that no other thread will reset the fence in the meanwhile
whilst the mutex is locked. This allows is to call
ff_vk_exec_discard_deps.
It was introduced for Vulkan, but it is equivalent to
short_term_ref_pic_set_size when !short_term_ref_pic_set_sps_flag,
and when !!short_term_ref_pic_set_sps_flag, Vulkan hardcodes a zero
anyway.
The old code was not properly handling a bunch of edge-cases with
streams terminating earlier and also did not properly report back EOF
to the first input.
This fixes at least one case where the filter could stop doing
anything:
ffmpeg -f lavfi -i "color=blue:d=10" -f lavfi -i "color=aqua:d=0" -filter_complex "[0][1]xfade=duration=2:offset=0:transition=wiperight" -f null -
Wrapped frames contain pointers so they need specific code to
noise them, the generic code would lead to segfaults
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
For now, there's not much value in this since Clang don't support
enabling the dotprod or i8mm features with either .arch_extension
or .arch (it has to be enabled by the base arch flags passed to
the compiler). But it may be supported in the future.
Signed-off-by: Martin Storsjö <martin@martin.st>
These are available since ARMv8.4-a and ARMv8.6-a respectively,
but can also be available optionally since ARMv8.2-a.
Check if ".arch armv8.2-a" and ".arch_extension {dotprod,i8mm}" are
supported, and check if the instructions can be assembled.
Current clang versions fail to support the dotprod and i8mm
features in the .arch_extension directive, but do support them
if enabled with -march=armv8.4-a on the command line. (Curiously,
lowering the arch level with ".arch armv8.2-a" doesn't make the
extensions unavailable if they were enabled with -march; if that
changes, Clang should also learn to support these extensions via
.arch_extension for them to remain usable here.)
Signed-off-by: Martin Storsjö <martin@martin.st>
Animated JPEG XL files requires a separate demuxer than image2, because
the timebase information is set by the demuxer. Should the timebase of
an animated JPEG XL file be incompatible with the timebase set by the
image2pipe demuxer (usually 1/25 unless set otherwise), rescaling will
fail. Adding a separate demuxer for animated JPEG XL files allows the
timebase to be set correctly.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
Migrate the libjxl decoder wrapper from the decode_frame method to the
receive_frame method, which allows sending more than one frame from a
single packet. This allows the libjxl decoder to decode JPEG XL files
that are animated, and emit every frame of the animation. Now, clients
that feed the libjxl decoder with an animated JPEG XL file will be able
to receieve the full animation.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
Currently of_output_packet() reuses the input packet, which requires its
callers to submit blank packets even on EOF, which makes the code more
complex.
It is set by the muxing code, which will not be synchronized with
encoding code after upcoming threading changes. Use an encoder-private
variable instead.
Packets submitted to the muxer now have their timebase attached to them,
so the muxer can do conversion to muxing timebase and avoid exposing it
to callers.
There is no reason to postpone it until opening the encoder. Also, abort
when the input stream is unknown, rather than disregard an explicit
request from the user.
The code will currently add a small offset to avoid exact midpoints, but
this can cause inexact results when a float timestamp is exactly
representable as an integer.
Fixes off-by-one in the first frame duration in multiple FATE tests.
Fixes: signed integer overflow: 9079256848778919936 - -288230376151711746 cannot be represented in type 'long'
Fixes: 58248/clusterfuzz-testcase-minimized-ffmpeg_dem_OGG_fuzzer-6326851353313280
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
In certain use cases, controlling the maximum frame size is critical. An
example is when transmitting AAC packets over Bluetooth A2DP.
While the spec allows the packets to be fragmented (but UNRECOMMENDED),
in practice most headsets do not recognize nor reassemble such packets.
In this patch, we allow setting `bit_rate_tolerance` to 0 to indicate
that the specified bit rate should be treated as an upper bound up to
frame level.
Signed-off-by: Jeremy Wu <jrwu@chromium.org>
Today, cuda is not able to import multiplane images, and cuda requires
images to be imported whether you trying to import to cuda or export
from cuda (in the later case, the image is imported and then copied
into on the cuda side). So any interop between cuda and vulkan requires
that multiplane be disabled.
The existing option for this is not sufficient, because when deriving
devices it is not possible to specify any options.
And, it is necessary to derive the Vulkan device, because any pipeline
that involves uploading from cuda to vulkan and then back to cuda must
use the same cuda context on both sides, and the only way to propagate
the cuda context all the way through is to derive the device at each
stage.
ie:
-vf hwupload=derive_device=vulkan,<filters>,hwupload=derive_device=cuda
Added support for more VideoToolbox encoder options:
- qmin and qmax options are now used
- max_slice_bytes: Max number of bytes per H.264 slice
- max_ref_frames: Limit the number of reference frames
- Disable open GOP when the cgop flag is set
- power_efficient: Enable power-efficient mode
Signed-off-by: Rick Kern <kernrj@gmail.com>
The contents of this field are not defined for decoding. Use
pkt_timebase, which is the timebase of demuxed packets.
Drop a tautological av_packet_rescale_ts() call, as the stream and
decoder timebases are the same.
Current code marks the output stream as finished and waits for a flush
packet, but that is both unnecessary and suspect, as in theory nothing
should be sent to a finished stream - not even flush packets.
Make all relevant state per-filtergraph input, rather than per-input
stream. Refactor the code to make it work and avoid leaking memory when
a single subtitle stream is sent to multiple filters.
Set them in ifilter_parameters_from_dec(), similarly to audio/video
streams. This reduces the extent to which sub2video filters need to be
treated specially.
This function should not take an InputStream, as it only uses it to get
the InputFile and the timebase. Pass those directly instead and avoid
confusion over dealing with multiple InputStreams.
This queue should be associated with a specific filtergraph input - if
a subtitle stream is sent to multiple filters then each should have its
own queue.
This code is a sub2video analogue of ifilter_send_frame(), so it
properly belongs to the filtering code.
Note that using sub2video with more than one target for a given input
subtitle stream is currently broken and this commit does not change
that. It will be addressed in following commits.
When the filtergraph has no inputs, it can be configured immediately
when all its outputs are bound to output streams. This will simplify
treating some corner cases.
This way the list of filtergraph inputs/outputs is always known after
FilterGraph creation. This will allow treating simple and complex
filtergraphs in a more uniform manner.
Currently NULL would be passed for simple filtergraphs, which would
make the filter code extract the graph description from the output
stream when needed. This is unnecessarily convoluted.
The existing DecklinkQueue implementation was using the PacketList
structure but wasn't using the standard avpriv_packet_list_get and
avpriv_packet_list_put functions. Convert to using them so we
eliminate the duplicate logic, per Marton Balint's suggestion.
Updated to reflect feedback from Marton Balint provided on 05/11/23.
Signed-off-by: Devin Heitmueller <dheitmueller@ltnglobal.com>
Signed-off-by: Marton Balint <cus@passwd.hu>
Check init_get_bits' result for NULL, to avoid dereferencing a NULL
pointer later (CWE-476).
Without this, a segfault happens when trying to decode a handcrafted
ogg-flac file with an absurdly long (e.g. 268435455 bytes) ogg header.
Co-authored-by: James Almer <jamrial@gmail.com>
Signed-off-by: Paul Arzelier <paul.arzelier@free.fr>
Importing Vulkan device on older versions no longer works due to the
lavu vulkan API changes (specifically, the switch to planar textures by
default). Additionally, importing on versions that don't suppirt
lock/unlock_queue is unsafe with the advent of the threaded vulkan
hwaccel. As a plus, saves us some annoying #ifdef boilerplate.
I will raise the minimum vf_libplacebo version globally on the next
stable release of libplacebo, and remove all of these checks.
Added prerequisites that must be met before providing support for the MPEG-5 EVC codec
- Added new entry to codec IDs list
- Added new entry to the codec descriptor list
- Bumped libavcodec minor version
- Added profiles for EVC codec
Signed-off-by: Dawid Kozinski <d.kozinski@samsung.com>
Signed-off-by: James Almer <jamrial@gmail.com>
According to Dave Airlie:
> <airlied> but I think ignoring it should be fine, I can't see any
> other way to get the imaeg extents correct for other usage
> <Lynne> what width/height should be used for the images?
> the final presentable dimensions, or the coded dimensions?
> <airlied> if you don't want noise I think the presentable dims
> <airlied> the driver should round up the allocations internally,
> but if you are going to sample from the images then w/h have to be
> the bounds of the image you want
> <airlied> since otherwise there's no way to stop the sampler from
> going outside the edges
Apparently, the alignment values are informative, rather than mandatory,
but the spec's wording makes it sound as if they're mandatory.
Previously floats where scaled up to 32bit int, but floats do not
have 32bits in their mantisse so a quarter of the bits where nonsense.
It seems no fate test is affected by this change, which is interresting
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The new function is much more precise
For default beta it is slightly slower, but its speed is already at the
worst case in that comparison
while the replaced function becomes much slower for larger beta
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
0th order modified bessel function of the first kind are used in multiple
places, lets avoid having 3+ different implementations
I picked this one as its accurate and quite fast, it can be replaced if
a better one is found
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: -9223372036854775808 - 2082844800 cannot be represented in type 'long'
Fixes: 58384/clusterfuzz-testcase-minimized-ffmpeg_dem_MOV_fuzzer-6428383700713472
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
When seeking through MBAFF-coded H264, this can happen. Decoding calls end_frame
without calling start_frame. We are unable to decode this, as no frame
state has been set.
Happens for both VAAPI and Vulkan. Could be an issue elsewhere, hence
the individual commit.
There are circumstances where the flag isn't set but the skip mode
frames are. So don't use the inferred bit which has other inputs
when deciding to pass the skip mode frames to the device.
This fixes some decoding bugs on intel av1
The temporary AVFrame on staack enables us to use the common
dependency/dispatch code in prepare_frame().
The prepare_frame() function is used for both frame initialization
and frame import/export queue family transfer operations.
In the former case, no AVFrame exists yet, so, as this is purely
libavutil code, we create a temporary frame on stack. Otherwise,
we'd need to allocate multiple frames somewhere, one for each
possible command buffer dispatch.
The idea was that it's faster to map linear images and copy them
via regular memcpy. This is a very niche use, plus very inconsistently
useful, as it would only really be faster on a few Intel GPUs.
Even then, using the non-cached memcpy would've been better.
Instead, scrap this code. Drivers are better at figuring out
what copy to use, and if we're host-mapping, it should actually be
just as fast, if not faster.
This commit adds proper handling of multiplane images throughout
all of the hwcontext code. To avoid breakage of individual
components, the change is performed as a single commit.
This commit rewrites the majority of vulkan.c to enable its use
as a general-purpose high-level utility code, usable for decoding,
encoding, and filtering of video frames.
The dependency system was rewritten to simplify management of
execution.
The image handling system was rewritten to accomodate multiplane
images.
Due to how related all the new features were, this is a single
commit.
This just disables the vulkan headers from defining any symbols
like vkCmdPipelineBarrier2(). Instead, all functions must be loaded
via the loader and used as function pointers as vk->CmdPipelineBarrier2.
Mostly just forces developers to write correct code, as using the
symbols can be undesirable in case API users define their own
function wrappers via the loader API.
The hack was added to enable exporting of vulkan images to DRM.
On Intel hardware, specifically for DRM images, all planes must be
allocated next to each other, due to hardware limitation, so the hack
used a single large allocation and suballocated all planes from it.
By natively supporting multiplane images, the driver is what decides
the layout, so exporting just works.
It's a hack because it conflicted heavily with image allocation, and
with the whole ecosystem in general, before multiplane images were
supported, which just made it redundant.
This is also the commit which broke the hwcontext hardest and prompted
the entire rewrite in the first place.
This just bumps the required loader library version (libvulkan).
All device-related features, such as video decoding, atomics, etc.
are still optional and the code deals with their loss on a local level
(e.g. the decoder or filter checks for the features it needs, not
the hwcontext).
Bumping the required version essentially packs all maintenance
extensions which correct the spec rather than requiring to enable
them individually.
Vulkan requires it.
It technically also requires use_default_scaling_matrix_mask,
but we can just be explicit and give it the matrix we fill in as-non
default.
The only thing besides the hwaccel that this function uses from
AVCodecHWConfigInternal is the pixel format, which should always match
the hwaccel one.
Will be useful in following commits.
A non-limiting stream could mistakenly end up being the queue head,
which would then produce incorrect synchronization, seen e.g. in
fate-matroska-flac-extradata-update for certain number of frame threads
(e.g. 5).
Found-By: James Almer
Checking whether the user requested an unsupported conversion between
text and bitmap subtitles can be done immediately when creating the
output stream.
This function is entangled with encoder setup, so it is more encoding
code rather than ffmpeg_hw code. This will allow making more encoder
state private in the future.
This function is entangled with decoder setup, so it is more decoding
code rather than ffmpeg_hw code. This will allow making more decoder
state private in the future.
As the comment in the code mentions, EAGAIN is not an expected value here
because we call avcodec_receive_frame() until all frames have been returned.
avcodec_send_packet() returning EAGAIN means a packet is still buffered, which
hints that the underlying decoder is buggy and not fetching packets as it
should.
An example of this behavior was in the libdav1d wrapper before f209614290,
where feeding it split frames (or individual OBUs) would result in the CLI
eventually printing the confusing "Error submitting packet to decoder: Resource
temporarily unavailable" error message, and just keep going until EOF without
returning new frames.
Signed-off-by: James Almer <jamrial@gmail.com>
This patch supports the use of the "checkasm --bench" testing feature
on loongarch platform.
Change-Id: I42790388d057c9ade0dfa38a19d9c1fd44ca0bc3
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Intel MediaSDK and oneVPL expect continuous allocation for data[i],
however there are mandatory padding bytes between data[i] and data[i+1].
when calling av_frame_get_buffer. This patch removes all extra padding
bytes.
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
The data copy is unnecessary for packed formats when frame width and
height are aligned
For example:
$ ffmpeg -f lavfi -i testsrc=size=1920x1088 -vf "format=yuyv422" -c:v hevc_qsv -f null -
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
The mfx implementation based on D3D11 is expected for an internal
session on Windows, however sometimes this implemntation is not
supported [1]. A fallback to the default mfx implementation is added in
this patch.
[1] https://github.com/intel/cartwheel-ffmpeg/issues/246
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
mix->timestamps is expressed relative to the source timebase, which is
possibly a different timescale from `base_pts`. We can't mix-and-match
here. The only reason this worked in my previous testing was because I
was testing on a source file which had an exactly matching timebase.
Fix it by always using the exact PTS as tagged on the AVFrame.
Fixes decoding packets containing split temporal units, as generated for example
by the av1_frame_split bsf.
Signed-off-by: James Almer <jamrial@gmail.com>
When no drawing is to be performed, tpad can work fine with
hardware frames, so advertise this in the query_formats
callback and ensure the drawing context is never initialised
when just cloning frames.
Reviewed-by: Thilo Borgmann <thilo.borgmann@mail.de>
Reviewed-by: Niklas Haas <git@haasn.dev>
It currently emulates the long-removed
avcodec_decode_audio4/avcodec_decode_video2 APIs, which obfuscates the
actual decoding flow. Restructure the decoding calls so that they
naturally follow the new avcodec_send_packet()/avcodec_receive_frame()
design.
This is not only significantly easier to read, but also shorter.
It is currently handled in the same loop as audio and video, but this
obscures the actual flow, because only one iteration is ever performed
for subtitles.
Also, avoid a pointless packet reference.
process_input_packet() contains two non-interacting pieces of nontrivial
size and complexity - decoding and streamcopy. Separating them makes the
code easier to read.
New placement requires fewer explicit conditions and is easier to
understand.
The logic should be exactly equivalent, since this is the only place
where eof_reached is set for decoding.
Passing ist=NULL is currently used to identify stream types that do not
decode into AVFrames, i.e. subtitles. That is highly non-obvious -
always pass a non-NULL InputStream and just check the type explicitly.
It tracks whether the decoder for this stream ever produced any frames
and its only use is for checking whether a filter input ever received a
frame - those that did not are prioritized by the scheduler.
This is awkward and unnecessarily complicated - checking whether the
filtergraph input format is valid works just as well and does not
require maintaining an extra variable.
In case no decoder is available, dec_open() called from ist_use() will
fail with 'Decoding requested, but no decoder found', so this check is
redundant.
When a filtergraph input receives EOF but never saw any input frames, we
use the fallback parameters. Currently an attempt to actually configure
the filtergraph will happen elsewhere, but there is no reason to
postpone this.
With complex filtergraphs it can happen that the filtergraph is
unconfigured because some other filter than the one we just got EOF on
is missing parameters.
Make sure that the fallback parametes for a given input are only used
when that input is unconfigured.
This algorithm has once again been refactored, this time leading to a
dropping of the old `tone_mapping_mode` field, to be replaced by a
single tunable hybrid mode with configurable strength.
We can approximately map the old modes onto the new API for backwards
compatibility. Replace deprecated enums by their integer equivalents to
safely preserve this API until the next bump.
Upstream deprecated the old ad-hoc, enum/intent-based gamut mapping API
and added a new API based on colorimetrically accurate gamut mapping
functions.
The relevant change for us is the addition of several new modes, as well
as deprecation of the old options. Update the documentation accordingly.
The current code relied on pl_queue eventually returning EOF back to the
caller, which didn't work in all situations (e.g. single frame input).
Also, the current code assumed that ff_inlink_acknowledge_status only
fired once, which was patently not true, as the above edge cases
demonstrated.
Solve both issues by keeping track of the acknowledged link status and
forwarding it (instead of trying to probe the pl_queue again) in the
event that we run out of queued input frames, as well as (in CFR mode)
when we pass the indicated status PTS.
Fixes: signed integer overflow: -159584 * 5105950 cannot be represented in type 'int'
Fixes: 55165/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_BONK_fuzzer-5796023719297024
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This should be a few nano seconds faster (not measureable)
But Collectively the whole humankind watching hls will safe a minute
Found-by: Leo Izen
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
As this is an AV_CODEC_CAP_OTHER_THREADS decoder, threading is handled by the
underlying library. In this case, the frame delay is calculated by libdav1d
based on the values from avctx->thread_count and the private max_frame_delay
option.
Export said delay reported by the library in AVCodecContext.delay
Reviewed-by: Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
When using low-latency mode, it eliminates frame reordering
and follows a one-in-one-out encoding mode
Signed-off-by: xufuji456 <839789740@qq.com>
Signed-off-by: Rick Kern <kernrj@gmail.com>
Use the next I/P/B or start code as the end of current frame.
Before the patch, extension start code, user data start code,
sequence end code and so on are treated as the start of next
frame.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
The demuxer exports rawvideo, so there's no reason for the muxer to only
work with wrapped_avframe.
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
Not only this is information that relies on the concept of a sequence of
frames, which is completely out of place as a field in AVFrame, but there are
no known or intended uses of this field.
Signed-off-by: James Almer <jamrial@gmail.com>
Accept it and pass it through unchanged.
The standard requires that decoders ignore unknown metadata, and indeed
this is tested by some of the Argon coverage streams.
* take num_ticks_per_picture_minus_1 into account, since that is a part
of the framerate computation
* stop exporting num_ticks_per_picture_minus_1 into
AVCodecContext.ticks_per_frame, as that field is used for other
purposes (in conjunction with repeat_pict, which is not used at all by
av1)
It does no initialization anymore, except for setting
transcode_init_done - the bulk of the function is printing the
input/output maps. It also cannot fail anymore, so remove the useless
return value.
Export the corresponding flag in InputFile instead. This will allow
making the demuxer AVFormatContext private in future commits, similarly
to what was previously done for muxers.
There is no point in having a per-stream wallclock start time, since
they are all computed at the same instant. Keep a per-file start time
instead, initialized when the demuxer thread starts.
That is a more appropriate place for this code and will allow hiding
more of InputStream.
The value of repeat_pict extracted from libavformat internal parser no
longer needs to be trasmitted outside of the demuxing thread.
Move readrate handling to the demuxer thread. This has to be done in the
same commit, since it reads InputStream.dts,nb_packets, which are now
set in the demuxer thread.
This way computing it and using it for streamcopy does not need to
happen in sync. Will be useful in following commits, where updating
InputStream.dts will be moved to the demuxing thread.
This code runs post-demuxing and is not synchronized with the decoder
output (which may be delayed with respect to its input by arbitrary and
unknowable amounts), so accessing any decoder properties is incorrect.
Move them to a separate function called right after timestamp
discontinuity processing. This is now possible, since these values have
no interaction with decoding anymore.
For encoding, this field is entirely redundant with
AVCodecContext.framerate.
For decoding, this field is entirely redundant with
AV_CODEC_PROP_FIELDS.
Since this is an external encoder not under our control, we cannot test
the encoded output exactly as is done for internal encoders. We can
still test however that the output is decodable and produces the
expected number of frames with expected dimensions, pixel formats, and
timestamps.
H.264 and mpeg12 parsers need to be adjusted at the same time to stop
using the value of AVCodecContext.ticks_per_frame, because it is not set
correctly unless the codec has been opened. Previously this would result
in both the parser and lavf seeing the same incorrect value, which would
cancel out.
Updating lavf and not the parsers would result in correct value in lavf,
but the wrong one in parsers, which would break some tests.
Decoders will currently warn if an audio decoder not marked with
AV_CODEC_CAP_SUBFRAMES consumes less than the whole packet, but
* this happens for regular files
* this has no negative consequences
* there is no meeaningful action that can or should be taken in response
The warning is thus useless noise.
This should fix the regression since 6b1f68ccb0
Should fix Ticket10353 (please test and report cases that still fail)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This exposes libplacebo's frame mixing functionality to vf_libplacebo,
by allowing users to specify a desired target fps to output at. Incoming
frames will be smoothly resampled (in a manner determined by the
`frame_mixer` option, to be added in the next commit).
To generate a consistently timed output stream, we directly use the
desired framerate as the timebase, and simply output frames in
sequential order (tracked by the number of frames output so far).
To present compatibility with the current behavior, we keep track of a
FIFO of exact frame timestamps that we want to output to the user. In
practice, this is essentially equivalent to the current filter_frame()
code, but this design allows us to scale to more complicated use cases
in the future - for example, insertion of intermediate frames
(deinterlacing, frame doubling, conversion to fixed fps, ...)
This does not leverage any immediate benefits, but refactors and
prepares the codebase for upcoming changes, which will include the
ability to do deinterlacing and resampling (frame mixing).
This commit contains no functional change. The goal is merely to
separate the highly intertwined `filter_frame` and `process_frames`
functions into their separate concerns, specifically to separate frame
uploading (which is now done directly in `filter_frame`) from emitting a
frame (which is now done by a dedicated function `output_frame`).
The overall idea here is to be able to ultimately call `output_frame`
multiple times, to e.g. emit several output frames for a single input
frame.
For example
ffmpeg -f lavfi -i sine -af "aformat=cl=stereo|5.1|7.1,lowpass,aformat=cl=7.1|5.1|stereo" -f null -
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
Before this commit if allocation would fail in ff_add_channel_layout()
function, function would return negative error code and this would
cause wrong format pick up later. If allocation would not fail return
code would be 0 and then format negotiation would simply fail as code
would break from the loop but with wrong return code.
Error was introduced in 6aaac24d72 commit.
Fixes#6638
This is not public API, no it has no need for an alloc() and free()
functions. The struct can reside on stack.
Signed-off-by: James Almer <jamrial@gmail.com>
Move the AVPacketQueue functionality that is currently only used
for the decklink decode module into decklink_common, so it can
be shared by the decklink encoder (i.e. for VANC insertion when
we receive data packets separate from video).
The threadsafe queue used within the decklink module was named
"AVPacketQueue" which implies that it is part of the public API,
which it is not.
Rename the functions and the name of the queue struct to make
clear it is used exclusively by decklink, per Marton Balint's
suggestion.
Signed-off-by: Devin Heitmueller <dheitmueller@ltnglobal.com>
Signed-off-by: Marton Balint <cus@passwd.hu>
This patch simply recognizes the AAC audio track during
decode -- it does not add functionality to encode AAC in
MXF.
Signed-off-by: Ammon Riley <ammon.riley@gmail.com>
Signed-off-by: Marton Balint <cus@passwd.hu>
Unlike other cases where the closed captions are embedded in the
video stream as MPEG-2 userdata or H.264 SEI data, with MOV files
the captions are often found on a separate "e608" subtitle track.
Add support for playout of such files, leveraging the new ccfifo
mechanism to ensure that they are embedded into VANC at the correct
rate (since e608 packets often contain batches of multiple 608 pairs).
Note this patch includes a new file named libavdevice/ccfifo.c, which
allows the ccfifo functionality in libavfilter to be reused even if
doing shared builds. This is the same approach used for log2_tab.c.
Signed-off-by: Devin Heitmueller <dheitmueller@ltnglobal.com>
Signed-off-by: Limin Wang <lance.lmwang@gmail.com>
THis filter can correct certain issues seen from upstream sources
where the cc_count is not properly set or the CEA-608 tuples are
not at the start of the payload as expected.
Make use of the ccfifo to extract and immediately repack the CEA-708
side data, thereby removing any extra padding and ensuring the 608
tuples are at the front of the payload.
Signed-off-by: Devin Heitmueller <dheitmueller@ltnglobal.com>
Signed-off-by: Limin Wang <lance.lmwang@gmail.com>
Because the interlacing filter halves the effective framerate, we
need to ensure that no CEA-708 data is lost as frames are merged.
Make use of the new ccfifo mechanism to ensure that caption data
is properly preserved as frames pass through the filter.
Thanks to Thomas Mundt for review and noticing a couple of
missed codepaths for injection on output. Thanks to Lance Wang
for pointing out a memory leak.
Signed-off-by: Devin Heitmueller <dheitmueller@ltnglobal.com>
Signed-off-by: Limin Wang <lance.lmwang@gmail.com>
Various deinterlacing modes have the effect of doubling the
framerate, and we need to ensure that the caption data isn't
duplicated (or else you get double captions on-screen).
Use the new ccfifo mechanism for yadif (and yadif_cuda and bwdif
since they use the same yadif core) so that CEA-708 data is
properly preserved through this filter.
Signed-off-by: Devin Heitmueller <dheitmueller@ltnglobal.com>
Signed-off-by: Limin Wang <lance.lmwang@gmail.com>
The existing implementation made an attempt to remove duplicate
captions if increasing the framerate, but made no attempt to
handle reducing the framerate, nor did it rewrite the caption
payloads to have the appropriate cc_count (e.g. the cc_count needs
to change from 20 to 10 when going from 1080i59 to 720p59 and
vice-versa).
Make use of the new ccfifo mechanism to ensure that caption data
is properly preserved.
Signed-off-by: Devin Heitmueller <dheitmueller@ltnglobal.com>
Signed-off-by: Limin Wang <lance.lmwang@gmail.com>
When transcoding video that contains 708 closed captions, the
caption data is tied to the frames as side data. Simply dropping
or adding frames to change the framerate will result in loss of
data, so the caption data needs to be preserved and reformatted.
For example, without this patch converting 720p59 to 1080i59
would result in loss of 50% of the caption bytes, resulting in
garbled 608 captions and 708 probably wouldn't render at all.
Further, the frames that are there will have an illegal
cc_count for the target framerate, so some decoders may ignore
the packets entirely.
Extract the 608 and 708 tuples and insert them onto queues. Then
after dropping/adding frames, re-write the tuples back into the
resulting frames at the appropriate rate given the target
framerate. This includes both having the correct cc_count as
well as clocking out the 608 pairs at the appropriate rate.
Thanks to Lance Wang <lance.lmwang@gmail.com>, Anton
Khirnov <anton@khirnov.net>, and Michael Niedermayer <michael@niedermayer.cc>
for providing review/feedback.
Signed-off-by: Devin Heitmueller <dheitmueller@ltnglobal.com>
Signed-off-by: Limin Wang <lance.lmwang@gmail.com>
Currently, the -1 (MR) preset is disallowed as it's taken as the preset
option not set, and the only way to access it was through svtav1-params.
Signed-off-by: Christopher Degawa <ccom@randomderp.com>
Signed-off-by: James Almer <jamrial@gmail.com>
Commit 55b81528a9 ("doc/filters: itemize crop examples") dropped the
quotation marks from these examples, but this one remained. Quotes are
actually needed to put the example into a command line or a program, but
removing it here makes the example consistent with the document.
The two checks using eof_reached are testing whether more input can
possibly appear on this filtergraph input. InputFilterPriv.eof is the
more authoritative source for this information.
When an input stream terminates and no frames were successfully decoded,
filtering code will currently configure the filtergraph using demuxer
stream parameters. Use decoder parameters instead, which should be more
reliable. Also, initialize them immediately when an input stream is
bound to a filtergraph input, so that these parameters are always
available (if at all) and filtering code does not need to reach into the
decoder at some arbitrary later point.
When no frames are ever seen by an encoder, encoder flush will do a
last-ditch attempt to configure its source filtergraph in order to at
least get the stream parameters. This involves extracting demuxer
parameters from filtergraph source inputs, which is
* a bad layering violation
* probably unreachable, because decoders are flushed before encoders,
which should call ifilter_send_eof(), which will also set these
parameters; however due to complex control flow it is hard to be
entirely sure this code can never be triggered
Even if this code can actually be reached, it is probably better to
return an error as the comment above it says.
These two functions are a part of a single logical action - determining
which, if any, output stream needs to be processed next. Keeping them
separate is a historical artifact that obscures what is actually being
done.
Fixes: runtime error: signed integer overflow: 2140143616 + 254665816 cannot be represented in type 'int'
Fixes: 45982/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_ADPCM_XMD_fuzzer-6690181676924928
As a sideeffect this simplifies the equation, the high bits are different after this but only
the low 16bits are stored and used in later steps.
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: -2147375930 + -133875 cannot be represented in type 'int'
Fixes: 45982/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WADY_DPCM_fuzzer-6703727013920768
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: index 4294967295 out of bounds for type 'uint16_t [65536]'
Fixes: 45982/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_TIFF_fuzzer-5950405086674944
Fixes: 45982/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_TIFF_fuzzer-6666195176914944
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 2147483372 - -148624 cannot be represented in type 'int'
Fixes: 45982/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_SONIC_fuzzer-5477177805373440
Fixes: 45982/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_SONIC_fuzzer-6681622236233728
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: out of array access
Fixes: 45982/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_IFF_ILBM_fuzzer-5124452659888128
Fixes: 45982/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_IFF_ILBM_fuzzer-6362836707442688
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 3011809745540902265 + 6323452730883571725 cannot be represented in type 'long'
Fixes: 45982/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_FLAC_fuzzer-6687553022722048
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The global header should not contain a frame, and decoding it
would result in leaks
Fixes: memleak
Fixes: 45982/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_APNG_fuzzer-6603443149340672
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Its unexpected that a .avi or other "standard" file turns into a playlist.
The goal of this patch is to avoid this unexpected behavior and possible
privacy or security differences.
Reviewed-by: Steven Liu <lingjiujianke@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
When tak_get_nb_samples() fails, it will currently write
AVERROR_INVALIDDATA as TAKStreamInfo.frame_samples. The parser will then
use this negative value as a frame duration, which leads to various
breakage.
Avoid this by returning the error code from tak_parse_streaminfo()
directly; never store negative values in the parsed header.
Motivated by a desire to use vf_libplacebo as a GPU-accelerated
cropping/padding/zooming filter. This commit adds support for setting
the `input/target.crop` fields as dynamic expressions.
Re-use the same generic variables available to other scale and crop type
filters, and also add some more that we can afford as a result of being
able to set these properties dynamically.
It's worth pointing out that `out_t/ot` is currently redundant with
`in_t/t` since it will always contain the same PTS values, but I plan on
changing this in the near future.
I decided to also expose `crop_w/crop_h` and `pos_w/pos_h` as variables
in the expression parser itself, since this enables the fairly common
use case of determining dimensions first and then placing the image
appropriately, such as is done in the default behavior (which centers
the cropped/placed region by default).
In some circumstances, libplacebo will clear the background as a result
of cropping/padding. Currently, this uses the hard-coded default fill
color of black. This option makes this behavior configurable.
As with the earlier bswap change, all versions of GCC and Clang that
support RISC-V support the popcount built-ins, so we can just use them
instead of inline assembler.
av_bswapXX() are used in context that expect exact size types, notably
variable arguments to av_log(). On Linux RV64, uint_fast32_t is an
unsigned long, so the current inline assembler does not work properly.
Since GCC and Clang gained their byte-swap built-ins before they
supported RISC-V, we can simply defer to them. As an added bonus, the
compiler can do instruction scheduling, which it couldn't with the Zbb
inline assembler.
Currently those are set in different ways depending on whether the
stream is decoded or not, using some values from the decoder if it is.
This is wrong, because there may be arbitrary amount of delay between
input packets and output frames (depending e.g. on the thread count when
frame threading is used).
Always use the path that was previously used only for streamcopy. This
should not cause any issues, because these values are now used only for
streamcopy and discontinuity handling.
This change will allow to decouple discontinuity processing from
decoding and move it to ffmpeg_demux. It also makes the code simpler.
Changes output in fate-cover-art-aiff-id3v2-remux and
fate-cover-art-mp3-id3v2-remux, where attached pictures are now written
in the correct order. This happens because InputStream.dts is no longer
reset to AV_NOPTS_VALUE after decoding, so streamcopy actually sees
valid dts values.
This was added in 380db56928 as a
temporary crutch that is not needed anymore. The only case where this
code can be triggered is the very first frame, for which InputStream.pts
is always equal to 0.
Stop using InputStream.dts for generating missing timestamps for decoded
frames, because it contains pre-decoding timestamps and there may be
arbitrary amount of delay between input packets and output frames (e.g.
dependent on the thread count when frame threading is used). It is also
in AV_TIME_BASE (i.e. microseconds), which may introduce unnecessary
rounding issues.
New code maintains a timebase that is the inverse of the LCM of all the
samplerates seen so far, and thus can accurately represent every audio
sample. This timebase is used to generate missing timestamps after
decoding.
Changes the result of the following FATE tests
* pcm_dvd-16-5.1-96000
* lavf-smjpeg
* adpcm-ima-smjpeg
In all of these the timestamps now better correspond to actual frame
durations.
If input packets have timestamps, they will be propagated to output
frames by the decoder, so at best this block does not do anything.
There can also be an arbitrary amount of delay between packets sent to
the decoder and decoded frames (e.g. due to decoder's intrinsic delay or
frame threading), so deriving any timestamps from packet properties is
wrong.
One that is fine enough to represent all DV audio sample rates. Audio
packet durations are now sample-accurate.
This largely undoes commit 76fbb0052d. To
avoid breaking the issue fixed by that commit, resync audio timestamps
against video if they get more than one frame apart. The sample from
issue #8762 still works correctly after this commit.
Slightly changes the results of the lavf-dv seektest, due to the audio
timebase being more granular.
Current code will call avpriv_set_pts_info() for each video frame,
possibly setting a different timebase if the stream framerate changes.
This violates API conventions, as the timebase is supposed to stay
constant after stream creation.
Change the demuxer to set a single timebase that is fine enough to
handle all supported DV framerates.
The seek tests change slightly because the new timebase is more
granular.
This makes the worst case much faster
Fixes: Timeout
Fixes: 51363/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_BONK_fuzzer-5660734784143360
Fixes: 57957/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_BONK_fuzzer-5874095467397120
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This allows the usage of codecs in builds that have a parser but no decoders
for remuxing scenarios with raw sources.
Signed-off-by: James Almer <jamrial@gmail.com>
Also remove the _minus8 part of the name to be in line with the rest of the
decoder, and fix the storage type for pps_palette_predictor_initializer,
to support hbd values.
Signed-off-by: James Almer <jamrial@gmail.com>
Will remove native backend, so change the default backend in filters,
and also remove the python scripts which generate native model file.
Signed-off-by: Ting Fu <ting.fu@intel.com>
Native backend will be removed in following commits, so change the
dnn interface and modify the error message in it first.
Signed-off-by: Ting Fu <ting.fu@intel.com>
Not doing so is an obvious oversight - the ICC profile is tied to the
original colorspace, so if we change it, we should definitely strip this
information.
We should probably also have an extra option to control whether the ICC
profile should be stripped, ignored, or applied, but for now this fixes
an existing bug.
Some testing revealed this to be a very efficient and reliable method of
ingesting GPU frames into libplacebo, so it's a good idea to give as an
example.
This example being first is now misleading because round-tripping
through hwdownload/hwupload is neither required nor recommended. Also,
the comment about avoiding format conversion is unnecessary because
`libplacebo` will now inherit the input frame format by default.
This was a bug/mistake in dae3679a9b.
use_mfra_for by defintion only has an effect on fragmented MP4 files,
making the check not only redundant, but also broken if a user used
the option globally (i.e. set on non-fragmented MP4s).
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Also make duration check for mvhd more consistent with the others, write
version 1 of mvhd if duration is at least INT32_MAX instead of UINT32_MAX.
Signed-off-by: Marton Balint <cus@passwd.hu>
Commit 23eeffcd48 added a hack to support invalid
files where the creation date was encoded as a classic unix timestamp. Let's
reduce the scope of the hack by only applying it to version 0 mdhd/mvhd atoms.
Also warn the user of such possibly broken files.
Signed-off-by: Marton Balint <cus@passwd.hu>
Support pixel formats 0x11412100, 0x11311100, and 0x41211100, and add
logic to perform 4x horizontal upsampling. This should fix various JPEG
files found in Ticket #8930.
Co-authored-by: <leo.izen@gmail.com>
The previous name is misleading, because the function does not actually
initialize any filters - it creates a new output stream and binds a
filtergraph output to it.
Their only function is checking that encoding was not specified for
data/unknown-type streams, but the check is broken because enc_ctx will
not be created in ost_add() unless a valid encoder can be found.
Add an actually working check for all types for which encoding is not
supported in choose_encoder().
ts_discontinuity_detect() is applied right after demuxing, while
InputStream.pts is a post-decoding timestamp, which may be delayed with
respect to demuxing by an arbitrary amount (e.g. depending on the thread
count when frame threading is used).
The name is misleading, because it is not a picture in the sense of MPEG
terminology (which define "picture" as "frame or field"), but always a
full frame. 'next' is also redundant and/or misleading, because it is
the _current_ frame to be encoded.
Previously they would only be used with trivial filtergraphs, because
filters did not handle frame durations. That is no longer true - most
filters process frame durations properly (there may still be some that
don't - this change will help finding and fixing them).
Improves output video frame durations in a number of FATE tests.
When an encoder exports sum-of-squared-differences information in
encoded packets, print_report() will print PSNR information in the
status line. However,
* the code computing PSNR assumes 8bit 420 video and prints incorrect
values otherwise; there are no issues on trac about this
* only a few encoders (namely aom, vpx, mpegvideo, snow) export this
information; other often-used encoders such as libx26[45] do not
export this, even though they could
This suggests that this feature is not useful and it is better to remove
it rather than spend effort on fixing it.
Otherwise main and overlay videos share the same input region. Note NULL
pointer imples the whole overlay video will be processed.
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
- qsv_internal.h: Remove unnecessary include va_drm.h
- qsv_internal.h: Enable AVCODEC_QSV_LINUX_SESSION_HANDLE on Linux/VA only
- hwcontext_qsv.c: Do not allow child_device_type VAAPI for Windows until
support is added, keep D3D11/DXVA2 as more prioritary defaults.
Initial review at https://github.com/intel-media-ci/ffmpeg/pull/619/
Signed-off-by: Sil Vilerino <sivileri@microsoft.com>
Reviewed-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Reviewed-by: Wu, Tong1 <tong1.wu@intel.com>
When write multi-trun box, the MOV_TRUN_FIRST_SAMPLE_FLAGS flag
need judge by first param, not 0.
If the original video contains consecutive I frames,
this will cause the packets of fmp4 have error sample_flags ,
and then incorrect keyframes were generated,
and then error packet will be seeked.
Signed-off-by: Wang Yaqiang <wangyaqiang03@kuaishou.com>
Signed-off-by: Steven Liu <liuqi05@chinaffmpeg.org>
Adds JPEG 2000 decoder tests using materials from the conformance suite specified in
Rec. ITU-T T.803 | ISO/IEC 15444-4.
The test materials are available at https://gitlab.com/wg1/htj2k-codestreams
Signed-off-by: Pierre-Anthony Lemieux <pal@palemieux.com>
framebuf is only allocated when the new width/height are larger than the old
but nothing sets the old so its always allocated.
Use av_fast_mallocz() instead.
Fixes: Timeout
Fixes: 55094/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_G2M_fuzzer-5116909932904448
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 100183269 - -2132769113 cannot be represented in type 'int'
Fixes: 55063/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_AAC_FIXED_fuzzer-5039294027005952
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
num_ref_idx_l0_active_minus1, num_ref_idx_l1_active_minus1,
num_ref_idx_l0_default_active_minus1, and num_ref_idx_l1_default_active_minus1
are all in the range 0 to 14, inclusive.
Signed-off-by: James Almer <jamrial@gmail.com>
Support full range videos when transcoding, enabled the
YUVJ420P to avoid auto scale from YUVJ420P to YUV420P
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
The change introduced in b18a9c2971
created a regression for non-subsampled progressive RGB jpegs. This
should fix that.
Additionally, this should fix other RGB JPEGs broken before the recent
patches, such as those in Trac issue #10190.
When enable lcms2, the fate-png-icc-parse will get error bellow.
TEST png-icc-parse
This because updated how PNG handles colors (no
longer using mastering display metadata for that).
Reviewed-by: Leo Izen <leo.izen@gmail.com>
Signed-off-by: Steven Liu <liuqi05@kuaishou.com>
The decoder is tagged as being FF_CODEC_CAP_SKIP_FRAME_FILL_PARAM, so might as
well make use of it.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
Also remove the _plus* and _minus* parts of some of these to be in line with
other similar fields in the decoder.
Signed-off-by: James Almer <jamrial@gmail.com>
The spec says: "The value of num_ref_loc_offsets shall be in the range of 0 to
vps_max_layers_minus1, inclusive".
Signed-off-by: James Almer <jamrial@gmail.com>
Remove now-obsolete code setting packet durations pre-muxing for CFR
encoded video.
Changes output in the following FATE tests:
* numerous adpcm tests
* ffmpeg-filter_complex_audio
* lavf-asf
* lavf-mkv
* lavf-mkv_attachment
* matroska-encoding-delay
All of these change due to the fact that the output duration is now
the actual input data duration and does not include padding added by
the encoder.
* apng-osample: less wrong packet durations are now passed to the muxer.
They are not entirely correct, because the first frame duration should
be 3 rather than 2. This is caused by the vsync code and should be
addressed later, but this change is a step in the right direction.
* tscc2-mov: last output frame has a duration of 11 rather than 1 - this
corresponds to the duration actually returned by the demuxer.
* film-cvid: video frame durations are now 2 rather than 1 - this
corresponds to durations actually returned by the demuxer and matches
the timestamps.
* mpeg2-ticket6677: durations of some video frames are now 2 rather than
1 - this matches the timestamps.
A single smvjpeg packet decodes into one large mjpeg frame, slices of
which are then returned as output frames. Packet duration covers all of
these slices.
Current code prefers deprecated AVFrame.pkt_duration over its
replacement AVFrame.duration whenever the former is set and not equal to
the latter. However, duration will only be actually used when the
caller sets the AV_CODEC_FLAG_FRAME_DURATION flag, which was added
_after_ AVFrame.duration.
This implies that any caller aware of AV_CODEC_FLAG_FRAME_DURATION is
also aware of AVFrame.duration. pkt_duration should then never be used.
Commit b18a9c2971 introduced a regression
that broke some baseline RGB jpegs. (See Trac issue #4045). This fixes
that.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
That field was added to store timestamp conversion state for audio
decoding code. Later it started being used by streamcopy as well, but
that use is wrong because, for a given input stream, both decoding and
an arbitrary number of streamcopies may be performed simultaneously.
They would then all overwrite the same state variable.
Store this state in MuxStream instead.
This is the last use of InputStream in of_streamcopy(), so the ist
parameter can now be removed.
It stores codec parameters of the stream submitted to the muxer, which
may be different from the codec parameters in AVStream due to bitstream
filtering.
This avoids the confusing back and forth synchronisation between the
encoder, bitstream filters, and the muxer, now information flows only in
one direction. It also reduces the need for non-muxing code to access
AVStream.
Reduces access to a deeply nested muxer property
OutputStream.st->codecpar->codec_type for this fundamental and immutable
stream property.
Besides making the code shorter, this will allow making the AVStream
(OutputStream.st) private to the muxer in the future.
Set InputStream.decoding_needed/discard/etc. only from
ist_{filter,output},add() functions. Reduces the knowledge of
InputStream internals in muxing/filtering code.
init_input_stream() can print log messages directly, there is no need to
ship them to the caller.
Also, log errors to the InputStream and avoid duplicate information in
the message.
Changing AVCodecContext.sample_aspect_ratio after the encoder was opened
is by itself questionable, but if anywhere it belongs in encoding rather
than filtering code.
Creating a new output stream of a given type is currently done by
calling new_<type>_stream(), which all start by calling
new_output_stream() to allocate the stream and do common init, followed
by type-specific init.
Reverse this structure - the caller now calls the common function
ost_add() with the type as a parameter, which then calls the
type-specific function internally. This will allow adding common code
that runs after type-specific code in future commits.
Add the check for the return value of the av_malloc in order to avoid
NULL pointer deference.
Fixes: e4be3485af ("MS Video 1 encoder")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Add the check for the return value of the av_malloc in order to avoid
NULL pointer deference.
Fixes: b86ab38137 ("Add weighted motion compensation for RV40 B-frames")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Fixes: out of array write on x86-32
Fixes: 57825/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_MPEG2VIDEO_fuzzer-6094366187061248
Fixes: 57829/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_MPEG2VIDEO_fuzzer-4526419991724032
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
This reverts commit f7abe92bd7.
timer.h has been removed from internal.h, and then added back with
3e6088f for convenience. This patch removed it again for the
following reasons:
1. Only includes what's necessary is a common and safe strategy.
2. It fixed some build errors on Android:
a. libavutil/timer.h includes sys/ioctl.h, and ioctl.h includes
termios.h on Android.
b. termios.h reserves names prefixed with ‘c_’, ‘V’, ‘I’, ‘O’, and
‘TC’; and names prefixed with ‘B’ followed by a digit.
c. libavcodec uses B0 B1 and so on as variable names a lot. So
the code failed to build with --enable-linux-perf, or
--target-os=Linux.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
In most cases this should only occur once or so per stream in an
input, and in case the logic ends up in an eternal loop, it should
be visible to the end user instead of being completely invisible.
Signed-off-by: Jan Ekström <jan.ekstrom@24i.com>
When no packet dts values are available from the container, video
decoding code will currently use its own guessed values, which will then
be propagated to pkt_dts on decoded frames and used as pts in certain
cases. This is inaccurate, fragile, and unnecessarily convoluted.
Simply removing this allows the extrapolation code introduced in the
previous commit to do a better job.
Changes the results of numerous h264 and hevc FATE tests, where no
spurious timestamp gaps are generated anymore. Several tests no longer
need -vsync passthrough.
When no timestamps are available from the container, the video decoding
code will currently use fake dts values - generated in
process_input_packet() based on a combination of information from the
decoder and the parser (obtained via the demuxer) - to generate
timestamps during decoder flushing. This is fragile, hard to follow, and
unnecessarily convoluted, since more reliable information can be
obtained directly from post-decoding values.
The new code keeps track of the last decoded frame pts and estimates its
duration based on a number of heuristics. Timestamps generated when both
pts and pkt_dts are missing are then simple pts+duration of the last frame.
The heuristics are somewhat complicated by the fact that lavf insists on
making up packet timestamps based on its highly incomplete information.
That should be removed in the future, allowing to further simplify this
code.
The results of the following tests change:
* h264-3386 now requires -fps_mode passthrough to avoid dropping frames
at the end; this is a pathology of the interaction of the new and old
code, and the fact that the sample switches from field to frame coding
in the last packet, and will be fixed in following commits
* hevc-conformance-DELTAQP_A_BRCM_4 stops inventing an arbitrary
timestamp gap at the end
* hevc-small422chroma - the single frame output by this test now has a
timestamp of 0, rather than an arbitrary 7
Changes the result of the h264_redundant_pps-mov test, where the output
timebase is now 1001/24000 instead of 1/24. This is more correct, as the
source file actually is 23.98fps.
Timestamps in two FATE H.264 conformance tests now start at 1 instead
of 0, which also happens in some other H.264 tests before this commit
and so is not a big issue.
Conversely, timestamps in some HEVC conformance tests start from a
smaller value now.
Ideally this should be addressed later in a more general way.
h264-conformance-frext-frext2_panasonic_b no longer requires -vsync
passthrough.
This field contains different values depending on whether the stream is
being decoded or not. When it is, InputStream.pts is set to the
timestamp of the last decoded frame. Otherwise, it is made equal to
InputStream.dts.
Since a given InputStream can be at the same time decoded and
streamcopied to any number of output streams, this use is incorrect, as
decoded frame timestamps can be delayed with respect to input packets by
an arbitrary amount (e.g. depending on the thread count when frame
threading is used).
Replace all uses of InputStream.pts for streamcopy with InputStream.dts,
which is its value when decoding is not performed. Stop setting
InputStream.pts for pure streamcopy.
Also, pass InputStream.dts as a parameter to do_streamcopy(), which
will allow that function to be decoupled from InputStream completely in
the future.
Which is subtitle encoding. Also, check for AVSubtitle.pts rather than
InputStream.pts, since that is the more authoritative value and is
guaranteed to be valid.
That function only contains two checks now - whether the muxer returned
EOF and whether the packet timestamp is before requested output start
time.
The first check is unnecessary, since the packet will just be rejected
by the muxer. The second check is better combined with a related check
directly in do_streamcopy().
Currently, output streams where an input stream is sent directly (i.e.
not through lavfi) are determined by iterating over ALL the output
streams and skipping the irrelevant ones. This is awkward and
inefficient.
The channel layout is set before opening the encoder, in enc_open().
Messing with it in configure_output_audio_filter() cannot accomplish
anything meaningful.
This option adds a long string of numbers to the progress line, where
i-th number contains the base-2 logarithm of the number of times a frame
with this QP value was seen by print_report().
There are multiple problems with this feature:
* despite this existing since 2005, web search shows no indication
that it was ever useful for any meaningful purpose;
* the format of what is printed is entirely undocumented, one has to
find it out from the source code;
* QP values above 31 are silently ignored;
* it only works with one video stream;
* as it relies on global state, it is in conflict with ongoing
architectural changes.
It then seems that the nontrivial cost of maintaining this option is not
worth its negligible (or possibly negative - since it pollutes the
already large option space) value.
Users who really need similar functionality can also implement it
themselves using -vstats.
Current code in print_final_stats(), printing the final summary such as
video:8851kB audio:548kB subtitle:0kB other streams:0kB global headers:20kB muxing overhead: 0.559521%
was written with a single output file in mind and makes very little
sense otherwise.
Print this information in mux_final_stats() instead, one line per output
file. Use the correct filesize, if available.
This is currently done in two places:
* at the end of print_final_stats(), which merely prints a warning if
the total size of all written packets is zero
* at the end of transcode() (under a misleading historical 'close each
encoder' comment), which instead checks the packet count to implement
-abort_on empty_output[_stream]
Consolidate both of these blocks into a single function called from
of_write_trailer(), which is a more appropriate place for this. Also,
return an error code rather than exit immediately, which ensures all
output files are properly closed.
Properly pass muxing return codes through the call stack instead.
Slightly changes behavior in case of errors:
* the output IO stream is closed even if writing the trailer returns an
error, which should be more correct
* all files get properly closed with -xerror, even if one of them fails
It is video encoding-only and does not need to be visible outside of
ffmpeg_enc.c
Also, rename the variable to frames_prev_hist to be consistent with
the naming in do_video_out().
This allows weird subsampling with progressive JPEGs to be decoded,
such as full-RG and only B subsampled.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
Drop unneeded ctype.h and math.h.
Group all system headers together.
Sort unconditional includes alphabetically.
Group local includes by the library, sort alphabetically.
Several places in the code currently call init_output_stream_wrapper(),
which in turn calls init_output_stream(), which then calls either
enc_open() or init_output_stream_streamcopy(), followed by
of_stream_init(), which tells the muxer the stream is ready for muxing.
All except one of these callers are in the encoding code, which will be
moved to ffmpeg_enc.c. Keeping this structure would then necessitate
ffmpeg_enc.c calling back into the common code in ffmpeg.c, which would
then just call ffmpeg_mux, thus making the already convoluted call chain
even more so.
Simplify the situation by using separate paths for filter-fed output
streams (audio and video encoders) and others (subtitles, streamcopy,
data, attachments).
Encoder initialization is currently split rather arbitrarily between
init_output_stream_encode() and init_output_stream(). Move all of it to
init_output_stream_encode().
The code currently uses lavfi for this, which creates a sort of
configuration dependency loop - the encoder should be ideally
initialized with information from the first audio frame, but to get this
frame one needs to first open the encoder to know the frame size. This
necessitates an awkward workaround, which causes audio handling to be
different from video.
With this change, audio encoder initialization is congruent with video.
For audio AVFrames, nb_samples is typically more trustworthy than
duration. Since sync queues look at durations, make sure they match the
sample count.
The last audio frame in the fate-shortest test is now gone. This is more
correct, since it outlasts the last video frame.
stride value is not relevant with unpadded content and the total count
of pixels (width x height) must be used instead of the rounding based on
width only then multiplied by height
unpadded_10bit value computing is moved sooner in the code in order to
be able to use it during computing of minimal content size. Also make sure to
only set it for 10bit.
Fix 'Overread buffer' error when the content is not lucky enough to have
(enough) padding bytes at the end for not being rejected by the formula
based on the stride value
Fixes ticket #10259.
Signed-off-by: Jerome Martinez <jerome@mediaarea.net>
Signed-off-by: Marton Balint <cus@passwd.hu>
Extend the decklink output to include support for compressed AC-3,
encapsulated using the SMPTE ST 377:2015 standard.
This functionality can be exercised by using the "copy" codec when
the input audio stream is AC-3. For example:
./ffmpeg -i ~/foo.ts -codec:a copy -f decklink 'UltraStudio Mini Monitor'
Note that the default behavior continues to be to do PCM output,
which means without specifying the copy codec a stream containing
AC-3 will be decoded and downmixed to stereo audio before output.
Thanks to Marton Balint for providing feedback.
Signed-off-by: Devin Heitmueller <dheitmueller@ltnglobal.com>
Signed-off-by: Marton Balint <cus@passwd.hu>
The Matroska spec requires it to be equal to the highest BlockAddID value in a
BlockAdditions, but being purely an informative element, only abort if strict
compliance is requested, as demuxing is otherwise unaffected.
Signed-off-by: James Almer <jamrial@gmail.com>
Implement support for including AFD in decklink output when putting
out 10-bit VANC data.
Updated to reflect feedback in 2018 from Marton Balint <cus@passwd.hu>,
Carl Eugen Hoyos <ceffmpeg@gmail.com> and Aaron Levinson
<alevinsn_dev@levland.net>. Also includes fixes to set the AR field
based on the SAR, as well as now sending the AFD info in both fields
for interlaced formats.
Signed-off-by: Devin Heitmueller <dheitmueller@ltnglobal.com>
Signed-off-by: Marton Balint <cus@passwd.hu>
RIP, if exists is the last KLV item in the MXF files therefore we can stop
parsing the file if it is encountered. This allows us to support files created by
broken muxers such as OpenCube MXFTk Advanced 2.8.0.0.1. which dumps some extra
garbage after the RIP.
Signed-off-by: Marton Balint <cus@passwd.hu>
The function now accepts an existing buffer to avoid unnecessary allocations,
as well as only reporting the needed amount of bytes if you pass a NULL pointer
as input for data.
For this, both parameters become input and output, as well as making data
optional. This is backwards compatible, and as such not breaking any existing
use of the function in external code (if there's any).
Signed-off-by: James Almer <jamrial@gmail.com>
Current log messages talk about 'suitable' output formats. This is
confusing, because it suggests that some formats are suitable, while
others are not, which is not the case.
Also, suggest possible user actions when format cannot be guessed from a
filename.
An uninitialized AVFormatContext instance with neither iformat nor
oformat set will currently log as 'NULL', which is confusing and
unhelpful. Print 'AVFormatContext' instead, which provides more
information.
This happens e.g. if choosing an output format fails in
avformat_alloc_output_context2().
E.g. with the following commandline:
ffmpeg -i <input> -f foobar -y /dev/null
before: [NULL @ 0x5580377834c0] Requested output format 'foobar' is not a suitable output format
after: [AVFormatContext @ 0x55fa15bb34c0] Requested output format 'foobar' is not a suitable output format
When using fractional framerates (or any fraction with a numerator != 1),
DTS values for packets would be calculated incorrectly.
Signed-off-by: Kyle Manning <tt2468@irltoolkit.com>
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
3GPP TS 26.244 Table 8.10 specifies that longitude is written before
latitude. The MOV demuxer already expects the correct order. So, write
them in that order.
However, the user supplied string may also be used in MOV mode which
requires ISO 6709 format which specifies latitude first. The demuxer
also exports the loci value in that format. So parser adjusted as well.
Fixes: signed integer overflow: -532410125 + -1759642300 cannot be represented in type 'int'
Fixes: 57045/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WAVARC_fuzzer-637023665297817
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
When writing a subtitle SSA/ASS subtitle file, the AVCodecParameters::extradata
buffer is written directly to the output. In the case where the buffer is
filled from a matroska source file produced by some older versions of
Handbrake, this buffer ends with a null terminating character, which is then
erroneously copied into the middle of the output file. The change here avoids
this problem by treating it as a string rather than a raw buffer. This way it
is agnostic as to whether the source buffer was null terminated or not.
Fixes ticket #10203.
Reported-by: Tim Angus <tim at ngus.net>
Signed-off-by: Marton Balint <cus@passwd.hu>
When including the header in decklink_enc.cpp it would be fed
through the C++ compiler rather than the C compiler, which has
more strict warnings when comparing signed/unsigned values.
Make the local variables unsigned to match the arguments they are
being passed for those functions.
Signed-off-by: Devin Heitmueller <dheitmueller@ltnglobal.com>
Signed-off-by: Marton Balint <cus@passwd.hu>
Clarify failure in case of x/y building a too big matrix.
Example:
$ ffmpeg -hide_banner -f lavfi -i color=c=white:size=640x360,unsharp=lx=5:ly=23 -f null -t 1 -
[Parsed_unsharp_1 @ 0x5650e1c30240] luma matrix size (lx/2+ly/2)*2=26 greater than maximum value 25
color=c=white:size=640x360,unsharp=lx=5:ly=23: Invalid argument
Fix trac issue:
http://trac.ffmpeg.org/ticket/6033
In particular, add a sentence to introduce the example, and add a
simpler starting example with no options.
Also use different format for input.avi and output.mp4, to convey
that the conversion also works on the container format.
Address issue:
http://trac.ffmpeg.org/ticket/8730
Document metadata entries set by the filter, extend and clarify
options, add additional example showing how to extract the generated
data.
Address issue:
http://trac.ffmpeg.org/ticket/8766
The only encoders avaliable using mediacodec were h264 and hevc. This
patch adds the vp9 encoder.
Signed-off-by: Samuel Mira <samuel.mira@qt.io<mailto:samuel.mira@qt.io>>
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Previously, the ff_configure_buffers_for_index function had
upper sanity limits of 16 MB (1<<24) for buffer_size and
8 MB (1<<23) for short_seek_threshold.
However, if the index contained entries with a much larger
delta, setting pos_delta to a value larger than the sanity
limit, we would end up not increasing the buffer size at all.
Instead, ignore the individual deltas that are excessive, but
increase the buffer size based on the deltas that are below the
sanity limit.
Only count deltas that are below 1<<23, 8 MB; pos_delta gets doubled
before setting the buffer size - this matches the previous maximum
buffer size of 1<<24, 16 MB.
This can happen e.g. with a mov file with some tracks containing
some samples that belong in the start of the file, at the end of
the mdat, while the rest of the file is mostly reasonably interleaved;
previously those samples caused the maximum pos_delta to skyrocket,
skipping any buffer size enlargement.
Signed-off-by: Martin Storsjö <martin@martin.st>
When scanning through the index, account for the fact that the
compared samples may be located in an unexpected order in the file;
this function is mainly interested in the absolute difference between
file locations.
Signed-off-by: Martin Storsjö <martin@martin.st>
While the inline cabac assembly has worked correctly in i386 builds
historically, modern compiler updates has started showing issues
with it, when the function gets inlined into larger contexts that
fail to provide the amount of free registers as this function
requires.
This was an issue with Clang on Windows on i386, which was fixed
in c6d284b945324a7bc70ea8b9056040c8148aa835. However, recently
the same issues also have started showing up with GCC (both for
Windows and Linux). Whether the issue appears seems dependent on
a lot of optimizer tuning (e.g. the issue appears or goes away
depenent on the combinaton of -march= and -mtune= options),
potentially due to the compiler making different decisions on
how much to inline.
Fixes: https://trac.ffmpeg.org/ticket/8903
Signed-off-by: Martin Storsjö <martin@martin.st>
Add support for writing sBIT chunks, which mark the significant
bit depth of the PNG file. This obtains the metadata using the field
bits_per_raw_sample of AVCodecContext.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
Add support for reading sBIT chunks, which mark the significant
bit depth of the PNG file. This passes the metadata using the field
bits_per_raw_sample of AVCodecContext.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
Some specifications require the size of ld/eld frames to be 480 samples
instead of the default 512. libfdk_aac provides an option to set an alternative
frame size, but it's not exposed via the ffmpeg interface.
This patch adds a frame_length option to solve this problem.
Signed-off-by: Raphael Schlarb <info@raphael.schlarb.one>
Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: James Almer <jamrial@gmail.com>
Otherwise the output format is not changed when output is in system
memory. For example, the output format is still p010le in the following
case:
$ ffmpeg -qsv_device /dev/dri/renderD128 -f lavfi -i testsrc -vf
"format=p010le,vpp_qsv=extra_hw_frames=8:format=nv12" -f null -
...
Output #0, null, to 'pipe:':
Metadata:
encoder : Lavf60.4.100
Stream #0:0: Video: wrapped_avframe, p010le(tv, progressive), 320x240
[SAR 1:1 DAR 4:3], q=2-31, 200 kb/s, 25 fps, 25 tbn
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Drop mention of unsupported N:M syntax, dropped since 0ed61546c4.
Also, drop reference of common expression constants, and fix
description of a expression parameter.
Fix issue:
http://trac.ffmpeg.org/ticket/9974
Reference drawtext textfile option and ffmpeg -filter_complex_script
and -filter_script as possible solutions to avoid shell escaping.
Address issue:
http://trac.ffmpeg.org/ticket/9008
This patch add another ARIB caption decoder using libaribcaption
external library.
Unlike libaribb24, it supports 3 types of subtitle outputs:
* text: plain text
* ass: ASS formatted text
* bitmap: bitmap image
Default subtitle type is ass as same as libaribb24.
Advantages compared with libaribb24 on ASS subtitle are:
* Subtitle positioning.
* Multi-rect subtitle: some captions are displayed at different
position at a time.
* More stability and reproducibility.
To compile with this feature:
* libaribcaption external library has to be pre-installed.
https://github.com/xqq/libaribcaption
* configure with `--enable-libaribcaption` option.
`--enable-libaribb24` and `--enable-libaribcaption` options are
not exclusive. If both enabled, libaribcaption precedes as
order listed in `libavcodec/allcodecs.c`.
Signed-off-by: rcombs <rcombs@rcombs.me>
Some additional properties are set for ARIB caption.
* need_parsing = 0
ARIB caption doesn't require any parser.
This avoids "parser not found" warning message.
* need_context_update = 1
When any profiles are changed, set this flag to notify.
Signed-off-by: rcombs <rcombs@rcombs.me>
To support bitmap subtitle output, remove AV_CODEC_PROP_TEXT_SUB
property from codec descriptor for AV_CODEC_ID_ARIB_CAPTION.
This is similar to `libavcodec/libzvbi-teletextdec.c`
(AV_CODEC_ID_DVB_TELETEXT).
Instead, each subtitle decoder has to specify a subtitile format.
`libavcodec/libaribb24.c` uses same AV_CODEC_ID_ARIB_CAPTION and
expects AV_CODEC_PROP_TEXT_SUB to be set, so this adds a line to
specify a format there.
Signed-off-by: rcombs <rcombs@rcombs.me>
Currently this would not be done if max_frames is triggered.
Makes no difference in either of the tools currently using
decode_simple, but may be important in future tools.
The previous commit allowed turning on error correction for interlaced
samples. Turning it off amounts to a no-op for FATE. These samples
should be tested with EC1-3 (guess_mvs/deblock/favor_inter)
additionally.
Signed-off-by: J. Dekker <jdek@itanimul.li>
This change on its own is almost certainly not correct; however, in
testing, many samples show notable improvement.
Signed-off-by: J. Dekker <jdek@itanimul.li>
This allows users to specify an argument such as './configure
--enable-lto=thin' to use ThinLTO with Clang. The old functionality with
'./configure --enable-lto' is preserved.
Signed-off-by: J. Dekker <jdek@itanimul.li>
According to MXF specs the Stored Rectangle corresponds to the data which is
passed to the compressor and received from the decompressor, so they should
contain the width / height extended to the macroblock boundary.
In practice however width and height values rounded to the upper 16 multiples
are only seen when muxing MPEG formats. Therefore this patch changes stored
width and height values to unrounded for all non-MPEG formats, even macroblock
based ones.
For DNXHD the specs (ST 2019-4) explicitly indicates to use 1080 for 1088p.
For ProRes the specs (RDD 44) only refer to to ST 377-1 without precision but
no known commercial implementations are using rounded values.
DV is not using 16x16 macroblocks, so 16 rounding makes no sense.
The patch also fixes Sampled Width / Display Width to use unrounded values.
Signed-off-by: Marton Balint <cus@passwd.hu>
Add the appropriate descriptors to the MPEG-TS demux and mux to
ensure that SMPTE 2038 VANC streams are properly preserved
when using codec copy (including adding the appropriate PMT
descriptors).
The focus of this patch is TS input to TS output. A separate
patch adds support for output of 2038 VANC over decklink SDI.
Thanks to Marton Balint for feedback to preserve ABI compatibility.
Signed-off-by: Devin Heitmueller <dheitmueller@ltnglobal.com>
Signed-off-by: Marton Balint <cus@passwd.hu>
Add stack_internal.h to the list of skipped headers to fix
make checkheaders fail, it's introduced by commit 742dfa281
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
This way we can clean up separate definitions in functions with
just a single loop, as well as have no reuse between different
loops' counters in functions with multiple.
This is more correct, but was not possible before the recently-added
filtergraph parsing API.
Also, only pass hw devices to filters that are flagged as capable of
using them.
Tested-by: Niklas Haas
This reverts commit 9413bdc381.
That commit broke the fate HEVC tests - unfortunately I only
tested checkasm for that patch, and that function is still
lacking checkasm coverage.
Signed-off-by: Martin Storsjö <martin@martin.st>
pts may not be set on input packets, which could result in the entire stream
being discarded.
This reverts commit 8fc2dedfe6, reintroducing the
behavior it replaced but now allowing the caller to manually drop the preroll
samples by looking at the skip_samples side data at the start while ignoring it
on seek, by setting the skip_manual avctx flag.
Fixes ticket #10251.
Suggested-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: James Almer <jamrial@gmail.com>
These fields are supposed to store information about the packet the
frame was decoded from, specifically the byte offset it was stored at
and its size.
However,
- the fields are highly ad-hoc - there is no strong reason why
specifically those (and not any other) packet properties should have a
dedicated field in AVFrame; unlike e.g. the timestamps, there is no
fundamental link between coded packet offset/size and decoded frames
- they only make sense for frames produced by decoding demuxed packets,
and even then it is not always the case that the encoded data was
stored in the file as a contiguous sequence of bytes (in order for pos
to be well-defined)
- pkt_pos was added without much explanation, apparently to allow
passthrough of this information through lavfi in order to handle byte
seeking in ffplay. That is now implemented using arbitrary user data
passthrough in AVFrame.opaque_ref.
- several filters use pkt_pos as a variable available to user-supplied
expressions, but there seems to be no established motivation for using them.
- pkt_size was added for use in ffprobe, but that too is now handled
without using this field. Additonally, the values of this field
produced by libavcodec are flawed, as described in the previous
ffprobe conversion commit.
In summary - these fields are ill-defined and insufficiently motivated,
so deprecate them.
These fields are ad-hoc and will be deprecated. Use the recently-added
AV_CODEC_FLAG_COPY_OPAQUE to pass arbitrary user data from packets to
frames.
Changes the result of the flcl1905 test, which uses ffprobe to decode
wmav2 with multiple frames per packet. Such packets are handled
internally by calling the decoder's decode callback multiple times,
offsetting the internal packet's data pointer and decreasing its size
after each call. The output pkt_size value before this commit is then
the remaining internal packet size at the time of each internal decode
call.
After this commit, output pkt_size is simply the size of the full packet
submitted by the caller to the decoder. This is more correct, since
internal packets are never seen by the caller and should have no
observable outside effects.
The AVPacket given to an encoder's encode callback
is unreferenced generically on error.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The AVPacket given to an encoder's encode callback
is unreferenced generically on error.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The AVPacket given to an encoder's encode callback
is unreferenced generically on error.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
FFmpeg's assembly code currently does not abide by the
plattform-specific ABIs wrt its handling of the X86 MMX flag:
Resetting the MMX state is deferred to avoid doing it multiple times
instead of ensuring that the CPU is in floating point state
upon return from any function.
Furthermore, resetting said state is sometimes done generically,
namely for all the decoders using the ordinary decode callback;
yet this is not done for the decoders using the receive_frame API.
This led to problems when MJPEG (and the MJPEG-based decoders)
were switched to the receive_frame API in commit
e9a2a87773, because ff_mjpeg_decode_sos()
only resets the MMX state on success, not on failure.
Such issues are probably still possible with SMVJPEG, which still
uses the receive_frame API. See issue #10210.
This commit therefore also resets the MMX state for
the receive_frame API to avoid any more surprises of this sort.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The base_enable_flag is parallel to three_Spline_enable_flag. The
typesetting of the specification is very misleading.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
It conflicts the comments. The operation based on Delta_enable_mode
can be applied by user during tone mapping.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
The original code would strip off the PTS/DTS of any packets
which had a stream ID of STREAM_ID_PRIVATE_STREAM_1. While the
intent was to apply this to asynchronous KLV packets, it was
being applied to any codec that had that same stream ID (including
types such as SMPTE 2038).
Add a clause to the if() statement to ensure it only gets applied
if the codec actually is KLV.
Signed-off-by: Devin Heitmueller <dheitmueller@ltnglobal.com>
Signed-off-by: Marton Balint <cus@passwd.hu>
ipcm is defined by ISO/IEC 23003-5, not defined by quicktime. After
adding ISO/IEC 23003-5 support, we don't need it for ticket #9219.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
This also fixes the number of planes for the NV formats
(this seems to not have caused any problems).
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Reviewed-by: Jan Ekström <jeebjp@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
SSIM360Context.ssim360_hist is an array of four pointers to double;
so sizeof(*ssim360_hist[0]) (=sizeof(double)) is the correct size
to use to calculate the amount of memory to allocate, not
sizeof(*ssim360_hist) (which is sizeof(double*)).
Use FF_ALLOCZ_TYPED_ARRAY to avoid this issue altogether.
Fixes Coverity issue #1520671.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Reviewed-by: Jan Ekström <jeebjp@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Generally if maxrate is set, the calculation should be maxrate over
bufsize. This additionally enables CRF + maxrate & bufsize usage.
In order to keep negative values from enabling zero to be treated
as larger and causing a division by zero, check that one of the
variables is larger than zero.
This has not been functional since a year ago, including in our current
minimum dependency of libplacebo (v4.192.0). It also causes build errors
against libplacebo v6, so it needs to be removed from the code. We can
keep the option around for now, but it should also be removed soon.
Signed-off-by: Niklas Haas <git@haasn.dev>
Signed-off-by: James Almer <jamrial@gmail.com>
This reverts commit 205117d87f.
This didn't fix make checkheaders after all, and also broke compilation in some
scenarios.
Signed-off-by: James Almer <jamrial@gmail.com>
It is currently abused to store packet size, which breaks
AV_CODEC_FLAG_COPY_OPAQUE.
Use stream_index instead, which is unused in libavcodec and has the
same type as size.
Found-by: Martin Storsjö
This includes Mastering Display, Content light level, and some ITU-T T35
metadata like closed captions and HDR10+.
Signed-off-by: James Almer <jamrial@gmail.com>
If the packets returned by libvpx and our internal frame properties
queue get desynchronized for some reason (should not happen, but it is
not clear libvpx API guarantees this), we will keep adding to the queue
indefinitely and never remove anything.
Change the code to drain the queue even if timestamps do not match.
This AVFifo is used to propagate HDR metadata from input frames to
output packets, since libvpx does not allow passing through arbitrary
user data.
It will be extended to pass through other kinds of data in future
commits, so give it a more generic name.
It is not used, except to check whether the packet is valid before
writing HDR metadata to the packet in storeframe(). However, that check
serves no purpose, as the encoded packet is already treated as valid
higher up in this function.
svg is xml, but <?xml is not required,
it can start with <svg and can have multiple empty lines,
or start with <!-- include some comments,
but must first line if start with <?xml.
Signed-off-by: Wang Yaqiang <wangyaqiang03@kuaishou.com>
Signed-off-by: Marton Balint <cus@passwd.hu>
Due to refactoring, the ctx/cctx variables are never actually used
in ff_decklink_write_packet(), so just remove them.
Signed-off-by: Devin Heitmueller <dheitmueller@ltnglobal.com>
Signed-off-by: Marton Balint <cus@passwd.hu>
The ff_decklink_write_packet() was always caching the last pts
received, to be used when calling StopScheduledPlayback(). However
because audio and video are on different timebases and the call to
StopScheduledPlayback() expects the video timebase, we'll end up
sending a weird value to the stop routine if the last packet
received contained audio.
Move the setting of last_pts to just be for the video stream.
Signed-off-by: Devin Heitmueller <dheitmueller@ltnglobal.com>
Signed-off-by: Marton Balint <cus@passwd.hu>
The existing code assumed that the first frame received by the decklink
output would always be PTS zero. However if running in other timing
modes than the default of CBR, items such as frame dropping at the
beginning may result in starting at a non-zero PTS.
For example, in our setup because we discard probing data and run
with "-vsync 2" the first video frame scheduled to the decklink
output will have a PTS around 170. Scheduling frames too far into
the future will either fail or cause a backlog of frames scheduled
far enough into the future that the entire pipeline will stall.
Issue can be reproduced with the following command-line:
./ffmpeg -copyts -i foo.ts -f decklink -vcodec v210 -ac 2 'DeckLink Duo (4)'
Keep track of the PTS of the first frame received, so that when
we enable start playback we can provide that value to the decklink
driver.
Thanks to Marton Balint for review and suggestion to use
AV_NOPTS_VALUE rather than zero for the initial value.
Signed-off-by: Devin Heitmueller <dheitmueller@ltnglobal.com>
Signed-off-by: Marton Balint <cus@passwd.hu>
The path attribute in the Set-Cookie header is optional but treated by ffmpeg as being compulsory.
Signed-off-by: Michael J. Walsh <mjfwalsh@gmail.com>
Signed-off-by: Marton Balint <cus@passwd.hu>
Fixes compilation with clang which errors out on Wint-conversion.
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
This fixes the following error when compiling with a modern
version of Clang for Windows/i386:
src/libavutil/hwcontext_vulkan.c:738:32: error: incompatible function pointer types initializing 'PFN_vkDebugUtilsMessengerCallbackEXT' (aka 'unsigned int (*)(enum VkDebugUtilsMessageSeverityFlagBitsEXT, unsigned int, const struct VkDebugUtilsMessengerCallbackDataEXT *, void *) __attribute__((stdcall))') with an expression of type 'VkBool32 (VkDebugUtilsMessageSeverityFlagBitsEXT, VkDebugUtilsMessageTypeFlagsEXT, const VkDebugUtilsMessengerCallbackDataEXT *, void *)' (aka 'unsigned int (enum VkDebugUtilsMessageSeverityFlagBitsEXT, unsigned int, const struct VkDebugUtilsMessengerCallbackDataEXT *, void *)') [-Wincompatible-function-pointer-types]
.pfnUserCallback = vk_dbg_callback,
^~~~~~~~~~~~~~~
Signed-off-by: Martin Storsjö <martin@martin.st>
Restores the behavior of naming the instance filter@id, which was accidentally changed
to simpy id in commit f17051eaae.
Fixes ticket #10226.
Signed-off-by: James Almer <jamrial@gmail.com>
As per 23003-5:2020, the rest of the bits are reserved, and thus
in the future they may be utilized for something else.
Quote:
format_flags is a field of flags that modify the default PCM sample format.
Undefined flags are reserved and shall be zero. The following flag is defined:
0x01 indicates little-endian format. If not present, big-endian format is used.
In texi source, @ characters need to be escaped.
This fixes the following build errors:
community.texi:59: unknown command `ffmpeg'
community.texi:143: unknown command `ffmpeg'
Signed-off-by: Martin Storsjö <martin@martin.st>
Remove doc/dev_communit markup files completely as they are at the wrong place.
Create a new community page, merging all of doc/dev_community and subsection Code of Conduct into a common place.
The corresponding patch to ffmpeg-web puts the Organisation & Code of Conduct into a seperate community chapter on the FFmpeg website.
When hls_init_time should available when hls_list_size > 0.
Because the list will not refresh new top line segment when hls_list_size is 0
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
According to description of vaExportSurfaceHandle in libva, vaSyncSurface
must be called if the contents of the surface will be read.
Fixes ticket #9967.
Reviewed-by: Zhao Zhili <zhilizhao@tencent.com>
Signed-off-by: Fei Wang <fei.w.wang@intel.com>
MSDK/VPL uses 420 chroma format as default to encode RGB, and this is
not a proper usage. Now enable 444 encoding for RGB input by default.
When main profile is used, RGB input is still encoded in 420 format.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Note that Screen-Extended Main 4:4:4 and 4:4:4 10 supports
chroma_format_idc from 0, 1 or 3, hence both 420 and 444 are
supported.
Signed-off-by: Linjie Fu <linjie.justin.fu@gmail.com>
Signed-off-by: Fei Wang <fei.w.wang@intel.com>
Screen Content Coding allows non-intra slice in an IRAP frame which can
reference the frame itself, and would mark the current decoded picture
as "used for long-term reference", no matter TwoVersionsOfCurrDecPicFlag(8.1.3),
hence some previous restricts are not suitable any more.
Constructe RefPicListTemp and RefPicList according to 8-8/9/10. Disable
slice decoding for SCC profile to avoid unexpected error in hevc native
decoder and patch welcome.
Signed-off-by: Linjie Fu <linjie.justin.fu@gmail.com>
Signed-off-by: Fei Wang <fei.w.wang@intel.com>
According to 7.3.6.1, use_integer_mv_flag should be parsed if
motion_vector_resolution_control_idc equals to 2. If not present, it
equals to motion_vector_resolution_control_idc.
Signed-off-by: Linjie Fu <linjie.justin.fu@gmail.com>
Signed-off-by: Fei Wang <fei.w.wang@intel.com>
1. Add extension syntax according to 7.3.2.2.3/7.3.2.3.3 in T-REC-H.265-201911.
2. Keep using parsed PPS when bitstream overread for compatibility. For
example, the clip PS_A_VIDYO_3.bit in FATE test has incomplete extension
syntax which will be overread and un-decodable if without this change.
3. Format brace in pps_range_extensions().
Signed-off-by: Linjie Fu <linjie.justin.fu@gmail.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Signed-off-by: Fei Wang <fei.w.wang@intel.com>
Described in HEVC spec A.3.7. Bump minor version and add APIchanges
entry for new added profile.
Signed-off-by: Linjie Fu <linjie.justin.fu@gmail.com>
Signed-off-by: Fei Wang <fei.w.wang@intel.com>
Replace cpucfg with getauxval to avoid crash in case of
some processor capabilities are not supportted by kernel used.
Reviewed-by: Steven Liu <liuqi05@kuaishou.com>
Fixes: out of array access
Fixes: 48567/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_BINK_fuzzer-6657932926517248
Alterantivly to this it is possibly to allocate a bigger array
Note: oss-fuzz assigned this issue to a unrelated theora bug so the bug number matches that
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: out of array read on 32bit
Fixes: 54857/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_VC1_fuzzer-5840588224462848
The chroma MC code reads over the currently allocated frame.
Alternative fixes would be allocating a few bytes more at the end instead of a whole
line extra or to adjust the threshold where the edge emu code is activated
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: out of array access:
Fixes: 48567/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_PNG_fuzzer-6716193709096960
Alternatively it should be possible to limit this to 3 plane RGB 8 /16bit to ensure the size is what it should be
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Such filters will not advance and be stuck in the current implementation
Fixes: Infinite loop
Fixes: 56052/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_RKA_fuzzer-5236218750435328
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: left shift of negative value -3201
Fixes: integer overflow: -76470276 * -25608 cannot be represented in type 'int'
Fixes: 56052/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_RKA_fuzzer-5236218750435328
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: division by zero
Fixes: 55940/clusterfuzz-testcase-minimized-ffmpeg_IO_DEMUXER_fuzzer-6333107679920128
The decoder does not support bps=1 and i have no such sample so it is not
known if this duration is correct. Alternatively we could error out on all
bps we currently do not support on the decoder side or not set duration.
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 9223372036854775584 + 536870912 cannot be represented in type 'long'
Fixes: 55844/clusterfuzz-testcase-minimized-ffmpeg_dem_MOV_fuzzer-510613920664780
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Remove CONFIG_VAAPI for VUYX, YUYV422, Y210, XV30, Y212, XV36.
Make 8-bit, 10-bit, 12-bit YUV 4:2:2 video sources as well as YUV 4:4:4
video sources supported by d3d11va and dxva2 just like what VAAPI does.
Sign-off-by: Tong Wu <tong1.wu@intel.com>
Add support for VUYX, YUYV422, Y210, XV30, P012, Y212, XV36.
The added formats work with qsv acceleration and will not have
impact on dxva2 acceleration(-hwaccel dxva2) since so far
these formats are still not supported by using dxva2 acceleration.
Hwupload and hwdownload can work with the added formats.
Signed-off-by: Tong Wu <tong1.wu@intel.com>
Add support for VUYX, YUYV422, Y210, XV30, P012, Y212, XV36.
The added formats work with qsv acceleration and will not have
impact on d3d11va acceleration(-hwaccel d3d11va) since so far
these formats are still not supported by using d3d11va acceleration.
Hwupload and hwdownload can work with the added formats.
Signed-off-by: Tong Wu <tong1.wu@intel.com>
Current mpegtsenc code only inserts SPS/PPS from extradata before IDR frames if
AUD is also inserted.
Unfortunately some encoders may preface a key frame with an AUD, but no
SPS/PPS. In that case current code does not repeat the "extradata" and the
resulting HLS stream may become noncompliant and unjoinable.
Fix this by always inserting SPS/PPS and moving AUD to the beginning of the
packet if it is already present.
Fixes ticket #10148.
Signed-off-by: Marton Balint <cus@passwd.hu>
FLV spec only has AVC end of sequence tag, and the EOS tag has a
CodecID as other video data packet. MPEG4 doesn't conformance to
the spec, but it's there for a decade. So only 'fix' the EOS tag
rather than remove it completely.
Reviewed-by: Steven Liu <lq@chinaffmpeg.org>
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
HLS segments may be MPEG-TS or fragmented MP4, so those (de)muxers are
required for reading/writing HLS media segments.
Fixes functionality with --disable-everything --enable-demuxer=hls
--enable-muxer=hls
When encode RGB frame, Intel driver convert RGB to YUV, so we cannot
set RGB colorspace to VPL/MSDK.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
CC libavcodec/libx265.o
libavcodec/libx265.c: In function ‘libx265_encode_frame’:
libavcodec/libx265.c:781:5: warning: this ‘else’ clause does not guard... [-Wmisleading-indentation]
else
^~~~
libavcodec/libx265.c:782:1: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the ‘else’
FF_DISABLE_DEPRECATION_WARNINGS
^~~
Signed-off-by: Marton Balint <cus@passwd.hu>
ISOBMFF (14496-12) made this field ('channelcount') in the
AudioSampleEntry structure non-template¹ somewhere before the
release of the 2022 edition. As for ETSI TS 126 244 AKA 3GPP
file format (V16.1.0, 2020-10), it does not seem contain any
references limiting the channelcount entry in AudioSampleEntry
or in its own definition of EVSSampleEntry.
fate-mov-mp4-chapters test had to be adjusted as it output a
mono vorbis stream, which would now be properly marked as such
in the container.
1: As per 14496-12:
Fields shown as “template” in the box descriptions are fields
which are coded with a default value unless a derived
specification defines their use and permits writers to use
other values than the default.
When building FFMPEG in the MSYS environment under Windows, one
must not use forward slashes ('/') for command-line options. It
appears that the MSYS shell interprets these as absolute paths and
then automatically rewrites them into equivalent Windows paths. For
example, the '/nologo' switch below gets rewritten to something like
'C:/Program Files/Git/nologo', and this obviously breaks the build.
Thankfully, most M$ tools accept dashes ('-') as well.
Signed-off-by: Ziemowit Łąski <15880281+zlaski@users.noreply.github.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
Their usefulness is questionable, very few decoders set them, and their type
should have been int64_t. A replacement field can be added later if a valid use
case is found.
Signed-off-by: Marton Balint <cus@passwd.hu>
Frame counters can overflow relatively easily (INT_MAX number of frames is
slightly more than 1 year for 60 fps content), so make sure we use 64 bit
values for them.
Also deprecate the old 32 bit frame_number attribute.
Signed-off-by: Marton Balint <cus@passwd.hu>
Many filters accept user-provided data that is cumbersome to provide as
text strings - e.g. binary files or very long text. For that reason such
filters typically provide a option whose value is the path from which
the filter loads the actual data.
However, filters doing their own IO internally is a layering violation
that the callers may not expect, and is thus best avoided. With the
recently introduced graph segment parsing API, loading option values
from files can now be handled by the caller.
This commit makes use of the new API in ffmpeg CLI. Any option name in
the filtergraph syntax can now be prefixed with a slash '/'. This will
cause ffmpeg to interpret the value as the path to load the actual value
from.
Callers currently have two ways of adding filters to a graph - they can
either
- create, initialize, and link them manually
- use one of the avfilter_graph_parse*() functions, which take a
(typically end-user-written) string, split it into individual filter
definitions+options, then create filters, apply options, initialize
filters, and finally link them - all based on information from this
string.
A major problem with the second approach is that it performs many
actions as a single atomic unit, leaving the caller no space to
intervene in between. Such intervention would be useful e.g. to
- modify filter options;
- supply hardware device contexts;
both of which typically must be done before the filter is initialized.
Callers who need such intervention are then forced to invent their own
filtergraph parsing, which is clearly suboptimal.
This commit aims to address this problem by adding a new modular
filtergraph parsing API. It adds a new avfilter_graph_segment_parse()
function to parse a string filtergraph description into an intermediate
tree-like representation (AVFilterGraphSegment and its children).
This intermediate form may then be applied step by step using further
new avfilter_graph_segment*() functions, with user intervention possible
between each step.
Also, replace an AVFilterContext argument with a logging context+private
class, as those are the only things needed in this function.
Will be useful in future commits.
This script generates the current general assembly voters according to
the criteria of '20 commits in the last 36 months'.
Signed-off-by: J. Dekker <jdek@itanimul.li>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
libavutil/color_utils contains some avpriv_ symbols that map
enum AVTransferCharacteristic values to gamma-curve approximations and
to the actual transfer functions to invert them (i.e. -> linear).
There's two issues with this:
(1) avpriv is evil and should be avoided whenever possible
(2) libavutil/csp.h exposes a public API for handling color that
already handles primaries and matricies
I don't see any reason this API has to be private, so this commit takes
the functionality from avutil/color_utils and merges it into avutil/csp
with an exposed av_ API rather than the previous avpriv_ API.
Every reference to the previous API has been updated to point to the
new one. color_utils.h has been deleted as well. This should not break
any applications as it only contained avpriv_ symbols in the first
place, so nothing in that header could be referenced by other
applications.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
It has been deprecated in 94d68a41fa
and can't be set via AVOptions. The only codecs that use it
(the MPEG-1/2 encoders) have private options for this.
So remove it.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
This commit does for AVOutputFormat what commit
20f9727018 did for AVCodec:
It adds a new type FFOutputFormat, moves all the internals
of AVOutputFormat to it and adds a now reduced AVOutputFormat
as first member.
This does not affect/improve extensibility of both public
or private fields for muxers (it is still a mess due to lavd).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
It is the most commonly used field and moving it to the start
e.g. allows to encode the offset in a pointer+offset addressing
mode on one byte on x86.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
Analogous to -enc_stats*, but happens right before muxing. Useful
because bitstream filters and the sync queue can modify packets after
encoding and before muxing. Also has access to the muxing timebase.
Since at least 4.4.3, -ab/-b:a help text was in the video section
of ffmpeg -h, but these are audio options.
Signed-off-by: Marth64 <marth64@proxyid.net>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
Current HLS implementation simply skip a failed segment to catch up
the stream, but this is not optimal for some use cases like livestream
recording.
Add an option to retry a failed segment to ensure the output file is
a complete stream.
Signed-off-by: gnattu <gnattuoc@me.com>
Reviewed-by: Steven Liu <liuqi05@kuaishou.com>
We parse the fallback cHRM on decode and correctly determine that we
have BT.709 primaries, but unknown TRC. This causes us to write cICP
where we shouldn't. Primaries without transfer can be handled entirely
by cHRM, so we should only write cICP if we actually know the transfer
function.
Additionally, we should avoid writing cICP if there's an ICC profile
because the spec says decoders must prioritize cICP over the ICC
profile.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
We need to construct the output format list separatedly from the input
format list, because we need to adhere to two extra requirements:
1. Big-endian output formats are always unsupported (runtime error)
2. Combining 'vulkan' with an explicit out_format that is not supported
by the vulkan frame allocation code is illegal and will crash (abort)
As a free side benefit, this rewrite fixes a possible memory leak in the
`fail` path that was present in the old code.
Signed-off-by: Niklas Haas <git@haasn.dev>
It only works on Linux
$ ffmpeg -loglevel verbose -init_hw_device qsv=intel -f lavfi -i \
yuvtestsrc -vf "format=uyvy422,vpp_qsv=format=nv12" -f null -
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Since many distributions ship libjxl 0.7.0 still, we'd still prefer to
compile against that, but don't want to lose the features that require
libjxl 0.8.0 or greater. For this reason I've added preprocessor #ifdef
guards around the features that aren't necessarily in libjxl 0.7.0.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
Splits the currently handled subtitle at random access point
packets that can be configured to follow a specific output stream.
Currently only subtitle streams which are directly mapped into the
same output in which the heartbeat stream resides are affected.
This way the subtitle - which is known to be shown at this time
can be split and passed to muxer before its full duration is
yet known. This is also a drawback, as this essentially outputs
multiple subtitles from a single input subtitle that continues
over multiple random access points. Thus this feature should not
be utilized in cases where subtitle output latency does not matter.
Co-authored-by: Andrzej Nadachowski <andrzej.nadachowski@24i.com>
Co-authored-by: Bernard Boulay <bernard.boulay@24i.com>
Signed-off-by: Jan Ekström <jan.ekstrom@24i.com>
This way we can call process_subtitles without causing the decoded
frame counter to get bumped.
Additionally, this now takes into mention all of the decoded
subtitle frames without fix_sub_duration latency/buffering, or filtering
out decoded reset/end subtitles without any rendered rectangles, which
matches the original intent in 4754345027
.
Signed-off-by: Jan Ekström <jan.ekstrom@24i.com>
This enables us to later call this when generating additional
subtitles for splitting purposes.
Co-authored-by: Andrzej Nadachowski <andrzej.nadachowski@24i.com>
Signed-off-by: Jan Ekström <jan.ekstrom@24i.com>
QSVDeintContext and VPPContext have the same base context, and all
features in deinterlace_qsv are implemented in vpp_qsv filter, so
deinterlace_qsv can be taken as a special case of vpp_qsv filter, and we
may use VPPContext with a different option array, preinit callback and
support pixel formats to implement deinterlace_qsv, then remove
QSVDeintContext.
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Like what we did for scale_qsv filter, we use QSVVPPContext as a base
context to manage MFX session for deinterlace_qsv filter.
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
This is used to control the output at frame rate or field rate when
deinterlace is expected and framerate is not specified.
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
From x86inc:
> On AMD cpus <=K10, an ordinary ret is slow if it immediately follows either
> a branch or a branch target. So switch to a 2-byte form of ret in that case.
> We can automatically detect "follows a branch", but not a branch target.
> (SSSE3 is a sufficient condition to know that your cpu doesn't have this problem.)
x86inc can automatically determine whether to use REP_RET rather than
REP in most of these cases, so impact is minimal. Additionally, a few
REP_RETs were used unnecessary, despite the return being nowhere near a
branch.
The only CPUs affected were AMD K10s, made between 2007 and 2011, 16
years ago and 12 years ago, respectively.
In the future, everyone involved with x86inc should consider dropping
REP_RETs altogether.
libjxl only accepts 16-bit buffers with its API, but it can
accept 9-bit to 15-bit input via a 16-bit buffer, provided the flag
is set declaring the buffer to be of the respective significant depth.
Likewise, it can only provide pixel data on decode as a 16-bit buffer
(if higher than 8) but does provide the metadata tagging the actual bit
depth.
This commit causes libjxlenc.c and libjxldec.c to respect this metadata
and tag/read it accordingly from AVCodecContext->bits_per_raw_sample.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
mfenc sets FF_CODEC_CAP_INIT_CLEANUP, so calling mf_close() on
failure inside mf_init() results in a double-free.
Signed-off-by: Cameron Gutman <aicommander@gmail.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
Only warn if the advanced_editlist option is enabled (it is enabled
by default though) so we don't print one warning for each track, and
demote the warning to AV_LOG_LEVEL_VERBOSE; this message does get
generated whenever parsing a fragmented MP4 file, regardless of
whether the file actually uses multiple edits or not.
Later when parsing the mov structures, the demuxer does warn if
the file did contain multiple edits which would require the
advanced_editlist option enabled for decoding correctly.
Adjust the warning message for the case when the file seemed like it
actually would have needed handling of advanced edit lists, to
reflect the fact that it doesn't help to try set the option as
it has been automatically disabled.
Signed-off-by: Martin Storsjö <martin@martin.st>
The construct of using offsetof on a (potentially anonymous) struct
defined within the offsetof expression, while supported by all
current compilers, has been declared explicitly undefined by the
C standards committee [1].
Clang recently got a change to identify this as an issue [2];
initially it was treated as a hard error, but it was soon after
softened into a warning under the -Wgnu-offsetof-extensions option
(not enabled automatically as part of -Wall though).
Nevertheless - in this particular case, it's trivial to fix the
code not to rely on the construct that the standards committee has
explicitly called out as undefined.
[1] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2350.htm
[2] https://reviews.llvm.org/D133574
Signed-off-by: Martin Storsjö <martin@martin.st>
If the dictionary provided on input contains multiple entries for an
option (relevant for flags modifying the previous value with '+' or
'-') and the option is not found in the target object, only the last
entry would be returned to the caller.
Pass AV_DICT_MULTIKEY to av_dict_set() to make sure all such entries are
returned.
QSVScaleContext and VPPContext have the same base context, and all
features in scale_qsv are implemented in vpp_qsv filter, so scale_qsv
can be taken as a special case of vpp_qsv filter, and we may use
VPPContext with a different option array, preinit callback and supported
pixel formats to implement scale_qsv then remove QSVScaleContext
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Use QSVVPPContext as a base context of QSVScaleContext, hence we may
re-use functions defined for QSVVPPContext to manage MFX session for
scale_qsv filter.
In addition, system memory has been taken into account in
QSVVVPPContext, we may add support for non-QSV pixel formats in the
future.
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Hyper Encode uses Intel integrated and discrete graphics on one system
to accelerate encoding of a single video stream.
Depending on the selected parameters and codecs, performance gain on AlderLake iGPU + ARC Gfx up to 1.6x.
More information: https://www.intel.co.uk/content/www/uk/en/architecture-and-technology/adaptix/deep-link.html
Developer guide: https://github.com/oneapi-src/oneVPL-intel-gpu/blob/main/doc/HyperEncode_FeatureDeveloperGuide.md
Hyper Encode is supported only on Windows and requires D3D11 and oneVPL.
To enable Hyper Encode need to specify:
-Hyper Encode mode (-dual_gfx on or dual_gfx adaptive)
-Encoder: h264_qsv or hevc_qsv
-BRC: VBR, CQP or ICQ
-Lowpower mode (-low_power 1)
-Closed GOP for AVC or strict GOP for HEVC -idr_interval = 0 used by default
Depending on the encoding parameters, the following parameters may need
to be adjusted:
-g recommended >= 30 for better performance
-async_depth recommended >= 30 for better performance
-extra_hw_frames recommended equal to async_depth value
-bf recommended = 0 for better performance
In the cases with fast encoding (-preset veryfast) there may be no
performance gain due to the fact that the decode is slower than the encode.
Command line examples:
ffmpeg.exe -init_hw_device qsv:hw,child_device_type=d3d11va,child_device=0 -v verbose -y -hwaccel qsv -extra_hw_frames 60 -async_depth 60 -c:v h264_qsv -i bbb_sunflower_2160p_60fps_normal.mp4
-async_depth 60 -c:v h264_qsv -preset medium -g 60 -low_power 1 -bf 0 -dual_gfx on output.h265
Signed-off-by: galinart <artem.galin@intel.com>
Let's ignore the index table if the number of index entries does not match the
index duration (or the special AVID index entry counts).
Fixes: OOM
Fixes: 50551/clusterfuzz-testcase-minimized-ffmpeg_dem_MXF_fuzzer-6607795234930688
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Marton Balint <cus@passwd.hu>
This is intended to be a more convenient replacement for
reordered_opaque.
Add support for it in the two encoders that offer
AV_CODEC_CAP_ENCODER_REORDERED_OPAQUE: libx264 and libx265. Other
encoders will be supported in future commits.
Some encoders (ffv1, flac, adx) are marked with AV_CODEC_CAP_DELAY onky
in order to be flushed at the end, otherwise they behave as no-delay
encoders.
Add a capability to mark these encoders. Use it for setting pts
generically.
bink supports 16x16 blocks in chroma planes thus we need to allocate enough.
Fixes: out of array access
Fixes: 55026/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_BINK_fuzzer-6013915371012096
Reviewed-by: Peter Ross <pross@xvid.org>
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This brings ff_vf_ssim360 in line with its declaration in allfilters.c;
this discrepancy is actually undefined behaviour.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The FILTER_INPUTS and FILTER_OUTPUTS macros already set
AVFilter.(inputs|outputs); Clang therefore emits a warning for
this: "initializer overrides prior initialization of this subobject
[-Winitializer-overrides]"
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Don't use "static const" for compile time float constants, but use
defines. This fixes the following error:
src/libavfilter/vf_ssim360.c(549): error C2099: initializer is not a constant
Signed-off-by: Martin Storsjö <martin@martin.st>
Customized SSIM for various projections (and stereo formats) of 360 images and videos.
Further contributions by:
Ashok Mathew Kuruvilla
Matthieu Patou
Yu-Hui Wu
Anton Khirnov
Suggested-By: ffmpeg@fb.com
Signed-off-by: Anton Khirnov <anton@khirnov.net>
Since those fields will be overridden by videotoolbox_start(), they
should never be set by user, it can trigger memory leaks otherwise.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
In the code path of av_videotoolbox_default_init/init2(),
avctx->internal->hwaccel_priv_data is NULL and passed to
decoder_cb.decompressionOutputRefCon. Then it will be dereferenced
inside videotoolbox_decoder_callback().
Delay videotoolbox_star() until ff_videotoolbox_common_init() to
fix the bug.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
It's the minimum of all child protocols max_packet_size. Can be used
like this:
ffmpeg -re -i cctv.mp4 -c copy -f mpegts \
-protocol_whitelist 'tee,file,udp' \
'tee:out.ts|udp://127.0.0.1:6666?pkt_size=1316'
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
AVMediaCodecDeviceContext without surface or native_window is
useless, it shouldn't be created at all. Such dummy AVHWDeviceContext
is allowed before, and it's used by mpv player. Creating a ANativeWindow
automatically breaks such usecases.
So disable creating a ANativeWindow by default. It can be enabled
via the create_window flag, or by set the AVDictionary of
av_hwdevice_ctx_create(). The downside is that
ffmpeg -hwaccel mediacodec -i input.mp4 \
-c:a copy -c:v hevc_mediacodec output.mp4
use ByteBuffer mode which isn't as efficient as before. The upside
is libavfilter works now, which should be less surprise.
To enable create_window on ffmpeg command line, use
ffmpeg -hwaccel mediacodec \
-init_hw_device mediacodec=mediacodec,create_window=1 \
-i input.mp4 -c:a copy -c:v hevc_mediacodec output.mp4
Users should know what it is to enable create_window. It should
be OK to take sometime to figure out the option. And there are comments
inside hwcontext_mediacodec.h to help user figure it out.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
This commit adds both decode and encode support for cICP chunks, which
allow a PNG image's pixel data to be tagged by any of the enum values in
H.273, without an ICC profile.
Upon decode, if a cICP chunk is present, the PNG decoder will tag output
AVFrames with the resulting enum color, and ignore iCCP, sRGB, gAMA, and
cHRM chunks, as per the spec.
Upon encode, if the color space is known and specified, and it is not sRGB,
the PNG encoder will output a cICP chunk containing the color space. If the
color space is sRGB, then it will output an sRGB chunk instead of a cICP
chunk. If the color space of the input is not unspecified, it will not output
a cICP chunk tagging the PNG as unspecified.
In either the sRGB case or the non-SRGB case, gAMA and cHRM are still written
as fallbacks provided the info is known.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
If an sRGB chunk is present in the PNG file, this commit will cause the
png decoder to ignore the cHRM and gAMA chunks and tag the resulting AVFrames
with BT.709 primaries, and ISO/IEC 61966-2-1 transfer. If these tags are
present in the AVFrame, pngenc.c already writes this chunk, so no change was
needed on the encode-side.
The PNG spec does not define what happens if sRGB and iCCP are present at
the same time, it just recommends that this not happen. As of this patch,
the decoder will have the ICC profile take precedence, and it will not tag
the pixel data as sRGB.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
The cHRM chunk is descriptive. That is, it describes the primaries that should
be used to interpret the pixel data in the PNG file. This is notably different
from Mastering Display Metadata, which describes which subset of the presented
gamut is relevant. MDM describes a gamut and says colors outside the gamut are
not required to be preserved, but it does not actually describe the gamut that
the pixel data from the frame resides in. Thus, to decode a cHRM chunk present
in a PNG file to Mastering Display Metadata is incorrect.
This commit changes this behavior so the cHRM chunk, if present, is decoded to
color metadata. For example, if the cHRM chunk describes BT.709 primaries, the
resulting AVFrame will be tagged with AVCOL_PRI_BT709, as a description of its
pixel data. To do this, it utilizes libavutil/csp.h, which exposes a funcction
av_csp_primaries_id_from_desc, to detect which enum value accurately describes
the white point and primaries represented by the cHRM chunk.
This commit also changes pngenc.c to utilize the libavuitl/csp.h API, since it
previously duplicated code contained in that API. Instead, taking advantage of
the API that exists makes more sense. pngenc.c does properly utilize the color
tags rather than incorrectly using MDM, so that required no change.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
Gamma 2.2 and Gamma 2.8 are tagged in the file as 0.45455 and 0.35714,
respectively (i.e. 1/2.2 and 1/2.8). Trying to identify them as 2.2 and
2.8 instead of these values will cause the transfer function to not
properly be recognized. This patch fixes this.
bits_*_signed(0) will currently invoke an undefined shift by
8 * sizeof(int).
Add bits_*_signed_nz() that only works for n>0, analogous to
bits_read_nz(). Add an explicit check for n=0 in bits_*_signed().
Found-by: James Almer
This change improves the performance and multicore scalability of the vp9
codec for streaming single-pass encoded videos by taking advantage of up
to 64 cores in the system. The current thread limit for ffmpeg codecs is 16
(MAX_AUTO_THREADS in pthread_internal.h) due to a limitation in H.264 codec
that prevents more than 16 threads being used.
Experiments show that increasing the thread limit to 64 for vp9 improves
the performance for encoding 4K raw videos for streaming by up to 47%
compared to 16 threads, and from 20-30% for 32 threads, with the same quality
as measured by the VMAF score.
Rationale for this change:
Vp9 uses tiling to split the video frame into multiple columns; tiles must
be at least 256 pixels wide, so there is a limit to how many tiles can be
used. The tiles can be processed in parallel, and more tiles mean more CPU
threads can be used. 4K videos can make use of 16 threads, and 8K videos
can use 32. Row-mt can double the number of threads so 64 threads can be used.
Signed-off-by: James Zern <jzern@google.com>
Fixes: out of array access
Fixes: 48567/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_ZERO12V_fuzzer-6714182078955520.fuzz
Fixes: 48567/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_ZERO12V_fuzzer-6698145212137472.fuzz
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The same members between QSVVPPContext and VPPContext are removed from
VPPContext, and async_depth is moved from QSVVPPParam to QSVVPPContext
so that all QSV filters using QSVVPPContext may support async depth.
In addition, we may use QSVVPPContext as base context in other QSV
filters in the future so that we may re-use functions defined in
qsvvpp.c for other QSV filters.
This commit shouldn't change the functionality of vpp_qsv / overlay_qsv.
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
This is in preparation for reusing the code for other QSV filters. E.g.
deinterlacing_qsv may have an option array without format option
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
QSV filters may set this flag in preinit callback to turn on / off pass
through mode
This is in preparation for reusing the code for other QSV filters. E.g.
scale_qsv filter doesn't support pass through mode.
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Set the expected default value for options in this callback, hence we
have the right values even if these options are not included in the
option arrray.
This is in preparation for reusing the code for other QSV filters.
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Special values are:
0 = original width/height
-1 = keep original aspect
This is in preparation for reusing the code for other QSV filters.
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
This patch provides default value if the expression is NULL.
This is in preparation for reusing the code for other QSV filters.
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Also fix the naming style in enum var_name.
This is in preparation for reusing the code for other QSV filters.
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
segment_time and segment_times are defined as duration specifications, not
timestamps, so calculation of segment duration must account for initial
timestamp. Fixed.
FATE ref for segment-mp4-to-ts changed on account of avoiding premature
segment cut at the end of the first segment.
Fixes: out of array access
Fixes: 48567/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WBMP_fuzzer-6652634692190208
Fixes: 48567/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WBMP_fuzzer-6653703453278208
Fixes: 48567/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WBMP_fuzzer-6668020758216704
Fixes: 48567/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WBMP_fuzzer-6684749875249152
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Peter Ross <pross@xvid.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This ensures all defined types are supported, even if only to report
their presence and print their size if a custom implementation is
not added.
Signed-off-by: James Almer <jamrial@gmail.com>
Defined by H.274, this SEI message is utilized by iPhones to save
the nominal ambient viewing environment for the display of recorded
HDR content. The contents of the message are exposed to API users
as AVFrame side data containing AVAmbientViewingEnvironment.
As the DV RPU test sample is from an iPhone and includes Ambient
Viewing Environment SEI messages, its test result gets updated.
Parsing should probably be enabled for all codecs, at least for headers,
but e.g. the AAC parser produces 1-byte packets of zero padding with it,
so I'm just enabling it for EAC3 for the moment.
Removed the unnecessary calls to ff_format_io_close
this patch introduced in dashenc_delete_file.
dashenc_delete_file functions open a
new HTTP connection regardless of the http_persistent value,
So change their behaviour to keep http connections open
if http_persistent is set.
Signed-off-by: Basel Sayeh <basel.sayeh@hotmail.com>
Removed the unnecessary calls to ff_format_io_close
this patch introduced in hls_delete_file.
hls_delete_file functions open a new HTTP connection
regardless of the http_persistent value,
So change their behaviour to keep http connections open
if http_persistent is set
Signed-off-by: Basel Sayeh <basel.sayeh@hotmail.com>
The HEIF specification permits specifying the looping behavior of
animated sequences by using the EditList (elst) box. The track
duration will be set to the total duration of all the loops (or
infinite) and the duration of a single loop will be set in the edit
list box.
The default behavior is to loop infinitely.
Compliance verification:
* This was added in libavif recently [1] and the files produced by
ffmpeg after this change have EditList boxes similar to the ones
produced by libavif (and avifdec is able to parse the loop count as
intended).
* ComplianceWarden is ok with the produced files.
* Chrome is able to play back the produced files.
[1] 4d2776a3af
Signed-off-by: Vignesh Venkatasubramanian <vigneshv@google.com>
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Allow specifying the movie_timescale options to AVIF ouptut.
This also makes sure that when movie_timescale is not specified,
the default value of 1000 is used instead of 0. Animated AVIF
files which don't specify the movie_timescale will have the
correct duration written in the track and movie headers after this
change (instead of writing 0).
Signed-off-by: Vignesh Venkatasubramanian <vigneshv@google.com>
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Fixes: signed integer overflow: -2889074 * 2048 cannot be represented in type 'int'
Fixes: 51363/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_BONK_fuzzer-5660734784143360
Fixes: 48567/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_BONK_fuzzer-6617680050520064
Fixes: 48567/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_BONK_fuzzer-6743951854141440
No check is done for the overflow as this was rejected in last review, see the ML
Note: the 2nd and 3rd testcase was assigned by ossfuzz to a unrelated theora issue (48567)
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: out of array access
Fixes: 48567/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_SGI_fuzzer-6704753329700864
Fixes: 48567/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_SGI_fuzzer-6683986844057600
Fixes: 48567/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_SGI_fuzzer-6697387691474944
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 2147481600 + 13408 cannot be represented in type 'int'
Fixes: 53963/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_H264_fuzzer-4650467311616000
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
android_get_device_api_level() is a static inline before API level
29. It was implemented via __system_property_get(). We can do the
same thing, but I don't want to mess up with __system_property_get.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
There is no reason to enforce a high minimum. In the context
of streaming only a few output buffers and capture buffers
are even needed for continuous playback. This also helps
alleviate memory pressure when decoding 4K media.
Signed-off-by: Aman Karmani <aman@tmm1.net>
The current code will apply them if the options string does not contain
a 'flags' substring, and will do so by appending the graph-level option
string to the filter option string (with the standard ':' separator).
This is flawed in at least the following ways:
- naive substring matching without actually parsing the options string
may lead to false positives (e.g. flags are specified by shorthand)
and false negatives (e.g. the 'flags' substring is not actually the
option name)
- graph-level sws options are not limited to flags, but may set
arbitrary sws options
This commit simply applies the graph-level options with
av_set_options_string() and lets them be overridden as desired by the
user-specified filter options (if any). This is also shorter and avoids
extra string handling.
This function currently treats AVFilterContext options and
filter-private options differently: the former are immediately applied,
while the latter are stored in a dictionary to be applied later.
There is no good reason for having two branches - storing all options in
the dictionary is simpler and achieves the same effect (since it is
later applied with av_opt_set_dict()).
This will also be useful in future commits.
Current code first sets AVFilterContext-level options, then aplies the
leftover on the filter's private data. This is unnecessary, applying the
options to AVFilterContext with the AV_OPT_SEARCH_CHILDREN flag
accomplishes the same effect.
Current code may, depending on the muxer, decide to use VSYNC_VFR tagged
with the specified framerate, without actually performing framerate
conversion. This is clearly wrong and against the documentation, which
states unambiguously that -r should produce CFR output for video
encoding.
FATE test changes:
* nuv-rtjpeg: replace -r with '-enc_time_base -1', which keeps the
original timebase. Output frames are now produced with proper
durations.
* filter-mpdecimate: just drop the -r option, it is unnecessary
* filter-fps-r: remove, this test makes no sense and actually
produces broken VFR output (with incorrect frame durations).
It is a URL rewriter for IPFS gateways, not an actual implementation of
IPFS, and naming it as such was both incorrect and misleading.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Advanced edit list support is entirely broken for fragmented MP4s,
currently. mov_fix_index is never run in mov_build_index, since
in fragmented MP4s the stco, stsz, stts, and stsc boxes have zero
entries, with the index being filled in as each fragment's trun
box is seen.
The result of this is that the skip samples is never set properly,
since half the code thinks it doesn't need to, as advanced_editlist
is enabled, but as mov_fix_index is never called, it doesnt get set.
This means that any edits for e.g. priming are not properly applied
as skip samples side data.
This also means remuxing to fragmented MP4 from progressive MP4 with
lavf will quietly drop the edit list, currently.
Example:
$ ffmpeg -loglevel quiet -advanced_editlist 1 -i non_fragmented.mp4 -f md5 -
MD5=d02d929f8eb4edef624758a298d5f7c6
$ ffmpeg -loglevel quiet -advanced_editlist 0 -i non_fragmented.mp4 -f md5 -
MD5=d02d929f8eb4edef624758a298d5f7c6
$ ffmpeg -loglevel quiet -advanced_editlist 1 -i fragmented.mp4 -f md5 -
MD5=e38b110f586fa886ff94e0ca98a95d59 <-- wrong, extra samples are output instead of being skipped
$ ffmpeg -loglevel quiet -advanced_editlist 0 -i fragmented.mp4 -f md5 -
MD5=d02d929f8eb4edef624758a298d5f7c6
We cannot call mov_fix_index after reading a trun box
since mov_fix_index seems to assume it is only called once, on a
fully complete index, an multiple calls to it don't seem like
they'd work, so the "best" option seems to be disabling advanced
edit list support entirely for the time being, as it is broken
for these types of files.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
The cached bitstream reader was originally written by Alexandra Hájková
for Libav, with significant input from Kostya Shishkov and Luca Barbato.
It was then committed to FFmpeg in ca079b0954, by merging it with the
implementation of the current bitstream reader.
This merge makes the code of get_bits.h significantly harder to read,
since it now contains two different bitstream readers interleaved with
#ifdefs. Additionally, the code was committed without proper authorship
attribution.
This commit re-adds the cached bitstream reader as a standalone header,
as it was originally developed. It will be made useful in following
commits.
Integration by Anton Khirnov.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
Current code stores a pointer to allocated data in libx265 and frees it
when the encoded packet is retrieved. This will leak if the packet is
never retrieved, e.g. if the encoder is closed without being flushed.
Restructure the code such that only indices to an array stored in our
private data are given to libx265. This ensures no allocated memory can
be lost.
Commit 18f24527eb accidentally made side data only packets be handled like a
flush request. Fix this regression by effectively ignoring them as was the
original intention.
Signed-off-by: James Almer <jamrial@gmail.com>
Fixes: signed integer overflow: 0 - -9223372036854775808 cannot be represented in type 'long int'
Fixes: fate-cover-art-aiff-id3v2-remux, fate-cover-art-mp3-id3v2-remux and fate-mov-cover-image
under ubsan.
Signed-off-by: James Almer <jamrial@gmail.com>
It appears faster than the iterative method on my machine (1.06x
faster), so I'm guessing compilers improved over time (the iterative
version was slightly faster in the past).
Currently, in case of equality on the first color channel, the order of
the ref colors is defined by the hashing function. This commit makes the
sorting deterministic and improve the hierarchical ordering.
This variable is used only for the running weight (used to reach the
target median). The places where we actually need the box weight are
changed to use box->weight.
The variance computation is simple enough now (since we can use the axis
squared errors) that it doesn't need to have a complex lazy computation
logic.
The equalities in the w{r,g,b} range checks make sure longest is never
0. Even if the alpha ended up being selected in get_next_color() it
would cause underread memory accesses in its caller (colormap_insert).
This reverts commit dea673d0d5.
This change cannot work for several reasons, the most obvious ones are:
- the alpha is being part of the scoring of the color difference, even
though we can not interpret the alpha as part of the perception of the
color (we don't even know if it's premultiplied or postmultiplied)
- the colors are averaged with their alpha value which simply cannot
work
The command proposed in the original thread of the patch actually
produces a completely broken file:
ffmpeg -y -loglevel verbose -i fate-suite/apng/o_sample.png -filter_complex "split[split1][split2];[split1]palettegen=max_colors=254:use_alpha=1[pal1];[split2][pal1]paletteuse=use_alpha=1" -frames:v 1 out.png
We can see that many color pixels are off, but more importantly some
colors have a random alpha value: https://imgur.com/eFQ2UK7
I don't see any easy fix for this unfortunately, the approach appears to
be flawed by design.
New option can be used to avoid creating very short segments with inputs
whose GOP size is variable or unharmonic with segment_time.
Only effective with segment_time.
Similar to how the encoder looks at frame color information to write the frame
header bitstream.
Should workaround ticket #10091, where container level color parameters passed
to the decoder context were being overwritten.
Signed-off-by: James Almer <jamrial@gmail.com>
Some encoders, like flac, can send side data only packets at the end.
Eventually, said extradata update should ideally be used to update the header
when writting to seekable output, but for now, ignore them.
Should fix the undefined behavior of passing NULL to memcpy().
Signed-off-by: James Almer <jamrial@gmail.com>
PFM (aka Portable FloatMap) encodes its scanlines from bottom-to-top,
not from top-to-bottom, unlike other NetPBM formats. Without this
patch, FFmpeg ignores this exception and decodes/encodes PFM images
mirrored vertically from their proper orientation.
For reference, see the NetPBM tool pfmtopam, which encodes a .pam
from a .pfm, using the correct orientation (and which FFmpeg reads
correctly). Also compare ffplay to magick display, which shows the
correct orientation as well.
See: http://www.pauldebevec.com/Research/HDR/PFM/ and see:
https://netpbm.sourceforge.net/doc/pfm.html for descriptions of this
image format.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: James Almer <jamrial@gmail.com>
EAGAIN is not correct in this scenario. FFCodec.cb.decode() callback decoders
always return the amount of bytes consumed from the input packet (if any), and
report if a frame was generated by setting got_frame.
Signed-off-by: James Almer <jamrial@gmail.com>
Add encoding of 32 bit-per-sample PCM to FLAC files to libavcodec.
Coding to this format is at this point considered experimental and
-strict experimental is needed to get ffmpeg to encode such files.
Fixes: signed integer overflow: 574590586 - -1875616554 cannot be represented in type 'int'
Fixes: 53914/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_AAC_FIXED_fuzzer-5037125846564864
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
They're not currently used, so they don't need to be there.
Vulkan stabilized the decode extensions less than a week ago, and their
name prefixes were changed from EXT to KHR. It's a bit too soon to be
depending on it, so rather than bumping, just remove these for now.
Fixes: OOM testcase
Fixes: 51527/clusterfuzz-testcase-minimized-ffmpeg_dem_LAF_fuzzer-5453663505612800
OOM can still happen after this as an arbitrary sized block is allocated and read
this would require a redesign or some limit on the sample rate.
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This filter, when used in the "pad" mode, currently makes the
distinction between limited and full range solely by testing for YUVJ
pixel formats at link setup time. This is deprecated and should be
improved to perform the detection based on the per-frame metadata.
In order to make this distinction based on color range metadata, which
is only known at the time of filtering frames, for simplicity, we simply
allocate two copies of the "black" frame - one for limited range and the
other for full range metadata. This could be done more dynamically (e.g.
as-needed or simply by blitting the appropriate pixel value directly),
but this change is relatively simple and preserves the structure of the
existing code.
This commit actually fixes a bug in FATE - the new output is correct for
the first time. The previous md5 ref was of a frame that incorrectly
combined full-range pixel data with limited-range black fields. The
corresponding result has been updated.
Signed-off-by: Niklas Haas <git@haasn.dev>
This filter currently makes the distinction between limited and full
range by testing for the deprecated YUVJ pixel formats at link setup
time. This is deprecated and should be improved to perform the detection
based on the per-frame metadata.
Rewrite it to calculate the black pixel threshold at the time of
filtering a frame, when metadata about the frame's color range is known.
Doing it this way has the small side benefit of being able to handle
streams with variable metadata, and is not a meaningful performance
cost.
Signed-off-by: Niklas Haas <git@haasn.dev>
Bugfix: The OpenVino DNN backend in the 'async' mode sets
'task->inference_done' to 'complete' prior to data copy from
OpenVino output buffer to task's output frame.
This order causes task destroy in ff_dnn_get_result_common()
prior to model output processing.
Signed-off-by: Rafik Saliev <rafik.f.saliev@intel.com>
It works since most of Android devices don't output B frames by
default. The behavior is documented by Android now, although there
is some exception in history, which should have been fixed now.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Use input PTS as DTS has multiple problems:
1. If there is no reordering, it's better to just use the output
PTS as DTS, since encoder may change the timestamp value (do it
on purpose or rounding error).
2. If there is reordering, input PTS should be shift a few frames
as DTS to satisfy the requirement of PTS >= DTS. I can't find a
reliable way to determine how many frames to be shift. For example,
we don't known if the encoder use hierarchical B frames. The
max_num_reorder_frames can be get from VUI, but VUI is optional.
3. Encoder dropping frames makes the case worse. Android has an
BITRATE_MODE_CBR_FD option to allow it explicitly.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
simple_idct.asm is 32 bit-only since
bfb28b5ce8,
whereas simple_idct10.asm is x64-only. So don't build
the ultimately unneeded and empty files, as some linkers
complain about this: "ranlib: file:
libavcodec/libavcodec.a(simple_idct.o) has no symbols"
(this is from an Xcode toolchain as reported by Ronald S. Bultje).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
We shouldn't be providing links to unverified and non-FFmpeg-controlled
content in our documentation.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
We do not endorse or recommend specific third party servers or companies
that users' data will be funneled through.
It is also incorrectly describing how FFmpeg currently works.
Should have been part of 412922cc6f but was
missed.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Attach the AVPacket instead of only a few values to the corresponding Dav1dData
struct and use it to set all output frame props.
Signed-off-by: James Almer <jamrial@gmail.com>
Only one codec using mjpegdec.c actually creates multiple
frames from a single packet, namely SMVJPEG. The other can
use the ordinary decode callback just fine. This e.g. has
the advantage of confining the special SP5X/AMV code to sp5xdec.c.
This reverts most of commit e9a2a8777317d91af658f774c68442ac4aa726ec;
of course it is not a simple revert: Way too much has changed;
furthermore, outright reverting the sp5xdec.c changes would readd
a stack packet to sp5x_decode_frame() which is not desired.
In order to avoid this without modifying the given AVPacket,
a variant of ff_mjpeg_decode_frame() with explicit buf and size
parameters has been added.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
AVID content is not supposed to be SMVJPEG; given that
both these codecs involve manipulating image dimensions
and cropping dimensions, it makes sense to restrict
the AVID codepaths to non-SMVJPEG codecs in order not
to have to think about what if SMVJPEG happens to
have a codec tag indicating AVID.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
xv30be is an obnoxious format that I shouldn't have included in the
first place. xv30 packs 3 10bit channels into 32bits and while our
byte-oriented logic can handle Little Endian correctly, it cannot
handle Big Endian. To avoid that, I marked xv30be as a bitstream
format, but while that didn't produce FATE errors, it turns out that
the existing read/write code silently produces incorrect results, which
can be revealed via ubsan.
In all likelyhood, the correct fix here is to remove the format. As
this format is only used by Intel vaapi, it's only going to show up
in LE form, so we could just drop the BE version. But I don't want to
deal with creating a hole in the pixfmt list and all the weirdness that
comes from that. Instead, I decided to write the correct read/write
code for it.
And that code isn't too bad, as long as it's specialised for this
format, as the channels are all bit-aligned inside a 32bit word.
Various parts of the code assume that width can be divided by various powers of 2
without rounding
Fixes: out of array access
Fixes: 53623/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_VQC_fuzzer-6209269924233216
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Peter Ross <pross@xvid.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
There can be more than one available render node, and it's not
guaranteed the first node we come across is the correct one. In
particular, 'vgem' devices are common, and are
never VAAPI-enabled and thus not valid here.
We have a 'kernel_driver' arg already for specifying a single driver we
*do* want, but it doesn't support a negation, nor a list. It's easier
just to automatically skip 'vgem' anyway, to avoid foisting this burden
on users.
This has precedent in libva-utils already:
bfb6b98ed62a exclude vgem node and invalid drm node in vainfo
bfb6b98ed6
Signed-off-by: Brian Norris <briannorris@chromium.org>
PI, PHI and E are defined in libavutil/eval.c, and user may use these
constants for scale_qsv filter, so we needn't re-define these variables
in vf_scale_qsv.c
Reviewed-by: Gyan Doshi <ffmpeg@gyani.pro>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
When process yuv420 frames, FFmpeg uses same alignment on Y/U/V
planes. VPL and MSDK use Y plane's pitch / 2 as U/V planes's
pitch, which makes U/V planes 16-bytes aligned. We need to set
a separate alignment to meet runtime's behaviour.
Now alignment is changed to 16 so that the linesizes of U/V planes
meet the requirment of VPL/MSDK. Add get_buffer.video callback to
qsv filters to change the default get_buffer behaviour.
Now the commandline works fine:
ffmpeg -f rawvideo -pix_fmt yuv420p -s:v 3082x1884 \
-i ./3082x1884.yuv -vf 'vpp_qsv=w=2466:h=1508' -f rawvideo \
-pix_fmt yuv420p 2466_1508.yuv
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Making it point to the input packet results in different behavior during flush,
where its contents will be that of an empty packet instead of containing the
props from the last input packet fed to the decoder.
After this change, decoding with more than one thread will shield the same
results as using a single thread.
Signed-off-by: James Almer <jamrial@gmail.com>
Use the opaque field instead to keep this value.
No functional change, but removes the hack that made the packet technically
invalid, allowing the safe usage of functions like av_packet_ref() on it
if required.
Signed-off-by: James Almer <jamrial@gmail.com>
The idea behind last_pkt_props was to store the properties of the last packet
fed to the decoder. Any sort of queueing required by CODEC_CAP_DELAY decoders
that consume several packets before they start outputting frames should be done
by the decoders in question. An example of this is libdav1d.
This is required for the following commits that will fix last_pkt_props in
frame threading scenarios, as well as maintain its contents during flush.
This revers commit 022a12b306.
Signed-off-by: James Almer <jamrial@gmail.com>
This will be needed for the following commit, after which ff_get_buffer() will
stop setting frame->pts to AV_NOPTS_VALUE.
Signed-off-by: James Almer <jamrial@gmail.com>
This will be needed for a following commit, after which ff_get_buffer() will
stop setting frame->pts to AV_NOPTS_VALUE.
Signed-off-by: James Almer <jamrial@gmail.com>
This commit tests it in a way that automatically checks
that using av_dict_iterate() is equivalent to using
av_dict_get() with key "" and AV_DICT_IGNORE_SUFFIX.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Untested with "non fuzzed" samples as i have no such file
The reference 5.6.0 decoder appears to also have undefined behavior in the lossless codepath for this
Fixes: shift exponent 32 is too large for 32-bit type 'int'
Fixes: 50930/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WAVPACK_fuzzer-6319201949712384
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The Encoding field (and the \fe tag) allows to limit font selection to
only those fonts declaring support for the specified codepage in their
OS/2's table "Code Page Character Range" field.
Particularly, Encoding=0 means only font's declaring support for "ANSI",
or rather "Latin (Western European)", are allowed to be selected.
Specifying Encoding=1 allows all fonts to be considered.
We do not want to limit font selection, so specify Encoding=1.
NB: at the time of writing libass only partially supports this field,
thus hiding the issue in any libass-based renderer. A VSFilter-based
DirectShow filter or XySubFilter will reveal the issue when a font not
declaring support for latin characters is specified in a style.
Colour values used in ASS files without a "YCbCr Matrix" header set to
"None" are subject to colour mangling, due to how ASS was historically
conceived. A more in-depth description can be found in the documetation
inside libass' public ass_types.h header. The important part is, if this
header is not set to "None", the final output colours can deviate from
the literal value specified in the file. When converting from non-ASS
formats we do not want any colour shift to happen, so let's set the
appropiate header.
NB: ffmpeg's subtitle filter, does not follow libass' documentation
regarding colour mangling, thus hiding the bug. Anything based on
VSFilter, XySubFilter or e.g. mpv do and might show the issue.
(Of course native ASS subs, which _do_ rely on colour mangling won't
work properly with the subtitle filter, but this can be fixed another
time)
There is no v4 ASS. There are versions 1 to 4 of SSA (with only v4
being supported by softsub renderers), ASS which declares itself as v4+
or v4.00+, and the rarely used v4++/v4.00++, which isn't fully supported
by libass.
As reflected by the [V4+ Styles] section header, FFmpeg uses ASS, not
SSA v4, so adjust the comment accordingly.
avx512 on Skylake-X (Xeon D-2123IT):
1.19x faster (970±91.2 vs. 817±104.4 decicycles) compared with avx2
avx512icl on Ice Lake (Xeon Silver 4316):
2.52x faster (1350±5.3 vs. 535±9.5 decicycles) compared with avx2
Instead of manually assembling the string, use av_dict_get_string
which handles things like proper escaping too (even though it is
not yet needed here).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This unfortunately involved adding some parameters
to ff_h2645_sei_to_frame() that will be mostly unused.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is valid for HEVC; in fact, the ATSC-HEVC spec [1] simply
refers to the relevant H.264 spec.
It is also trivial to implement now: Just move applying AFD
to ff_h2645_sei_to_frame() and stop ignoring AFD when parsing
a HEVC SEI containing it.
A FATE-test for this has been added.
[1]: https://www.atsc.org/atsc-documents/a3412017-video-hevc/
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
There are only slight differences between H.264 and HEVC
for this side data, so it makes sense to share the code.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This commit only factors out freeing the common SEI parts,
not whether the fields indicating whether an SEI is present
shall be reset. H.264 and HEVC differ in this regard
(ff_h264_sei_uninit() really resets, whereas ff_hevc_reset_sei()
only uninits.) and neither actually honours the persistency
as prescribed by the relevant specs.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Currently, several components select atsc_a53, despite
not using anything from it themselves. They only select
it because parsing SEI messages adds an indirect dependency.
But using direct dependencies is more natural, so add
dedicated subsystems for them.
It already allows to remove a superfluous dependency of
the HEVC QSV encoder on hevc_sei and atsc_a53.
Adding new subsystems only becomes effective after a reconfiguration.
In order to force this, some needed headers (which are only included
implicitly before this commit) were included explicitly in
libavformat/allformats.c.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
SEI NALUs and several SEI messages are naturally byte-aligned,
so reading them via the bytestream-API is more natural.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
VPP in the SDK requires the frame rate to be set to a valid value,
otherwise init will fail, so always set a default framerate when the
input link doesn't have a valid framerate.
Reviewed-by: Soft Works <softworkz@hotmail.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
The SDK will return error if feeding h264_qsv encoder p010 input.
$ ffmpeg -f lavfi -i testsrc -vf "format=p010" -c:v h264_qsv -f null -
[...]
[h264_qsv @ 0x55584888b080] Current pixel format is unsupported
[h264_qsv @ 0x55584888b080] some encoding parameters are not supported
by the QSV runtime. Please double check the input parameters.
Error initializing output stream 0:0 -- Error while opening encoder for
output stream #0:0 - maybe incorrect parameters such as bit_rate, rate,
width or height
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
adaptive_i and adaptive_b cannot work with MFX_GOP_STRICT,
so only enable MFX_GOP_STRICT when these features are disabled.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
The SDK may provides HDR metadata for HDR streams via mfxExtBuffer
attached on output mfxFrameSurface1
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Fixes some MP4F files which have duration in mdhd set to UINT_MAX instead of zero.
Signed-off-by: Sasi Inguva <isasi@google.com>
Signed-off-by: James Almer <jamrial@gmail.com>
Fixes: Timeout (read mostly the same data repeatly)
Fixes: 52457/clusterfuzz-testcase-minimized-ffmpeg_dem_ALP_fuzzer-6610706313379840
Fixes: 53098/clusterfuzz-testcase-minimized-ffmpeg_dem_SOL_fuzzer-6481382981632000
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: -2146670226 + -2227242 cannot be represented in type 'int'
Fixes: 51943/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_APAC_fuzzer-5779018251370496
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This causes the RLE decoder to exit before applying the last RLE run
All images i tested with are unchanged, this makes the special case
for handling the last run unused for non truncated images.
Reviewed-by: Peter Ross <pross@xvid.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: -1094995528 * 8224 cannot be represented in type 'int'
Fixes: 53508/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_FFV1_fuzzer-474551033462784
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: negation of -2147483648 cannot be represented in type 'int'; cast to an unsigned type to negate this value to itself
Fixes: 53364/clusterfuzz-testcase-minimized-ffmpeg_BSF_DTS2PTS_fuzzer-4693772269387776
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This is a regression since: adaa06581c
Before this, max_channel and max_matrix_channel where compared for equality
Fixes: out of array access
Fixes: 53340/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_TRUEHD_fuzzer-514959011885875
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: -1284837070 - 982101618 cannot be represented in type 'int'
Fixes: 53105/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_AC3_FIXED_fuzzer-4848015827664896
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Rather than hard-coding AV_PIX_FMT_VULKAN, expand this to the full list
of formats supported by <libplacebo/utils/libav.h>. We re-use the
existing `format` option to allow selecting specific software formats in
addition to specific vulkan hwframe formats.
Some minor changes are necessary to account for the fact that
`ff_vk_filter_config_output` is now only called optionally, the fact
that the output format must now be parsed before `query_format` gets
called, and the fact that we need to call a different function to
retrieve data from the `pl_frame` in the non-hwaccel case.
Signed-off-by: Niklas Haas <git@haasn.dev>
Rather than the encoder timebase. Since the times are parsed as
microseconds, this will not reduce precision, except possibly when
chapter times are used and the chapter timebase happens to be better
aligned with the encoder timebase, which is unlikely.
This will allow parsing the keyframe times earlier (before encoder
timebase is known) in future commits.
There are 8 of them and they are typically used together. Allows to pass
just this struct to forced_kf_apply(), which makes it clear that the
rest of the OutputStream is not accessed there.
Currently, it is done once per slice-thread, leading to
one warning per slice-thread in case a YUVJ pixel format
has been originally used.
This also fixes the anomaly that said parameter are only
updated for the user-facing context (whose values are retrievable
via av_opt_get()) if slice-threading is not in use.
Fixes ticket #9860.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Initializing slice threads currently uses the function
(sws_init_context()) that is also used for initializing
user-facing contexts with the only difference being that
nb_threads is set to one before initializing the slice contexts.
Yet sws_init_context() also initializes lots of stuff
that is not slice-dependent, i.e. (src|dst)Range. This
currently only works because the code sets these fields
to the same values for all slice contexts. This is not
nice; even worse, it entails that log messages are printed
once per slice context (and therefore fill the screen).
This commit lays the groundwork to fix this.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The encoder is sensitive to changes in precision, and its test target
was a compromise. It was already close to failing on x87 FPUs.
ff_mdct_init used double precision entirely from the scale to computing
the MDCT exp tables. av_tx_init uses single-precision for the scale,
with a small input change which was enough to tip the test into failing on
x87 FPUs.
Increase the fuzz factor in line with other AAC encoder tests to fix.
AVCodecContext.frame_number is actually only incremented
in case encoding was successfull; if e.g. the ff_alloc_packet()
below fails, it won't be incremented and therefore it is possible
for the previous_frame buffer to be allocated for multiple
first frames, leaking every one except the last.
So check for whether there already is a previous frame instead.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The earlier code did not account for the frame header as well
as the block headers; furthermore, in case a large part of
a block is unused (due to padding), the output size may
exceed 3 * width * height (where the dimensions correspond
to the visible pixels) due to the overhead of the zlib header,
so use the padded dimensions to calculate the maximum packet size
(which is also what the actual call to compress2() uses).
Fixes ticket #10053.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Fixes the crash in ticket #10050.
Also ensure that we don't overflow before ff_get_encode_buffer().
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Bring it up to date with current practice, as the current text does not
cover everything. Drop the reference to unprefixed exported symbols,
which do not exist anymore.
It describes a general development policy, not code formatting. It is
also not true, as these days we tend to prioritize correctness, safety,
and completeness over code size.
It is currently very out of touch with reality.
* declare we are using C99 fully, rather than C90 plus extensions
* mention our use of stdatomic.h
* mention forbidden C99 features, like VLAs and complex numbers
Do it in set_dispositions() rather than during stream creation.
Since at this point all other stream information is known, this allows
setting disposition based on metadata, which implements #10015. This
also avoids an extra allocated string in OutputStream that was unused
after of_open().
Replace it with an array of streams in each InputFile. This is a more
accurate reflection of the actual relationship between InputStream and
InputFile.
Analogous to what was previously done to output streams in
7ef7a22251.
Encoding init code will currently fall back to a 25fps default when no
framerate is known or specified, but only if there is a known source
input stream. There is no good reason for this condition, so drop it.
VP6 alpha in EA format is a second VP6 encoded video stream where only the Y
component is used and is interpreted as the alpha channel of the first VP6
stream. The alpha VP6 stream is muxed separately from the main VP6 stream, has
its own stream headers and packet headers. In theory the two streams might not
even have the same resolution (although most likely that is not something that
is seen or supported in the wild), but the format is capable of doing it.
Merged VP6 alpha (also known as the VP6A codec) means that a packet of the
video stream contains the corresponding packet of both VP6 substreams like
this:
{OffsetOfAlpha, DataPacket, AlphaDataPacket}
So data and alpha data of a frame is merged to a single packet, this is how VP6
video with alpha is muxed in FLV and SWF.
The first approach is more like how the demuxer sees data in the EA format,
unfortunately it is different to what the FLV or SWF format expects, so -
having no better place for it in the framework - I decided to do an optional
format conversion in the EA demuxer.
Signed-off-by: Marton Balint <cus@passwd.hu>
Add qsv_transcode example which shows how to use qsv to do hardware
accelerated transcoding, also show how to dynamically set encoding
parameters.
examples:
Normal usage:
qsv_transcode input.mp4 h264_qsv output.mp4 "g 60"
Dynamic setting usage:
qsv_transcode input.mp4 hevc_qsv output.mp4 "g 60 asyne_depth 1"
100 "g 120"
This command initializes codec with gop_size 60 and change it to
120 after 100 frames
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Since frame->buf[0] is always NULL in this case, av_buffer_unref
has no effect. If it's not NULL, double-free will happen.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
At the beginning of decoding, if we feed mediacodec too fast, the
input port will return try again. It takes some time for mediacodec
to consume bitstream and output frame. So the output port also return
try again. It possible that mediacodec_receive_frame doesn't consume
any AVPacket and no AVFrame is output. Then both avcodec_send_packet()
and avcodec_receive_frame() return EAGAIN, which shouldn't happen.
This bug can be produced with decoding benchmark on Pixel 3.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Same principle as previous commit, with sufficiently huge rgb2yuv table
values this produces wrong results and undefined behavior.
The unsigned produces the same incorrect results. That is probably
ok as these cases with huge values seem not to occur in any real
use case.
Fixes: signed integer overflow
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Large rgb2yuv tables and high pixel values cause the intermediate
int32_t of ru*r + gu*g + bu*b to exceed INT_MAX, which is undefined
behavior. This causes libswscale built with LLVM -fsanitize=undefined to
assert. Using unsigned integers instead has defined behavior and
produces identical results, and makes rgb64ToUV_c_template match
rgb64ToY_c_template.
Fixes: signed integer overflow
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
maximum_buffer_size_ms should only be set if both are specified or if
the user sets it through -svtav1-params buf-sz=val
Signed-off-by: Christopher Degawa <ccom@randomderp.com>
The check of vpx_rac_is_end check(s) are added originally from
1afd246960. It causes a regression
of some vp8 stream. b6b9ac5698 fixes
the regression by a sort of band-aid way. This fixes the wrongness
of the original commit. vpx_rac_is_end() should be called against
the bool decoder for the vp8 headr context, not one for each
coefficient. Reference is vp8_dixie_tokens_process_row() in token.c
in spec 20.16.
Fixes: Ticket 8069
Fixes: regression of 1afd246960.
Fixes: b6b9ac5698
Co-authored-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Hirokazu Honda <hiroh@chromium.org>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
This previous expression multiplied a constant (outlink->h) that was
guaranteed to be 0 at this point, thus making it always a no-op.
Fix the calculation, and also properly reset the SAR to 1:1 as is now
necessary (the failure to do so previously hid this bug's existence).
As a result of a typo in the source code, this option was completely
non-functional. In order to fix it, without breaking the current default
behavior, explicitly change this default to 0.
This behavior is also consistent with how other scale filters behave by
default, so it's probably best to enshrine it anyways.
After commit c0b93, it's possible that `ff_vk_filter_config_input` never
gets called, leading to `s->vkctx.input_format` being left unset. This
broke the format auto-selection logic in `libplacebo_config_output`,
resulting in a default to yuv420p, instead of defaulting to the input
format as intended.
Fixes: c0b93c4f8b
x265_sei is available since X265_BUILD 88. Bump required version
to 89 to fix the regression from commit 1f58503013, and remove a
conditional compilation.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
svt-av1 v1.2.0 has deprecated vbv_bufsize in favor of using
- maximum_buffer_size_ms (--buf-sz)
- starting_buffer_level_ms (--buf-initial-sz)
- optimal_buffer_level_ms (--buf-optimal-sz)
and vbv_bufsize has not been in use since svt-av1 v0.8.6
Signed-off-by: Christopher Degawa <christopher.degawa@intel.com>
compressed_ten_bit_format has been deprecated upstream and has no effect
and can be removed. Plus, technically it was never used in the first place
since it would require the app (ffmpeg) to set it and do additional
processing of the input frames.
Also simplify alloc_buffer by removing calculations relating to the
non-existant processing.
Signed-off-by: Christopher Degawa <christopher.degawa@intel.com>
Profile can be derived from values codecpar pixel format only with software
formats. For hardware formats, we're forced to parse a frame header to get
the required information.
Signed-off-by: James Almer <jamrial@gmail.com>
Drop the reference to directly committing code, because
- it is highly discouraged and very rarely done these days
- there is no good reason NOT to submit patches for review
Add a link to the development policy chapter.
Frame limiting is now handled using sync queues. This code prevents the
sync queue from triggering EOF, resulting in unnecessarily many frames
being decoded, filtered, and then discarded.
Found-by: U. Artie Eoff <ullysses.a.eoff@intel.com>
Specificaly, the of_add_attachments() call (which can add attachment
streams to the output) and the check whether the output file contains
any streams. They both logically belong in create_streams().
The DVD subtitle parser handles two types of packets: "normal"
packets with a 16-bit length, and HD-DVD packets that set the
16-bit length to 0 and encode a 32-bit length in the next four
bytes. This implies that HD-DVD packets are at least six bytes
long, but the code didn't actually verify this.
The faulty length check results in an out of bounds read for
zero-length "normal" packets that occur in the input, which are
only 2 bytes long, but get misinterpreted as an HD-DVD packet.
When this happens the parser reads packet_len from beyond the
end of the input buffer. The subtitle stream is not correctly
decoded after this point due to the garbage packet_len.
Fixing this is pretty simple: fix the length check so packets
less than 6 bytes long will not be mistakenly parsed as HD-DVD
packets.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
This is done only to the inputs, not the outputs, because we always
output vulkan hwframes.
Doing so also requires keeping track of backing textures for non-hwdec
formats, but is otherwise a mostly straightforward change to the format
query function. Special care needs to be taken to avoid crashing on
older libplacebo due to AV_PIX_FMT_RGBF32LE et al.
Instead of doing it ad-hoc in `filter_frame`. This is not a huge change
on its own, but paves the way for adding support for more formats in the
future, in particular formats other than AV_PIX_FMT_VULKAN.
This commit enabled assembly code with intel AVX512 VNNI and added unit test for sobel filter
sobel_c: 4537
sobel_avx512icl 2136
Signed-off-by: bwang30 <bin.wang@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
block_index is write-only for the H.261 decoder, so
don't update it by calling ff_update_block_index().
Instead use a function of our own to set/update dest.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Also remove some internal.h inclusions which have been
unnecessarily added recently.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The max == 0 case can be removed too but i left it as 50% of the cases use it
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Due to the lack of an active AMF maintainer at the moment, as well
as plans to add the av1 encoder and other improvements of AMF,
I added myself to the maintainers. Timely review and merging
patches targeting AMF integration should improve support
of AMD GPUs and APUs in FFmpeg.
For the last couple of years I have been working on AMF related
patches to ffmpeg and other open source projects.
added a new option 'a53cc' (on by default, as in libx264) for rendering
AV_FRAME_DATA_A53_CC as hevc sei payloads.
the code is a blend of the libx265.c code for writing
AV_FRAME_DATA_SEI_UNREGISTERED with the libx264.c code for writing atsc
a/53 payloads.
Up until now, the ClearVideo decoder separates parsing tiles
and actually using the parsed information: The information is
instead stored in structures which are constantly allocated
and freed. This commit changes this to use the information
immediately, avoiding said allocations. This e.g. reduced
the amount of allocations for [1] from 2,866,462 to 24,720.
For said sample decoding speed improved by 143%.
[1]: https://samples.ffmpeg.org/V-codecs/UCOD/AccordianDance-300.avi
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
postProcess in postprocess_template.c copies a PPContext
to the stack, works with this copy and then copies
it back again. Said local copy uses a hardcoded alignment
of eight, although PPContext has alignment 32 since
cbe27006ce
(this commit was in anticipation of AVX2 code that never landed).
This leads to misalignment in the filter-(pp|pp1|pp2|pp3|qp)
FATE-tests which UBSan complains about. So avoid the local copy.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
postprocess.c currently has C, MMX, MMXEXT, 3DNow as well as
SSE2 versions of its internal functions. But given that only
ancient 32-bit x86 CPUs don't support SSE2, the MMX, MMXEXT
and 3DNow versions are obsolete and are therefore removed by
this commit. This saves about 56KB here.
(The SSE2 version in particular is not really complete,
so that it often falls back to MMXEXT (which means that
there were some identical (apart from the name) MMXEXT
and SSE2 functions; this duplication no longer exists
with this commit.)
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The input data is multiplied by `s->offset` to get normalized output.
`s->target_tp` and `true_peak` is not in dB,
so `s->offset` should be calculated by division instead of subtraction.
Signed-off-by: Rui Zhu <real.zhurui@gmail.com>
Should fix fate failures on Windowx x86 targets, where long is 32 bits.
Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: James Almer <jamrial@gmail.com>
The amount of lines printed is too high for the verbose level, and the debug
level is a better fit for their content.
Signed-off-by: James Almer <jamrial@gmail.com>
Some formats like FLV can dynamically add streams during packet reading.
FFprobe does check for this and reallocates the global stream info, but does
not reallocate InputFrame's streams and decoders when this happens, which,
as a result, could have caused flushing to occur on an out of bounds stream
index, since the flush loop iterates over fmt_ctx's nb_streams, and not
ifile's, despite using ifile's streams.
This fixes an out of bounds read and segfult.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Since e6afa61be9, no components in
libavcodec enable CONFIG_MDCT. This fixes building "make testprogs".
Signed-off-by: Martin Storsjö <martin@martin.st>
Add skip_frame support to qsvenc. Use per-frame metadata
"qsv_skip_frame" to control it. skip_frame option defines the behavior
of qsv_skip_frame.
no_skip: Frame skipping is disabled.
insert_dummy: Encoder inserts into bitstream frame where all macroblocks
are encoded as skipped.
insert_nothing: Similar to insert_dummy, but encoder inserts nothing.
The skipped frames are still used in brc. For example, gop still include
skipped frames, and the frames after skipped frames will be larger in
size.
brc_only: skip_frame metadata indicates the number of missed frames
before the current frame.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Without this change, MSDK/VPL always defaults the HEVC tier to MAIN if the -level is specified.
Also, according to the HEVC specs, only level >= 4 can support High Tier.
Signed-off-by: nyanmisaka <nst799610810@gmail.com>
This e.g. allows compilers to bake the offset implied
by using ff_vc1_b_field_mvpred_scales[3] into the
general offset; for certain arches this is also necessary
in order to avoid building suboptimal code.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is the only encoder supporting quarter samples.
This also allows to remove the qpeldsp dependency from
mpegvideo_enc.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The MPEG-4 decoder is the only decoder based upon H.263 that
supports quarterpel motion vectors.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The only thing from the H.263 decoder that is reachable
by the VC-1 decoder is ff_h263_decode_init(); but it does
not even use all of it; e.g. h263dsp is unused and so are
the VLCs initialized in ff_h263_decode_init() (they amount
to about 77KB which are now no longer touched).
Notice that one could also call ff_idctdsp_init()
directly instead of ff_mpv_idct_init(); one could even
do so in ff_vc1_init_common().
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Decoders that might use quarter pixel motion estimation
(namely MPEG-4 as well as the VC-1 family) currently
use MpegEncContext.me.qpel_(put|avg) as scratch space
for pointers to arrays of function pointers.
(MotionEstContext contains such pointers as it supports
quarter pixel motion estimation.) The MotionEstContext
is unused apart from this for the decoding part of
mpegvideo.
Using the context at all is for decoding is actually
unnecessary and easily avoided: All codecs with
quarter pixels set me.qpel_avg to qdsp.avg_qpel_pixels_tab,
so one can just unconditionally use this in ff_mpv_reconstruct_mb().
MPEG-4 sets qpel_put to qdsp.put_qpel_pixels_tab
or to qdsp.put_no_rnd_qpel_pixels_tab based upon
whether the current frame is a b-frame with no_rounding
or not, while the VC-1-based decoders set it to
qdsp.put_qpel_pixels_tab unconditionally. Given
that no_rounding is always zero for VC-1, using
the same check for VC-1 as well as for MPEG-4 would work.
Since ff_mpv_reconstruct_mb() already has exactly
the right check (for hpeldsp), it can simply be reused.
(This change will result in ff_mpv_motion() receiving
a pointer to an array of NULL function pointers instead
of a NULL pointer for codecs without qpeldsp (like MPEG-1/2).
It doesn't matter.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
vc1_decode_skip_blocks() is only called if the current picture
is a P frame. So setting pict_type to AV_PICTURE_TYPE_P
is redundant; removing it makes pict_type read-only in vc1_block.c
(as it should be).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The only msmpeg4 code that is ever executed by the VC-1 based
decoders is ff_msmpeg4_decode_init() and what is directly
reachable from it. This is:
a) A call to av_image_check_size(), then ff_h263_decode_init(),
b) followed by setting [yc]_dc_scale_table and initializing
scantable/permutations.
c) Afterwards, some static tables are initialized.
d) Finally, slice_height is set.
The replacement for ff_msmpeg4_decode_init() performs a)
just like now; it also sets [yc]_dc_scale_table,
but it only initializes inter_scantable and intra_scantable
and not permutated_intra_[hv]_scantable: The latter are only
used inside decode_mb callbacks which are only called
in ff_h263_decode_frame() which is unused for VC-1.*
The static tables initialized in c) are not used at all by
VC-1 (the ones that are used have been factored out in
previous commits); this avoids touching 327KiB of .bss.
slice_height is also not used by the VC-1 decoder (setting
it in ff_msmpeg4_decode_init() is probably redundant after
b34397b4cd).
*: It follows from this that the VC-1 decoder is not really
based upon the H.263 decoder either; changing this will
be done in a future commit.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is in preparation for splitting VC-1 from msmpeg4.
(msmpeg4data.c was originally intended to be just this;
9488b966c7 changed it).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is done since 16af29a7a6
(and is actually unnecessary, because the tables initialized
in ff_msmpeg4_decode_init() are only ever used in vc1_block.c
which is only entered after a call to ff_msmpeg4_decode_init())
in a very ugly manner; said manner had the byproduct of
involving lots of unnecessary allocations and even opening
and closing a hwaccel in case one is used.
This commit achieves the aim of 16af29a7a6
by initializing the VLCs used by VC-1 in ff_vc1_init_common().
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
VC1 shares some VLCs with MSMPEG-4, but vc1_block.c
simply duplicates the defines instead of including
the appropriate headers; furthermore, use a proper
prefix for these defines: DC_VLC_BITS is also used
by other codecs.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
and include said header at the place where the VLCs are created.
This allows to make said tables static.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
ff_vc1_norm6_spec has been removed in commit
356be9307c (and it seems that it
has never been used); the declarations of the 8x8_zz arrays meanwhile
have been added in f0c02e1cbc
without having ever been defined.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The defines in vc1data.c are duplicates of the ones in vc1data.h;
they are also pointless, because they are not used anywhere.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is unnecessary to initialize the VLCs: The only VLC
that was only ever used by the code reachable from the parser
was ff_vc1_bfraction_vlc; and this VLC has been removed.
Yet vc1dsp is still needed for startcode_find_candidate.
Maybe this should be factored out of vc1dsp in a later
commit.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Added in b50be4e38dc83389925dc14f24fa11e660d32197;
this check was racy back then (as the VLC could be initialized
concurrently) and it is redundant (always-false)
since commit c742ab4e81.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This check has been added in c617bed34f,
merging ee769c6a7c to fix
a possible segfault if AVCodecContext.codec is not set
as it may be during parsing. While this fixes the segfault,
it has the unfortunate side effect that it makes the output
of the parser dependent on whether a decoder is set (and
ultimately available). The fix later applied in
5d2be71b9e does not have this
downside and makes checking AVCodecContext.codec superfluous.
So remove this check.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
There are sill many users of these APIs within libav*, so this commit
introduced too many deprecation warnings, making compilation too noisy and
potentially hiding legit warnings.
Once the remaining users are ported, this can be reapplied.
This reverts commit 76d0038579.
The encoder is fixed point, and uses an MDCT only for analysis. Due
to the slightly different rounding, the encoder makes a different
decision, so the tests have to be adjusted as well.
This patch replaces the transform used in AAC with lavu/tx and removes
the limitation on only being able to decode 960-sample files
with the float decoder.
This commit also removes a whole bunch of unnecessary and slow
lifting steps the decoder did to compensate for the poor accuracy
of the old integer transformation code.
Overall float decoder speedup on Zen 3 for 64kbps: 32%
Mostly consistent formatting and consistently ordering of
warnings/notes to be next to the description.
Additionally group the AV_DICT_* macros.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
This is a more explicit iteration API rather than using the "magic"
av_dict_get(d, "", t, AV_DICT_IGNORE_SUFFIX) which is not really
trivial to grasp what it does when casually reading through code.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
dts != pts is actually a spec violation for AV1, given it has no
reordering in the classical sense.
We don't really need the whole timestamp queue in this case and can just
pass through the timestamp as is for both dts and pts.
The encoder seems to be trading blows with hevc_nvenc.
In terms of quality at low bitrate cbr settings, it seems to
outperform it even. It produces fewer artifacts and the ones it
does produce are less jarring to my perception.
At higher bitrates I had a hard time finding differences between
the two encoders in terms of subjective visual quality.
Using the 'slow' preset, av1_nvenc outperformed hevc_nvenc in terms
of encoding speed by 75% to 100% while performing above tests.
Needless to say, it always massively outperformed h264_nvenc in terms
of quality for a given bitrate, while also being slightly faster.
Makes it possible to use deinterlacers which output one frame for each field as fallback if field
matching fails (combmatch=full).
Currently, the documented example with fallback on a post-deinterlacer will only work in case the
deinterlacer outputs one frame per first field (as yadif=mode=0). The reason for that is that
fieldmatch will attempt to match the second field regardless of whether it recognizes the end
result is still interlaced. This produces garbled output with for example mixed telecined 24fps and
60i content combined with a field-based deinterlaced such as yadif=mode=1.
This patch orders fieldmatch to revert to using the second field of the current frame in case the
end result is still interlaced and a post-deinterlacer is assumed to be used.
Signed-off-by: lovesyk <lovesyk@users.noreply.github.com>
Negligible speed difference for avx2 on Zen 2 (Ryzen 5700X) and
Broadwell (Xeon E5-2620 v4):
1690±4.3 decicycles vs. 1693±78.4
1439±31.1 decicycles vs 1429±16.7
Moderate speedup with avx512 on Skylake-X (Xeon D-2123IT):
1.22x faster (793±0.8 vs. 649±5.5 decicycles) compared with avx2
Better speedup with avx512icl on Ice Lake (Xeon Silver 4316):
1.77x faster (784±1.8 vs. 442±11.6 decicycles) compared with avx2
Co-authors:
Henrik Gramner <henrik@gramner.com>
Kieran Kunhya <kierank@obe.tv>
In av1_spec.pdf page 38/669, there is a sentence below:
if ( frame_type == KEY_FRAME && show_frame ) {
for ( i = 0; i < NUM_REF_FRAMES; i++) {
RefValid[ i ] = 0
......
}
......
}
This shows that the condition of invalidating current
DPB frames should be the coming frame_type is KEY_FRAME plus
show_frame is equal to 1. Otherwise, some of the frames
in sequence after KEY_FRAME still refer to the reference frames
before KEY_FRAME, and if these before KEY_FRAME reference
frames were invalidated, these frames could not find their
reference frames, and it could cause image corruption.
Mesa fix is in https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19386
Reviewed-by: Fei Wang <fei.w.wang@intel.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
The order in which the channels are coded in the bitstream do not always follow
the native, bitmask-based order of channels both signaled by the WAV container
and forced by this same decoder. This is the case with layouts containing an
LFE channel, as it's always coded last.
Fixes ticket #9964.
Signed-off-by: James Almer <jamrial@gmail.com>
Generalize the checks for channels in all positions, and properly support
the three height groups (normal, top, bottom) instead of manually setting
the relevant channels for the latter two after the normal height tags were
parsed.
Signed-off-by: James Almer <jamrial@gmail.com>
If PCE defines channels not covered by those in the standard configurations
then don't try to come up with some made up layout and just return them in the
coded order.
Fixes al08_44.mp4 from the conformance suite, now reporting and decoding all 48
channels instead of 10.
Signed-off-by: James Almer <jamrial@gmail.com>
Set the correct amount of tags in tags_per_config[].
Also, there are no channels that correspond to a side element in this
configuration, so reflect this in the list of known/supported channel layouts.
Signed-off-by: James Almer <jamrial@gmail.com>
The IMF CPL contains an optional timecode start address. This patch reads the
latter, if present, into the context's timecode metadata parameter.
This addresses https://trac.ffmpeg.org/ticket/9842.
The current adjustment of input start times just adjusts the tsoffset.
And it does so, by resetting the tsoffset to nullify the new start time.
This leads to breakage of -copyts, ignoring of input_ts_offset, breaking
of -isync as well as breaking wrap correction.
Fixed by taking cognizance of these parameters, and by correcting start times
just before sync offsets are applied.
Provide arm64 neon optimized implementations for hscale16To19 with
filter sizes 4, 8 and X4.
The tests and benchmarks run on AWS Graviton 2 instances.
The results from a checkasm tool are shown below.
hscale_16_to_19__fs_4_dstW_512_c: 6216.0
hscale_16_to_19__fs_4_dstW_512_neon: 2257.0
hscale_16_to_19__fs_8_dstW_512_c: 10417.7
hscale_16_to_19__fs_8_dstW_512_neon: 3112.5
hscale_16_to_19__fs_12_dstW_512_c: 14890.5
hscale_16_to_19__fs_12_dstW_512_neon: 3899.0
hscale_16_to_19__fs_16_dstW_512_c: 19006.5
hscale_16_to_19__fs_16_dstW_512_neon: 5341.2
hscale_16_to_19__fs_32_dstW_512_c: 36629.5
hscale_16_to_19__fs_32_dstW_512_neon: 9502.7
hscale_16_to_19__fs_40_dstW_512_c: 45477.5
hscale_16_to_19__fs_40_dstW_512_neon: 11552.0
(Note, the checkasm tests for these functions haven't been
merged since they fail on x86.)
Signed-off-by: Hubert Mazur <hum@semihalf.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
Add arm64 neon implementations for hscale 16 to 15 with filter
sizes 4, 8 and X4.
The tests and benchmarks run on AWS Graviton 2 instances.
The results from a checkasm tool are shown below.
hscale_16_to_15__fs_4_dstW_512_c: 6703.5
hscale_16_to_15__fs_4_dstW_512_neon: 2298.0
hscale_16_to_15__fs_8_dstW_512_c: 10983.0
hscale_16_to_15__fs_8_dstW_512_neon: 3216.5
hscale_16_to_15__fs_12_dstW_512_c: 15526.0
hscale_16_to_15__fs_12_dstW_512_neon: 3993.0
hscale_16_to_15__fs_16_dstW_512_c: 20183.5
hscale_16_to_15__fs_16_dstW_512_neon: 5369.7
hscale_16_to_15__fs_32_dstW_512_c: 39315.2
hscale_16_to_15__fs_32_dstW_512_neon: 9511.2
hscale_16_to_15__fs_40_dstW_512_c: 48995.7
hscale_16_to_15__fs_40_dstW_512_neon: 11570.0
(Note, the checkasm tests for these functions haven't been
merged since they fail on x86.)
Signed-off-by: Hubert Mazur <hum@semihalf.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
Add arm64 neon implementations for hscale 8 to 19 with filter
sizes 4, 4X and 8. Both implementations are based on very similar ones
dedicated to hscale 8 to 15. The major changes refer to saving
the data - instead of writing the result as int16_t it is done
with int32_t.
These functions are heavily inspired on patches provided by J. Swinney
and M. Storsjö for hscale8to15 which were slightly adapted for
hscale8to19.
The tests and benchmarks run on AWS Graviton 2 instances. The results
from a checkasm tool shown below.
hscale_8_to_19__fs_4_dstW_512_c: 5663.2
hscale_8_to_19__fs_4_dstW_512_neon: 1259.7
hscale_8_to_19__fs_8_dstW_512_c: 9306.0
hscale_8_to_19__fs_8_dstW_512_neon: 2020.2
hscale_8_to_19__fs_12_dstW_512_c: 12932.7
hscale_8_to_19__fs_12_dstW_512_neon: 2462.5
hscale_8_to_19__fs_16_dstW_512_c: 16844.2
hscale_8_to_19__fs_16_dstW_512_neon: 4671.2
hscale_8_to_19__fs_32_dstW_512_c: 32803.7
hscale_8_to_19__fs_32_dstW_512_neon: 5474.2
hscale_8_to_19__fs_40_dstW_512_c: 40948.0
hscale_8_to_19__fs_40_dstW_512_neon: 6669.7
Signed-off-by: Hubert Mazur <hum@semihalf.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
The FFmpeg encoder does not increment the temporal reference field, so use
that knowledge plus the extradata field to detect buggy versions of the encoder.
Fixes ticket #128.
The SVQ1 interframe mean VLC symbols -128 and 128 are incorrectly swapped
in our SVQ1 implementation, resulting in visible artifacts for some videos.
This patch unswaps the order of these two symbols.
The most noticable example of the artiacts caused by this error can be observed in
https://trac.ffmpeg.org/attachment/ticket/128/svq1_set.7z '352_288_k_50.mov'.
The artifacts are not observed when using the reference decoder
(QuickTime 7.7.9 x86 binary).
As a result of this patch, the reference data for the fate-svq1 test
($SAMPLES/svq1/marymary-shackles.mov) must be modified. For this file, our
decoder output is now bitwise identical to the reference decoder. I have
tested patch with various other samples and they are all now bitwise identical.
ff_mpeg1_dc_scale_table is the default value for
[yc]_dc_scale_table (as set by ff_mpv_common_defaults()).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This e.g. allows compilers to bake the offset implied
by using ff_mpeg12_dc_scale_table[3] (as the SpeedHQ encoder
does) into the general offset; for certain arches this is
also necessary in order to avoid building suboptimal code.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
These tables are only accessed in ff_set_qscale()
which only accesses values 1..31 as well as in
encode_picture() in mpegvideo_enc.c, accessing
the value with index 8. So make these tables smaller.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
For encoders, mpeg_quant is an option of the MPEG-4 encoder
and therefore constant. This implies that one can set
the dct_unquantize_(intra|inter) function pointers during init.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The frei0r API requires linesize to be width * 4.
Since the align property of ff_default_get_video_buffer2
specifies line alignment, not buffer alignment, the
previous value of 16 breaks this requirement for frames
whose width is a multiple of 8, but not a multiple of 16
as it adds extra padding to ensure line aligment.
Fix Trac ticket #9873
This basically reverts cc0380222a.
At the time of said commit, cleanup on init failure was very
buggy. For codecs with the INIT_CLEANUP cap, the close function
could be called on error even before the private data has been
allocated; and when using frame threading the same could also
happen even without said flag. Some mpegvideo decoders were
affected by the latter.
Yet both of these issues have been fixed long ago, the latter
in commit e9b6617579. Therefore
the workaround in ff_mpv_common_end() can be removed.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It avoids checks and allows to make ff_wmv2_decode_mb() static;
furthermore, it allows to avoid a config_components.h inclusion.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Only encoders need two sets of int16_t [12][64]
(one to save the current best state and one for the current
working state); decoders need only one. This saves 1.5KiB
per slice context for a decoder.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is only used by metasound.c, so this allows to make
the data static.
To do this, remove metasound_data.h, rename metasound_data.c
into metasound_data.h (and add inclusion guards, remove ff_
prefixes etc.).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Namely into a header metasound_twinvq_data.h included
in twinvq.c (the common file of MetaSound and TwinVQ).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This can be achieved by moving the AVOnce out of the structure
containing the function pointers; the latter can then be made
const.
This also has the advantage of eliminating padding in the structure
(sizeof(AVOnce) is four here) and allowing the AVOnces to be put
into .bss (dependening upon the implementation).
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is possible to avoid the factors array for the power-of-two
tables for which said array is unused by using a different
structure for initialization for power-of-two tables than for
non-power-of-two-tables. This saves 3*15*16B from .data.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Treat the 32 bit stride registers as signed.
Alternatively, we could make the stride arguments ptrdiff_t instead
of int, and changing all of the assembly to operate on these
registers with their full 64 bit width, but that's a much larger
and more intrusive change (and risks missing some operation, which
would clamp the intermediates to 32 bit still).
Fixes: https://trac.ffmpeg.org/ticket/9985
Signed-off-by: Martin Storsjö <martin@martin.st>
Support for building with older versions of MSVC (with the
c99wrap/c99conv frontend) was removed in
ce943dd6ac.
Signed-off-by: Martin Storsjö <martin@martin.st>
ff_rl_init() initializes RLTable.(max_level|max_run|index_run);
max_run is unused by the SpeedHQ encoder (as well as MPEG-1/2).
Furthermore, it initializes these things twice (for two passes),
but the second half of this is never used.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
ff_rl_init() initializes RLTable.(max_level|max_run|index_run);
max_run is unused by the MPEG-1/2 encoders (as well as SpeedHQ).
Furthermore, it initializes these things twice (for two passes),
but the second half of this is never used.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This allows to exploit that ff_rl_mpeg1 and ff_rl_mpeg2
only differ in their VLC table.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
ff_rl_mpeg1 and ff_rl_mpeg2 differ only in RLTable.table_vlc,
which ff_rl_init() does not use to initialize RLTable.max_level,
RLTable.max_run and RLTable.index_run. This implies
that these tables agree for ff_rl_mpeg1 and ff_rl_mpeg2;
hence one can just use one of them and avoid calling ff_rl_init()
ff_rl_mpeg2 alltogether.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This will allow to remove ff_rl_mpeg2 soon and remove
all uses of RLTable in MPEG-1/2/SpeedHQ later.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
A two byte sync word is not enough to ensure we got a real syncframe, nor are
all the range checks we do in the first seven bytes. Do therefore an integrity
check for the sync frame in order to prevent the parser from filling avctx with
bogus information.
Signed-off-by: James Almer <jamrial@gmail.com>
Have it only find frame boundaries. The stream props will then be filled once
we have an assembled frame.
Signed-off-by: James Almer <jamrial@gmail.com>
Code freeing the struct on failure is kept for backwards compatibility, but
should be removed in the next major bump, and the existing lavf user adapted.
Signed-off-by: James Almer <jamrial@gmail.com>
for videos with wmv9 rectangles, the region drawn by ff_mss12_decode_rect
may be less than the entire video area. the wmv9 rectangles are used to
calculate the ff_mss12_decode_rect draw region.
Fixes tickets #3255 and #4043
Duplicates of the standard JPEG quantization tables were found in the
AGM, MSS34(dsp), NUV and VP31 codecs. This patch elimates those duplicates,
placing a single copy in jpegquanttables.c.
GCC 11 has a bug: When it creates clones of recursive functions
(to inline some parameters), it clones a recursive function
eight times by default, even when this exceeds the recursion
depth. This happens with encode_block() in libavcodec/svq1enc.c
where a parameter level is always in the range 0..5;
but GCC 11 also creates functions corresponding to level UINT_MAX
and UINT_MAX - 1 (on -O3; -O2 is fine).
Using such levels would produce undefined behaviour and because
of this GCC emits bogus -Warray-bounds warnings for these clones.
Since commit d08b2900a9, certain
symbols that are accessed like ff_svq1_inter_multistage_vlc[level]
are declared with hidden visibility, which allows compilers
to bake the offset implied by level into the instructions
if level is a compile-time constant as it is in the clones.
Yet this leads to insane offsets for level == UINT_MAX which
can be incompatible with the supported offset ranges of relocations.
This happens in the small code model (the default code model for
AArch64).
This commit therefore works around this bug by disabling cloning
recursive functions for GCC 10 and 11. GCC 10 is affected by the
underlying bug (see
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102513), so the workaround
also targets it, although it only produces three versions of
encode_block(), so it does not seem to trigger the actual issue here.
The issue has been mitigated in GCC 12.1 (it no longer creates clones
for impossible values; see also commit
1cb7fd317c), so the workaround
does not target it.
Reported-by: J. Dekker <jdek@itanimul.li>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Signed-off-by: J. Dekker <jdek@itanimul.li>
The current code will override the *_disable fields (set by -vn/-an
options) when creating output streams for unlabeled complex filtergraph
outputs, in order to disable automatic mapping for the corresponding
media type.
However, this will apply not only to automatic mappings, but to manual
ones as well, which should not happen. Avoid this by adding local
variables that are used only for automatic mappings.
Specifically recording_time and stop_time - use local variables instead.
OptionsContext should be input-only to this code. Will allow making it
const in future commits.
This is similar to what was done before for output files and will allow
introducing demuxer-private state in future commits
Unlike for muxing, the code is moved to existing ffmpeg_demux.c rather
than to a new file. The reason is just file size - the demuxing code is
much smaller than muxing.
The next commit and other commits in future will use more ExtParam
buffers.
And combine 2 free functions into single one
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Up until now, av_aes_init() uses a->round_key[0].u8 + t
as dst of memcpy where it is intended for t to greater
than 16 (u8 is an uint8_t[16]); given that round_key itself
is an array, it is actually intended for the dst to be
in a latter round_key member. To do this properly,
just cast a->round_key to unsigned char*.
This fixes the srtp, aes, aes_ctr, mov-3elist-encrypted,
mov-frag-encrypted and mov-tenc-only-encrypted
FATE-tests with (Clang-)UBSan.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The AES code uses av_aes_block, a union consisting of
uint64_t[2], uint32_t[4], uint8_t[4][4] and uint8_t[16].
subshift() performs byte-wise manipulations of two av_aes_blocks,
but when encrypting, it does so with a shift of two bytes;
more precisely, it uses
"av_aes_block *s1 = (av_aes_block *) (s0[0].u8 - s)"
and lateron uses the uint8_t[16] member to access s0.
Yet av_aes_block requires to be suitably aligned for
the uint64_t[2] member, which s0[0].u8 - 2 is certainly
not. This is in violation of 6.3.2.3 (7) of C11. UBSan
reports this in the aes_ctr, mov-3elist-encrypted,
mov-frag-encrypted, mov-tenc-only-encrypted and srtp
tests.
Furthermore, there is another issue here: The pointer points
outside of s0; this works, because all the accesses lateron
use an index >= 3. (Clang-)UBSan reports this as
"runtime error: index -2 out of bounds for type 'uint8_t[16]'".
This commit fixes both of these issues: The latter issue
is fixed by applying an offset of "+ 3" during the cast
and subtracting this from the indices used lateron.
The former issue is solved by not casting to av_aes_block*
at all; instead simply cast to unsigned char*.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
For an int array[8][2] using &array[8][0] (which is an int*
pointing to the element beyond the last element of array)
triggers a "runtime error: index 8 out of bounds for type 'int[8][2]'"
from (Clang-)UBSan in the fate-vsynth(1|2|_lena)-snow tests.
I don't know whether this is really undefined behaviour or does not
actually fall under the "pointer arithmetic with the element beyond
the last element of the array is allowed as long as it is not
accessed" exception". All I know is that the code itself does not
read from beyond the last element of the array.
Nevertheless rewrite the code to a form that UBSan does not complain
about.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Namely ScanTable.permutated. The rest of the IDCTDSP-API
is unused as cavs has its own idct.
Reviewed-by: Peter Ross <pross@xvid.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is the part of ff_init_scantable() that is used
by all users of said function.
Reviewed-by: Peter Ross <pross@xvid.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Also rename the scantable variable to idct_permutation
to better reflect what it actually is.
Reviewed-by: Peter Ross <pross@xvid.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The eatqi decoder uses a custom IDCT and actually does not
use the IDCTDSP API at all. Somehow it was nevertheless
used to simply apply the identity permutation on ff_zigzag_direct.
This commit stops doing so.
Reviewed-by: Peter Ross <pross@xvid.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The eatgq decoder uses a custom IDCT and actually does not
use the IDCTDSP API at all. Somehow it was nevertheless
used to simply apply the identity permutation on ff_zigzag_direct.
This commit stops doing so. It also renames perm to scantable,
because it is only the scantable as given by the spec without
any further permutation performed by us.
Reviewed-by: Peter Ross <pross@xvid.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The eamad decoder uses a custom IDCT and actually does not
use the IDCTDSP API at all. Somehow it was nevertheless
used to simply apply the identity permutation on ff_zigzag_direct.
This commit stops doing so.
Reviewed-by: Peter Ross <pross@xvid.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
summary: This patch modifies the `curves` filter with new `interp` option
to let user pick the existing natural cubic spline interpolation
and the new PCHIP interapolation.
reason: The natural cubic spline does not impose monotonicity between
the keypoints. As such, the fitted curve may vary wildly against
user's intension. The PCHIP interpolation is not as smooth as
the natural spline but guarantees the monotonicity. Providing
both options enhances users experience (e.g., reduces the number
of keypoints to realize the desired curve). See the related bug
report for the example of an ill-interpolated curve.
alternate solution:
Both Photoshop and GIMP appear to use monotonic interpolation in
their curve tools, which were the models for this filter. As
such, an alternate solution is to drop the natural spline and
go without the `interp` option.
related bug report: https://trac.ffmpeg.org/ticket/9947 (filed by myself)
Signed-off-by: Takeshi (Kesh) Ikuma <tikuma@hotmail.com>
Renaming the decoder to speedhqdec.c makes sense since
we have an encoder in speedhqenc.c. Given that ff_rl_speedhq
is also used by the encoder, it is kept in speedhq.c
and a proper header for it is created to replace the ad-hoc
declaration in speedhqenc.c. This also allows to remove
the check for CONFIG_SPEEDHQ_DECODER, as it is always true
when speedhqdec.c is compiled.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The number of channels that is checked here is automatically
valid because it has just been set by us (based upon an entry
in codec_props).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It just contains duplicates of values from AVCodecContext
as well as an write-only pointer to said AVCodecContext.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Fixes: signed integer overflow: 9223372036854550860 + 530259564 cannot be represented in type 'long'
Fixes: 49093/clusterfuzz-testcase-minimized-ffmpeg_dem_ASF_O_fuzzer-4697179192688640
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
codec_id is always AV_CODEC_ID_MPEG4 for mpeg4_decode_mb(),
as the MPEG-4 decoder is the only decoder for which
ff_mpeg4_decode_picture_header() as well as decode_init()
are ever called and these are the only places where
the decode_mb function pointer is ever set to mpeg4_decode_mb().
ff_mpeg4_workaround_bugs() is also only called for the MPEG-4
decoder (the caller checks the codec id).
(ff_mpeg4_decode_picture_header() is also called for the MPEG-4
parser, but it never uses the decode_mb function pointer.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is set in the vol header and is therefore a sequence and
not only a picture property.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is only used by gmc/gmc1 which is only used by the MPEG-4
decoder, so move it to Mpeg4DecContext and rename it
to Mpeg4VideoDSP. Also compile it iff the MPEG-4 decoder is compiled.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
None of the MPEG-1/2 codecs support frame threading,
so one can optimize the check for it away.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This has the advantage of not having to check for whether
a given MpegEncContext is actually a decoder or an encoder
context at runtime.
To do so, mpv_reconstruct_mb_internal() is moved into a new
template file that is included by both mpegvideo_enc.c
and mpegvideo_dec.c; the decoder-only code (mainly lowres)
are also moved to mpegvideo_dec.c. The is_encoder checks are
changed to #if IS_ENCODER in order to avoid having to include
headers for decoder-only functions in mpegvideo_enc.c.
This approach also has the advantage that it is easy to adapt
mpv_reconstruct_mb_internal() to using different structures
for decoders and encoders (e.g. the check for whether
a macroblock should be processed for the encoder or not
uses MpegEncContext elements that make no sense for decoders
and should not be part of their context).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Up until now, we inlined lowres_flag as well as is_mpeg12
independently (unless CONFIG_SMALL was true); this commit
changes this to instead inline mpv_reconstruct_mb_internal()
(at most) four times, namely once for encoders, once for decoders
using lowres and once for non-lowres mpeg-1/2 decoders and once
for non-lowres non-mpeg-1/2 decoders (mpeg-1/2 is not inlined
in case of CONFIG_SMALL). This is neutral performance-wise,
but proved beneficial size-wise: It saved 1776B of .text
for GCC 11 or 1344B for Clang 14 (both -O3 x64).
Notice that inlining is_mpeg12 for is_encoder would not really
be beneficial, as the encoder codepath does mostly not depend
on is_mpeg12 at all.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
There are two types of checks for whether the current codec
is MPEG-1/2 in mpv_reconstruct_mb_internal(): Those that are
required for correctness and those that are not; an example
of the latter is "is_mpeg12 || (s->codec_id != AV_CODEC_ID_WMV2)".
The reason for the existence of such checks is that
mpv_reconstruct_mb_internal() has the av_always_inline attribute
and is_mpeg12 is usually inlined, so that in case we are dealing
with MPEG-1/2 the above check can be completely optimized away.
But is_mpeg12 is not always inlined: it is not in case
CONFIG_SMALL is true in which case is_mpeg12 is always zero,
so that the checks required for correctness need to check
out_format explicitly. This is currently done via a macro
in mpv_reconstruct_mb_internal(), so that the fact that
it is CONFIG_SMALL that determines this is encoded at two places.
This commit changes this by making is_mpeg12 a three-state:
DEFINITELY_MPEG12, MAY_BE_MPEG12 and NOT_MPEG12. In the second
case, one has to resort to check out_format, in the other cases
is_mpeg12 can be taken at face-value. This will allow to make
inlining is_mpeg12 more flexible in a future commit.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Both the FFV1 decoder and encoder use a template of their own
to generate code multiple times. They also use a common template,
used by both decoder and encoder templates which is currently
instantiated in ffv1.h (and therefore also in ffv1.c, which
doesn't need it at all).
All these templates have the prerequisite that two macros
are defined, namely RENAME() and TYPE. The codec-specific
templates call the functions generated via the common template
via the RENAME() macro and therefore the macros used for
the common template must coincide with the macros used for
the codec-specific templates. But then it is better to not
instantiate the common template in ffv1.h, but in the codec
specific templates.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Some parts of mpegvideo.c behave differently depending
upon whether AVCodecContext.draw_horiz_band is set or not.
This differing behaviour makes lots of FATE tests fail
and leads to garbage output, although setting this callback
is not supposed to change the output at all.
These checks have been added in commits
3994623df2 and
b68ab2609c. The commit messages
do not contain a real reason for adding the checks and it is
indeed a mystery to me. But removing these checks fixes
the FATE tests when one adds an (empty) draw_horiz_band
when using a codec that claims to support it.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
librav1e provides a function to create extradata, so use it instead of
extracting the sequence header OBU from packets.
Signed-off-by: James Almer <jamrial@gmail.com>
Up until now, ff_startcode_find_candidate_c() simply casts
an uint8_t* to uint64_t*/uint32_t* to read 64/32 bits at a time
in case HAVE_FAST_UNALIGNED is true. Yet this ignores the
alignment requirement of these types as well as effective type
rules of the C standard. This commit therefore replaces these
direct accesses with AV_RN64/32; this also improves
readability.
UBSan reported these unaligned accesses which happened in 233
FATE-tests involving H.264 and VC-1 (this has also been reported
in tickets #8138 and #8485); these tests are fixed by this commit.
The output of GCC with -O3 is unchanged for aarch64, loongarch,
ppc and x64 (as well as for arches like alpha for which
HAVE_FAST_UNALIGNED is never true in the first place).
There was only a slight difference for mips and arm.
I don't know about the speed impact of them.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Now that we have proper options for defining display matrix
overrides, this should no longer be required.
fftools does not have its own versioning, so for now the define is
just set to 1 and disables the functionality if set to zero.
This enables overriding the rotation as well as horizontal/vertical
flip state of a specific video stream on the input side.
Additionally, switch the singular test that was utilizing the rotation
metadata to instead override the input display rotation, thus leading
to the same result.
This e.g. allows compilers to bake the "+ 256" offset
used to access ff_v2_dc_(lum|chroma)_table into
the general offset; for certain arches this is also necessary
in order to avoid building suboptimal code.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
These tables are not exported as avpriv symbols, but instead
included into every library using them. Therefore they
can be mark with the hidden elf visibility. For certain arches
this is necessary in order to avoid building suboptimal code;
for other arches it just allows the compiler to simplify accesses
like ff_mjpeg_bits_dc_luminance + 1 because the "+ 1" can be baked
into the offset.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is now possible since OutputStream is a child of OutputFile and the
code allocating it can access MuxStream. Avoids the overhead and extra
complexity of allocating two objects instead of one.
Similar to what was previously done for OutputFile/Muxer.
Replace it with an array of streams in each OutputFile. This is a more
accurate reflection of the actual relationship between OutputStream and
OutputFile. This is easier to handle and will allow further
simplifications in future commits.
This is now possible since the code allocating OutputFile can see
sizeof(Muxer). Avoids the overhead and extra complexity of allocating
two objects instead of one.
Similar to what is done e.g. for AVStream/FFStream in lavf.
ffmpeg_opt.c currently contains code for
- parsing the options provided on the command line
- opening and initializing input files based on these options
- opening and initializing output files based on these options
The code dealing with each of these is for the most part disjoint, so it
makes sense to move them to separate files. Beyond reducing the quite
considerable size of ffmpeg_opt.c, this will also allow exposing muxer
internals (currently private to ffmpeg_mux.c) to the initialization
code, thus removing the awkward separation currently in place.
qsvenc makes a copy when the input in system memory is not padded as the
SDK requires, however the padding area is not filled with right data
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Doxygen does not properly form references that span multiple levels,
so instead reword it a bit and manually add the references to what
they should point to.
The FF_PAD_STRUCTURE is too complex for doxygen to be able to
properly handle, resulting in completely broken AVBPrint documentation.
To fix that, tell Doxygen what to expand that macro to.
Avoids doxy to interpret these as internal references forced by the #
character, fixing the warnings:
warning: explicit link request to 'nanoTime()' could not be resolved
warning: explicit link request to 'releaseOutputBuffer(int,long)'
could not be resolved
The intention here was probably to document this as use of
conditionals does not make sense in a comment.
Fixes doxy warning:
warning: explicit link request to 'if' could not be resolved
Separate the blocks to make the grouping easier to grasp,
add the file properly to the group and fix the file description
incorrectly used as filename, fixing the following doxy warning:
warning: the name 'Colorspace' supplied as the argument in
the \file statement is not an input file
Make it a bit easier to grasp the grouping when not
unnecessarily splitting comment blocks.
Additionally do not try to add lavu_video_stereo3d to itself, resolving
the following doxy warning:
warning: Refusing to add group lavu_video_stereo3d to itself
Make it a bit easier to grasp the grouping when not
unnecessarily splitting comment blocks.
Additionally do not try to add lavu_video_spherical to itself, resolving
the following doxy warning:
warning: Refusing to add group lavu_video_spherical to itself
Make it a bit easier to grasp the grouping when not
unnecessarily splitting comment blocks.
Additionally do not try to add lavu_video_display to itself, resolving
the following doxy warning:
warning: Refusing to add group lavu_video_display to itself
This is more natural, because said context is only used
for the duration of one call to svq1_encode_frame().
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The encoder uses ff_svq1_inter_mean_vlc + 256 and setting
hidden visibility allows to bake this "+ 256" into the
general offset of ff_svq1_inter_mean_vlc and the code
accessing it.
For certain arches, this is also required for the compiler
to not produce overtly pessimistic code that can't be fixed
up by the linker lateron.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Currently, SVQ1EncContext is defined in a header that is also
included by the arch-specific code that initializes the one
and only dsp function that this encoder uses directly.
But the arch-specific functions to set this dsp function
do not need anything from SVQ1EncContext. This commit therefore
adds a small SVQ1EncDSPContext whose only member is said
function pointer and renames svq1enc.h to svq1encdsp.h
to avoid exposing unnecessary internals to these init
functions (and the whole mpegvideo with it).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Bayer sources are read in groups of 2 lines (e.g. for a
BGGR flavor, the first row contains only B and G samples,
while the second row contains only G and R samples). They
need to be read as a whole.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
This fixes a regression from commit 36117968ad.
wrapped_url_read() used to be able to return positive number from
ffurl_read(). It relies on the result to check if EOF is reached in
async_buffer_task().
But FIFO callbacks must return 0 on success. This should be handled
in ring_write() instead.
Test case:
ffmpeg -f lavfi -i testsrc -t 1 test.mp4
ffmpeg -i async:test.mp4
Signed-off-by: Guangyu Sun <gsun@roblox.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
This simplifies the code as there is no other place the error buffer
is needed, so the av_err2str helper macro can be used.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
av_err2str which is a wrapper for av_strerror already calls
strerror_r if available and if not has a fallback for the other
error codes that would be handled by that, so manually calling
strerror again if it fails is not necessary.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
av_err2str which is a wrapper for av_strerror already calls
strerror_r if available and if not has a fallback for the other
error codes that would be handled by that, so manually calling
strerror again if it fails is not necessary.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
This filter currently keeps the input timebase, but produces CFR output.
It is thus simpler to use 1/output fps as the output timebase.
Also, set output frame durations.
Currently it would essentially change the find_stream_info setting for
the file it was specified for and all following files, which is unusual
and somewhat unexpected behaviour for a per-file option and not even
documented to behave like this.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
VSETVLI xd, x0, ...' has rather nonobvious semantics:
- If xd is x0, then it preserves the current vector length.
- If xd is not x0, it sets the vector length to the supported maximum.
Also somewhat confusingly, while VMV.X.S always does its thing
regardless of the selected vector length, VMV.S.X does _nothing_ if the
selected vector length is zero.
So the current code breaks fails to initialise the accumulator if we
are unlucky to have a selected vector length of zero on entry. Fix it
by forcing the vector length to one.
The binkaudio decoders don't need mdct or sinewin at all;
and binkaudio_dct doesn't need rdft directly (but nevertheless
uses it indirectly via dct).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Add an AV_PIX_FMT_NE macro for RGB32FBE/RGB32FLE and also one for
RGBA32FBE/RGBA32FLE for packed 32-bit float RGB samples, and also
packed 32-bit float RGBA samples, respectively.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Leo Izen <leo.izen@gmail.com>
There is no MMX code for (add|put|put_signed)_pixels_clamped
since commit bfb28b5ce8, so use
declare_func instead of declare_func_emms() to also test that
we are not in MMX mode after return.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
There is no MMX code for diff_bytes since commit
230ea38de1, so use declare_func
instead of declare_func_emms() to also test that we are not
in MMX mode after return.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
There is no MMX code for add_int16 since commit
4b6ffc2880, so use declare_func
instead of declare_func_emms() to also test that we are not
in MMX mode after return.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
There is no MMX code for llviddsp after commit
fed07efcde, so use declare_func
instead of declare_func_emms() to also test that we are not
in MMX mode after return.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
There is no MMX code for pixblockdsp after commit
92b5800277, so use declare_func
instead of declare_func_emms() to also test that we are not
in MMX mode after return.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
There is no MMX code for audiodsp after commit
3d716d38ab, so use declare_func
instead of declare_func_emms() to also test that we are not
in MMX mode after return.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
There is no MMX code for blockdsp after commit
ee551a21dd, so use declare_func
instead of declare_func_emms() to also test that we are not
in MMX mode after return.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
There is no MMX code for vc1_inv_trans_8x8 or
vc1_unescape_buffer, so use declare_func instead of
declare_func_emms() to also test that we are not in MMX
mode after return.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Currently it is done in several different ways, which
might cause needless dependencies or in case of
tx_float_neon.S is incorrect.
Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
Currently, it is accessed via a pointer (ff_celt_window)
exported from opustab.h which points inside a static array
(ff_celt_window_padded) in opustab.h. Instead export
ff_celt_window_padded directly and make opustab.h
a static const pointer pointing inside ff_celt_window_padded.
Also mark all the declarations in opustab.h as hidden,
so that the compiler knows that ff_celt_window has a fixed
offset from the code even when compiling position-independent
code.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
HEVC spec has CRA frame which allows random access with open GOP, hence
it can achieve higher compression efficiency.
Removing the entry was suggested by Andreas
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
AV_PIX_FMT_P012, AV_PIX_FMT_Y212 and AV_PIX_FMT_XV36 are used in
FFmpeg and MFX_FOURCC_P016, MFX_FOURCC_Y216, and MFX_FOURCC_Y416 are used
in the SDK
Signed-off-by: Fei Wang <fei.w.wang@intel.com>
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
P012, Y212 and XV36 are used for 12bit content in FFmpeg VAAPI, so
these formats should be used in FFmpeg QSV too, however the SDK only
declares support for P016, Y216 and Y416. So this commit fudged mappings
between AV_PIX_FMT_P012 and MFX_FOURCC_P016, AV_PIX_FMT_Y212 and
MFX_FOURCC_Y216, AV_PIX_FMT_XV36 and MFX_FOURCC_Y416.
Signed-off-by: Fei Wang <fei.w.wang@intel.com>
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
XV30 is used for 10bit 4:4:4 content in FFmpeg VAAPI, so XV30 should be
used for 10bit 4:4:4 content in FFmpeg QSV too because QSV is based on
VAAPI on Linux. However the SDK only declares support for Y410 but does
nothing with the alpha in Y410, so this commit fudged a mapping between
AV_PIX_FMT_XV30 and MFX_FOURCC_Y410.
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
We can't get Shift from bit depth for some formats in the SDK. For
example, bit depth is 10, however Shift is 0 for Y410 (XV30 in FFmpeg).
In order to support these formats in the next commits, this patch
specified Shift for each format
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Although the DSP function only uses single precision from RISC-V F, the
caller may leave double precision values in the spilled registers if the
calling convention supports double precision hardware floats. Then, we
need to save and restore FS registers as double precision.
Conversely, we do not need to save anything at all if an integer calling
convention is in use. However we can assume that single precision floats
are supported, since the Zve32f extension implies the F extension.
So for the sake of simplicity, we always save at least single precision
values.
In theory, we should even save quadruple precision values if the LP64Q
ABI is in use. I have yet to see a compiler that supports it though.
This adds a variant of the postfilter for use with 512-bit vectors.
Half a vector is enough to perform the scalar product. Normally a whole
vector would be used anyhow. Indeed fractional multiplers are no faster
than the unit multipler.
But in this particular function, a full vector makes up 16 samples,
which would be loaded at each iteration of the outer loop. The minimum
guaranteed CELT postfilter period is only 15. Accounting for the edges,
we can only safely preload up to 13 samples.
The fractional multipler is thus used to cap the selected vector length
to a safe value of 8 elements or 256 bits.
Likewise, we have the 1024-bit variant with the quarter multipler. In
theory, a 2048-bit one would be possible with the eigth multipler, but
that length is not even defined in the specifications as of yet, nor is
it supported by any emulator - forget actual hardware.
This adds a variant of the postfilter for use with 256-bit vectors.
As a single vector is then large enough to perform the scalar product,
the group multipler is reduced to just one at run-time.
The different vector type is passed via register. Unfortunately,
there is no VSETIVL instruction, so the constant vector size (5) also
needs to be passed via a register.
On most cases, the vector type (VTYPE) for the RISC-V Vector extension
is supplied as an immediate value, with either of the VSETVLI or
VSETIVLI instructions. There is however a third instruction VSETVL
which takes the vector type from a general purpose register. That is so
the type can be selected at run-time.
This introduces a macro to load a (valid) vector type into a register.
The syntax follows that of VSETVLI and VSETIVLI, with element size,
group multiplier, then tail and mask policies.
This is implemented for a vector size of 128-bit. Since the scalar
product in the inner loop covers 5 samples or 160 bits, we need a group
multipler of 2.
To avoid reconfiguring the vector type, the outer loop, which loads
multiple input samples sticks to the same multipler. Consequently, the
outer loop loads 8 samples per iteration. This is safe since the minimum
period of the CELT codec is 15 samples.
The same code would also work, albeit needlessly inefficiently with a
vector length of 256 bits. A proper implementation will follow instead.
Fixes the following integer overflows:
libavfilter/vf_rotate.c:273:13: runtime error: signed integer overflow: 92951468 + 2058533568 cannot be represented in type 'int'
libavfilter/vf_rotate.c:273:37: runtime error: signed integer overflow: 39684 * 54149 cannot be represented in type 'int'
libavfilter/vf_rotate.c:272:13: runtime error: signed integer overflow: 247587320 + 1900985032 cannot be represented in type 'int'
libavfilter/vf_rotate.c:272:37: runtime error: signed integer overflow: 42584 * 50430 cannot be represented in type 'int'
libavfilter/vf_rotate.c:272:50: runtime error: signed integer overflow: 65083 * 52912 cannot be represented in type 'int'
libavfilter/vf_rotate.c:273:50: runtime error: signed integer overflow: 65286 * 38044 cannot be represented in type 'int'
Fixes ticket #9799, different output with different compilers.
Since commit 4fc2531fff opus.c
contains only the celt stuff shared between decoder and encoder.
meanwhile, opus_celt.c is decoder-only. So the new names
reflect the actual content better than the current ones.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The PutBitContext has already been flushed a few lines above
and nothing has been written to it in the meantime.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
(There is a small issue that is now being treated differently:
The earlier code would record a position in a buffer that
is being written to via put_bits(), then write data,
then overwrite the byte at the position recorded earlier
and only then flush the PutBitContext. In case there was
no writeout in the meantime, said flush would overwrite
what one has just written. This never happened in my tests,
but maybe it can happen. In this case this commit fixes
this issue by flushing before overwriting the old data.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
ff_square_tab is always used with an offset; if this table
is marked as hidden, the compiler can infer that it and
therefore also ff_square_tab + 256 have a fixed offset
from the code. This allows to avoid performing "+ 256"
at runtime by baking it into the offset from the code to
the table.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This codec uses BswapDSP, BlockDSP and IDCTDSP.
The former never used MMX, the latter does not use it
for idct_put since bfb28b5ce8
and BlockDSP does not use it since commit
ee551a21dd.
Therefore this emms_c() is can be removed.
(It was actually always redundant, because its caller
(decode_simple_internal()) calls emms_c() itself afterwards.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
These matrices are only used for MJPEG, not for LJPEG.
So only check them for the former. This is in preparation
for removing said matrices from LJPEG altogether
(i.e. sending NULL matrices).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The codes here have the property that the long codes
are to the left of the tree (each zero bit child node
is by definition to the left of its one bit sibling);
they also have the property that among codes of the same length,
the symbol is ascending from left to right.
These properties can be used to create the codes from
the lengths in only two passes over the array of lengths
(the current code uses one pass for each length, i.e. 32):
First one counts how many nodes of each length there are.
Then one calculates the range of codes of each length
(possible because the codes are ordered by length in the tree).
This enables one to calculate the actual codes with only
one further traversal of the length array.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
While the share of elements used by both is quite big, the amount
of code shared between the decoders and encoders is negligible.
Therefore one can easily split the context if one wants to.
The reasons for doing so are that the non-shared elements
are non-negligible: The stats array which is only used by
the encoder takes 524288B of 868904B (on x64); similarly,
pix_bgr_map which is only used by the decoder takes 16KiB.
Furthermore, using a shared context also entails inclusions
of unneeded headers like put_bits.h for the decoder and get_bits.h
for the encoder (and all of these and much more for huffyuv.c).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
These parameters are easily accessible whereever they
are accessed, so using copies from HYuvContext is
unnecessary.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
av_pix_fmt_get_chroma_sub_sample() is superfluous if one
already has an AVPixFmtDescriptor.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is nearly unused anyway, so stop use the field altogether.
This is in preparation for splitting HYuvContext into
decoder and encoder contexts.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
All codecs here have the FF_CODEC_CAP_INIT_CLEANUP set,
so ff_huffyuv_common_end() will be called automatically
in encode_end() on error.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The ffvhuff encoder has AVCodec.pix_fmts set and therefore
encode_preinit_video() checks that the used pixel format
is permissible.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
avio_seek() is called inside check(). Seeking to 'off' then seeking
to 'off + i' is unefficient, and it can loop 64 * 1024 times in the
worst case. When probe a malformed file over HTTP, it looks like
stucked forvever. ffio_ensure_seekback() doesn't solve the issue
when the stream is seekable but slow.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Add PCR at keyframe can be undesirable when -pcr_period is
specified. Add an flag to disable this behavior.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
There is no MMX code for loop filters since commit
6a551f1405, so use declare_func
instead of declare_func_emms() to also test that we are not
in MMX mode after return.
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
PixblockDSP does not use MMX functions any more since
92b5800277 and FDCTDSP
since d402ec6be9.
BswapDSP never used MMX, so that the emms_c() here
is unnecessary.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This also enables the compiler to optimize the implicit
checks performed by the PutBit-API away (Clang does so).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The current pbc might be small for an obu frame, so a new pbc is
required then parse this obu frame again. Because
CodedBitstreamAV1Context has already been updated for this obu frame, we
need to restore CodedBitstreamAV1Context, otherwise
CodedBitstreamAV1Context doesn't match this obu frame when parsing obu
frame again, e.g. CodedBitstreamAV1Context.order_hint.
$ ffmpeg -i input.ivf -c:v copy -f null -
[...]
[av1_frame_merge @ 0x558bc3d6f880] ref_order_hint[i] does not match
inferred value: 20, but should be 22.
[av1_frame_merge @ 0x558bc3d6f880] Failed to write unit 1 (type 6).
[av1_frame_merge @ 0x558bc3d6f880] Failed to write packet.
[obu @ 0x558bc3d6e040] av1_frame_merge filter failed to send output
packet
Reviewed-by: James Almer <jamrial@gmail.com>
Reviewed-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
It does not require anything that is being set between
the new position where it is called and the old position
where it used to be called; and nothing that it sets
gets overwritten between these two positions.
Doing so allows to remove a check lateron.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It does not require anything that is being set between
the new position where it is called and the old position
where it used to be called; and nothing that it sets
gets overwritten between these two positions.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Thanks, Pierre-Anthony. I've updated the patch to remove the unnecessary UL and it's now using mxf_match_uid() to detect the EKLV packet.
Signed-off-by: Richard Ayres <richard.ayres@bydeluxe.com>
Using unsigned and negative linesizes doesn't really work.
Use ptrdiff_t instead. This fixes the fraps-v0 and fraps-v1
FATE tests with negative linesizes.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Use ptrdiff_t instead of unsigned for linesizes.
Fixes the armovie-escape124 FATE test with negative linesizes.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
c93.c used an int for the stride and an unsigned for the current
linenumber. This does not work when using negative linesizes.
So use ptrdiff_t for stride and int for linenumber.
This fixes the cyberia-c93 FATE test when using negative linesizes.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The data in SGI images is stored planar, so exporting
it via planar pixel formats is natural.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
SGI is intra-frame only; the decoder therefore does not
maintain any state between frames, so remove
the private context.
Also rename depth to nb_components.
Reviewed-by: Tomas Härdin <git@haerdin.se>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The earlier code used "p->data[0] + p->linesize[0] * s->height" with
the latter being unsigned, which gives the wrong value for negative
linesizes. There is also a not so obvious problem with this:
In case of negative linesizes, the last line is the start of
the allocated buffer, so using the line after the last line
would involve undefined pointer arithmetic. So don't do it.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Fixes segfaults with negative linesizes; in particular,
this affected the sunraster-(1|8|24)bit-(raw|rle) and
sunraster-8bit_gray-raw FATE tests.
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
A lot of the stuff in ASV1Context is actually only used
by decoders or encoders, but not both: Of the seven contexts
in ASV1Context, only the BswapDSPContext is used by both.
So splitting makes sense.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Simply taking the Zbb REV8 instruction into use in a simple loop gives
some significant savings:
bswap_buf_c: 1081.0
bswap_buf_rvb_b: 771.0
But we can also use the 64-bit REV8 as a pseudo-SIMD instruction with
just one additional shift, and one fewer load, effectively doubling the
bandwidth. Consequently, this patch is useful even if the compile-time
target has Zbb enabled for C code:
bswap_buf_c: 1081.0
bswap_buf_rvb_b: 341.0 (this patch)
On the other hand, this approach fails miserably for bswap16_buf as the
ratio of shifts and stores becomes unfavorable compared to naïve C:
bswap16_buf_c: 1542.0
bswap16_buf_rvb_b: 1803.7
Unrolling to process 128 bits (4 samples) at a time actually worsens
performance ever so slightly:
bswap_buf_c: 1081.0
bswap_buf_rvb_b: 408.5
Unfortunately, it is common, and will remain so, that the Bit
manipulations are not enabled at compilation time. This is an official
policy for Debian ports in general (though they do not support RISC-V
officially as of yet) to stick to the minimal target baseline, which
does not include the B extension or even its Zbb subset.
For inline helpers (CPOP, REV8), compiler builtins (CTZ, CLZ) or
even plain C code (MIN, MAX, MINU, MAXU), run-time detection seems
impractical. But at least it can work for the byte-swap DSP functions.
To avoid data dependencies, this does the following unroll, which
requires one extra but probably free addition:
coeff = (b * left_weight) >> decorr_shift;
b += a;
a -= coeff;
b -= coeff;
swap(a, b);
We never guard against a user freeing/stealing the private context;
and returning AVERROR(EIO) is inappropriate.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Move ROUND_MUL* macros to their only users and the Celt
macros to opus_celt.h. Also improve the other headers
a bit while at it.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Simply don't include opus_pvq.h in opus_celt.h: The latter only
uses pointers to CeltPVQ.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
opus.h (which is used by all the Opus code) currently includes
several structures only used by the parser and the decoder;
several elements of OpusContext are even only used by the decoder.
This commit therefore moves the part of OpusContext that is shared
between these two components (and used by ff_opus_parse_extradata())
out into a new structure and moves all the other accompanying
structures and functions to a new header, opus_parse.h; the
functions itself are also moved to a new file, opus_parse.c.
(This also allows to remove several spurious dependencies
of the Opus parser and encoder.)
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
More than 2 channels seems unsupported, the code seems to just output empty extra channels
Fixes: Timeout
Fixes: 51569/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_SPEEX_fuzzer-5511509165342720
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 119760682 - -2084600173 cannot be represented in type 'int'
Fixes: 50993/clusterfuzz-testcase-minimized-ffmpeg_dem_VIVIDAS_fuzzer-6745781167587328
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Otherwise p->linesize[0] * y will be evaluated as an unsigned
which leads to segfaults in case linesize is negative.
This happens in the apng-dispose-previous FATE-test in case
one makes get_buffer return pictures with negative linesizes.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This might happen in avio_write() if size == 0
when the direct codepath is taken. It is undefined behaviour
according to the spec although it happens to work in practice.
Fixes the webm-webvtt-remux FATE-test under UBSan.
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
partial_corr is an int16_t and so the av_clipl_int32()
never clips and can be removed. This also avoids
undefined left-shifts of negative numbers.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Return immediately if not enough leftover bits are available
when flushing. This is simpler and also avoids an
init_get_bits(gb, NULL, 0) (which currently leads to NULL + 0,
which is UB; this affects the lossless-wma(|-1|-2|-rawtile)
FATE tests).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Happens when flushing. This triggers NULL + 0 (which is UB) in
init_get_bits_xe (which previously errored out, but the return value
has not been checked) and in copy_bits().
This fixes the wmavoice-(7|11|19)k FATE-tests with UBSan.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
src2 is used in CAVS_SUBPIX_HV iff FULL is true (it is exactly
for the egpr functions); otherwise it might be NULL. So check
for FULL before doing pointer arithmetic.
Fixes a "src/libavcodec/cavsdsp.c:524:1: runtime error: applying
non-zero offset 8 to null pointer" from UBSan.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
As long as ff_mpeg12_common_init() existed in mpeg12.c,
it added a dependency of mpeg12.o on mpegvideodata.o
(which provides ff_mpeg2_dc_scale_table, which is used
in ff_mpeg12_common_init()). mpegvideodata.o is normally
provided by the mpegvideo subsystem and therefore several
codecs and the MPEG-1/2 parser added a configure dependency
on said subsystem (additionally, the eatqi decoder just
added a Makefile dependency on mpegvideodata.o).
Given that ff_mpeg12_common_init() is no more, these dependencies
can be removed.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It only sets [yc]_dc_scale_table and these tables are only
read in ff_set_qscale(); but the MPEG-1/2 decoders don't call
ff_set_qscale() at all.
(Furthermore, given that intra_dc_precision is always zero
for a decoder at this point, ff_mpeg12_common_init()
actually set these pointers to what ff_mpv_common_defaults()
already set them.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is better place for these declarations than
mpeg12data.h as RL VLC are just a variant of VLCs.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Starting with an h264 implementation. Can be extended to support other codecs.
A few caveats:
- OpenGOP streams are currently not supported. The firt packet must be an IDR
frame.
- In some streams, a few frames at the end may not get a reordered PTS when
they reference frames past EOS. The code added to derive timestamps from
previous frames needs to extended.
Addresses ticket #502.
Signed-off-by: James Almer <jamrial@gmail.com>
Provide optimized implementation for vsse_intra8 for arm64.
Performance tests are shown below.
- vsse_5_c: 87.7
- vsse_5_neon: 26.2
Benchmarks and tests are run with checkasm tool on AWS Graviton 3.
Co-authored-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Martin Storsjö <martin@martin.st>
Provide optimized implementation of vsse8 for arm64.
Performance comparison tests are shown below.
- vsse_1_c: 141.5
- vsse_1_neon: 32.5
Benchmarks and tests are run with checkasm tool on AWS Graviton 3.
Signed-off-by: Grzegorz Bernacki <gjb@semihalf.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
Add vectorized implementation of nsse8 function.
Performance comparison tests are shown below.
- nsse_1_c: 256.0
- nsse_1_neon: 82.7
Benchmarks and tests run with checkasm tool on AWS Graviton 3.
Signed-off-by: Grzegorz Bernacki <gjb@semihalf.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
Provide optimized implementation of pix_abs8 function for arm64.
Performance comparison tests are shown below:
pix_abs_1_1_c: 162.5
pix_abs_1_1_neon: 27.0
pix_abs_1_2_c: 174.0
pix_abs_1_2_neon: 23.5
pix_abs_1_3_c: 203.2
pix_abs_1_3_neon: 34.7
Benchmarks and tests are run with checkasm tool on AWS Graviton 3.
Co-authored-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Grzegorz Bernacki <gjb@semihalf.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
Apparently this option was intended (the context contains a
currently-unused frame_rate field), but was never added. This results in
the output timebase being unset after config_output(), so the input
audio timebase ends up being used for video output, which is clearly
wrong.
Add an option for setting output video framerate. Also set output frame
durations.
It has been deprecated in favor of the aresample filter for almost 10
years.
Another thing this option can do is drop audio timestamps and have them
generated by the encoding code or the muxer, but
- for encoding, this can already be done with the setpts filter
- for muxing this should almost never be done as timestamp generation by
the muxer is deprecated, but people who really want to do this can use
the setts bitstream filter
The butterflies_fixed function pointer declaration specifies av_restrict
for the first two pointer arguments. So the corresponding function
definitions should honor this declaration.
MSVC emits warning C4113 for this.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
Users can't make anything with its content.
Making it opaque might allow us to avoid one level of indirection.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
vorbis.h currently contains stuff only used by the native
Vorbis codecs and some Vorbis tables, which are also used by
Opus and libvorbis. Therefore split the data out into a header
of its own.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Don't increment back_frame if it does not correspond
to a real buffer. To do this, handle copying from
the back frame separately from the "use coded value"
codepath; also use memcpy for the former, as the
chunks here are typically worth it.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This check is intended to be avoid buffer overflows,
yet there are four problems with it:
1. It has an in-built off-by-one error: len == out_end - out
is perfectly fine and nothing to worry about.
This off-by-one error led to the pixel in the lower-right corner
not being set properly for the back frame of the sample from
the rl2 FATE-test. This pixel is copied to every frame which
is the reason for the update to the reference file of said test.
With this patch, the output of the decoder matches the output
as captured from the reference decoder* (apart from the fact
that said reference somehow lacks the top part of the frame
(copied over from the background frame)).
2. Given that the stride of the buffer may be different
from the width of the video (despite one pixel taking one byte),
there is a second check lateron making the first check redundant
(if one returns immediately; a simple break at the second check
is not sufficient, because it only exits the inner loop).
3. The check is based around the assumption of the stride being
positive (it has this in common with the other check which
will be fixed in a future commit).
4. Even after fixing the off-by-one error, the check in
question is still triggered by all the non-background frames
in the FATE sample as well as by A1100100.RL2. In all these
cases, they use len == 255 and val == 128. For videos with
background frame this just means "copy from the background
frame", which would be done anyway lateron.* Yet for videos
without it copying it is necessary to avoid leaving
uninitialized parts in the video.
*: Available in https://samples.mplayerhq.hu/game-formats/voyeur-rl2/
**: Due to this, the code that copies the rest from the
back frame is no longer executed for any of the samples
available on the sample server. Given that these are only
the files from the demo version of this game, I don't know
whether this code is executed for any file in existence or not.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is only used by three of the thirty files that (potentially
indirectly) include mpeg4audio.h. Twenty of these files won't
have a put_bits.h inclusion any more after this patch.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This allows av_mediacodec_release_buffer to be called safely after
the decoder is closed, this was already the case with delay_flush=1.
Note that this causes holding onto frames to keep the decoding context
alive which is generally considered to be the intended behavior.
Signed-off-by: sfan5 <sfan5@live.de>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
Namely to lavu/tests/pixelutils.c. This way, this function will
not be included into actual binaries any more.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Instead use av_pix_fmt_desc_next(). It is still possible
to check its return values by comparing it with the
(currently) expected values and the code does so.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
ff_check_pixfmt_descriptors() was added in commit
20e99a9c10. At this time,
the values of enum AVPixelFormat were not contiguous;
instead there was a jump from 111 to 291 (or from 115
to 295 depending upon AV_PIX_FMT_ABI_GIT_MASTER).
ff_check_pixfmt_descriptors() accounts for this
by skipping empty descriptors. Yet this issue no longer
exists: There are no holes.
The check for said holes makes GCC believe that the name
can be NULL; because it is used as argument corresponding to
%s in a log statement, it therefore emits a warning
(since d75c4693fe). Therefore
this commit simply removes these checks.
Also move the checks for name before the log statement.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is currently 64-bit only because the stack spilling code would not
assemble on RV32I (and it would corrupt s0 and s1 on RV128I, in theory).
This could be added later in the unlikely that someone wants it.
Also remove a variable that is only used in this commented-out
codeblock. This fixes a -Wunused-variable warning.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Set preset default value to MFX_TARGETUSAGE_UNKNOWN. Let runtime to
decide the targetUsage, so that ffmpeg-qsv can keep up with runtime's
update.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Unset qsv_h264 and qsv_hevc's default settings. Let runtime to decide
these parameters, so that it can choose the best parameter and ffmpeg-qsv
can keep up with runtime's update.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
This avoids one redundant load per row; pix3 from the previous
iteration can be used as pix2 in the next one.
Before: Cortex A53 A72 A73
pix_abs_0_2_neon: 138.0 59.7 48.0
After:
pix_abs_0_2_neon: 109.7 50.2 39.5
Signed-off-by: Martin Storsjö <martin@martin.st>
Fixes the j2k-dwt FATE-test; also fixes#9945.
(I don't know whether the multiplication can overflow.)
Reviewed-by: Tomas Härdin <git@haerdin.se>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Possible since 48ac225db2.
Furthermore, the current code would not work on mips
in case ff_lsp2polyf() were used outside of lsp.c,
because it is not compiled on mips since commit
3827a86eac at all;
instead it is overridden with a static av_always_inline
function which only works for the callers in lsp.c.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Will avoid a forward declaration lateron.
Also adapt the function to modern style while at it.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
They will be mistaken for the sentinel of the arrays
they are in, thereby hiding the 6.1, 7.1 and 22.2 layouts.
(This doesn't really matter, as these arrays are informational
only for decoders.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
AVCodec.channel_layouts is deprecated and Clang (unlike GCC)
warns when setting this field in a codec definition.
Fortunately, Clang (unlike GCC) allows to use
FF_DISABLE_DEPRECATION_WARNINGS inside a definition (of an FFCodec),
so that one can create simple macros to set AVCodec.channel_layouts
that also suppress deprecation warnings for Clang.
(Notice that some of the codec definitions were already
inside FF_DISABLE/ENABLE_DEPRECATION_WARNINGS (that were not
guarded by FF_API_OLD_CHANNEL_LAYOUT); these have been removed.
Also notice that setting AVCodec.channel_layouts was not guarded
by FF_API_OLD_CHANNEL_LAYOUT either, so testing disabling it
it without removing all the codeblocks would not have worked.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Pointers to void can be converted to any pointer to incomplete or
object type and back; but they are nevertheless not completely generic
pointers: There is no provision in the C standard that guarantees their
convertibility with function pointers. C90 lacks a generic function
pointer, C99 made every function pointer a generic function pointer and
still disallows the convertibility with void *. Both GCC as well as
Clang warn about this when using -pedantic.
Therefore use unions to avoid these conversions.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Don't place it as doxy specific for the order field, and generalize it both to
also cover already defined orders and to not make it seem like the user is
required to handle a layout they don't fully support or understand.
Signed-off-by: James Almer <jamrial@gmail.com>
av_display_rotation_get will return NAN when the display matrix is invalid,
which would end up printing NAN as an integer in the rotation field. This
is poor for multiple reasons:
* Users of ffprobe have no way of discerning "valid but ugly rotation from
display matrix" from "invalid display matrix".
* It can have unintended consequences on some platforms, such as Linux x86_64,
where NAN is equal to INT64_MIN, which, for example, when printed as JSON,
which uses floating point for all numbers, can end up as invalid JSON or wit
a number that cannot be reserialized as an integer at all.
Since NAN is av_display_rotation_get's error case, just print 0 (no rotation)
when that happens.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
This starts with one-time initialisation of the 26 constant factors
like 08edacc248. That is done with
the scalar instruction set. While the formula can readily be vectored,
the gains would (probably) be more than lost in transfering the results
back to FP registers (or suitably reshuffling them into vector
registers).
Note that the main loop could likely be scheduled sligthly better by
expanding the filter macro and interleaving loads with arithmetic.
It is not clear yet if that would be relevant for vector processing (as
opposed to traditional SIMD).
We could also use fewer vectors, but there is not much point in sparing
them (they are *all* callee-clobbered).
This uses the following vectorisation:
for (i = 0; i < blocksize; i++) {
ang[i] = mag[i] - copysignf(fmaxf(ang[i], 0.f), mag[i]);
mag[i] = mag[i] - copysignf(fminf(ang[i], 0.f), mag[i]);
}
RVV defines a total of 12 different extensions, including:
- 5 different instruction subsets:
- Zve32x: 8-, 16- and 32-bit integers,
- Zve32f: Zve32x plus single precision floats,
- Zve64x: Zve32x plus 64-bit integers,
- Zve64f: Zve32f plus Zve64x,
- Zve64d: Zve64f plus double precision floats.
- 6 different vector lengths:
- Zvl32b (embedded only),
- Zvl64b (embedded only),
- Zvl128b,
- Zvl256b,
- Zvl512b,
- Zvl1024b,
- and the V extension proper: equivalent to Zve64f and Zvl128b.
In total, there are 6 different possible sets of supported instructions
(including the empty set), but for convenience we allocate one bit for
each type sets: up-to-32-bit ints (RVV_I32), floats (RVV_F32),
64-bit ints (RVV_I64) and doubles (RVV_F64).
Whence the vector size is needed, it can be retrieved by reading the
unprivileged read-only vlenb CSR. This should probably be a separate
helper macro if needed at a later point.
RV64G supports MIN & MAX instructions natively only on floating point
registers, not general purpose ones. The later would require the Zbb
extension. Due to that, it is actually faster to perform the clipping
"properly" in FPU.
Benchmarks on SiFive U74-MC (courtesy of Shanghai StarFive Tech):
audiodsp.vector_clipf_c: 29551.5
audiodsp.vector_clipf_rvf: 17871.0
Also tried unrolling with 2 or 8 elements but it gets worse either way.
This introduces compile-time and run-time CPU detection on RISC-V. In
practice, I doubt that FFmpeg will ever see a RISC-V CPU without all of
I, F and D extensions, and if it does, it probably won't have run-time
detection. So the flags are essentially always set.
But as things stand, checkasm wants them that way. Compare the ARMV8
flag on AArch64. We are nowhere near running short on CPU flag bits.
When force_original_aspect_ratio and force_divisible_by are both
used, dimensions are now rounded to the nearest allowed multiple of
force_divisible_by rather than first rounding to the nearest integer and
then rounding in a static direction. This results in less distortion of
the aspect ratio.
Reviewed-by: Thierry Foucu <tfoucu@google.com>
Signed-off-by: Tristan Schmelcher <tschmelcher@google.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The general demuxing API uses parsers and decoders. Therefore
FFStream contains pointers to AVCodecContexts and
AVCodecParserContext and lavf/internal.h includes lavc/avcodec.h.
Yet actually only a few files files really use these; and it is best
when this number stays small. Therefore this commit uses opaque
structs in lavf/internal.h for these contexts and stops including
avcodec.h.
This also avoids including lavc/codec_desc.h implicitly. All other
headers are implicitly included as now (mostly through codec.h).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
avcodec_enum_to_chroma_pos() and avcodec_chroma_pos_to_enum()
deal with enum AVChromaLocation which is defined in lavu.
These functions are therefore replaced by
av_chroma_location_enum_to_pos() and av_chroma_location_pos_to_enum().
This commit provides the necessary deprecations. Also already make
these functions wrappers around the corresponding lavu functions
as not doing so would force one to disable deprecation warnings.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
They are intended as replacements for avcodec_enum_to_chroma_pos()
and avcodec_chroma_pos_to_enum().
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
They are also frequently used in libavformat.
This change does not cause any breakage as avcodec.h
includes defs.h.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This also tests writing slice data in the unaligned mode
(some of these files use CAVLC) as well as updating
side data as well as parsing ISOBMFF avcc extradata.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is unnecessary because an av_shrink_packet() a few lines below
will set the size; furthermore, it is actually harmful, because
av_shrink_packet() does nothing in case the size already matches,
so that the packet's padding is not correctly zeroed.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is e.g. legal for an ISOBMFF avcc to contain zero parameter sets.
In this case the annex B that we produce would be empty and therefore
useless. This happens e.g. with mov/frag_overlap.mp4 from the
FATE-suite.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
There is no check for whether these supposedly redundant PPS
are actually redundant. One could check via memcmp which would
work in practice* (because all content buffers are initially
zero-allocated), but this is not portable as compilers may
trash padding inside structures as they wish.
In case the PPS is not really redundant the output is garbage.
This happens with several files from the FATE-suite. E.g.
h264-conformance/CVCANLMA2_Sony_C.jsv doesn't decode correctly
any more, whereas h264-conformance/CABA3_TOSHIBA_E.264 even
fails in ff_cbs_write_packet(), because the inferred value
of num_ref_idx_l0_active_minus1 mismatches with the value set
in the slice (this happens when num_ref_idx_l0_default_active_minus1
changes in the PPS; the value in the slice header is inferred from
the original PPS's num_ref_idx_l0_default_active_minus1).
*: Unless slice_group_id is used, i.e. unless slice_group_map_type
is six.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Fixes: signed integer overflow: 1917019860 + 265558963 cannot be represented in type 'int'
Fixes: 48798/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_DST_fuzzer-4833165046317056
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Maybe timestamp / duration validity should be checked earlier
Fixes: 50993/clusterfuzz-testcase-minimized-ffmpeg_dem_WEBM_DASH_MANIFEST_fuzzer-6586894739177472
Fixes: signed integer overflow: 0 - -9223372036854775808 cannot be represented in type 'long'
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
There is probably a better place to check for this, but better
here than nowhere
Fixes: signed integer overflow: -9223372036824775808 - 86400000000 cannot be represented in type 'long'
Fixes: 50993/clusterfuzz-testcase-minimized-ffmpeg_dem_SBG_fuzzer-6601162580688896
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 9223372036851135042 + 15666854 cannot be represented in type 'long'
Fixes: 50993/clusterfuzz-testcase-minimized-ffmpeg_dem_SBG_fuzzer-6573717339111424
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 2138820085 + 16130322 cannot be represented in type 'int'
Fixes: 50993/clusterfuzz-testcase-minimized-ffmpeg_dem_LIVE_FLV_fuzzer-6704728165187584
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The check could be made more strict
Fixes: signed integer overflow: 36 * 538976288 cannot be represented in type 'int'
Fixes: 50993/clusterfuzz-testcase-minimized-ffmpeg_dem_GENH_fuzzer-6539389873815552
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: negation of -2147483648 cannot be represented in type 'int'; cast to an unsigned type to negate this value to itself
Fixes: 50993/clusterfuzz-testcase-minimized-ffmpeg_dem_DHAV_fuzzer-6604736532447232
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 1099511693312 * 538976288 cannot be represented in type 'long'
Fixes: 50993/clusterfuzz-testcase-minimized-ffmpeg_dem_CAF_fuzzer-6565048815845376
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
avoids overflows with it
Fixes: signed integer overflow: 9223372036846866010 + 4294967047 cannot be represented in type 'long'
Fixes: 50993/clusterfuzz-testcase-minimized-ffmpeg_dem_ASF_O_fuzzer-6538296768987136
Fixes: 50993/clusterfuzz-testcase-minimized-ffmpeg_dem_ASF_O_fuzzer-657169555665715
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Use the stream duration as last resort, as an off-by-one result of the
"st->duration / (caf->packets - 1)" calculation can break playback on some
devices.
Also, don't write the sample_rate value propagated by encoders like libopus.
The sample rate of the audio fed to it is irrelevant after being encoded.
Fixes ticket #9930.
Signed-off-by: James Almer <jamrial@gmail.com>
MJPEG-B is an intra-codec, so it makes no sense to keep the reference.
It is unused lateron anyway.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This could be improved further by not allocating the buffers
that won't be needed lateron in the first place.
Reviewed-by: Tomas Härdin <tjoppen@acc.umu.se>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Fixes assertion failures after avcodec_flush_buffers(), where
stashed hwaccel state is present, but prev_thread is NULL.
Found-by: Wang Bin <wbsecg1@gmail.com>
To support non-aligned buffers during the post-transform step, just iterate
backwards over the array.
This allows using the 15xN-point FFT, with which the speed is 2.1 times
faster than our old libavcodec implementation.
~4x faster than the C version.
The shuffles in the 15pt dim1 are seriously expensive. Not happy with it,
but I'm contempt.
Can be easily converted to pure AVX by removing all vpermpd/vpermps
instructions.
There are two issues here. Firstly, the floating-point comparison
is always true. Seconly, the code depends on the default value of
min_hard_comp implicitly, which can be dangerous.
Partially fixes ticket 9859.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
According to the HEIF specification (ISO/IEC 23008-12) Section
7.5.3.1, tracks with handler_type 'auxv' must contain a 'auxi' box
in its SampleEntry to notify the nature of the auxiliary track to the
decoder.
The content is the same as the 'auxC' box. So parameterize and re-use
the existing function.
Signed-off-by: Vignesh Venkatasubramanian <vigneshv@google.com>
Signed-off-by: James Zern <jzern@google.com>
this maps to the vpxenc argument with the same name and the
VP9E_SET_MIN_GF_INTERVAL codec control
Signed-off-by: James Zern <jzern@google.com>
Reviewed-by: Vignesh Venkatasubramanian <vigneshv@google.com>
The input complex factors are constant for each iterations. This
substitudes 4 loads, 2 additions and 2 subtractions per iteration of
the inner-loop with another 4 loads. Thus effectively 4 arithmetic
operations per iteration of the inner loop are avoided, i.e. 24
operations per iteration of the outer loop, or 24 * (n - 1) operations
in total.
If the inner loop is not unrolled by the compiler, this also might
also save some pointer arithmetic as most instruction sets do not
have addressing modes with negated register offsets (12 - j). Unless
the compiler is optimising for code size, this is unlikely though.
Fixes: signed integer overflow: 9223372036854775807 - -2146905566 cannot be represented in type 'long'
Fixes: 50993/clusterfuzz-testcase-minimized-ffmpeg_dem_MXF_fuzzer-6570996594769920
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
In case a SupplementalProperty node exists in an adaptationset,
it is searched for a "schemeIdUri" property via xmlGetProp().
Whatever xmlGetProp() returns is then compared via av_strcasecmp()
to a string literal. xmlGetProp() can return NULL, namely in case
no "schemeIdUri" exists and (given that this string is allocated)
presumably also on allocation failure. No check for NULL is done,
so this may crash.
Furthermore, the string returned by xmlGetProp() needs to be freed
with xmlFree(), but this is not done either.
This commit fixes both of these issues; they existed since this code
has been added in 10d008f0fd.
This has been found while investigating ticket #9697. The continuous
leaks might very well be the reason behind the observed slowdown.
Reviewed-by: Steven Liu <lingjiujianke@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
In case new orders are added in the future, existing library users can still
use the layout simply by ignoring everything but the channel count in it, so
make this explicit.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: James Almer <jamrial@gmail.com>
ff_encode_preinit() ensures that the channel layout is equivalent
to one of the channel layouts in AVCodec.ch_layout; given that
all of these channel layouts have distinct numbers of channels,
one can therefore uniquely determine the channel layout by
the number of channels.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This encoder has AVCodec.ch_layouts set, so ff_encode_preinit()
ensures that the used channel layout is equivalent to one of
these.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
ff_encode_preinit() has already checked that the channel layout
is equivalent to one of the layouts in AVCodec.ch_layouts.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
These encoders have AVCodec.ch_layouts set, so ff_encode_preinit()
has already checked that the used channel layout is equivalent
to one of these native layouts. Therefore one can simply
compare the channel masks (with the added complication
that one has to use av_channel_layout_subset() to get it,
because the channel layout is not guaranteed to have
AV_CHANNEL_ORDER_NATIVE).
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The encoder actually creates files with side channels, not back
channels. See thd_layout in mlp_parse.h.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The encoders using this have AVCodec.ch_layouts set, so that
this is checked generically.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
None of the decoders here have the AV_CODEC_CAP_CHANNEL_CONF set,
so that it is already checked generically that the number of channels
is positive.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The pcm_bluray encoder has AVCodec.ch_layouts set, so that
ff_encode_preinit() checks that the channel layout in use
is equivalent to one of the layouts from AVCodec.ch_layouts.
Yet equivalent is not the same as identical; in particular,
custom layouts equivalent to native layouts are possible
(and necessary if one wants to use the name/opaque fields
with an ordinary channel layout), so one must not simply
use AVChannelLayout.u.mask. Use av_channel_layout_subset()
instead.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This decoder does not have the AV_CODEC_CAP_CHANNEL_CONF set,
so that number of channels has to be set by the user before
avcodec_open2().
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Now that it is ensured that the old and new channel count/layout
values coincide if the old ones are set, the consistency of the
AVChannelLayout (which is checked before we reach this point)
implies the consistency of the old values, making these checks
here dead code. So remove them.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This ensures that if AVCodecContext.channels or
AVCodecContext.channel_layout are set, AVCodecContext.ch_layout
has the equivalent values after this block.
(In case these values are set inconsistently, the consistency check
for ch_layout below will error out.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
In particular, check the provided channel layout for encoders
without AVCodec.ch_layouts set. This fixes an infinite loop
in the WavPack encoder (and maybe other issues in other encoders
as well) in case the channel count is zero.
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The wrapper for the legacy channel layout API already sets
AVCodecContext.channels based upon AVCodecContext.channel_layout
if the latter is set while the former is unset.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Provide optimized implementation for pix_median_abs8 function.
Performance comparison tests are shown below.
- median_sad_1_c: 277.0
- median_sad_1_neon: 82.0
Benchmarks and tests run with checkasm tool on AWS Graviton 3.
Signed-off-by: Hubert Mazur <hum@semihalf.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
Provide optimized implementation for vsad8_intra function.
Performance comparison tests are shown below.
- vsad_5_c: 94.7
- vsad_5_neon: 20.7
Benchmarks and tests run with checkasm tool on AWS Graviton 3.
Signed-off-by: Hubert Mazur <hum@semihalf.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
Provide optimized implementation for pix_median_abs16 function.
Performance comparison tests are shown below.
- median_sad_0_c: 720.5
- median_sad_0_neon: 127.2
Benchmarks and tests run with checkasm tool on AWS Graviton 3.
Signed-off-by: Hubert Mazur <hum@semihalf.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
When determining whether a packet should be decrypted,
should use the stsd_id of the fragment where the current packet is located.
Reviewed-by: Zhao Zhili <zhilizhao@tencent.com>
Signed-off-by: Wang Yaqiang <wangyaqiang03@kuaishou.com>
Old one was written with the assumption only even inputs would be given.
This very messy replacement supports even and odd inputs, and supports
AVX2 for extra speed. The buffers given are usually quite big (4k samples),
so the speedup is worth it.
The new SSE version is still faster than the old inline asm version by 33%.
Also checkasm is provided to make sure this monstrosity works.
This fixes some FATE tests.
Clang's static analyzer complains that leaving the variable
uninitialized could lead to a code path where the uninitialized value is
written to at the end of this function.
This patch simply zero-initializes that variable to avoid that.
Signed-off-by: Will Cassella <cassew@google.com>
Signed-off-by: James Almer <jamrial@gmail.com>
The aim of this test is to show the interleavement
of the file generated in the first pass; so make the
interleavement queue in the framecrc muxer in the second
pass as small as possible so that the framecrc muxer does not
fix wrong interleavement of the input file behind our backs.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
enc_dec is designed for raw input and output and computes
the PSNR between these two. The input of the shortest-sub
test is the idx file of a vobsub sub+idx combination
and the output is the output of framecrc of said vobsub
subtitle muxed into Matroska together with a synthesized
video. Calculating the PSNR between these two files makes
no sense, therefore switch to a transcode test, where
the ref file file contains the output of framecrc directly,
making the interleavement better visible in the ref file
at the cost of a larger ref file (>400 lines).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The MXF demuxer does not currently set AVStream::avg_frame_rate and ::r_frame_rate
when J2K essence is wrapped according to SMPTE ST 422.
Reviewed-by: Tomas Härdin <tjoppen@acc.umu.se>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Basically reverts af15c17daa.
Flipping a picture by modifying the pointers is so common
that even users of direct rendering should take it into account.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Up until now, libswscale/output.c used a macro to write
an output pixel which involved a call to av_pix_fmt_desc_get()
to find out whether the input pixel format is BE or LE
despite this being known at compile-time (there are templates
per pixfmt). Even worse, these calls are made in a loop,
so that e.g. there are eight calls to av_pix_fmt_desc_get()
for every pixel processed in yuv2rgba64_X_c_template()
for 64bit RGB formats.
This commit modifies these macros to ensure that isBE()
is evaluated at compile-time. This saved 41184B of .text
for me (GCC 11.2, -O3). Of course, it also improved performance.
E.g. ffmpeg_g -f lavfi -i testsrc2,format=yuva420p -pix_fmt rgba64le \
-threads 1 -t 1:00 -f null - (which uses yuv2rgba64le_X_c,
which is an invocation of yuv2rgba64_X_c_template() mentioned above),
performance improved from 95589 to 41387 decicycles for one call
to yuv2packedX; for the be variant the numbers went down from
76087 to 43024 decicycles.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Up until now, libswscale/input.c used a macro to read
an input pixel which involved a call to av_pix_fmt_desc_get()
to find out whether the input pixel format is BE or LE
despite this being known at compile-time (there are templates
per pixfmt). Even worse, these calls are made in a loop,
so that e.g. there are six calls to av_pix_fmt_desc_get()
for every pair of UV pixel processed in
rgb64ToUV_half_c_template().
This commit modifies these macros to ensure that isBE()
is evaluated at compile-time. This saved 9743B of .text
for me (GCC 11.2, -O3). For a simple RGB64LE->YUV420P
transformation like
ffmpeg -f lavfi -i haldclutsrc,format=rgba64le -pix_fmt yuv420p \
-threads 1 -t 1:00 -f null -
the amount of decicycles spent in rgb64LEToUV_half_c
(which is created via the template mentioned above)
decreases from 19751 to 5341; for RGBA64BE the number
went down from 11945 to 5393. For shared builds (where
the call to av_pix_fmt_desc_get() is indirect) the old numbers
are 15230 for RGBA64BE and 27502 for RGBA64LE, whereas
the numbers with this patch are indistinguishable from
the numbers from a static build.
Also make the macros that are touched conform to the
usual convention of using uppercase names while just at it.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Up until now, using NULL as key in av_dict_get() on a non-empty
AVDictionary would crash; using NULL as key in av_dict_set()
would also crash for a non-empty AVDictionary unless AV_DICT_MULTIKEY
was set; in case the dictionary was initially empty or AV_DICT_MULTIKEY
was set, it was even possible for av_dict_set() to succeed when
adding a NULL key, namely when one uses a value != NULL and
the AV_DICT_DONT_STRDUP_VAL flag. Using av_dict_get() on such
an AVDictionary will usually lead to crashes, though.
Fix this by actually checking for key in both functions; error out
if they are NULL.
While just at it, also stop relying on av_strdup(NULL) to return NULL
in av_dict_set().
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The compiler cannot infer that the two float vectors do not alias,
causing unnecessary extra loads and serialisation. This patch caches
the two input values in local variables so that compiler can optimise
individual loop iterations.
While this probably never overflows, we are better safe than sorry.
The callback prototype should probably also use ptrdiff_t or size_t,
but I diggress (this would affect the DSP callback prototype).
Do this by setting AVCodecInternal.pad_samples.
This prevents reading into the frame's padding and writing
into the packet's padding.
This actually happened in our FATE tests (where the number of samples
is 2 mod 4), which therefore needed to be updated.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Some audio codecs work with atomic units that decode to a fixed
number of audio samples with this number being so small that it is
common to put multiple of these atoms into one packet. In these
cases it makes no sense to pad the last frame to the big frame_size,
so allow encoders to set the number of samples that they want
the last frame to be padded to instead.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
In particular, check that there is only one small last frame
in case the encoder has the AV_CODEC_CAP_SMALL_LAST_FRAME set.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The APTX (HD) decoder decodes blocks of four (six) bytes to four
output samples. It makes no sense to handle incomplete blocks:
They would just lead to synchronization errors, in which case
the complete frame is discarded. So only handle complete blocks.
This also avoids reading from the packet's padding and writing
into the frame's padding.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Just because we try to put multiple units of block_align bytes
(the atomic units for APTX and APTX HD) into one packet
does not mean that packets with fewer units than the
one we wanted are corrupt; only those packets that are not
a multiple of block_align are.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This field was misunderstood: It gives the number of samples
in a packet, not the number of bytes. Its usage was wrong for APTX HD.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Currently the APTX (HD) codecs set frame_size if unset and check
whether it is divisible by block_size (corresponding to block_align
as used by other codecs). But this is based upon a misunderstanding
of the API: frame_size is not in bytes, but in samples.
Said value is also not intended to be set by the user at all,
but set by encoders and (possibly) decoders if the number of channels
in a frame is constant. The latter condition is not fulfilled here,
so only set it for encoders. Given that the encoder can handle any
number of samples as long as it is divisible by four and given that
it worked to set a custom frame size before, the encoders accept
any multiple of four; otherwise the value is set to the value
that it already had for APTX: 1024 samples (per channel).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
APTX decodes four bytes of input to four stereo samples; APTX HD
does the same with six bytes of input. So it can be easily supported
in av_get_audio_frame_duration().
This fixes invalid durations and (derived) timestamps of demuxed
APTX HD packets and therefore fixed the timestamp in the aptx-hd
FATE test.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
We have de- and encoders for APTX and APTX HD, yet not FATE tests.
This commit therefore adds a transcoding test to utilize them.
Furthermore, during creating these tests it turned out that
the duration is set incorrectly for APTX HD. This will be fixed
in a future commit.
(Thanks to Andriy Gelman for finding an issue in an earlier version
that used a 192kHz input sample which does not work reliably accross
platforms.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This block was scheduled for removal, which means that no validation would have
taken place after the old API was removed.
It was algo going to mistakenly remove an unrelated bits_per_coded_sample
check.
Signed-off-by: James Almer <jamrial@gmail.com>
The opaque parameter for the callback is set in videotoolbox_start(),
called when the hwaccel is initialized. When frame threading is used,
avctx will be the context corresponding to the frame thread currently
doing the decoding. Using this same codec context in all subsequent
invocations of the decoder callback (even those triggered by a different
frame thread) is unsafe, and broken after
cc867f2c09, since each frame thread now
cleans up its hwaccel state after decoding each frame.
Fix this by passing hwaccel_priv_data as the opaque parameter, which
exists in a single instance forwarded between all frame threads.
The only other use of AVCodecContext in the decoder output callback is
as a logging context. For this purpose, store a logging context in
hwaccel_priv_data.
This is no longer used since 4608996772.
It also has no implementations other than the plain C one.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Since introducing the various packed formats used by VAAPI (and p012),
we've noticed that there's actually a gap in how
av_find_best_pix_fmt_of_2 works. It doesn't actually assign any value
to having the same bit depth as the source format, when comparing
against formats with a higher bit depth. This usually doesn't matter,
because av_get_padded_bits_per_pixel() will account for it.
However, as many of these formats use padding internally, we find that
av_get_padded_bits_per_pixel() actually returns the same value for the
10 bit, 12 bit, 16 bit flavours, etc. In these tied situations, we end
up just picking the first of the two provided formats, even if the
second one should be preferred because it matches the actual bit depth.
This bug already existed if you tried to compare yuv420p10 against p016
and p010, for example, but it simply hadn't come up before so we never
noticed.
But now, we actually got a situation in the VAAPI VP9 decoder where it
offers both p010 and p012 because Profile 3 could be either depth and
ends up picking p012 for 10 bit content due to the ordering of the
testing.
In addition, in the process of testing the fix, I realised we have the
same gap when it comes to chroma subsampling - we do not favour a
format that has exactly the same subsampling vs one with less
subsampling when all else is equal.
To fix this, I'm introducing a small score penalty if the bit depth or
subsampling doesn't exactly match the source format. This will break
the tie in favour of the format with the exact match, but not offset
any of the other scoring penalties we already have.
I have added a set of tests around these formats which will fail
without this fix.
Update description for ssim and ms_ssim libvmaf options to specify
feature=float_ssim and feature=float_ms_ssim which are used to request
ssim and ms_ssim values in the latest versions of libvmaf.
Signed-off-by: Yondon Fu <yondon.fu@gmail.com>
Fixes: signed integer overflow: -2147448926 + -198321 cannot be represented in type 'int'
Fixes: 48798/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_APE_fuzzer-5739619273015296
Fixes: 48798/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_APE_fuzzer-6744428485672960
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 127 + 2147483536 cannot be represented in type 'int'
Fixes: 48798/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_MOBICLIP_fuzzer-6014034970804224
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 17121181824 * 538976288 cannot be represented in type 'long long'
Fixes: 48798/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_EXR_fuzzer-5915330316206080
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes regression with tickets/4364/L1004220.DNG
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The patch fixes the bugs that occurred when running
fate-checkasm-hevc_pel on MIPS paltform.
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This patch fixes a bug where the fate-checkasm-motion fails when
h is not a multiple of 8.
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The code to initialize it takes more space (in .text) than
the table to be initialized (namely 86B vs 64B for GCC 11.2
with -O3 in an av_cold function), so hardcode the table.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Most of the VLCs used here have a max_depth of two;
some have a max_depth of one. Therefore one can just use two
and avoid the runtime check for whether one should
perform another round of LUT lookup in case the first read
did not read a complete codeword.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Up until now, initializing the dca VLC tables uses ff_init_vlc_sparse()
with length tables of type uint8_t and code tables of type uint16_t
(except for the LBR tables, which uses length and symbols of type
uint8_t; these tables are interleaved). In case of the quant index
codebooks these arrays were accessed via tables of pointers to the
individual tables.
This commit changes this: First, we switch to ff_init_vlc_from_lengths()
to replace the uint16_t code tables by uint8_t symbol tables
(this necessitates ordering the tables from left-to-right in the tree
first). These symbol tables are interleaved with the length tables.
Furthermore, these tables are combined in order to remove the table of
pointers to individual tables, thereby avoiding relocations (for x64
elf systems this amounts to 96*24B = 2304B saved in .rela.dyn) and
saving 1280B from .data.rel.ro (for 64bit systems). Meanwhile the
savings in .rodata amount to 2709 + 2 * 334 = 3377B. Due to padding
the actual savings are higher: The ELF x64 ABI requires objects >= 16B
to be padded to 16B and lots of the tables have 2^n + 1 elements
of these were from replacing uint16_t codes with uint8_t symbols;
the rest was due to the fact that combining the tables eliminated
padding (the ELF x64 ABI requires objects >= 16B to be padded to 16B
and lots of the tables have 2^n + 1 elements)). Taking this into
account gives savings of 4548B. (GCC by default uses an even higher
alignment (controlled by -malign-data); for it the savings are 5748B.)
These changes also necessitated to modify the init code for
the encoder tables.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Up until now, the encoder used the same tables that the decoder
uses to create its VLCs. These have the downside of requiring
the encoder to offset the tables at runtime as well as having
to read from separate tables for the length as well as the code
of the symbol to encode. The former are uint8_t, the latter uint16_t,
so using a joint table would require padding, but this doesn't
matter when these tables are generated at runtime, because they
live in the .bss segment.
Also move these init functions as well as the functions that
actually use them to dcaenc.c, because they are encoder-specific.
This also allows to remove an inclusion of PutBitContext from
dcahuff.h (and indirectly from all dca-decoder files).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It increases the size of one VLC from two to three bits, thereby
requiring four more VLCEntries (16 bytes .bss), but it allows to
inline the number of bits used when reading them.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The ff_dca_vlc_transition_mode VLCs don't use an offset at all,
so just use ordinary VLCs for them.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
GCC 12 apparently believes that negative palette sizes are
possible (they are not, as this has already been checked during
init) and therefore emits a -Wstringop-overflow= for the memcpy.
Using unsigned avoids this.
(To be honest, there might be a compiler bug involved.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This might be useful in case this decoder were changed to support
new extradata passed via side-data.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This code is only called once during init, so none of the buffers
here have been allocated already.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
183132872a made the iff demuxer
output extradata and made the decoder parse said extradata.
To make this extradata extensible, it came with its own internal
length field (containing the offset of the palette at the end
of the extradata). Furthermore, in order to support mid-stream
extradata changes, the packets returned by the demuxer also have
such a length field (containing the offset of the actual packet
data). Therefore the packet parsing the extradata accepted its
input from both AVPackets as well as from ordinary extradata.
Yet the demuxer never made use of this "feature": The packet's
length field always indicated that the packet data starts
immediately after the length field.
Later, commit cb928fc448 stopped
appending the length field to the packets' data; of course,
it also stopped searching for extradata in this data.
Instead it added code to parse the packet's header to the function
that parses extradata. This made this function consist of two disjoint
parts, one of which is only reachable if this function is called
from init (when parsing extradata) and one of which is reachable
when parsing packet headers.
Therefore this commit splits this function into two.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Otherwise the buffer might be too small. Fixes assert violations
when encoding mono audio with exactly one sample.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
For example, if the jpeg contains exif information
and the rotation direction is included in the exif,
the displaymatrix will be set on the side_data of the frame when decoding.
However, when ffplay is used to play the image,
only the side data in the stream will be determined.
It does not check whether the frame also contains rotation information,
causing it to play in the wrong direction
Reviewed-by: Zhao Zhili <zhilizhao@tencent.com>
Signed-off-by: Wang Yaqiang <wangyaqiang03@kuaishou.com>
It is currently calling av_channel_layout_describe()
unnecessarily.
Reviewed-by: Nicolas George <george@nsup.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
If a developer using FFmpeg libraries seeks into an earlier position and calls
avcodec_flush_buffers() afterwards as recommended, the Vorbis decoder will drop
the next frame, since buffer flushing clears the first_frame flag. As a result,
the audio samples the calling code receives may be ahead of the requested seek
position, which is unacceptable in some use cases such as playing a looping
sound effect.
This commit records the presentation timestamp of the first frame and
determines after that if the new frame is the first frame (possible after
seeking to the start) by comparing its pts to the stored pts.
When appending two values (due to AV_DICT_APPEND), the earlier code
would first zero-allocate a buffer of the required size and then
copy both parts into it via av_strlcat(). This is problematic,
as it leads to quadratic performance in case of frequent enlargements.
Fix this by using av_realloc() (which is hopefully designed to handle
such cases in a better way than simply throwing the buffer we already
have away) and by copying the string via memcpy() (after all, we already
calculated the strlen of both strings).
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
If a key already exists in an AVDictionary and the AV_DICT_APPEND flag
is set, the old entry is at first discarded from the dictionary, but
a pointer to the value is kept. Lateron enough memory to store the
appended string is allocated; should this allocation fail, the old string
is not freed and hence leaks. This commit changes this by moving
creating the combined value to an earlier point in the function,
which also ensures that the AVDictionary is unchanged in case of errors.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
We know that an AVDictionary is not empty if we have just added
an entry to it, so only check for it being empty on the branch
that does not do so.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
av_strlcpy() returns the length of the src string to enable
the caller to check for truncation. It is currently used in
the following way in dump_metadata(): Every metadata value
is searched for \b, \n, \v, \f, \r and then the data up to
the first of these characters found is copied to a small
temporary buffer via av_strlcpy() (but of course not more
than fits into said buffer) and then printed; all characters up
to the character found earlier are then treated as consumed.
But this is bad performance-wise if the while string is big
and contains many of these characters, because av_strlcpy()
will unnecessarily calculate the length of the whole remaining string.
(dump_metadata() actually ignored the return value of av_strlcpy().)
Fix this by not copying the data to a temporary buffer at all.
Instead just use %.*s to bound the number of characters output.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It may be NULL, as is the case for D3D11VA_VLD.
Running "ffmpeg -h decoder=h264" on a Windows build
Before:
Decoder h264 [H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10]:
Supported hardware devices: dxva2 (null) d3d11va cuda
After:
Decoder h264 [H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10]:
Supported hardware devices: dxva2 d3d11va cuda
Signed-off-by: James Almer <jamrial@gmail.com>
If it's unsupported or invalid, then there's no point trying to rebuild it
using a value that may have been derived from the same layout to begin with.
Move the checks before the attempts at copying the layout while at it.
Fixes ticket #9908.
Signed-off-by: James Almer <jamrial@gmail.com>
This reverts commit 2c8dc7e953.
The loongarch headers have been fixed, so that this wrapper
is no longer necessary.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This reverts commit 6c9a60ada4.
The loongarch headers have been fixed, so that this workaround
is no longer necessary.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
If the target supports the Basic bit-manipulation (Zbb) extension, then
the REV8 instruction is available to reverse byte order.
Note that this instruction only exists at the "XLEN" register size,
so we need to right shift the result down to the data width.
If Zbb is not supported, then this patchset does nothing. Support for
run-time detection is left for the future. Currently, there are no
bits in auxv/ELF HWCAP for Z-extensions, so there are no clean ways to
do this.
RISC-V defines the CLZ instruction as part of the ratified Zbb subset
of the (not yet ratified) bit mapulation extension (B). We can detect
it from the __riscv_zbb predefined constant. At least GCC 12 already
supports this correctly.
Note that the macro will be non-zero if supported, zero if enabled
in the compiler flags (e.g. -march=rv64gzbb) but not known to the
compiler, and undefined otherwise.
This uses the architected RISC-V 64-bit cycle counter from the
RISC-V unprivileged instruction set.
In 64-bit and 128-bit, this is a straightforward CSR read.
In 32-bit mode, the 64-bit value is exposed as two CSRs, which
cannot be read atomically, so a loop is necessary to detect and fix up
the race condition where the bottom half wraps exactly between the two
reads.
These tests test both the demuxer as well as the muxer
wherever possible. It is not always possible due to the fact
that the muxer supports more codecs than the demuxer.
The spdif demuxer does currently not set the need_parsing flag.
If one were to set this to AVSTREAM_PARSE_FULL, the test results
would change as follows:
- For spdif-aac-remux, the packets are currently padded to 16bits,
i.e. if the actual packet size is odd, there is a padding byte.
The parser splits this byte away into a one byte packet of its own.
Insanely, these one byte packets get the same duration as normal
packets, i.e. timing is ruined.
- The DCA-remux tests get proper duration/timestamps.
- In the spdif-mp2-remux test the demuxer marks the stream as
being MP2; the parser sets it to MP3 and this triggers
the "Codec change in IEC 61937" codepath; this test therefore
returns only two packets with the parser.
- For spdif-mp3-remux some bytes end up in different packets:
Some input packets of this file have an odd length (417B instead
of 418B like all the other packets) and are padded to 418B.
Without a parser, all returned packets from the spdif-demuxer
are 418B. With a parser, the packets that were originally 417B
are 417B again, but the padding byte has not been discarded,
but added to the next packet which is now 419B.
This fixes "Multiple frames in a packet" warning and avoids
an "Invalid data found when processing input" error when decoding.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
When compiling FFmpeg with GCC-9, some very random segfaults were
observed in code which had previously called down into the SBC encoder
NEON assembly routines. This was caused by these functions clobbering
some of the vfp callee saved registers (d8 - d15 aka q4 - q7). GCC was
using these registers to save local variables, but after these
functions returned, they would contain garbage.
Fix by reallocating the registers in the two affected functions in
the following way:
ff_sbc_analyze_4_neon: q2-q5 => q8-q11, then q1-q4 => q8-q11
ff_sbc_analyze_8_neon: q2-q9 => q8-q15
The reason for using these replacements is to keep closely related
sets of registers consecutively numbered which hopefully makes the
code more easy to follow. Since this commit only reallocates
registers, it should have no performance impact.
Signed-off-by: James Cowgill <jcowgill@debian.org>
Signed-off-by: Martin Storsjö <martin@martin.st>
It is not needed on x64, because the AV_COPY* and AV_ZERO*
macros never use MMX on x64.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Provide optimized implementation for vsse_intra16 for arm64.
Performance tests are shown below.
- vsse_4_c: 155.2
- vsse_4_neon: 36.2
Benchmarks and tests are run with checkasm tool on AWS Graviton 3.
Signed-off-by: Hubert Mazur <hum@semihalf.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
Provide optimized implementation for vsad_intra16 function for arm64.
Performance comparison tests are shown below.
- vsad_4_c: 177.5
- vsad_4_neon: 23.5
Benchmarks and tests are run with checkasm tool on AWS Gravtion 3.
Signed-off-by: Hubert Mazur <hum@semihalf.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
Provide optimized implementation of vsse16 for arm64.
Performance comparison tests are shown below.
- vsse_0_c: 257.7
- vsse_0_neon: 59.2
Benchmarks and tests are run with checkasm tool on AWS Graviton 3.
Signed-off-by: Hubert Mazur <hum@semihalf.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
Provide optimized implementation of vsad16 function for arm64.
Performance comparison tests are shown below.
- vsad_0_c: 285.2
- vsad_0_neon: 39.5
Benchmarks and tests are run with checkasm tool on AWS Graviton 3.
Co-authored-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Hubert Mazur <hum@semihalf.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
Add "slice" intra refresh type to h264_qsv and hevc_qsv. This type means
horizontal refresh by slices without overlapping. Also update the doc.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
This duration is equal to the longest duration in all track's tkhd atoms, which
may be comprised of the sum of all edit lists in each track. Empty edit lists
in tracks represent start_time, and the actual media duration is stored in the
mdhd atom.
This change lets the generic demux code derive the longest track duration taken
from mdhd atoms, so the correct duration and start_time combination will be
reported.
Should fix ticket #9775.
Reviewed-by: zhilizhao(赵志立) <quinkblack@foxmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
These macros are definitions, not only declarations and therefore
should not contain a semicolon. Such a semicolon is actually
spec-incompliant, but compilers happen to accept them.
Reviewed-by: Philip Langdale <philipl@overt.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The latest commit of Loongson MMI macro replaces were incorrect.
It makes a mass of green tints on HEVC videos when playing. I've
compared it with the older MMI implementation, and found out that
several lines have been replaced by wrong macros.
Signed-off-by: Qi Tiezheng <qitiezheng@360.cn>
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
AV_PIX_FMT_VUYX is used in FFmpeg and MFX_FOURCC_AYUV is used in the SDK
Reviewed-by: Philip Langdale <philipl@overt.org>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
AV_PIX_FMT_VUYX is used for 8bit 4:4:4 content in FFmpeg VAAPI, so
AV_PIX_FMT_VUYX should be used for 8bit 4:4:4 content in FFmpeg QSV too
because QSV is based on VAAPI on Linux. However the SDK only declares
support for AYUV and does nothing with the alpha, so this commit fudged
a mapping between AV_PIX_FMT_VUYX and MFX_FOURCC_AYUV.
Reviewed-by: Philip Langdale <philipl@overt.org>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Currently AVBR is disabled and VBR is the default method if maxrate is
not specified on Linux, but AVBR is the default one if maxrate is not
specified on Windows. In order to make user experience better accross
Linux and Windows, use VBR by default on Windows if maxrate is not
specified. User need to set both avbr_accuracy and avbr_convergence to
non-zero explicitly and not to specify maxrate if AVBR is expected.
In addition, AVBR works for H264 and HEVC only in the SDK.
$ ffmpeg.exe -v verbose -f lavfi -i yuvtestsrc -vf "format=nv12" -c:v
vp9_qsv -f null -
The FFV1 decoder only uses the last frame's data to conceal
errors. The encoder does not have this problem and therefore
only uses the current frame and none of the ThreadFrames.
So only allocate them for the decoder.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
By using a symbol table one can already bake in applying
a LUT on the return value of get_vlc2(). So change the
symbol table for the vec2 and vec4 tables to avoid
using the symbol_to_vec2/4 LUTs.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It allows to replace tables of big codes (uint16_t and uint32_t)
by tables of smaller symbols (mostly uint8_t).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
These codes are already ordered from left-to-right in the tree,
so one can just use ff_init_vlc_static_from_lengths().
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Instead reuse the destination RL VLC as scratch space.
This is possible, because the (implicit) codes here are already
ordered from left-to-right in the tree and because the codelengths
are increasing, which implies that mapping from VLC entries to the
corresponding entries used to initialize the VLC is monotonically
increasing. This means that one can reuse the right end of the
destination RL VLC to store the tables used to initialize the VLC
with.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible because the codes are already ordered
from left to right in the tree. It avoids having to create
the codes ourselves and will enable the codes table
to be removed altogether once the encoder stops using it.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Right now, it is nearly ordered by "left codes in the tree first";
the only exception is the escape value which has been put at the
end. This commit moves it to the place it should have according
to the above order. This is in preparation for further commits.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This matches a similar cap on the number of automatic threads
in libavcodec/pthread_slice.c.
On systems with lots of cores, this fixes a couple fate failures
in 32 bit mode on such machines (where spawning a huge number of
threads runs out of address space).
Signed-off-by: Martin Storsjö <martin@martin.st>
Instead of the potentially adjusted ones. Otherwise, if config_props() is
called again and if using force_original_aspect_ratio, the already adjusted
values could be altered again.
Example command line
scale=size=1920x1000:force_original_aspect_ratio=decrease:force_divisible_by=2
user value 1920x1000 -> 1920x798 on init_dict() -> 1918x798 on frame
change when eval_mode == EVAL_MODE_INIT, which after e645a1ddb9 could be at the
very first frame.
Signed-off-by: James Almer <jamrial@gmail.com>
This state is not refcounted, so make sure it always has a well-defined
owner.
Remove the block added in 091341f2ab, as
this commit also solves that issue in a more general way.
Mention:
- that it is legacy and optional (every hwaccel that uses it can also
work with hwcontext, though some optional information can only be
signalled throught hwaccel_context)
- that it can be used for encoders (only qsvenc currently)
- ownership and lifetime
This commit implements an iMDCT in pure assembly.
This is capable of processing any mod-8 transforms, rather than just
power of two, but since power of two is all we have assembly for
currently, that's what's supported.
It would really benefit if we could somehow use the C code to decide
which function to jump into, but exposing function labels from assebly
into C is anything but easy.
The post-transform loop could probably be improved.
This was somewhat annoying to write, as we must support arbitrary
strides during runtime. There's a fast branch for stride == 4 bytes
and a slower one which uses vgatherdps.
Zen 3 benchmarks for stride == 4 for old (av_imdct_half) vs new (av_tx):
128pt:
2811 decicycles in av_tx (imdct),16775916 runs, 1300 skips
3082 decicycles in av_imdct_half,16776751 runs, 465 skips
256pt:
4920 decicycles in av_tx (imdct),16775820 runs, 1396 skips
5378 decicycles in av_imdct_half,16776411 runs, 805 skips
512pt:
9668 decicycles in av_tx (imdct),16775774 runs, 1442 skips
10626 decicycles in av_imdct_half,16775647 runs, 1569 skips
1024pt:
19812 decicycles in av_tx (imdct),16777144 runs, 72 skips
23036 decicycles in av_imdct_half,16777167 runs, 49 skips
Inside a function, the second ';' in ";;" is just a null statement,
but it is actually illegal outside of functions. Compilers
nevertheless accept it without warning, except when in -pedantic
mode when e.g. Clang emits a -Wextra-semi warning. Therefore
remove the unnecessary ';'.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The field is not specific to Opus.
The mp2fixed encoder signals initial_padding and is used
by both the matroska-encoding-delay test as well as
the lavf-mkv tests which necessitated several FATE ref changes.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Matroska requires pts to be >= 0 with a slight exception:
It has a mechanism to deal with codec delay, i.e. with
the data added at the beginning that does not correspond
to actual input data and should be discarded by the player.
Only the audio actually intended to be output needs to have
a timestamp >= 0.
In order to avoid unnecessary timestamp shifting, this patch
allows muxers to inform the shifting code about this so that
it can take it into account.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Matroska generally requires timestamps to be nonnegative, but
there is an exception: Data that corresponds to encoder delay
and is not supposed to be output anyway can have a negative
timestamp. This is achieved by using the CodecDelay header
field: The demuxer has to subtract this value from the raw
(nonnegative) timestamps of the corresponding track.
Therefore the muxer has to add this value first to write
this raw timestamp.
Support for writing CodecDelay has been added in FFmpeg commit
d92b1b1bab and in Libav commit
a1aa37dd0b. The former simply
wrote the header field and did not apply any timestamp offsets,
leading to desynchronisation (if one uses multiple tracks).
The latter applied it at two places, but not at the one where
it actually matters, namely in mkv_write_block(), leading to
the same desynchronisation as with the former commit. It furthermore
used the wrong stream timebase to convert the delay to the
stream's timebase, as the conversion used the timebase from
before avpriv_set_pts_info().
When the latter was merged in 82e4f39883,
it was only done in a deactivated state that still did not
offset the timestamps when muxing due to "assertion failures
and av sync errors". a1aa37dd0b
made it definitely more likely to run into assertion failures
(namely if the relative block timestamp doesn't fit into an int16_t).
Yet all of the above issues have been fixed (in commits
962d631573,
5d3953a5dc and
4ebeab15b0. This commit therefore
enables applying CodecDelay, fixing ticket #7182.
There is just one slight regression from this: If one has input
with encoder delay where the first timestamp is negative, but
the pts of the part of the data that is actually intended to be
output is nonnegative, then the timestamps will currently by default
be shifted to make them nonnegative before they reach the muxer;
the muxer will then ensure that the shifted timestamps are retained.
Before this commit, the muxer did not ensure this; instead the
timestamps that the demuxer will output were shifted and
if the first timestamp of the actually intended output was zero
before shifting, then this unintentional shift just cancels
the shift performed before the packet reached the muxer.
(But notice that this only applies if all the tracks use the same
CodecDelay, or the relative sync between tracks will be impaired.)
This happens in the matroska-opus-remux and matroska-ogg-opus-remux
FATE tests. Future commits will forward the information that
the Matroska muxer has a limited capability to handle negative
timestamps so that the shifting in libavformat can take advantage
of it.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Opus can be decoded to multiple samplerates (namely 48kHz, 24KHz,
16Khz, 12 KHz and 8Khz); libopus as well as our encoder wrapper
support these sample rates. The OpusHead contains a field for
this original samplerate. Yet the pre-skip (and the granule-position
in the Ogg-Opus mapping in general) are always in the 48KHz clock,
irrespective of the original sample rate.
Before commit c3c22bee63, our libopus
encoder was buggy: It did not account for the fact that the pre-skip
field is always according to a 48kHz clock and wrote a too small
value in case one uses the encoder with a sample rate other than 48kHz;
this discrepancy between CodecDelay and OpusHead led to Firefox
rejecting such streams.
In order to account for that, said commit made the muxer always use
48kHz instead of the actual sample rate to convert the initial_padding
(in samples in the stream's sample rate) to ns. This meant that both
fields are now off by the same factor, so Firefox was happy.
Then commit f4bdeddc3c fixed the issue
in libopusenc; so the OpusHead is correct, but the CodecDelay is
still off*. This commit fixes this by effectively reverting
c3c22bee63.
*: Firefox seems to no longer abort when CodecDelay and OpusHead
are off.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is possible for the trailing padding to be zero, namely
e.g. if the AV_PKT_DATA_SKIP_SAMPLES side data is used
for leading padding. Matroska supports this (use a negative
DiscardPadding), but players do not; at least Firefox refuses
to play such a file. So for now only write DiscardPadding
if it is trailing padding and nonzero.
The fate-matroska-ogg-opus-remux was affected by this.
(I wish CodecDelay would not exist and DiscardPadding would
be used to instead trim the codec delay away (with the Block
timestamp corresponding to the time at which the actually
output audio is output).)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Creating CFHD RL VLC tables works by first extending
the codes by the sign, followed by creating a VLC,
followed by deriving the RL VLC from this VLC (which
is then discarded). Extending the codes uses stack arrays.
The tables used to initialize the VLC are already sorted
from left-to-right in the tree. This means that the
corresponding VLC entries are generally also ascending,
but not always: Entries from subtables always follow
the corresponding main table although it is possible
for the right-most node to fit into the main table.
This suggests that one can try to use the final destination
buffer as scratch buffer for the tables with sign included.
Unfortunately it works for neither of the tables if one
uses the right-most part of the RL VLC buffer as scratch buffer;
using the left-most part of the RL VLC buffer as scratch buffer
might work if one traverses the VLC entries from end to start.
But it works only for the little RL VLC (table 9), not for table 18.
Therefore this patch uses the RL VLC buffer for table 9
as scratch buffer for creating the bigger table 18.
Afterwards the left part of the buffer for table 9 is
used as scratch buffer to create table 9.
This fixes the cfhd part of ticket #9399 (if it is not already fixed).
Notice that I do not consider the previous stack usage excessive.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
cfhddata.c initializes a RL VLC table via code tables and
corresponding tables for length, run and level. code and length
tables are used to initialize a VLC, no symbol table is used.
Afterwards the symbols of said VLC are just the indices of
the corresponding entries in the code and length table that
were used for initialization; they can therefore be used
to get the matching level and run entry and they are not used
for anything else. Therefore one can just permute these tables
without changing the resulting RL VLC tables.
This commit does just this. It permutes these tables so that
the code tables are ordered from left to right in the resulting
tree and then switches to ff_init_vlc_from_lengths(), which
allows to remove the codes table altogether.
Given that these tables are constructed on the stack, this
also reduces stack usage, potentially fixing part of #9399.
(The size of the tables on the stack decreases from 4752 to
2640.)
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
cfhd.c checked for level being equal to a certain codebook-
dependent constant and to run being two. The first check is
actually redundant, as all codebooks contain only one (real)
entry with run == 2 (as is usual with VLCs, this one real entry
has several corresponding entries in the table). But given
that no entry has a run of zero (except incomplete entries
which just signal that one needs to do another round of parsing),
one can actually use that as sentinel. This patch does so.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Demuxers are not allowed to do this and few callers, if any, will handle
this correctly. Send the AV_SIDE_DATA_PARAM_CHANGE_SAMPLE_RATE side data
instead.
Demuxers are not supposed to update AVCodecParameters after the stream
was seen by the caller. This value is not important enough to support
dynamic updates for.
The mov demuxer only returns DV audio, video packets are discarded.
It first reads the data to be parsed into a packet. Then both this
packet and the pointer to its data are passed together to
avpriv_dv_produce_packet(), which parses the data and partially
overwrites the packet. This is confusing and potentially dangerous, so
just pass NULL and avoid pointless packet modification.
The struct is quite small and the decoder and the encoder use different
fields from it, so benefits from reusing it are small.
This allows making the buf field const.
The function contains only two assignments, setting DVVideoContext.avctx
and AVCodecContext.chroma_sample_location. However, the decoder does not
use the former, and the encoder should not be setting the latter.
Therefore move the first assignment to dvenc and the second to dvdec.
Make the encoder warn if the user-signalled chroma sample location does
not match the supported one, and return an error on higher compliance
levels.
Initialized to 1:1, but if the script sets these properties, it
will be set to those instead (0:0 disables it, apparently).
Signed-off-by: Stephen Hutchinson <qyot27@gmail.com>
If we want to be able to map between VAAPI and Vulkan (to do Vulkan
filtering), we need to have matching formats on each side.
The mappings here are not exact. In the same way that P010 is still
mapped to full 16 bit formats, P012 has to be mapped that way as well.
Similarly, VUYX has to be mapped to an alpha-equipped format, and XV36
has to be mapped to a fully 16bit alpha-equipped format.
While Vulkan seems to fundamentally lack formats with an undefined,
but physically present, alpha channel, it has have 10X6 and 12X4
formats that you could imagine using for P010, P012 and XV36, but these
formats don't support the STORAGE usage flag. Today, hwcontext_vulkan
requires all formats to be storable because it wants to be able to use
them to create writable images. Until that changes, which might happen,
we have to restrict the set of formats we use.
Finally, when mapping a Vulkan image back to vaapi, I observed that
the VK_FORMAT_R16G16B16A16_UNORM format we have to use for XV36 going
to Vulkan is mapped to Y416 when going to vaapi (which makes sense as
it's the exact matching format) so I had to add an entry for it even
though we don't use it directly.
With the necessary pixel formats defined, we can now expose support for
the remaining 10/12bit combinations that VAAPI can handle.
Specifically, we are adding support for:
* HEVC
** 12bit 420
** 10bit 422
** 12bit 422
** 10bit 444
** 12bit 444
* VP9
** 10bit 444
** 12bit 444
These obviously require actual hardware support to be usable, but where
that exists, it is now enabled.
Note that unlike YUVA/YUVX, the Intel driver does not formally expose
support for the alphaless formats XV30 and XV360, and so we are
implicitly discarding the alpha from the decoder and passing undefined
values for the alpha to the encoder. If a future encoder iteration was
to actually do something with the alpha bits, we would need to use a
formal alpha capable format or the encoder would need to explicitly
accept the alphaless format.
These are the formats we want/need to use when dealing with the Intel
VAAPI decoder for 12bit 4:2:0, 12bit 4:2:2, 10bit 4:4:4 and 12bit 4:4:4
respectively.
As with the already supported Y210 and YUVX (XVUY) formats, they are
based on formats Microsoft picked as their preferred 4:2:2 and 4:4:4
video formats, and Intel ran with it.
P12 and Y212 are simply an extension of 10 bit formats to say 12 bits
will be used, with 4 unused bits instead of 6.
XV30, and XV36, as exotic as they sound, are variants of Y410 and Y412
where the alpha channel is left formally undefined. We prefer these
over the alpha versions because the hardware cannot actually do
anything with the alpha channel and respecting it is just overhead.
Y412/XV46 is a normal looking packed 4 channel format where each
channel is 16bits wide but only the 12msb are used (like P012).
Y410/XV30 packs three 10bit channels in 32bits with 2bits of alpha,
like A/X2RGB10 style formats. This annoying layout forced me to define
the BE version as a bitstream format. It seems like our pixdesc
infrastructure can handle the LE version being byte-defined, but not
when it's reversed. If there's a better way to handle this, please
let me know. Our existing X2 formats all have the 2 bits at the MSB
end, but this format places them at the LSB end and that seems to be
the root of the problem.
There are no particular reasons to force the compiler to use the same
register as output and input operand. This forces an extra MOV
instruction if the input value needs to be reused after the swap.
In most cases, this makes no differences, as the compiler will seleect
the same register for both operands either way.
Signed-off-by: Martin Storsjö <martin@martin.st>
There are no particular reasons to force the compiler to use the same
register as output and input operand. This forces an extra MOV
instruction if the input value needs to be reused after the swap.
In most cases, this makes no differences, as the compiler will seleect
the same register for both operands either way.
Signed-off-by: Martin Storsjö <martin@martin.st>
It is advantageous for ff_crop_tab, as the base pointer used to
access this table is not the first element of it. But the real
base pointer is still at a constant offset from the code/the GOT
and can therefore be accessed relative to the instruction pointer
(if supported by the arch) or relative to the GOT; without this,
one has to first load address of ff_crop_tab (potentially via
the GOT) and then offset manually (which is what the earlier code
did).
Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It reduces typing: Before this patch, there were 11 callbacks
that exceeded the 80 char line length limit; now there are zero.
It also allows to remove ONLY_IF_THREADS_ENABLED() in
libavutil/internal.h.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It reduces typing: Before this patch, there were 105 codecs
whose long_name-definition exceeded the 80 char line length
limit. Now there are only nine of them.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It has been deprecated in b4f59beeb4,
but the attribute_deprecated was not set and there was no entry
in APIchanges. This commit adds these and schedules it for removal.
Given that the reason behind the deprecation is exactly the same
as in av_fopen_utf8(), reuse its FF_API_AV_FOPEN_UTF8.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
They are unused since d63443b968.
Furthermore, they were always in the wrong header:
libavutil/internal.h is auto-included almost everywhere, but
FF_SYMVER would only ever be used at a few places, so a proper
header of its own would be appropriate for it.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Currently, the frame-threaded decoding API still supports thread-unsafe
callbacks. If one uses a thread-unsafe get_buffer2() callback,
calls to av_frame_unref() by the decoder are serialized, because
it is presumed that the underlying deallocator is thread-unsafe.
The frame-threaded encoder seems to have been written with this
restriction in mind: It always serializes unreferencing its AVFrames,
although no documentation forces it to do so.
This commit schedules to change this behaviour as soon as thread-unsafe
callbacks are removed. For this reason, the FF_API_THREAD_SAFE_CALLBACKS
define is reused.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The AArch64 assembly accesses those symbols directly, without
indirection via e.g. the GOT on ELF. In order for this not to
require text relocations, those symbols need to be resolved fully
at link time, i.e. those symbols can't be interposable.
Normally, so far, this is achieved when linking shared libraries
in two ways; we have a version script (libavcodec/libavcodec.v) which
marks all symbols that don't start with av* as local. Additionally,
we try to add -Wl,-Bsymbolic to the linker options if supported,
making sure that such symbol references are resolved fully at link
time, instead of making them interposable.
When the libavcodec static library is linked into another shared
library, there's no guarantee that it uses similar options (even though
that would be favourable), which would end up requiring text relocations
in the AArch64 assembly.
Explicitly mark the symbols that are accessed from AArch64 assembly
as hidden, so that they are resolved fully at link time even without
the version script and -Wl,-Bsymbolic.
Signed-off-by: Martin Storsjö <martin@martin.st>
It makes no sense here, as flac_parse_block_header()
is not even supposed to advance the caller's pointer.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
(The FLAC parser currently ignores the streaminfo block;
therefore some of this is decoder-only. Given that the FLAC
parser should probably use the streaminfo block, this stuff
is moved to flac_parse.h.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This field is not really used by the decoder at all:
It is only output in some debug log message, but this debug
log message should better use the value read from the streaminfo
instead of a second-guessed value from the decoder.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The HEVC code currently uses an array of arrays of NALUs; one such array
contains all the SPS NALUs, one all PPS NALUs etc. The array of arrays
is grown dynamically via av_reallocp_array(), but given that the latter
function automatically frees its buffer upon reallocation error,
it may only be used with PODs, which this case is not. Even worse:
While the pointer to the arrays is reset, the counter for the number
of arrays is not, leading to a segfault in hvcc_close().
Fix this by avoiding the allocations of the array of arrays altogether.
This is easily possible because their number is bounded (by five).
Furthermore, as a byproduct we can ensure that the code always
produces the recommended ordering of VPS-SPS-PPS-SEI (which was
not guaranteed before).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is designed to improve and unify error handling for
allocation failures for the many (often small) allocations that we have
in the fftools. These typically either don't return an error message
or an error message that is not really helpful to the user
and can be replaced by a generic error message without loss of
information.
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The threshold of 5 is arbitrary, both smaller and larger should work fine
Fixes: Stack overflow
Fixes: 50603/clusterfuzz-testcase-minimized-ffmpeg_dem_ASF_O_fuzzer-6049302564175872
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
update_video_stats() currently uses OutputStream.data_size to print the
total size of the encoded stream so far and the average bitrate.
However, that field is updated in the muxer thread, right before the
packet is sent to the muxer. Not only is this racy, but the numbers may
not match even if muxing was in the main thread due to bitstream
filters, filesize limiting, etc.
Introduce a new counter, data_size_enc, for total size of the packets
received from the encoder and use that in update_video_stats(). Rename
data_size to data_size_mux to indicate its semantics more clearly.
No synchronization is needed for data_size_mux, because it is only read
in the main thread in print_final_stats(), which runs after the muxer
threads are terminated.
It is either equal to OutputStream.enc_ctx->codec, or NULL when enc_ctx
is NULL. Replace the use of enc with enc_ctx->codec, or the equivalent
enc_ctx->codec_* fields where more convenient.
ost->enc is always non-NULL here, since
- this code is never called for streamcopy
- opening the output file will fail if an encoder cannot be found, so
filters are never initialized
This code cannot be triggered, since after 90944ee3ab opening the
output file will abort if an encoder cannot be found and streamcopy was
not explicitly requested.
It races with the demuxing thread. Instead, send the information along
with the demuxed packets.
Ideally, the code should stop using the stream-internal parsing
completely, but that requires considerably more effort.
Fixes races, e.g. in:
- fate-h264-brokensps-2580
- fate-h264-extradata-reload
- fate-iv8-demux
- fate-m4v-cfr
- fate-m4v
Adds an option to use constant bitrate instead of average bitrate to the
videotoolbox encoders. This is enabled via -constant_bit_rate true.
macOS 13 is required for this option to work.
Signed-off-by: Sebastian Beckmann <beckmann.sebastian@outlook.de>
Signed-off-by: Rick Kern <kernrj@gmail.com>
This has been broken since the start, and it was only discovered
when I started testing my replacement for the FFT.
Disable it, since there's no point in fixing slower code that's about
to be removed anyway.
The vfp version is not affected.
Fixes: signed integer overflow: -6322983228386819992 - 5557477266266529857 cannot be represented in type 'long'
Fixes: 50112/clusterfuzz-testcase-minimized-ffmpeg_dem_IFF_fuzzer-6329186221948928
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: Timeout
Fixes no testcase, this is the same idea as similar attacks against XML parsers
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
It is only used by encoders; in fact, AVCodecContext.time_base
is only used by encoders, so it is only useful for encoders.
Also constify the AVCodecContext parameter in it.
Also fixup the other headers a bit while removing now unnecessary
internal.h inclusions.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Decoder-only, as the dimensions are set by the user when encoding.
Also fixup the other headers a bit while removing unnecessary internal.h
inclusions.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Only used by decoders (encoders have ff_encode_alloc_frame()).
Also clean up the other headers a bit while removing now redundant
internal.h inclusions.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Only used by decoders.
Also clean up the headers a bit while removing now unnecessary
internal.h inclusions.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Up until now, these encoders received non-refcounted packets
(whose data was owned by the corresponding AVCodecContext)
from ff_alloc_packet(); these packets were made refcounted lateron
by av_packet_make_refcounted() generically.
This commit makes these encoders accept user-supplied buffers by
replacing av_packet_make_refcounted() with an equivalent function
that is based upon get_encode_buffer().
(I am pretty certain that one can also set the flag for mpegvideo-
based encoders, but I want to double-check this later. What is certain
is that it reallocates the buffer owned by the AVCodecContext
which should maybe be moved to encode.c, so that proresenc_kostya.c
and ttaenc.c can make use of it, too.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The encode-callback (the callback used by the FF_CODEC_CB_TYPE_ENCODE
encoders) is currently called in two places: encode_simple_internal()
and by the worker threads of frame-threaded encoders.
After the call, some packet properties are set based upon
the corresponding AVFrame properties and the packet is made
refcounted if it isn't already. So there is some code duplication.
There was also non-duplicated code in encode_simple_internal()
which is executed even when using frame-threading. This included
an emms_c() (which is needed for frame-threading, too, if it is
needed for the single-threaded case, because there are allocations
(via av_packet_make_refcounted()) immediately after returning
from the encode-callback).
Furthermore, some further properties are only set in
encode_simple_internal(): For audio, pts and duration are derived
from the corresponding fields of the frame if the encoder does not
have the AV_CODEC_CAP_DELAY set. Yet this is wrong for frame-threaded
encoders, because frame-threading always introduces delay regardless
of whether the underlying codec has said cap. This only worked because
there are no frame-threaded audio encoders.
This commit fixes the code duplication and the above issue by factoring
this code out and reusing it in both places. It would work in case
of audio codecs with frame-threading, because now the values are
derived from the correct AVFrame.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Instead of indicating whether we got a packet by setting
pkt->data and pkt->size to zero.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
AVCodecInternal.frame_thread_encoder is only set iff
active_thread_type is FF_THREAD_FRAME.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
As vaapi doesn't actually do anything useful with the alpha channel,
and we have an alphaless format available, let's use that instead.
The changes here are mostly 1:1 switching, but do note the explicit
change in the number of declared channels from 4 to 3 to reflect that
the alpha is being ignored.
As we already have support for VUYA, I figured I should do the small
amount of work to support VUYX as well. That means a little refactoring
to share code.
This is the alphaless version of VUYA that I introduced recently. After
further discussion and noting that the Intel vaapi driver explicitly
lists XYUV as a support format for encoding and decoding 8bit 444
content, we decided to switch our usage and avoid the overhead of
having a declared alpha channel around.
Note that I am not removing VUYA, as this turned out to have another
use, which was to replace the need for v408enc/dec when dealing with
the format.
The vaapi switching will happen in the next change
The fastest fast Fourier transform in not just the west, but the world,
now for the most popular toy ISA.
On a high level, it follows the design of the AVX2 version closely,
with the exception that the input is slightly less permuted as we don't have
to do lane switching with the input on double 4pt and 8pt.
On a low level, the lack of subadd/addsub instructions REALLY penalizes
any attempt at writing an FFT. That single register matters a lot,
and reloading it simply takes unacceptably long.
In x86 land, vendors would've noticed developers need this.
In ARM land, you get a badly designed complex multiplication instruction
we cannot use, that's not present on 95% of devices. Because only
compilers matter, right?
Future optimization options are very few, perhaps better register
management to use more ld1/st1s.
All timings below are in cycles:
A53:
Length | C | New (lavu) | Old (lavc) | FFTW
------ |-------------|-------------|-------------|-----
4 | 842 | 420 | 1210 | 1460
8 | 1538 | 1020 | 1850 | 2520
16 | 3717 | 1900 | 3700 | 3990
32 | 9156 | 4070 | 8289 | 8860
64 | 21160 | 9931 | 18600 | 19625
128 | 49180 | 23278 | 41922 | 41922
256 | 112073 | 53876 | 93202 | 101092
512 | 252864 | 122884 | 205897 | 207868
1024 | 560512 | 278322 | 458071 | 453053
2048 | 1295402 | 775835 | 1038205 | 1020265
4096 | 3281263 | 2021221 | 2409718 | 2577554
8192 | 8577845 | 4780526 | 5673041 | 6802722
Apple M1
New - Total for len 512 reps 2097152 = 1.459141 s
Old - Total for len 512 reps 2097152 = 2.251344 s
FFTW - Total for len 512 reps 2097152 = 1.868429 s
New - Total for len 1024 reps 4194304 = 6.490080 s
Old - Total for len 1024 reps 4194304 = 9.604949 s
FFTW - Total for len 1024 reps 4194304 = 7.889281 s
New - Total for len 16384 reps 262144 = 10.374001 s
Old - Total for len 16384 reps 262144 = 15.266713 s
FFTW - Total for len 16384 reps 262144 = 12.341745 s
New - Total for len 65536 reps 8192 = 1.769812 s
Old - Total for len 65536 reps 8192 = 4.209413 s
FFTW - Total for len 65536 reps 8192 = 3.012365 s
New - Total for len 131072 reps 4096 = 1.942836 s
Old - Segfaults
FFTW - Total for len 131072 reps 4096 = 3.713713 s
Thanks to wbs for some simplifications, assembler fixes and a review
and to jannau for giving it a look.
It is unnecessary since ee551a21ddcbf81afe183d9489c534ee80f263a0;
but it was redundant even before that, because decode_simple_internal()
calls emms_c().
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
There's no warranty that vpx_codec_encode() will generate a list with the same
amount of packets for both the yuv planes encoder and the alpha plane encoder,
so queueing packets based on what the main encoder returns will fail when the
amount of packets in both lists differ.
Queue all data packets for every vpx_codec_encode() call from both encoders
before attempting to assemble output AVPackets out of them.
Fixes ticket #9884
Reviewed-by: Vignesh Venkatasubramanian <vigneshv@google.com>
Signed-off-by: James Almer <jamrial@gmail.com>
This is somewhat redundant with the is_decoded check. Maybe
there is a nicer solution
Fixes: Null pointer dereference
Fixes: 49584/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_HEVC_fuzzer-5297367351427072
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: -1948269928 * 10 cannot be represented in type 'int'
Fixes: 49451/clusterfuzz-testcase-minimized-ffmpeg_dem_SUBVIEWER_fuzzer-6344614822412288
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Don't silently replace it with the default layout for the amount of channels
from the requested layout.
Should fix ticket #9869
Signed-off-by: James Almer <jamrial@gmail.com>
Fixes FATE-failures with the the filter-2xbr filter-3xbr filter-4xbr
filter-ep2x filter-ep3x filter-hq2x filter-hq3x filter-hq4x
filter-paletteuse-bayer filter-paletteuse-bayer0
filter-paletteuse-nodither and filter-paletteuse-sierra2_4a tests
when using 32bit x86 with CPUFLAGS ranging from "mmx+mmxext" to
"mmx+mmxext+sse+sse2+sse3" (the relevant function is only overwritten
when using SSSE3).
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
By checking immediately whether the first allocation was successfull
one can simplify the cleanup code in case of errors.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
APNG works with a single reference frame and an output frame.
According to the spec, decoding APNG works by decoding
the current IDAT/fdAT chunks (which decodes to a rectangular
subregion of the whole image region), followed by either
overwriting the region of the output frame with the newly
decoded data or by blending the newly decoded data with
the data from the reference frame onto the current subregion
of the output frame. The remainder of the output frame
is just copied from the reference frame.
Then the reference frame might be left untouched
(APNG_DISPOSE_OP_PREVIOUS), it might be replaced by the output
frame (APNG_DISPOSE_OP_NONE) or the rectangular subregion
corresponding to the just decoded frame has to be reset
to black (APNG_DISPOSE_OP_BACKGROUND).
The latter case is not handled correctly by our decoder:
It only performs resetting the rectangle in the reference frame
when decoding the next frame; and since commit
b593abda6c it does not reset
the reference frame permanently, but only temporarily (i.e.
it only affects decoding the frame after the frame with
APNG_DISPOSE_OP_BACKGROUND). This is a problem if the
frame after the APNG_DISPOSE_OP_BACKGROUND frame uses
APNG_DISPOSE_OP_PREVIOUS, because then the frame after
the APNG_DISPOSE_OP_PREVIOUS frame has an incorrect reference
frame. (If it is not followed by an APNG_DISPOSE_OP_PREVIOUS
frame, the decoder only keeps a reference to the output frame,
which is ok.)
This commit fixes this by being much closer to the spec
than the earlier code: Resetting the background is no longer
postponed until the next frame; instead it is applied to
the reference frame.
Fixes ticket #9602.
(For multithreaded decoding it was actually already broken
since commit 5663301560d77486c7f7c03c1aa5f542fab23c24.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Some code in alimiter assumes that there are 2 channels, resulting in
clipping if the loudest channel is 3 or above and an out-of-bounds read if
the input is monophonic. Fix that in 2 places.
Signed-off-by: David Flater <dave@flaterco.com>
Add adaptive_i/b feature to hevc_qsv. Adaptive_i allows changing of
frame type from P and B to I. Adaptive_b allows changing of frame type
frome B to P.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
According to its documentation it returns "pts of the last muxed packet
+ its duration", but the value it actually returns right now is
(possibly guessed) dts after muxer-internal bitstream filtering (if
any).
This function was added for ffmpeg.c, but it is not used there anymore.
Since the value it returns is ill-defined and so inappropriate for any
serious use, deprecate it.
Just return AVERROR_EXTERNAL immediately when encode error.
The other logic should keep the old behavior before commit 7c05b7951.
Suggested-By: Zhao Zhili <zhilizhao@tencent.com>
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
This commit moves the encoder-only allocations of the tables
owned solely by the main encoder context to mpegvideo_enc.c.
This avoids checks in mpegvideo.c for whether we are dealing
with an encoder; it also improves modularity (if encoders are
disabled, this code will no longer be included in the binary).
And it also avoids having to reset these pointers at the beginning
of ff_mpv_common_init() (in case the dst context is uninitialized,
ff_mpeg_update_thread_context() simply copies the src context
into dst which therefore may contain pointers not owned by it,
but this does not happen for encoders at all).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Fixes frame-threaded decoding of samples created by concatting
a video with data partitioning and a video not using it.
(Only the MPEG-4 decoder sets this, so it is synced in
mpeg4_update_thread_context() despite being a MpegEncContext-field.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is by no means perfect, since at least ddagrab will return scRGB
data with values outside of 0.0f to 1.0f for HDR values.
Its primary purpose is to be able to work with the format at all.
_Float16 support was available on arm/aarch64 for a while, and with gcc
12 was enabled on x86 as long as SSE2 is supported.
If the target arch supports f16c, gcc emits fairly efficient assembly,
taking advantage of it. This is the case on x86-64-v3 or higher.
Same goes on arm, which has native float16 support.
On x86, without f16c, it emulates it in software using sse2 instructions.
This has shown to perform rather poorly:
_Float16 full SSE2 emulation:
frame=50074 fps=848 q=-0.0 size=N/A time=00:33:22.96 bitrate=N/A speed=33.9x
_Float16 f16c accelerated (Zen2, --cpu=znver2):
frame=50636 fps=1965 q=-0.0 Lsize=N/A time=00:33:45.40 bitrate=N/A speed=78.6x
classic half2float full software implementation:
frame=49926 fps=1605 q=-0.0 Lsize=N/A time=00:33:17.00 bitrate=N/A speed=64.2x
Hence an additional check was introduced, that only enables use of
_Float16 on x86 if f16c is being utilized.
On aarch64, a similar uplift in performance is seen:
RPi4 half2float full software implementation:
frame= 6088 fps=126 q=-0.0 Lsize=N/A time=00:04:03.48 bitrate=N/A speed=5.06x
RPi4 _Float16:
frame= 6103 fps=158 q=-0.0 Lsize=N/A time=00:04:04.08 bitrate=N/A speed=6.32x
Since arm/aarch64 always natively support 16 bit floats, it can always
be considered fast there.
I'm not aware of any additional platforms that currently support
_Float16. And if there are, they should be considered non-fast until
proven fast.
IEEE-754 differentiates two different kind of NaNs.
Quiet and Signaling ones. They are differentiated by the MSB of the
mantissa.
For whatever reason, actual hardware conversion of half to single always
sets the signaling bit to 1 if the mantissa is != 0, and to 0 if it's 0.
So our code has to follow suite or fate-testing hardware float16 will be
impossible.
This avoids triggering overflows in the filters, and avoids stray
test failures in the approximate functions on x86; due to rounding
differences, one implementation might overflow while another one
doesn't.
Signed-off-by: Martin Storsjö <martin@martin.st>
Some muxers, such as GPAC, create files with only one sidx, but two streams
muxed into the same fragments pointed to by this sidx.
Prevously, in such a case, when we seeked in such files, we fell back
to, for example, using the sidx associated with the video stream, to
seek the audio stream, leaving the seekhead in the wrong place.
We can still do this, but we need to take care to compare timestamps
in the same time base.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
They are already synced generically in update_context_from_thread()
in pthread_frame.c.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is unused since 3575a495f6
(and the error message is dangerous: av_get_pix_fmt_name(format)
returns NULL iff av_pix_fmt_desc_get(format) returns NULL
and using a NULL string for %s would be UB).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Improves the grepability of the code.
(Furthermore, I hope that no compiler will really call memset
for 28 bytes.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is done later in ff_mpv_frame_start() (and nobody uses
current_picture_ptr between setting it in ff_mpv_frame_start()).
(The reason the vsynth*-h263-obmc ref files change is because
the call to ff_find_unused_picture() now happens after the older
pictures have been unreferenced in ff_mpv_frame_start(),
so that their slots in the picture array can be immediately
reused; the obmc code is somehow buggy and changes its output
depending on the earlier contents of the motion_val buffer.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Provide optimized implementation of pix_abs8 function for arm64.
Performance comparison tests are shown below.
- pix_abs_1_0_c: 101.2
- pix_abs_1_0_neon: 22.5
- sad_1_c: 101.2
- sad_1_neon: 22.5
Benchmarks and tests are run with checkasm tool on AWS Graviton 3.
Signed-off-by: Martin Storsjö <martin@martin.st>
Provide optimized implementation of sse8 function for arm64.
Performance comparison tests are shown below.
- sse_1_c: 130.7
- sse_1_neon: 29.7
Benchmarks and tests run with checkasm tool on AWS Graviton 3.
Signed-off-by: Hubert Mazur <hum@semihalf.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
Provide optimized implementation of pix_abs16_y2 function for arm64.
Performance comparison tests are shown below.
pix_abs_0_2_c: 317.2
pix_abs_0_2_neon: 37.5
Benchmarks and tests run with checkasm tool on AWS Graviton 3.
Signed-off-by: Hubert Mazur <hum@semihalf.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
Provide neon implementation for sse4 function.
Performance comparison tests are shown below.
- sse_2_c: 80.7
- sse_2_neon: 31.0
Benchmarks and tests are run with checkasm tool on AWS Graviton 3.
Signed-off-by: Hubert Mazur <hum@semihalf.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
Provide neon implementation for sse16 function.
Performance comparison tests are shown below.
- sse_0_c: 268.2
- sse_0_neon: 43.5
Benchmarks and tests run with checkasm tool on AWS Graviton 3.
Signed-off-by: Hubert Mazur <hum@semihalf.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
c11fb46731 led to a regression whereby the return code for missing
input or input probe is overridden by writer close return code and
hence not conveyed in the exit code.
Since d69d12a5b9 these av_assert2()
(or more exactly, the ones in hadamard8_diff8x8_c() and
hadamard8_intra8x8_c()) are hit. So just remove all of these asserts.
(If the test were improved to know which functions expect h == 8
and which support any value, the asserts could be readded
at the appropriate places.)
Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This directory dependency is normally added implicitly by rules
in ffbuild/common.mak; for tools it's created by a rule for TOOLOBJS.
TOOLOBJS is populated implicitly from TOOLS, and decode_simple.o
doesn't end up there because it's an odd occurrance of a lone
object file in the tools subdirectory, not belonging to any other
tool.
Signed-off-by: Martin Storsjö <martin@martin.st>
Previously, the checkasm test always passed h=8, so no other cases
were tested.
Out of the me_cmp functions, in practice, some functions are hardcoded
to always assume a 8x8 block (ignoring the h parameter), while others
do use the parameter. For those with hardcoded height, both the
reference C function and the assembly implementations ignore the
parameter similarly.
The documentation for the functions indicate that heights between
w/2 and 2*w, within the range of 4 to 16, should be supported. This
patch just tests random heights in that range, without knowing what
width the current function actually uses.
Signed-off-by: Martin Storsjö <martin@martin.st>
The height is hardcoded in some of the me_cmp functions, but not
in all of them. But in the case of all other functions, it's hardcoded
in the same place in SIMD functions as in the C reference functions,
while this one function differs from the behaviour of the C code.
(Before 542765ce3e, there were a
couple other sad8_*_mmx functions with similar hardcoded height.)
Signed-off-by: Martin Storsjö <martin@martin.st>
This commit adds new code paths for vscale when filterSize is 2, 4, or
8. By using specialized code with unrolling to match the filterSize we
can improve performance.
On AWS c7g (Graviton 3, Neoverse V1) instances:
before after
yuv2yuvX_2_0_512_accurate_neon: 558.8 268.9
yuv2yuvX_4_0_512_accurate_neon: 637.5 434.9
yuv2yuvX_8_0_512_accurate_neon: 1144.8 806.2
yuv2yuvX_16_0_512_accurate_neon: 2080.5 1853.7
Signed-off-by: Jonathan Swinney <jswinney@amazon.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
Use scalar times vector multiply accumlate instructions instead of
vector times vector to remove the need for replicating load instructions
which are slightly slower.
On AWS c7g (Graviton 3, Neoverse V1) instances:
yuv2yuvX_8_0_512_accurate_neon: 1144.8 987.4
yuv2yuvX_16_0_512_accurate_neon: 2080.5 1869.4
Signed-off-by: Jonathan Swinney <jswinney@amazon.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
Change the reference to exactly match the C reference in swscale,
instead of exactly matching the x86 SIMD implementations (which
differs slightly). Test with and without SWS_ACCURATE_RND - if this
flag isn't set, the output must match the C reference exactly,
otherwise it is allowed to be off by 2.
Mark a couple x86 functions as unavailable when SWS_ACCURATE_RND
is set - apparently this discrepancy hasn't been noticed in other
exact tests before.
Add a test for yuv2plane1.
Signed-off-by: Jonathan Swinney <jswinney@amazon.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
Use it instead of AVStream.codecpar in the main thread. While
AVStream.codecpar is documented to only be updated when the stream is
added or avformat_find_stream_info(), it is actually updated during
demuxing. Accessing it from a different thread then constitutes a race.
Ideally, some mechanism should eventually be provided for signalling
parameter updates to the user. Then the demuxing thread could pick up
the changes and propagate them to the decoder.
This specialization handles the case where filtersize is 4 mod 8, e.g.
12, 20, etc. Aarch64 was previously using the c function for this case.
This implementation speeds up that case significantly.
hscale_8_to_15__fs_12_dstW_512_c: 6234.1
hscale_8_to_15__fs_12_dstW_512_neon: 1505.6
Signed-off-by: Jonathan Swinney <jswinney@amazon.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
frag_stream_info->index_entry isn't the first sample/trun index.
cenc.frag_index_entry_base failed to catch the case since
current_index > 0.
Fix ticket #9807.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
frag_index.current is used by cenc_filter, and is updated inside
mov_read_moof. It can out of sync regarding to mov_read_packet.
Partly fix ticket #9807.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Convert the input from a scatter to a gather instead,
which is faster and better for SIMD.
Also, add a pre-shuffled exptab version to avoid
gathering there at all. This doubles the exptab size,
but the speedup makes it worth it. In SIMD, the
exptab will likely be purged to a higher cache
anyway because of the FFT in the middle, and
the amount of loads stays identical.
For a 960-point inverse MDCT, the speedup is 10%.
This makes it possible to write sane and fast SIMD
versions of inverse MDCTs.
A gateway can see everything, and we should not be shipping a hardcoded
default from a third party company; it's a security risk.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
mpegvideo uses an array of Pictures and when it is done with using
them, it only unreferences them incompletely: Some buffers are kept
so that they can be reused lateron if the same slot in the Picture
array is reused, making this a sort of a bufferpool.
(Basically, a Picture is considered used if the AVFrame's buf is set.)
Yet given that other pieces of the decoder may have a reference to
these buffers, they need not be writable and are made writable using
av_buffer_make_writable() when preparing a new Picture. This involves
reading the buffer's data, although the old content of the buffer
need not be retained.
Worse, this read can be racy, because the buffer can be used by another
thread at the same time. This happens for Real Video 3 and 4.
This commit fixes this race by no longer copying the data;
instead the old buffer is replaced by a new, zero-allocated buffer.
(Here are the details of what happens with three or more decoding threads
when decoding rv30.rm from the FATE-suite as happens in the rv30 test:
The first decoding thread uses the first slot of its picture array
to store its current pic; update_thread_context copies this for the
second thread that decodes a P-frame. It uses the second slot in its
Picture array to store its P-frame. This arrangement is then copied
to the third decode thread, which decodes a B-frame. It uses the third
slot in its Picture array for its current frame.
update_thread_context copies this to the next thread. It unreferences
the third slot containing the other B-frame and then it reuses this
slot for its current frame. Because the pic array slots are only
incompletely unreferenced, the buffers of the previous B-frame are
still in there and they are not writable; in fact the previous
thread is concurrently writing to them, causing races when making
the buffer writable.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
At this point active_thread_type is set iff active_thread_type
is set to FF_THREAD_FRAME iff AVCodecInternal.frame_thread_encoder
is set.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Discontinuity detection/correction is left in the main thread, as it is
entangled with InputStream.next_dts and related variables, which may be
set by decoding code.
Fixes races e.g. in fate-ffmpeg-streamloop after
aae9de0cb2.
This will allow to move normal offset handling to demuxer thread, since
discontinuities currently have to be processed in the main thread, as
the code uses some decoder-produced values.
InputFile.ts_offset can change during transcoding, due to discontinuity
correction. This should not affect the streamcopy starting timestamp.
Cf. bf2590aed3
The AviSynth C API requires using avs_release_video_frame
whenever avs_get_frame has been used, but the recent addition
of frameprop reading to the demuxer was missing this in
avisynth_create_stream_video.
Signed-off-by: Stephen Hutchinson <qyot27@gmail.com>
This allows user to build FFmpeg against Intel oneVPL. oneVPL 2.6
is the required minimum version when building Intel oneVPL code.
It will fail to run configure script if both libmfx and libvpl are
enabled.
It is recommended to use oneVPL for new work, even for currently available
hardwares [1]
Note the preferred child device type is d3d11va for libvpl on Windows.
The commands below will use d3d11va if d3d11va is available on Windows.
$ ffmpeg -hwaccel qsv -c:v h264_qsv ...
$ ffmpeg -qsv_device 0 -hwaccel qsv -c:v h264_qsv ...
$ ffmpeg -init_hw_device qsv=qsv:hw_any -hwaccel qsv -c:v h264_qsv ...
$ ffmpeg -init_hw_device qsv=qsv:hw_any,child_device=0 -hwaccel qsv -c:v h264_qsv ...
User may use child_device_type option to specify child device type to
dxva2 or derive a qsv device from a dxva2 device
$ ffmpeg -init_hw_device qsv=qsv:hw_any,child_device=0,child_device_type=dxva2 -hwaccel qsv -c:v h264_qsv ...
$ ffmpeg -init_hw_device dxva2=d3d9:0 -init_hw_device qsv=qsv@d3d9 -hwaccel qsv -c:v h264_qsv ...
[1] https://www.intel.com/content/www/us/en/develop/documentation/upgrading-from-msdk-to-onevpl/top.html
If qsv hwdevice is available, use the mfxLoader handle in qsv hwdevice
to create mfx session. Otherwise create mfx session with a new mfxLoader
handle.
This is in preparation for oneVPL support
The following Cflags has been added to libmfx.pc, so mfx/ prefix is no
longer needed when including mfx headers in FFmpeg.
Cflags: -I${includedir} -I${includedir}/mfx
Some old versions of libmfx have the following Cflags in libmfx.pc
Cflags: -I${includedir}
We may add -I${includedir}/mfx to CFLAGS when running 'configure
--enable-libmfx' for old versions of libmfx, if so, mfx headers without
mfx/ prefix can be included too.
If libmfx comes without pkg-config support, we may do a small change to
the settings of the environment(e.g. set -I/opt/intel/mediasdk/include/mfx
instead of -I/opt/intel/mediasdk/include to CFLAGS), then the build can
find the mfx headers without mfx/ prefix
After applying this change, we won't need to change #include for mfx
headers when mfx headers are installed under a new directory.
This is in preparation for oneVPL support (mfx headers in oneVPL are
installed under vpl directory)
The data structures for VP9 in mfxvp9.h is wrapped by
MFX_VERSION_NEXT, which means those data structures have never been used
in a public release. Actually MFX_CODEC_VP9 and other VP9 stuffs are
added in mfxstructures.h. In addition, mfxdefs.h is included in
mfxvp9.h, so we may use the check in this patch for MFX_CODEC_VP9
This is in preparation for oneVPL support because mfxvp9.h is removed
from oneVPL [1]
[1]: https://github.com/oneapi-src/oneVPL
Intel's oneVPL is a successor to MediaSDK, but removed some obsolete
features of MediaSDK[1], some early versions of oneVPL still use libmfx
as library name[2]. However some of obsolete features, including OPAQUE
memory, multi-frame encode, user plugins and LA_EXT rate control mode
etc, have been enabled in QSV, so user can not use --enable-libmfx to
enable QSV if using an early version of oneVPL SDK. In order to ensure
user builds FFmpeg against a right version of libmfx, this patch added a
check for version < 2.0 and warning message about the used obsolete
features.
[1] https://spec.oneapi.io/versions/latest/elements/oneVPL/source/VPL_intel_media_sdk.html
[2] https://github.com/oneapi-src/oneVPL
It is the proper place to set it, directly besides mb_width and
mb_stride. The reason for doing it the way it is done now seems
to be that the code does not create more slice contexts than necessary
(i.e. not more than one per row), so that this number needs to be
known before setting the number of slices. But this can always be
arranged by just moving the code that sets the number of slices.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
These fields are only ever set by the encoder for the current picture
and for no other picture. So only one set of these values needs to
exist, so move them to MpegEncContext.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Poisoning returned buffers is based around the implicit assumption
that the contents of said buffers are transient. Yet this is not true
for the buffer pools used by the various hardware contexts which store
important state in there that needs to be preserved.
Furthermore, the current code is also based on the assumption
that the complete buffer pointed to by AVBuffer->data coincides with
AVBufferRef->data; yet an implementation might store some data of its
own before the actual user-visible data (accessible via AVBufferRef)
which would be broken by the current code.
(This is of course yet more proof that the AVBuffer API is not the right
tool for the hardware contexts.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Of all the buffers that are made writable, three are always allocated
and the other four are allocated iff any one of them is allocated;
so one can replace the seven checks for existence with one.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Up until now, ff_wmv2_decode_secondary_picture_header() only
set the mb_type array for non I-pictures, so that the decoding
process uses the earlier values of this array; this affects
the output of the wmv8-x8intra FATE-test (which this patch
therefore updates). These earlier values were set when decoding
earlier frames or when the buffer was initially zero-allocated.
A consequence of this is that the output of this test would be
random if ff_find_unused_picture() would select the unused picture
to return at random. Furthermore decoding from a keyframe onwards
depends upon the earlier state of the decoder.
This patch therefore zeroes said array when decoding an I picture.
(It is not claimed that zero is the right value to fill the array with.
I just don't know.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Include two values for it, a default one that sets/keeps the current behavior,
where the frame event generated by the primary input will have a timestamp
equal or higher than frames in secondary input, plus a new one where the
secondary input frame will be that with the absolute closest timestamp to that
of the frame event one.
Addresses ticket #9689, where the new optional behavior produces better frame
syncronization.
Reviewed-by: Nicolas George <george@nsup.org>
Signed-off-by: James Almer <jamrial@gmail.com>
Stores the item ids of all the items found in the file and
processes the primary item at the end of the meta box. This patch
does not change any behavior. It sets up the code for parsing
alpha channel (and possibly images with 'grid') in follow up
patches.
Reviewed-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Signed-off-by: Vignesh Venkatasubramanian <vigneshv@google.com>
Signed-off-by: James Zern <jzern@google.com>
These tables are only used by encoders and only for the current picture;
ergo they need not be put into the picture at all, but rather into
the encoder's context. They also don't need to be refcounted,
because there is only one owner.
In contrast to this, the earlier code refcounts them which
incurs unnecessary overhead. These references are not unreferenced
in ff_mpeg_unref_picture() (they are kept in order to have something
like a buffer pool), so that several buffers are kept at the same
time, although only one is needed, thereby wasting memory.
The code also propagates references to other pictures not part of
the pictures array (namely the copy of the current/next/last picture
in the MpegEncContext which get references of their own). These
references are not unreferenced in ff_mpeg_unref_picture() (the
buffers are probably kept in order to have something like a pool),
yet if the current picture is a B-frame, it gets unreferenced
at the end of ff_mpv_encode_picture() and its slot in the picture
array will therefore be reused the next time; but the copy of the
current picture also still has its references and therefore
these buffers will be made duplicated in order to make them writable
in the next call to ff_mpv_encode_picture(). This is of course
unnecessary.
Finally, ff_find_unused_picture() is supposed to just return
any unused picture and the code is supposed to work with it;
yet for the vsynth*-mpeg4-adap tests the result depends upon
the content of these buffers; given that this patchset
changes the content of these buffers (the initial content is now
the state of these buffers after encoding the last frame;
before this patch the buffers used came from the last picture
that occupied the same slot in the picture array) their ref-files
needed to be changed. This points to a bug somewhere (if one removes
the initialization, one gets uninitialized reads in
adaptive_quantization in ratecontrol.c).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Sufficiently recent Intel hardware is able to do encoding of 8bit 4:4:4
content in HEVC and VP9. The main requirement here is that the frames
must be provided in the AYUV format.
Enabling support is done by adding the appropriate encoding profiles
and noting that AYUV is officially a four channel format with alpha so
we must state that we expect all four channels.
vaapi_decode_find_best_format currently does not set the
VA_SURFACE_ATTRIB_SETTABLE flag on the pixel format attribute that it
returns.
Without this flag, the attribute will be ignored by vaCreateSurfaces,
meaning that the driver's default logic for picking a pixel format will
kick in.
So far, this hasn't produced visible problems, but when trying to
decode 4:4:4 content, at least on Intel, the driver will pick the
444P planar format, even though the decoder can only return the AYUV
packed format.
The hwcontext_vaapi code that sets surface attributes when picking
formats does not have this bug.
Applications may use their own logic for finding the best format, and
so may not hit this bug. eg: mpv is unaffected.
The present default value of 0 will render the overlay video invisible.
A default of 1.0 is consistent with most common use cases.
Signed-off-by: Fei Wang <fei.w.wang@intel.com>
Reviewed-by: Philip Langdale <philipl@overt.org>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Directly branch into the special 64-point deinterleave
subroutine rather than going through the general deinterleave.
64-point transform timings on Zen 3:
Before:
1974 decicycles in av_tx (fft),16776864 runs, 352 skips
After:
1956 decicycles in av_tx (fft),16775378 runs, 1838 skips
This codepath is enabled by default on arm, if the linux perf API
is available, unless disabled with --disable-linux-perf.
Signed-off-by: Martin Storsjö <martin@martin.st>
-stream_loop is currently handled by destroying the demuxer thread,
seeking, then recreating it anew. This is very messy and conflicts with
the future goal of moving each major ffmpeg component into its own
thread.
Handle -stream_loop directly in the demuxer thread. Looping requires the
demuxer to know the duration of the file, which takes into account the
duration of the last decoded audio frame (if any). Use a thread message
queue to communicate this information from the main thread to the
demuxer thread.
This avoids a potential race with the demuxer adding new streams. It is
also more efficient, since we no longer do inter-thread transfers of
packets that will be just discarded.
This undocumented feature runtime-enables dumping input packets. I can
think of no reasonable real-world use case that cannot also be
accomplished in a different way. Keeping this functionality would
interfere with the following commit moving it to the input thread (then
setting the variable would require locking or atomics, which would be
unnecessarily complicated for a feature that probably nobody uses).
There are currently three possible modes for an output stream:
1) The stream is produced by encoding output from some filtergraph. This
is true when ost->enc_ctx != NULL, or equivalently when
ost->encoding_needed != 0.
2) The stream is produced by copying some input stream's packets. This
is true when ost->enc_ctx == NULL && ost->source_index >= 0.
3) The stream is produced by attaching some file directly. This is true
when ost->enc_ctx == NULL && ost->source_index < 0.
OutputStream.stream_copy is currently used to identify case 2), and
sometimes to confusingly (or even incorrectly) identify case 1). Remove
it, replacing its usage with checking enc_ctx/source_index values.
The fLaC and dfLa box IDs have been registered with the MP4 RA
(they are now listed at https://mp4ra.org/#/codecs) and support
for muxing FLAC in MP4 has been experimental in ffmpeg for
6 years now, since Nov 21, 2016
This patch removes the experimental status and removes the MP4
object type, as none has been registered for FLAC as it was not
deemed necessary.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
In addition to .eac3, .ec3 is also commonly used by people to name raw
E-AC-3 streams. Enables automatic recognition of the eac3 format for
the .ac3 extension.
For instance Dolby Digital Plus software only support files with
.ec3. Files with .eac3 are not supported. Check issue #18 in the
public dlb_mp4base repository from DolbyLaboratories.
Signed-off-by: Ruben Gonzalez <rgonzalez@fluendo.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
The functions to replace parameter sets are only called
after the respective parameter set has just been read or
has just been written; all of these functions check
that the id field is within the appropriate range.
So the checks in the replace-functions can be removed.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is no longer used.
Also rename ff_cbs_alloc_unit_content2 to ff_cbs_alloc_unit_content.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
cbs_jpeg was the last user of CBS that didn't use
CodedBitstreamUnitTypeDescriptors.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The code just creates new references without allocating
new buffers for the subobjects; therefore the actual data pointer
stays valid and need not be updated.
Also remove an assert that ensured that the calculation
for updating the pointer makes sense.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This updates the list closer to reality.
Iam not a professional server admin, iam happy to help maintain the box as i have
done in the past. But iam not qualified nor volunteering to fix sudden problems
nor do i do major upgrades (i lack the experience to recover the box remotely if
something goes wrong) and also iam not maintaining backups ATM (our backup system
had a RAID-5 failure, raz is working on setting a new one up)
Maybe this should be signaled in a different way than spliting the lines but ATM
people ping me if something is wrong and what i do is mainly mail/ping raz
and try to find another root admin so raz is not the only active & professional
admin on the team. It would be more efficient if people contact raz and others
directly instead of depending on my waking up and forwarding a "ffmpeg.org" is down note
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
It's not modified, so we can simply use a const pointer to it.
Also check the return value of the other copy in the function while at it.
Reviewed-by: Nicolas George <george@nsup.org>
Signed-off-by: James Almer <jamrial@gmail.com>
nvenc does not appear to use these values as inputs for its built in RGB
to YUV conversion, but instead sets them on the output as-is.
Testing indicates the input is expected to be sRGB, with the output
ending up as limited range bt.470.
Up until now, there were four copies of
quantize_and_encode_band_cost_(ZERO|[SU]QUAD|[SU]PAIR|ESC|NONE|NOISE|STEREO)
(namely in aaccoder.o, aacenc_is.o, aacenc_ltp.o, aacenc_pred.o).
As 43b378a0d3 says, this is meant to
enable more optimizations.
Yet neither GCC nor Clang performed such optimizations: The functions
in the aforementioned files are not optimized according to
the specifics of the calls in the respective file. Therefore
duplicating them is a waste of space; this commit therefore stops doing
so. The remaining copy is placed into aaccoder.c (which is the only
place where the "round to zero" variant of quantize_and_encode_band()
is used, so that this can be completely internal to aaccoder.c).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Also constify the corresponding code in mpegvideo.c that handles
lowres.
(Unfortunately, not everything that is const could be constified:
ref_picture could be made const uint8_t* const* if C allowed the
safe automatic conversion from uint8_t**; and pix_op, qpix_op
could be made to point to const function pointers, but C's handling
of const in pointers to arrays is broken.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible now that the HEVC DSP API treats them as const.
Also constify the AVFrames whose data is used as source in such
cases.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
__lasx_xvldx does not accept a pointer to const (in fact,
no function in lasxintrin.h does so), although it is not allowed
to modify the pointed-to buffer. Therefore this commit adds a wrapper
for it in order to constify the H264Chroma API in a later commit.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
__lsx_vldx does not accept a pointer to const (in fact,
no function in lsxintrin.h does so), although it is not allowed
to modify the pointed-to buffer. Therefore this commit adds a wrapper
for it in order to constify the HEVC DSP functions in a later commit.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Now that we have a combination of capable hardware (new enough Intel)
and a mutually understood format ("AYUV"), we can declare support for
decoding 8bit 4:4:4 content via VAAPI.
This requires listing AYUV as a supported format, and then adding VAAPI
as a supported hwaccel for the relevant codecs (HEVC and VP9). I also
had to add VP9Profile1 to the set of supported profiles for VAAPI as it
was never relevant before.
The "AYUV" format is defined by Microsoft as their preferred format for
4:4:4 content, and so it is the format used by Intel VAAPI and QSV.
As Microsoft like to define their byte ordering in little-endian
fashion, the memory order is reversed, and so our pix_fmt, which
follows memory order, has a reversed name (VUYA).
To do so, store the pointer to the VLC table and not to the VLC.
This is possible, because all the VLCs of the same type use
the same number of bits.
Also use a const VLCElem*, because the target is static and must
therefore not be modified after its initialization.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The msmpeg4 decoders/encoders share a common set of prerequisites,
ergo it makes sense to use common subsystems for them. This also
allows to remove the CONFIG_MSMPEG4_DECODER/ENCODER ad-hoc defines
(which violated the CONFIG_ namespace).
Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
IntraX8 uses WMV2DSP directly, so it should have a direct dependency
on it. Also remove the indirect Makefile dependency of the VC-1 decoder
on wmv2dsp.o. Notice that since the addition of the MIPS WMV2DSP
implementation building only the VC-1 decoder would fail, because
no Makefile dependency VC1->wmv2dsp_init_mips.o has been added.
This is of course fixed by this commit.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Firstly, the timestamps generated from framerate are inaccurate for
variable framerate mode.
Secondly, the timestamps always start from zero, while pts/dts can
start from nonzero. FLV demuxer rejects such index with message:
"Found invalid index entries, clearing the index".
Usually a HW decoder is expected when user specifies a HW acceleration
method via -hwaccel option, however the current implementation doesn't
take HW acceleration method into account, it is possible to select a SW
decoder.
For example:
$ ffmpeg -hwaccel vaapi -i av1.mp4 -f null -
$ ffmpeg -hwaccel nvdec -i av1.mp4 -f null -
$ ffmpeg -hwaccel vdpau -i av1.mp4 -f null -
[...]
Stream #0:0 -> #0:0 (av1 (libdav1d) -> wrapped_avframe (native))
libdav1d is selected in this case even if vaapi, nvdec or vdpau is
specified.
After applying this patch, the native av1 decoder (with vaapi, nvdec or
vdpau support) is selected for decoding(libdav1d is still used for
probing format).
$ ffmpeg -hwaccel vaapi -i av1.mp4 -f null -
$ ffmpeg -hwaccel nvdec -i av1.mp4 -f null -
$ ffmpeg -hwaccel vdpau -i av1.mp4 -f null -
[...]
Stream #0:0 -> #0:0 (av1 (native) -> wrapped_avframe (native))
Tested-by: Mario Roy <marioeroy@gmail.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
After applying this patch, the desired HW acceleration method is known
before selecting decoder, so we may take HW acceleration method into
account when selecting decoder for input stream in the next commit
There should be no functional changes in this patch
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
This function is bitrotten: It uses different parameters
than the corresponding ASM functions which replaced it.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
ea41e6d637 forgot
the int->ptrdiff_t switch for the stride. Libav didn't
do it because Libav had already dropped support for Alpha
at that point.
Only compilation has been tested for this commit.
(It might be that the ASM-versions of me_cmp_func functions
need to be updated as well.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The generated files are endian-dependent, so no checksums
may be part of the ref files.
Fixes ticket #9854.
Tested-by: Sebastian Ramacher <sramacher@debian.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The output file (even the filesize) of the recently added
EXR tests depends on the endianness; therefore checksums
of these files must not be part of the ref file. Therefore
this commit adds an option (unused for now) to disable these
checksums on a per-test basis.
In order to avoid having to check twice, the checksum and
the filesize info are moved to immediately follow one another;
this results into updates to the ref files of all lavf-image tests.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Fixes outputting silence on the second channel when decoding Parametic Stereo
HE-AAC.
Closes ticekt #3361.
Signed-off-by: James Almer <jamrial@gmail.com>
The DXGI_OUTDUPL_FRAME_INFO type isn't available in Windows API
subsets other than "desktop", while the IDXGIOutput1 interface is
available for all API subsets.
This fixes compilation for UWP/"Windows Store" configurations (and
older API subsets like Windows Phone).
Signed-off-by: Martin Storsjö <martin@martin.st>
Fixes -Werror=format-security build failures when building with
disabled optimizations and (according to fate.ffmpeg.org also with
several other old GCC versions).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Modifying the main context from a slice thread is (usually)
a data race, so it must not happen. So only use a pointer to const
to access the main context.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Modifying the main context from a slice thread is (usually)
a data race, so it must not happen. So only use a pointer to const
to access the main context.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Modifying the main context from a slice thread is (usually)
a data race, so it must not happen. So only use a pointer to const
to access the main context.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Modifying the main context from a slice thread is (usually)
a data race, so it must not happen. So only use a pointer to const
to access the main context.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Modifying the main context from a slice thread is (usually)
a data race, so it must not happen. So only use a pointer to const
to access the main context.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Modifying the main context from a slice thread is (usually)
a data race, so it must not happen. So only use a pointer to const
to access the main context.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Modifying the main context from a slice thread is (usually)
a data race, so it must not happen. So only use a pointer to const
to access the main context.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Modifying the main context from a slice thread is (usually)
a data race, so it must not happen. So only use a pointer to const
to access the main context.
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Modifying the main context from a slice thread is (usually) a data race,
so it must not happen. So only use a pointer to const to access
the main context.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible now that ff_thread_await_progress() accepts
a const ThreadFrame*.
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible for most of the callers, because e.g. only
the MPEG-4 decoder can have bits_per_raw_sample > 8.
Also most mpegvideo-based codecs are 420 only.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
These modifications were actually meant to be applied to
the coded_frame, yet 08b31a72db
changed this and so this code has not been removed when coded_frame
has been removed in 11bc790893.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
String literals are allowed to be deduplicated (and toolchains
are already capable of doing so), yet the same is not allowed
for named arrays (even when they contain strings). Therefore
use a const char *const pointing to an unnamed string literal
for ttml_default_namespacing.
Reviewed-by: Jan Ekström <jeebjp@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The decoder is meant to use it as a fallback if the value in extradata is
invalid.
Regression since d199099be.
Signed-off-by: James Almer <jamrial@gmail.com>
Since this behavior is intentional, use the VERBOSE level instead of WARNING as
it's nothing the user should worry about.
Signed-off-by: James Almer <jamrial@gmail.com>
This tests the new "-flags2 icc_profiles" option by making sure the
embedded ICC profile gets correctly detected as sRGB.
Signed-off-by: Niklas Haas <git@haasn.dev>
Only if requested, and only if the codec signals support for ICC
profiles. Implementation roughly matches the functionality of the
existing vf_iccgen filter, albeit with some reduced flexibility and no
caching.
Ideally, we'd also only do this on the first frame (e.g. mjpeg, apng),
but there's no meaningful way for us to distinguish between this case
and e.g. somebody using the image2 muxer, in which case we'd want to
attach ICC profiles to every frame in the stream.
Closes: #9672
Signed-off-by: Niklas Haas <git@haasn.dev>
Implementation for the decode side of the ICC profile API, roughly
matching the behavior of the existing vf_iccdetect filter.
Closes: #9673
Signed-off-by: Niklas Haas <git@haasn.dev>
Handling this in general code makes more sense than handling it in
individual codec files, because it would be a lot of unnecessary code
duplication for the plenty of formats that support exporting ICC
profiles (jpg, png, tiff, webp, jxl, ...).
encode.c and decode.c will be in charge of initializing this state as
needed, so we merely need to make sure to uninit it afterwards from the
common destructor path.
Signed-off-by: Niklas Haas <git@haasn.dev>
This functionally already exists, but as pointed out in #9672 and #9673,
requiring users to manually include filters is clumsy, error-prone and
hard to use together with tools like ffplay.
To streamline ICC profile support, add a new AVCodecContext flag to
globally enable reading and writing ICC profiles, automatically, for all
appropriate media types.
Note that this commit only includes the new API. The implementation is
split off to separate commits for readability.
Signed-off-by: Niklas Haas <git@haasn.dev>
Codecs that can read/write ICC profiles deserve a special capability so
the common logic in encode.c/decode.c can decide whether or not there
needs to be any special handling for ICC profiles. The motivation here
is to be able to use it to decide whether or not an ICC profile needs to
be generated in the encode path, but it might as well get added to
decoders as well for purely informative reasons.
It's not entirely clear to me whether the "thp" and "smvjpeg" variants
of "mjpeg" should have this capability set or not, given that the code
technically supports it but I somehow doubt these files may contain
them. In either case, this cap is purely informative for decoders so it
doesn't matter too much either way.
It's also not entirely clear whether the "amv" encoder should signal ICC
profile support, but again erring on the side of caution, we probably
*shouldn't* be generating (and encoding!) ICC profiles for this type of
media file.
Signed-off-by: Niklas Haas <git@haasn.dev>
We will need this helper inside libavcodec in the future, so move it
there, leaving behind an #include to the raw source file in its old
location in libvfilter. This approach is inspired by the handling of
vulkan.c, and avoids us needing to expose any of it publicly (or
semi-publicly) in e.g. libavutil, thus avoiding any ABI headaches.
It's debatable whether the actual code belongs in libavcodec or
libavfilter, but I decided to put it into libavcodec because it
conceptually deals with encoding and decoding ICC profiles, and will be
used to decode embedded ICC profiles in image files.
Signed-off-by: Niklas Haas <git@haasn.dev>
GPU hang is one of the most typical errors on Intel GPUs in
case something goes wrong. It's important to recognize it
explicitly for easier bugs triage. Also, this error code
can be used to trigger GPU recovery path in self-written
applications.
There were 2 other statuses which MediaSDK can ppotentially return,
MFX_ERR_NONE_PARTIAL_OUTPUT and MFX_ERR_REALLOC_SURFACE. Adding
them as well.
v2: move MFX_ERR_NONE_PARTIAL_OUTPUT next to MFX_WRN_* (Haihao)
Signed-off-by: Hon Wai Chow <hon.wai.chow@intel.com>
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
The streamcopy initialization code briefly needs an AVCodecContext to
apply AVOptions to. Allocate a temporary codec context, do not use the
encoding one.
Instead replace VP56mv by new and identical structures VP8mv and VP9mv.
Also replace VP56Frame by VP8FrameType in vp8.h and use that
in VP8 code. Also remove VP56_FRAME_GOLDEN2, as this has only
been used by VP8, and use VP8_FRAME_ALTREF as replacement for
its usage in VP8 as this is more in line with VP8 verbiage.
This allows to remove all inclusions of vp56.h from everything
that is not VP5/6. This also removes implicit inclusions
of hpeldsp.h, h264chroma.h, vp3dsp.h and vp56dsp.h from all VP8/9
files.
(This also fixes a build issue: If one compiles with -O0 and disables
everything except the VP8-VAAPI encoder, the file containing
ff_vpx_norm_shift is not compiled, yet this is used implicitly
by vp56_rac_gets_nn() which is defined in vp56.h; it is unused
by the VP8-VAAPI encoder and declared as av_unused, yet with -O0
unused noninline functions are not optimized away, leading to
linking failures. With this patch, said function is not included
in vaapi_encode_vp8.c any more.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Even resolution or number of picture stores changes, we still need
follow no_output_of_prior_pics_flag in next IDR.
Tested-by: Fei Wang <fei.w.wang@intel.com>
Signed-off-by: Xu Guangxin <guangxin.xu@intel.com>
suppose
a. You have 3 frames, 0, 1, 4096.
b. The ltMask is 0xfff and use_msb is 0.
c. The 0, 1 are lt refs for 4096.
d. you are decoding frame 4096, and get the 0 frame.
Since 4096 & ltMask is 0 too, even you want get 0, find_ref_idx may give you 4096.
add_candidate_ref will report an error for this
Tested-by: Fei Wang <fei.w.wang@intel.com>
Signed-off-by: Xu Guangxin <guangxin.xu@intel.com>
We will generate a new frame for a missed reference. The frame can only
be used for reference. We assign an invalid decode sequence to it, so
it will not be involved in any dpb process.
Tested-by: Fei Wang <fei.w.wang@intel.com>
Signed-off-by: Fei Wang <fei.w.wang@intel.com>
Signed-off-by: Xu Guangxin <guangxin.xu@intel.com>
According to C.5.2.2, item 2. When we got an IRAP, and the
NoOutputOfPriorPicsFlag = 0, we need bump all outputable frames.
Tested-by: Fei Wang <fei.w.wang@intel.com>
Signed-off-by: Xu Guangxin <guangxin.xu@intel.com>
It conflicts with the name of the test using the testtool
in libavformat.mak.
Fixes ticket #9841.
Reviewed-by: Pierre-Anthony Lemieux <pal@sandflow.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It used to be allocated separately, so that the pointer to it
is copied to all HEVCContexts, so that all slice-threads
use the same. This is completely unnecessary now that there
is only one HEVCContext any more. There is just one minor
complication left: The slice-threads only get a pointer to
const HEVCContext, but they need to modify the common CABAC
state. Fix this by adding a pointer to the common CABAC state
to HEVCLocalContext and document why it exists.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
While just at it, also use av_calloc() instead of zeroing
the array ourselves in a loop.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Actually, ff_slice_thread_allocz_entries() always already
allocates zeroed entries, so ff_reset_entries() was already
unnecessary. Make this more clear by renaming it to
ff_slice_thread_allocz_entries().
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The HEVC decoder has both HEVCContext and HEVCLocalContext
structures. The latter is supposed to be the structure
containing the per-slicethread state.
Yet up until now that is not how it is handled in practice:
Each HEVCLocalContext has a unique HEVCContext allocated for it
and each of these coincides except in exactly one field: The
corresponding HEVCLocalContext. This makes it possible to pass
the HEVCContext everywhere where logically a HEVCLocalContext
should be used. And up until recently, this is how it has been done.
Yet the preceding patches changed this, making it possible
to avoid allocating redundant HEVCContexts.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Right now the code passes a list of ints whose entry #i
is just i as opaque parameter to hls_decode_entry_wpp
via execute2; said list is even constantly allocated and freed.
This commit stops doing so and instead passes the list of
HEVCLocalContext* instead, so that the main HEVCContext
can be avoided in accessing the HEVCLocalContext.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The HEVC decoder has both HEVCContext and HEVCLocalContext
structures. The latter is supposed to be the structure
containing the per-slicethread state.
Yet that is not how it is handled in practice: Each HEVCLocalContext
has a unique HEVCContext allocated for it and each of these
coincides except in exactly one field: The corresponding
HEVCLocalContext. This makes it possible to pass the HEVCContext
everywhere where logically a HEVCLocalContext should be used.
This commit stops doing this for lavc/hevcdec.c itself.
It also constifies what can be constified in order to make
the nonconst stuff stand out more.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The HEVC decoder has both HEVCContext and HEVCLocalContext
structures. The latter is supposed to be the structure
containing the per-slicethread state.
Yet that is not how it is handled in practice: Each HEVCLocalContext
has a unique HEVCContext allocated for it and each of these
coincides except in exactly one field: The corresponding
HEVCLocalContext. This makes it possible to pass the HEVCContext
everywhere where logically a HEVCLocalContext should be used.
This commit stops doing this for lavc/hevcpred as well as
the corresponding mips code; the latter is untested.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The HEVC decoder has both HEVCContext and HEVCLocalContext
structures. The latter is supposed to be the structure
containing the per-slicethread state.
Yet that is not how it is handled in practice: Each HEVCLocalContext
has a unique HEVCContext allocated for it and each of these
coincides except in exactly one field: The corresponding
HEVCLocalContext. This makes it possible to pass the HEVCContext
everywhere where logically a HEVCLocalContext should be used.
This commit stops doing this for lavc/hevc_cabac.c; it also constifies
everything that is possible in order to ensure that no slice thread
accidentally modifies the main HEVCContext state.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The HEVC decoder has both HEVCContext and HEVCLocalContext
structures. The latter is supposed to be the structure
containing the per-slicethread state.
Yet that is not how it is handled in practice: Each HEVCLocalContext
has a unique HEVCContext allocated for it and each of these
coincides with the main HEVCContext except in exactly one field:
The corresponding HEVCLocalContext.
This makes it possible to pass the HEVCContext everywhere where
logically a HEVCLocalContext should be used.
This led to confusion in the first version of what eventually became
commit c8bc0f66a8:
Before said commit, the initialization of the Rice parameter derivation
state was incorrect; the fix for single-threaded as well as
frame-threaded decoding was to add backup stats to HEVCContext
that are used when the cabac state is updated*, see
https://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268861.html
Yet due to what has been said above, this does not work for
slice-threading, because the each HEVCLocalContext has its own
HEVCContext, so the Rice parameter state would not be transferred
between threads.
This is fixed in c8bc0f66a8
by a hack: It rederives what the previous thread was and accesses
the corresponding HEVCContext.
Fix this by treating the Rice parameter state the same way
the ordinary CABAC parameters are shared between threads:
Make them part of the same struct that is shared between
slice threads. This does not cause races, because
the parts of the code that access these Rice parameters
are a subset of the parts of code that access the CABAC parameters.
*: And if the persistent_rice_adaptation_enabled_flag is set.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The HEVC decoder has both HEVCContext and HEVCLocalContext
structures. The latter is supposed to be the structure
containing the per-slicethread state.
Yet that is not how it is handled in practice: Each HEVCLocalContext
has a unique HEVCContext allocated for it and each of these
coincides with the main HEVCContext except in exactly one field:
The corresponding HEVCLocalContext.
This makes it possible to pass the HEVCContext everywhere where
logically a HEVCLocalContext should be used.
This commit stops doing this for lavc/hevc_filter.c; it also constifies
everything that is possible in order to ensure that no slice thread
accidentally modifies the main HEVCContext state.
There are places where this was not possible, namely with the SAOParams
in sao_filter_CTB() or with sao_pixels_buffer_h in copy_CTB_to_hv().
Both of these instances lead to data races, see
https://fate.ffmpeg.org/report.cgi?time=20220629145651&slot=x86_64-archlinux-gcc-tsan-slices
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The HEVC decoder has both HEVCContext and HEVCLocalContext
structures. The latter is supposed to be the structure
containing the per-slicethread state.
Yet that is not how it is handled in practice: Each HEVCLocalContext
has a unique HEVCContext allocated for it and each of these
coincides except in exactly one field: The corresponding
HEVCLocalContext. This makes it possible to pass the HEVCContext
everywhere where logically a HEVCLocalContext should be used.
This commit stops doing this for lavc/hevc_mvs.c; it also constifies
everything that is possible in order to ensure that no slice thread
accidentally modifies the main HEVCContext state.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is safe for a slice thread to read the main context
and therefore it is safe to add a pointer to const HEVCContext
(namely the parent context) to each HEVCLocalContext.
It is also safe (and actually redundant) to add a pointer
to a logcontext to HEVCLocalContext.
Doing so allows to pass the HEVCLocalContext as context in
the parts of the code that is run slice-threaded when slice-threading
is in use (currently these parts of the code use ordinary
HEVCContext*). This way one is not tempted to modify
the main context from the slice contexts.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The slicethread contexts need to be initialized for
every frame, not only the first one, so one can
remove the initialization when allocating these contexts,
because the ordinary per-frame initialization will
initialize them again just a few lines below.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Otherwise, there is no guarantee that the various av_log-messages
are not interrupted by another log statement. The latter may originate
from anywhere else, even the HEVC decoder itself, as happens when
one uses frame-threading to decode the BUMPING_A_ericsson_1.bit
sample from the FATE-suite.
Furthermore, the earlier approach suffered from the fact that
various parts of the logmsg were output with different loglevels
and that checking stopped after having encountered the first
plane with MD5 mismatch, although it is probably interesting to
know whether other planes are incorrect, too.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Dump all input/output names to OVModel struct. In case other funcs use
them for reporting errors or locating issues.
Signed-off-by: Ting Fu <ting.fu@intel.com>
encode_block() in svq1enc.c looks like the following:
static int encode_block(int block[7][256], int level)
{
int best_score = 0;
for (unsigned x = 0; x < level; x++) {
int v = block[1][x];
block[level][x] = 0;
best_score += v * v;
}
if (level > 0 && best_score > 64) {
int score = 0;
score += encode_block(block, level - 1);
score += encode_block(block, level - 1);
if (score < best_score) {
best_score = score;
}
}
return best_score;
}
When called from outside of encode_block(), it is always called with
level == 5.
This triggers a bug [1] in GCC: On -O3, it creates eight clones of
encode_block with different values of level inlined into it. The clones
with negative values are of course useless*, but they also lead to
-Warray-bounds warnings, because they access block[-1].
This has been mitigated in GCC 12: It no longer creates clones
for parameters that it knows are impossible. Somehow switching levels
to unsigned makes GCC know this. Therefore this commit does this.
(For GCC 11, this changes the warning to "array subscript 4294967295 is
above array bounds" from "array subscript -1 is below array bounds".)
[1]: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102513
*: These clones can actually be discarded when compiling with
-ffunction-sections.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
filter_mb_mbaff_edgev() and filter_mb_mbaff_edgecv()
have a function parameter whose expected size depends upon
another parameter: It is 2 * bsi + 1 (with bsi always being 1 or 2).
This array is declared as const int16_t[7], yet some of the callers
with bsi == 1 call it with only an const int16_t[4] available.
This leads to -Wstringop-overread warnings from GCC 12.1.
This commit fixes these by replacing [7] with [/* 2 * bsi + 1 */],
so that the expected range and its dependence on bsi is immediately
visible.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
check_block_inter() currently does this when calling check_block().
This leads to a -Wstringop-overflow= warning when compiling with
GCC 12.1.
Given that the main part of the body of check_block() consists
of an "if (intra) { ... } else { ... }" which is true iff
check_block() is not called from check_block_inter(),
it makes sense to fix this by just inlining check_block()
check_block_inter() and turning check_block() into a new
check_block_intra() (with the inter parts removed, of course).
This should also not make much of a difference for the generated code
given that both check_block() as well as check_block_inter()
are already marked as av_always_inline, so this commit follows
this route to fix the issue.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
multiswap_step() and multiswap_inv_step() both only require
six keys; in all current callers, these keys are part of
an array of twelve keys, yet in some of these callers the keys
given to these functions point to the second half of these
twelve keys, so that only six keys are available to these functions.
This led to -Wstringop-overread warnings when compiling with GCC 12.1.
Fix these by adapting the declaration of these functions.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Using tail calls with functions returning void is forbidden
(C99/C11 6.8.6.4: "A return statement with an expression shall not appear
in a function whose return type is void.") GCC emits a warning
because of this when using -pedantic: "ISO C forbids ‘return’ with
expression, in function returning void"
Reviewed-by: Hendrik Leppkes <h.leppkes@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It retrieves the muxer's internal timestamp with under-defined
semantics. Continuing to use this value would also require
synchronization once the muxer is moved to a separate thread.
Replace the value with last_mux_dts.
This field means different things when the video is encoded (number of
frames emitted to the encoding sync queue/encoder by the video sync
code) or copied (number of packets sent to the muxer sync queue).
Print the value of packets_written instead, which means the same thing
in both cases. It is also more accurate, since packets may be dropped by
the sync queue or bitstream filters.
Same issues apply to it as to -shortest.
Changes the results of the following tests:
- matroska-flac-extradata-update
The test reencodes two input FLAC streams into three output FLAC
streams. The last output stream is limited to 8 frames. The current
code results in the first two output streams having 12 frames, after
this commit all three streams have 8 frames and are the same length.
This new result is better, since it is predictable.
- mkv-1242
The test streamcopies one video and one audio stream, video is limited
to 11 frames. The new result shortens the audio stream so that it is
not longer than the video.
The -shortest option (which finishes the output file at the time the
shortest stream ends) is currently implemented by faking the -t option
when an output stream ends. This approach is fragile, since it depends
on the frames/packets being processed in a specific order. E.g. there
are currently some situations in which the output file length will
depend unpredictably on unrelated factors like encoder delay. More
importantly, the present work aiming at splitting various ffmpeg
components into different threads will make this approach completely
unworkable, since the frames/packets will arrive in effectively random
order.
This commit introduces a "sync queue", which is essentially a collection
of FIFOs, one per stream. Frames/packets are submitted to these FIFOs
and are then released for further processing (encoding or muxing) when
it is ensured that the frame in question will not cause its stream to
get ahead of the other streams (the logic is similar to libavformat's
interleaving queue).
These sync queues are then used for encoding and/or muxing when the
-shortest option is specified.
A new option – -shortest_buf_duration – controls the maximum number of
queued packets, to avoid runaway memory usage.
This commit changes the results of the following tests:
- copy-shortest[12]: the last audio frame is now gone. This is
correct, since it actually outlasts the last video frame.
- shortest-sub: the video packets following the last subtitle packet are
now gone. This is also correct.
The following commits will add a new buffering stage after bitstream
filters, which should not be taken into account for choosing next
output.
OutputStream.last_mux_dts is also used by the muxing code to make up
missing DTS values - that field is now moved to the muxer-private
MuxStream object.
The current placement of this free is historical - it used to be
followed by avcodec_close(), since removed.
The proper place for freeing the stats is currently right before the
encoder context itself is freed.
It is currently called from two places:
- output_packet() in ffmpeg.c, which submits the newly available output
packet to the muxer
- from of_check_init() in ffmpeg_mux.c after the header has been
written, to flush the muxing queue
Some packets will thus be processed by this function twice, so it
requires an extra parameter to indicate the place it is called from and
avoid modifying some state twice.
This is fragile and hard to follow, so split this function into two.
Also rename of_write_packet() to of_submit_packet() to better reflect
its new purpose.
The muxing queue currently lives in OutputStream, which is a very large
struct storing the state for both encoding and muxing. The muxing queue
is only used by the code in ffmpeg_mux, so it makes sense to restrict it
to that file.
This makes the first step towards reducing the scope of OutputStream.
Figure out earlier whether the output stream/file should be bitexact and
store this information in a flag in OutputFile/OutputStream.
Stop accessing the muxer in set_encoder_id(), which will become
forbidden in future commits.
The current code postpones closing the files until after printing the
final report, which accesses the output file size. Deal with this by
storing the final file size before closing the file.
Move the file size checking code to ffmpeg_mux. Use the recently
introduced of_filesize(), making this code consistent with the size
shown by print_report().
Move header_written into it, which is not (and should not be) used by
any code outside of ffmpeg_mux.
In the future this context will contain more muxer-private state that
should not be visible to other code.
The amount of padding samples reported by containers take into account the
extended samplerate in HE-AAC.
Fixes ticket #9671.
Signed-off-by: James Almer <jamrial@gmail.com>
This issue would cause segmetaion fault when running srcnn model with
sr filter by TensorFlow backend. This filter would free the frame incorectly.
Signed-off-by: Ting Fu <ting.fu@intel.com>
Using parameter from AVCodecContext to reset qsv codec is more suitable
for MFXVideoENCODE_Reset()'s usage. Per-frame metadata is more suitable
for the usage of mfxEncodeCtrl being passed to
MFXVideoENCODE_EncodeFrameAsync(). Now change it to use the value
from AVCodecContext.
Because q->param is passed to both "in" and "out" parameters when call
MFXVideoENCODE_Query(), the value in q->param may be changed. New
variables are added to store old configuration, so that we can detect
real parameter change.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Dividing one line log into several av_log() call is not thread safe. Now
merge these strings into one av_log() call.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
The only duration field currently present in AVFrame is pkt_duration,
which is semantically restricted to those frames that are output by
decoders.
Add a new field that stores the frame's duration without regard for how
that frame was produced. Deprecate pkt_duration.
It need not be writable; in fact, it is often not writable even if
the packet sent to the decoder was writable, because the generic code
calls av_packet_ref() on it. It is never writable if a user
drains the decoder after every packet, because in this case the decode
callback is called from avcodec_send_packet().
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
wrapped_avframe_decode() uses an AVFrame as dst in av_frame_move_ref()
after having called ff_decode_frame_props() to attach side-date
to this very frame. This leaks all the side-data and metadata
that ff_decode_frame_props() has attached.
This happens in various fate-filter-metadata tests since
6ca43a9675.
These particular leaks (which affect metadata-only)
could be fixed by not adding metadata side-data to AVPackets
in libavdevice if they are also available from the AVFrames.
Yet this would break users that extract the metadata from
AVPackets.
The changes to FATE happen because of the way av_dict_set()
works when it overwrites an already existing entry:
It overwrites the entry to be overwritten with the last entry
and adds the new entry at the end. The end result is that
the first entry of the dict is the second-to-last-entry of
the original dict, the last entry of the dict is the last
entry of the old dict and the first count - 2 entries
of the original dict are at positions 1..count - 2 in their
original order.
Reviewed-by: Timo Rothenpieler <timo@rothenpieler.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
and remove FF_CODEC_CAP_INIT_THREADSAFE
All our native codecs are already init-threadsafe
(only wrappers for external libraries and hwaccels
are typically not marked as init-threadsafe yet),
so it is only natural for this to also be the default state.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is in preparation of switching the default init-thread-safety
to a codec being init-thread-safe.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Streams with all zero sample_delta in 'stts' have all zero dts.
They have higher chance be chose by mov_find_next_sample(), which
leads to seek again and again.
For example, GoPro created a 'GoPro SOS' stream:
Stream #0:4[0x5](eng): Data: none (fdsc / 0x63736466), 13 kb/s (default)
Metadata:
creation_time : 2022-06-21T08:49:19.000000Z
handler_name : GoPro SOS
With 'ffprobe -show_frames http://example.com/gopro.mp4', ffprobe
blocks until all samples in 'GoPro SOS' stream are consumed first.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
This avoids an extra copy of potentially quite big video frames.
Instead of copying the entire frames data into a rawvideo packet it
packs the frame into a wrapped avframe packet and passes it through
as-is.
Unfortunately, wrapped avframes are set up to be video frames, so the
audio frames continue to be copied.
Additionally, this enabled passing through video frames that previously
were impossible to process, like hardware frames or other special
formats that couldn't be packed into a rawvideo packet.
According to API docs avdevice_list_devices(), avdevice_list_input_sources()
and avdevice_list_input_sinks() should return the number of autodetected
devices on success. This is redundant with AVDeviceInfoList->nb_devices so it
was not noticed earlier that none of the underlying device list functions work
like that.
Let's fix it in generic code to make it in line with the API docs.
Fixes ticket #9820.
Signed-off-by: Marton Balint <cus@passwd.hu>
This patch prevents the libjxl encoder wrapper from failing to
encode images when the input video has untagged primaries. It will
instead assume BT.709/sRGB primaries and print a warning.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
The max height is currently documented as 16; the max difference per
pixel is 255, and a .8h element can easily contain 16*255, thus keep
accumulating in two .8h vectors, and just do the final accumulationat the
end. This should work for heights up to 256.
This requires a minor register renumbering in ff_pix_abs16_xy2_neon.
Before: Cortex A53 A72 A73 Graviton 3
pix_abs_0_0_neon: 97.7 47.0 37.5 22.7
pix_abs_0_1_neon: 154.0 59.0 52.0 25.0
pix_abs_0_3_neon: 179.7 96.7 87.5 41.2
After:
pix_abs_0_0_neon: 96.0 39.2 31.2 22.0
pix_abs_0_1_neon: 150.7 59.7 46.2 23.7
pix_abs_0_3_neon: 175.7 83.7 81.7 38.2
Signed-off-by: Martin Storsjö <martin@martin.st>
Using absolute-difference-accumulate does use twice the amount of
absolute-difference instructions, but avoids the need for the
uaddl and add instructions, reducing the total number of instructions
by 3.
These can be interleaved in the rest of the calculation, to avoid
tight dependencies at the end. Unfortunately, this is marginally
slower on Cortex A53, but faster on A72 and A73.
Before: Cortex A53 A72 A73 Graviton 3
pix_abs_0_3_neon: 175.7 109.2 92.0 41.2
After:
pix_abs_0_3_neon: 179.7 96.7 87.5 41.2
Signed-off-by: Martin Storsjö <martin@martin.st>
This is a per-file input option that adjusts an input's timestamps
with reference to another input, so that emitted packet timestamps
account for the difference between the start times of the two inputs.
Typical use case is to sync two or more live inputs such as from capture
devices. Both the target and reference input source timestamps should be
based on the same clock source.
If either input lacks starting timestamps, then no sync adjustment is made.
Provide neon implementation for pix_abs16_x2 function.
Performance tests of implementation are below.
- pix_abs_0_1_c: 283.5
- pix_abs_0_1_neon: 39.0
Benchmarks and tests run with checkasm tool on AWS Graviton 3.
Signed-off-by: Hubert Mazur <hum@semihalf.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
The FFmpeg project is organized through a community working on global consensus.
Decisions are taken by the ensemble of active members, through voting and are aided by two committees.
@anchor{General Assembly}
@chapter General Assembly
The ensemble of active members is called the General Assembly (GA).
The General Assembly is sovereign and legitimate for all its decisions regarding the FFmpeg project.
The General Assembly is made up of active contributors.
Contributors are considered "active contributors" if they have authored more than 20 patches in the last 36 months in the main FFmpeg repository, or if they have been voted in by the GA.
The list of active contributors is updated twice each year, on 1st January and 1st July, 0:00 UTC.
Additional members are added to the General Assembly through a vote after proposal by a member of the General Assembly. They are part of the GA for two years, after which they need a confirmation by the GA.
A script to generate the current members of the general assembly (minus members voted in) can be found in `tools/general_assembly.pl`.
@anchor{Voting}
@chapter Voting
Voting is done using a ranked voting system, currently running on https://vote.ffmpeg.org/ .
Majority vote means more than 50% of the expressed ballots.
@anchor{Technical Committee}
@chapter Technical Committee
The Technical Committee (TC) is here to arbitrate and make decisions when technical conflicts occur in the project. They will consider the merits of all the positions, judge them and make a decision.
The TC resolves technical conflicts but is not a technical steering committee.
Decisions by the TC are binding for all the contributors.
Decisions made by the TC can be re-opened after 1 year or by a majority vote of the General Assembly, requested by one of the member of the GA.
The TC is elected by the General Assembly for a duration of 1 year, and is composed of 5 members. Members can be re-elected if they wish. A majority vote in the General Assembly can trigger a new election of the TC.
The members of the TC can be elected from outside of the GA. Candidates for election can either be suggested or self-nominated.
The conflict resolution process is detailed in the resolution process document.
The TC can be contacted at <tc@@ffmpeg>.
@anchor{Resolution Process}
@section Resolution Process
The Technical Committee (TC) is here to arbitrate and make decisions when technical conflicts occur in the project.
The TC main role is to resolve technical conflicts. It is therefore not a technical steering committee, but it is understood that some decisions might impact the future of the project.
@subsection Seizing
The TC can take possession of any technical matter that it sees fit.
To involve the TC in a matter, email tc@ or CC them on an ongoing discussion.
As members of TC are developers, they also can email tc@ to raise an issue.
@subsection Announcement
The TC, once seized, must announce itself on the main mailing list, with a [TC] tag.
The TC has 2 modes of operation: a RFC one and an internal one.
If the TC thinks it needs the input from the larger community, the TC can call for a RFC. Else, it can decide by itself.
If the disagreement involves a member of the TC, that member should recuse themselves from the decision.
The decision to use a RFC process or an internal discussion is a discretionary decision of the TC.
The TC can also reject a seizure for a few reasons such as: the matter was not discussed enough previously; it lacks expertise to reach a beneficial decision on the matter; or the matter is too trivial.
@subsection RFC call
In the RFC mode, one person from the TC posts on the mailing list the technical question and will request input from the community.
The mail will have the following specification:
a precise title
a specific tag [TC RFC]
a top-level email
contain a precise question that does not exceed 100 words and that is answerable by developers
may have an extra description, or a link to a previous discussion, if deemed necessary,
contain a precise end date for the answers.
The answers from the community must be on the main mailing list and must have the following specification:
keep the tag and the title unchanged
limited to 400 words
a first-level, answering directly to the main email
answering to the question.
Further replies to answers are permitted, as long as they conform to the community standards of politeness, they are limited to 100 words, and are not nested more than once. (max-depth=2)
After the end-date, mails on the thread will be ignored.
Violations of those rules will be escalated through the Community Committee.
After all the emails are in, the TC has 96 hours to give its final decision. Exceptionally, the TC can request an extra delay, that will be notified on the mailing list.
@subsection Within TC
In the internal case, the TC has 96 hours to give its final decision. Exceptionally, the TC can request an extra delay.
@subsection Decisions
The decisions from the TC will be sent on the mailing list, with the [TC] tag.
Internally, the TC should take decisions with a majority, or using ranked-choice voting.
The decision from the TC should be published with a summary of the reasons that lead to this decision.
The decisions from the TC are final, until the matters are reopened after no less than one year.
@anchor{Community Committee}
@chapter Community Committee
The Community Committee (CC) is here to arbitrage and make decisions when inter-personal conflicts occur in the project. It will decide quickly and take actions, for the sake of the project.
The CC can remove privileges of offending members, including removal of commit access and temporary ban from the community.
Decisions made by the CC can be re-opened after 1 year or by a majority vote of the General Assembly. Indefinite bans from the community must be confirmed by the General Assembly, in a majority vote.
The CC is elected by the General Assembly for a duration of 1 year, and is composed of 5 members. Members can be re-elected if they wish. A majority vote in the General Assembly can trigger a new election of the CC.
The members of the CC can be elected from outside of the GA. Candidates for election can either be suggested or self-nominated.
The CC is governed by and responsible for enforcing the Code of Conduct.
The CC can be contacted at <cc@@ffmpeg>.
@anchor{Code of Conduct}
@chapter Code of Conduct
Be friendly and respectful towards others and third parties.
Treat others the way you yourself want to be treated.
Be considerate. Not everyone shares the same viewpoint and priorities as you do.
Different opinions and interpretations help the project.
Looking at issues from a different perspective assists development.
Do not assume malice for things that can be attributed to incompetence. Even if
it is malice, it's rarely good to start with that as initial assumption.
Stay friendly even if someone acts contrarily. Everyone has a bad day
once in a while.
If you yourself have a bad day or are angry then try to take a break and reply
once you are calm and without anger if you have to.
Try to help other team members and cooperate if you can.
The goal of software development is to create technical excellence, not for any
individual to be better and "win" against the others. Large software projects
are only possible and successful through teamwork.
If someone struggles do not put them down. Give them a helping hand
instead and point them in the right direction.
Finally, keep in mind the immortal words of Bill and Ted,
av_log(NULL, AV_LOG_WARNING, "Multiple %s options specified for stream %d, only the last option '-%s%s%s "SPECIFIER_OPT_FMT_##type"' will be used.\n",\
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.