Separates macro arguments with commas and passes .4H/.8H as macro
arguments instead of 4H/8H (the later form being interpreted as an
hexadecimal value).
Fixes ticket #6324.
Suggested-by: Martin Storsjö <martin@martin.st>
ASC frames smaller than AAC_ADTS_HEADER_SIZE were being discarded.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
(cherry picked from commit 0f05f2c7e6)
The code was skipping the entire reported SEI message size regardless of
the amount of bits read.
While in theory safe for NALU where the picture timing SEI message is alone
or at the end as we're using the checked bitstream reader, it isn't in any
other situation, where every SEI message in the NALU after the picture
timing one would potentially fail to parse.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
(cherry picked from commit f738140807)
Conflicts:
libavcodec/hevc_sei.c
It was never meant to do otherwise, as av_packet_get_side_data() returns the first
entry it finds of a given type.
Based on code from libavformat's av_stream_add_side_data().
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
(cherry picked from commit 28f60eeabb)
This avoids intermediates from overflowing (the final values are checked)
Fixes: runtime error: signed integer overflow: -167712 + -2147352576 cannot be represented in type 'int'
Fixes: 1298/clusterfuzz-testcase-minimized-5955580877340672
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit c1c3a14073)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The code does use 16bit sized arrays later so larger deltas would not work
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit 48b3117844)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
compilers doing DCE at -O0 do not necessarily understand "complex" boolean expressions
Build succeeds with this change, this was the only failure
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit fa8fd0808f)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
It should not be a value larger than the number of streams we have,
or it will cause invalid reads and/or SIGSEGV.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit ec07efa700)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This seems to be non-optional, and if the muxer is run without it,
strlen() is run on NULL, causing a segfault.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit cbd3a68f3e)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Because write_packet() fakely writes packets to muxer by queueing
them when muxer hasn't been initialized, it should also increment
frame_number fakely.
This is required because code in do_streamcopy() rely on
frame_number.
Should fix Ticket6227
Reviewed-by: James Almer <jamrial@gmail.com>
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Muhammad Faiz <mfcc64@gmail.com>
(cherry picked from commit c4be288fdb)
There appears to be no need to treat interlaced videos differently,
also that code is flawed, as for at least one input cur_field would
be always 0.
Fixes ticket #6344.
Signed-off-by: Paul B Mahol <onemda@gmail.com>
(cherry picked from commit ac30754a14)
The av_log() is done outside the lock, but this way the accesses to the
field (reads and writes) are always protected by a mutex. The av_log()
is not run inside the lock context because it may involve user callbacks
and doing that in performance-sensitive code is probably not a good idea.
This should fix occasional tsan warnings when running fate-h264, like:
WARNING: ThreadSanitizer: data race (pid=10916)
Write of size 4 at 0x7d64000174fc by main thread (mutexes: write M2313):
#0 update_context_from_user src/libavcodec/pthread_frame.c:335 (ffmpeg+0x000000df7b06)
[..]
Previous read of size 4 at 0x7d64000174fc by thread T1 (mutexes: write M2311):
#0 ff_thread_await_progress src/libavcodec/pthread_frame.c:592 (ffmpeg+0x000000df8b3e)
(cherry picked from commit 2e664b9c1e)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This tries to handle cases where separate invocations of decode_frame()
(each running in separate threads) write to respective fields in the
same AVFrame->data[]. Having per-field owners makes interaction between
readers (the referencing thread) and writers (the decoding thread)
slightly more optimal if both accesses are field-based, since they will
use the respective producer's thread objects (mutex/cond) instead of
sharing the thread objects of the first field's producer.
In practice, this fixes the following tsan-warning in fate-h264:
WARNING: ThreadSanitizer: data race (pid=21615)
Read of size 4 at 0x7d640000d9fc by thread T2 (mutexes: write M1006):
#0 ff_thread_report_progress pthread_frame.c:569 (ffmpeg:x86_64+0x100f7cf54)
[..]
Previous write of size 4 at 0x7d640000d9fc by main thread (mutexes: write M1004):
#0 update_context_from_user pthread_frame.c:335 (ffmpeg:x86_64+0x100f81abb)
(cherry picked from commit 083300bea9)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes tsan warnings like this in fate-vp8-test-vector-007:
WARNING: ThreadSanitizer: data race (pid=65909)
Write of size 4 at 0x7d8c0000e088 by thread T1:
#0 vp8_decode_mb_row_sliced vp8.c:2519 (ffmpeg:x86_64+0x100995ede)
[..]
Previous write of size 4 at 0x7d8c0000e088 by thread T2:
#0 vp8_decode_mb_row_sliced vp8.c:2519 (ffmpeg:x86_64+0x100995ede)
(cherry picked from commit fed92adbb3)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes tsan warnings like this in fate-vp8-test-vector-007:
WARNING: ThreadSanitizer: data race (pid=3590)
Write of size 4 at 0x7d8c0000e07c by thread T2:
#0 decode_mb_row_no_filter src/libavcodec/vp8.c:2330 (ffmpeg+0x000000ffb59e)
[..]
Previous write of size 4 at 0x7d8c0000e07c by thread T1:
#0 decode_mb_row_no_filter src/libavcodec/vp8.c:2330 (ffmpeg+0x000000ffb59e)
(cherry picked from commit 9a54c6f243)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes the following tsan warning when running fate-vsynth_lena-ffvhuff:
WARNING: ThreadSanitizer: data race (pid=6484)
Write of size 8 at 0x7d64000154b8 by main thread (mutexes: write M1331):
#0 update_context_from_user src/libavcodec/pthread_frame.c:331 (ffmpeg+0x000000dca887)
[..]
Previous read of size 8 at 0x7d64000154b8 by thread T2 (mutexes: write M1334):
#0 draw_slice src/libavcodec/huffyuvdec.c:857 (ffmpeg+0x000000bcc86f)
(cherry picked from commit 7c7e7c44a6)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes tsan warnings in fate-apng:
WARNING: ThreadSanitizer: data race (pid=51230)
Read of size 4 at 0x7d50000042fc by main thread (mutexes: write M1000):
#0 frame_copy_props frame.c:302 (ffmpeg:x86_64+0x1019a35d6)
[..]
Previous write of size 4 at 0x7d50000042fc by thread T1 (mutexes: write M997):
#0 decode_idat_chunk pngdec.c:708 (ffmpeg:x86_64+0x100f5562a)
(cherry picked from commit eff2861a75)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes a reported (but false) race condition in tsan for fate-apng:
WARNING: ThreadSanitizer: data race (pid=6274)
Read of size 4 at 0x7d680001ec78 by main thread (mutexes: write M1338):
#0 update_thread_context src/libavcodec/pngdec.c:1456 (ffmpeg+0x000000dacf0c)
[..]
Previous write of size 4 at 0x7d680001ec78 by thread T1 (mutexes: write M1335):
#0 decode_idat_chunk src/libavcodec/pngdec.c:737 (ffmpeg+0x000000dae951)
(cherry picked from commit 478f1c3d5e)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Values from subsequent values are guaranteed to be identical (since
poc and nal_unit_type are checked to be the same between slices), so
this doesn't affect output in any way, but does resolve the remaining
reported race conditions (by tsan) in fate-hevc.
In practice, this fixes tsan warnings like this:
WARNING: ThreadSanitizer: data race (pid=25334)
Read of size 4 at 0x7d9c0001adcc by main thread (mutexes: write M1386):
#0 hevc_update_thread_context src/libavcodec/hevcdec.c:3310 (ffmpeg+0x000000b41c7c)
[..]
Previous write of size 4 at 0x7d9c0001adcc by thread T1 (mutexes: write M1383):
#0 hls_slice_header src/libavcodec/hevcdec.c:596 (ffmpeg+0x000000b43a22)
(cherry picked from commit 1f50baa2b2)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Otherwise the thread may still be in the middle of decoding a previous
frame, which would effectively trigger a race condition on any field
concurrently read and written.
In practice, this fixes tsan warnings like the following:
WARNING: ThreadSanitizer: data race (pid=17380)
Write of size 4 at 0x7d64000160fc by main thread:
#0 update_context_from_user src/libavcodec/pthread_frame.c:335 (ffmpeg+0x000000dca515)
[..]
Previous read of size 4 at 0x7d64000160fc by thread T2 (mutexes: write M1821):
#0 ff_thread_report_progress src/libavcodec/pthread_frame.c:565 (ffmpeg+0x000000dcb08a)
(cherry picked from commit 1269cd5b6f)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Should fix tsan warnings in fate-fifo-muxer-h264/wav:
WARNING: ThreadSanitizer: data race (pid=26552)
Write of size 4 at 0x000001e0d7c0 by main thread:
#0 transcode_init src/ffmpeg.c:3761 (ffmpeg+0x00000050ca1c)
[..]
Previous read of size 4 at 0x000001e0d7c0 by thread T1:
#0 decode_interrupt_cb src/ffmpeg.c:460 (ffmpeg+0x0000004fde19)
(cherry picked from commit 76d8c77430)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This is how the ref list manager links bitstream IDs to H264Picture/Ref
objects, and is local to the producer thread. There is no need for the
consumer thread to know the bitstream IDs of its references in their
respective producer threads.
In practice, this fixes tsan warnings when running fate-h264:
WARNING: ThreadSanitizer: data race (pid=19295)
Read of size 4 at 0x7dbc0000e614 by main thread (mutexes: write M1914):
#0 ff_h264_ref_picture src/libavcodec/h264_picture.c:112 (ffmpeg+0x0000013b3709)
[..]
Previous write of size 4 at 0x7dbc0000e614 by thread T2 (mutexes: write M1917):
#0 build_def_list src/libavcodec/h264_refs.c:91 (ffmpeg+0x0000013b46cf)
(cherry picked from commit e72690b18d)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
I'm hoping that this will address the remaining tsan fate-h264 issues:
WARNING: ThreadSanitizer: data race (pid=24478)
Read of size 8 at 0x7dbc0001c828 by main thread (mutexes: write M3243):
#0 ff_h264_ref_picture src/libavcodec/h264_picture.c:107 (ffmpeg+0x0000013b78d8)
[..]
Previous write of size 1 at 0x7dbc0001c82e by thread T2 (mutexes: write M3245):
#0 ff_h264_direct_ref_list_init src/libavcodec/h264_direct.c:137 (ffmpeg+0x000001382c93)
But I'm not sure because I haven't been able to reproduce locally.
(cherry picked from commit 7f05c5cea0)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Experimental VP9 support was added to the muxer recently.
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
(cherry picked from commit d36a3f5a78)
The custom callback can cause significant CPU usage on Windows for some large
files with many index entries for some reason.
v2: Move check after parsing options.
Signed-off-by: Marton Balint <cus@passwd.hu>
This changes nothing but is nicer looking as this checks rlen
Maybe this helps coverity remove CID1397743
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit c94d551ea7)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This should help coverity see that the issues this leads to cannot occur
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit 8dd0c12648)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Doesn't work yet with slice threading and won't work with AMV.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit 03eb0515c1)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
When coding lossless jpeg the priv context will be pointing to LJpegEncContext
rather than MpegEncContext, which the function expects.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
(cherry picked from commit 2c9be3882a)
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
* commit 'e18ba2dfd2d19aedc8afccf011d5fd0833352423':
hwcontext_dxva2: make sure the sw frame format is the right one during transfer
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '9d7026574bbbe67d004a1c32911da75375692967':
hwcontext_dxva2: fix handling of the mapping flags
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '0d3176e32f351d18d6174d8b05796829a75a4c6b':
hwcontext_dxva2: do not assume the destination format during mapping is always the right one
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '0a4b9d0ccd10b3c39105f99bd320f696f69a75a2':
hlsenc: Add encryption support
This commit is a noop, see 907ac20aa2
Note that this commit differs from our encryption support in various
ways so it may need some adjustments in the future.
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'bb265b764a055f2dc576b9aec62460d9580868f4':
examples/transcode_aac: Drop pointless return value const qualifier
This commit is a noop, the function doesn't exist in FFmpeg anymore
since e181e2909b.
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'bfe92dfe60f601b3f20a918ffcc0acdf40a5955c':
Ignore all generated example binaries
This commit is a noop, the .gitignore was updated during the merges of
these examples.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '3cc3463f306f425f76bd962755df1132eeac6dfa':
avisynth: Support pix_fmts added to AviSynth+
This commit is mostly a noop, see
92916e8542.
Cosmetics and a small fix are merged.
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'aaae59700f7fc10fd80cb93b38c5d109900872d9':
avisynth: Simplify the pix_fmt check for the newer AviSynth API
This commit is a noop, see 0ed5c3ce81
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'bcefafa226dcda23d4d9af9601d19389cb918a5b':
avisynth: Fix setting stream timebase
This commit is a noop, see 8009a1f1fd
Merged-by: Clément Bœsch <u@pkh.me>
* commit '481ff3cf018811ba3235f1c236e970f32a6300b9':
fate: Add h264 and hevc extradata reload tests
Only the HEVC part is merged, see 00c8079816
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'd5d62ce6d643de704e7bd62a2375e6391c0ffb9a':
mov: Fix identity matrix boolean logic
This commit is a noop, see 7010ebdf1f
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'e7ae8f7a715843a5089d18e033afb3ee19ab3057':
aarch64: vp9: loop filter: replace 'orr; cbn?z' with 'adds; b.{eq,ne};
This commit is a noop, see e7ae8f7a71
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'd7595de0b25e7064fd9e06dea5d0425536cef6dc':
aarch64: vp9: use alternative returns in the core loop filter function
This commit is a noop, see 62ea07d797
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'e17567a831dede1f24e3a1a4c305a93012d7a8ce':
libilbc: support for latest git of libilbc
This commit is a noop, see 59af5383c1
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'ffe89e1edb0281ff65d1bda88253784e9283b717':
configure: Move mjpeg_vaapi_decoder dependency declarations to the right place
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'fbd1f7639d0142c391bec85d1d840c835210843f':
af_asyncts: Use llabs instead of labs for 64-bit variable
This commit is a noop, see a8fe8d6b4a
Merged-by: Clément Bœsch <u@pkh.me>
* commit '182cf170a544bce069c8690c90b49381150a1f10':
vp8: Return stream format information from parser
Return codes are adjusted to consume the whole packet in case of error
as the API does not allow returning AVERROR codes (a negative return
value is valid).
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'b6582b29277e00e5d49f400e58beefa5a21d83b8':
qsv: Add VC-1 decoder
See fb57bc6c34.
Merged for cosmetic purposes to reduce differences with libav.
Merged-by: James Almer <jamrial@gmail.com>
* commit 'fea4dc05b41f5465bedc786b67966f204ec6150c':
vc1: Return stream format information from parser
This commit is a noop, see 4df6605da7
Merged-by: James Almer <jamrial@gmail.com>
* commit '0940b748bdba36c4894fc8ea6be631d821fdf578':
qsvdec: Only warn about unconsumed data if it happens more than once
Merged-by: James Almer <jamrial@gmail.com>
* commit '030d84fa2e35af0e77516735de35bf1a52371c86':
qsvdec: Pass field order information to libmfx
qsvdec: Pass the correct profile to libmfx
These commits are a noop, see 1f26a231bb
Merged-by: James Almer <jamrial@gmail.com>
* commit '3297577f3eac1c87d48dedd527942de2bd28e7a5':
mpegvideo: Return correct coded frame sizes from parser
This commit is a noop, see 309fe16a12
Merged-by: James Almer <jamrial@gmail.com>
* commit '3c9546dfafcdfe8e7860aff9ebbf609318220f29':
aarch64: vp9: Add NEON itxfm routines
This commit is a noop, see f43079e11c
Merged-by: James Almer <jamrial@gmail.com>
* commit '01348e411f962f5e4605d649fc9a47a54587ba8e':
avconv_opt: Consistently iterate through hwaccels array in all cases
Merged-by: James Almer <jamrial@gmail.com>
* commit '8ddfa5ae5ef64a25dd087d74954ebdb9081f0d67':
vf_drawtext: Drop wrong void* cast
This commit is a noop, see 4c96985af1
Merged-by: James Almer <jamrial@gmail.com>
* commit 'fcbdd605b5409103c3f4bfa063ea270f2229b125':
nut: Use correct function pointer casts instead of void*
This commit is a noop. Casts are not needed.
Merged-by: James Almer <jamrial@gmail.com>
* commit '3b50dbc51fb0978d09c1a5b83d4bf5a59d170e1e':
ratecontrol: Use correct function pointer casts instead of void*
Merged-by: James Almer <jamrial@gmail.com>
* commit 'dd299a2d6d4d1af9528ed35a8131c35946be5973':
arm: vp9: Add NEON loop filters
This commit is a noop, see 6bec60a683
Merged-by: James Almer <jamrial@gmail.com>
* commit 'f7d183f08472e566a2e6b62a80e200a12670ed0e':
libxvid: Check return value of write() call
This commit is a noop, see 25f35df115
Merged-by: James Almer <jamrial@gmail.com>
* commit '12db2832e41aa71b5903ef7fa5c59c5473ded2c5':
libxvid: Require availability of mkstemp()
This commit is a noop. Our libxvid wrapper doesn't use mkstemp().
Merged-by: James Almer <jamrial@gmail.com>
* commit 'a67ae67083151f2f9595a1f2d17b601da19b939e':
arm: vp9: Add NEON itxfm routines
This commit is a noop, see b4dc7c341e
Merged-by: James Almer <jamrial@gmail.com>
* commit '0b37cd09a67c3ba4db044404b99c65a32b4ad932':
checkasm: add vp9dsp.itxfm_add tests.
This commit is a noop, see 0b227c6d47
Merged-by: James Almer <jamrial@gmail.com>
* commit 'fd0fae60372cddbe0bec8830d07e760195f80bad':
pthread_frame: Unreference hw_frames_ctx on per-thread codec contexts
This commit is a noop, see fb69a8e1f1
Merged-by: James Almer <jamrial@gmail.com>
* commit '11623217e3c9b859daee544e31acdd0821b61039':
arm: vp9mc: Use a different helper register for PIC loads
This commit is a noop, see 68caef9d48
Merged-by: James Almer <jamrial@gmail.com>
* commit '824e8c284054f323f854892d1b4739239ed1fdc7':
arm: Clear the gp register alias at the end of functions
This commit is a noop, see 86c5a23ee5
Merged-by: James Almer <jamrial@gmail.com>
* commit '6a62795d4051f435a9a2c59395d96913693922f8':
aarch64: h264idct: Use the offset parameter to movrel
This commit is a noop, see da5c8284c0
Merged-by: James Almer <jamrial@gmail.com>
* commit '557c1675cf0e803b2fee43b4c8b58433842c84d0':
arm: vp9mc: Minor adjustments from review of the aarch64 version
This commit is a noop, see 68caef9d48
Merged-by: James Almer <jamrial@gmail.com>
* commit '383d96aa2229f644d9bd77b821ed3a309da5e9fc':
aarch64: vp9: Add NEON optimizations of VP9 MC functions
This commit is a noop, see 1f7801c2bc
Merged-by: James Almer <jamrial@gmail.com>
* commit 'c44a8a3eabcd6acd2ba79f32ec8a432e6ebe552c':
aarch64: Add an offset parameter to the movrel macro
This commit is a noop, see 7fe898dbb9
Merged-by: James Almer <jamrial@gmail.com>
* commit 'a4cfcddcb0f76e837d5abc06840c2b26c0e8aefc':
vp9: Make the subpel filters non-static
This commit is a noop.
Merged-by: James Almer <jamrial@gmail.com>
* commit '98cae966c77875e26c5958206a6cfe7eba6269e8':
matroskaenc: write updated STREAMINFO metadata for FLAC streams if available
This commit is a noop, see 8c1342e631
Merged-by: James Almer <jamrial@gmail.com>
* commit 'f4bf236338f6001736a4784b9c23de863057a583':
matroskaenc: fix muxing AAC streams when using aac_adtstoasc bsf
This commit is a noop. aac_adtstoasc bsf sends its extradata update
straight to codecpar->extradata.
This behavior violates the bsf API and should be fixed so this change
may then be applied.
Merged-by: James Almer <jamrial@gmail.com>
* commit '84f225684cd389747907381122c073aa1c8b6bf1':
pthread_frame: properly propagate the hw frame context across frame threads
This commit is a noop, see 98f89d615b.
Merged-by: James Almer <jamrial@gmail.com>
* commit '72a19f4013ec2c7f8581416f8ad4bf81df163fb6':
mpegaudiodsp: aarch64: Adjust function prototype after 2caa93b813
Merged-by: James Almer <jamrial@gmail.com>
* commit 'c78495d1cdac6dd13786a7e5571b606604a360bd':
configure: Log name and parameters of all helper functions where it makes sense
Merged-by: James Almer <jamrial@gmail.com>
* commit '831005b2302cbeb377e3f00fd18c78928bcec185':
configure: Log correct test name and use correct filter when testing objective C flags
Merged-by: James Almer <jamrial@gmail.com>
* commit 'fe7bc1f16abaefe66d8a20f734ca3eb8a4ce4d43':
configure: Do not unconditionally check for (and enable) xlib
Merged-by: James Almer <jamrial@gmail.com>
The value must be identical between slices, since mbaff depends on
picture_structure and sps, both of which are checked to be identical
to the first slice before this point.
In practice, this silences some tsan warnings in fate-h264.
This fixes race conditions reported by tsan in fate-lagarith. The races
were because each thread's LagarithContext::avctx was set to the first
thread's AVCodecContext.
Otherwise all thread's private contexts have the avctx pointer set to
the AVCodecContext of the first thread, which means all writes to
ctx->avctx->* (in e.g. read_header) are effectively race conditions.
Fixes fate-dnxhd under tsan.
Adding an MOV format option to turn on/off the editlist supporting code, introduced in ca6cae73db
Signed-off-by: Sasi Inguva <isasi@google.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* commit 'd1a91ebe4990001e0800ee9ac54ed2207e4f56ff':
configure: Print list of enabled programs
This commit is mostly a noop, see 832b4a4a43
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit 'a3483f79933e8f1fd99d524e3218688e14c59150':
avconv: Drop stray leftover debug output
This commit is a noop, see a283665693
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit '59d2b00d201935c16408a2917957d89a170fe58f':
configure: Add --quiet command line parameter to suppress informative output
The license assignment is moved out of the quiet condition to make sure
it ends up in config.h
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit 'de6e2ff3ddf506d5b487c2f226cea73e095ad6d1':
mov: Read multiple stsd from DV
This commit is a noop, see a765ba647d
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit '47a795727f5433f5238a8a244cf181f61ea5af2c':
hevc: Support extradata changes from multiple stsd
This commit is a noop, see 25fcbf7a84
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit '2fe30b4743c0f4c3bdf37b91ae534cafa85e4036':
hevc: Allow parsing external extradata buffers
This commit is a noop as it matches FFmpeg state.
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit '5be21531119d7a97ebc706800d1608272ee5a507':
hevc: Move hevc_decode_extradata before frame decoding
This commit is a noop, hevc_decode_extradata() is already above
hevc_decode_frame().
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit 'bed2c4b2652b1412b584e5545d6dd2ef8c613be0':
lavc: Add hevc main10 profile to avconv cli
This commit is a noop, see 271afd632f
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit '0361e4dcb4d394c88c33364415a3b8fe315b67d1':
h264_qpel: x86: Move function with only one instance out of template macro
Note: warning is present with clang.
Merged-by: Clément Bœsch <cboesch@gopro.com>
This is more robust in case some change or corner case causes them to be
dereferenced before being set
Fixes CID1396274, CID1396275
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* commit '88f0cf8cd30c8ea283430e6710a7bd98bb9c0301':
avplay: Correct function pointer assignments in options array
This commit is a noop, see a9a1bc56ab
Merged-by: Clément Bœsch <u@pkh.me>
* commit '943533d64c7fa7a1b2fc9559e67652c349d21d51':
avconv: Correct function pointer assignments in options array
This commit is a noop, see 4c96985af1
Merged-by: Clément Bœsch <u@pkh.me>
* commit '43de8b328b62cf21ec176c3989065168da471a5f':
lzf: update pointer p after realloc
This commit is a noop, see bb6a7b6f75
Merged-by: Clément Bœsch <u@pkh.me>
* commit '00aeedd84105a17f414185bd33ecadebeddb3a27':
qsv{dec,enc}: use a struct as a memory id with internal memory allocator
Merged-by: Mark Thompson <sw@jkqxz.net>
* commit '404e51478ecad060249d5b9bee6ab39a8a9d8c1c':
qsv{dec,enc}: always use an internal mfxFrameSurface1
Minor fixups for differences in the QSV encoder because of a53cc.
Merged-by: Mark Thompson <sw@jkqxz.net>
* commit '8ea15afbf2c1ec89b5d4bac1f0b8345e4b906a5d':
hwcontext_qsv: transfer data through the child context when VPP fails
Merged-by: Mark Thompson <sw@jkqxz.net>
* commit 'b91ce4860054430d3712deb0d9487cac2fcb7d68':
hwcontext_qsv: do not fail when download/upload VPP session creation fails
Merged-by: Mark Thompson <sw@jkqxz.net>
This is an example, people will copy and use this. The maximum supported is quite
unreasonable as a default choice
Reviewed-by: Steven Liu <lingjiujianke@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* commit 'fabfbfe5710050812147f93a351a53fdda56ff8c':
dxva2: fix surface selection when compiled with both d3d11va and dxva2
This commit is a noop, see 153b36fc62
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit 'db0b3dccb3842de134721e8d5c275f56d384340d':
libx265: Add option to force IDR frames
This commit is a noop, see 8a8902f221
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit '3cba09e5228c889d63814dc43bc68f15c9dbac77':
x86: Drop stray semicolons after function definitions
This commit is a noop, they are already fixed in FFmpeg.
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit 'd1ef1b9eaa45043ea5df5a004fb37243e05da61d':
configure: Silence lld-link when getting the version number
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit '392caa65df3efa8b2d48a80f08a6af4892c61c08':
arm: vp9mc: Insert a literal pool at the middle of the file
This commit is a noop, see 68caef9d48
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit '99434f4df81b6801b2b535d5b9143305595784f6':
float_dsp: Have implementation match function pointer prototype
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit '6354957a95022864746180525680cca872ab0e0a':
dnxhdenc: Have function pointer prototype match implementation
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit 'c778eb15b89d875cb246b18f65b3b4321cb1e7d6':
pixblockdsp: Have function pointer prototype match implementation
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit '99ddeddc7fc996c0c1e842112928490e78542bd5':
ituh263dec: Have function signature match across declaration and definition
This commit is a noop, see 2d2b363c65
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit '67c65e461cb073d61ffbc78845d4a3d8f14bf481':
vf_hwupload_cuda: Fix build error
This commit is a noop, see 78e871ebbc
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit '13fcdfb976038f63b9f753e2ebcc8e04d7c7abc2':
svq3: Drop unused function dctcoef_get()
This commit is a noop, see 1e298e7724
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit 'ee59f0540875ab42496af2aacddd942757707683':
intrax8: Have function signature match across declaration and definition
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit '1a469a5e423bdad779b8534247dea8cc86169b88':
options_table: Remove a now unnecessary include of config.h
This commit is a noop, see 76f43cbe26
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit 'ffbd1d2b0002576ef0d976a41ff959c635373fdc':
arm: vp9: Add NEON optimizations of VP9 MC functions
This commit is a noop, see 68caef9d48
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit '2e55e26b40e269816bba54da7d0e03955731b8fe':
vp9: Flip the order of arguments in MC functions
This commit is a noop, it was made to match our prototypes.
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit '7e2561fa8313982aa21f7657953eedeeb33b210d':
lavfi: Use ff_get_video_buffer in all filters using hwframes
vf_hwupload_cuda: Fix build error
Merged-by: Matthieu Bouron <matthieu.bouron@gmail.com>
* commit '7433feb82f75827884d909de34d341a1c4401d4a':
lavfi: Make default get_video_buffer work with hardware frames
Merged-by: Matthieu Bouron <matthieu.bouron@gmail.com>
This conflict with the DJGPP libc math.h which includes a PI macro (to
M_PI).
We cannot make DJGPP POSIX only (using -D_POSIX_SOURCE) to avoid this
kind of symbols conflicts due to the lack of both posix_memalign and
memalign (DJGPP non standard function) in that POSIX mode. We currently
rely on memalign for aligned heap allocation.
This conflict with the DJGPP libc which includes a pow2 function¹
We cannot make DJGPP POSIX only (using -D_POSIX_SOURCE) to avoid this
kind of symbols conflicts due to the lack of both posix_memalign and
memalign (DJGPP non standard function) in that POSIX mode. We currently
rely on memalign for aligned heap allocation.
[1]: http://www.delorie.com/djgpp/doc/libc-2.02/libc_536.html
This conflict with the DJGPP libc which includes a pow2 function¹
We cannot make DJGPP POSIX only (using -D_POSIX_SOURCE) to avoid this
kind of symbols conflicts due to the lack of both posix_memalign and
memalign (DJGPP non standard function) in that POSIX mode. We currently
rely on memalign for aligned heap allocation.
[1]: http://www.delorie.com/djgpp/doc/libc-2.02/libc_536.html
* commit '39cea6570c11a49b64b2ec8d71e218db03b4c742':
aactab: Move extern keyword to the front of array declarations
Merged-by: Clément Bœsch <u@pkh.me>
* commit '85baef4ff1512bcc2544928bfa5f42072903a691':
vf_drawtext: Move static keyword to beginning of variable declaration
This commit is mostly a noop, see:
d9e2aceb7f6d7aa437e1
Merged-by: Clément Bœsch <u@pkh.me>
* commit '636515c324facaa14ccd8ab0732740a240a31ba9':
examples/decode_video: remove a stray unrelated comment
This commit is a noop, see 8c4753f7f5
Merged-by: Clément Bœsch <u@pkh.me>
* commit '5b4d7ac7ae5d821cfa6ab89f8eab4d31851ef32c':
examples/encode_video: use the AVFrame API for allocating the frame
Merged-by: Clément Bœsch <u@pkh.me>
* commit '7b1f03477f1a43d2261fbd83e50a4ad90c7f806d':
examples/avcodec: split the remaining two examples into separate files
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'f5df897c4b61985e3afc89ba1290649712ff438e':
examples/avcodec: split audio decoding into a separate example
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'f76698e759a08e8d3b629c06edb0439f474e7fee':
examples/encode_audio: use the AVFrame API for allocating the data
Merged-by: Clément Bœsch <u@pkh.me>
* commit '40aaa8dadfd1c69ff4460d04750e1403b5535a6d':
examples/avcodec: split audio encoding into a separate example
Merged-by: Clément Bœsch <u@pkh.me>
Get rid of the "ret" variable, and always use err. Report the packet as
consumed if err is unset. This should be equivalent to the old code,
which obviously required err=0 for p->result>=0 (and otherwise,
p->result must have had the value err was last set to). The code block
added by commit 32a5b63126 is also not needed anymore, because the new
code strictly returns err if it's >=0.
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Intra-only codecs should either be able to read these items from the
bitstream, or they should be set upon codec initialization. In both
cases, syncing these items at runtime is unnecessary.
In practice, this fixes race conditions for decoders that read these
values from the bitstream.
If ret is NULL, a dummy common holder is created to hold *all* the
parallel function returns, which gets written concurrently. This commit
simplify the whole logic by simply not writing to that holder when not
set.
Needed for the C+11 atomics. Also change add_cxxflags to check_cxxflags.
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Marton Balint <cus@passwd.hu>
refer to SPEC:
Annex E. The FLV File Format said:
E.3 TheFLVFileBody have a table:
Field Type Comment
PreviousTagSize0 UI32 Always 0
Reviewed-by: Bela Bodecs <bodecsb@vivanet.hu>
Reviewed-by: Hendrik Leppkes <h.leppkes@gmail.com>
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
adding demuxer and other logs should be easy
This forces single threaded decoding for simplicity
It also requires pthreads, this could be avoided either with
some lockless tricks or simply by assuming av_log would never be called from
another thread.
Fixes Ticket5521
Previous version reviewed by Stefano
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* commit '064f19f39e2f17927278c6ad8fe884a5b92310d6':
avconv: support parsing bitstream filter options
This commit is a noop, see 5ef1959080
Merged-by: James Almer <jamrial@gmail.com>
* commit 'ecd2ec69ce10e13f6ede353d2def7ce9f45c1a7d':
mov: Evaluate the movie display matrix
This commit is a noop, see 7010ebdf1f
Merged-by: James Almer <jamrial@gmail.com>
* commit 'b90c8a3d08e3f9ad4de1253376d2d1d93abb8b8c':
fate: Add tests for mov display matrix
Adapted to use ffprobe -show_entries
Merged-by: James Almer <jamrial@gmail.com>
* commit '7d308bf84bda78d47c01439ff625bb06624991a7':
avprobe: Add -show_stream_entry to get a single stream property
This commit is a noop, we have a generic -show_entry option.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '218ed7250c103a975e874fb16e8e5941f4cbe223':
openssl: Allow newer TLS versions than TLSv1
This commit is a noop, see e8634fb92e
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'dad7514f9ec8a8c5e44d70fcfbbcedeff16f7e13':
xcb: Add all the libraries to the link line explicitly
This commit is a noop. It appears we already link against the xcb shape
library since 54170a33c2.
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'c541a44e029e8a4f21db028c34fee3ad1c10a409':
Revert "rtmpproto: Don't include a client version in the unencrypted C1 handshake"
Merged-by: Clément Bœsch <u@pkh.me>
* commit '801ac7156d3efb8e088fb6024f568eb36a293887':
qsv: Be informative when reporting that no data has been consumed
Merged-by: Clément Bœsch <u@pkh.me>
* commit '30015305f3b523ed7640f2c3c58b017140533c58':
Use avpriv_request_sample() where appropriate
Only the roqvideo chunk is merged because we actually support 24bpp
flic, see 5781c983d8.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '07cac07c0c0360d67e73a7472214c79d6c520a4b':
dash: Use correct ISO C scanf conversion specifier
This commit is a noop: the use of SCN (scanf) format is wrong here.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '3ec6f855d0f21d90a0494fb798c4cf203fdb3db0':
srt: Adjust signedness of sscanf format strings
This commit is a noop, a different fix is included in the big -Wformat
patch under review
(http://ffmpeg.org/pipermail/ffmpeg-devel/2017-March/209239.html)
Merged-by: Clément Bœsch <u@pkh.me>
* commit '7a2b2b6a92c4b528ecb640790eca0aa790d858f4':
dxtory: Drop nonsense ISO C printf conversion specifiers for standard types
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'c454dfcff90f0ed39c7b0d4e85664986a8b4476c':
Use ISO C printf conversion specifiers where appropriate
This commit is a noop, an equivalent patch is currently under review on
the mailing-list: http://ffmpeg.org/pipermail/ffmpeg-devel/2017-March/209239.html
Merged-by: Clément Bœsch <u@pkh.me>
Could lead to random behavior. This possibly happened due to commit
32a5b63126. This should/could probably be simplified, but for no apply
a minimal fix to quell the errors.
Tested-by: Michael Niedermayer <michael@niedermayer.cc>
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
async_mutex has is used in a very strange but intentional way: it is
locked by default, and unlocked only in regions that can be run
concurrently.
If the user was calling API functions to the same context from different
threads (in a safe way), this could unintentionally unlock the mutex on
a different thread than the previous lock operation. It's not allowed by
the pthread API.
Fix this by emulating a binary semaphore using a mutex and condition
variable. (Posix semaphores are not available on all platforms.)
Tested-by: Michael Niedermayer <michael@niedermayer.cc>
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
The old "API" that signaled rotation as a metadata value has been
replaced by DISPLAYMATRIX side data quite a while ago.
There is no reason to make muxers/demuxers/API users support both. In
addition, the metadata API is dangerous, as user tags could "leak" into
it, creating unintended features or bugs.
ffmpeg CLI has to be updated to use the new API. In particular, we must
not allow to leak the "rotate" tag into the muxer. Some muxers will
catch this properly (like mov), but others (like mkv) can add it as
generic tag. Note applications, which use libavformat and assume the
old rotate API, will interpret such "rotate" user tags as rotate
metadata (which it is not), and incorrectly rotate the video.
The ffmpeg/ffplay tools drop the use of the old API for muxing and
demuxing, as all muxers/demuxers support the new API. This will mean
that the tools will not mistakenly interpret per-track "rotate" user
tags as rotate metadata. It will _not_ be treated as regression.
Unfortunately, hacks have been added, that allow the user to override
rotation by setting metadata explicitly, e.g. via
-metadata:s:v:0 rotate=0
See references to trac #4560. fate-filter-meta-4560-rotate0 tests this.
It's easier to adjust the hack for supporting it than arguing for its
removal, so ffmpeg CLI now explicitly catches this case, and essentially
replaces the "rotate" value with a display matrix side data. (It would
be easier for both user and implementation to create an explicit option
for rotation.)
When the code under FF_API_OLD_ROTATE_API is disabled, one FATE
reference file has to be updated (because "rotate" is not exported
anymore).
Tested-by: Michael Niedermayer <michael@niedermayer.cc>
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Squelches the following compiler warnings:
libavcodec/opusenc.c:1051:16: warning: format specifies type 'long' but
the argument has type 'long long' [-Wformat]
avctx->bit_rate/1000, clipped_rate/1000);
^~~~~~~~~~~~~~~~~~~~
libavcodec/opusenc.c:1051:38: warning: format specifies type 'long' but
the argument has type 'long long' [-Wformat]
avctx->bit_rate/1000, clipped_rate/1000);
^~~~~~~~~~~~~~~~~
This was skipped in c17563c5d3 because
it depended on the filter setup merge, but was forgotten after that
actually happened.
Fixes hwaccel fate for stream size change tests.
Changes to the parsing code originally committed to mpegvideo_parser.c
in 73fb23dc5a.
Required by some samples, like PVA_test-partial.pva
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
* commit 'fbe425c8d29e473a8f69ae2dc52b1a10b77f3b44':
hap: Adjust printf length modifiers to match variable types
This commit is a noop, see 5a51ca2da7
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'f22363c72968f1a1fc4881d8695ec7068b0aa03c':
openssl: Avoid double semicolons after the GET_BIO_DATA macro
This commit is a noop, see fc83de7e1d
Merged-by: Clément Bœsch <u@pkh.me>
* commit '99aeae20de4d09ea313fdc619d4e2df825155e62':
scale_npp: fix passthrough mode
This commit is a noop, see f524275ef9
Merged-by: Clément Bœsch <u@pkh.me>
* commit '754b20d7ebccbe8d316b12128c8cb433d5a516ac':
vaapi_h264: fix RefPicList[] field flags.
This commit is a noop, see 88325c2e0b
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'ee050797664c7c74cae262ffab05006b55d47a11':
openssl: Support version 1.1.0.
This commit is mostly a noop, see 798c6ecce5
Included the simplifications by Martin Storsjö and fixed the
GET_BIO_DATA() macro to prevent a warning after the simplifications.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '2f806622e1270d3ed1d41a53049a19673dafbe70':
bktr: Use memset(0) instead of zero initialization for struct sigaction
Merged-by: Clément Bœsch <u@pkh.me>
* commit '6d5636ad9ab6bd9bedf902051d88b7044385f88b':
hevc: x86: Add add_residual() SIMD optimizations
See a6af4bf64d
This merge is only cosmetics (renames, space shuffling, etc).
The functionnal changes in the ASM are *not* merged:
- unrolling with %rep is kept
- ADD_RES_MMX_4_8 is left untouched: this needs investigation
Merged-by: Clément Bœsch <u@pkh.me>
* commit '043b0b9fb1481053b712d06d2c5b772f1845b72b':
Replace leftover uses of -aframes|-dframes|-vframes with -frames:a|d|v
The merge also includes all our own occurences.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '4b07ebf1eb13561492f7e3c30a67f34415016b3e':
mov: Update colr values
Mostly noop, see a3cab3d433
Only the use of av_color_{primaries,transfer,space}_name() is merged.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '59c90097a0eff0dc81fbec15b8900c929859d1e7':
hevc: factor out a repeated condition
This commit is a noop. It doesn't apply as our codebase has diverged
too much.
Merged-by: James Almer <jamrial@gmail.com>
* commit '096a8effa3f8f3455292c958c3ed07e798def7bd':
lavf: check that the codec is supported by extract_extradata
This commit is a noop. The code it changes was reverted.
See 40fa9d416a
Merged-by: James Almer <jamrial@gmail.com>
This reverts commit 1c193ac1f9, reversing
changes made to 7ebc9f8df4.
Several FATE tests started failing after this merge, so it's reverted
until it can be properly fixed.
* commit '788544ff0ed6fe67fda80ad6d3a0796ace035584':
audiodsp: x86: Remove pointless header file
This commit is a noop, see 6ec3dc97fc
Merged-by: James Almer <jamrial@gmail.com>
* commit 'b89804da9bad2d94dd95bf20ac6187447e9c17e9':
x86: videodsp: Add parentheses to expression to work around warning
Merged-by: James Almer <jamrial@gmail.com>
* commit 'da4f8c8e35a867f2d9fed0fb75e16c81ab968637':
fate: Update filter-pixfmts-scale gbrap12le hash missing from be9dba5c8a
This commit is a noop.
Merged-by: James Almer <jamrial@gmail.com>
* commit 'be9dba5c8abc6ecf0b8ee4ccb11c7850327fcf8d':
swscale: Properly load alpha for planar rgb
This commit is a noop, see
4170a44bbcdf36257a53
Merged-by: James Almer <jamrial@gmail.com>
* commit '58224dc5f3d4fea40a8d55cca87291a960c11622':
ppc: avcodec: Drop silly "_ppc" suffixes from files in ppc subdirectories
Merged-by: James Almer <jamrial@gmail.com>
* commit '0cf86fabfa5820596cca2cfead63c6f8df76c3f2':
vaapi_encode: Write sequence header as extradata
This commit is a noop. It has already been cherry-picked in
51020adcec
Merged-by: James Almer <jamrial@gmail.com>
* commit 'c91d6a33f872574c95c8784277cf60ffcf6bff4f':
checkasm: aarch64: Add filler args to make sure all parameters are passed on the stack
Merged-by: James Almer <jamrial@gmail.com>
* commit 'f1b3e131385176c3c9d9783b25047856a0dcebf6':
checkasm: aarch64: Clobber the stack before calling functions
Merged-by: James Almer <jamrial@gmail.com>
* commit 'a05cc56124b4f1237f6355784de821e3290ddb44':
checkasm: arm/aarch64: Fix the amount of space reserved for stack parameters
Merged-by: James Almer <jamrial@gmail.com>
* commit '8e2ea691351c5079cdab245ff7bfa5c0f3e3bfe4':
lavf: use the new bitstream filter for extracting extradata
Merged-by: James Almer <jamrial@gmail.com>
* commit '89b35a139e838deeb32ec20d8d034c81014401d0':
lavc: add a bitstream filter for extracting extradata from packets
Merged-by: James Almer <jamrial@gmail.com>
* commit 'f6e2f8a9ffda2247bffba991450990d075ea68e3':
hevcdec: move parameter set parsing into a separate header
Merged-by: James Almer <jamrial@gmail.com>
* commit '150c896a9e46b23b97debb0a5f66fbaeaa32f153':
hevcdec: split ff_hevc_diag_scan* declarations into a separate header
Merged-by: James Almer <jamrial@gmail.com>
* commit '645c6ff4231a75a71db58c8e6d06346068d2f949':
hevcdec: drop the prototype of a non-existing function
This commit is a noop. The prototype in question is not in our tree.
Merged-by: James Almer <jamrial@gmail.com>
* commit 'c359d624d3efc3fd1d83210d78c4152bd329b765':
hevcdec: move decoder-independent declarations into a separate header
Merged-by: James Almer <jamrial@gmail.com>
* commit '6c31ba226968f12f898120dbb928dab34e03782b':
avformat/matroska: fix MatroskaVideoFieldOrder enum values
This commit is a noop, see dc781459cc
Merged-by: Clément Bœsch <u@pkh.me>
* commit '20b75970e43a030f959b17ff2dfd561174b6f24e':
file protocol: handle the file: protocol string in file_check
This commit is a noop, see 77015443a8
Merged-by: Clément Bœsch <u@pkh.me>
* commit '7d8d726be7dc46343ab1c98c339c1ed44bcb07c1':
rtmpproto: Don't include a client version in the unencrypted C1 handshake
Merged-by: Clément Bœsch <u@pkh.me>
* commit '9f23f77a532ca9c2b7dc4b5328bc413e4f6f5b56':
rtmpproto: Don't include the libavformat version as "clientid"
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'c9527bf3444c5332fa04931d32997308784fc862':
Make the RELEASE file match with the most recent tag
This commit is noop.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '881477c77bb10c3c62fda111b0f1f3554968bc78':
swscale: Add the GBRAP12 output
Add GBRAP12 pixel format support
swscale: Enable GBRP12 output
swscale: x86: Add some forgotten 12-bit planar YUV cases
swscale: Add input support for 12-bit formats
This merge is noop, these commits are recrafted cherry-picks from
FFmpeg.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '1e93aa69a60815d1407a6c34d8da3f83ab193ad5':
Add GBRP12 pixel format support
This commit is a noop, see e9757066e1
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'e7e5be8635c1cf0588d2a07e59374135de6da55a':
APIchanges: Expand the name of recently added pixel formats
This commit is a noop, we don't have this entry.
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'eb542106029a9b28b4f76ff7c181eb4f542da9c4':
swscale: Add missing yuv444p12 swapping
This commit is a noop, these pixel formats were introduced long ago and
present in the switch case.
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'cbd84b8a51aa656d71b7d6ed44bd89041ff081a8':
nvenc: Fix error log
This commit is a noop, the error message is correct in FFmpeg.
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'da2848375a2e2121dad9f1e8cbd0ead4e3bf77d6':
nvenc: Force high_444 profile for 444 input
This commit is a noop, see 20abda6b62
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'e4128c08d786eb5513578e8c6063671ba03226ab':
Revert "hevc: x86: Refactor IDCT macro declarations"
So apparently this was technically correct be reverted due to
authorship. Reverted as well in FFmpeg for now...
See http://lists.libav.org/pipermail/libav-devel/2016-October/079560.html
Merged-by: Clément Bœsch <u@pkh.me>
* commit '20abcaa273a6e77d0a2e1a98c643c73562c6f8f2':
configure: #include stdint.h as part of libxavs test
This commit is a noop, see 20c4fb2e01
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'be630b1e08ebe8f766b1798accd6b8e5e096f5aa':
d3d11va: Use the proper decoding slice index
This commit is a noop, see 9b462a0b9d
Merged-by: Clément Bœsch <u@pkh.me>
* commit '715f139c9bd407ef7f4d1f564ad683140ec61e6d': (23 commits)
vp9lpf/x86: make filter_16_h work on 32-bit.
vp9lpf/x86: make filter_48/84/88_h work on 32-bit.
vp9lpf/x86: make filter_44_h work on 32-bit.
vp9lpf/x86: make filter_16_v work on 32-bit.
vp9lpf/x86: make filter_48/84_v work on 32-bit.
vp9lpf/x86: make filter_88_v work on 32-bit.
vp9lpf/x86: make filter_44_v work on 32-bit.
vp9lpf/x86: save one register in SIGN_ADD/SUB.
vp9lpf/x86: store unpacked intermediates for filter6/14 on stack.
vp9lpf/x86: move variable assigned inside macro branch.
vp9lpf/x86: simplify ABSSUM_CMP by inverting the comparison meaning.
vp9lpf/x86: remove unused register from ABSSUB_CMP macro.
vp9lpf/x86: slightly simplify 44/48/84/88 h stores.
vp9lpf/x86: make cglobal statement more conservative in register allocation.
vp9lpf/x86: save one register in loopfilter surface coverage.
vp9lpf/x86: add ff_vp9_loop_filter_[vh]_44_16_{sse2,ssse3,avx}.
vp9lpf/x86: add ff_vp9_loop_filter_h_{48,84}_16_{sse2,ssse3,avx}().
vp9lpf/x86: add an SSE2 version of vp9_loop_filter_[vh]_88_16
vp9lpf/x86: add ff_vp9_loop_filter_[vh]_88_16_{ssse3,avx}.
vp9lpf/x86: add ff_vp9_loop_filter_[vh]_16_16_sse2().
...
All these commits are cherry-picks from FFmpeg. Maybe some slight
differences sneaked in but the Libav codebase still differs too much
with our own to make a proper diff. This merge is a noop.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '497c087939e32b26b792515d2dbc7e22561203f7':
avidec: Set palette alpha as fully opaque
This commit is a noop, see 64cafe340b
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'bad4aad4037f59ba0ad656164be9ab8f7a0fa2d4':
avidec: Do not special case palette on big-endian
This commit is a noop, see 64cafe340b
Merged-by: Clément Bœsch <u@pkh.me>
* commit '5a5df90d9c05d86d9b0564b8b40b6d64a324df5e':
vaapi_h265: Add main 10 encode support
This commit is a noop, see b9514756ba
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'eaaaabf6c93321cdb78bf61dc383cf515ec12e07':
hwcontext_vaapi: Enable P010 support
This commit is a noop, see 7e0623b70b
Merged-by: Clément Bœsch <u@pkh.me>
* commit '5cc0057f4910c8c72421b812c8f337ef6c43696c':
lavu: remove the custom atomic API
This commit is a noop. The removal is postponed until all usages in
FFmpeg are dropped as well. A patchset is on discussion on the
mailing-list:
http://ffmpeg.org/pipermail/ffmpeg-devel/2017-March/209003.html
Merged-by: Clément Bœsch <u@pkh.me>
This supports retrieving the device from a provided hw_frames_ctx, and
automatically creating a hw_frames_ctx if hw_device_ctx is set.
The old API is not deprecated yet. The user can still use
av_vdpau_bind_context() (with or without setting hw_frames_ctx), or use
the API before that by allocating and setting hwaccel_context manually.
Cherry-picked from Libav commit 1a7ddba5.
(Adds missing APIchanges entry to the Libav version.)
Reviewed-by: Mark Thompson <sw@jkqxz.net>
This "reuses" the flags introduced for the av_vdpau_bind_context() API
function, and makes them available to all hwaccels. This does not affect
the current vdpau API, as av_vdpau_bind_context() should obviously
override the AVCodecContext.hwaccel_flags flags for the sake of
compatibility.
Cherry-picked from Libav commit 16a163b5.
Reviewed-by: Mark Thompson <sw@jkqxz.net>
libavcodec/vaapi.h:58:1: warning: attribute 'deprecated' is ignored, place it after "struct" to apply attribute to type declaration [-Wignored-attributes]
(cherry picked from commit ed6a891c36)
Signed-off-by: Mark Thompson <sw@jkqxz.net>
* commit '59c70227405c214b29971e6272f3a3ff6fcce3d0':
pthread_frame: use atomics for frame progress
This commit is a noop, see b6587421c7
Merged-by: Clément Bœsch <u@pkh.me>
* commit '64a31b2854c589e4f27cd68ebe3bcceb915704e5':
pthread_frame: use atomics for PerThreadContext.state
This commit is a noop, see 7492626932
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'db2733256db323e4b88a34b135320f33274148e2':
pthread_frame: use a thread-safe way for signalling threads to die
This commit is a noop, see 4845f0720e
Merged-by: Clément Bœsch <u@pkh.me>
USE_ATOMICS is only set if there is no thread implementation enabled, in
which case you can't expect any lock mechanism from FFmpeg.
This is also conflicting with the incoming use of stdatomic.
* commit 'eb34d40354e2474517c9b9bd787e0dadc89c2a81':
Add a compat dummy stdatomic.h used when threading is disabled
Add a compat stdatomic.h implementation based on pthreads
Add a compat stdatomic.h implementation based on suncc atomics
Add a compat stdatomic.h implementation based on windows atomics
Add a compat stdatomic.h implementation based on GCC atomics
This merge is a noop, see:
41e891e89e Add a compat dummy stdatomic.h used when threading is disabled
74b5f10862 Add a compat stdatomic.h implementation based on pthreads
70faadc826 Add a compat stdatomic.h implementation based on suncc atomics
c91e72ed52 Add a compat stdatomic.h implementation based on windows atomics
3359eede8f Add a compat stdatomic.h implementation based on GCC atomics
Merged-by: Clément Bœsch <u@pkh.me>
* commit '13f5d2bf75b95a0bfdb9940a5e359a719e242bed':
configure: check for stdatomic.h
This commit is a noop, see 6a4e24280d
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'b015872c0d0823e70776e98b865509ec1287e2f6':
huffyuvdsp: Enable the altivec code for PPC little-endian as well
This commit is a noop, see 902ce2a6c4 and
libavcodec/ppc/lossless_videodsp_altivec.c
Merged-by: Clément Bœsch <u@pkh.me>
* commit '1d25a86902946dbc80bb3a38e61755181ca3af7b':
huffyuvdsp: Reenable PPC optimizations
This commit is a noop, see 6596b34954
Merged-by: Clément Bœsch <u@pkh.me>
* commit '22c3ab18646924ce24dc6017a9e882ff69689e40':
checkasm: Add test for huffyuvdsp add_bytes
huffyuvdsp is renamed to llviddsp to be consistent with our codebase.
Note: af607b7e07 wasn't actually required for this test since this
commit is not actually testing huffyuvdsp.
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'f6772e9bf8251d3943f52f6f34d97d2ce6c4b8af':
avconv: make sure the filtergraph is freed on init failure
This commit is a noop, see 16abc10b09
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'd10102d23c9467d4eb84f58e0cd12be284b982f6':
avconv: set the encoding framerate when the output is CFR
This commit is a noop, see 8db301dead
Merged-by: Clément Bœsch <u@pkh.me>
* commit '5bf2454e7cb03609b3ec1a3cf4c22427fe5f8e36':
h264dec: support broken files with mp4 extradata/annex b data
This commit is a noop, see 93b89868e1
The sample pointed out on
https://github.com/HandBrake/HandBrake/issues/339 decodes fine in
FFmpeg.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '2124711b950b03c582a119c75f52a87acc32d6ec':
hwcontext_vaapi: add a quirk for the missing MemoryType attribute
This commit is a noop, see 775a8477b7
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'a9ba59591ed509fb7e6decfde8da4cbfd4ddf4b8':
ac3dsp: Add some special-case handling for the C downmix function
Merged-by: Clément Bœsch <u@pkh.me>
* commit '8ea35af7620e4f73f9e8c072e1c0fac9a04ec161':
avio: add a new flag for marking streams seekable by timestamp
Merged-by: James Almer <jamrial@gmail.com>
* commit '8d1267932ca9c2e343ef303349101bab6681d02e':
x86/h264_weight: use appropriate register size for weight parameters
This commit is a noop, see 5ae0ad001a
Merged-by: James Almer <jamrial@gmail.com>
* commit '2caa93b813adc5dbb7771dfe615da826a2947d18':
mpegaudiodsp: Change type of array stride parameters to ptrdiff_t
Merged-by: James Almer <jamrial@gmail.com>
* commit '15b4f494fc6bddb8178fdb5aed18b420efc75e22':
mss*: Change type of array stride parameters to ptrdiff_t
Merged-by: James Almer <jamrial@gmail.com>
* commit 'a339e919cad1ab0125948f0dd9d49f6cb590db89':
ea: Change type of array stride parameters to ptrdiff_t
Merged-by: James Almer <jamrial@gmail.com>
* commit 'ba479f3daafc7e4359ec1212164569ebe59f0bb7':
hevc: Change type of array stride parameters to ptrdiff_t
Merged-by: James Almer <jamrial@gmail.com>
* commit 'e4a94d8b36c48d95a7d412c40d7b558422ff659c':
h264chroma: Change type of stride parameters to ptrdiff_t
Merged-by: James Almer <jamrial@gmail.com>
* commit '2ec9fa5ec60dcd10e1cb10d8b4e4437e634ea428':
idct: Change type of array stride parameters to ptrdiff_t
Merged-by: James Almer <jamrial@gmail.com>
Aliased compressed AAC bytes are almost certainly not meaningful SBR
data. In the wild this causes harsh artifacts switching HE-AAC streams
that don't have SBR headers aligned with segment boundaries.
Turning off SBR falls back to a default set of upsampling parameters
that can function as a sort of error concealment. This is consistent
with how the decoder handles other sorts of errors.
* commit '956a54129db522998a5abae869568dae2c9774cb':
vaapi_h264: Set max_num_ref_frames to 1 when not using B frames
vaapi_encode: Sync to input surface rather than output
vaapi_encode: Check packed header capabilities
vaapi_encode: Refactor initialisation
This merge is a noop, see:
ee1d04f970 vaapi_h264: Set max_num_ref_frames to 1 when not using B frames
94f446c628 vaapi_encode: Sync to input surface rather than output
478a4b7e6d vaapi_encode: Check packed header capabilities
c8241e730f vaapi_encode: Refactor initialisation
Merged-by: Clément Bœsch <u@pkh.me>
* commit '67d28f4a0fbb52d0734ca3682b85035e96d294fb':
examples/output: switch to the new encoding API
This commit is a noop, our examples are different. Still, we need to
update them to the new API, so doc/libav-merge.txt is updated.
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'de2ae3c1fae5a2eb539b9abd7bc2a9ca8c286ff0':
lavc: add clobber tests for the new encoding/decoding API
The merge only re-order what we already have.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '6c09af7e46a5a1ada67ffe832f7895cf2749130b':
APIchanges: fix a typo in the version number
This commit is a noop (typo is not present in FFmpeg).
Merged-by: Clément Bœsch <u@pkh.me>
* commit '0e8d1fc1f013eb805a7b66656d9452bcbca36d22':
lavu: Bump version for the 12bit Planar YUV support
pixfmt: Add yuv444p12 pixel format
pixfmt: Add yuv422p12 pixel format
pixfmt: Add yuv420p12 pixel format
This merge is a noop, we already have all these pixel formats.
Merged-by: Clément Bœsch <u@pkh.me>
It was done on a whim because of the FATE header check and was actually
meant to be removed before pushing.
Also, nobody in review spotted it.
Reviewed-by: wm4
* commit '2b5b1e1e9b89063d352e2efed014f9d761b85032':
swscale: Rename is9_OR_10 to match what it does
This commit is a noop. We use isNBPS() in these places instead since
d736b52a04. is9_15BPS() wouldn't be a good name in our codebase due to
supporting only up to 14 (see 2ea585b8e3).
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'e87a501e7d03ac68b58520108fe24ad9d0b36765':
swscale: Update bitdepth range check
This commit is a noop.
Up to 14 bits is supported since fa36f33422. This commits pushes the
limit to 15 bits but we don't seem to have pixel formats that enters in
that category.
12:03 <ubitux> so what's your opinion? should we move to 15 even if unused currently to make it consistent with libav and the function names, or keep our 14 suggesting there might be an issue with 15?
12:05 <ubitux> (functions are called hScale8To15_c, hScale16To15_c, ff_hscale8to15, ...)
12:06 <michaelni> I prefer to keep 14 until theres a case that allows us to test this and i suspect it will not work with 15 at least not all the code
Merged-by: Clément Bœsch <u@pkh.me>
libavcodec now automatically serializes decoding for hwaccels which
are not thread-safe. This means API users, which rely on the libavcodec
native software fallback mechanism, can now simply enable threading
without running into problems.
Certain hardware decoding APIs are not guaranteed to be thread-safe, so
having the user access decoded hardware surfaces while the decoder is
running in another thread can cause failures (this is mainly known to
happen with DXVA2).
For such hwaccels, only allow the decoding thread to run while the user
is inside a lavc decode call (avcodec_send_packet/receive_frame).
Merges Libav commit d4a91e65.
Signed-off-by: wm4 <nfxjfg@googlemail.com>
Tested-by: Michael Niedermayer <michael@niedermayer.cc>
This improves commit 59c7022740.
In ff_thread_report_progress(), the fast code path can load
progress[field] with the relaxed memory order, and the slow code path
can store progress[field] with the release memory order. These changes
are mainly intended to avoid confusion when one inspects the source code.
They are unlikely to have measurable performance improvement.
ff_thread_report_progress() and ff_thread_await_progress() form a pair.
ff_thread_await_progress() reads progress[field] with the acquire memory
order (in the fast code path). Therefore, one expects to see
ff_thread_report_progress() write progress[field] with the matching
release memory order.
In the fast code path in ff_thread_report_progress(), the atomic load of
progress[field] doesn't need the acquire memory order because the
calling thread is trying to make the data it just decoded visible to the
other threads, rather than trying to read the data decoded by other
threads.
In ff_thread_get_buffer(), initialize progress[0] and progress[1] using
atomic_init().
Signed-off-by: Wan-Teh Chang <wtc@google.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
Merges Libav commit 343e2833.
Signed-off-by: wm4 <nfxjfg@googlemail.com>
When decoding with threads enabled, the get_format callback will be
called with one of the per-thread codec contexts rather than with the
outer context. If a hwaccel is in use too, this will add a reference
to the hardware frames context on that codec context, which will then
propagate to all of the other per-thread contexts for decoding. Once
the decoder finishes, however, the per-thread contexts are not freed
normally, so these references leak.
Merges Libav commit fd0fae60.
Signed-off-by: wm4 <nfxjfg@googlemail.com>
This patch deprecates anything that has to do with merging/splitting
side data. Automatic side data merging (and splitting), as well as all
API symbols involved in it, are removed completely.
Two FF_API_ defines are dedicated to deprecating API symbols related to
this: FF_API_MERGE_SD_API removes av_packet_split/merge_side_data in
libavcodec, and FF_API_LAVF_KEEPSIDE_FLAG deprecates
AVFMT_FLAG_KEEP_SIDE_DATA in libavformat.
Since it was claimed that changing the default from merging side data to
not doing it is an ABI change, there are two additional FF_API_ defines,
which stop using the side data merging/splitting by default (and remove
any code in avformat/avcodec doing this): FF_API_MERGE_SD in libavcodec,
and FF_API_LAVF_MERGE_SD in libavformat.
It is very much intended that FF_API_MERGE_SD and FF_API_LAVF_MERGE_SD
are quickly defined to 0 in the next ABI bump, while the API symbols are
retained for a longer time for the sake of compatibility.
AVFMT_FLAG_KEEP_SIDE_DATA will (very much intentionally) do nothing for
most of the time it will still be defined. Keep in mind that no code
exists that actually tries to unset this flag for any reason, nor does
such code need to exist. Code setting this flag explicitly will work as
before. Thus it's ok for AVFMT_FLAG_KEEP_SIDE_DATA to do nothing once
side data merging has been removed from libavformat.
In order to avoid that anyone in the future does this incorrectly, here
is a small guide how to update the internal code on bumps:
- next ABI bump (probably soon):
- define FF_API_LAVF_MERGE_SD to 0, and remove all code covered by it
- define FF_API_MERGE_SD to 0, and remove all code covered by it
- next API bump (typically two years in the future or so):
- define FF_API_LAVF_KEEPSIDE_FLAG to 0, and remove all code covered
by it
- define FF_API_MERGE_SD_API to 0, and remove all code covered by it
This forces anyone who actually wants packet side data to temporarily
use deprecated API to get it all. If you ask me, this is batshit fucked
up crazy, but it's how we roll. Making AVFMT_FLAG_KEEP_SIDE_DATA to be
set by default was rejected as an ABI change, so I'm going all the way
to get rid of this once and for all.
Reviewed-by: James Almer <jamrial@gmail.com>
Reviewed-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
This filter does not implement all features of MPEG7. Missing features:
- compression of signature files
- work only on (cropped) parts of the video
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* commit '70de2ea4261f860457a04e3d0c58c5543f403325':
nvenc: Extended rate-control support as provided by SDK 7
This commit is a noop, see facc19ef06
Merged-by: Clément Bœsch <u@pkh.me>
* commit '358c887a9fa0fb2e7ce089eaea71ab924a3e47a7':
nvenc: Add support for high bitdepth
This commit is a noop, see d1bf8a3aa8
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'e02e2515b24bfc37ede6ca1744696230be55e50b':
nvenc: Add some easier to understand presets that match x264 terminology
This commit is a noop, see a81b000a39 and
faffff88c2.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '352741b5ead1543d775ccf6040f33023e4491186':
nvenc: Make sure that enum and array index match
This commit is a noop, see a81b000a39
Merged-by: Clément Bœsch <u@pkh.me>
* commit '12004a9a7f20e44f4da2ee6c372d5e1794c8d6c5':
audiodsp/x86: yasmify vector_clipf_sse
audiodsp: reorder arguments for vector_clipf
Merged the version from Libav after a discussion with James Almer on
IRC:
19:22 <ubitux> jamrial: opinion on 12004a9a7f20e44f4da2ee6c372d5e1794c8d6c5?
19:23 <ubitux> it was apparently yasmified differently
19:23 <ubitux> (it depends on the previous commit arg shuffle)
19:24 <ubitux> i don't see the magic movsxdifnidn in your port btw
19:24 <ubitux> it's a port from 1d36defe94
19:25 <jamrial> seems better thanks to said arg shuffle
19:25 <jamrial> the loop is the same, but init is simpler
19:25 <jamrial> probably worth merging
19:25 <ubitux> OK
19:25 <ubitux> thanks
19:26 <jamrial> curious they didn't make len ptrdiff_t after the previous bunch of commits, heh
19:26 <ubitux> yeah indeed
Both commits are merged at the same time to prevent a conflict with our
existing yasmified ff_vector_clipf_sse.
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'eea9857bfd6925d0c34382c00b971ee6df12ad44':
blockdsp: drop the high_bit_depth parameter
This commit is a noop, see 562ba4a827
Merged-by: Clément Bœsch <u@pkh.me>
* commit '340f12f71207513672b5165d810cb6c8622c6b21':
hwcontext_cuda: Add P010 and YUV444P16 pixel format
This commit is a noop, we already have P010 and P016.
18:52 <@BtbN> Adding AV_PIX_FMT_YUV444P16 won't hurt, but doesn't gain anything.
18:53 <@BtbN> I'd say just noop it. If we'll ever need it, it will be added in turn.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '75d98e30afab61542faab3c0f11880834653bd6b':
audiodsp/x86: clear the high bits of the order parameter on 64bit
Merged-by: Clément Bœsch <u@pkh.me>
* commit '1d6c76e11febb58738c9647c47079d02b5e10094':
audiodsp/x86: fix ff_vector_clip_int32_sse2
No functionnal changes, only cosmetics. This issue was fixed in
9a9e2f1c8a.
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'de64dd13cbd47fd54334b6aa2a2cd3c7c36daae2':
avcodec: Add the extended pixel format profile for HEVC
This commit is a noop, see 5a41999d81
Merged-by: Clément Bœsch <u@pkh.me>
* commit '136f55207521f0b03194ef5b55ba70f1635d6aee':
mpegvideo_motion: Handle edge emulation even without unrestricted_mv
This commit is a noop, see 7b1e0beb2d
Merged-by: Clément Bœsch <u@pkh.me>
* commit '15fcf6292ed79be274c824fedb099c2665f4cc15':
build: remove hardcoded name of version header
This commit is noop, our version.sh is completely different.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '8c201dde0ab62e5cd581d958e78d7609e0ba710d':
build: doc: more fine-grained dependencies for generated texi files
This commit is a noop, we have a different system for handling the
documentation.
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'bc7399934def210c2a84ea51375d50f79c676c96':
libdc1394: Distinguish between enumeration errors and no cameras found
This commit is a noop, see 384251daff
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'df3795025337479a639cb3cd26c93a4e82ccd4db':
rtsp: Fix a crash with the RTSP muxer
This commit is a noop, see f8a13c7213
Merged-by: Clément Bœsch <u@pkh.me>
* commit '3a9662af6c741f8354b1ca97642f78f5c02e2e8f':
vaapi_h264: Fix HRD bit_rate/cpb_size scaling
This commit is a noop, see 06d73d002e
Merged-by: Clément Bœsch <u@pkh.me>
* commit '7081620aca36e616ea96f71fd71d2703e3abae09':
hwcontext_vdpau: Fix missing subscripts
This commit is a noop, see f7e9275f83
Merged-by: Clément Bœsch <u@pkh.me>
Regression from 4563a86f01.
Both need stdint.h included before the respective x264.h and xavs.h.
Old require() used different, separate checks that didn't actually
need stdint.h to work. require2()'s (now require) check_func_headers()
does include stdint.h but only after the custom headers.
For libxavs this would also be consequently fixed by libav's
commit 20abcaa273 which wasn't merged yet.
* commit 'ab3554e1a7c04a5ea30f9c905de92348478ef7c8':
configure: Drop check_lib()/require() in favor of check_lib2()/require2()
Merged-by: Clément Bœsch <u@pkh.me>
* commit '6ce93757ee6b81fe727bfdc9f546fd0ddf9139c3':
ppc: Update #endif comments
This commit is mostly a noop as we seem to support PPC LE (see
902ce2a6c4). Only the h264 chunks are
updated.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '75d642a944d5579e4ef20ff3701422a64692afcf':
vaapi_vp8: Explicitly include libva vp8 decode header
vaapi_decode: Ignore the profile when not useful
lavc/vaapi: Add VP8 decode hwaccel
vp8: Add hwaccel hooks
This merge is a noop as these commits are already under review on the
mailing list. doc/libav-merge.txt is updated to track its progress.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '52730e0f867fe77b7d2353d8b44e92edb7079ca5':
iir_filter: Change type of array stride parameters to ptrdiff_t
The merge also updates the MIPS code and drop the extra log.h include.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '3aa9d37d03da3c9b482d19b3988659287815280e':
build: Fix directory dependencies of tests/pixfmts.mak target
This might not be necessary given our mkdirs in the configure, but it
probably doesn't hurt.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '0e5dde739943168d6f61d3fb40b3f622e7abfeff':
configure: Fix --disable-pod2man / --disable-texi2html
This commit is a noop, we have dedicated documentation option for this
purpose.
Merged-by: Clément Bœsch <u@pkh.me>
The configure has the --disable-manpages option for this purpose, and
--disable-pod2man is currently ignored due to that. This is also
consistent with the other documentation options.
* commit '2610c9528f86286e4c6e174411a26ff5b4815cde':
configure: Move initial VAAPI check to a more sensible place
This commit is a noop, see 17989dcf54
Merged-by: Clément Bœsch <u@pkh.me>
* commit '4fb311c804098d78e5ce5f527f9a9c37536d3a08':
Drop memalign hack
Merged, as this may indeed be uneeded since
46e3936fb0.
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'f01f7a7846529b7c3ef343f117eaa2c0a1457af0':
hwcontext_dxva2: use the special UC copy for downloading frames
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'd7bc52bf456deba0f32d9fe5c288ec441f1ebef5':
imgutils: add a function for copying image data from GPU mapped memory
Merged-by: Clément Bœsch <u@pkh.me>
* commit '851960f6f8cf1f946fe42fa36cf6598fac68072c':
lavc: Remove old vaapi decode infrastructure
avconv_vaapi: Convert to use hw_frames_ctx only
vaapi_mpeg4: Convert to use the new VAAPI hwaccel code
vaapi_vc1: Convert to use the new VAAPI hwaccel code
vaapi_mpeg2: Convert to use the new VAAPI hwaccel code
vaapi_h264: Convert to use the new VAAPI hwaccel code
lavc: Rewrite VAAPI decode infrastructure
This merge is a noop, these commits have already been cherry-picked.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '72eba6558ee4f10239ba3f472c0b033ec70082a7':
wmavoice: Simplify GetBitContext initialization
This commit is a noop. We don't have that code anymore since
3deb4b54a2.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '728e80cd2e1d4b7c3e26489efcd77bd7a9e84a99':
High Definition Compatible Digital (HDCD) decoder filter, using libhdcd
This commit is a noop, we have that code natively.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '95f80293456d9d4b1b096621260c38bc90325ec0':
avprobe: Fix memory leak
This commit is a noop, ffprobe is not affected.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '8db804e8f549d5b86a1edf62736e0ef80f160da9':
mov: Remove old b-frame/video delay heuristic
This commit is a noop, see 425be3c810
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'eb96505b761eb02b6a3efc76d854afa6a41941ff':
mov: Remove ancient heuristic hack
This commit is a noop, see 04f8d31287
Merged-by: Clément Bœsch <u@pkh.me>
Fixes: 864/clusterfuzz-testcase-4774385942528000
See: [FFmpeg-devel] [PATCH 1/2] avcodec/h264_direct: Fix runtime error: signed integer overflow: 2147483647 - -14133 cannot be represented in type 'int'
See: [FFmpeg-devel] [PATCH 2/2] avcodec/h264_direct: Fix runtime error: signed integer overflow: -9 - 2147483647 cannot be represented in type 'int'
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This work is sponsored by, and copyright, Google.
This avoids loading and calculating coefficients that we know will
be zero, and avoids filling the temp buffer with zeros in places
where we know the second pass won't read.
This gives a pretty substantial speedup for the smaller subpartitions.
The code size increases from 21512 bytes to 31400 bytes.
The idct16/32_end macros are moved above the individual functions; the
instructions themselves are unchanged, but since new functions are added
at the same place where the code is moved from, the diff looks rather
messy.
Before:
vp9_inv_dct_dct_16x16_sub1_add_10_neon: 284.6
vp9_inv_dct_dct_16x16_sub2_add_10_neon: 1902.7
vp9_inv_dct_dct_16x16_sub4_add_10_neon: 1903.0
vp9_inv_dct_dct_16x16_sub8_add_10_neon: 2201.1
vp9_inv_dct_dct_16x16_sub12_add_10_neon: 2510.0
vp9_inv_dct_dct_16x16_sub16_add_10_neon: 2821.3
vp9_inv_dct_dct_32x32_sub1_add_10_neon: 1011.6
vp9_inv_dct_dct_32x32_sub2_add_10_neon: 9716.5
vp9_inv_dct_dct_32x32_sub4_add_10_neon: 9704.9
vp9_inv_dct_dct_32x32_sub8_add_10_neon: 10641.7
vp9_inv_dct_dct_32x32_sub12_add_10_neon: 11555.7
vp9_inv_dct_dct_32x32_sub16_add_10_neon: 12499.8
vp9_inv_dct_dct_32x32_sub20_add_10_neon: 13403.7
vp9_inv_dct_dct_32x32_sub24_add_10_neon: 14335.8
vp9_inv_dct_dct_32x32_sub28_add_10_neon: 15253.6
vp9_inv_dct_dct_32x32_sub32_add_10_neon: 16179.5
After:
vp9_inv_dct_dct_16x16_sub1_add_10_neon: 282.8
vp9_inv_dct_dct_16x16_sub2_add_10_neon: 1142.4
vp9_inv_dct_dct_16x16_sub4_add_10_neon: 1139.0
vp9_inv_dct_dct_16x16_sub8_add_10_neon: 1772.9
vp9_inv_dct_dct_16x16_sub12_add_10_neon: 2515.2
vp9_inv_dct_dct_16x16_sub16_add_10_neon: 2823.5
vp9_inv_dct_dct_32x32_sub1_add_10_neon: 1012.7
vp9_inv_dct_dct_32x32_sub2_add_10_neon: 6944.4
vp9_inv_dct_dct_32x32_sub4_add_10_neon: 6944.2
vp9_inv_dct_dct_32x32_sub8_add_10_neon: 7609.8
vp9_inv_dct_dct_32x32_sub12_add_10_neon: 9953.4
vp9_inv_dct_dct_32x32_sub16_add_10_neon: 10770.1
vp9_inv_dct_dct_32x32_sub20_add_10_neon: 13418.8
vp9_inv_dct_dct_32x32_sub24_add_10_neon: 14330.7
vp9_inv_dct_dct_32x32_sub28_add_10_neon: 15257.1
vp9_inv_dct_dct_32x32_sub32_add_10_neon: 16190.6
Signed-off-by: Martin Storsjö <martin@martin.st>
This work is sponsored by, and copyright, Google.
This reduces the code size of libavcodec/aarch64/vp9itxfm_16bpp_neon.o from
26288 to 21512 bytes.
This gives a small slowdown of a couple of tens of cycles, but makes
it more feasible to add more optimized versions of these transforms.
Before:
vp9_inv_dct_dct_16x16_sub4_add_10_neon: 1887.4
vp9_inv_dct_dct_16x16_sub16_add_10_neon: 2801.5
vp9_inv_dct_dct_32x32_sub4_add_10_neon: 9691.4
vp9_inv_dct_dct_32x32_sub32_add_10_neon: 16154.9
After:
vp9_inv_dct_dct_16x16_sub4_add_10_neon: 1899.5
vp9_inv_dct_dct_16x16_sub16_add_10_neon: 2827.2
vp9_inv_dct_dct_32x32_sub4_add_10_neon: 9714.7
vp9_inv_dct_dct_32x32_sub32_add_10_neon: 16175.9
Signed-off-by: Martin Storsjö <martin@martin.st>
This work is sponsored by, and copyright, Google.
This reduces the code size of libavcodec/arm/vp9itxfm_16bpp_neon.o from
17500 to 14516 bytes.
This gives a small slowdown of a couple tens of cycles, up to around
150 cycles for the full case of the largest transform, but makes
it more feasible to add more optimized versions of these transforms.
Before: Cortex A7 A8 A9 A53
vp9_inv_dct_dct_16x16_sub4_add_10_neon: 4237.4 3561.5 3971.8 2525.3
vp9_inv_dct_dct_16x16_sub16_add_10_neon: 6371.9 5452.0 5779.3 3910.5
vp9_inv_dct_dct_32x32_sub4_add_10_neon: 22068.8 17867.5 19555.2 13871.6
vp9_inv_dct_dct_32x32_sub32_add_10_neon: 37268.9 38684.2 32314.2 23969.0
After:
vp9_inv_dct_dct_16x16_sub4_add_10_neon: 4375.1 3571.9 4283.8 2567.2
vp9_inv_dct_dct_16x16_sub16_add_10_neon: 6415.6 5578.9 5844.6 3948.3
vp9_inv_dct_dct_32x32_sub4_add_10_neon: 22653.7 18079.7 19603.7 13905.3
vp9_inv_dct_dct_32x32_sub32_add_10_neon: 37593.2 38862.2 32235.8 24070.9
Signed-off-by: Martin Storsjö <martin@martin.st>
Keep the idct32 coefficients in narrow form in q6-q7, and idct16
coefficients in lengthened 32 bit form in q0-q3. Avoid clobbering
q0-q3 in the pass1 function, and squeeze the idct16 coefficients
into q0-q1 in the pass2 function to avoid reloading them.
The idct16 coefficients are clobbered and reloaded within idct32_odd
though, since that turns out to be faster than narrowing them and
swapping them into q6-q7.
Before: Cortex A7 A8 A9 A53
vp9_inv_dct_dct_32x32_sub4_add_10_neon: 22653.8 18268.4 19598.0 14079.0
vp9_inv_dct_dct_32x32_sub32_add_10_neon: 37699.0 38665.2 32542.3 24472.2
After:
vp9_inv_dct_dct_32x32_sub4_add_10_neon: 22270.8 18159.3 19531.0 13865.0
vp9_inv_dct_dct_32x32_sub32_add_10_neon: 37523.3 37731.6 32181.7 24071.2
Signed-off-by: Martin Storsjö <martin@martin.st>
Align the second/third operands as they usually are.
Due to the wildly varying sizes of the written out operands
in aarch64 assembly, the column alignment is usually not as clear
as in arm assembly.
This is cherrypicked from libav commit
7995ebfad1.
Signed-off-by: Martin Storsjö <martin@martin.st>
In the half/quarter cases where we don't use the min_eob array, defer
loading the pointer until we know it will be needed.
This is cherrypicked from libav commit
3a0d5e206d.
Signed-off-by: Martin Storsjö <martin@martin.st>
This reduces the number of lines and reduces the duplication.
Also simplify the eob check for the half case.
If we are in the half case, we know we at least will need to do the
first three slices, we only need to check eob for the fourth one,
so we can hardcode the value to check against instead of loading
from the min_eob array.
Since at most one slice can be skipped in the first pass, we can
unroll the loop for filling zeros completely, as it was done for
the quarter case before.
This allows skipping loading the min_eob pointer when using the
quarter/half cases.
This is cherrypicked from libav commit
98ee855ae0.
Signed-off-by: Martin Storsjö <martin@martin.st>
* commit '4ab496261b12e20ef293b7adca4fcaef1a67c538':
libvpx: Cast a pointer to const to squelch a warning
This commit is a noop, see 09b3bbe605
Merged-by: James Almer <jamrial@gmail.com>
* commit '802727b538b484e3f9d1345bfcc4ab24cfea8898':
vp8: Update some assembly comments left unchanged in bd66f073fe
Merged-by: James Almer <jamrial@gmail.com>
* commit '6755eb5b212384e0599f7f2c5de42df49fff57de':
mss12: validate display dimensions
This commit is a noop, see ee9151b616
Merged-by: Clément Bœsch <u@pkh.me>
* commit '33f10546ec012ad4e1054b57317885cded7e953e':
vc1: check that slices have a positive height
This commit is a noop, see e985cfd18b
Merged-by: Clément Bœsch <u@pkh.me>
* commit '09b23786b3986502ee88d4907356979127169bdd':
pcx: use the bytestream2 API for reading from input
This commit is a noop, see 8cd1c0febe
Merged-by: Clément Bœsch <u@pkh.me>
* commit '221402c1c88b9d12130c6f5834029b535ee0e0c5':
pcx: check that the packet is large enough before reading the header
See 8cd1c0febe
Merged-by: Clément Bœsch <u@pkh.me>
* commit '15ee419b7abaf17f8c662c145fe93d3dbf43282b':
pcx: properly pad the scanline
This commit is a noop, see d24de4596c
Merged-by: Clément Bœsch <u@pkh.me>
* commit '796dca027be09334d7bbf4f2ac1200e06bb054cb':
alac: do not return success if nothing was decoded
See e11983bda0
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'f5d46d332258dcd8ca623019ece1d5e5bb74142b':
vmnc: check that subrectangles fit into their containing rectangles
See 6ba02602aa
This merge keeps our condition against w-i and h-j instead of bw and bh.
One may be more correct than the other, but I'm keeping our behaviour
here for safety reasons.
The style and formatting is merged.
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'b53d8c3ccfeff77874f5ca7c68136b6d87a0a69c':
mjpegdec: Drop disabled code
The last chunk is replaced with a comment describing the structure.
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'aa37d2bf4505afc106e2a23c44afc722bb204a8e':
swscale: Kill non-compiling disabled cruft
The isGray() chunk is not merged as an alternative patch actually fixing
the dead code is currently under review on the mailing-list.
The SWS_X chunk is merged, with an additional cosmetic.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '00a0419c7f7ebce9010cba93b7ff67c9f1165815':
mathematics: Kill non-compiling disabled cruft
This commit is a noop, see 1e1513d01a
Merged-by: Clément Bœsch <u@pkh.me>
* commit '5a667322f5cb0e77c15891fc06725c19d8f3314f':
vaapi_vc1: Remove redundant version check
This commit is a noop, see d07d01bcce
Merged-by: Clément Bœsch <u@pkh.me>
* commit '01d6f84f49a55fd591aa120960fce2b9dba92d0d':
vaapi_vc1: Constify pointers
This commit is a noop, see 845c2c140b
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'ee9061293e925916fe2e0b7c08fbbd1f981b1d29':
vaapi_mpeg2: Constify pointers
This commit is a noop, see 6bc2808c41
Merged-by: Clément Bœsch <u@pkh.me>
* commit '03adfe913062c6995136eb1ca51152b6d596c0f4':
vaapi_h264: Constify pointers
This commit is a noop, see d0897da924
Merged-by: Clément Bœsch <u@pkh.me>
* commit '121f34d5f0c8d7d376829a467590fbbe4c228f4f':
hwcontext_vaapi: Try the first render node as the default DRM device
This commit is a noop, see 8d47d84075
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'f6d2fed811dea36c4ebaf991927e44c78eb0aca5':
avconv: Make sure that inputless filtergraphs are configured
This commit is a noop. Related code is pretty different in ffmpeg, and
-filter_complex testsrc works.
See also af1761f7b5
Merged-by: Clément Bœsch <u@pkh.me>
* commit '602abe77b02f9702c18c2787d208fcfc9d94b70f':
avconv: Check the fifo allocation
This commit is a noop, see af1761f7b5
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'f2143c57b6a61fef382f3128138d8558a9bdecee':
vp9: reindent after last commit
vp9: add frame threading
vp9: allocate 'b', 'block/uvblock' and 'eob/uveob' dynamically.
vp9: split last/cur_frame from the reference buffers.
This commit is a noop, we already have all these changes. Again, we will
need in the future to analyse the tiny differences between the two
repository on the vp9 files. But in the current state, it's a real pain
to do at every commit due to the huge differences (such as files split
and cosmetics).
Merged-by: Clément Bœsch <u@pkh.me>
* commit '04763c6f87690b31cfcd0d324cf36a451531dcd0':
h264_direct: use the reference mask from the actual reference
This commit is a noop, see d8151a7e94
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'e9bfff1cc66c85b91b262c41e8aa5e8685606225':
lavc: free buffer_frame/pkt on avcodec_open2() failure
This commit is a noop, see 27adf9f9cd.
Only reordered to reduce diff.
Merged-by: Clément Bœsch <u@pkh.me>
The typeof keyword is apparently not available when using the -std=c99 option.
Fixes the use of C11 atomic functions with old GCC.
Reviewed-by: Muhammad Faiz <mfcc64@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
Not used by anything at all since we don't auto insert lavr filters.
Reviewed-by: wm4 <nfxjfg@googlemail.com>
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
use fltp when doing s32 -> s32 resampling
because s32p has no simd optimization
benchmark:
old 17.913s
new 7.584s (use fma3)
Reviewed-by: wm4 <nfxjfg@googlemail.com>
Signed-off-by: Muhammad Faiz <mfcc64@gmail.com>
These values are defined to be 32bit in the specification,
so it makes more sense to store them as fixed width.
Based on a patch by Micahel Niedermayer <michael@niedermayer.cc>.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
This field is of little value, and interferes with testing side data,
since sizes can be different on multiple architectures.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
* commit '0638b99cdba52554691fc668d9e477bc184c7a33':
aiff: Skip padding byte for odd-sized chunks
Also removes to odd-size checks from get_aiff_header and get_meta to use
the generic path introduced by the original commit.
Merged-by: Matthieu Bouron <matthieu.bouron@gmail.com>
Allows to get a more realistic total bitrate (and estimated file size)
in avi_write_header. Previously a static default value of 200k was
assumed.
Adds an internal helper function for bitrate guessing.
Signed-off-by: Tobias Rapp <t.rapp@noa-archive.com>
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
when set_compensation is called with zero sample_delta,
compensation does not happen (because dst_incr == ideal_dst_incr)
but compensation_distance is set
regression since 01ebb57c03
Found-by: wm4 <nfxjfg@googlemail.com>
Reviewed-by: wm4 <nfxjfg@googlemail.com>
Signed-off-by: Muhammad Faiz <mfcc64@gmail.com>
Reflects the actual code and silences a gcc warning:
ffprobe.c:1797:42: warning: passing argument 1 of 'av_spherical_tile_bounds' discards 'const' qualifier from pointer target type
* commit '0df4801105d84883071b0978cb3afc7cd5184ce8':
vp9: make mv bounds 32bit.
This commit is a noop, see 024fac5cd4
Merged-by: Clément Bœsch <u@pkh.me>
* commit '24a362569bff1d4161742fffaca80a4a4428be8a':
buffer: fix av_buffer_realloc() when the data is offset wrt buffer start
Merged-by: Clément Bœsch <u@pkh.me>
Reflects the actual code and silences a gcc warning:
libavcodec/utils.c:2102:36: warning: passing argument 1 of 'av_packet_get_side_data' discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]
* commit 'e99ecda55082cb9dde8fd349361e169dc383943a':
checkasm: add vp9 MC tests.
vp9mc/x86: sse2 MC assembly.
vp9mc/x86: add AVX and AVX2 MC
vp9mc/x86: rename ff_* to ff_vp9_*
vp9mc/x86: rename ff_avg[48]_sse to ff_avg[48]_mmxext
vp9mc/x86: simplify a few inits.
vp9mc/x86: add 16px functions (64bit only).
Noop (aside from a formatting comment in vp9mc.asm). We already have all
of this. We should consider making a final diff between the two projects
when the dust comes down.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '89466de4aeaf5e359489b81b8a9920a2bc7936d6':
vp9/x86: rename vp9dsp to vp9mc
File was already renamed, only the top description is updated.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '63ac8e2d93080b74f6be32c7c3c1a1e44aacf34e':
lavu: add LOCAL_ALIGNED_32
This commit is a noop, see 25d5ea6d5a
Merged-by: James Almer <jamrial@gmail.com>
* commit 'd3e4d406b020b0464486318aceda08bd8f69ca41':
h264dec: reset nb_slice_ctx_queued for hwaccel decoding
This commit is a noop, see 7448019890
Merged-by: James Almer <jamrial@gmail.com>
* commit 'e5b019725f53b79159931d3a7317107cbbfd0860':
m4vdec: Check for non-startcode 00 00 00 sequences in probe
This commit is a noop, see 7c1835c52a
Merged-by: James Almer <jamrial@gmail.com>
* commit '3ccec334b8502701e72ef13bed25913c3578022e':
sbrdsp: Move a misplaced #endif directive to the right spot
Merged-by: James Almer <jamrial@gmail.com>
* commit 'e723dce6f8ba1e8260433b6ecfe5a3262f4c7a99':
dvbsubdec: Use NULL instead of 0 as pointer value
This commit is a noop. The affected code isn't in our tree.
Merged-by: James Almer <jamrial@gmail.com>
* commit 'fc94a1acc27ab7296edce3fa81ef36691af5c134':
Revert "libavutil: Use an intermediate variable in AV_COPY*U"
This commit is a noop.
Merged-by: James Almer <jamrial@gmail.com>
* commit '9806b9ab5c7fb2ac5efd8ffa8713fea0c5fd218d':
Revert "Don't use expressions with side effects in macro parameters"
This commit is a noop.
Merged-by: James Almer <jamrial@gmail.com>
* commit 'f79d847400d218cfd0b95f10358fe6e65ec3c9c4':
intreadwrite: Use the __unaligned keyword on MSVC for ARM and x86_64
Merged-by: James Almer <jamrial@gmail.com>
* commit '230b1c070baa3b6d4bd590426a365b843d60ff50':
intreadwrite: Add intermediate variables in the byteswise AV_W*() macros
Mostly a noop. Merged for cosmetic purposes.
See d83ff76ca0
Merged-by: James Almer <jamrial@gmail.com>
* commit '014773b66bdff4de24f384066d1a85d2a5bb6774':
libavutil: Use an intermediate variable in AV_COPY*U
This commit is a noop. It would be reverted in a future merge either
way.
Merged-by: James Almer <jamrial@gmail.com>
* commit '25bacd0a0c32ae682e6f411b1ac9020aeaabca72':
Don't use expressions with side effects in macro parameters
This commit is a noop. It would be reverted in a future merge either
way.
Merged-by: James Almer <jamrial@gmail.com>
Benchmarks with START_TIMER indicate that the code is faster with unsigned, (that is
with the patch), there was quite some fluctuation in the numbers so this may be just
random
Fixes: 811/clusterfuzz-testcase-6465493076541440
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
libnpp was erroneously grouped up with libfdk-aac and openssl to check
if --enable-nonfree wasn't passed only with --enable-gpl in
9f28db47ac. The latter two are compatible
with LGPL, libnpp is not.
Signed-off-by: James Almer <jamrial@gmail.com>
* commit '7ebdffc353f3f0827864e8e3461fdc00cc243b14':
dxv: Check to make sure we don't overrun buffers on corrupt inputs
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'e328178da90f44690e0076f4dbfd16da9175f441':
qsvdec: only access hwaccel_context is the pixel format is QSV
Merged-by: Clément Bœsch <u@pkh.me>
* commit '5ebef79abecc3ffcc4ab0d46e203d13b068107c9':
Fix instances of broken indentation found by gcc 6
Noop, see 21d3f0c02, 6089c44a2
Merged-by: Clément Bœsch <u@pkh.me>
* commit '2ac00d2d1d51047c6ce69d5fbe1a08392d142658':
mov: Validate the ID number
This commit is a noop as the modified check is not present in FFmpeg.
See d30870cc73.
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'a115eb9e750543f1d8bf951414d291069bf396c2':
mimic: do not release the newly obsolete reference at the end of decoding
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'ae90119c6701fa09ff747cca35238e36b2d2ab2f':
configure: Simplify license incompatibility check
An extra GPLV3 list is added for libsmbclient as having it in both GPL
and VERSION3 lists would cause a duplicate in the final config list.
Also, for consistency, libnpp is treated the same as the other nonfree
component (libfdk_aac and openssl).
Merged-by: Clément Bœsch <u@pkh.me>
add kVTCompressionPropertyKey_DataRateLimits support by rc_max_bitrate
Reviewed-by: Rick Kern <kernrj@gmail.com>
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
* commit 'e46a6fb7732a7caef97a916a4f765ec0f779d195':
avconv: Check that muxing_queue exists before reading from it
Mostly noop. This was fixed in FFmpeg in 7f7c494a3.
The merge makes the cosmetics match but does not include the weird
av_log().
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit '100fb0ddfda958da70f98feac81f924c02483789':
configure: Allow detecting and using LLVM lld-link as linker for windows
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit '56af0bc10f49654b5b5f3efe82c69a13bf15fc8b':
configure: Check for strtoll and redirect to _strtoi64 in the msvcrt block
Also includes _strtoui64 in the check.
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit 'b183abfb5b6366b177cf44f244c66156257a6fd6':
vpx: Support color range
Decoder chunk not merged as the framework automatically copies avctx
color range to the frame color range. And we already set the avctx field
since cbcc88c039.
Merged-by: Clément Bœsch <cboesch@gopro.com>
Preparation for potentially disabling merged side data by default in the
libs. Do this in particular because it affects fate tests.
The changed tests either reflect added packet side data, or the changed
packet size due to merged side data removal reducing the packet size.
The current form of the messages indicating matches in the white
or black lists seems to be a bit too much relying on context.
Make the messages more explicit.
Signed-off-by: Alexander Strasser <eclipse7@gmx.net>
Fixes: 755/clusterfuzz-testcase-5369072516595712
See: [FFmpeg-devel] [PATCH 1/2] avcodec/h264_direct: Fix runtime error: signed integer overflow: 2147483647 - -14133 cannot be represented in type 'int'
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
the SECOND_LEVEL* flags process and name is too long
extract all of them output to funtions, make code clear
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
so tsf option in aresample will have effect
previously tsf/internal_sample_format had no effect
fate is updated
s32p previously used fltp internally
dblp previously used fltp/dblp internally
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Muhammad Faiz <mfcc64@gmail.com>
Add missing return value checks to suppress build warning and
remove noop ff_formats_unref() calling.
Note: most filters using ff_formats_ref() didn't have a suitable
error handling, it's a potential memory leak issue.
Signed-off-by: Jun Zhao <jun.zhao@intel.com>
Signed-off-by: Mark Thompson <sw@jkqxz.net>
* commit 'ad71d3276fef0ee7e791e62bbfe9c4e540047417':
lavfi: add a QSV deinterlacing filter
Minor fixup for lavfi differences.
Merged-by: Mark Thompson <sw@jkqxz.net>
* commit '0956fd460681e8ccbdae19f135f0d3970bf95c2f':
qsvenc: do not re-execute encoding on all positive status codes
Noop, see fb240a6276.
Merged-by: Mark Thompson <sw@jkqxz.net>
* commit 'd9ec3c60143babe1bb77c268e1d5547d15acd69b':
qsvenc: take only the allocated dimensions from the frames context
Merged-by: Mark Thompson <sw@jkqxz.net>
* commit '8b7a9729aa162e2bbd571933f1aa40767f1ff47b':
avconv_qsv: use the actual pixel format provided by lavc
This commit is a noop, see 03cef34aa6
Merged-by: Clément Bœsch <u@pkh.me>
* commit '6f40181cad8ac04adff7bd10e1e1ab65f22bc1f0':
avconv_qsv: align the surface size to 32
This commit is a noop, see 03cef34aa6
Merged-by: Clément Bœsch <u@pkh.me>
Fixes: 732/clusterfuzz-testcase-4872990070145024
See: [FFmpeg-devel] [PATCH 2/6] avcodec/dca_xll: Fix runtime error: signed integer overflow: 2147286116 + 6298923 cannot be represented in type 'int'
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Provides a way to change bandwidth parameter inside DASH manifest after a non-CBR H.264 encoding.
Caller now is able to compute the bitrate by itself, after all packets have been written, and then set that value in AVFormatContext->streams->codecpar->bit_rate before calling av_write_trailer. As a result that value will be set in DASH manifest.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This matches the order they are in the 16 bpp version.
There they are in this order, to make sure we access them in the
same order they are declared, easing loading only half of the
coefficients at a time.
This makes the 8 bpp version match the 16 bpp version better.
This is cherrypicked from libav commit
b8f66c0838.
Signed-off-by: Martin Storsjö <martin@martin.st>
This matches the order they are in the 16 bpp version.
There they are in this order, to make sure we access them in the
same order they are declared, easing loading only half of the
coefficients at a time.
This makes the 8 bpp version match the 16 bpp version better.
This is cherrypicked from libav commit
08074c092d.
Signed-off-by: Martin Storsjö <martin@martin.st>
All elements are used pairwise, except for the first one.
Previously, the 16th element was unused. Move the unused element
to the second slot, to make the later element pairs not split
across registers.
This simplifies loading only parts of the coefficients,
reducing the difference to the 16 bpp version.
This is cherrypicked from libav commit
09eb88a12e.
Signed-off-by: Martin Storsjö <martin@martin.st>
All elements are used pairwise, except for the first one.
Previously, the 16th element was unused. Move the unused element
to the second slot, to make the later element pairs not split
across registers.
This simplifies loading only parts of the coefficients,
reducing the difference to the 16 bpp version.
This is cherrypicked from libav commit
de06bdfe6c.
Signed-off-by: Martin Storsjö <martin@martin.st>
The idct32x32 function actually pushed d8-d15 onto the stack even
though it didn't clobber them; there are plenty of registers that
can be used to allow keeping all the idct coefficients in registers
without having to reload different subsets of them at different
stages in the transform.
After this, we still can skip pushing d12-d15.
Before:
vp9_inv_dct_dct_32x32_sub32_add_neon: 8128.3
After:
vp9_inv_dct_dct_32x32_sub32_add_neon: 8053.3
This is cherrypicked from libav commit
65aa002d54.
Signed-off-by: Martin Storsjö <martin@martin.st>
The idct32x32 function actually pushed q4-q7 onto the stack even
though it didn't clobber them; there are plenty of registers that
can be used to allow keeping all the idct coefficients in registers
without having to reload different subsets of them at different
stages in the transform.
Since the idct16 core transform avoids clobbering q4-q7 (but clobbers
q2-q3 instead, to avoid needing to back up and restore q4-q7 at all
in the idct16 function), and the lanewise vmul needs a register in
the q0-q3 range, we move the stored coefficients from q2-q3 into q4-q5
while doing idct16.
While keeping these coefficients in registers, we still can skip pushing
q7.
Before: Cortex A7 A8 A9 A53
vp9_inv_dct_dct_32x32_sub32_add_neon: 18553.8 17182.7 14303.3 12089.7
After:
vp9_inv_dct_dct_32x32_sub32_add_neon: 18470.3 16717.7 14173.6 11860.8
This is cherrypicked from libav commit
402546a172.
Signed-off-by: Martin Storsjö <martin@martin.st>
For this case, with 8 inputs but only changing 4 of them, we can fit
all 16 input pixels into a q register, and still have enough temporary
registers for doing the loop filter.
The wd=8 filters would require too many temporary registers for
processing all 16 pixels at once though.
Before: Cortex A7 A8 A9 A53
vp9_loop_filter_mix2_v_44_16_neon: 289.7 256.2 237.5 181.2
After:
vp9_loop_filter_mix2_v_44_16_neon: 221.2 150.5 177.7 138.0
This is cherrypicked from libav commit
575e31e931.
Signed-off-by: Martin Storsjö <martin@martin.st>
This is one cycle faster in total, and three instructions fewer.
Before:
vp9_loop_filter_mix2_v_44_16_neon: 123.2
After:
vp9_loop_filter_mix2_v_44_16_neon: 122.2
This is cherrypicked from libav commit
3bf9c48320.
Signed-off-by: Martin Storsjö <martin@martin.st>
This fixes building with clang for linux with PIC enabled.
This is cherrypicked from libav commit
8847eeaa14.
Signed-off-by: Martin Storsjö <martin@martin.st>
This adds lots of extra .ifs, but speeds it up by a couple cycles,
by avoiding stalls.
This is cherrypicked from libav commit
b0806088d3.
Signed-off-by: Martin Storsjö <martin@martin.st>
This adds lots of extra .ifs, but speeds it up by a couple cycles,
by avoiding stalls.
This is cherrypicked from libav commit
e18c39005a.
Signed-off-by: Martin Storsjö <martin@martin.st>
Previously we first calculated hev, and then negated it.
Since we were able to schedule the negation in the middle
of another calculation, we don't see any gain in all cases.
Before: Cortex A7 A8 A9 A53 A53/AArch64
vp9_loop_filter_v_4_8_neon: 147.0 129.0 115.8 89.0 88.7
vp9_loop_filter_v_8_8_neon: 242.0 198.5 174.7 140.0 136.7
vp9_loop_filter_v_16_8_neon: 500.0 419.5 382.7 293.0 275.7
vp9_loop_filter_v_16_16_neon: 971.2 825.5 731.5 579.0 453.0
After:
vp9_loop_filter_v_4_8_neon: 143.0 127.7 114.8 88.0 87.7
vp9_loop_filter_v_8_8_neon: 241.0 197.2 173.7 140.0 136.7
vp9_loop_filter_v_16_8_neon: 497.0 419.5 379.7 293.0 275.7
vp9_loop_filter_v_16_16_neon: 965.2 818.7 731.4 579.0 452.0
This is cherrypicked from libav commit
e1f9de86f4.
Signed-off-by: Martin Storsjö <martin@martin.st>
This work is sponsored by, and copyright, Google.
Before: Cortex A53
vp9_inv_dct_dct_16x16_sub1_add_neon: 235.3
vp9_inv_dct_dct_32x32_sub1_add_neon: 555.1
After:
vp9_inv_dct_dct_16x16_sub1_add_neon: 180.2
vp9_inv_dct_dct_32x32_sub1_add_neon: 475.3
This is cherrypicked from libav commit
3fcf788fbb.
Signed-off-by: Martin Storsjö <martin@martin.st>
No measured speedup on a Cortex A53, but other cores might benefit.
This is cherrypicked from libav commit
388e0d2515.
Signed-off-by: Martin Storsjö <martin@martin.st>
Fold the field lengths into the macro.
This makes the macro invocations much more readable, when the
lines are shorter.
This also makes it easier to use only half the registers within
the macro.
This is cherrypicked from libav commit
5e0c2158fb.
Signed-off-by: Martin Storsjö <martin@martin.st>
The ld1r is a leftover from the arm version, where this trick is
beneficial on some cores.
Use a single-lane load where we don't need the semantics of ld1r.
This is cherrypicked from libav commit
ed8d293306.
Signed-off-by: Martin Storsjö <martin@martin.st>
This work is sponsored by, and copyright, Google.
This avoids loading and calculating coefficients that we know will
be zero, and avoids filling the temp buffer with zeros in places
where we know the second pass won't read.
This gives a pretty substantial speedup for the smaller subpartitions.
The code size increases from 14740 bytes to 24292 bytes.
The idct16/32_end macros are moved above the individual functions; the
instructions themselves are unchanged, but since new functions are added
at the same place where the code is moved from, the diff looks rather
messy.
Before:
vp9_inv_dct_dct_16x16_sub1_add_neon: 236.7
vp9_inv_dct_dct_16x16_sub2_add_neon: 1051.0
vp9_inv_dct_dct_16x16_sub4_add_neon: 1051.0
vp9_inv_dct_dct_16x16_sub8_add_neon: 1051.0
vp9_inv_dct_dct_16x16_sub12_add_neon: 1387.4
vp9_inv_dct_dct_16x16_sub16_add_neon: 1387.6
vp9_inv_dct_dct_32x32_sub1_add_neon: 554.1
vp9_inv_dct_dct_32x32_sub2_add_neon: 5198.5
vp9_inv_dct_dct_32x32_sub4_add_neon: 5198.6
vp9_inv_dct_dct_32x32_sub8_add_neon: 5196.3
vp9_inv_dct_dct_32x32_sub12_add_neon: 6183.4
vp9_inv_dct_dct_32x32_sub16_add_neon: 6174.3
vp9_inv_dct_dct_32x32_sub20_add_neon: 7151.4
vp9_inv_dct_dct_32x32_sub24_add_neon: 7145.3
vp9_inv_dct_dct_32x32_sub28_add_neon: 8119.3
vp9_inv_dct_dct_32x32_sub32_add_neon: 8118.7
After:
vp9_inv_dct_dct_16x16_sub1_add_neon: 236.7
vp9_inv_dct_dct_16x16_sub2_add_neon: 640.8
vp9_inv_dct_dct_16x16_sub4_add_neon: 639.0
vp9_inv_dct_dct_16x16_sub8_add_neon: 842.0
vp9_inv_dct_dct_16x16_sub12_add_neon: 1388.3
vp9_inv_dct_dct_16x16_sub16_add_neon: 1389.3
vp9_inv_dct_dct_32x32_sub1_add_neon: 554.1
vp9_inv_dct_dct_32x32_sub2_add_neon: 3685.5
vp9_inv_dct_dct_32x32_sub4_add_neon: 3685.1
vp9_inv_dct_dct_32x32_sub8_add_neon: 3684.4
vp9_inv_dct_dct_32x32_sub12_add_neon: 5312.2
vp9_inv_dct_dct_32x32_sub16_add_neon: 5315.4
vp9_inv_dct_dct_32x32_sub20_add_neon: 7154.9
vp9_inv_dct_dct_32x32_sub24_add_neon: 7154.5
vp9_inv_dct_dct_32x32_sub28_add_neon: 8126.6
vp9_inv_dct_dct_32x32_sub32_add_neon: 8127.2
This is cherrypicked from libav commit
a63da4511d.
Signed-off-by: Martin Storsjö <martin@martin.st>
This allows reusing the macro for a separate implementation of the
pass2 function.
This is cherrypicked from libav commit
79d332ebbd.
Signed-off-by: Martin Storsjö <martin@martin.st>
This allows reusing the macro for a separate implementation of the
pass2 function.
This is cherrypicked from libav commit
47b3c2c18d.
Signed-off-by: Martin Storsjö <martin@martin.st>
This work is sponsored by, and copyright, Google.
This reduces the code size of libavcodec/aarch64/vp9itxfm_neon.o from
19496 to 14740 bytes.
This gives a small slowdown of a couple of tens of cycles, but makes
it more feasible to add more optimized versions of these transforms.
Before:
vp9_inv_dct_dct_16x16_sub4_add_neon: 1036.7
vp9_inv_dct_dct_16x16_sub16_add_neon: 1372.2
vp9_inv_dct_dct_32x32_sub4_add_neon: 5180.0
vp9_inv_dct_dct_32x32_sub32_add_neon: 8095.7
After:
vp9_inv_dct_dct_16x16_sub4_add_neon: 1051.0
vp9_inv_dct_dct_16x16_sub16_add_neon: 1390.1
vp9_inv_dct_dct_32x32_sub4_add_neon: 5199.9
vp9_inv_dct_dct_32x32_sub32_add_neon: 8125.8
This is cherrypicked from libav commit
115476018d.
Signed-off-by: Martin Storsjö <martin@martin.st>
This work is sponsored by, and copyright, Google.
This reduces the code size of libavcodec/arm/vp9itxfm_neon.o from
15324 to 12388 bytes.
This gives a small slowdown of a couple tens of cycles, up to around
150 cycles for the full case of the largest transform, but makes
it more feasible to add more optimized versions of these transforms.
Before: Cortex A7 A8 A9 A53
vp9_inv_dct_dct_16x16_sub4_add_neon: 2063.4 1516.0 1719.5 1245.1
vp9_inv_dct_dct_16x16_sub16_add_neon: 3279.3 2454.5 2525.2 1982.3
vp9_inv_dct_dct_32x32_sub4_add_neon: 10750.0 7955.4 8525.6 6754.2
vp9_inv_dct_dct_32x32_sub32_add_neon: 18574.0 17108.4 14216.7 12010.2
After:
vp9_inv_dct_dct_16x16_sub4_add_neon: 2060.8 1608.5 1735.7 1262.0
vp9_inv_dct_dct_16x16_sub16_add_neon: 3211.2 2443.5 2546.1 1999.5
vp9_inv_dct_dct_32x32_sub4_add_neon: 10682.0 8043.8 8581.3 6810.1
vp9_inv_dct_dct_32x32_sub32_add_neon: 18522.4 17277.4 14286.7 12087.9
This is cherrypicked from libav commit
0331c3f5e8.
Signed-off-by: Martin Storsjö <martin@martin.st>
This avoids concatenation, which can't be used if the whole macro
is wrapped within another macro.
This is also arguably more readable.
This is cherrypicked from libav commit
58d87e0f49.
Signed-off-by: Martin Storsjö <martin@martin.st>
The 'sqrt' and 'cbrt' scalers were added in commit
80262d8c86, but their symbolic option values
only made available to the showwaves filter, not showwavespic, despite
the scalers working properly by their numerical option values.
Signed-off-by: Moritz Barsnick <barsnick@gmx.net>
1. limit to single layer, as there is no current support for setting distortion/quality of multiple layers
2. encoder mode should be kept at default setting (0)
3. remove fixed_alloc parameter from context : seldom if ever used, and no way of properly configuring at the moment
4. add irreversible setting, to allow for lossless encoding. Set to OpenJPEG default (enabled)
5. set numresolution max to 33, which is the maximum number of allowed resolutions according the J2K spec
Signed-off-by: Michael Bradshaw <mjbshaw@google.com>
Make it clear that there is no timing-dependent behavior. In particular,
there is no state in which both input and output are denied, and where
you have to wait for a while yourself to make progress (apparently some
hardware decoders like to do this).
Avoid wording that makes references to time. It shouldn't be mistaken
for some kind of asynchronous API (like POSIX read() can return EAGAIN
if there is no new input yet). It's a state machine, so try to use
appropriate terms.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
Merges Libav commit 8a60bba0ae.
Apparently the demuxer outputs the wrong padding for HE-AAC (based on
the raw sample rate, or so). aacdec contains a hack to adjust the muxer
padding accordingly before it's used to trim the decoder output. This
modified the packet side data, which in combination with the old
decoding API would change the packet the user passed to the decoder.
This is clearly not allowed, and it breaks running some gapless fate
tests with "-fflags +keepside" applied (without keepside, the packet
metadata is typically newly allocated, essentially making a copy and not
modifying the user's input packet).
This should probably be fixed in the demuxer (and consequently also the
muxer), but for now only fix the immediate problem.
Regression since 946ed78f5f (2012).
except filter_length == 1
odd filter_length gives worse frequency response,
even when compared with shorter filter_length
also makes build_filter simpler
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Muhammad Faiz <mfcc64@gmail.com>
This reverts commit faa9d29829.
This change became superfluous when support for C11 atomics was introduced.
Reverting it will make the removal of this implementation in an upcoming
merge conflict free.
Reviewed-by: wm4 <nfxjfg@googlemail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
The constants used in the decoder used floating point precision,
and this caused different values to be generated on different
architectures.
So, eradicate floating point numbers and use fixed point (32.32)
arithmetics everywhere, replacing constants with precomputed integer
values.
Signed-off-by: Vittorio Giovara <vittorio.giovara at gmail.com>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
The way videotoolbox hooks in as a hwaccel is pretty hacky. The VT decode
API is not invoked until end_frame(), so alloc_frame() returns a dummy
frame with a 1-byte buffer. When end_frame() is eventually called, the
dummy buffer is replaced with the actual decoded data from
VTDecompressionSessionDecodeFrame().
When the VT decoder fails, the frame returned to the h264 decoder from
alloc_frame() remains invalid and should not be used. Before
9747219958, it was accidentally being
returned all the way up to the API user. After that commit, the dummy
frame was unref'd so the user received an error.
However, since that commit, VT hwaccel failures started causing random
segfaults in the h264 decoder. This happened more often on iOS where the
VT implementation is more likely to throw errors on bitstream anomolies.
A recent report of this issue can be see in
http://ffmpeg.org/pipermail/libav-user/2016-November/009831.html
The issue here is that the dummy frame is still referenced internally by the
h264 decoder, as part of the reflist and cur_pic_ptr. Deallocating the
frame causes assertions like this one to trip later on during decoding:
Assertion h->cur_pic_ptr->f->buf[0] failed at src/libavcodec/h264_slice.c:1340
With this commit, we leave the dummy 1-byte frame intact, but avoid returning it
to the user.
This reverts commit 9747219958.
Signed-off-by: wm4 <nfxjfg@googlemail.com>
Without this the FPU state becomes trashed and causes mysterious
fate failures with cpuflags=0
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Ever since the codecpar changes, this has been always printed when
opening a flv file. This is because the codecpar changes made all
streams to be added lazily as read_packet is called.
There is no reason that draining couldn't return an error or two. But
some decoders don't handle this very well, and might always return an
error. This can lead to API users getting into an infinite loop and
burning CPU, because no progress is made and EOF is never returned.
In fact, ffmpeg.c contains a hack against such a case. It is made
unnecessary with this commit, and removed with the next one. (This
particular error case seems to have been fixed since the hack was
added, though.)
This might lose frames if decoding returns errors during draining.
The code modifying the buffer on big endian systems was removed.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
The filter field is often used to check whether a filter is
configured. If configuring the filter actually fails somewhere in
the middle of it, these fields could still be set to non-NULL, which
lead to other code accessing the half-configured filter graph, which
in turn could lead to crashes within libavfilter.
Solve this by properly resetting all fields.
This was triggered by a fuzzed sample after the recent changes. It's
unknown whether this behavior could be triggered before that.
If a subtitle packet came before the first video frame could be fully
decoded, the subtitle packet would get discarded. This puts the subtitle
into a queue instead, and processes it once the attached filter graph is
initialized.
Be more careful when an input stream encounters EOF when its filtergraph
has not been configured yet. The current code would immediately mark the
corresponding output streams as finished, while there may still be
buffered frames waiting for frames to appear on other filtergraph
inputs.
This should fix the random FATE failures for complex filtergraph tests
after a3a0230a98
This merges Libav commit 94ebf55. It was previously skipped.
This is the last filter init related Libav commit that was skipped, so
this also removes the commits from doc/libav-merge.txt.
Signed-off-by: wm4 <nfxjfg@googlemail.com>
This is a more appropriate place for it, and will also be useful in the
following commit.
This merges Libav commit d2e56cf. It was previously skipped.
Signed-off-by: wm4 <nfxjfg@googlemail.com>
This makes sure the actual stream parameters are used, which is
important mainly for hardware decoding+filtering cases, which would
previously require various weird workarounds to handle the fact that a
fake software graph has to be constructed, but never used.
This should also improve behaviour in rare cases where
avformat_find_stream_info() does not provide accurate information.
This merges Libav commit a3a0230. It was previously skipped.
The code in flush_encoders() which sets up a "fake" format wasn't in
Libav. I'm not sure if it's a good idea, but it tends to give
behavior closer to the old one in certain corner cases.
The vp8-size-change gives different result, because now the size of
the first frame is used. libavformat reported the size of the largest
frame for some reason.
The exr tests now use the sample aspect ratio of the first frame. For
some reason libavformat determines 0/1 as aspect ratio, while the
decoder returns the correct one.
The ffm and mxf tests change the field_order values. I'm assuming
another libavformat/decoding mismatch.
Signed-off-by: wm4 <nfxjfg@googlemail.com>
This will be useful in the following commit, after which the muxer
timebase is not always available when encoding.
This merges Libav commit 3e265ca. It was previously skipped.
There are some changes with how/when the mux_timebase field is set,
because the Libav approach often causes a too imprecise time base
to be set. This is hard, because the muxer's write_header function
can readjust the timebase, at which point we might already have
encoded packets buffered. (It might be better to buffer them after
the encoder, instead of after all the timestamp handling logic
before muxing.)
The two FATE tests change because the output time base is raised
for subtitles. (Needed to avoid certain rounding issues in other
cases.)
Includes a minor merge fix by Mark Thompson, and
avconv: Move rescale to stream timebase before monotonisation
also by Mark Thompson <sw@jkqxz.net>.
Signed-off-by: wm4 <nfxjfg@googlemail.com>
Some callers (like do_subtitle_out()) call this with an AVPacket that is
not refcounted. This can cause undefined behavior.
Calling av_packet_move_ref() does not make a packet refcounted if it
isn't yet. (And it can't be made to, because it always succeeds,
and can't return ENOMEM.)
Call av_packet_ref() instead to make sure it's refcounted.
I couldn't find a case that is fixed by this with the current code. But
it will fix the fate-pva-demux test with the later patches applied.
Signed-off-by: wm4 <nfxjfg@googlemail.com>
Found a case where we use size==0, the other related commits
remain needed, and should be sufficient to fix the original issue
This reverts commit 7e4f32f4e4.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
If AVVideotoolboxContext.cv_pix_fmt_type is set to 0, don't set the
kCVPixelBufferPixelFormatTypeKey value on the VT decoder.
This makes VT output its native format, which can be much faster on
some hardware iterations (if the native format does not match with
the requested format, it will be converted, which is slow).
The default is still forcing nv12.
Public fields were added after the private fields (negating the entire
point of this). New private fields go into AVStreamInternal anyway.
The new marker was set by guessing which fields are supposed to be
private and wshich not. recommended_encoder_configuration is accessed by
ffserver_config.c directly, and is supposed to use the public API.
ffmpeg.c accesses AVStream.cur_dts, even though it's a private field,
but that seems to be an older error.
Allow all struct fields to be accessed directly, as long as they're
public.
Before this change, many fields were "public", but could be accessed via
AVOption only. This meant they were effectively not public, but were
present for documentation purposes, which was incredibly confusing at
best.
qmin and qmax are not necessary for nvenc vbr.
Enforcing this constraint, doesn't allow user to use vbr 2 pass mode without explicity setting the qmin and qmax options
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
Rodger Combs will be added to the ffmpeg-security alias when this patch is applied
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Reviewed-by: Clément Bœsch <u@pkh.me>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
wm4 will be added to the ffmpeg-security alias when this patch is applied
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Reviewed-by: Clément Bœsch <u@pkh.me>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The size 0 special case causes side data to be created which is
different and a special case if for any reasons size = 0 is passed
Fixes: multiple runtime error: null pointer passed as argument 1, which is declared to never be null
Fixes: 653/clusterfuzz-testcase-5773837415219200
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: runtime error: shift exponent 34 is too large for 32-bit type 'int'
Fixes: 653/clusterfuzz-testcase-5773837415219200
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This should fix the fate failure due to a truncated last frame.
Alternatively the frame could be dropped.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
MSVC doesn't support the %s time format, and instead of returning an
error the invalid parameter handler is invoked which (by default)
terminates the process.
Reviewed-by:Steven Liu <lq@chinaffmpeg.org>
Signed-off-by: Hendrik Leppkes <h.leppkes@gmail.com>
refer to ticket id: #6170
rename file from temp to origin name after complete current segment
Reviewed-by: Aman Gupta <ffmpeg@tmm1.net>
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
Since the PVQ search has been well fuzzed and is guaranteed to never
break SUM(abs(y[])) == K, the assert is no longer needed.
Also the assert only prevented coding the wrong vector index but didn't
prevent crashes during searching for it, which made the assert rather
informational than practical.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Since the probelm mentioned only happened when the phase was negative
(e.g. the sum had to be decreased), only discarding dimensions with a
zero pulse in that case restored the search's previously low distortion
at low Ks when the phase is never negative.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This is not necessarily specific to fuzzed files
Fixes: Multiple integer overflows
Fixes: 656/clusterfuzz-testcase-6463814516080640
Fixes: 658/clusterfuzz-testcase-6691260146384896
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes#6081. Some dictionary keys are not present on OS X 10.8.
This loads the symbols and uses a default value if not present.
Signed-off-by: Rick Kern <kernrj@gmail.com>
This ensures that the wrapped avframe will not get reallocated later, which
would invalidate internal references such as extended data.
Reviewed-by: wm4 <nfxjfg@googlemail.com>
Signed-off-by: Marton Balint <cus@passwd.hu>
When setting the channel layout directly using AVBufferSrcParameters
the channel layout was correctly set however the init function still
expected the old string format to set the number of channels (when it
hadn't already been specified).
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
The quantization table is stored in the natural order, but when we
access it, we use an index that's in zigzag order, causing us to read
the wrong value. This causes artifacts, especially in areas with
horizontal or vertical edges. The artifacts look a lot like the
DCT ringing artifacts you'd expect to see from a low-bitrate file,
but when comparing to NewTek's own decoder, it's obvious they're not
supposed to be there.
Fix by simply storing the scaled quantization table in zigzag order.
Performance is unchanged.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
If the PVQ search picked a place to increment/decrement on the y[]
vector which had no pulse then it would cause a desync since it would
change the sum in the wrong direction. Fix this by not considering
places without pulses as viable.
This makes the PVQ search slightly worse at K < 5 which isn't all that
common. Still, this is a workaround to prevent making broken files until
I can think of a better way of fixing it.
Also add an assertion, which can be removed or moved to assert1/2 once
the PVQ search is stable.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This dts value can end up in the list in the absence of durations and is in that
case semantically identical to AV_NOPTS_VALUE. We can alternatively prevent
storing RELATIVE_TS_BASE if there is no duration.
Fixes Ticket3640
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Originally committed to x264 in 1637239a by Henrik Gramner who has
agreed to re-license it as LGPL. Original commit message follows.
x86: Avoid some bypass delays and false dependencies
A bypass delay of 1-3 clock cycles may occur on some CPUs when transitioning
between int and float domains, so try to avoid that if possible.
If there is progressive input it will disable deinterlacing in cuvid for
all future frames even those interlaced.
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
Fixes Ticket 6018
This fixes a regression, and allows playback of files containing mpeg4video that are otherwise
not supported
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
When the http method is not set, the method will use POST for ts,
PUT for m3u8, it is not unify, now set it unify.
This ticket id: #5315
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Reviewed-by: Moritz Barsnick <barsnick@gmx.net>
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
Channel mapping 2 additionally supports a non-diegetic stereo track
appended to the end of a full-order ambisonics signal, such that the
total channel count is either
(n + 1) ^ 2, or
(n + 1) ^ 2 + 2
where n is the ambisonics order
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Skips using temporary files when outputting to a protocol other than
"file", which enables dash to output content over network
protocols. The logic has been copied from the HLS format.
Reviewed-by: Steven Liu <lingjiujianke@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This commit optimizes HTTP performance by reducing forward seeks, instead
favoring a read-ahead and discard on the current connection (referred to
as a short seek) for seeks that are within a TCP window's worth of data.
This improves performance because with TCP flow control, a window's worth
of data will be in the local socket buffer already or in-flight from the
sender once congestion control on the sender is fully utilizing the window.
Note: this approach doesn't attempt to differentiate from a newly opened
connection which may not be fully utilizing the window due to congestion
control vs one that is. The receiver can't get at this information, so we
assume worst case; that full window is in use (we did advertise it after all)
and that data could be in-flight
The previous behavior of closing the connection, then opening a new
with a new HTTP range value results in a massive amounts of discarded
and re-sent data when large TCP windows are used. This has been observed
on MacOS/iOS which starts with an initial window of 256KB and grows up to
1MB depending on the bandwidth-product delay.
When seeking within a window's worth of data and we close the connection,
then open a new one within the same window's worth of data, we discard
from the current offset till the end of the window. Then on the new
connection the server ends up re-sending the previous data from new
offset till the end of old window.
Example (assumes full window utilization):
TCP window size: 64KB
Position: 32KB
Forward seek position: 40KB
* (Next window)
32KB |--------------| 96KB |---------------| 160KB
*
40KB |---------------| 104KB
Re-sent amount: 96KB - 40KB = 56KB
For a real world test example, I have MP4 file of ~25MB, which ffplay
only reads ~16MB and performs 177 seeks. With current ffmpeg, this results
in 177 HTTP GETs and ~73MB worth of TCP data communication. With this
patch, ffmpeg issues 4 HTTP GETs and 3 seeks for a total of ~22MB of TCP data
communication.
To support this feature, the short seek logic in avio_seek() has been
extended to call a function to get the short seek threshold value. This
callback has been plumbed to the URLProtocol structure, which now has
infrastructure in HTTP and TCP to get the underlying receiver window size
via SO_RCVBUF. If the underlying URL and protocol don't support returning
a short seek threshold, the default s->short_seek_threshold is used
This feature has been tested on Windows 7 and MacOS/iOS. Windows support
is slightly complicated by the fact that when TCP window auto-tuning is
enabled, SO_RCVBUF doesn't report the real window size, but it does if
SO_RCVBUF was manually set (disabling auto-tuning). So we can only use
this optimization on Windows in the later case
Signed-off-by: Joel Cunningham <joel.cunningham@me.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This marks the first time anyone has written an Opus encoder without
using any libopus code. The aim of the encoder is to prove how far
the format can go by writing the craziest encoder for it.
Right now the encoder's basic, it only supports CBR encoding, however
internally every single feature the CELT layer has is implemented
(except the pitch pre-filter which needs to work well with the rest of
whatever gets implemented). Psychoacoustic and rate control systems are
under development.
The encoder takes in frames of 120 samples and depending on the value of
opus_delay the plan is to use the extra buffered frames as lookahead.
Right now the encoder will pick the nearest largest legal frame size and
won't use the lookahead, but that'll change once there's a
psychoacoustic system.
Even though its a pretty basic encoder its already outperforming
any other native encoder FFmpeg has by a huge amount.
The PVQ search algorithm is faster and more accurate than libopus's
algorithm so the encoder's performance is close to that of libopus
at zero complexity (libopus has more SIMD).
The algorithm might be ported to libopus or other codecs using PVQ in
the future.
The encoder still has a few minor bugs, like desyncs at ultra low
bitrates (below 9kbps with 20ms frames).
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This is meant to be applied on top of my previous patch which
split PVQ into celt_pvq.c and made opus_celt.h
Essentially nothing has been changed other than renaming CeltFrame
to CeltBlock (CeltFrame had absolutely nothing at all to do with
a frame) and CeltContext to CeltFrame.
3 variables have been put in CeltFrame as they make more sense
there rather than being passed around as arguments.
The coefficients have been moved to the CeltBlock structure
(why the hell were they in CeltContext and not in CeltFrame??).
Now the encoder would be able to use the exact context the decoder
uses (plus a couple of extra fields in there).
FATE passes, no slowdowns, etc.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
A huge amount can be reused by the encoder, as the only thing
which needs to be done would be to add a 10 line celt_icwrsi,
a wrapper around it (celt_alg_quant) and templating the
ff_celt_decode_band to replace entropy decoding functions
with entropy encoding.
There is no performance loss but in fact a performance gain of
around 6% which is caused by the compiler being able to optimize
the decoding more efficiently.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Handles strides (needed for Opus transients), does pre-reindexing and folding
without needing a copy.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Mostly used the RFC document, the decoding functions and
the reference encoder's implmenentation as a reference.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
A strict reading of the spec seems to imply that it should be aligned to
the start of the element instance tag, but that would break all of the
samples with PCEs.
It seems like a well formed LATM stream should have its PCE in the ASC
rather than inband.
Fixes ticket 4544
D3D9Ex uses different driver paths. This helps with "headless"
configurations when no user logs in. Plain D3D9 device creation will
fail if no user is logged in, while it works with D3D9Ex.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
Merges Libav commit c2f97f0508.
This is an extended version of the AVFrame.opaque field, which can be
used to attach arbitrary user information to an AVFrame.
The usefulness of the opaque field is rather limited, because it can
store only up to 32 bits of information (or 64 bit on 64 bit systems).
It's not possible to set this field to a memory allocation, because
there is no way to deallocate it correctly.
The opaque_ref field circumvents this by letting the user set an
AVBuffer, which makes the user data refcounted.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
Merges Libav commit 04f3bd3496.
hls-encoder currenlty does not provide stream level metadata to mpegts
muxer. This patch fixes track #3848 bug.
Signed-off-by: Bela Bodecs <bodecsb@vivanet.hu>
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
This enables having multiple tracks of the same type which would
be treated as different things by the media server (as opposed to
different bit rate versions of the same track). According to the
smooth streaming specification, just setting the systemLanguage
tag is not enough to note that a track with the same attributes
differs from another one.
Reviewed-by: Martin
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
When user use the hls_wrap, there have many problem:
1. some platform refersh the old but usefull segment
2. CDN(Content Delivery Network) Deliver HLS not friendly
The hls_wrap is used to wrap segments for use little space,
now user can use hls_list_size and hls_flags delete_segments
instead it.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Reviewed-by: Carl Eugen Hoyos <ceffmpeg@gmail.com>
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
This way it's clear the size field accounts for the footer length plus every
tag entry, but not the header.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
The size field in the header/footer accounts for the entire APE tag
structure except the 32 bytes from header, for compatibility with
APEv1.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
According to the spec[1], a value of 0 means the footer is present and a value
of 1 means it's absent, the exact opposite of header presence flag where 1
means present and 0 absent.
The reason for this is compatibility with APEv1 tags, where there's no header,
footer presence was mandatory for all files, and the flags field was a zeroed
reserved field.
[1] http://wiki.hydrogenaud.io/index.php?title=Ape_Tags_Flags
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
This limits the bugs, speedloss and extra memory allocation to the case when
optimal tables are needed.
Fixes regressions with slice multi-threading
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
If this is wanted iam not against it but it must be designed to work with all cases
like slice threads, and a single growing buffer does not work very well with slices.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Only do this when building for a recent VAAPI version - initial
driver implementations were confused about the interpretation of the
framerate field, but hopefully this will be consistent everywhere
once 0.40.0 is released.
(cherry picked from commit ff35aa8ca4)
Default to using VBR when a target bitrate is set, unless the max rate
is also set and matches the target. Changes to the Intel driver mean
that min_qp is also respected in this case, so set a codec default to
unset the value rather than using the current default inherited from
the MPEG-4 part 2 encoder.
(cherry picked from commit eddfb57210)
This includes a backward-compatibility hack to choose CBR anyway on
old drivers which have no CBR support, so that existing programs will
continue to work their options now map to VBR.
(cherry picked from commit f033ba470f)
Before this change, it was possible to overflow pic_order_cnt_lsb and
generate a stream with invalid POC numbering. This makes sure that
the field is large enough that a single IDR B* P sequence uses fewer
than half the available POC lsb values.
(cherry picked from commit 89725a8512)
This change makes the configured GOP size be respected exactly -
previously the value could be exceeded slightly due to flaws in the
frame type selection logic.
(cherry picked from commit 37fab0661a)
Core of patch is from paul@paulmehta.com
Reference https://crbug.com/643951
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Check value reduced as the code does not support values beyond INT_MAX
Also the check is moved to a more common place and before integer truncation
Adds a `-hls_flags +temp_file` which will write segment data to
filename.tmp, and then rename to filename when the segment is complete.
This patch is similar in spirit to one used in Plex's ffmpeg fork, and
allows a transcoding webserver to ensure incomplete segment files are
never served up accidentally.
Reviewed-by: Hendrik Leppkes <h.leppkes@gmail.com>
Reviewed-by: Bodecs Bela <bodecsb@vivanet.hu>
Signed-off-by: Aman Gupta <aman@tmm1.net>
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
Blocks are marked as key frames whenever the "reference" field is
zero. This breaks for non-keyframe Blocks with a reference timestamp
of zero.
The likelihood of reference timestamp being zero is increased by a
longstanding bug in muxing that encodes reference timestamp as the
absolute time of the referenced frame (rather than relative to the
current Block timestamp, as described in MKV spec).
Now using INT64_MIN to denote "no reference".
Reported to chromium at http://crbug.com/497889 (contains sample)
The original code is correctly following the API - vaTerminate() must
be called to free the resources of a VADisplay after it is created by
any of the vaGetDisplay*() calls; it is not necessary to have
successfully called vaInitialize() on it. The segfaults which
prompted this change must therefore be bugs in libva or the driver it
loads.
This reverts commit 3606602f11.
Detecting a leap second depends on a lot of things, segment time, segment
offset, system leap second implementation, the removed part is a huge
simplification which can be misleading, so it is best to remove it.
Signed-off-by: Marton Balint <cus@passwd.hu>
Not starting a new segment if the elapsed microsecs since the start of the day
equals the the elapsed microsecs since the start of the day at the time of the
last cut seems plain wrong to me, Deti do you remember the original reason
behind this check?
Signed-off-by: Marton Balint <cus@passwd.hu>
Without the /UTF-8 switch, the MSVC compiler treats all files as in the
system codepage, instead of in UTF-8, which causes UTF-8 string literals
to be interpreted wrong.
This switch was only introduced in VS2015 Update 2, and any earlier
versions do not have an equivalent solution.
Fixes fate-sub-scc on MSVC 2015+
Fixes out of array read
Fixes: 544/clusterfuzz-testcase-5936536407244800.f8bd9b24_8ba77916_70c2c7be_3df6a2ea_96cd9f14
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This fixes ubsan warnings in non debug builds by using unsigned operations
in debug builds the correct signed operations are retained so that overflows
(which should not occur in valid files and may indicate problems in the DSP code
or decoder) can be detected.
Alternatively they can be changed to unsigned unconditionally, then its
not possible though to detect overflows easily if someone wants to test
the DSP code for overflows.
The 2nd alternative would be to leave the code as it is and accept that
there are undefined operations in the DSP code and that ubsan output is
full of them in some cases.
Similar changes would be needed in some other DSP routines
Suggested-by: Matt Wolenetz <wolenetz@google.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Clarify that setting loop=0 is required to make the stream loop infinitely, rather than saying that a value "less than 1" is needed.
Signed-off-by: Lou Logan <lou@lrcd.com>
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x0000000000aff8a4 in vaTerminate ()
#1 0x0000000000ae50ce in vaapi_device_free (ctx=<optimized out>) at libavutil/hwcontext_vaapi.c:882
#2 0x0000000000ae1f9e in hwdevice_ctx_free (opaque=<optimized out>, data=<optimized out>) at libavutil/hwcontext.c:66
#3 0x0000000000ad856f in buffer_replace (src=0x0, dst=0x7fffa26ef1b8) at libavutil/buffer.c:119
#4 av_buffer_unref (buf=buf@entry=0x7fffa26ef1f8) at libavutil/buffer.c:129
#5 0x0000000000ae299f in av_hwdevice_ctx_create (pdevice_ref=0x170ac50 <hw_device_ctx>, type=type@entry=AV_HWDEVICE_TYPE_VAAPI, device=<optimized out>,
opts=opts@entry=0x0, flags=flags@entry=0) at libavutil/hwcontext.c:494
#6 0x0000000000400968 in vaapi_device_init (device=<optimized out>) at ffmpeg_vaapi.c:223
Signed-off-by: Mark Thompson <sw@jkqxz.net>
Implements support for height/width expressions in vf_scale_vaapi,
by refactoring common code into a new libavfilter/scale.c
Signed-off-by: Mark Thompson <sw@jkqxz.net>
Change the encoding of the original developer name from ISO-8859-1 to UTF-8.
Remove the stale/completed TODO list.
Fix two small typos.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* commit '7f549b8338ed3775fec4bf10421ff5744e5866dd':
riff: don't overwrite bps from WAVEFORMATEX if EXTENSIBLE doesn't contain that data.
Only cosmetics, the change was already present.
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit '6135c3b61e084be93c0876cecd06f4e764f961c0':
Revert "avprobe: Zero the allocated avio buffer memory"
This commit is a noop, see 591cf8aa0e
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit 'ed9b2a5178d7a7c5a95694da3a808af327f36aff':
mov: Rework the check for invalid indexes in stsc
This commit is a noop, see 3c058f5701.
The proposed fix breaks seeking in multiple_stsd.mp4 (ticket #3962) and
playback of wwwq_cut.mp4 (ticket #2991).
Merged-by: Matthieu Bouron <matthieu.bouron@gmail.com>
Certain alpha run lengths (for SHQ1/SHQ3/SHQ5) could be stored in
both long and short versions, and we would only accept the short version,
returning -1 (invalid code) for the others. This could cause an
out-of-bounds write on malicious input, as discovered by
Andreas Cadhalpun during fuzzing.
Fix by simply allowing both versions, leaving no invalid codes
in the alpha VLC.
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
Multichannel joint stereo simply interleaves stereo pairs (6ch: 2ch + 2ch + 2ch), so each pair is decoded separatedly.
***
To test my changes, I converted examples to wav with ffmpeg.exe (old and new), and compared them to see they are byte-exact.
Regular 2ch files (JS and normal) were straightforward to test.
For multichannel, to check each JS pair is correctly decoded separatedly I did:
- manually demux 6ch.msf into 3 pairs and convert them (2ch_1.wav + 2ch_2.wav + 2ch_3.wav)
- convert the 6ch.msf file to wav (with my changes)
- manually demux the 6ch.wav into 3 pairs (6ch_d1.wav + 6ch_d2.wav + 6ch_d3.wav)
- compare each pair (ex. 2ch_3.wav vs 6ch_d3.wav): all pairs are byte-exact.
The new code just processes each JS pair separatedly, there are no algorithm changes.
It could be improved a bit but I'm not sure about typical styles.
I've only seen 6ch .MSF (probably the AT3 spec only supports 2ch audio).
Signed-off-by: bnnm <bananaman255@gmail.com>
Fixes: u263_b-frames_1.avi
Fixes part of Ticket1536
return -1 is used here as it is used in similar code in this function, I intend
to replace it by proper error codes in the whole function.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* commit '90bc423212396e96a02edc1118982ab7f7766a63':
mov: Wrap stsc index and count compare in a separate function
The mov_stsc_index_valid() function is replaced with a macro to prevent
signdness issues (index is not always signed, and count is always
unsigned currently).
The comparison is also adjusted to reduce the risk of overflows.
Merged-by: Clément Bœsch <u@pkh.me>
Retain the ranges of frame indexes when applying edit list in
mov_fix_index. The index ranges are then used to keep track of the frame
index of the current sample. In case of a discontinuity in frame indexes
due to edit, update the auxiliary info position accordingly.
Reviewed-by: Sasi Inguva <isasi@google.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* commit '209ee680ce99035202520b900326a57f7fa0aceb':
mov: Fix stsc_count comparison
This commit is a noop, see 3c058f5701
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'a1f6a2dfdaf9beb42ca66e49d10bfaf5905a0128':
ratecontrol: Reorder functions to avoid forward declarations
Merged, but this seems to break the clear separation of 1-pass vs
2-pass.
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'd639dcdae022130078c9c84b7b691c5e9694786c':
ratecontrol: Move Xvid-related functions to the place they are actually used
Merged-by: Clément Bœsch <u@pkh.me>
* commit '44972e227df0f7ad5aa9004d971fb54e9dc5c849':
ratecontrol: Move mpegenc-only function where it is used
This commit is a noop. ff_write_pass1_stats() is used in snowenc as
well.
Merged-by: Clément Bœsch <u@pkh.me>
The code relies on their validity and otherwise can try to access a NULL
object->rle pointer, causing segmentation faults.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
The assumption this is based on is wrong, the code is not always run with bitexact flags
This reverts commit a956164e1e, reversing
changes made to f6005907fd.
Approved-by: James Almer <jamrial@gmail.com>
* commit 'd06dfaa5cbdd20acfd2364b16c0f4ae4ddb30a65':
x86: huffyuv: Use EXTERNAL_SSSE3_FAST convenience macro where appropriate
Merged-by: James Almer <jamrial@gmail.com>
* commit '8e9cd81d291b1010c625b2766058aadf4affb537':
x86: cpu: Detect Conroe CPUs and their slow shuffle unit
Merged-by: James Almer <jamrial@gmail.com>
* commit '7d7355aa92bb36ca0765c49a569a999bcb96f332':
x86: Add SSSE3_SLOW CPU flag and related convenience macros
Merged-by: James Almer <jamrial@gmail.com>
* commit '4efab89332ea39a77145e8b15562b981d9dbde68':
x86: Use *_FAST/*_SLOW CPU feature detection macros where appropriate
Merged-by: James Almer <jamrial@gmail.com>
* commit '0a39c9ac0bfd7345fe676b4e2707d9cec3cbb553':
x86: hpeldsp: Don't check for bitexact flag when initializing VP3-specific code
Merged-by: James Almer <jamrial@gmail.com>
* commit '1dfc3cf89d0eb026af28be46294b85d79499ffb5':
x86: hpeldsp: Split off VP3-specific bits into a separate file
Merged-by: James Almer <jamrial@gmail.com>
* commit '0e0538aefc75958ded49f5d075c99a81cf6b2bbb':
avprobe: Zero the allocated avio buffer memory
This commit is a noop, no such thing exists in ffprobe.
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit 'e344e65109f1a75ca82aff4cecec44e79197753c':
h264dec: do not call finish_setup() if we have not started a frame
This commit is a noop, see bdbbb8f11e
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit '1f7b4f9abc6bae94e576e710b8d10117ca3c8238':
h264dec: make sure not to call finish_setup() more than once per frame
This commit is a noop, see bdbbb8f11e
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit 'da917fcf5183ed249ad1285b8edd330f421376c4':
avconv_dxva2: add a profile check for hevc
This commit is a noop, see a655bc8344
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit '1ecb63cd1c1a4ddc5efed4abbc3158b969d8c5e4':
hevc: set profile based on the profile compatibility flags if needed
This commit is a noop, see f85cc3bf12
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit 'fca3c3b61952aacc45e9ca54d86a762946c21942':
hevc: Add AVX2 DC IDCT
Mostly noop as we already have that code.
In the ASM, code is merged with the exception of SECTION which is kept
uppercase for consistency with the rest of the codebase.
Still in the ASM, the prototype comment is fixed to honor the '_' added
from the original commit.
idct_dc_proto() is dropped as it's not used anymore here.
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit 'cc16da75c2f99d92f7a6461100f041352deb6d88':
hevc: Add coefficient limiting to speed up IDCT
Noop again as we have these changes already, only random spacing
changes.
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit '4f247de3b797cdc9d243d26534412f81c306e5b5':
hevcdsp_template: Templatize IDCT
This commit is a noop as we already have that code from a previous
commits (see 92cccb7bcd).
Spacing is adjusted to reduce the diff.
Merged-by: Clément Bœsch <cboesch@gopro.com>
* commit '1bd890ad173d79e7906c5e1d06bf0a06cca4519d':
hevc: Separate adding residual to prediction from IDCT
This commit should be a noop but isn't because of the following renames:
- transform_add → add_residual
- transform_skip → dequant
- idct_4x4_luma → transform_4x4_luma
Merged-by: Clément Bœsch <cboesch@gopro.com>
Allows the user to reserve space for the ODML master index. A sufficient
sized master index in the AVI header avoids storing follow-up master
indexes within the 'movi' data later. If the option is omitted or zero
the index size is estimated from output duration and bitrate.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Tobias Rapp <t.rapp@noa-archive.com>
Codec 4 (frame size 98) uses joint stereo per spec and examples.
Also removed an incorrect "align" var which wasn't used anyway (it was overwrittern).
Probably all/only .AT3 of frame size 98 are JS, too.
Signed-off-by: bnnm <bananaman255@gmail.com>
Instead of just updating statistics and leaving the work to the
call site, have it actually do the work.
Also: skip the samples by updating the frame data pointers
instead of moving the samples. More efficient and avoid writing
into shared frames.
Found-By: Muhammad Faiz <mfcc64@gmail.com>
Name and purpose are more appropriate there since the code isn't
an ideal example.
Reviewed-by: wm4 <nfxjfg@googlemail.com>
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This allows testing EC and non EC. Avoids spending most time in EC on
high res samples and reduces the likelyhood of hitting timeouts
Fixes: Timeout in 467/fuzz-2-ffmpeg_VIDEO_AV_CODEC_ID_H263_fuzzer
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
When detecting a swapped AC3 marker the data of the frame is swapped. However, in subsequent frames the data swapped is taken from the first frame rather than the current frame.
Signed-off-by: Marijn Meijles <marijn@bitpit.net>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
From e24d95c0e06a878d401ee34fd6742fcaddeeb95f Mon Sep 17 00:00:00 2001
From: Joel Cunningham <joel.cunningham@me.com>
Date: Mon, 9 Jan 2017 13:37:51 -0600
Subject: [PATCH] tcp: set socket buffer sizes before listen/connect/accept
Attempting to set SO_RCVBUF and SO_SNDBUF on TCP sockets after connection
establishment is incorrect and some stacks ignore the set call on the socket at
this point. This has been observed on MacOS/iOS. Windows 7 has some peculiar
behavior where setting SO_RCVBUF after applies only if the buffer is increasing
from the default while decreases are ignored. This is possibly how the incorrect
usage has gone unnoticed
Unix Network Programming Vol. 1: The Sockets Networking API (3rd edition, seciton 7.5):
"When setting the size of the TCP socket receive buffer, the ordering of the
function calls is important. This is because of TCP's window scale option,
which is exchanged with the peer on SYN segments when the connection is
established. For a client, this means the SO_RCVBUF socket option must be
set before calling connect. For a server, this means the socket option must
be set for the listening socket before calling listen. Setting this option
for the connected socket will have no effect whatsoever on the possible window
scale option because accept does not return with the connected socket until
TCP's three-way handshake is complete. This is why the option must be set on
the listening socket. (The sizes of the socket buffers are always inherited from
the listening socket by the newly created connected socket)"
Signed-off-by: Joel Cunningham <joel.cunningham@me.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
When bytes_read overflowed, last_bytes_read did not yet overflow
and no bytes-read report was created leading to a timeout.
Analyzed-by: Thomas Bernhard
Fixes ticket #5836.
Current code returned the number of channels as channel layout in that case,
and if nret is not set then unknown layouts are typically not supported.
Also use the common parsing code. Use a temporary workaround to parse an
unknown channel layout such as '13c', after a 1 year grace period only '13C'
will work.
Signed-off-by: Marton Balint <cus@passwd.hu>
Return a channel layout and the number of channels based on the specified name.
This function is similar to av_get_channel_layout(), but can also parse unknown
channel layout specifications.
Unknown channel layout specifications are a decimal number and a capital 'C'
suffix, in order to not break compatibility with the lowercase 'c' suffix,
which is used for a guessed channel layout with the specified number of
channels.
Signed-off-by: Marton Balint <cus@passwd.hu>
This work is sponsored by, and copyright, Google.
This is similar to the arm version, but due to the larger registers
on aarch64, we can do 8 pixels at a time for all filter sizes.
Examples of runtimes vs the 32 bit version, on a Cortex A53:
ARM AArch64
vp9_loop_filter_h_4_8_10bpp_neon: 213.2 172.6
vp9_loop_filter_h_8_8_10bpp_neon: 281.2 244.2
vp9_loop_filter_h_16_8_10bpp_neon: 657.0 444.5
vp9_loop_filter_h_16_16_10bpp_neon: 1280.4 877.7
vp9_loop_filter_mix2_h_44_16_10bpp_neon: 397.7 358.0
vp9_loop_filter_mix2_h_48_16_10bpp_neon: 465.7 429.0
vp9_loop_filter_mix2_h_84_16_10bpp_neon: 465.7 428.0
vp9_loop_filter_mix2_h_88_16_10bpp_neon: 533.7 499.0
vp9_loop_filter_mix2_v_44_16_10bpp_neon: 271.5 244.0
vp9_loop_filter_mix2_v_48_16_10bpp_neon: 330.0 305.0
vp9_loop_filter_mix2_v_84_16_10bpp_neon: 329.0 306.0
vp9_loop_filter_mix2_v_88_16_10bpp_neon: 386.0 365.0
vp9_loop_filter_v_4_8_10bpp_neon: 150.0 115.2
vp9_loop_filter_v_8_8_10bpp_neon: 209.0 175.5
vp9_loop_filter_v_16_8_10bpp_neon: 492.7 345.2
vp9_loop_filter_v_16_16_10bpp_neon: 951.0 682.7
This is significantly faster than the ARM version in almost
all cases except for the mix2 functions.
Based on START_TIMER/STOP_TIMER wrapping around a few individual
functions, the speedup vs C code is around 2-3x.
Signed-off-by: Martin Storsjö <martin@martin.st>
This work is sponsored by, and copyright, Google.
Compared to the arm version, on aarch64 we can keep the full 8x8
transform in registers, and for 16x16 and 32x32, we can process
it in slices of 4 pixels instead of 2.
Examples of runtimes vs the 32 bit version, on a Cortex A53:
ARM AArch64
vp9_inv_adst_adst_4x4_sub4_add_10_neon: 111.0 109.7
vp9_inv_adst_adst_8x8_sub8_add_10_neon: 914.0 733.5
vp9_inv_adst_adst_16x16_sub16_add_10_neon: 5184.0 3745.7
vp9_inv_dct_dct_4x4_sub1_add_10_neon: 65.0 65.7
vp9_inv_dct_dct_4x4_sub4_add_10_neon: 100.0 96.7
vp9_inv_dct_dct_8x8_sub1_add_10_neon: 111.0 119.7
vp9_inv_dct_dct_8x8_sub8_add_10_neon: 618.0 494.7
vp9_inv_dct_dct_16x16_sub1_add_10_neon: 295.1 284.6
vp9_inv_dct_dct_16x16_sub2_add_10_neon: 2303.2 1883.9
vp9_inv_dct_dct_16x16_sub8_add_10_neon: 2984.8 2189.3
vp9_inv_dct_dct_16x16_sub16_add_10_neon: 3890.0 2799.4
vp9_inv_dct_dct_32x32_sub1_add_10_neon: 1044.4 1012.7
vp9_inv_dct_dct_32x32_sub2_add_10_neon: 13333.7 9695.1
vp9_inv_dct_dct_32x32_sub16_add_10_neon: 18531.3 12459.8
vp9_inv_dct_dct_32x32_sub32_add_10_neon: 24470.7 16160.2
vp9_inv_wht_wht_4x4_sub4_add_10_neon: 83.0 79.7
The larger transforms are significantly faster than the corresponding
ARM versions.
The speedup vs C code is smaller than in 32 bit mode, probably
because the 64 bit intermediates in the C code can be expressed
more efficiently in aarch64.
Signed-off-by: Martin Storsjö <martin@martin.st>
This work is sponsored by, and copyright, Google.
This has mostly got the same differences to the 8 bit version as
in the arm version. For the horizontal filters, we do 16 pixels
in parallel as well. For the 8 pixel wide vertical filters, we can
accumulate 4 rows before storing, just as in the 8 bit version.
Examples of runtimes vs the 32 bit version, on a Cortex A53:
ARM AArch64
vp9_avg4_10bpp_neon: 35.7 30.7
vp9_avg8_10bpp_neon: 93.5 84.7
vp9_avg16_10bpp_neon: 324.4 296.6
vp9_avg32_10bpp_neon: 1236.5 1148.2
vp9_avg64_10bpp_neon: 4639.6 4571.1
vp9_avg_8tap_smooth_4h_10bpp_neon: 130.0 128.0
vp9_avg_8tap_smooth_4hv_10bpp_neon: 440.0 440.5
vp9_avg_8tap_smooth_4v_10bpp_neon: 114.0 105.5
vp9_avg_8tap_smooth_8h_10bpp_neon: 327.0 314.0
vp9_avg_8tap_smooth_8hv_10bpp_neon: 918.7 865.4
vp9_avg_8tap_smooth_8v_10bpp_neon: 330.0 300.2
vp9_avg_8tap_smooth_16h_10bpp_neon: 1187.5 1155.5
vp9_avg_8tap_smooth_16hv_10bpp_neon: 2663.1 2591.0
vp9_avg_8tap_smooth_16v_10bpp_neon: 1107.4 1078.3
vp9_avg_8tap_smooth_64h_10bpp_neon: 17754.6 17454.7
vp9_avg_8tap_smooth_64hv_10bpp_neon: 33285.2 33001.5
vp9_avg_8tap_smooth_64v_10bpp_neon: 16066.9 16048.6
vp9_put4_10bpp_neon: 25.5 21.7
vp9_put8_10bpp_neon: 56.0 52.0
vp9_put16_10bpp_neon/armv8: 183.0 163.1
vp9_put32_10bpp_neon/armv8: 678.6 563.1
vp9_put64_10bpp_neon/armv8: 2679.9 2195.8
vp9_put_8tap_smooth_4h_10bpp_neon: 120.0 118.0
vp9_put_8tap_smooth_4hv_10bpp_neon: 435.2 435.0
vp9_put_8tap_smooth_4v_10bpp_neon: 107.0 98.2
vp9_put_8tap_smooth_8h_10bpp_neon: 303.0 290.0
vp9_put_8tap_smooth_8hv_10bpp_neon: 893.7 828.7
vp9_put_8tap_smooth_8v_10bpp_neon: 305.5 263.5
vp9_put_8tap_smooth_16h_10bpp_neon: 1089.1 1059.2
vp9_put_8tap_smooth_16hv_10bpp_neon: 2578.8 2452.4
vp9_put_8tap_smooth_16v_10bpp_neon: 1009.5 933.5
vp9_put_8tap_smooth_64h_10bpp_neon: 16223.4 15918.6
vp9_put_8tap_smooth_64hv_10bpp_neon: 32153.0 31016.2
vp9_put_8tap_smooth_64v_10bpp_neon: 14516.5 13748.1
These are generally about as fast as the corresponding ARM
routines on the same CPU (at least on the A53), in most cases
marginally faster.
The speedup vs C code is around 4-9x.
Signed-off-by: Martin Storsjö <martin@martin.st>
This work is sponsored by, and copyright, Google.
This is more in line with how it will be extended for more bitdepths.
Signed-off-by: Martin Storsjö <martin@martin.st>
This work is sponsored by, and copyright, Google.
This is pretty much similar to the 8 bpp version, but in some senses
simpler. All input pixels are 16 bits, and all intermediates also fit
in 16 bits, so there's no lengthening/narrowing in the filter at all.
For the full 16 pixel wide filter, we can only process 4 pixels at a time
(using an implementation very much similar to the one for 8 bpp),
but we can do 8 pixels at a time for the 4 and 8 pixel wide filters with
a different implementation of the core filter.
Examples of relative speedup compared to the C version, from checkasm:
Cortex A7 A8 A9 A53
vp9_loop_filter_h_4_8_10bpp_neon: 1.83 2.16 1.40 2.09
vp9_loop_filter_h_8_8_10bpp_neon: 1.39 1.67 1.24 1.70
vp9_loop_filter_h_16_8_10bpp_neon: 1.56 1.47 1.10 1.81
vp9_loop_filter_h_16_16_10bpp_neon: 1.94 1.69 1.33 2.24
vp9_loop_filter_mix2_h_44_16_10bpp_neon: 2.01 2.27 1.67 2.39
vp9_loop_filter_mix2_h_48_16_10bpp_neon: 1.84 2.06 1.45 2.19
vp9_loop_filter_mix2_h_84_16_10bpp_neon: 1.89 2.20 1.47 2.29
vp9_loop_filter_mix2_h_88_16_10bpp_neon: 1.69 2.12 1.47 2.08
vp9_loop_filter_mix2_v_44_16_10bpp_neon: 3.16 3.98 2.50 4.05
vp9_loop_filter_mix2_v_48_16_10bpp_neon: 2.84 3.64 2.25 3.77
vp9_loop_filter_mix2_v_84_16_10bpp_neon: 2.65 3.45 2.16 3.54
vp9_loop_filter_mix2_v_88_16_10bpp_neon: 2.55 3.30 2.16 3.55
vp9_loop_filter_v_4_8_10bpp_neon: 2.85 3.97 2.24 3.68
vp9_loop_filter_v_8_8_10bpp_neon: 2.27 3.19 1.96 3.08
vp9_loop_filter_v_16_8_10bpp_neon: 3.42 2.74 2.26 4.40
vp9_loop_filter_v_16_16_10bpp_neon: 2.86 2.44 1.93 3.88
The speedup vs C code measured in checkasm is around 1.1-4x.
These numbers are quite inconclusive though, since the checkasm test
runs multiple filterings on top of each other, so later rounds might
end up with different codepaths (different decisions on which filter
to apply, based on input pixel differences).
Based on START_TIMER/STOP_TIMER wrapping around a few individual
functions, the speedup vs C code is around 2-4x.
Signed-off-by: Martin Storsjö <martin@martin.st>
This work is sponsored by, and copyright, Google.
This is structured similarly to the 8 bit version. In the 8 bit
version, the coefficients are 16 bits, and intermediates are 32 bits.
Here, the coefficients are 32 bit. For the 4x4 transforms for 10 bit
content, the intermediates also fit in 32 bits, but for all other
transforms (4x4 for 12 bit content, and 8x8 and larger for both 10
and 12 bit) the intermediates are 64 bit.
For the existing 8 bit case, the 8x8 transform fit all coefficients in
registers; for 10/12 bit, when the coefficients are 32 bit, the 8x8
transform also has to be done in slices of 4 pixels (just as 16x16 and
32x32 for 8 bit).
The slice width also shrinks from 4 elements to 2 elements in parallel
for the 16x16 and 32x32 cases.
The 16 bit coefficients from idct_coeffs and similar tables also need
to be lenghtened to 32 bit in order to be used in multiplication with
vectors with 32 bit elements. This leads to the fixed coefficient
vectors needing more space, leading to more cases where they have to
be reloaded within the transform (in iadst16).
This technically would need testing in checkasm for subpartitions
in increments of 2, but that slows down normal checkasm runs
excessively.
Examples of relative speedup compared to the C version, from checkasm:
Cortex A7 A8 A9 A53
vp9_inv_adst_adst_4x4_sub4_add_10_neon: 4.83 11.36 5.22 6.77
vp9_inv_adst_adst_8x8_sub8_add_10_neon: 4.12 7.60 4.06 4.84
vp9_inv_adst_adst_16x16_sub16_add_10_neon: 3.93 8.16 4.52 5.35
vp9_inv_dct_dct_4x4_sub1_add_10_neon: 1.36 2.57 1.41 1.61
vp9_inv_dct_dct_4x4_sub4_add_10_neon: 4.24 8.66 5.06 5.81
vp9_inv_dct_dct_8x8_sub1_add_10_neon: 2.63 4.18 1.68 2.87
vp9_inv_dct_dct_8x8_sub4_add_10_neon: 4.52 9.47 4.24 5.39
vp9_inv_dct_dct_8x8_sub8_add_10_neon: 3.45 7.34 3.45 4.30
vp9_inv_dct_dct_16x16_sub1_add_10_neon: 3.56 6.21 2.47 4.32
vp9_inv_dct_dct_16x16_sub2_add_10_neon: 5.68 12.73 5.28 7.07
vp9_inv_dct_dct_16x16_sub8_add_10_neon: 4.42 9.28 4.24 5.45
vp9_inv_dct_dct_16x16_sub16_add_10_neon: 3.41 7.29 3.35 4.19
vp9_inv_dct_dct_32x32_sub1_add_10_neon: 4.52 8.35 3.83 6.40
vp9_inv_dct_dct_32x32_sub2_add_10_neon: 5.86 13.19 6.14 7.04
vp9_inv_dct_dct_32x32_sub16_add_10_neon: 4.29 8.11 4.59 5.06
vp9_inv_dct_dct_32x32_sub32_add_10_neon: 3.31 5.70 3.56 3.84
vp9_inv_wht_wht_4x4_sub4_add_10_neon: 1.89 2.80 1.82 1.97
The speedup compared to the C functions is around 1.3 to 7x for the
full transforms, even higher for the smaller subpartitions.
Signed-off-by: Martin Storsjö <martin@martin.st>
This work is sponsored by, and copyright, Google.
The plain pixel put/copy functions are used from the 8 bit version,
for the double size (e.g. put16 uses ff_vp9_copy32_neon), and a new
copy128 is added.
Compared with the 8 bit version, the filters can no longer use the
trick to accumulate in 16 bit with only saturation at the end, but now
the accumulators need to be 32 bit. This avoids the need to keep track
of which filter index is the largest though, reducing the size of the
executable code for these filters.
For the horizontal filters, we only do 4 or 8 pixels wide in parallel
(while doing two rows at a time), since we don't have enough register
space to filter 16 pixels wide.
For the vertical filters, we still do 4 and 8 pixels in parallel just
as in the 8 bit case, but we need to store the output after every 2
rows instead of after every 4 rows.
Examples of relative speedup compared to the C version, from checkasm:
Cortex A7 A8 A9 A53
vp9_avg4_10bpp_neon: 2.25 2.44 3.05 2.16
vp9_avg8_10bpp_neon: 3.66 8.48 3.86 3.50
vp9_avg16_10bpp_neon: 3.39 8.26 3.37 2.72
vp9_avg32_10bpp_neon: 4.03 10.20 4.07 3.42
vp9_avg64_10bpp_neon: 4.15 10.01 4.13 3.70
vp9_avg_8tap_smooth_4h_10bpp_neon: 3.38 6.22 3.41 4.75
vp9_avg_8tap_smooth_4hv_10bpp_neon: 3.89 6.39 4.30 5.32
vp9_avg_8tap_smooth_4v_10bpp_neon: 5.32 9.73 6.34 7.31
vp9_avg_8tap_smooth_8h_10bpp_neon: 4.45 9.40 4.68 6.87
vp9_avg_8tap_smooth_8hv_10bpp_neon: 4.64 8.91 5.44 6.47
vp9_avg_8tap_smooth_8v_10bpp_neon: 6.44 13.42 8.68 8.79
vp9_avg_8tap_smooth_64h_10bpp_neon: 4.66 9.02 4.84 7.71
vp9_avg_8tap_smooth_64hv_10bpp_neon: 4.61 9.14 4.92 7.10
vp9_avg_8tap_smooth_64v_10bpp_neon: 6.90 14.13 9.57 10.41
vp9_put4_10bpp_neon: 1.33 1.46 2.09 1.33
vp9_put8_10bpp_neon: 1.57 3.42 1.83 1.84
vp9_put16_10bpp_neon: 1.55 4.78 2.17 1.89
vp9_put32_10bpp_neon: 2.06 5.35 2.14 2.30
vp9_put64_10bpp_neon: 3.00 2.41 1.95 1.66
vp9_put_8tap_smooth_4h_10bpp_neon: 3.19 5.81 3.31 4.63
vp9_put_8tap_smooth_4hv_10bpp_neon: 3.86 6.22 4.32 5.21
vp9_put_8tap_smooth_4v_10bpp_neon: 5.40 9.77 6.08 7.21
vp9_put_8tap_smooth_8h_10bpp_neon: 4.22 8.41 4.46 6.63
vp9_put_8tap_smooth_8hv_10bpp_neon: 4.56 8.51 5.39 6.25
vp9_put_8tap_smooth_8v_10bpp_neon: 6.60 12.43 8.17 8.89
vp9_put_8tap_smooth_64h_10bpp_neon: 4.41 8.59 4.54 7.49
vp9_put_8tap_smooth_64hv_10bpp_neon: 4.43 8.58 5.34 6.63
vp9_put_8tap_smooth_64v_10bpp_neon: 7.26 13.92 9.27 10.92
For the larger 8tap filters, the speedup vs C code is around 4-14x.
Signed-off-by: Martin Storsjö <martin@martin.st>
This work is sponsored by, and copyright, Google.
This is more in line with how it will be extended for more bitdepths.
Signed-off-by: Martin Storsjö <martin@martin.st>
* commit 'fd5e6a095f69495c558069315d6b36ea410c31fa':
x86util: Extend SPLATW for avx2
This commit is a noop, see 1ace9573dc
(only libavutil/x86/x86util.asm chunk).
Merged-by: Clément Bœsch <u@pkh.me>
* commit '37961044c6':
checkasm: arm: Ignore changes to bits 0-4 and 7 of FPSCR
cheackasm/arm: remove NEON instructions from checkasm_checked_call_vfp
checkasm: arm: Don't start new const blocks for each string
This merge is a noop: the changes were included in 9f1c81e5ec.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '5ece6911010b3464d2fdacfa8031c15b5bd83418':
apichanges: Fill in missing hashes and dates
This commit is a noop as we need to fill with our own hashes.
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'facdfe40805559963b5875931af9406ed5ddcd5c':
swscale: Add proper ff_ prefix to init functions
This commit is a noop, see e8c3716064
I'm keeping our ff_sws_ vs ff_ since we use ff_sws_ in other places in
swscale.
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'c0fd2fb27bebd1d5ab028e6df6bca9119d269122':
swscale: Rename sws_context_class to ff_sws_context_class
This commit is a noop, see 8bfbc8c5e5
Merged-by: Clément Bœsch <u@pkh.me>
* commit '71a0472114574993df7035f4de9aa007e03817b8':
checkasm: arm: report the first clobbered register in checkasm_checked_call
Also includes 446353ea18, 59aeed93e4, and 37961044c6 to avoid breaking
too much stuff.
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'a8fce24b9c5a87187f5bd864b18f5b3e575f8c3d':
avconv_dxva2: support HEVC Main10 decoding
This commit is a noop, see 1ec14612a5
Merged-by: Clément Bœsch <u@pkh.me>
* commit '33f6690eb4e21acc4b581688eecfc4cc5ea9515e':
hevc: offer DXVA2 for 10bit 420
This commit is a noop, see ccb94789e2
Merged-by: Clément Bœsch <u@pkh.me>
* commit '38efff92f1ef81f3de20ff0460ec7b70c253d714':
FATE: add a test for H.264 with two fields per packet
h264: fix decoding multiple fields per packet with slice threads
This merge includes two commits because the FATE test was useful in
order to make proper testing.
The merge gets rid of the now unused:
- SLICE_SINGLETHREAD and SLICE_SKIPED macros
- max_contexts
- "again" label in decode_nal_units()
This commit also includes the fix from d3e4d406b.
Thanks to wm4 and Michael Niedermayer for their testing.
Merged-by: Clément Bœsch <u@pkh.me>
Merged-by: Matthieu Bouron <matthieu.bouron@gmail.com>
This treats the case of no slices like no frames which it basically is.
The field is added to the context as other nal related fields are also there
and passing the has_slices field per *arguments is ugly and not consistent
Found-by: ubitux
Approved-by: ubitux
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
If fifo is enabled on tee muxer, ffmpeg exits because of an unknown option passed to fifo muxer.
Option name "format_options" was replaced by "format_opts" on tee muxer.
Signed-off-by: Felipe Astroza <felipe@astroza.cl>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
CUVID on GeForce GT 730 and GeForce GTX 1060 does not report any error when
decoding 8K h264 packets. However, it does return an error during
cuvidCreateDecoder call if the indicated video resolution is not
supported.
Given that stream resolution is typically known as a result of probing
it is better to use this information during avcodec_open2 call to fail
immediately, rather than proceeding to decode and never receiving any
frames from the decoder nor receiving any indication of decode failure.
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
This happens because segment_end() returns an error, so seg_write_packet
never proceeds to segment_start(), and seg->avf->pb is never re-set,
so we crash with a null pb when av_write_trailer flushes the packet
queue.
This doesn't seem to be clearly recoverable, so I'm just failing more
gracefully.
Repro:
ffmpeg -i input.ts -f segment -c copy -segment_list /noaxx.m3u8 test-%05d.ts
(assuming you don't have write access to /)
This makes the code 7 times faster with the testcase from libfuzzer
and should reduce the amount of timeouts we hit in automated fuzzing.
(for example 438/fuzz-2-ffmpeg_VIDEO_AV_CODEC_ID_RV40_fuzzer)
The code is also faster with more realistic input though the difference
is small here as that is far from the worst cases the fuzzers pick out
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
use av_lfg_init_from_data() to seed AC-3 dithering from the AC-3 frame
data to make it consistent given the same AC-3 frame, if option is set.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Raises max channels to 6 (for non joint-stereo only),
there is no difference decoding 1 or N discrete channels.
Fixes trac issue #5840
Signed-off-by: bnnm <bananaman255@gmail.com>
When use http method to delete the old segments,
there is only io_open, hove not io_close yet,
this patch is used to fix it
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
when push hls to http server, the old segemnts can not delete by hls formats.
so add the http option into hls_delete_old_segments
Reported-by: Yin Jiaoyuan <yinjiaoyuan@163.com>
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
Moves much of the setup logic for VAAPI decoding into lavc; the user
now need only provide the hw_frames_ctx.
(cherry picked from commit 123ccd07c5)
(cherry picked from commit 5e879b54a3)
(cherry picked from commit 0aec37e625)
(cherry picked from commit cfa4eb4fba)
* commit 'f450cc7bc595155bacdb9f5d2414a076ccf81b4a':
h264: eliminate decode_postinit()
Also includes fixes from 1f7b4f9abc and e344e65109.
Original patch replace H264Context.next_output_pic (H264Picture *) by
H264Context.output_frame (AVFrame *). This change is discarded as it
is incompatible with the frame reconstruction and motion vectors
display code which needs the extra information from the H264Picture.
Merged-by: Clément Bœsch <u@pkh.me>
Merged-by: Matthieu Bouron <matthieu.bouron@gmail.com>
We can pick the correct slice index directly from the ID3D11VideoDecoderOutputView
casted from data[3].
Also added myself as maintainer for DXVA2 and D3D11VA.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
No need to loop through the known surfaces, we'll use the requested surface
anyway.
The loop is only done for DXVA2.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
When support for this was added the details weren't yet finalized.
This is no longer the case.
Fixes writing of mkv/webm files with HDR.
Reported-by: Kagami Hiiragi <kagami@genshiken.org>
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Reviewed-by: James Almer <jamrial@gmail.com>
This work is sponsored by, and copyright, Google.
Previously all subpartitions except the eob=1 (DC) case ran with
the same runtime:
vp9_inv_dct_dct_16x16_sub16_add_neon: 1373.2
vp9_inv_dct_dct_32x32_sub32_add_neon: 8089.0
By skipping individual 8x16 or 8x32 pixel slices in the first pass,
we reduce the runtime of these functions like this:
vp9_inv_dct_dct_16x16_sub1_add_neon: 235.3
vp9_inv_dct_dct_16x16_sub2_add_neon: 1036.7
vp9_inv_dct_dct_16x16_sub4_add_neon: 1036.7
vp9_inv_dct_dct_16x16_sub8_add_neon: 1036.7
vp9_inv_dct_dct_16x16_sub12_add_neon: 1372.1
vp9_inv_dct_dct_16x16_sub16_add_neon: 1372.1
vp9_inv_dct_dct_32x32_sub1_add_neon: 555.1
vp9_inv_dct_dct_32x32_sub2_add_neon: 5190.2
vp9_inv_dct_dct_32x32_sub4_add_neon: 5180.0
vp9_inv_dct_dct_32x32_sub8_add_neon: 5183.1
vp9_inv_dct_dct_32x32_sub12_add_neon: 6161.5
vp9_inv_dct_dct_32x32_sub16_add_neon: 6155.5
vp9_inv_dct_dct_32x32_sub20_add_neon: 7136.3
vp9_inv_dct_dct_32x32_sub24_add_neon: 7128.4
vp9_inv_dct_dct_32x32_sub28_add_neon: 8098.9
vp9_inv_dct_dct_32x32_sub32_add_neon: 8098.8
I.e. in general a very minor overhead for the full subpartition case due
to the additional cmps, but a significant speedup for the cases when we
only need to process a small part of the actual input data.
This is cherrypicked from libav commits
cad42fadcd and
a0c443a398.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This work is sponsored by, and copyright, Google.
Previously all subpartitions except the eob=1 (DC) case ran with
the same runtime:
Cortex A7 A8 A9 A53
vp9_inv_dct_dct_16x16_sub16_add_neon: 3188.1 2435.4 2499.0 1969.0
vp9_inv_dct_dct_32x32_sub32_add_neon: 18531.7 16582.3 14207.6 12000.3
By skipping individual 4x16 or 4x32 pixel slices in the first pass,
we reduce the runtime of these functions like this:
vp9_inv_dct_dct_16x16_sub1_add_neon: 274.6 189.5 211.7 235.8
vp9_inv_dct_dct_16x16_sub2_add_neon: 2064.0 1534.8 1719.4 1248.7
vp9_inv_dct_dct_16x16_sub4_add_neon: 2135.0 1477.2 1736.3 1249.5
vp9_inv_dct_dct_16x16_sub8_add_neon: 2446.7 1828.7 1993.6 1494.7
vp9_inv_dct_dct_16x16_sub12_add_neon: 2832.4 2118.3 2266.5 1735.1
vp9_inv_dct_dct_16x16_sub16_add_neon: 3211.7 2475.3 2523.5 1983.1
vp9_inv_dct_dct_32x32_sub1_add_neon: 756.2 456.7 862.0 553.9
vp9_inv_dct_dct_32x32_sub2_add_neon: 10682.2 8190.4 8539.2 6762.5
vp9_inv_dct_dct_32x32_sub4_add_neon: 10813.5 8014.9 8518.3 6762.8
vp9_inv_dct_dct_32x32_sub8_add_neon: 11859.6 9313.0 9347.4 7514.5
vp9_inv_dct_dct_32x32_sub12_add_neon: 12946.6 10752.4 10192.2 8280.2
vp9_inv_dct_dct_32x32_sub16_add_neon: 14074.6 11946.5 11001.4 9008.6
vp9_inv_dct_dct_32x32_sub20_add_neon: 15269.9 13662.7 11816.1 9762.6
vp9_inv_dct_dct_32x32_sub24_add_neon: 16327.9 14940.1 12626.7 10516.0
vp9_inv_dct_dct_32x32_sub28_add_neon: 17462.7 15776.1 13446.2 11264.7
vp9_inv_dct_dct_32x32_sub32_add_neon: 18575.5 17157.0 14249.3 12015.1
I.e. in general a very minor overhead for the full subpartition case due
to the additional loads and cmps, but a significant speedup for the cases
when we only need to process a small part of the actual input data.
In common VP9 content in a few inspected clips, 70-90% of the non-dc-only
16x16 and 32x32 IDCTs only have nonzero coefficients in the upper left
8x8 or 16x16 subpartitions respectively.
This is cherrypicked from libav commit
9c8bc74c2b.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This avoids reloading them if they haven't been clobbered, if the
first pass also was idct.
This is similar to what was done in the aarch64 version.
This is cherrypicked from libav commit
3c87039a40.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Since the same parameter is used for both input and output,
the name inout is more fitting.
This matches the naming used below in the dmbutterfly macro.
This is cherrypicked from libav commit
79566ec8c7.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The clobbering tests in checkasm are only invoked when testing
correctness, so this bug didn't show up when benchmarking the
dc-only version.
This is cherrypicked from libav commit
4d960a1185.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This is one instruction less for thumb, and only have got
1/2 arm/thumb specific instructions.
This is cherrypicked from libav commit
e5b0fc170f.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The latter is 1 cycle faster on a cortex-53 and since the operands are
bytewise (or larger) bitmask (impossible to overflow to zero) both are
equivalent.
This is cherrypicked from libav commit
e7ae8f7a71.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Since aarch64 has enough free general purpose registers use them to
branch to the appropiate storage code. 1-2 cycles faster for the
functions using loop_filter 8/16, ... on a cortex-a53. Mixed results
(up to 2 cycles faster/slower) on a cortex-a57.
This is cherrypicked from libav commit
d7595de0b2.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Seemingly ff_clear_block_sse assumed that the block array is aligned,
so make sure it is.
Fixes ticket #6079
Signed-off-by: James Almer <jamrial@gmail.com>
when hlsenc use flag second_level_segment_index,
second_level_segment_size and second_level_segment_duration,
the rename is ok but the output filename always use the old filename
so move the rename operation after the close the ts file and
before open new segment
Reported-by: Christian Johannesen <chrisjohannesen@gmail.com>
Reviewed-by: Bodecs Bela <bodecsb@vivanet.hu>
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
CID: 1396852
check the devices_list alloc status,
and release the devices_list when alloc devices error
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
avfilter_graph_request_oldest() does work that should be done by
either the filter or the application.
The principle of this function, calling ff_request_frame() from
outside the filter was always shaky. This version is less elegant
since it requires making special cases for each filter, but it
is more robust since it no longer calls ff_request_frame()
directly without notifying the filter.
Eventually, avfilter_graph_request_oldest() will be deprecated
for a function to just run the graph.
Unlike av_frame_is_writable(), it uses the link's alloc callback,
making direct rendering possible.
The code comes from ff_filter_frame_framed(), moved with mostly
trivial changes.
start_number option starts the playlist sequence number
(#EXT-X-MEDIA-SEQUENCE) from the specified number. Unless hls_flags
single_file is set, it also specifies starting sequence numbers of
segment and subtitle filenames. Sometimes it is usefull to have unique
starting numbers at each run, but currently it is only achiveable by
setting this parameter manually.
This patch enables to specify start_number source parameter by
introducing hls_start_number_source with 3 possible values:
generic/epoch/datetime. This ensures to set start sequence number
automatically for practically unique numbers. Generic option is the
default and this is the curent behaviour: start_number option value
specifies the start sequence number. (start_number default value is 0)
If hls_start_number_source is set to epoch, then the start number will
be the seconds since epoch (1970-01-01 00:00:00). If set to datetime,
then the start sequence number will be based on the current date/time
value as YYYYmmddHHMMSS. e.g. 20161231235659.
Hls speficication allows 64 bit integers as sequence numbers. This patch
also changes some code where only 32 bit integer values were handled
correctly.
Reviewed-by: Moritz Barsnick <barsnick@gmx.net>
Signed-off-by: Bela Bodecs <bodecsb@vivanet.hu>
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
Reason: For some cases, such as 2 or more graphics cards existing, the
default command line may fail because ffmpeg does not open the correct
device node:
ffmpeg -hwaccel qsv -c:v h264_qsv -i test.264 -c:v h264_qsv out.264
Let user choose the proper one by running like below:
ffmpeg -hwaccel qsv -qsv_device /dev/dri/renderD128 -c:v h264_qsv \
-i test.264 -c:v h264_qsv out.264
Signed-off-by: ChaoX A Liu <chaox.a.liu@gmail.com>
Signed-off-by: Huang, Zhengxu <zhengxu.maxwell@gmail.com>
Signed-off-by: Andrew, Zhang <huazh407@gmail.com>
Signed-off-by: Mark Thompson <sw@jkqxz.net>
This decoder can decode all existing SpeedHQ formats (SHQ0–5, 7, and 9),
including correct decoding of the alpha channel.
1080p is decoded in 142 fps on one core of my i7-4600U (2.1 GHz Haswell),
about evenly split between bitstream reader and IDCT. There is currently
no attempt at slice or frame threading, even though the format trivially
supports both.
NewTek very helpfully provided a full set of SHQ samples, as well as
source code for an SHQ2 encoder (not included) and assistance with
understanding some details of the format.
This is what gimp, ImageMagick and FreeImage do and what the
Adobe Photoshop file format specification suggests.
Fixes a sample from ticket #6045.
Reviewed-by: Martin Vignali
when the segments largest duration value is look like 4.000000, the
EXT-X-TARGETDURATION value should equ 4.
it's wrong when hlsenc use ceil, so fix it.
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
cid: 1396268
when av_strdup(str) error, the lst need release
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
This should make no difference as the value should not be able to be that large
but its more correct this way
Fixes CID1348138
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
When allocating stack space with an alignment requirement that is larger
than the current stack alignment we need to store a copy of the original
stack pointer in order to be able to restore it later.
If we chose to use another register for this purpose we should not pick
eax/rax since it can be overwritten as a return value.
Disable B frames when using baseline/constrained baseline profile,
following H.264 spec Annex A.2.1.
Signed-off-by: Jun Zhao <jun.zhao@intel.com>
Signed-off-by: Yi A Wang <yi.a.wang@intel.com>
Signed-off-by: Mark Thompson <sw@jkqxz.net>
When the command line for children is created, it is assumed that
my_program_name always ends with "ffserver", which doesn't have to
be true if ffserver is called through a symbolic link.
In such a case, it could be that not enough space for "ffmpeg" is
available at the end, leading to a buffer overflow.
One example would be:
$ ln -s /usr/bin/ffserver ~/f; ~/f
As this is only a local buffer overflow, i.e. is based on a weird
program call, this has NO security impact.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The following three commits created a regression by writing initially
invalid mkv headers:
650e17d88b avformat/matroskaenc: write a
CRC32 element on Tags
3bcadf8227 avformat/matroskaenc: write a
CRC32 element on Info
ee888cfbe7 avformat/matroskaenc: postpone
writing the Tracks master
Symptoms:
- You can no longer playback a file that is still processed by ffmpeg,
e.g. VLC fails playback
- You can no longer stream a file to a client while if is still being
processed
- Various diagnosing tools show header errors or incomplete headers
(e.g. ffprobe, mediainfo, mkvalidator)
Note: The symptoms do not apply to completed files or ffmpeg runs that
were interrupted with 'q'
Cause:
The mentioned commits made changes in a way that some header elements
are only partially written in
mkv_write_header, leaving the header in an invalid state. Only in
mkv_write_trailer, these elements
are finished correctly, but that does only occur at the end of the
process.
Regression:
Before these commits were applied, mkv headers have always been valid,
even before completion of ffmpeg.
This has worked reliably over many versions of ffmpeg, to it was an
obvious regression.
Bugtracker:
This issue has been recorded as #5977 which is resolved by this patch
Patch:
The patch adds a new function 'end_ebml_master_crc32_preliminary' that
preliminarily finishes the ebml
element without destroying the buffer. The buffer can be used to update
the ebml element later during
mkv_write_trailer. But most important: mkv_write_header finishes with a
valid mkv header again.
Signed-off-by: James Almer <jamrial@gmail.com>
This commit adds the avio_get_dyn_buf function which allows accessing
the
content of a DynBuffer without destroying it.
This is required in matroskaenc for preliminary writing (correct) mkv
headers.
Context for this change is fixing regression bug #5977.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
looks like there is a bug in commit
1a08758e7c relating to the handling of
ptr in decode_frame after decode_block is called, before this commit
ptr would have been incremented for each line in the data window, now
after the commit it is left at the start of the first included line
rather than the line after the data window then the code sets the
remaining lines to 0 and thus the whole image is over written.
Fix by adjusting ptr to the correct line after decode_block returns
Signed-off-by: Kevin Wheatley <kevin.j.wheatley@gmail.com>
This commit replaces the current inefficient non-power-of-two FFT with a
much faster FFT based on the Prime Factor Algorithm.
Although it is already much faster than the old algorithm without SIMD,
the new algorithm makes use of the already very throughouly SIMD'd power
of two FFT, which improves performance even more across all platforms
which we have SIMD support for.
Most of the work was done by Peter Barfuss, who passed the code to me to
implement into the iMDCT and the current codebase. The code for a
5-point and 15-point FFT was derived from the previous implementation,
although it was optimized and simplified, which will make its future
SIMD easier. The 15-point FFT is currently using 6% of the current
overall decoder overhead.
The FFT can now easily be used as a forward transform by simply not
multiplying the 5-point FFT's imaginary component by -1 (which comes
from the fact that changing the complex exponential's angle by -1 also
changes the output by that) and by multiplying the "theta" angle of the
main exptab by -1. Hence the deliberately left multiplication by -1 at
the end.
FATE passes, and performance reports on other platforms/CPUs are
welcome.
Performance comparisons:
iMDCT, PFA:
101127 decicycles in speed, 32765 runs, 3 skips
iMDCT, Old:
211022 decicycles in speed, 32768 runs, 0 skips
Standalone FFT, 300000 transforms of size 960:
PFA Old FFT kiss_fft libfftw3f
3.659695s, 15.726912s, 13.300789s, 1.182222s
Being only 3x slower than libfftw3f is a big achievement by itself.
There appears to be something capping the performance in the iMDCT side
of things, possibly during the pre-stage reindexing. However, it is
certainly fast enough for now.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Prep work for the next commit, which will add a new FFT algorithm
which makes the iMDCT over 3x faster than it is currently (standalone,
the FFT is with some framesizes over 10x faster).
The new FFT algorithm uses the already thouroughly SIMD'd power of two
FFT which already has SIMD for AArch64, so users of that platform will
still see an improvement.
The previous FFT+SIMD was barely 2.5x faster than the C versions on these
platforms.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
CID: 1398228
Passing null pointer dirname to strlen, which dereferences it.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
in get_default_pattern_localtime_fmt the default pattern contains
%Y%m%d%H%I%S but the original intention was %Y%m%d%H%M%S
Signed-off-by: Bela Bodecs <bodecsb@vivanet.hu>
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
fix CID: 1398364 Resource leak
refine the code of the new options
Reviewed-by: Bodecs Bela <bodecsb@vivanet.hu>
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
A wrong, unitialized variable is used for testing. This patch fixes this
typo.
Signed-off-by: Bela Bodecs <bodecsb@vivanet.hu>
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
1st:
This patch makes it possible to put actual segment file size (measured
in bytes) and/or duration (calculated in microseconds) into segment
filenames. This feature is useful when post-processing live streaming
access log files. New behaviour works only when -use_localtime option
is set and second_level_segment_size or/and
second_level_segment_duration new hls_flags are specified. %%s is the
placeholder for size and %%t for duration in hls_segment_filename
option. Fix sized trailing zeropadding also works eg. %%09s or %%023t.
A command to test new features:
./ffmpeg -loglevel info -y -f lavfi -i color=c=red:size=640x480:r=25 -f
lavfi -i sine=f=440:b=4:r=44100 -c:v mpeg2video -g 25 -acodec aac
-cutoff 20000 -ac 2 -ar 44100 -ab 192k -f hls -hls_time 3 -hls_list_size
5 -hls_flags
second_level_segment_index+second_level_segment_size+second_level_segment_duration
-use_localtime 1 -use_localtime_mkdir 1 -hls_segment_filename
"segment_%Y%m%d%H%M%S_%%04d_%%08s_%%013t.ts" stream.m3u8
2nd:
doc/muxers: beside second_level_segment_duration and second_level_segment_size,
added some more details and example to hls_segment_filename,
use_localtime, use_localtime_mkdir, hls_flags. hls_flags option list
reformatted to table
Signed-off-by: Bela Bodecs <bodecsb@vivanet.hu>
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
if the http server don't response the http command,
then the thread will be blocked and never be interrupted.
Reported-by: yinyunjiang <yinyunjiang1991@qq.com>
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Round qpIntra and qpInter calculation instead of old floor behavior.
Adopted from vaapi_encode_h264.c
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
Current implementation of finding duplicate segment filenames may fail
if use_localtime_mkdir and use_localtime are in effect and
segment_filename option expression contains subdirectories with
date/time specifiers. This patch fixes this false behaviour.
Signed-off-by: Bela Bodecs <bodecsb@vivanet.hu>
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
in hlcenc.c, in the hls_write_header() function the default format
string for strftime() function contains %s specifier when use_localtime
is true. This %s specifier will insert the seconds since EPOCH. But %s
is not available on all system/environment. This patch check %s
availabilty at runtine and alter the default format string if necessary.
Signed-off-by: Bela Bodecs <bodecsb@vivanet.hu>
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
cutoff is implemented as an option global to lavc, but supported only
by a few encoders. This fact is now reflected in its documentation. ac3's
support of this option is added for completeness.
Signed-off-by: Moritz Barsnick <barsnick@gmx.net>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Pass the cutoff option from lavc's avcodec_options[] to libmp3lame's
lowpass option, without allowing to adjust its default behavior.
Signed-off-by: Moritz Barsnick <barsnick@gmx.net>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Prevents memory leak when read_samples_from_audio_fifo() is
called more than once by deallocating before reallocating
more memory.
Fixes space indentation for contents in ERROR().
Signed-off-by: Thomas Turner <thomastdt@googlemail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Old ICC verions don't advertise having a full C11 implementation but
may nonetheless include a feature-incomplete stdatomic.h header.
Fixes ticket #6049
Signed-off-by: James Almer <jamrial@gmail.com>
In ff_index_search_timestamp(), if b == num_entries,
m == num_entries - 1, and entries[m].flags & AVINDEX_DISCARD_FRAME is
true, then the search for the next non-discarded packet could access
entries[nb_entries], exceeding its bounds. This change adds a protection
against that scenario. Reference: https://crbug.com/666770
Reviewed-by: Sasi Inguva <isasi@google.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
When delete_segments hls_flag is specified, deleting old segments may
fail in certain cases when use_localtime_mkdir is in effect and
hls_segment_filename expression contains subdirs. This patch fixes this
behaviour.
Signed-off-by: Bela Bodecs <bodecsb@vivanet.hu>
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
Additional/Modified FATE tests improve code coverage from 63.7% to 98.1%.
Changed fate-suite sample files:
* filter/hdcd-mix.flac (958K) added. It is a much better test than
filter/hdcd.flac (910K), which is now unused, but can't be removed.
* filter/hdcd-fake20bit.flac (168K) added. It is the first second of
filter/hdcd.flac, with the 16-bit LSB copied into bit 20 of a 24-bit
stream. There isn't an actual non-16-bit HDCD sample available to test.
Signed-off-by: Burt P <pburt0@gmail.com>
The checked bitstream reader does that already. To allow parsing of
superframes split over a packet boundary, we always decode the last
superframe in each packet at the start of the next packet, even if
theoretically we could have decoded it. The last superframe in the
last packet is decoded using AV_CODEC_CAP_DELAY.
in filenames
Putting date/time values into segment filenames is very usefull.
But to produce non-conflicting segment filenames with -use_localtime
option with date/time
values in hls_segment_filename option, sometimes is not enough.
Like in cases when multiple segments produced in the same second.
But hlsenc currently does not make possible to use segment index (%d) at
the
same time whe use_localtime is in effect, due to identifier conflict.
This patch makes possible to use strftime identifiers and still put
segment index (%d) at same time in segment filenames by introducing
second_level_segment_index flag. When -use_localtime is active,
identifier %d is for month day index, so %%d is the segment index
placeholder. This enhanced behaviour only exists when new
second_level_segment_index flag is specified.
For instance putting 'segment_%Y%m%d%H%M%S_%%05d.ts' value into
-hls_segment_filename option and specifing -hls_flags
second_level_segment_index and -use_localtime 1, may produce segment
filename as 'segment_20161230235758_00002.ts'
An example:
ffmpeg -loglevel info -y -f lavfi -i color=c=red:size=640x480:r=25 -f
lavfi -i anullsrc=r=44100:cl=stereo -c:v mpeg2video -g 25 -acodec aac
-cutoff 20000 -ac 2 -ar 44100 -ab 192k -f hls -hls_time 3 -hls_list_size
5 -hls_flags delete_segments+second_level_segment_index -use_localtime 1
-hls_segment_filename "segment_%Y%m%d%H%M%S_%%05d.ts" stream.m3u8
will produce segments filenames:
....
segment_20161227005902_00013.ts
segment_20161227005902_00014.ts
segment_20161227005902_00015.ts
segment_20161227005903_00016.ts
segment_20161227005903_00017.ts
segment_20161227005903_00018.ts
segment_20161227005903_00019.ts
segment_20161227005903_00020.ts
....
Signed-off-by: Bela Bodecs <bodecsb@vivanet.hu>
initial_prog_date_time shouldn't be adjusted when deleting segments
from disk, but rather when segments are removed from the playlist.
Signed-off-by: Jesper Ek <deadbeef84@gmail.com>
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
It is now bitexact with the ssse3 and sse4.1 versions of the function.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
ffmpeg-devel
with use_localtime parameter hlsenc may produce identical filenames for
different but still existing segments. It happens when
hls_segment_filename contains
syntacticaly correct but inadequate format parameters. Currently there
is no any log message when such a situaton occurs but these cases should
be avoided in most times. This patch generate warning log messages in
these cases.
ticketID: #6043
Signed-off-by: Bela Bodecs <bodecsb@vivanet.hu>
Signed-off-by: Steven Liu <lingjiujianke@gmail.com>
This should fix issues on BSD
CLOCKS_PER_SEC is 128 on BSD while SUSv2 requires it to be a million
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes pts gaps when reading AVI files > 256GiB generated by FFmpeg.
Signed-off-by: Tobias Rapp <t.rapp@noa-archive.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
With min_samples, if a frame arrives but is too small, it clears
frame_wanted_out. In most cases, the destination filter would be
activated again later because of frame_wanted_out on its own
outputs, but not sinks.
avfilter_graph_request_oldest() is doing the work of the sink
itself, and is therefore allowed to use frame_blocked_in.
Configure checks if the ebx register can be used for asm and it has to
be saved if and only if this is not the case.
Without this the build fails when configuring with --toolchain=hardened
--disable-pic on i386 using gcc 4.8:
error: PIC register clobbered by '%ebx' in 'asm'
In that case gcc 4.8 reserves the ebx register for the GOT needed for
PIE, so it can't be used in asm directly.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
A lot of changes happen at the same time:
- Add a framequeue fifo to AVFilterLink.
- split AVFilterLink.status into status_in and status_out: requires
changes to the few filters and programs that use it directly
(f_interleave, split, filtfmts).
- Add a field ready to AVFilterContext, marking when the filter is ready
and its activation priority.
- Add flags to mark blocked links.
- Change ff_filter_frame() to enqueue the frame.
- Change all filtering functions to update the ready field and the
blocked flags.
- Update ff_filter_graph_run_once() to use the ready field.
- buffersrc: always push the frame immediately.
This makes it possible to decode motion jpeg 2000
encoded in a transport stream without a correct PMT/PAT.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The assumption that avcodec_send_packet makes regarding decoders
consuming the entire packet is not true if the codec supports
truncated decoding mode and the truncated flag is turned on.
Steps to reproduce:
./ffmpeg_g -flags truncated \
-i "http://samples.ffmpeg.org/MPEG2/test-ebu-422.40000.pakets.ts" \
-c:v ffv1 -c:a copy -y /tmp/truncated.nut
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Clang is not able to eliminate the reference to ff_spdif_probe() when
there is a goto target in the same block and optimization is disabled.
This fixes the following build failure on OS X:
./configure --disable-everything --disable-doc \
--enable-decoder=pcm_s16le --enable-demuxer=wav \
--enable-protocol=file --disable-optimizations --cc=clang
make
...
Undefined symbols for architecture x86_64:
"_ff_spdif_probe", referenced from:
_set_spdif in libavformat.a(wavdec.o)
ld: symbol(s) not found for architecture x86_64
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Most decoders return the amount of data used.
This is more consistent
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Most decoders return the amount of data used.
This is more consistent
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Most decoders return the amount of data used.
This is more consistent
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Most decoders return the amount of data used.
This is more consistent
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
MPEG Audio frame header must be 4 bytes. If we fail to read
4 bytes bail early to avoid Use-of-uninitialized-value msan error.
Reference https://crbug.com/666874.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Height of canvas produced by drawtext varies depending on symbols in
text, so add example for printing separate texts aligned horizontally.
Wording suggested by Lou Logan <lou@lrcd.com>
Signed-off-by: Andrey Utkin <andrey.utkin@pb.com>
Signed-off-by: Lou Logan <lou@lrcd.com>
Decode the Image Data Section (which contains merged pictures).
Support RGB/A and Grayscale/A in 8bits and 16 bits per channel.
Support uncompress and rle decompression in Image Data Section.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
integrate it inside multiple_resample
allow some calculations to be performed outside loop
Suggested-by: Michael Niedermayer <michael@niedermayer.cc>
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Muhammad Faiz <mfcc64@gmail.com>
-pie was added to C flags for ThreadSanitizer in commit
19f251a288. Under clang 3.8.0, the -pie
flag causes a compiler warning and a linker error when running configure
--toolchain=clang-tsan. Here is an excerpt from config.log:
clang ... -fsanitize=thread -pie -std=c11 -fomit-frame-pointer -pthread -c -o /tmp/ffconf.hL61stP9.o /tmp/ffconf.YO6ZaSFG.c
clang: warning: argument unused during compilation: '-pie'
clang -fsanitize=thread -pie -Wl,--as-needed -Wl,-z,noexecstack -o /tmp/ffconf.W5c2e41l /tmp/ffconf.hL61stP9.o -lbz2 -pthread
/usr/bin/ld: /tmp/ffconf.hL61stP9.o: relocation R_X86_64_PC32 against undefined symbol `atan2f@@GLIBC_2.2.5' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: Bad value
clang: error: linker command failed with exit code 1 (use -v to see invocation)
To be conservative, I changed -pie to -fPIE. But the documentation seems
to imply just -fsanitize=thread is enough:
http://clang.llvm.org/docs/ThreadSanitizer.htmlhttps://github.com/google/sanitizers/wiki/ThreadSanitizerCppManual
Signed-off-by: Wan-Teh Chang <wtc@google.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Make the one-time initialization in av_get_cpu_flags() thread-safe. The
static variable |cpu_flags| in libavutil/cpu.c is read and written using
normal load and store operations. These are considered as data races.
The fix is to use atomic load and store operations.
The fix can be verified by running the libavutil/tests/cpu_init.c test
program under ThreadSanitizer:
./configure --toolchain=clang-tsan
make libavutil/tests/cpu_init
libavutil/tests/cpu_init
There should be no warnings from ThreadSanitizer.
Co-author: Dmitry Vyukov of Google, who suggested the data race fix.
Signed-off-by: Wan-Teh Chang <wtc@google.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The demuxer doesn't fill the defaults if the master isn't present.
This results in codecpar->color_space being set with a value of
zero (RGB) on such files.
Signed-off-by: James Almer <jamrial@gmail.com>
As I used simple RGBA formats for subtitles and for the video texture if
avfilter is disabled I kind of assumed that sws_scale won't access data
pointers and strides above index 0, but apparently that is not the case.
Fixes Coverity CID 1396737, 1396738, 1396739, 1396740.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Marton Balint <cus@passwd.hu>
And only enable them, if they haven't been disabled.
This is needed for the following patch.
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
ff_parse_close expects priv_data to be the ParseContext directly and
thus doesn't work if it isn't at the beginning of OpusParseContext.
Reviewed-by: Hendrik Leppkes <h.leppkes@gmail.com>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
Make ff_opus_parse_extradata free allocated memory on error instead of
expecting callers to free it in that case.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
Since the default in the libav fork is to only allow known layouts, making
unknown layouts allowed by default here can be a security risk for filters
directly merged from libav. However, usually it is simple to detect such cases,
use of av_get_channel_layout_nb_channels is a good indicator, so I suggest we
change this regardless.
See http://ffmpeg.org/pipermail/ffmpeg-devel/2016-November/203204.html.
This patch indirectly adds unknown channel layout support for filters where
query_formats is not specified:
abench
afifo
ainterleave
anullsink
apad
aperms
arealtime
aselect
asendcmd
asetnsamples
asetpts
asettb
ashowinfo
azmq
It introduces a query_formats callback for the asyncts filter, which only
supports known channel layouts since it is using libavresample.
And it removes .query_formats callback from filters where it was only there to
support unknown layouts, as this is now the default:
aloop
ametadata
anull
asidedata
asplit
atrim
Acked-by: Nicolas George <george@nsup.org>
Signed-off-by: Marton Balint <cus@passwd.hu>
This decreases the amount of computations and memory needed for analysing mpeg1/2 streams
the properties update is moved from code that is skiped if skip_frame is set
to code that is not skiped so the change doesnt loose that
from being executed
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This is faster 2871 -> 2189 cycles for int16 matrixbench -> 23456hz
Fixes a integer overflow in a artificial corner case
Fixes part of 668007-media
Found-by: Matt Wolenetz <wolenetz@google.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
We are checking during encoding if there is enough space as version 4 needs that
check.
Fixes Ticket6005
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Examples use the native FFmpeg AAC encoder but it is no longer
considered experimental and therefore not required.
Signed-off-by: Lou Logan <lou@lrcd.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
(cherry picked from commit 6a62795d40)
Cherry pick Suggested-by: Martin Storsjö
This should fix the build failure on macosx
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This implements Spherical Video V1 and V2, as described in the
spatial-media collection by Google.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
While no decoder currently exports spherical information, this type
represents a frame property that has to be passed through from container
to frames.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
av_find_stream_info() was deprecated by avformat_find_stream_info(),
correct the warning message in the avformat_find_stream_info() and
comments in the avformat.h
Signed-off-by: Jun Zhao <jun.zhao@intel.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This is a bit messy as codecar does not support AVOptions so we need
to use AVCodecContext where AVOptions are required and copy back and forth.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The speex specification does not seem to restrict these values, thus
the limits where choosen so as to avoid multiplicative overflow
Fixes undefined behavior
Fixes: 635422.ogg
Found-by: Matt Wolenetz <wolenetz@google.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This would be simpler if codecpar supported AVOptions
modern ffserver should be unaffected by this, older ffserver which required the
muxer to directly access the encoder could have issues with this, but this
direct access is just wrong and unsafe
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This accesses the private encoder context, it should not be used by
the current ffserver it may affect old ffserver versions but i believe
there is consens that accessing the private encoder context from the muxer
is completely wrong.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Note, this temporarly drops the ability to set ffmpeg encoder debug and flags2 via ffserver.conf
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This reverts parts of c16582579b. The hard
coded 30 seconds are a lot, and finishing the seek can takes several
seconds when the source is on a network share. Remove this code
entirely, because it does more bad than good.
(Commit message provided by committer, based on the original messages
by the patch author.)
Signed-off-by: Rainer Hochecker <fernetmenta@online.de>
Signed-off-by: wm4 <nfxjfg@googlemail.com>
It randomly causes failures with an error like:
"Failed to set value '-f' for option 'd': Error number -920332800 occurred"
(The error number is different every time.)
Reviewed-by: Reynaldo H. Verdejo Pinochet <reynaldo@osg.samsung.com>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
This matrix needs to be applied after all others have (currently only
display matrix from trak), but cannot be handled in movie box, since
streams are not allocated yet. So store it in main context, and apply
it when appropriate, that is after parsing the tkhd one.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
When input surfaces are cuda frames, we will not know what the actual
underlying format (nv12, p010, etc) is at surface allocation time.
On the other hand, we will know when the input frames are actually
registered and associated with a surface.
So, let's delay format discovery until registration time, which is
actually how we handle other frame properties, such as dimensions.
By itself, this change doesn't allow for transcoding of 10bit
content from cuvid, but it reduces the problem to the hardcoding of
the sw format in ffmpeg_cuvid.c
Signed-off-by: Philip Langdale <philipl@overt.org>
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
Using the decode interrupt feature of ffmpeg may cause crashes by
accessing previously freed pointers in matroska_read_close.
To prevent this reset nb_elem to zero after freeing the elements,
because ffmpeg normally tests for nb_elem.
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
The code calls av_new_packet a few lines above and the allocated memory
has to be freed in case of an error.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
The delta escape (2) is supposed to work the same in 4-bit RLE as in
8-bit RLE. This is documented in the MSDN Bitmap Compression page:
https://msdn.microsoft.com/en-us/library/windows/desktop/dd183383(v=vs.85).aspx
The unchecked modification of line is safe, since the loop condition
(line >= 0) will check it before any pixel data is written.
Fixes ticket #5153 (output now matches ImageMagick for the provided sample).
Signed-off-by: Daniel Verkamp <daniel@drv.nu>
This fixes some differences between runs of the ffserver tests
(in my local tree 2 runs gave the same result with this but i had other
changes too)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This fixes a segmentation fault caused by calling memcpy with NULL as
second argument in handle_p_frame_apng.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
This should not be needed, our AVParsers should do this
I do not have a testcase though, please help testing this and please
add fate tests if you can.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes null pointer dereference
Testcase is simply a ffmpeg instance sending a stream to ffserver while another ffmpeg reads from it
This reverts commit 6f0a1710d7.
Since this is a C11 feature, it requires -std=c11.
Not actually used for anything yet, that will be added in the following
commits.
This merges libav commit 13f5d2bf75.
Signed-off-by: Wan-Teh Chang <wtc@google.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
mythtv have problem with non-seekable dont write duration and filesize
and there have problem with some other server and player with 0 value
duation and filesize.
So add a flv flags to fix the ticket and make a choose for users.
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
give very bad quality for soxr resampler.
linear_interp is intended for using linear interpolation
between filter bank so quality will be better.
i guess this is misunderstood as 'do not use filter bank,
but directly interpolate linearly between samples'.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Muhammad Faiz <mfcc64@gmail.com>
This fixes a heap-buffer-overflow in ff_er_frame_end when decoding mss2
with coded_width/coded_height larger than width/height.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
This dubious behaviour in nvenc was finally removed by nvidia, and
as we refuse to run on anything older than 7.0, we don't need to
keep it around for old versions.
floats are not necessarily normalized, so a normalized softfloat needs
MIN_EXP lowered by 23 to cover that range.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
Otherwise the codec context and codecpar might disagree on the codec id,
triggering asserts in av_parser_parse2.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
separate dsp.resample to dsp.resample_common and dsp.resample_linear
and choose to call faster resample_common even when linear_interp=on
when c->frac and c->dst_incr_mod are both zero
speed up resampling when exact_rational and linear_interp are both
enabled because exact_rational force c->frac and c->dst_incr_mod to
be zero when soft compensation does not happen
benchmark on exact_rational=on:linear_interp=on
old new
real 8.432s 5.097s
user 7.679s 4.989s
sys 0.125s 0.107s
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Muhammad Faiz <mfcc64@gmail.com>
Fixes make checkheaders on systems without the Cuda Toolkit, which
was broken after the dynlink changes.
Signed-off-by: James Almer <jamrial@gmail.com>
Move global thread variables to better place.
Use correct variable for simple and complex filtergraphs.
This makes number of threads set per filter work again.
Signed-off-by: Paul B Mahol <onemda@gmail.com>
Expands the parser to also accept the separator '-' in addition to
'+', and take the negative sign into consideration.
The optional sign for the first factor in the expression is already
covered by parsing for an integer.
Signed-off-by: Moritz Barsnick <barsnick@gmx.net>
Reviewed-by: Nicolas George <george@nsup.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
scaling list is already transfered to raster scan during head parsing,
so no need to transfer it again.
And after this fix, FATE test SLIST_A_Sony_4/SLIST_B_Sony_8/
SLIST_C_Sony_3/SLIST_D_Sony_9 will pass in i965/Skylake.
Signed-off-by: Wang, Yi A <yi.a.wamg@intel.com>
Signed-off-by: Jun Zhao <jun.zhao@intel.com>
Signed-off-by: Mark Thompson <sw@jkqxz.net>
Remove the |checked| variable because the invalid value of -1 for
|flags| can be used to indicate the same condition. Also rename |flags|
to |cpu_flags| because there are a local variable and a function
parameter named |flags| in the same file.
Co-author: Dmitry Vyukov of Google
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
I moved this into the handle_video_sequence callback because that's
the earliest time you can make an accurate decision as to what the
format should be.
However, transcoding requires that the decision between using
the accelerated PIX_FMT_CUDA vs a normal pix format happen at init()
time. There is enough information available to make that decision
and things work out with the underlying format only being discovered
in the sequence callback.
This patch moves the av_frame_make_writable() call from fill_yuv_image
to get_video_frame so that its argument can be the actual frame that
will be sent to the encoder.
This fixes data corruption issues in codecs that keep references on
one or several previous frames.
Signed-off-by: Sam Hocevar <sam@hocevar.net>
Reviewed-by: wm4
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Requested-by: wm4 ([FFmpeg-devel] [PATCH] avutil/opt: Support max > INT64_MAX in write_number() with AV_OPT_TYPE_INT64)
Requested-by: ronald ([FFmpeg-devel] [PATCH] avutil/opt: Support max > INT64_MAX in write_number() with AV_OPT_TYPE_INT64)
Reviewed-by: Andreas Cadhalpun <andreas.cadhalpun@googlemail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The nvidia 375.xx driver introduces support for P016 output surfaces,
for 10bit and 12bit HEVC content (it's also the first driver to support
hardware decoding of 12bit content).
The cuvid api, as far as I can tell, only declares one output format
that they appear to refer to as P016 in the driver strings. Of course,
10bit content in P016 is identical to P010, and it is useful for
compatibility purposes to declare the format to be P010 to work with
other components that only know how to consume P010 (and to avoid
triggering swscale conversions that are lossy when they shouldn't be).
For simplicity, this change does not maintain the previous ability
to output dithered NV12 for 10/12 bit input video - the user will need
to update their driver to decode such videos.
P016 is the 16-bit variant of NV12 (planar luma, packed chroma), using
two bytes per component.
It may, and in fact is most likely to, be used in situations where
there are less than 16 bits of data. It is the responsibility of
the writer to zero out any unused LSBs.
Currently, it forces IDR frames for both true and false.
Not entirely sure what the original idea behind the tri-state bool
option is.
Reviewed-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
User selectable surfaces are not working correctly, if you set number of
surfaces on cmdline, it will always use minimum 32 or 48 depends on
selected resolution, but in nvenc it is not necessary to use so many
surfaces.
So from now you can define as low as 1 surface and nvenc will still
work, it will ofcourse lower GPU memory usage by 95% and async_delay to zero
That was the easy part, now littlebit more...
Next part of this patch is to always prefer rc_lookahead to be more
important for number of surfaces, than user defined surfaces value.
Maximum rc_lookahead from nvidia documentation is 32, but could increase
in future generations so there is no limit for this yet. Value
async_depth is still accepted and prefered over rc_lookahead.
There were also bug when you request more than rc_lookahead > 31, it
will always set maximum 31, because surface numbers recalculation was
after setting lookahead, which is now fixed.
Results:
If you set -rc_lookahead 32 and -bf 3 it will now use only 40 surfaces
and lower GPU memory usage by 20%, also it will now increase PSNR by 0.012dB
Two more comments:
1. from my internal test, i don't understand addition of 4 more surfaces
when lookahead is calculated, i didn't used this and everything works as
with those 4 more extra surfaces, does anybody know what is going on
there? I looks like it was used for B frames which are calculated
separately, because B frames maximum is 4.
2. rc_lookahead is defined default to -1, but in test condition if
(ctx->rc_lookahead) which sets lookahead it will be always true, i don't
know if this is intended behavior, so in default behavior is lookahead
always on!
This is default condition when rc_lokkahead is -1 (not defined on
cmdline), whis is maybe something that is not intended:
ctx->encode_config.rcParams.enableLookahead = 1;
ctx->encode_config.rcParams.lookaheadDepth = 0;
ctx->encode_config.rcParams.disableIadapt = 0;
ctx->encode_config.rcParams.disableBadapt = 0;
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
when meeting IDR frame, vaapi_encode_h264 poc number don't reset, now fix
this issue based on h264 spec. Some decoder don't care this case, but this
fix will enhance the encoder action. Before this fix, poc number is
negative in some case.
Reviewed-by: Jun Zhao <jun.zhao@intel.com>
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
Signed-off-by: Mark Thompson <sw@jkqxz.net>
This was not observed earlier because the only syntax element which
it normally misses with the current setup is slice_qp_delta, but that
is always going to be zero (in IDR frames QP isn't varied on the
slice) which will always exp-golomb code as a single 1 bit. The
immediately following part is the byte alignment, which is always a 1
bit followed by 0s which are ignored, so as long as the bitstream is
never aligned at that point we will never notice because the only
difference is that an ignored bit is a 1 instead of a 0.
(cherry picked from commit fc30a90898)
While outwardly bizarre, this change makes the behaviour consistent
with other VAAPI encoders which sync to the encode /input/ picture in
order to wait for /output/ from the encoder. It is not harmful on
i965 (because synchronisation already happens in vaRenderPicture(),
so it has no effect there), and it allows the encoder to work on
mesa/gallium which assumes this behaviour.
(cherry picked from commit 086e4b58b5)
This allows better checking of capabilities and will make it easier
to add more functionality later.
It also commonises some duplicated code around rate control setup
and adds more comments explaining the internals.
(cherry picked from commit 80a5d05108)
There should be an extra offset of 6 on bit_rate_scale and of 4 on
cpb_size_scale which were not accounted for here.
(cherry picked from commit 3a9662af6c)
FLAC streams originating from the FLAC encoder send updated and more
complete STREAMINFO metadata as part of the last packet, so write that
to CodecPrivate instead of the incomplete one available in extradata
during init.
Signed-off-by: James Almer <jamrial@gmail.com>
A negative extradata size for example gets passed to memcpy in
avcodec_parameters_from_context causing a segmentation fault.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
If realloc fails, the pointer is overwritten and the previously allocated
buffer is leaked, which goes against the expected behavior of keeping the
packet unchanged in case of error.
Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
This makes av_stream_add_side_data() consistent with av_packet_add_side_data().
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
Functionally similar to av_packet_add_side_data(). Allows the use of an
already allocated buffer as stream side data.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
when parsing keyframe index metadata, list the message by trace log
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This reverts commit e0c6b32046.
Said commit changed the behavior of the demuxer and decoder in a non
backwards compatible way.
Demuxers should make extradata available at init if possible, and send
new extradata as side data within a packet if needed.
A better fix for the remuxing crash will follow.
Signed-off-by: James Almer <jamrial@gmail.com>
* commit '8d07e941b04d63fc4443dd986e3dc7b69cdcca43':
FATE: add a test of H.264 SEI recovery in an intra refresh stream
Our H264 decoder drops 3 frames from the beginning of the stream, but
all frames after those match, hence the difference in the fate test.
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '46278ec90ac5ad1dab5e85991f176afe49003fee':
mp3enc: write trailing padding
Noop, we have our own implementation for mp3 gapless.
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit 'd60c2d5216930ef98c7d4d6837d6229b37e0dcb3':
mp3dec: read the initial/trailing padding from the LAME tag
Noop, we have our own implementation for mp3 gapless tags.
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '2d097c16b833c532ac974a7f1fd05c0a1f3b7675':
libopenh264enc: Return a more sensible error code in some init failure paths
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '36b380dcd52ef47d7ba0559ed51192c88d82a9bd':
libopenh264dec: Simplify the init thanks to FF_CODEC_CAP_INIT_CLEANUP being set
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit 'd0b1e6049b06eeeeca146ece4d2f199c5dba1565':
libopenh264dec: Fix cleanup if the init failed early
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
Just the presence of a hw frames context is not enough to detect whether
the transfer is an upload or a download, because hw frames mapped to
system memory will have a hw frames context attached.
D3DLOCK_READONLY properly corresponds to the absence of the write flag,
not to the presence of the read flag, while D3DLOCK_DISCARD is
equivalent to the overwrite flag.
Fixes division by 0
This is similar to how avg_frame_rate is checked elsewhere
Fixes: 6d24add0455f41b1b45b7ba615cd46f3/asan_generic_dc34c3_5480_0a2ef411cae999b9871ed71a2e481b71.mov
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This matches the other branch
Fixes out of array read
Fixes: 4d142ca76d39fe685effcf5017098723/asan_heap-oob_31ae824_8611_348fdb64f9009b63c8a8eae9a0e497c5.mkv
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* commit '61bd0ed781b56eea1e8e851aab34a2ee3b59fbac':
h264: Log more information about invalid NALu size
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '80fbb7becae530167373fe5178966b7d7604306e':
checkasm: vp8.mc: initialize the full src buffer after ec32574209
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '17c99b6158f2c6720af74e81ee727ee50d2e7e96':
h2645_parse: handle embedded Annex B NAL units in size prefixed NAL units
This commit is a noop, see a9bb4cf87d
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit 'a8cbe5a0ccebf60a8a8b0aba5d5716dd54c1595c':
h264_ps: export actual height in MBs as SPS.mb_height
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '99cf943339a2e5171863c48cd1a73dd43dc243e1':
d3d11va: don't keep the context lock while waiting for a frame
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '2866d108c9e9da7baf53ff57a51d470691049a57':
vp8dsp: Remove the comment saying that the height is equal to the width
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '8c816c0c9b12fdefd9046415e97df299880bc9b8':
checkasm/arm: align the clobber check data properly for ldrd
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit 'ec32574209f36467ef0d22c21a7e811ba98c15b6':
checkasm: vp8: mc: test unequal width/height for partitions
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit 'fc5cdc0d5372f5103c71d5dede296734fe71ead2':
doc: escape left brace in texi2pod.pl regex
This commit is a noop, see e43ea1cbb2
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit 'd825b1a5306576dcd0553b7d0d24a3a46ad92864':
libopenh264: Support building with the 1.6 release
This commit is a noop, see 293676c476
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '4f7723cb3b913c577842a5bb088c804ddacac8df':
movenc: Add an option for skipping writing the mfra/tfra/mfro trailer
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
A number of new pix_fmts* have been added to AviSynth+:
16-bit packed RGB and RGBA
10-, 12-, 14, and 16-bit YUV 4:2:0, 4:2:2, and 4:4:4
8-, 10-, 12-, 14-, and 16-bit Planar RGB
8-, 10-, 12-, 14-, and 16-bit Planar YUVA and Planar RGBA
10-, 12-, 14-, and 16-bit GRAY variants
32-bit floating point Planar YUV(A), Planar RGB(A), and GRAY
*some of which are not currently available pix_fmts here and were
not added to the demuxer due to this
Signed-off-by: Diego Biurrun <diego@biurrun.de>
Stream timebase should be set using avpriv_set_pts_info, otherwise
avctx->pkt_timebase is not correct, leading to A/V desync.
Signed-off-by: Marton Balint <cus@passwd.hu>
Reviewed-by: Stephen Hutchinson <qyot27@gmail.com>
Signed-off-by: Diego Biurrun <diego@biurrun.de>
uint32 need 4 bytes not 1.
Fix decoding when there is half/float and uint32 channel.
This fixes crashes due to pointer corruption caused by invalid writes.
The problem was introduced in commit
03152e74df.
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
This fixes NULL pointer dereferencing for formats, where frame->data[1]
is not allocated.
The problem was introduced in commit
257fbc3af4.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
This prevented the code from correctly exporting the rotation matrix
which caused a few samples to be displayed wrong.
Introduced in ecd2ec69ce.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
Fixes building with --disable-everything --enable-shared --enable-dxva2
The hwcontext DXVA2 implementation in avutil needs this library now, instead
of just the ffmpeg program.
The dc-only mode is already checked to work correctly above, but this
allows benchmarking this mode for performance tuning, and allows making
sure that it actually is correctly hooked up.
Signed-off-by: Martin Storsjö <martin@martin.st>
The latter is 1 cycle faster on a cortex-53 and since the operands are
bytewise (or larger) bitmask (impossible to overflow to zero) both are
equivalent.
Since aarch64 has enough free general purpose registers use them to
branch to the appropiate storage code. 1-2 cycles faster for the
functions using loop_filter 8/16, ... on a cortex-a53. Mixed results
(up to 2 cycles faster/slower) on a cortex-a57.
In the latest git commits of libilbc developers removed WebRtc_xxx typedefs.
This commit uses int types instead. It's safe to apply also for previous
versions since WebRtc_Word16 was always a typedef of int16_t and
WebRtc_UWord16 a typedef of uint16_t.
Reviewed-by: Timothy Gu <timothygu99@gmail.com>
Signed-off-by: Diego Biurrun <diego@biurrun.de>
This work is sponsored by, and copyright, Google.
These are ported from the ARM version; thanks to the larger
amount of registers available, we can do the loop filters with
16 pixels at a time. The implementation is fully templated, with
a single macro which can generate versions for both 8 and
16 pixels wide, for both 4, 8 and 16 pixels loop filters
(and the 4/8 mixed versions as well).
For the 8 pixel wide versions, it is pretty close in speed (the
v_4_8 and v_8_8 filters are the best examples of this; the h_4_8
and h_8_8 filters seem to get some gain in the load/transpose/store
part). For the 16 pixels wide ones, we get a speedup of around
1.2-1.4x compared to the 32 bit version.
Examples of runtimes vs the 32 bit version, on a Cortex A53:
ARM AArch64
vp9_loop_filter_h_4_8_neon: 144.0 127.2
vp9_loop_filter_h_8_8_neon: 207.0 182.5
vp9_loop_filter_h_16_8_neon: 415.0 328.7
vp9_loop_filter_h_16_16_neon: 672.0 558.6
vp9_loop_filter_mix2_h_44_16_neon: 302.0 203.5
vp9_loop_filter_mix2_h_48_16_neon: 365.0 305.2
vp9_loop_filter_mix2_h_84_16_neon: 365.0 305.2
vp9_loop_filter_mix2_h_88_16_neon: 376.0 305.2
vp9_loop_filter_mix2_v_44_16_neon: 193.2 128.2
vp9_loop_filter_mix2_v_48_16_neon: 246.7 218.4
vp9_loop_filter_mix2_v_84_16_neon: 248.0 218.5
vp9_loop_filter_mix2_v_88_16_neon: 302.0 218.2
vp9_loop_filter_v_4_8_neon: 89.0 88.7
vp9_loop_filter_v_8_8_neon: 141.0 137.7
vp9_loop_filter_v_16_8_neon: 295.0 272.7
vp9_loop_filter_v_16_16_neon: 546.0 453.7
The speedup vs C code in checkasm tests is around 2-7x, which is
pretty much the same as for the 32 bit version. Even if these functions
are faster than their 32 bit equivalent, the C version that we compare
to also became around 1.3-1.7x faster than the C version in 32 bit.
Based on START_TIMER/STOP_TIMER wrapping around a few individual
functions, the speedup vs C code is around 4-5x.
Examples of runtimes vs C on a Cortex A57 (for a slightly older version
of the patch):
A57 gcc-5.3 neon
loop_filter_h_4_8_neon: 256.6 93.4
loop_filter_h_8_8_neon: 307.3 139.1
loop_filter_h_16_8_neon: 340.1 254.1
loop_filter_h_16_16_neon: 827.0 407.9
loop_filter_mix2_h_44_16_neon: 524.5 155.4
loop_filter_mix2_h_48_16_neon: 644.5 173.3
loop_filter_mix2_h_84_16_neon: 630.5 222.0
loop_filter_mix2_h_88_16_neon: 697.3 222.0
loop_filter_mix2_v_44_16_neon: 598.5 100.6
loop_filter_mix2_v_48_16_neon: 651.5 127.0
loop_filter_mix2_v_84_16_neon: 591.5 167.1
loop_filter_mix2_v_88_16_neon: 855.1 166.7
loop_filter_v_4_8_neon: 271.7 65.3
loop_filter_v_8_8_neon: 312.5 106.9
loop_filter_v_16_8_neon: 473.3 206.5
loop_filter_v_16_16_neon: 976.1 327.8
The speed-up compared to the C functions is 2.5 to 6 and the cortex-a57
is again 30-50% faster than the cortex-a53.
This is an adapted cherry-pick from libav commits
9d2afd1eb8 and
31756abe29.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
This work is sponsored by, and copyright, Google.
These are ported from the ARM version; thanks to the larger
amount of registers available, we can do the 16x16 and 32x32
transforms in slices 8 pixels wide instead of 4. This gives
a speedup of around 1.4x compared to the 32 bit version.
The fact that aarch64 doesn't have the same d/q register
aliasing makes some of the macros quite a bit simpler as well.
Examples of runtimes vs the 32 bit version, on a Cortex A53:
ARM AArch64
vp9_inv_adst_adst_4x4_add_neon: 90.0 87.7
vp9_inv_adst_adst_8x8_add_neon: 400.0 354.7
vp9_inv_adst_adst_16x16_add_neon: 2526.5 1827.2
vp9_inv_dct_dct_4x4_add_neon: 74.0 72.7
vp9_inv_dct_dct_8x8_add_neon: 271.0 256.7
vp9_inv_dct_dct_16x16_add_neon: 1960.7 1372.7
vp9_inv_dct_dct_32x32_add_neon: 11988.9 8088.3
vp9_inv_wht_wht_4x4_add_neon: 63.0 57.7
The speedup vs C code (2-4x) is smaller than in the 32 bit case,
mostly because the C code ends up significantly faster (around
1.6x faster, with GCC 5.4) when built for aarch64.
Examples of runtimes vs C on a Cortex A57 (for a slightly older version
of the patch):
A57 gcc-5.3 neon
vp9_inv_adst_adst_4x4_add_neon: 152.2 60.0
vp9_inv_adst_adst_8x8_add_neon: 948.2 288.0
vp9_inv_adst_adst_16x16_add_neon: 4830.4 1380.5
vp9_inv_dct_dct_4x4_add_neon: 153.0 58.6
vp9_inv_dct_dct_8x8_add_neon: 789.2 180.2
vp9_inv_dct_dct_16x16_add_neon: 3639.6 917.1
vp9_inv_dct_dct_32x32_add_neon: 20462.1 4985.0
vp9_inv_wht_wht_4x4_add_neon: 91.0 49.8
The asm is around factor 3-4 faster than C on the cortex-a57 and the asm
is around 30-50% faster on the a57 compared to the a53.
This is an adapted cherry-pick from libav commit
3c9546dfaf.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
This work is sponsored by, and copyright, Google.
These are ported from the ARM version; it is essentially a 1:1
port with no extra added features, but with some hand tuning
(especially for the plain copy/avg functions). The ARM version
isn't very register starved to begin with, so there's not much
to be gained from having more spare registers here - we only
avoid having to clobber callee-saved registers.
Examples of runtimes vs the 32 bit version, on a Cortex A53:
ARM AArch64
vp9_avg4_neon: 27.2 23.7
vp9_avg8_neon: 56.5 54.7
vp9_avg16_neon: 169.9 167.4
vp9_avg32_neon: 585.8 585.2
vp9_avg64_neon: 2460.3 2294.7
vp9_avg_8tap_smooth_4h_neon: 132.7 125.2
vp9_avg_8tap_smooth_4hv_neon: 478.8 442.0
vp9_avg_8tap_smooth_4v_neon: 126.0 93.7
vp9_avg_8tap_smooth_8h_neon: 241.7 234.2
vp9_avg_8tap_smooth_8hv_neon: 690.9 646.5
vp9_avg_8tap_smooth_8v_neon: 245.0 205.5
vp9_avg_8tap_smooth_64h_neon: 11273.2 11280.1
vp9_avg_8tap_smooth_64hv_neon: 22980.6 22184.1
vp9_avg_8tap_smooth_64v_neon: 11549.7 10781.1
vp9_put4_neon: 18.0 17.2
vp9_put8_neon: 40.2 37.7
vp9_put16_neon: 97.4 99.5
vp9_put32_neon/armv8: 346.0 307.4
vp9_put64_neon/armv8: 1319.0 1107.5
vp9_put_8tap_smooth_4h_neon: 126.7 118.2
vp9_put_8tap_smooth_4hv_neon: 465.7 434.0
vp9_put_8tap_smooth_4v_neon: 113.0 86.5
vp9_put_8tap_smooth_8h_neon: 229.7 221.6
vp9_put_8tap_smooth_8hv_neon: 658.9 621.3
vp9_put_8tap_smooth_8v_neon: 215.0 187.5
vp9_put_8tap_smooth_64h_neon: 10636.7 10627.8
vp9_put_8tap_smooth_64hv_neon: 21076.8 21026.9
vp9_put_8tap_smooth_64v_neon: 9635.0 9632.4
These are generally about as fast as the corresponding ARM
routines on the same CPU (at least on the A53), in most cases
marginally faster.
The speedup vs C code is pretty much the same as for the 32 bit
case; on the A53 it's around 6-13x for ther larger 8tap filters.
The exact speedup varies a little, since the C versions generally
don't end up exactly as slow/fast as on 32 bit.
This is an adapted cherry-pick from libav commit
383d96aa22.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
With apple tools, the linker fails with errors like these, if the
offset is negative:
ld: in section __TEXT,__text reloc 8: symbol index out of range for architecture arm64
This is cherry-picked from libav commit
c44a8a3eab.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
This work is sponsored by, and copyright, Google.
The implementation tries to have smart handling of cases
where no pixels need the full filtering for the 8/16 width
filters, skipping both calculation and writeback of the
unmodified pixels in those cases. The actual effect of this
is hard to test with checkasm though, since it tests the
full filtering, and the benefit depends on how many filtered
blocks use the shortcut.
Examples of relative speedup compared to the C version, from checkasm:
Cortex A7 A8 A9 A53
vp9_loop_filter_h_4_8_neon: 2.72 2.68 1.78 3.15
vp9_loop_filter_h_8_8_neon: 2.36 2.38 1.70 2.91
vp9_loop_filter_h_16_8_neon: 1.80 1.89 1.45 2.01
vp9_loop_filter_h_16_16_neon: 2.81 2.78 2.18 3.16
vp9_loop_filter_mix2_h_44_16_neon: 2.65 2.67 1.93 3.05
vp9_loop_filter_mix2_h_48_16_neon: 2.46 2.38 1.81 2.85
vp9_loop_filter_mix2_h_84_16_neon: 2.50 2.41 1.73 2.85
vp9_loop_filter_mix2_h_88_16_neon: 2.77 2.66 1.96 3.23
vp9_loop_filter_mix2_v_44_16_neon: 4.28 4.46 3.22 5.70
vp9_loop_filter_mix2_v_48_16_neon: 3.92 4.00 3.03 5.19
vp9_loop_filter_mix2_v_84_16_neon: 3.97 4.31 2.98 5.33
vp9_loop_filter_mix2_v_88_16_neon: 3.91 4.19 3.06 5.18
vp9_loop_filter_v_4_8_neon: 4.53 4.47 3.31 6.05
vp9_loop_filter_v_8_8_neon: 3.58 3.99 2.92 5.17
vp9_loop_filter_v_16_8_neon: 3.40 3.50 2.81 4.68
vp9_loop_filter_v_16_16_neon: 4.66 4.41 3.74 6.02
The speedup vs C code is around 2-6x. The numbers are quite
inconclusive though, since the checkasm test runs multiple filterings
on top of each other, so later rounds might end up with different
codepaths (different decisions on which filter to apply, based
on input pixel differences). Disabling the early-exit in the asm
doesn't give a fair comparison either though, since the C code
only does the necessary calcuations for each row.
Based on START_TIMER/STOP_TIMER wrapping around a few individual
functions, the speedup vs C code is around 4-9x.
This is pretty similar in runtime to the corresponding routines
in libvpx. (This is comparing vpx_lpf_vertical_16_neon,
vpx_lpf_horizontal_edge_8_neon and vpx_lpf_horizontal_edge_16_neon
to vp9_loop_filter_h_16_8_neon, vp9_loop_filter_v_16_8_neon
and vp9_loop_filter_v_16_16_neon - note that the naming of horizonal
and vertical is flipped between the libraries.)
In order to have stable, comparable numbers, the early exits in both
asm versions were disabled, forcing the full filtering codepath.
Cortex A7 A8 A9 A53
vp9_loop_filter_h_16_8_neon: 597.2 472.0 482.4 415.0
libvpx vpx_lpf_vertical_16_neon: 626.0 464.5 470.7 445.0
vp9_loop_filter_v_16_8_neon: 500.2 422.5 429.7 295.0
libvpx vpx_lpf_horizontal_edge_8_neon: 586.5 414.5 415.6 383.2
vp9_loop_filter_v_16_16_neon: 905.0 784.7 791.5 546.0
libvpx vpx_lpf_horizontal_edge_16_neon: 1060.2 751.7 743.5 685.2
Our version is consistently faster on on A7 and A53, marginally slower on
A8, and sometimes faster, sometimes slower on A9 (marginally slower in all
three tests in this particular test run).
This is an adapted cherry-pick from libav commit
dd299a2d6d.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
This work is sponsored by, and copyright, Google.
For the transforms up to 8x8, we can fit all the data (including
temporaries) in registers and just do a straightforward transform
of all the data. For 16x16, we do a transform of 4x16 pixels in
4 slices, using a temporary buffer. For 32x32, we transform 4x32
pixels at a time, in two steps of 4x16 pixels each.
Examples of relative speedup compared to the C version, from checkasm:
Cortex A7 A8 A9 A53
vp9_inv_adst_adst_4x4_add_neon: 3.39 5.83 4.17 4.01
vp9_inv_adst_adst_8x8_add_neon: 3.79 4.86 4.23 3.98
vp9_inv_adst_adst_16x16_add_neon: 3.33 4.36 4.11 4.16
vp9_inv_dct_dct_4x4_add_neon: 4.06 6.16 4.59 4.46
vp9_inv_dct_dct_8x8_add_neon: 4.61 6.01 4.98 4.86
vp9_inv_dct_dct_16x16_add_neon: 3.35 3.44 3.36 3.79
vp9_inv_dct_dct_32x32_add_neon: 3.89 3.50 3.79 4.42
vp9_inv_wht_wht_4x4_add_neon: 3.22 5.13 3.53 3.77
Thus, the speedup vs C code is around 3-6x.
This is mostly marginally faster than the corresponding routines
in libvpx on most cores, tested with their 32x32 idct (compared to
vpx_idct32x32_1024_add_neon). These numbers are slightly in libvpx's
favour since their version doesn't clear the input buffer like ours
do (although the effect of that on the total runtime probably is
negligible.)
Cortex A7 A8 A9 A53
vp9_inv_dct_dct_32x32_add_neon: 18436.8 16874.1 14235.1 11988.9
libvpx vpx_idct32x32_1024_add_neon 20789.0 13344.3 15049.9 13030.5
Only on the Cortex A8, the libvpx function is faster. On the other cores,
ours is slightly faster even though ours has got source block clearing
integrated.
This is an adapted cherry-pick from libav commits
a67ae67083 and
52d196fb30.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
This work is sponsored by, and copyright, Google.
The filter coefficients are signed values, where the product of the
multiplication with one individual filter coefficient doesn't
overflow a 16 bit signed value (the largest filter coefficient is
127). But when the products are accumulated, the resulting sum can
overflow the 16 bit signed range. Instead of accumulating in 32 bit,
we accumulate the largest product (either index 3 or 4) last with a
saturated addition.
(The VP8 MC asm does something similar, but slightly simpler, by
accumulating each half of the filter separately. In the VP9 MC
filters, each half of the filter can also overflow though, so the
largest component has to be handled individually.)
Examples of relative speedup compared to the C version, from checkasm:
Cortex A7 A8 A9 A53
vp9_avg4_neon: 1.71 1.15 1.42 1.49
vp9_avg8_neon: 2.51 3.63 3.14 2.58
vp9_avg16_neon: 2.95 6.76 3.01 2.84
vp9_avg32_neon: 3.29 6.64 2.85 3.00
vp9_avg64_neon: 3.47 6.67 3.14 2.80
vp9_avg_8tap_smooth_4h_neon: 3.22 4.73 2.76 4.67
vp9_avg_8tap_smooth_4hv_neon: 3.67 4.76 3.28 4.71
vp9_avg_8tap_smooth_4v_neon: 5.52 7.60 4.60 6.31
vp9_avg_8tap_smooth_8h_neon: 6.22 9.04 5.12 9.32
vp9_avg_8tap_smooth_8hv_neon: 6.38 8.21 5.72 8.17
vp9_avg_8tap_smooth_8v_neon: 9.22 12.66 8.15 11.10
vp9_avg_8tap_smooth_64h_neon: 7.02 10.23 5.54 11.58
vp9_avg_8tap_smooth_64hv_neon: 6.76 9.46 5.93 9.40
vp9_avg_8tap_smooth_64v_neon: 10.76 14.13 9.46 13.37
vp9_put4_neon: 1.11 1.47 1.00 1.21
vp9_put8_neon: 1.23 2.17 1.94 1.48
vp9_put16_neon: 1.63 4.02 1.73 1.97
vp9_put32_neon: 1.56 4.92 2.00 1.96
vp9_put64_neon: 2.10 5.28 2.03 2.35
vp9_put_8tap_smooth_4h_neon: 3.11 4.35 2.63 4.35
vp9_put_8tap_smooth_4hv_neon: 3.67 4.69 3.25 4.71
vp9_put_8tap_smooth_4v_neon: 5.45 7.27 4.49 6.52
vp9_put_8tap_smooth_8h_neon: 5.97 8.18 4.81 8.56
vp9_put_8tap_smooth_8hv_neon: 6.39 7.90 5.64 8.15
vp9_put_8tap_smooth_8v_neon: 9.03 11.84 8.07 11.51
vp9_put_8tap_smooth_64h_neon: 6.78 9.48 4.88 10.89
vp9_put_8tap_smooth_64hv_neon: 6.99 8.87 5.94 9.56
vp9_put_8tap_smooth_64v_neon: 10.69 13.30 9.43 14.34
For the larger 8tap filters, the speedup vs C code is around 5-14x.
This is significantly faster than libvpx's implementation of the same
functions, at least when comparing the put_8tap_smooth_64 functions
(compared to vpx_convolve8_horiz_neon and vpx_convolve8_vert_neon from
libvpx).
Absolute runtimes from checkasm:
Cortex A7 A8 A9 A53
vp9_put_8tap_smooth_64h_neon: 20150.3 14489.4 19733.6 10863.7
libvpx vpx_convolve8_horiz_neon: 52623.3 19736.4 21907.7 25027.7
vp9_put_8tap_smooth_64v_neon: 14455.0 12303.9 13746.4 9628.9
libvpx vpx_convolve8_vert_neon: 42090.0 17706.2 17659.9 16941.2
Thus, on the A9, the horizontal filter is only marginally faster than
libvpx, while our version is significantly faster on the other cores,
and the vertical filter is significantly faster on all cores. The
difference is especially large on the A7.
The libvpx implementation does the accumulation in 32 bit, which
probably explains most of the differences.
This is an adapted cherry-pick from libav commits
ffbd1d2b00,
392caa65df,
557c1675cf and
11623217e3.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
We reset .Lpic_gp to zero at the start of each function, which means
that the logic within movrelx for clearing gp when necessary will
be missed.
This fixes using movrelx in different functions with a different
helper register.
This is cherry-picked from libav commit
824e8c2840.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Make them aligned, to allow efficient access to them from simd.
This is an adapted cherry-pick from libav commit
a4cfcddcb0.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Also a small cosmetic change to the avx2 idct16 version to make it
explicit that one of the arguments to the write-out macros is unused
for >=avx2 (it uses pmovzxbw instead of punpcklbw).
libavfilter/af_asyncts.c:212:9: warning: absolute value function 'labs' given an argument of type 'int64_t' (aka 'long long') but has parameter of type 'long' which may cause truncation of value [-Wabsolute-value]
This was correct for H.26[45], because libmfx uses the same values
derived from profile_idc and the constraint_set flags, but it is
wrong for other codecs.
Also avoid passing FF_LEVEL_UNKNOWN (-99) as the level, as this is
certainly invalid.
* commit 'dc08bbf63a217c839aa4c143f2a1d0b7e2e6d997':
vp8dsp: Clarify the first dimension of the mc function tables
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '924e2ecd2b7d51cca60c79351ef16b04dd4245c3':
qsvdec: when a frames ctx is supplied, use its frame dimensions
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '92736c74fb1633e36f7134a880422a9b7db14d3f':
qsvdec: add support for P010 (10-bit 420) decoding
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit 'ce320cf1c4daab3e2e3726ed7d2e879d10f7b991':
qsvdec: use the same mfxFrameInfo for allocating frames that was passed to DECODE_Init
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '536bb17e9659c5ed7576a218d4085cdd6d5742fa':
qsvdec: make ff_qsv_map_pixfmt() return a MFX fourcc as well
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '4926fa9a4aa03f3b751f52e900b9efb87fea0591':
hwcontext_vaapi: Add driver quirks to the hwdevice
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit 'e78e5b735fd559bc7aa3f5a01e9c8d37dc2ec6d8':
swscale: add P010 input support
This commit is a noop, see 2e31434d84
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit 'b7c5f885233a7b8692140c920d9f43220dc830d9':
pixfmt: add P010 pixel format
This commit is a noop, see c2869b4640
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit 'b55566db4c51d920a6496455bb30a608e5a50a41':
avconv: use avcodec_parameters_copy() with streamcopy
The fate-aac-autobsf-adtstoasc changes from writing an audio bitdepth
based on the sample format, which is now available.
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit 'be3e807c8fad1f82766c083073e44396799f155b':
oggparseopus: export pre-skip
Noop, we already export this information
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '029cf99c5166b36f33381cd8ebfa5f1f1f463d1f':
mov: Save number of stsd elements after stream extradata allocation
Mostly noop, see 8b43ee4054
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '6c445990e64124ad64c79423dfd3764520648c89':
tiffenc: Check zlib support for deflate option during initialization
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit 'd8f3b0fb584677d4882e3a2d7c28f8b15c7319f5':
targaenc: Move size check to initialization function
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '67cb2c0f73ec08bdcecd675c1ffe25c3a5b26ef2':
checkasm: hevc: Iterate over features first, then over bitdepths
Noop, we don't have these checkasm tests.
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit 'fe27792fd779ac4cdd5e57be5f6f488483c307b2':
build: Move ff_mpeg12_frame_rate_tab to a separate file
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '8c929037ec75fbe9f367e0a31ee34839e92de481':
build: Add a new component for H.264 parsing code
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '3c08b7bc761b6411f55db68189721638dde2c46a':
ffv1: Report additional bitstream information in verbose mode
Noop, we already have bitstream information printing.
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit 'fe6e5cbea7dbd5d2c67d79b5570e26debb70e95b':
ffv1: Remove version 2 and mark version 3 as non-experimental
Noop, our ffv1 decoder is far more advanced and version 3 has been stable for a while.
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '7c55fac7dfa8bad9644dea5d03309da30be69563':
fate: Add test for webp
Noop, we already have a variety of webp tests, including a fate-webp target,
which would collide with this test.
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
__MAC_10_11 can be present in updated revision of an older SDK so it
can't reliably detect availability of kAudioFormatEnhancedAC3 constant.
Fixes: b4daa2c40f ('lavc/audiotoolboxdec: add eac3 decoder')
Cc: Rodger Combs <rodger.combs@gmail.com>
Signed-off-by: Dmitry Kalinkin <dmitry.kalinkin@gmail.com>
Previous version reviewed by: Rodger Combs <rodger.combs@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This work is sponsored by, and copyright, Google.
These are ported from the ARM version; thanks to the larger
amount of registers available, we can do the 16x16 and 32x32
transforms in slices 8 pixels wide instead of 4. This gives
a speedup of around 1.4x compared to the 32 bit version.
The fact that aarch64 doesn't have the same d/q register
aliasing makes some of the macros quite a bit simpler as well.
Examples of runtimes vs the 32 bit version, on a Cortex A53:
ARM AArch64
vp9_inv_adst_adst_4x4_add_neon: 90.0 87.7
vp9_inv_adst_adst_8x8_add_neon: 400.0 354.7
vp9_inv_adst_adst_16x16_add_neon: 2526.5 1827.2
vp9_inv_dct_dct_4x4_add_neon: 74.0 72.7
vp9_inv_dct_dct_8x8_add_neon: 271.0 256.7
vp9_inv_dct_dct_16x16_add_neon: 1960.7 1372.7
vp9_inv_dct_dct_32x32_add_neon: 11988.9 8088.3
vp9_inv_wht_wht_4x4_add_neon: 63.0 57.7
The speedup vs C code (2-4x) is smaller than in the 32 bit case,
mostly because the C code ends up significantly faster (around
1.6x faster, with GCC 5.4) when built for aarch64.
Examples of runtimes vs C on a Cortex A57 (for a slightly older version
of the patch):
A57 gcc-5.3 neon
vp9_inv_adst_adst_4x4_add_neon: 152.2 60.0
vp9_inv_adst_adst_8x8_add_neon: 948.2 288.0
vp9_inv_adst_adst_16x16_add_neon: 4830.4 1380.5
vp9_inv_dct_dct_4x4_add_neon: 153.0 58.6
vp9_inv_dct_dct_8x8_add_neon: 789.2 180.2
vp9_inv_dct_dct_16x16_add_neon: 3639.6 917.1
vp9_inv_dct_dct_32x32_add_neon: 20462.1 4985.0
vp9_inv_wht_wht_4x4_add_neon: 91.0 49.8
The asm is around factor 3-4 faster than C on the cortex-a57 and the asm
is around 30-50% faster on the a57 compared to the a53.
Signed-off-by: Martin Storsjö <martin@martin.st>
This work is sponsored by, and copyright, Google.
These are ported from the ARM version; thanks to the larger
amount of registers available, we can do the loop filters with
16 pixels at a time. The implementation is fully templated, with
a single macro which can generate versions for both 8 and
16 pixels wide, for both 4, 8 and 16 pixels loop filters
(and the 4/8 mixed versions as well).
For the 8 pixel wide versions, it is pretty close in speed (the
v_4_8 and v_8_8 filters are the best examples of this; the h_4_8
and h_8_8 filters seem to get some gain in the load/transpose/store
part). For the 16 pixels wide ones, we get a speedup of around
1.2-1.4x compared to the 32 bit version.
Examples of runtimes vs the 32 bit version, on a Cortex A53:
ARM AArch64
vp9_loop_filter_h_4_8_neon: 144.0 127.2
vp9_loop_filter_h_8_8_neon: 207.0 182.5
vp9_loop_filter_h_16_8_neon: 415.0 328.7
vp9_loop_filter_h_16_16_neon: 672.0 558.6
vp9_loop_filter_mix2_h_44_16_neon: 302.0 203.5
vp9_loop_filter_mix2_h_48_16_neon: 365.0 305.2
vp9_loop_filter_mix2_h_84_16_neon: 365.0 305.2
vp9_loop_filter_mix2_h_88_16_neon: 376.0 305.2
vp9_loop_filter_mix2_v_44_16_neon: 193.2 128.2
vp9_loop_filter_mix2_v_48_16_neon: 246.7 218.4
vp9_loop_filter_mix2_v_84_16_neon: 248.0 218.5
vp9_loop_filter_mix2_v_88_16_neon: 302.0 218.2
vp9_loop_filter_v_4_8_neon: 89.0 88.7
vp9_loop_filter_v_8_8_neon: 141.0 137.7
vp9_loop_filter_v_16_8_neon: 295.0 272.7
vp9_loop_filter_v_16_16_neon: 546.0 453.7
The speedup vs C code in checkasm tests is around 2-7x, which is
pretty much the same as for the 32 bit version. Even if these functions
are faster than their 32 bit equivalent, the C version that we compare
to also became around 1.3-1.7x faster than the C version in 32 bit.
Based on START_TIMER/STOP_TIMER wrapping around a few individual
functions, the speedup vs C code is around 4-5x.
Examples of runtimes vs C on a Cortex A57 (for a slightly older version
of the patch):
A57 gcc-5.3 neon
loop_filter_h_4_8_neon: 256.6 93.4
loop_filter_h_8_8_neon: 307.3 139.1
loop_filter_h_16_8_neon: 340.1 254.1
loop_filter_h_16_16_neon: 827.0 407.9
loop_filter_mix2_h_44_16_neon: 524.5 155.4
loop_filter_mix2_h_48_16_neon: 644.5 173.3
loop_filter_mix2_h_84_16_neon: 630.5 222.0
loop_filter_mix2_h_88_16_neon: 697.3 222.0
loop_filter_mix2_v_44_16_neon: 598.5 100.6
loop_filter_mix2_v_48_16_neon: 651.5 127.0
loop_filter_mix2_v_84_16_neon: 591.5 167.1
loop_filter_mix2_v_88_16_neon: 855.1 166.7
loop_filter_v_4_8_neon: 271.7 65.3
loop_filter_v_8_8_neon: 312.5 106.9
loop_filter_v_16_8_neon: 473.3 206.5
loop_filter_v_16_16_neon: 976.1 327.8
The speed-up compared to the C functions is 2.5 to 6 and the cortex-a57
is again 30-50% faster than the cortex-a53.
Signed-off-by: Martin Storsjö <martin@martin.st>
* commit 'e48746deec48e9ff195841bc3266b4e153a878cd':
checkasm: h264dsp: Move the x and y variables into the randomize_buffer macro
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '82b7525173f20702a8cbc26ebedbf4b69b8fecec':
Add an OpenH264 decoder wrapper
This commit is a noop, see c5d326f551
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '785c25443b56adb6dbbb78d68cccbd9bd4a42e05':
movenc: Apply offsets on timestamps when peeking into interleaving queues
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit 'eccfb9778ae939764d17457f34338d140832d9e1':
qsvdec_hevc: add the UID of the HEVC HW decoder plugin
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit 'c3f113d58488df7594a489bdbb993a69ad47063c':
vf_hwdownload: allocate the destination frame for the pool size
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit 'fdfe01365d579189d9a55b3741dba2ac46eb1df8':
hwcontext: allocate the destination frame for the pool size
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '5fcae3b3f93fd02b3d1e009b9d9b17410fca9498':
hwcontext: clarify the behaviour of transfer_data() for cropped frames
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '94ebf5565849e4dc036d2ca43979571ed3736457':
avconv: restructure sending EOF to filters
Noop, as its a fixup to a previously skipped commit
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit 'd2e56cf753a6c462041dee897d9d0c90f349988c':
avconv: move flushing the queued frames to configure_filtergraph()
Noop, as its a fixup to a previously skipped commit
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
The Intel binary iHD driver does not support the
VASurfaceAttribMemoryType, so surface allocation will fail when using
it.
(cherry picked from commit 2124711b95)
If no string argument is supplied when av_hwdevice_ctx_create() is
called to create a VAAPI device, we currently only try the default
X11 display (that is, $DISPLAY) to find a device, and will therefore
fail in the absence of an X server to connect to. Change the logic
to also look for a device via the first DRM render node (that is,
"/dev/dri/renderD128"), which is probably the right thing to use in
most simple configurations which only have one DRM device.
(cherry picked from commit 121f34d5f0)
No longer leaks memory when used with a driver with the "render does
not destroy param buffers" quirk (i.e. Intel i965).
(cherry picked from commit 221ffca631)
Fixes ticket #5871.
The driver being used is detected inside av_hwdevice_ctx_init() and
the quirks field then set from a table of known device. If this
behaviour is unwanted, the user can also set the quirks field
manually.
Also adds the Intel i965 driver quirk (it does not destroy parameter
buffers used in a call to vaRenderPicture()) and detects that driver
to set it.
(cherry picked from commit 4926fa9a4a)
Set up the encoder with a hardware context which will match the one
the decoder will use when it starts later.
Includes 02c2761973, with additional
hackery to get around a3a0230a98 being
skipped.
* commit '8a62d2c28fbacd1ae20c35887a1eecba2be14371':
vaapi_encode: Maintain a pool of bitstream output buffers
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '4a081f224e12f4227ae966bcbdd5384f22121ecf':
libavcodec: fix constness in clobber test avcodec_open2() wrappers
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '02c2761973dfc886e94a60a9c7d6d30c296d5b8c':
avconv_qsv: use the device creation API
Not merged, our ffmpeg hwaccel infra is not quite the same as avconvs.
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '232399e3ee219d16d0e0d482c9f31a26202d4993':
avconv: pass the hwaccel frames context to the decoder
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit 'a3a0230a9870b9018dc7415ae5872784d524cfe5':
avconv: init filtergraphs only after we have a frame on each input
This commit is a noop since it doesn't apply cleanly due to differences
in the dataflow between avconv and ffmpeg, and thus fixing this in the
scope of a merge is unfeasible.
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '3e265ca58f0505470186dce300ab66a6eac3978e':
avconv: do packet ts rescaling in write_packet()
This commit is a noop since it doesn't apply cleanly due to differences
in the dataflow between avconv and ffmpeg, and thus fixing this in the
scope of a merge is unfeasible.
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit 'ba7397baef796ca3991fe1c921bc91054407c48b':
avconv: factor out initializing stream parameters for encoding
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
The handling of the other block sizes was limited to 'SCALED == 0' in
commit dc96c0f9fc, so this assert should
be disabled, too, as it can now be triggered.
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
Fixes building for Windows x86 with MSVC using the link libraries distributed with the CUDA SDK.
check_lib2 is required here because it includes the header to get the full signature of the
function, including the stdcall calling convention and all of its arguments, which enables
the linker to determine the fully qualified object name and resolve it through the import
library, since the CUDA SDK libraries do not include un-qualified aliases.
AVFilterLink.frame_count is supposed to count the number of frames
that were passed on the link, but with min_samples, that number is
not always the same for the source and destination filters.
With the addition of a FIFO on the link, the difference will become
more significant.
Split the variable in two: frame_count_in counts the number of
frames that entered the link, frame_count_out counts the number
of frames that were sent to the destination filter.
The test is not supposed to cover audio.
Also, using -vframes along with an audio stream depends on
the exact order the frames are processed by filters, it is
too much constraint to guarantee.
Fixes valgrind warning about "Conditional jump or move depends on uninitialised value(s)"
Reviewed-by: Andreas Cadhalpun <andreas.cadhalpun@googlemail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
libavcodec/ratecontrol.c:120:9: warning: ISO C forbids initialization between function pointer and ‘void *’ [-Wpedantic]
libavcodec/ratecontrol.c:121:9: warning: ISO C forbids initialization between function pointer and ‘void *’ [-Wpedantic]
Otherwise put_bits can be called with a value that doesn't fit in the
sample_len, causing an assertion failure.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
From 'man ppm': The maximum color value (Maxval), again in ASCII decimal.
Must be less than 65536.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
Also contains the following changes to the library:
- add ff_ prefix to functions
- remove cplusplus defines.
- add FF_ prefix to contants and some structs
- remove true peak calculation feature, since it uses its own resampler, and
af_loudnorm does not need it.
- remove version info and some fprintf(stderr) functions
- convert to use av_malloc
- always use histogram mode for LRA calculation, otherwise LRA data is slowly
consuming memory making af_loudnorm unfit for 24/7 operation. It also uses a
BSD style linked list implementation which is probably not available on all
platforms. So let's just remove the classic mode which not uses histogram.
- add ff_thread_once for calculating static histogram tables
- convert some functions to void which cannot fail
- remove intrinsics and some unused headers
- add support for planar audio
- remove channel / sample rate changer function, in ffmpeg usually we simply
alloc a new context
- convert some static variables to defines
- declare static histogram variables as aligned
- convert some initalizations to mallocz
- add window size parameter to init function and remove window size setter
function
- convert return codes to AVERROR
- fix indentation
Signed-off-by: Marton Balint <cus@passwd.hu>
This work is sponsored by, and copyright, Google.
The implementation tries to have smart handling of cases
where no pixels need the full filtering for the 8/16 width
filters, skipping both calculation and writeback of the
unmodified pixels in those cases. The actual effect of this
is hard to test with checkasm though, since it tests the
full filtering, and the benefit depends on how many filtered
blocks use the shortcut.
Examples of relative speedup compared to the C version, from checkasm:
Cortex A7 A8 A9 A53
vp9_loop_filter_h_4_8_neon: 2.72 2.68 1.78 3.15
vp9_loop_filter_h_8_8_neon: 2.36 2.38 1.70 2.91
vp9_loop_filter_h_16_8_neon: 1.80 1.89 1.45 2.01
vp9_loop_filter_h_16_16_neon: 2.81 2.78 2.18 3.16
vp9_loop_filter_mix2_h_44_16_neon: 2.65 2.67 1.93 3.05
vp9_loop_filter_mix2_h_48_16_neon: 2.46 2.38 1.81 2.85
vp9_loop_filter_mix2_h_84_16_neon: 2.50 2.41 1.73 2.85
vp9_loop_filter_mix2_h_88_16_neon: 2.77 2.66 1.96 3.23
vp9_loop_filter_mix2_v_44_16_neon: 4.28 4.46 3.22 5.70
vp9_loop_filter_mix2_v_48_16_neon: 3.92 4.00 3.03 5.19
vp9_loop_filter_mix2_v_84_16_neon: 3.97 4.31 2.98 5.33
vp9_loop_filter_mix2_v_88_16_neon: 3.91 4.19 3.06 5.18
vp9_loop_filter_v_4_8_neon: 4.53 4.47 3.31 6.05
vp9_loop_filter_v_8_8_neon: 3.58 3.99 2.92 5.17
vp9_loop_filter_v_16_8_neon: 3.40 3.50 2.81 4.68
vp9_loop_filter_v_16_16_neon: 4.66 4.41 3.74 6.02
The speedup vs C code is around 2-6x. The numbers are quite
inconclusive though, since the checkasm test runs multiple filterings
on top of each other, so later rounds might end up with different
codepaths (different decisions on which filter to apply, based
on input pixel differences). Disabling the early-exit in the asm
doesn't give a fair comparison either though, since the C code
only does the necessary calcuations for each row.
Based on START_TIMER/STOP_TIMER wrapping around a few individual
functions, the speedup vs C code is around 4-9x.
This is pretty similar in runtime to the corresponding routines
in libvpx. (This is comparing vpx_lpf_vertical_16_neon,
vpx_lpf_horizontal_edge_8_neon and vpx_lpf_horizontal_edge_16_neon
to vp9_loop_filter_h_16_8_neon, vp9_loop_filter_v_16_8_neon
and vp9_loop_filter_v_16_16_neon - note that the naming of horizonal
and vertical is flipped between the libraries.)
In order to have stable, comparable numbers, the early exits in both
asm versions were disabled, forcing the full filtering codepath.
Cortex A7 A8 A9 A53
vp9_loop_filter_h_16_8_neon: 597.2 472.0 482.4 415.0
libvpx vpx_lpf_vertical_16_neon: 626.0 464.5 470.7 445.0
vp9_loop_filter_v_16_8_neon: 500.2 422.5 429.7 295.0
libvpx vpx_lpf_horizontal_edge_8_neon: 586.5 414.5 415.6 383.2
vp9_loop_filter_v_16_16_neon: 905.0 784.7 791.5 546.0
libvpx vpx_lpf_horizontal_edge_16_neon: 1060.2 751.7 743.5 685.2
Our version is consistently faster on on A7 and A53, marginally slower on
A8, and sometimes faster, sometimes slower on A9 (marginally slower in all
three tests in this particular test run).
Signed-off-by: Martin Storsjö <martin@martin.st>
This work is sponsored by, and copyright, Google.
For the transforms up to 8x8, we can fit all the data (including
temporaries) in registers and just do a straightforward transform
of all the data. For 16x16, we do a transform of 4x16 pixels in
4 slices, using a temporary buffer. For 32x32, we transform 4x32
pixels at a time, in two steps of 4x16 pixels each.
Examples of relative speedup compared to the C version, from checkasm:
Cortex A7 A8 A9 A53
vp9_inv_adst_adst_4x4_add_neon: 3.39 5.83 4.17 4.01
vp9_inv_adst_adst_8x8_add_neon: 3.79 4.86 4.23 3.98
vp9_inv_adst_adst_16x16_add_neon: 3.33 4.36 4.11 4.16
vp9_inv_dct_dct_4x4_add_neon: 4.06 6.16 4.59 4.46
vp9_inv_dct_dct_8x8_add_neon: 4.61 6.01 4.98 4.86
vp9_inv_dct_dct_16x16_add_neon: 3.35 3.44 3.36 3.79
vp9_inv_dct_dct_32x32_add_neon: 3.89 3.50 3.79 4.42
vp9_inv_wht_wht_4x4_add_neon: 3.22 5.13 3.53 3.77
Thus, the speedup vs C code is around 3-6x.
This is mostly marginally faster than the corresponding routines
in libvpx on most cores, tested with their 32x32 idct (compared to
vpx_idct32x32_1024_add_neon). These numbers are slightly in libvpx's
favour since their version doesn't clear the input buffer like ours
do (although the effect of that on the total runtime probably is
negligible.)
Cortex A7 A8 A9 A53
vp9_inv_dct_dct_32x32_add_neon: 18436.8 16874.1 14235.1 11988.9
libvpx vpx_idct32x32_1024_add_neon 20789.0 13344.3 15049.9 13030.5
Only on the Cortex A8, the libvpx function is faster. On the other cores,
ours is slightly faster even though ours has got source block clearing
integrated.
Signed-off-by: Martin Storsjö <martin@martin.st>
Documents options and behaviour, noting when 'chunks' option will
not be honoured.
Signed-off-by: Tom Butterworth <bangnoise@gmail.com>
Signed-off-by: Martin Vignali <martin.vignali@gmail.com>
It can read less than the requested amount, in which case buf contains
uninitialized data, causing problems like segmentation faults later on.
Also make sure that image->size is positive, so that it can't match a
negative error code.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
When decoding with threads enabled, the get_format callback will be
called with one of the per-thread codec contexts rather than with the
outer context. If a hwaccel is in use too, this will add a reference
to the hardware frames context on that codec context, which will then
propagate to all of the other per-thread contexts for decoding. Once
the decoder finishes, however, the per-thread contexts are not freed
normally, so these references leak.
The implicit checks via v_data_size and a_data_size don't work in the case
'(hdr_size > 7) && !ctx->alpha_info'.
This fixes segmentation faults due to invalid reads.
This problem was introduced in commit
547c2f002a.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
The secondary compression in Hap is optional, this change exposes that option to
the user as some use-cases favour higher bitrate files to reduce workload
decoding.
Adds "none" or "snappy" as options for "compressor". Selecting "none" disregards
"chunks" option: chunking is only of benefit decompressing Snappy.
Reviewed-by: Martin Vignali <martin.vignali@gmail.com>
Signed-off-by: Tom Butterworth <bangnoise@gmail.com>
This fixes crashes since 557c1675cf in linux PIC builds.
Previously, movrelx silently used r12 as helper register, which
doesn't work when r12 is the destination register.
Signed-off-by: Martin Storsjö <martin@martin.st>
We reset .Lpic_gp to zero at the start of each function, which means
that the logic within movrelx for clearing gp when necessary will
be missed.
This fixes using movrelx in different functions with a different
helper register.
Signed-off-by: Martin Storsjö <martin@martin.st>
This work is sponsored by, and copyright, Google.
The speedup for the large horizontal filters is surprisingly
big on A7 and A53, while there's a minor slowdown (almost within
measurement noise) on A8 and A9.
Cortex A7 A8 A9 A53
orig:
vp9_put_8tap_smooth_64h_neon: 20270.0 14447.3 19723.9 10910.9
new:
vp9_put_8tap_smooth_64h_neon: 20165.8 14466.5 19730.2 10668.8
Signed-off-by: Martin Storsjö <martin@martin.st>
This work is sponsored by, and copyright, Google.
These are ported from the ARM version; it is essentially a 1:1
port with no extra added features, but with some hand tuning
(especially for the plain copy/avg functions). The ARM version
isn't very register starved to begin with, so there's not much
to be gained from having more spare registers here - we only
avoid having to clobber callee-saved registers.
Examples of runtimes vs the 32 bit version, on a Cortex A53:
ARM AArch64
vp9_avg4_neon: 27.2 23.7
vp9_avg8_neon: 56.5 54.7
vp9_avg16_neon: 169.9 167.4
vp9_avg32_neon: 585.8 585.2
vp9_avg64_neon: 2460.3 2294.7
vp9_avg_8tap_smooth_4h_neon: 132.7 125.2
vp9_avg_8tap_smooth_4hv_neon: 478.8 442.0
vp9_avg_8tap_smooth_4v_neon: 126.0 93.7
vp9_avg_8tap_smooth_8h_neon: 241.7 234.2
vp9_avg_8tap_smooth_8hv_neon: 690.9 646.5
vp9_avg_8tap_smooth_8v_neon: 245.0 205.5
vp9_avg_8tap_smooth_64h_neon: 11273.2 11280.1
vp9_avg_8tap_smooth_64hv_neon: 22980.6 22184.1
vp9_avg_8tap_smooth_64v_neon: 11549.7 10781.1
vp9_put4_neon: 18.0 17.2
vp9_put8_neon: 40.2 37.7
vp9_put16_neon: 97.4 99.5
vp9_put32_neon/armv8: 346.0 307.4
vp9_put64_neon/armv8: 1319.0 1107.5
vp9_put_8tap_smooth_4h_neon: 126.7 118.2
vp9_put_8tap_smooth_4hv_neon: 465.7 434.0
vp9_put_8tap_smooth_4v_neon: 113.0 86.5
vp9_put_8tap_smooth_8h_neon: 229.7 221.6
vp9_put_8tap_smooth_8hv_neon: 658.9 621.3
vp9_put_8tap_smooth_8v_neon: 215.0 187.5
vp9_put_8tap_smooth_64h_neon: 10636.7 10627.8
vp9_put_8tap_smooth_64hv_neon: 21076.8 21026.9
vp9_put_8tap_smooth_64v_neon: 9635.0 9632.4
These are generally about as fast as the corresponding ARM
routines on the same CPU (at least on the A53), in most cases
marginally faster.
The speedup vs C code is pretty much the same as for the 32 bit
case; on the A53 it's around 6-13x for ther larger 8tap filters.
The exact speedup varies a little, since the C versions generally
don't end up exactly as slow/fast as on 32 bit.
Signed-off-by: Martin Storsjö <martin@martin.st>
With apple tools, the linker fails with errors like these, if the
offset is negative:
ld: in section __TEXT,__text reloc 8: symbol index out of range for architecture arm64
Signed-off-by: Martin Storsjö <martin@martin.st>
FLAC streams originating from the FLAC encoder send updated and more
complete STREAMINFO metadata as part of the last packet, so write that
to CodecPrivate instead of the incomplete one available in extradata
during init.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
aac_adtstoasc makes the aac extradata available only after the first packet
is filtered, and as packet side data.
Assume extradata will be available as part of the first packet if
avpriv_mpeg4audio_get_config() fails the first time due to missing extradata
and reserve space for the OutputSampleRate element in the Tracks master.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
Add keyframe index metadata
Used to facilitate seeking; particularly for HTTP pseudo streaming.
1. read live streaming or file by sequence
2. if use add_keyframe_index option, add a mark flag at the position,
use to insert new context at the last step.
3. add the keyframes *offset* and *timestamp* into a list
4. if use add_keyframe_index option, shift the metadata data from
mark flag offset
5. insert the keyframes *offset* and *timestamp* from the list by
sequence
6. free the list
7. end.
Add FATE test case;
Reviewed-by: Lou Logan <lou@lrcd.com>
Signed-off-by: Steven Liu <liuqi@gosun.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This allows a subsequent change to compress directly into the output packet when possible.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Tom Butterworth <bangnoise@gmail.com>
If the value is negative then it means padding at the start of the packet
instead of at the end.
Based on a patch by Hendrik Leppkes.
Reviewed-by: James Zern <jzern-at-google.com@ffmpeg.org>
Signed-off-by: James Almer <jamrial@gmail.com>
Compare using AVCodecParameters instead of the deprecated
AVStream.codec field
Signed-off-by: Reynaldo H. Verdejo Pinochet <reynaldo@osg.samsung.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The libopus encoder does the same thing and its better than
keeping track of when the empty flush frames appear.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
The API does not allow returning AVERROR codes.
It triggers an assert in av_parser_parse2.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
Commit 04964ac311 ("avformat/hls: Fix missing streams in some
cases with MPEG TS") caused a regression where subdemuxer streams that
use probing (e.g. dts/eac3/mp2 in mpegts) no longer get probed properly.
This is because the codec parameters from the subdemuxer stream, once
probed, are not passed on to the main stream.
Fix that by updating the codec parameters if the codec id changes.
Signed-off-by: Anssi Hannula <anssi.hannula@iki.fi>
This will allow implementing the allocator more fully, which is needed
by the HEVC encoder plugin with video memory input.
Signed-off-by: Maxym Dmytrychenko <maxym.dmytrychenko@intel.com>
For encoding, this avoids modifying the input surface, which we are not
allowed to do.
This will also be useful in the following commits.
Signed-off-by: Maxym Dmytrychenko <maxym.dmytrychenko@intel.com>
Uploading/downloading data through VPP may not work for some formats, in
that case we can still try to call av_hwframe_transfer_data() on the
child context.
Signed-off-by: Maxym Dmytrychenko <maxym.dmytrychenko@intel.com>
Certain pixel formats (e.g. P8) might not be supported for
download/upload through VPP operations, but can still be used otherwise.
Signed-off-by: Maxym Dmytrychenko <maxym.dmytrychenko@intel.com>
When using GPU surfaces with QSV, one needs to supply a frame allocator,
which will be invoked to pass surface pools to libmfx.
For encoding, this allocator gets invoked not only for the pool of input
frames, but also for a separate pool of (apparently) reconstructed frames
and another pool of MFX_FOURCC_P8, which on Windows needs to return
D3DFMT_P8 D3D surfaces. Those are probably used to store the encoded
bitstream on the GPU.
Signed-off-by: Maxym Dmytrychenko <maxym.dmytrychenko@intel.com>
AVCodecParameters.sample_rate is a signed integer, so
XMVAudioPacket.sample_rate should be, too.
A negative sample rate doesn't make sense and triggers assertions in
av_rescale_rnd.
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
A negative sample rate doesn't make sense and triggers assertions in
av_rescale_rnd.
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
A negative sample rate doesn't make sense and triggers assertions in
av_rescale_rnd.
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
A negative sample rate doesn't make sense and triggers assertions in
av_rescale_rnd.
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
A negative sample rate doesn't make sense and triggers assertions in
av_rescale_rnd.
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
A negative sample rate doesn't make sense and triggers assertions in
av_rescale_rnd.
Also check for errors from avpriv_mpeg4audio_get_config in
ff_mp4_read_dec_config_descr.
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
A negative sample rate doesn't make sense and triggers assertions in
av_rescale_rnd.
fate-aac-al07_96 fails if sample_rate == 0 is rejected in
ff_mov_read_stsd_entries.
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
A negative sample rate doesn't make sense and triggers assertions in
av_rescale_rnd.
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
A negative sample rate doesn't make sense and triggers assertions in
av_rescale_rnd.
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
This is required since some programs are not able to correctly recognize
the metadata. See H.222, 2.6.58 Metadata pointer descriptor.
putstr8() is modified in order to allow to skip writing the string
length.
This should be more useful for users since numerical values for channel
layout can be confusing and unintuitive.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
When ffplay is used to play from the RTSP URL that serves 24 bit audio
content, ffplay fails to recognize the audio codec format. The attached
patch adds support for playing 24 bit audio content over RTSP by
defining a dynamic payload handler for "L24".
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
libavcodec/x86/rv40dsp_init.c:97:2: warning: ISO C does not allow extra ‘;’ outside of a function [-Wpedantic]
libavcodec/x86/vp9dsp_init.c:94:40: warning: ISO C does not allow extra ‘;’ outside of a function [-Wpedantic]
Fixes valgrind warnings about usage of uninitialized values.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
The bitstream filters do not work with merged in side data
This leaves the input packet split if it is being split.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
This reverts commit fba2a8a254.
The changes were right for av_write_frame() but not for av_interleaved_write_frame().
The following commit will fix this in a simpler way.
Signed-off-by: James Almer <jamrial@gmail.com>
In recent lld-link versions, this command prints the version to
stdout, but also prints an error to stderr:
$ lld-link -flavor gnu --version
LLD 4.0.0 (trunk 285641)
lld-link: error: no input files
lld-link: error: target emulation unknown: -m or at least one .o file required
Signed-off-by: Martin Storsjö <martin@martin.st>
This fixes errors like this when building non-pic binaries with armv6
as baseline:
Error: invalid literal constant: pool needs to be closer
Signed-off-by: Martin Storsjö <martin@martin.st>
Otherwise it can be non-zero next time decode_lowdelay is called, causing
slice_params_buf not to be allocated, leading to a NULL pointer dereference.
The problem was introduced in commit
dcad4677d6.
Reviewed-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
"Vidvox Hap", not "Vidvox Hap encoder" or "Vidvox Hap decoder". Fixes
bad name in "ffmpeg -codecs", matches other codec naming.
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2-channels convolution using complex fft
improves speed significantly
not sure if it should be enabled by default
so disable it by default
Signed-off-by: Muhammad Faiz <mfcc64@gmail.com>
Takes a frame associated with a hardware context as input and maps it
to something else (another hardware frame or normal memory) for other
processing. If the frame to map was originally in the target format
(but mapped to something else), the original frame is output.
Also supports mapping backwards, where only the output has a hardware
context. The link immediately before will be supplied with mapped
hardware frames which it can write directly into, and this filter
then unmaps them back to the actual hardware frames.
Adds the new av_hwframe_map() function, which allows mapping between
hardware frames and normal memory, along with internal support for
implementing it.
Also adds av_hwframe_ctx_create_derived(), for creating a hardware
frames context associated with one device using frames mapped from
another by some hardware-specific means.
this is somewhat a magic number, which can be understood from reading section
"7.1.2 Exponent Strategy" of the ac3 specification, in short:
Three exponents each represented as number 0-4 are grouped together and
base-5 encoded, so the maximal correct value is 25*4 + 5*4 + 4 = 124.
Reviewed-by: Andreas Cadhalpun <andreas.cadhalpun@googlemail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The framework will allocate a buffer and copy the data to it,
that takes time. But it avoids constently creating and
destroyng the shared memory segment, and that saves more time.
On my setup,
from ~200 to ~300 FPS at full screen (1920×1200),
from ~1400 to ~3300 at smaller size (640×480),
similar to legacy x11grab and confirmed by others.
Plus, shared memory segments are a scarce resource,
allocating potentially many is a bad idea.
Note: if the application were to drop all references to the
buffer before the next call to av_read_frame(), then passing
the shared memory segment as a refcounted buffer would be
even more efficient, but it is hard to guarantee, and it does
not happen with the ffmpeg command-line tool. Using a small
number of preallocated buffers and resorting to a copy when
the pool is exhausted would be a solution to get the better
of both worlds.
According to spec ISO_IEC_15444_12 "For any media stream for which no segment index is present, referred to as non‐indexed stream, the media stream associated with the first Segment Index box in the segment serves as a reference stream in a sense that it also describes the subsegments for any non‐indexed media stream."
Signed-off-by: Sasi Inguva <isasi@google.com>
Reviewed-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
libavutil/x86/float_dsp_init.c(144) : warning C4028: formal parameter 1 different from declaration
libavutil/x86/float_dsp_init.c(144) : warning C4028: formal parameter 2 different from declaration
libavcodec/dnxhdenc.c(326) : warning C4028: formal parameter 1 different from declaration
libavcodec/dnxhdenc.c(329) : warning C4028: formal parameter 1 different from declaration
libavcodec/pixblockdsp.c(58) : warning C4028: formal parameter 1 different from declaration
libavcodec/pixblockdsp.c(63) : warning C4028: formal parameter 1 different from declaration
libavcodec/pixblockdsp.c(66) : warning C4028: formal parameter 1 different from declaration
libavcodec/ituh263dec.c(215) : warning C4028: formal parameter 1 different from declaration
libavcodec/ituh263dec.c(215) : warning C4028: formal parameter 2 different from declaration
The include of config.h was added in 2012 in 1d9c2dc8, due to
the use of CONFIG_SNOW_ENCODER ifdefs within options_table.h.
When the snow codec was dropped later (in a0c5917f8 in 2013),
this include no longer served any purpose.
options_table.h is included in builds for the host as well, when
building documentation. config.h should not be included in code
that is built for the host, since it can contain workarounds
for the target compiler/environment, like adding a missing define
of restrict, defining getenv(x) to NULL for environments that lack
getenv.
The seemingly innocent include reordering in 2025d37871 broke
builds that have getenv(x) defined to NULL in config.h (Windows CE
and Windows Phone/RT), since libavcodec/options_table.h include
config.h, while libavformat/options_table.h end up bringing in
more system headers, and those system headers can contain a proper
definition of getenv, which clash with the getenv define in config.h.
This was avoided earlier as long as libavformat/options_table.h (or
avformat.h) was included before libavcodec/options_table.h.
This fixes builds for Windows Phone/RT and CE.
Signed-off-by: Martin Storsjö <martin@martin.st>
This work is sponsored by, and copyright, Google.
The filter coefficients are signed values, where the product of the
multiplication with one individual filter coefficient doesn't
overflow a 16 bit signed value (the largest filter coefficient is
127). But when the products are accumulated, the resulting sum can
overflow the 16 bit signed range. Instead of accumulating in 32 bit,
we accumulate the largest product (either index 3 or 4) last with a
saturated addition.
(The VP8 MC asm does something similar, but slightly simpler, by
accumulating each half of the filter separately. In the VP9 MC
filters, each half of the filter can also overflow though, so the
largest component has to be handled individually.)
Examples of relative speedup compared to the C version, from checkasm:
Cortex A7 A8 A9 A53
vp9_avg4_neon: 1.71 1.15 1.42 1.49
vp9_avg8_neon: 2.51 3.63 3.14 2.58
vp9_avg16_neon: 2.95 6.76 3.01 2.84
vp9_avg32_neon: 3.29 6.64 2.85 3.00
vp9_avg64_neon: 3.47 6.67 3.14 2.80
vp9_avg_8tap_smooth_4h_neon: 3.22 4.73 2.76 4.67
vp9_avg_8tap_smooth_4hv_neon: 3.67 4.76 3.28 4.71
vp9_avg_8tap_smooth_4v_neon: 5.52 7.60 4.60 6.31
vp9_avg_8tap_smooth_8h_neon: 6.22 9.04 5.12 9.32
vp9_avg_8tap_smooth_8hv_neon: 6.38 8.21 5.72 8.17
vp9_avg_8tap_smooth_8v_neon: 9.22 12.66 8.15 11.10
vp9_avg_8tap_smooth_64h_neon: 7.02 10.23 5.54 11.58
vp9_avg_8tap_smooth_64hv_neon: 6.76 9.46 5.93 9.40
vp9_avg_8tap_smooth_64v_neon: 10.76 14.13 9.46 13.37
vp9_put4_neon: 1.11 1.47 1.00 1.21
vp9_put8_neon: 1.23 2.17 1.94 1.48
vp9_put16_neon: 1.63 4.02 1.73 1.97
vp9_put32_neon: 1.56 4.92 2.00 1.96
vp9_put64_neon: 2.10 5.28 2.03 2.35
vp9_put_8tap_smooth_4h_neon: 3.11 4.35 2.63 4.35
vp9_put_8tap_smooth_4hv_neon: 3.67 4.69 3.25 4.71
vp9_put_8tap_smooth_4v_neon: 5.45 7.27 4.49 6.52
vp9_put_8tap_smooth_8h_neon: 5.97 8.18 4.81 8.56
vp9_put_8tap_smooth_8hv_neon: 6.39 7.90 5.64 8.15
vp9_put_8tap_smooth_8v_neon: 9.03 11.84 8.07 11.51
vp9_put_8tap_smooth_64h_neon: 6.78 9.48 4.88 10.89
vp9_put_8tap_smooth_64hv_neon: 6.99 8.87 5.94 9.56
vp9_put_8tap_smooth_64v_neon: 10.69 13.30 9.43 14.34
For the larger 8tap filters, the speedup vs C code is around 5-14x.
This is significantly faster than libvpx's implementation of the same
functions, at least when comparing the put_8tap_smooth_64 functions
(compared to vpx_convolve8_horiz_neon and vpx_convolve8_vert_neon from
libvpx).
Absolute runtimes from checkasm:
Cortex A7 A8 A9 A53
vp9_put_8tap_smooth_64h_neon: 20150.3 14489.4 19733.6 10863.7
libvpx vpx_convolve8_horiz_neon: 52623.3 19736.4 21907.7 25027.7
vp9_put_8tap_smooth_64v_neon: 14455.0 12303.9 13746.4 9628.9
libvpx vpx_convolve8_vert_neon: 42090.0 17706.2 17659.9 16941.2
Thus, on the A9, the horizontal filter is only marginally faster than
libvpx, while our version is significantly faster on the other cores,
and the vertical filter is significantly faster on all cores. The
difference is especially large on the A7.
The libvpx implementation does the accumulation in 32 bit, which
probably explains most of the differences.
Signed-off-by: Martin Storsjö <martin@martin.st>
This makes it match the pattern already used for VP8 MC functions.
This also makes the signature match ffmpeg's version of these
functions, easing porting of code in both directions.
Signed-off-by: Martin Storsjö <martin@martin.st>
This was broken by the following Libav commit:
4c387c7 ppc: dsputil: do unaligned block accesses correctly
The following tests fail due to this:
fate-checkasm
fate-vsynth1-dnxhd-2k-hr-hq fate-vsynth1-dnxhd-edge1-hr
fate-vsynth1-dnxhd-edge2-hr fate-vsynth1-dnxhd-edge3-hr
fate-vsynth1-dnxhd-hr-sq-mov fate-vsynth1-dnxhd-hr-hq-mov
fate-vsynth2-dnxhd-2k-hr-hq fate-vsynth2-dnxhd-edge1-hr
fate-vsynth2-dnxhd-edge2-hr fate-vsynth2-dnxhd-edge3-hr
fate-vsynth2-dnxhd-hr-sq-mov fate-vsynth2-dnxhd-hr-hq-mov
fate-vsynth3-dnxhd-2k-hr-hq fate-vsynth3-dnxhd-edge1-hr
fate-vsynth3-dnxhd-edge2-hr fate-vsynth3-dnxhd-edge3-hr
fate-vsynth3-dnxhd-hr-sq-mov fate-vsynth3-dnxhd-hr-hq-mov
Fixes trac ticket #5508.
Reviewed-by: Carl Eugen Hoyos <ceffmpeg@gmail.com>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
The parser depends on the codec and thus must not be used with a different one.
If it is, the 'avctx->codec_id == s->parser->codec_ids[0] ...' assert in
av_parser_parse2 gets triggered.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
The old code had to retain a partial frame across two calls in
the case of separate interlaced fields. Now, we know that we'll
get both fields within the same receive_frame call, and so we
don't need to manage the frame as private state any more.
It's not possible to return EAGAIN when we've passed input EOF and are
in draining mode. If do return EAGAIN, we're saying there's no way to
get any more output - which isn't true in many cases.
So let's handled these cases in an internal loop as best we can.
It seems that without all the other 1:1 heuristics, we don't have
a fundamental problem trusting the interlaced flag on output
pictures. That's a relief.
I'm not sure why, but the mpeg4_unpack_bframes bsf is not
interacting well with seeking. Looking at the code, it should be
ok, with possibly one warning shown, but I see it getting stuck
for an extended period of time after a seek where a packed frame
is cached to be shown later.
So, I gave up on that and went back to making the old hardware
based path work. Turns out that it wasn't broken except that some
samples have a 6 byte drop packet which I wasn't accounting for.
Now it works again and seeks are good.
The new decode API allows for m:n decode patterns, which is what
you need to use this hardware in a sane way. There are so many
situations where 1:1 doesn't happen naturally that it's a miracle
I got it working as well as I did.
With this change, we can throw all of the crazy heuristics and
sleeps(!) out, and things work correctly.
Why on earth the hardware returns garbage for the first sample of
a decoded picture is anyone's guess. The simplest reasonable way
to patch it up is to copy the first sample of the second line. This
should result in the correct chroma values (because the data was
original 4:2:0 upsampled to 4:2:2) even if the luma is isn't.
Also adds a new flag to mark filters which are aware of hwframes and
will perform this task themselves, and marks all appropriate filters
with this flag.
This is required to allow software-mapped hardware frames to work,
because we need to have the frames context available for any later
mapping operation in the filter graph.
The output from the filter graph should only propagate further to an
encoder if the hardware format actually matches the visible format
(mapped frames are valid here and have an hw_frames_ctx, but this
should not be given to the encoder as its hardware context).
This avoids potential rounding errors and guarantees the source aspect
ratio is preserved.
Keep writing pixel values when Stereo 3D Mode is enabled and for WebM,
as the format doesn't support anything else.
This fixes ticket #5743, implementing the suggestion from ticket #5903.
Signed-off-by: James Almer <jamrial@gmail.com>
Fixes the following warning:
libavcodec/hapenc.c:122:20: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 4 has type ‘size_t’ [-Wformat]
Based on a patch by Diego Biurrun.
If there are no index entries, e_old = st->index_entries is only one
byte large, since it was created by av_realloc called with size 0.
Thus accessing e_old[0].timestamp causes a heap buffer overflow.
Reviewed-by: Sasi Inguva <isasi@google.com>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
This matrix needs to be applied after all others have (currently only
display matrix from trak), but cannot be handled in movie box, since
streams are not allocated yet. So store it in main context, and apply
it when appropriate, that is after parsing the tkhd one.
Fate tests are updated accordingly.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
This is needed for improved fate testing and it is modeled after
-show_format_entry. The main behavioral difference is that when a print
function is called with an empty key, rather than discarding it, the
closes key in the hierarchy is used instead.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
The use of TLSv1_*_method() disallows newer protocol versions; instead
use SSLv23_*_method() and then explicitly disable the deprecated
protocol versions which should not be supported.
Merged as-at libav 398f015, and therefore includes outstanding
skipped merges 04b17ff and 130e1f1.
All features not in libav are preserved, and no options change.
pkg-config(1) expects uninstalled pc files to follow the
blah-uninstalled.pc naming convention and the behavior
of the program is impacted by it. Without this fix
overriding PKGP_CONFIG_LIBDIR is required to ensure
uninstalled files are preferred (overkill), instead of
just adding pc-uninstalled/ to the utility's search path
by setting PKG_CONFIG_PATH accordingly.
Signed-off-by: Reynaldo H. Verdejo Pinochet <reynaldo@osg.samsung.com>
The number of channels is used as divisor in decode_frame, so it must
not be zero to avoid SIGFPE crashes.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
The use of TLSv1_*_method() disallows newer protocol versions; instead
use SSLv23_*_method() and then explicitly disable the deprecated
protocol versions which should not be supported.
Fixes ticket #5915.
libavcodec/hapenc.c:121:20: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 4 has type ‘size_t {aka unsigned int}’ [-Wformat=]
libavcodec/hapenc.c:121:20: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 5 has type ‘size_t {aka unsigned int}’ [-Wformat=]
Since avversion.h is a generated header it must be created before
dependencies can be determined as a side effect of compilation.
Otherwise Make stops and restarts the build process to generate
avversion.h and produces related error messages.
The dynamic buffer does not contain the CRC32 element so calls to avio_tell()
don't take it into account. This resulted in CueRelativePosition values being
six bytes short.
This is a regression since 6724525a15
Instead of adding yet another custom check for CRC32 to fix a size or an offset,
remove the existing ones and reserve the six bytes in the dynamic buffer.
Signed-off-by: James Almer <jamrial@gmail.com>
When the macro is expanded with a semicolon following it and the
macro itself contains a semicolon, we ended up in double semicolons,
which is treated as a statement that disallows further declarations.
This avoids errors about mixed declarations and statements on gcc,
after ee05079766.
Signed-off-by: Martin Storsjö <martin@martin.st>
The buffer map/unmap code was in an early version of this before it
was committed, but the unmap was never removed. While wrong, this
was harmless (and therefore unnoticed) because the buffers can't be
mapped at this point - all drivers just did nothing with the call.
Use new H264Ref.reference field to track field picture flags. The
H264Picture.reference flag in DPB is now irrelevant here.
This is a regression from git commit a12d3188, and that affected
multiple interlaced video streams.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Signed-off-by: Mark Thompson <sw@jkqxz.net>
When decoding interlaced pictures, the structure is reused to render
to the same surface twice. The parameter buffers were not being
cleared, which caused the i965 driver to error out.
Instead use our own struct, which we already use when using
gcrypt and gnutls.
In OpenSSL 1.1, the DH struct has been made opaque.
Signed-off-by: Martin Storsjö <martin@martin.st>
sigaction is not defined in standards as a struct starting with another
struct. Some *BSD variants do however, resulting in a warning from the
zero initialization, which this change eliminates.
This partially reverts a92be9b856.
For 'nclx', the latest edition of the standard switched from JPEG XR
to 23001-8, which matches the current order of our entries. Bounds
are preserved as a sanity check.
For 'nclc', qtff edition 2016-09-13 introduced a few new entries.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
x29 (FP) is a callee saved register and should be restored on
return. Instead of backing up x29 and restoring it here, back up
sp in a register that we are allowed to overwrite.
This fixes crashes in checkasm on aarch64 since f1b3e13138.
For some reason, gcc builds didn't crash, but clang builds do.
Signed-off-by: Martin Storsjö <martin@martin.st>
This, combined with clobbering the stack space prior to the call,
increases the chances of finding cases where 32 bit parameters
are erroneously treated as 64 bit.
Signed-off-by: Martin Storsjö <martin@martin.st>
Even if MAX_ARGS - 2 (for arm) or MAX_ARGS - 7 (for aarch64) parameters
are passed on the stack to checkasm_checked_call, we actually only
need to store MAX_ARGS - 4 (for arm) or MAX_ARGS - 8 (for aarch64)
parameters on the stack when calling the tested function.
Signed-off-by: Martin Storsjö <martin@martin.st>
This also fixes a minor bug introduced in the codecpar conversion, where
the termination condition for extracting the extradata does not match
the actual extradata setting code. As a result, the packet durations
made up by lavf go back to their values before the codecpar conversion.
That is of little consequence since that code should eventually be
dropped completely.
This way they can be reused by other code without including the whole
decoder-specific hevcdec.h
Also, add the HEVC_ prefix to them, since similarly named values exist
for H.264 as well and are sometimes used in the same code.
The spec says
9: Interlaced with bottom field displayed first and top field stored first
14: Interlaced with top field displayed first and bottom field stored first
And avcodec.h states
AV_FIELD_TB, //< Top coded first, bottom displayed first
AV_FIELD_BT, //< Bottom coded first, top displayed first
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
According to the public RTMP specification, these 4 bytes should
be zero.
librtmp in server mode assumes that the RTMPE (FP9) handshake is
used if these bytes are nonzero.
Signed-off-by: Martin Storsjö <martin@martin.st>
When acting as server, the server can include a "clientid" property
in some status messages. But this should be a unique number
identifying the client session, not identifying the server itself.
In practice, omitting it works just as well as including this
incorrect field.
Signed-off-by: Martin Storsjö <martin@martin.st>
This makes sure that e.g. Adobe FME actually reacts to it. As long
as the value we've been sending is the default one (128), the bug
hasn't been noticed.
Signed-off-by: Martin Storsjö <martin@martin.st>
Some applications such as Adobe FME append lots of parameters
here, making it easily overflow the current limit.
Signed-off-by: Martin Storsjö <martin@martin.st>
The decoding buffer index expected by D3D11VA is the one from the
ID3D11Texture2D not the one from the ID3D11VideoDecoderOutputView array
in AVD3D11VAContext.
Otherwise, when providing decoder slices that do not start from 0,
pictures appear in bogus order. For an invalid index crashes and
image corruption can occur.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
filter16 goes from 508 to 482 (h) or 346 to 314 (v) cycles; filter88
goes from 240 to 238 (h) or 174 to 165 (v) cycles, measured on TOS.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
Similar gains as the ssse3 version once again
Additional improvements by Clément Bœsch <u@pkh.me>.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
The randomize_buffer() implementation assures that "most of the time",
we'll do a good mix of wide16/wide8/hev/regular/no filters for complete
code coverage. However, this is not mathematically assured because that
would make the code either much more complex, or much less random.
Some fixes and improvements by Rodger Combs <rodger.combs@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
This was not observed earlier because the only syntax element which
it normally misses with the current setup is slice_qp_delta, but that
is always going to be zero (in IDR frames QP isn't varied on the
slice) which will always exp-golomb code as a single 1 bit. The
immediately following part is the byte alignment, which is always a 1
bit followed by 0s which are ignored, so as long as the bitstream is
never aligned at that point we will never notice because the only
difference is that an ignored bit is a 1 instead of a 0.
Errors during decoding are currently considered non-fatal and do not
terminate transcoding, so even if parts of the data are corrupted, the
rest may be decodable.
However, that should apply only to the actual decoding calls, not to the
failures elsewhere (e.g. configuring filters).
The filtergraph's existence is used in several places to mean that the
filtergraph is fully configured. This causes problems if it's allocated,
but the initialization fails (e.g. if a non-existent filter is
specified).
Adds a wrapper function for downmixing which detects channel count changes
and updates the selected downmix function accordingly.
Simplification and porting to current x86inc infrastructure by Diego Biurrun.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
Also use (float **) instead of (float (*)[2]). This matches the matrix
layout in libavresample so we can reuse assembly code between the two.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
It is supposed to be a flag. The only currently defined value is
AVIO_SEEKABLE_NORMAL, but other ones may be added in the future.
However all the current lavf code treats this field as a bool (mainly
for historical reasons).
Change all those cases to properly check for AVIO_SEEKABLE_NORMAL.
While outwardly bizarre, this change makes the behaviour consistent
with other VAAPI encoders which sync to the encode /input/ picture in
order to wait for /output/ from the encoder. It is not harmful on
i965 (because synchronisation already happens in vaRenderPicture(),
so it has no effect there), and it allows the encoder to work on
mesa/gallium which assumes this behaviour.
This allows better checking of capabilities and will make it easier
to add more functionality later.
It also commonises some duplicated code around rate control setup
and adds more comments explaining the internals.
Follow a 420, 422, 444 order instead of a random one.
This simplifies double-checking additions of new formats.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
This version, which is the only one doing two processing cycles per loop
iteration, computes the load/store indices incorrectly for the second
cycle.
CC: libav-stable@libav.org
This was introduced in bc2a32969e.
The whole block that the statement was added to is only
relevant when used as a demuxer, but the other statements
there have had other if statements guarding them. Make
sure to only run this whole block if being used as a
demuxer.
CC: libav-stable@libav.org
Signed-off-by: Martin Storsjö <martin@martin.st>
If the input has been decoded from a stream which uses edge cropping
then the whole surface need not be valid. This defines an input
region for the scaler so we only use the active area of the frame.
This avoids SIMD-optimized functions having to sign-extend their
line size argument manually to be able to do pointer arithmetic.
Also adjust parameter names to be "stride" everywhere.
The latter can do everything the former can do, but also handle conditions
the former cannot like multiple header #includes and checking for headers
and functions in a single test program, which is necessary for certain
library tests.
With some old libva versions <va/va.h> does not automatically include
the per-codec subsidiary headers, so we need to include the right one
explicitly ourselves.
After init_opts() there needs to be an uninit_opts() call
to free the swscale context and other buffers.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
This was added before edts support existed, and is no longer
valid.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
This breaks files with legitimate single-entry edit lists,
and the hack, introduced in f03a081df0,
has no link to any known sample in its commit message.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
libavcodec/libvpxdec.c:100:57: warning: passing argument 3 of 'av_image_copy' from incompatible pointer type
av_image_copy(picture->data, picture->linesize, img->planes,
libavutil/imgutils.h:116:6: note: expected 'const uint8_t **' but argument is of type 'unsigned char **'
void av_image_copy(uint8_t *dst_data[4], int dst_linesizes[4],
This avoids SIMD-optimized functions having to sign-extend their
line size argument manually to be able to do pointer arithmetic.
Also adjust parameter names to be "linesize" everywhere.
This avoids SIMD-optimized functions having to sign-extend their
stride argument manually to be able to do pointer arithmetic.
Also adjust parameter names to be "stride" everywhere.
The code currently reads the coded dimensions from the extradata, but
expects the display dimensions to be set by the caller, and does not
check that they are compatible (i.e. that the displayed size is smaller
than the coded size).
Make sure that when the display dimensions are set, they are also valid.
Fixes possible invalid memory access.
CC: libav-stable@libav.org
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
It is passed to the get_bits API, which requires buffers to be padded.
Fixes possible invalid reads.
CC: libav-stable@libav.org
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
height - me_y is the line from which we read, so it must be strictly
smaller than the frame height. Fixes possible invalid reads in corrupted
files.
Also, use a proper context for logging the error.
CC: libav-stable@libav.org
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
If we encounter an END element before anything is decoded, we would
return success even though the output frame has not been allocated,
which is invalid.
CC: libav-stable@libav.org
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
If no string argument is supplied when av_hwdevice_ctx_create() is
called to create a VAAPI device, we currently only try the default
X11 display (that is, $DISPLAY) to find a device, and will therefore
fail in the absence of an X server to connect to. Change the logic
to also look for a device via the first DRM render node (that is,
"/dev/dri/renderD128"), which is probably the right thing to use in
most simple configurations which only have one DRM device.
We need more information from last/cur_frame than from reference
buffers, so we can use a simplified structure for reference buffers,
and then store mvs and segmentation map information in last/cur.
This prepares the decoder for frame threading support.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
Not from the underlying frame. Fixes races with frame threading in
field-coded files, where decoding would wait for the wrong field (e.g.
random failures in mixed-nal-coding).
Bug-Id: 954
A non-existent av_buffer_pool_can_uninit() function is mentioned instead
of av_buffer_pool_uninit(). Also, this function is to be called by the
caller, not the pool itself.
The frame dimensions are 16bit, so the mv bounds can easily overflow
int16 for large videos.
Bug-Id: Handbrake/46
CC: libav-stable@libav.org
Signed-off-by: Anton Khirnov <anton@khirnov.net>
In such a case behave as if the buffer was not reallocatable -- allocate a
new one and copy the data (preserving just the part described by the
reference passed to av_buffer_realloc).
CC: libav-stable@libav.org
Reported-By: wm4 <nfxjfg@googlemail.com>
pavgb is an sse integer instruction, so the mmxext flag is enough
Signed-off-by: James Almer <jamrial@gmail.com>
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
This reverts commit 014773b66b.
Since 230b1c070, the bytewise AV_W*() macros only expand their
argument once, i.e. doing exactly the same change as was done
in the AV_COPY*U macros, so this change is no longer necessary.
Signed-off-by: Martin Storsjö <martin@martin.st>
This reverts commit 25bacd0a0c.
Since 230b1c070, the bytewise AV_W*() macros only expand their
argument once, so revert to the more readable version of these.
Signed-off-by: Martin Storsjö <martin@martin.st>
AV_WN64 is meant for unaligned data, but the existing av_alias* unions
(without a definition for the av_alias attribute - we don't have one
for MSVC) indicate to the compiler that they would have sufficient
alignment for normal access, i.e. the compiler is free to assume
8 byte alignment.
On ARM, this makes sure that AV_WN64 (or two consecutive AV_WN32) is
done with two str instructions instead of one strd.
Signed-off-by: Martin Storsjö <martin@martin.st>
This avoids issues with expanding the argument multiple times,
and makes sure that it is of the right type for the following shifts.
Even if the caller of a macro could be expected not to pass parameters
that have side effects if expanded multiple times, these fallback
codepaths are rarely, if ever, tested, so it is expected that such
issues can arise.
Thefore, for safety, make sure the fallback codepaths only expand
the arguments once.
Signed-off-by: Martin Storsjö <martin@martin.st>
If AV_RN and AV_WN are macros with multiple individual reads and
writes, the previous version of the AV_COPYU macro would fail if
the reads and writes overlap.
This should not be any less efficient in any case, given a
sensibly optimizing compiler.
Signed-off-by: Martin Storsjö <martin@martin.st>
AV_WB32 can be implemented as a macro that expands its parameters
multiple times (in case AV_HAVE_FAST_UNALIGNED isn't set and the
compiler doesn't support GCC attributes); make sure not to read
multiple times from the source in this case.
Signed-off-by: Martin Storsjö <martin@martin.st>
The reference frames are used in update_thread_context(), so modifying
them after finish_setup() is a race. The frame in question will be
released during the next decode call.
CC: libav-stable@libav.org
This allows doing this redirection, if building with clang against
old enough MSVC headers that lack strtoll (2012 and older).
Signed-off-by: Martin Storsjö <martin@martin.st>
This reverts commit 0e0538aefc.
The valgrind warning was a false positive due to OSX implementation of
printf (invoking a strnlen), while this code is actually fine, since the
format specifier %.*s guarantes that no more than buf_size bytes from
buf will be read.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
Clang normally disguises as GCC (defining __GNUC__), and thus get
all the normal GCC specific attributes.
Clang can also work as a drop-in replacement for MSVC, and in these
cases, it doesn't define __GNUC__, but defines _MSC_VER instead.
Even in these setups, it still supports the GCC style attributes,
thus use them, especially where there isn't any MSVC specific
version, or where the MSVC specific version doesn't work on clang
(for DECLARE_ASM_CONST).
Signed-off-by: Martin Storsjö <martin@martin.st>
When targeting COFF (windows), clang doesn't support this
directive (while binutils supports it for all targets).
Signed-off-by: Martin Storsjö <martin@martin.st>
There are samples with invalid stsc that may work fine as is and
do not need extradata change. So ignore any out of range index, and
error out only when explode is set.
Found-by: Matthieu Bouron <matthieu.bouron@stupeflix.com>
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
The current code will ignore the init_get_bits() failure and do an
invalid read from the uninitialized GetBitContext.
Found-By: Jan Ruge <jan.s.ruge@gmail.com>
Bug-Id: 952
This fixes retrieving a valid profile for many of the FATE conformance samples,
allowing them to be properly decoded by the HWAccel after adding a profile check.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
These bits are set by exceptions in NEON instructions.
Also print the differing bits when FPSCR is clobbered,
and use bic instead of lsl, for clearing the topmost bits.
Signed-off-by: Martin Storsjö <martin@martin.st>
Each const block needs to be terminated by one endconst
invocation so either call endconst after each, or just
declare plain labels to the later strings.
This fixes errors such as this, on some binutils versions:
checkasm.S:38: Error: Macro `endconst' was already defined
Signed-off-by: Martin Storsjö <martin@martin.st>
Since we only know whether a NAL unit corresponds to a new field after
parsing the slice header, this requires reorganizing the calls to slice
parsing, per-slice/field/frame init and actual decoding.
In the previous code, the function for slice header decoding also
immediately started a new field/frame as necessary, so any slices
already queued for decoding would no longer be decodable.
After this patch, we first parse the slice header, and if we determine
that a new field needs to be started we decode all the queued slices.
This function's purpose is not very well defined. Currently it does two
(only marginally related) things: selecting the next output frame and
calling ff_thread_finish_setup() for frame threading. The first of those
more properly belongs under field_start(), while the second can be
called directly from decode_nal_units().
The stack used by checkasm_checked_call_vfp was a multiple of 4 when the
checked function is called. AAPCS requires a double word (8 byte)
aligned stack public interfaces. Since both calls are public interfaces
the stack is misaligned when the checked is called.
Might fix the SIGBUS error in the armv7-linux-clang-3.7 fate config.
Fixes a regression in ca2f19b9cc with some mov/mp4 files. The files have
several NAL units in the supposed single NAL unit after the size field.
Annex B start code prefixes are used to separate them. The first NAL unit
is correctly parsed but the buffer does not point to the next size field.
Instead semi random data (it seems to be the rbsp_stop_one_bit and the
start code prefix) is then parsed as length and will exceed the
remaining length of the buffer.
Patch based on the code in h264's decode_nal_units() and a similar
patch by Hendrik Leppkes in FFmpeg (a9bb4cf87d).
Bug-Id: ffmpeg/trac5529
Reported-By: Vittorio Giovara
Currently, SPS.mb_height is actually what the spec calls
PicHeightInMapUnits, which is half the frame height when interlacing is
allowed. Calling this 'mb_height' is quite confusing, and there are at
least two associated bugs where this field is treated as the actual
frame height - in the h264 parser and in the code computing maximum
reordering buffer size for a given level.
Fix those issues (and avoid possible future ones) by exporting the real
frame height in this field.
This comment isn't true, the height can be different from the width
for these functions (which is why the height is passed as a parameter
to them).
Signed-off-by: Martin Storsjö <martin@martin.st>
GNU as evaluates true as '-1' while Apple's variant and llvm's internal
assembler evaluate it as '1'. The best way to avoid this madness is to
eliminate boolean expressions instead of trying to fix it with
preprocessor directives. Use a direct formula to calculate the
required temporary space on the stack in
ff_put_vp8_{epel,bilin}{4,8,16}_h[246]v[246]_armv6().
Fixes a checkasm segfault in vp8dsp.mc when using llvm's internal
assembler for a non-Apple target.
When writing a fragmented file, we by default write an index pointing
to all the fragments at the end of the file. This causes constantly
increasing memory usage during the muxing. For live streams, the
index might not be useful at all.
A similar fragment index is written (but at the start of the file) if
the global_sidx flag is set. If ism_lookahead is set, we need to keep
data about the last ism_lookahead+1 fragments.
If no fragment index is to be written, we don't need to store information
about all fragments, avoiding increasing the memory consumption
linearly with the muxing runtime.
This fixes out of memory situations with long live mp4 streams.
Signed-off-by: Martin Storsjö <martin@martin.st>
The driver being used is detected inside av_hwdevice_ctx_init() and
the quirks field then set from a table of known device. If this
behaviour is unwanted, the user can also set the quirks field
manually.
Also adds the Intel i965 driver quirk (it does not destroy parameter
buffers used in a call to vaRenderPicture()) and detects that driver
to set it.
P010 is the 10-bit variant of NV12 (planar luma, packed chroma), using two
bytes per component to store 10-bit data plus 6-bit zeroes in the LSBs.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
This preserves all the information in the codec parameters.
The wavpack ref changes are caused by the fact that now the sample
format is set, so matroskaenc can use it to set the bit depth.
Bug-Id: 945, along with the previous commit
This avoids listing the same feature multiple times in the
test output. Previously the output contained something like this:
SSE2:
- hevc_mc.qpel [OK]
- hevc_mc.epel [OK]
- hevc_mc.unweighted_pred [OK]
- hevc_mc.qpel [OK]
- hevc_mc.epel [OK]
- hevc_mc.unweighted_pred [OK]
Signed-off-by: Martin Storsjö <martin@martin.st>
This avoids the risk of accidentally clobbering such variables outside
of the macro if the same variables are used there.
Signed-off-by: Martin Storsjö <martin@martin.st>
This fixes valgrind warnings about conditional jumps based on
uninitialized data (even though the uninitialized data only ever
was compared with a direct copy of the same uninitialized data).
Signed-off-by: Martin Storsjö <martin@martin.st>
While it is less featureful (and slower) than the built-in H264
decoder, one could potentially want to use it to take advantage
of the cisco patent license offer.
Signed-off-by: Martin Storsjö <martin@martin.st>
The hw frame used as reference has an attached size but it need not
match the actual size of the surface, so enforcing that the sw frame
used in copying matches its size exactly is not useful.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
The source frame may be cropped, so that its dimensions are smaller than
the pool dimensions. The transfer_data API requires the allocated size
of the destination frame to be the same as the pool size.
Be more careful when an input stream encounters EOF when its filtergraph
has not been configured yet. The current code would immediately mark the
corresponding output streams as finished, while there may still be
buffered frames waiting for frames to appear on other filtergraph
inputs.
This should fix the random FATE failures for complex filtergraph tests
after a3a0230a98
Previously we would allocate a new one for every frame. This instead
maintains an AVBufferPool of them to use as-needed.
Also makes the maximum size of an output buffer adapt to the frame
size - the fixed upper bound was a bit too easy to hit when encoding
large pictures at high quality.
This makes sure the actual stream parameters are used, which is
important mainly for hardware decoding+filtering cases, which would
previously require various weird workarounds to handle the fact that a
fake software graph has to be constructed, but never used.
This should also improve behaviour in rare cases where
avformat_find_stream_info() does not provide accurate information.
Currently, a filtergraph will pull in the output constraints from its
corresponding decoder context, which breaks proper layering. Instead,
explicitly send the constaints on the output parameters to the
filtergraph.
This is similar to what is done for filtergraph inputs in
30ab4c51a180610d9f1720c75518d763515c0d9f
Setting the filter input parameters is moved to init_input_stream(),
so that it is done before the decoder is opened, potentially overwriting
the information from avformat_find_stream_info() with less accurate
data.
This commit temporarily disables QSV transcoding with hw frames. The
functionality will be re-added in the following commits.
Currently, calling configure_filtergraph() will pull in the input
parameters from the corresponding decoder context. This has the
following disadvantages:
- the decoded frame is a more proper source for this information
- a filter accessing decoder data breaks proper layering
Add functions for explicitly sending the input stream parameters to a
filtergraph input - currently from a frame and a decoder. The decoder
one will be dropped in future commits after some more restructuring.
2016-06-25 11:20:50 +02:00
1692 changed files with 93674 additions and 32557 deletions
- avcodec/vp8dsp: vp7_luma_dc_wht_c: Fix multiple runtime error: signed integer overflow: -1366381240 + -1262413604 cannot be represented in type 'int'
- avcodec/avcodec: Limit the number of side data elements per packet
- avcodec/texturedsp: Fix runtime error: left shift of 255 by 24 places cannot be represented in type 'int'
- avcodec/g723_1dec: Fix runtime error: left shift of negative value -1
- avcodec/wmv2dsp: Fix runtime error: signed integer overflow: 181 * -17047030 cannot be represented in type 'int'
- avcodec/diracdec: Fix Assertion frame->buf[0] failed at libavcodec/decode.c:610
- avcodec/msmpeg4dec: Check for cbpy VLC errors
- avcodec/cllc: Check num_bits
- avcodec/cllc: Factor VLC_BITS/DEPTH out, do not use repeated literal numbers
- avcodec/scpr: Check y in first line loop in decompress_i()
- avcodec/dvbsubdec: Check entry_id
- avcodec/aacdec_fixed: Fix multiple shift exponent 33 is too large for 32-bit type 'int'
- avcodec/mpeg12dec: Fixes runtime error: division by zero
- avcodec/pixlet: Fix runtime error: signed integer overflow: 436207616 * -5160230545260541 cannot be represented in type 'long'
- avcodec/webp: Always set pix_fmt
- avfilter/vf_uspp: Fix currently unused input frame dimensions
- avcodec/truemotion1: Fix multiple runtime error: left shift of negative value -1
- avcodec/eatqi: Fix runtime error: signed integer overflow: 4466147 * 1075 cannot be represented in type 'int'
- avcodec/dss_sp: Fix runtime error: signed integer overflow: 2147481189 + 4096 cannot be represented in type 'int'
- avformat/wavdec: Check chunk_size
- avcodec/cavs: Check updated MV
- avcodec/y41pdec: Fix width in input buffer size check
- avcodec/svq3: Fix multiple runtime error: signed integer overflow: -237341 * 24552 cannot be represented in type 'int'
- avcodec/texturedsp: Fix runtime error: left shift of 218 by 24 places cannot be represented in type 'int'
- avcodec/lagarith: Check scale_factor
- avcodec/lagarith: Fix runtime error: left shift of negative value -1
- avcodec/takdec: Fix multiple runtime error: left shift of negative value -1
- avcodec/indeo2: Check for invalid VLCs
- avcodec/g723_1dec: Fix several integer related cases of undefined behaviour
- avcodec/htmlsubtitles: Check for string truncation and return error
- avcodec/bmvvideo: Fix runtime error: left shift of 137 by 24 places cannot be represented in type 'int'
- avcodec/dss_sp: Fix multiple runtime error: signed integer overflow: -15699 * -164039 cannot be represented in type 'int'
- avcodec/dvbsubdec: check region dimensions
- avcodec/vp8dsp: Fixes: runtime error: signed integer overflow: 1330143360 - -1023040530 cannot be represented in type 'int'
- avcodec/hqxdsp: Fix multiple runtime error: signed integer overflow: 248220 * 21407 cannot be represented in type 'int' in idct_col()
- avcodec/cavsdec: Check sym_factor
- avcodec/cdxl: Check format for BGR24
- avcodec/ffv1dec: Fix copying planes of paletted formats
- avcodec/wmv2dsp: Fix runtime error: signed integer overflow: 181 * -12156865 cannot be represented in type 'int'
- avcodec/xwddec: Check bpp more completely
- avcodec/aacdec_template: Do not decode 2nd PCE if it will lead to failure
- avcodec/s302m: Fix left shift of 8 by 28 places cannot be represented in type 'int'
- avcodec/eamad: Fix runtime error: signed integer overflow: 49674 * 49858 cannot be represented in type 'int'
- avcodec/g726: Fix runtime error: left shift of negative value -2
- avcodec/magicyuv: Check len to be supported
- avcodec/ra144: Fix runtime error: left shift of negative value -798
- avcodec/mss34dsp: Fix multiple signed integer overflow
- avcodec/targa_y216dec: Fix width type
- avcodec/texturedsp: Fix multiple runtime error: left shift of 255 by 24 places cannot be represented in type 'int'
- avcodec/ivi_dsp: Fix multiple left shift of negative value -2
- avcodec/svq3: Fix multiple runtime error: signed integer overflow: 44161 * 61694 cannot be represented in type 'int'
- avcodec/msmpeg4dec: Correct table depth
- avcodec/dds: Fix runtime error: left shift of 1 by 31 places cannot be represented in type 'int'
- avcodec/cdxl: Check format parameter
- avutil/softfloat: Fix overflow in av_div_sf()
- avcodec/hq_hqa: Fix runtime error: left shift of negative value -207
- avcodec/mss3: Change types in rac_get_model_sym() to match the types they are initialized from
- avcodec/shorten: Check k in get_uint()
- avcodec/webp: Fix null pointer dereference
- avcodec/dfa: Fix signed integer overflow: -2147483648 - 1 cannot be represented in type 'int'
- avcodec/g723_1: Fix multiple runtime error: left shift of negative value
- avcodec/mimic: Fix runtime error: left shift of negative value -1
- avcodec/clearvideo: Fix multiple runtime error: left shift of negative value -1024
- avcodec/fic: Fix multiple left shift of negative value -15
- avcodec/mlpdec: Fix runtime error: left shift of negative value -22
- avcodec/snowdec: Check qbias
- avutil/softfloat: Fix multiple runtime error: left shift of negative value -8
- avcodec/aacsbr_template: Do not leave bs_num_env invalid
- avcodec/mdec: Fix signed integer overflow: 28835400 * 83 cannot be represented in type 'int'
- avcodec/dfa: Fix off by 1 error
- avcodec/nellymoser: Fix multiple left shift of negative value -8591
- avcodec/cdxl: Fix signed integer overflow: 14243456 * 164 cannot be represented in type 'int'
- avcodec/g722: Fix multiple runtime error: left shift of negative value -1
- avcodec/dss_sp: Fix multiple left shift of negative value -466
- avcodec/wnv1: Fix runtime error: left shift of negative value -1
- avcodec/tiertexseqv: set the fixed dimenasions, do not depend on the demuxer doing so
- avcodec/mjpegdec: Fix runtime error: signed integer overflow: -24543 * 2031616 cannot be represented in type 'int'
- avcodec/cavsdec: Fix undefined behavior from integer overflow
- avcodec/dvdsubdec: Fix runtime error: left shift of 242 by 24 places cannot be represented in type 'int'
- libavcodec/mpeg4videodec: Convert sprite_offset to 64bit
- avcodec/pngdec: Use ff_set_dimensions()
- avcodec/msvideo1: Check buffer size before re-getting the frame
- avcodec/h264_cavlc: Fix undefined behavior on qscale overflow
- avcodec/dcadsp: Fix runtime error: signed integer overflow
- avcodec/svq3: Reject dx/dy beyond 16bit
- avcodec/svq3: Increase offsets to prevent integer overflows
- avcodec/indeo2: Check remaining bits in ir2_decode_plane()
- avcodec/vp3: Check remaining bits in unpack_dct_coeffs()
- doc/developer: Add terse documentation of assumed C implementation defined behavior
- avcodec/bmp: Use ff_set_dimensions()
- avcodec/mdec: Fix runtime error: left shift of negative value -127
- avcodec/x86/vc1dsp_init: Fix build failure with --disable-optimizations and clang
- libavcodec/exr : fix float to uint16 conversion for negative float value
- avformat/webmdashenc: Validate the 'streams' adaptation sets parameter
- avformat/webmdashenc: Require the 'adaptation_sets' option to be set
- lavfi/avfiltergraph: only return EOF in avfilter_graph_request_oldest if all sinks EOFed
- ffmpeg: check for unconnected outputs
- avformat/utils: free AVStream.codec properly in free_stream()
- avcodec/options: do a more thorough clean up in avcodec_copy_context()
@@ -13,8 +13,9 @@ You can disable all the demuxers using the configure option
the option @code{--enable-demuxer=@var{DEMUXER}}, or disable it
with the option @code{--disable-demuxer=@var{DEMUXER}}.
The option @code{-formats} of the ff* tools will display the list of
enabled demuxers.
The option @code{-demuxers} of the ff* tools will display the list of
enabled demuxers. Use @code{-formats} to view a combined list of
enabled demuxers and muxers.
The description of some of the currently available demuxers follows.
@@ -243,11 +244,17 @@ file subdir/file-2.wav
@end example
@end itemize
@section flv
@section flv, live_flv
Adobe Flash Video Format demuxer.
This demuxer is used to demux FLV files and RTMP network streams.
This demuxer is used to demux FLV files and RTMP network streams. In case of live network streams, if you force format, you may use live_flv option instead of flv to survive timestamp discontinuities.
Enable variable-size block motion compensation. Motion estimation is applied with smaller block sizes at object boundaries in order to make the them less blur. Default is @code{0} (disabled).
@end table
@end table
@@ -9665,7 +9730,7 @@ Scene change detection method. Scene change leads motion vectors to be in random
@item none
Disable scene change detection.
@item fdiff
Frame difference. Corresponding pixel values are compared and if it statisfies @var{scd_threshold} scene change is detected.
Frame difference. Corresponding pixel values are compared and if it satisfies @var{scd_threshold} scene change is detected.
@end table
Default method is @samp{fdiff}.
@@ -10142,7 +10207,10 @@ force YUV422 output
force YUV444 output
@item rgb
force RGB output
force packed RGB output
@item gbrp
force planar RGB output
@end table
Default value is @samp{yuv420}.
@@ -10367,6 +10435,24 @@ Specify the color of the padded area. For the syntax of this option,
check the "Color" section in the ffmpeg-utils manual.
The default value of @var{color} is "black".
@item eval
Specify when to evaluate @var{width}, @var{height}, @var{x} and @var{y} expression.
It accepts the following values:
@table @samp
@item init
Only evaluate expressions once during the filter initialization or when
a command is processed.
@item frame
Evaluate expressions for each incoming frame.
@end table
Default value is @samp{init}.
@end table
The value for the @var{width}, @var{height}, @var{x}, and @var{y}
@@ -10941,6 +11027,12 @@ Set medium thresholding (good results, default).
@end table
@end table
@section premultiply
Apply alpha premultiply effect to input video stream using first plane
of second stream as alpha.
Both streams must have same dimensions and same pixel format.
@section prewitt
Apply prewitt operator to input video stream.
@@ -11170,6 +11262,76 @@ less than @code{0}, the filter will try to use a good random seed on a
best effort basis.
@end table
@section readeia608
Read closed captioning (EIA-608) information from the top lines of a video frame.
This filter adds frame metadata for @code{lavfi.readeia608.X.cc} and
@code{lavfi.readeia608.X.line}, where @code{X} is the number of the identified line
with EIA-608 data (starting from 0). A description of each metadata value follows:
@table @option
@item lavfi.readeia608.X.cc
The two bytes stored as EIA-608 data (printed in hexadecimal).
@item lavfi.readeia608.X.line
The number of the line on which the EIA-608 data was identified and read.
@end table
This filter accepts the following options:
@table @option
@item scan_min
Set the line to start scanning for EIA-608 data. Default is @code{0}.
@item scan_max
Set the line to end scanning for EIA-608 data. Default is @code{29}.
@item mac
Set minimal acceptable amplitude change for sync codes detection.
Default is @code{0.2}. Allowed range is @code{[0.001 - 1]}.
@item spw
Set the ratio of width reserved for sync code detection.
Default is @code{0.27}. Allowed range is @code{[0.01 - 0.7]}.
@item mhd
Set the max peaks height difference for sync code detection.
Default is @code{0.1}. Allowed range is @code{[0.0 - 0.5]}.
@item mpd
Set max peaks period difference for sync code detection.
Default is @code{0.1}. Allowed range is @code{[0.0 - 0.5]}.
@item msd
Set the first two max start code bits differences.
Default is @code{0.02}. Allowed range is @code{[0.0 - 0.5]}.
@item bhd
Set the minimum ratio of bits height compared to 3rd start code bit.
Default is @code{0.75}. Allowed range is @code{[0.01 - 1]}.
@item th_w
Set the white color threshold. Default is @code{0.35}. Allowed range is @code{[0.1 - 1]}.
@item th_b
Set the black color threshold. Default is @code{0.15}. Allowed range is @code{[0.0 - 0.5]}.
@item chp
Enable checking the parity bit. In the event of a parity error, the filter will output
@code{0x00} for that character. Default is false.
@end table
@subsection Examples
@itemize
@item
Output a csv with presentation time and the first two lines of identified EIA-608 captioning data.
- 1f821750f hevcdsp: split the qpel functions by width instead of by the subpixel fraction
- 818bfe7f0 hevcdsp: split the epel functions by width
- 688417399 hevcdsp: split the pred functions by width
- a853388d2 hevc: change the stride of the MC buffer to be in bytes instead of elements
- 0cef06df0 checkasm: add HEVC MC tests
- e7078e842 hevcdsp: add x86 SIMD for MC
- VAAPI VP8 decode hwaccel (currently under review: http://ffmpeg.org/pipermail/ffmpeg-devel/2017-February/thread.html#207348)
- Removal of the custom atomic API (5cc0057f49, see http://ffmpeg.org/pipermail/ffmpeg-devel/2017-March/209003.html)
- Use the new bitstream filter for extracting extradata (8e2ea69135 and 096a8effa3, see https://ffmpeg.org/pipermail/ffmpeg-devel/2017-March/209068.html)
- Read aac_adtstoasc extradata updates from packet side data on Matroska once mov and the bsf in question are fixed (See 13a211e632 and 5ef1959080)
Collateral damage that needs work locally:
------------------------------------------
@@ -96,3 +105,11 @@ Collateral damage that needs work locally:
- Merge proresdec2.c and proresdec_lgpl.c
- Merge proresenc_anatoliy.c and proresenc_kostya.c
- Remove ADVANCED_PARSER in libavcodec/hevc_parser.c
- Fix MIPS AC3 downmix
- hlsenc encryption support may need some adjustment (see edc43c571d)
Extra changes needed to be aligned with Libav:
----------------------------------------------
- Switching our examples to the new encode/decode API (see 67d28f4a0f)
- HEVC IDCT bit depth 12-bit support (Libav added 8 and 10 but doesn't have 12)
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.