Skip to content
Open
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 28 additions & 2 deletions src/torchcodec/_core/Encoder.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -631,24 +631,42 @@ void tryToValidateCodecOption(
}

void sortCodecOptions(
const AVFormatContext* avFormatContext,
const std::map<std::string, std::string>& extraOptions,
UniqueAVDictionary& codecDict,
UniqueAVDictionary& formatDict) {
// Accepts a map of options as input, then sorts them into codec options and
// format options. The sorted options are returned into two separate dicts.
const AVClass* formatClass = avformat_get_class();
const AVClass* muxerClass =
avFormatContext->oformat ? avFormatContext->oformat->priv_class : nullptr;
for (const auto& [key, value] : extraOptions) {
// Check if option is generic format option
const AVOption* fmtOpt = av_opt_find2(
&formatClass,
key.c_str(),
nullptr,
0,
AV_OPT_SEARCH_CHILDREN | AV_OPT_SEARCH_FAKE_OBJ,
nullptr);
if (fmtOpt) {
// Check if option is muxer-specific option
// (Returned from `ffmpeg -h muxer=mp4`)
const AVOption* muxerOpt = nullptr;
if (muxerClass) {
muxerOpt = av_opt_find2(
&muxerClass,
key.c_str(),
nullptr,
0,
AV_OPT_SEARCH_FAKE_OBJ,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason not to use AV_OPT_SEARCH_CHILDREN here? It's used above for fmtOpt

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only use AV_OPT_SEARCH_CHILDREN when an AVClass has children to iterate over.
I checked that the AVFormatContext class defines child_next and child_class_iterate to iterate through child fields:
https://github.com/FFmpeg/FFmpeg/blob/release/8.0/libavformat/options.c#L128-L137

While the muxer class for MOV does not define these functions:
https://github.com/FFmpeg/FFmpeg/blob/release/8.0/libavformat/movenc.c#L132-L137

nullptr);
}
if (fmtOpt || muxerOpt) {
// Pass container-format options to formatDict to be used in
// avformat_write_header
av_dict_set(formatDict.getAddress(), key.c_str(), value.c_str(), 0);
} else {
// Default to codec option (includes AVCodecContext + encoder-private)
// By default, pass as codec option to be used in avcodec_open2
av_dict_set(codecDict.getAddress(), key.c_str(), value.c_str(), 0);
}
}
Expand Down Expand Up @@ -835,6 +853,7 @@ void VideoEncoder::initializeEncoder(
tryToValidateCodecOption(*avCodec, key.c_str(), value);
}
sortCodecOptions(
avFormatContext_.get(),
videoStreamOptions.extraOptions.value(),
avCodecOptions,
avFormatOptions_);
Expand Down Expand Up @@ -921,6 +940,13 @@ void VideoEncoder::encode() {
flushBuffers();

status = av_write_trailer(avFormatContext_.get());
// av_write_trailer returns mfra atom size (positive) for fragmented
// containers, which we'd misinterpret as an error. So we replace positive
// values with AVSUCCESS. See:
// https://github.com/FFmpeg/FFmpeg/blob/n8.0/libavformat/movenc.c#L8666
if (status > 0) {
status = AVSUCCESS;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this conversion, we interpret the positive returned integer as a random low level error unrelated to the encoding process (ex. 24 raises EMFILE: too many open files). It might just be a bug in the FFmpeg implementation that we can work around.

Since FFmpeg errors are negative, I think its fine to set the status to successful here.

Copy link
Contributor

@mollyxu mollyxu Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for providing context! Would we want to also add a comment on how ffmpeg errors are negative and link to source code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated!

TORCH_CHECK(
status == AVSUCCESS,
"Error in av_write_trailer: ",
Expand Down
42 changes: 42 additions & 0 deletions test/test_encoders.py
Original file line number Diff line number Diff line change
Expand Up @@ -1509,3 +1509,45 @@ def test_nvenc_against_ffmpeg_cli(
assert color_range == encoder_metadata["color_range"]
if color_space is not None:
assert color_space == encoder_metadata["color_space"]

@pytest.mark.parametrize("format", ["mp4", "mov"])
@pytest.mark.parametrize(
"extra_options",
[
# frag_keyframe with empty_moov (new fragment every keyframe)
{"movflags": "+frag_keyframe+empty_moov"},
# frag_duration creates fragments based on duration (in microseconds)
{"movflags": "+empty_moov", "frag_duration": "1000000"},
],
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a fun test!
Just to make sure, did you verify that it breaks if we don't pass the extra_option flags?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it fails to open the file when using extra_options={}:
RuntimeError: Could not open input file: .../fragmented_output.mp4 Invalid data found when processing input

Hopefully this fun test does not introduce flakiness!

def test_fragmented_mp4(
self,
tmp_path,
extra_options,
format,
):
# Test that VideoEncoder can write fragmented files using movflags.
# Fragmented files store metadata interleaved with data rather than
# all at the end, making them decodable even if writing is interrupted.
source_frames, frame_rate = self.decode_and_get_frame_rate(TEST_SRC_2_720P.path)
encoder = VideoEncoder(frames=source_frames, frame_rate=frame_rate)
encoded_path = str(tmp_path / f"fragmented_output.{format}")
encoder.to_file(dest=encoded_path, extra_options=extra_options)

# Decode the file to get reference frames
reference_decoder = VideoDecoder(encoded_path)
reference_frames = [reference_decoder.get_frame_at(i) for i in range(10)]

# Truncate the file to simulate interrupted write
with open(encoded_path, "rb") as f:
full_content = f.read()
truncated_size = int(len(full_content) * 0.5)
with open(encoded_path, "wb") as f:
f.write(full_content[:truncated_size])

# Decode the truncated file and verify first 10 frames match reference
truncated_decoder = VideoDecoder(encoded_path)
assert len(truncated_decoder) >= 10
for i in range(10):
truncated_frame = truncated_decoder.get_frame_at(i)
assert torch.equal(truncated_frame.data, reference_frames[i].data)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use torch.testing.assert_equal(..., rtol=0, atol=0), it provides better error messages than torch,equal.

Loading