Skip to content

How to Reduce the Number of Cached Frames in Intel VPL Decoder and Lower Decoding Latency #190

@lion117

Description

@lion117

Background Description

  1. Hardware: Intel I7-10500, 16GB memory
  2. Operating System: Ubuntu 22.04
  3. Library Version: Intel VPL 2.15.0

Problem Description

I have a player project that requires implementing hardware decoding using Intel VPL, with the core requirement of achieving minimal latency. After decoding via VPL, the NV12 data needs to be retrieved from memory for post-processing and subsequent rendering.
To track the decoding behavior of VPL, I print the Presentation Timestamps (PTS) both before feeding frames into the decoder and after retrieving the decoded frames.
When decoding H.264 videos containing only I-frames and P-frames, I observe that I have to input at least 5 consecutive frames (in the sequence of I-P-P-P-P) before the decoder outputs the first decoded frame. This indicates that the decoder internally caches at least 4 frames. For a video with a frame rate of 30 FPS, this internal caching introduces a latency of at least 100ms, which is unacceptable for our latency optimization goals.

config code

   LOGI("begin to initDecoderWithParams ");
    memset(&m_decParams, 0, sizeof(m_decParams));
    m_decParams.mfx.CodecId = codecId;
    m_decParams.IOPattern = MFX_IOPATTERN_OUT_SYSTEM_MEMORY;
    m_decParams.mfx.FrameInfo.FourCC = MFX_FOURCC_NV12;
    m_decParams.mfx.FrameInfo.ChromaFormat = MFX_CHROMAFORMAT_YUV420;
    m_decParams.mfx.FrameInfo.Width = ALIGN16(width);
    m_decParams.mfx.FrameInfo.Height = ALIGN16(height);
    m_decParams.mfx.FrameInfo.CropW = width;
    m_decParams.mfx.FrameInfo.CropH = height;

    if (m_bLowLantency)
    {
        m_decParams.AsyncDepth =1; // 合理的异步深度
        m_decParams.mfx.MaxDecFrameBuffering = 1;

        // m_decParams.mfx.DecodedOrder = 1;   // B帧模式可能乱序
        m_decParams.mfx.SkipOutput = MFX_CODINGOPTION_OFF;
        LOGI("set low latency decode mode  ");
    }
    else {
        m_decParams.AsyncDepth =6; // 合理的异步深度
        m_decParams.mfx.MaxDecFrameBuffering = 6;
    }

Question

What optimization methods are available to adjust the decoder settings, enabling it to output decoded frames immediately after processing and reducing the number of internally cached frames?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions