【FFmpeg】AVFrame结构体

发布于:2024-06-28 ⋅ 阅读:(16) ⋅ 点赞:(0)

参考:
FFMPEG结构体分析:AVFrame

示例工程:
【FFmpeg】调用ffmpeg库实现264软编
【FFmpeg】调用ffmpeg库实现264软解
【FFmpeg】调用ffmpeg库进行RTMP推流和拉流
【FFmpeg】调用ffmpeg库进行SDL2解码后渲染

流程分析:
【FFmpeg】编码链路上主要函数的简单分析
【FFmpeg】解码链路上主要函数的简单分析

结构体分析:
【FFmpeg】AVCodec结构体
【FFmpeg】AVCodecContext结构体
【FFmpeg】AVStream结构体
【FFmpeg】AVFormatContext结构体
【FFmpeg】AVIOContext结构体
【FFmpeg】AVPacket结构体

1.AVFrame结构体的定义

AVFrame结构体的定义位于libavutil\frame.h中,存储了未压缩的数据(编码前或解码后)

/**
 * This structure describes decoded (raw) audio or video data.
 *
 * AVFrame must be allocated using av_frame_alloc(). Note that this only
 * allocates the AVFrame itself, the buffers for the data must be managed
 * through other means (see below).
 * AVFrame must be freed with av_frame_free().
 *
 * AVFrame is typically allocated once and then reused multiple times to hold
 * different data (e.g. a single AVFrame to hold frames received from a
 * decoder). In such a case, av_frame_unref() will free any references held by
 * the frame and reset it to its original clean state before it
 * is reused again.
 *
 * The data described by an AVFrame is usually reference counted through the
 * AVBuffer API. The underlying buffer references are stored in AVFrame.buf /
 * AVFrame.extended_buf. An AVFrame is considered to be reference counted if at
 * least one reference is set, i.e. if AVFrame.buf[0] != NULL. In such a case,
 * every single data plane must be contained in one of the buffers in
 * AVFrame.buf or AVFrame.extended_buf.
 * There may be a single buffer for all the data, or one separate buffer for
 * each plane, or anything in between.
 *
 * sizeof(AVFrame) is not a part of the public ABI, so new fields may be added
 * to the end with a minor bump.
 *
 * Fields can be accessed through AVOptions, the name string used, matches the
 * C structure field name for fields accessible through AVOptions. The AVClass
 * for AVFrame can be obtained from avcodec_get_frame_class()
 */
// 这个结构描述解码后的(原始的)音频或视频数据
// 
// 1.AVFrame必须使用av_frame_alloc()来分配。注意,这只分配AVFrame本身,数据的缓冲区必须通过其他方式管理(见下文)。
//
// 2.AVFrame通常分配一次,然后多次重用以保存不同的数据(例如,单个AVFrame保存从解码器接收的帧)。
// 在这种情况下,av_frame_unref()将释放框架持有的所有引用,并在再次重用之前将其重置为原始的干净状态
// 
// 3.AVFrame描述的数据通常通过AVBuffer API进行引用计数。底层缓冲区引用存储在AVFrame.buf或者AVFrame.extended_buf
// 如果至少设置了一个引用,则AVFrame被认为是引用计数。但是[0]!= NULL。在这种情况下,每个单独的数据平面必须包含在
// AVFrame中的一个缓冲区中。但是对于AVFrame.extended_buf
// 
// 4.sizeof(AVFrame)不是公共ABI的一部分,所以新字段可能会添加到末尾
//
// 5.字段可以通过AVOptions访问,使用的名称字符串与通过AVOptions访问的字段的C结构字段名称相匹配
// AVFrame的AVClass可以通过avcodec_get_frame_class()获得
typedef struct AVFrame {
#define AV_NUM_DATA_POINTERS 8
    /**
     * pointer to the picture/channel planes.
     * This might be different from the first allocated byte. For video,
     * it could even point to the end of the image data.
     *
     * All pointers in data and extended_data must point into one of the
     * AVBufferRef in buf or extended_buf.
     *
     * Some decoders access areas outside 0,0 - width,height, please
     * see avcodec_align_dimensions2(). Some filters and swscale can read
     * up to 16 bytes beyond the planes, if these filters are to be used,
     * then 16 extra bytes must be allocated.
     *
     * NOTE: Pointers not needed by the format MUST be set to NULL.
     *
     * @attention In case of video, the data[] pointers can point to the
     * end of image data in order to reverse line order, when used in
     * combination with negative values in the linesize[] array.
     */
    // 1.指向picture/channel plane的指针。这可能与第一个分配的字节不同。对于视频,它甚至可以指向图像数据的末端
    // 
    // 2.data和extended_data中的所有指针必须指向AVBufferRef中的buf或extended_buf中的一个
    // 
    // 3.一些解码器访问0,0以外的区域-宽度,高度,请参阅avcodec_align_dimensions2()
    // 一些过滤器和swscale可以读取平面之外的16个字节,如果要使用这些过滤器,则必须分配16个额外的字节
    // 
    // @note: 格式中不需要的指针必须设置为NULL
    // @attention: 在视频的情况下,当与linsize[]数组中的负值结合使用时,data[]指针可以指向图像数据的末尾,以便反转行顺序
    uint8_t *data[AV_NUM_DATA_POINTERS];

    /**
     * For video, a positive or negative value, which is typically indicating
     * the size in bytes of each picture line, but it can also be:
     * - the negative byte size of lines for vertical flipping
     *   (with data[n] pointing to the end of the data
     * - a positive or negative multiple of the byte size as for accessing
     *   even and odd fields of a frame (possibly flipped)
     *
     * For audio, only linesize[0] may be set. For planar audio, each channel
     * plane must be the same size.
     *
     * For video the linesizes should be multiples of the CPUs alignment
     * preference, this is 16 or 32 for modern desktop CPUs.
     * Some code requires such alignment other code can be slower without
     * correct alignment, for yet other it makes no difference.
     *
     * @note The linesize may be larger than the size of usable data -- there
     * may be extra padding present for performance reasons.
     *
     * @attention In case of video, line size values can be negative to achieve
     * a vertically inverted iteration over image lines.
     */
    // 1.对于视频,一个正值或负值,通常表示每条图片行的字节大小,但它也可以是:
	// (1)垂直翻转(data[n]指向数据的末尾)行的负字节大小
	// (2)访问一个帧的偶数和奇数字段的字节大小的正负倍数(可能是翻转)
	// 
	// 2.对于音频,只能设置linesize[0]。对于平面音频,每个声道平面必须是相同的大小
	// 
	// 3.对于视频,linesize应该是cpu对齐首选项的倍数,对于cpu,这是16或32
	// 一些代码需要这样的对齐,另一些代码如果没有正确的对齐可能会变慢,但这并没有什么区别。
	// 
	// @note: 行大小可能大于可用数据的大小——出于性能原因可能会有额外的填充
	// @attention: 在视频的情况下,linesize大小值可以为负,以实现对图像线的垂直反向迭代
    int linesize[AV_NUM_DATA_POINTERS];

    /**
     * pointers to the data planes/channels.
     *
     * For video, this should simply point to data[].
     *
     * For planar audio, each channel has a separate data pointer, and
     * linesize[0] contains the size of each channel buffer.
     * For packed audio, there is just one data pointer, and linesize[0]
     * contains the total size of the buffer for all channels.
     *
     * Note: Both data and extended_data should always be set in a valid frame,
     * but for planar audio with more channels that can fit in data,
     * extended_data must be used in order to access all channels.
     */
    // 指向数据planes/channels的指针(根据视频或者音频确定)
    // 1.对于视频,直接指向data[]
    // 2.对于planar音频,每个通道都有一个单独的数据指针,linesize[0]包含每个通道缓冲区的大小
    // 		对于packed音频,只有一个数据指针,并且linesize[0]包含所有通道的缓冲区的总大小
    // Note: data和extended_data应该始终设置在一个有效的帧中,但是对于具有更多可以容纳数据的通道的平面音频
    //		必须使用extended_data来访问所有通道
    uint8_t **extended_data;

    /**
     * @name Video dimensions
     * Video frames only. The coded dimensions (in pixels) of the video frame,
     * i.e. the size of the rectangle that contains some well-defined values.
     *
     * @note The part of the frame intended for display/presentation is further
     * restricted by the @ref cropping "Cropping rectangle".
     * @{
     */
    // 视频的长和宽
    int width, height;
    /**
     * @}
     */

    /**
     * number of audio samples (per channel) described by this frame
     */
    // 这个帧描述的音频采样数(每个通道)
    int nb_samples;

    /**
     * format of the frame, -1 if unknown or unset
     * Values correspond to enum AVPixelFormat for video frames,
     * enum AVSampleFormat for audio)
     */
    // frame的格式,unknown则设置为-1,否则不设置
    // 值对应于视频帧的enum AVPixelFormat,音频帧的enum AVSampleFormat
    int format;

#if FF_API_FRAME_KEY
    /**
     * 1 -> keyframe, 0-> not
     *
     * @deprecated Use AV_FRAME_FLAG_KEY instead
     */
    // attribute_deprecated表示某函数、变量或类型已经不再推荐使用,并可能在未来的版本中移除
    // 这里key_frame这个变量在将来可能会使用AV_FRAME_FLAG_KEY替代
    attribute_deprecated
    int key_frame;
#endif

    /**
     * Picture type of the frame.
     */
    // frame的帧类型(例如I帧,P帧,B帧等)
    enum AVPictureType pict_type;

    /**
     * Sample aspect ratio for the video frame, 0/1 if unknown/unspecified.
     */
    // 视频帧的采样宽高比,如果未知/未指定,则为0/1
    AVRational sample_aspect_ratio;

    /**
     * Presentation timestamp in time_base units (time when frame should be shown to user).
     */
    // 以time_base为单位的呈现时间戳(帧应该显示给用户的时间)
    int64_t pts;

    /**
     * DTS copied from the AVPacket that triggered returning this frame. (if frame threading isn't used)
     * This is also the Presentation time of this AVFrame calculated from
     * only AVPacket.dts values without pts values.
     */
    // 从触发返回此帧的AVPacket中复制的DTS。(如果没有使用帧线程)
    // 这也是仅从AVPacket计算的AVFrame的呈现时间。没有PTS值的DTS值
    int64_t pkt_dts;

    /**
     * Time base for the timestamps in this frame.
     * In the future, this field may be set on frames output by decoders or
     * filters, but its value will be by default ignored on input to encoders
     * or filters.
     */
    // 此框架中时间戳的时间基础
    // 在未来,该字段可能会在解码器或过滤器输出的帧上设置,但默认情况下,它的值将在编码器或过滤器输入时被忽略
    AVRational time_base;

    /**
     * quality (between 1 (good) and FF_LAMBDA_MAX (bad))
     */
    // 质量
    // 位于1和FF_LAMBDA_MAX之间,其中1=good,FF_LAMBDA_MAX=bad
    // #define FF_LAMBDA_MAX (256*128-1)
    int quality;

    /**
     * Frame owner's private data.
     *
     * This field may be set by the code that allocates/owns the frame data.
     * It is then not touched by any library functions, except:
     * - it is copied to other references by av_frame_copy_props() (and hence by
     *   av_frame_ref());
     * - it is set to NULL when the frame is cleared by av_frame_unref()
     * - on the caller's explicit request. E.g. libavcodec encoders/decoders
     *   will copy this field to/from @ref AVPacket "AVPackets" if the caller sets
     *   @ref AV_CODEC_FLAG_COPY_OPAQUE.
     *
     * @see opaque_ref the reference-counted analogue
     */
    // 帧拥有者的私有数据
    // 该字段可以由分配/拥有帧数据的代码设置
    // 然后,它不会被任何标准库函数触及,除了:
	// (1) 它被av_frame_copy_props()复制到其他引用(因此av_frame_ref());
	// (2) 当av_frame_unref()清除帧时,它被设置为NULL
	// (3) 应来电者的明确要求。例如,如果调用者设置了@ref AV_CODEC_FLAG_COPY_OPAQUE, 
	//		libavcodec编码器/解码器将把这个字段复制到/从@ref AVPacket "AVPackets"中
    void *opaque;

    /**
     * Number of fields in this frame which should be repeated, i.e. the total
     * duration of this frame should be repeat_pict + 2 normal field durations.
     *
     * For interlaced frames this field may be set to 1, which signals that this
     * frame should be presented as 3 fields: beginning with the first field (as
     * determined by AV_FRAME_FLAG_TOP_FIELD_FIRST being set or not), followed
     * by the second field, and then the first field again.
     *
     * For progressive frames this field may be set to a multiple of 2, which
     * signals that this frame's duration should be (repeat_pict + 2) / 2
     * normal frame durations.
     *
     * @note This field is computed from MPEG2 repeat_first_field flag and its
     * associated flags, H.264 pic_struct from picture timing SEI, and
     * their analogues in other codecs. Typically it should only be used when
     * higher-layer timing information is not available.
     */
    // 该帧中应该重复的字段数,即该帧的总持续时间应为repeat_pict + 2个正常字段持续时间
    // 
    // 1.对于隔行帧,该字段可以设置为1,这表明该帧应该以3个字段表示:
    //	从第一个字段开始(由AV_FRAME_FLAG_TOP_FIELD_FIRST设置或不设置决定),然后是第二个字段,然后是第一个字段
    //
    // 2.对于渐进帧,该字段可以设置为2的倍数,这表明该帧的持续时间应该是(repeat_pict + 2) / 2个正常帧持续时间
    // 
    // @note: 这个字段是从MPEG2的repeat_first_field标志及其相关标志,H.264的pic_struct从图像时序SEI,
    // 			以及它们在其他编解码器中的类似物中计算出来的。通常,只有在无法获得更高层的计时信息时才应该使用它
    int repeat_pict;

#if FF_API_INTERLACED_FRAME
    /**
     * The content of the picture is interlaced.
     *
     * @deprecated Use AV_FRAME_FLAG_INTERLACED instead
     */
    // 查看当前picture是否是交错的
    // 在将来可能会使用AV_FRAME_FLAG_INTERLACED替代
    attribute_deprecated
    int interlaced_frame;

    /**
     * If the content is interlaced, is top field displayed first.
     *
     * @deprecated Use AV_FRAME_FLAG_TOP_FIELD_FIRST instead
     */
    // 如果内容是隔行显示的,则首先显示顶部字段
    // 在将来可能会使用AV_FRAME_FLAG_TOP_FIELD_FIRST替代
    attribute_deprecated
    int top_field_first;
#endif

#if FF_API_PALETTE_HAS_CHANGED
    /**
     * Tell user application that palette has changed from previous frame.
     */
    // 告诉用户应用程序调色板与前一帧相比发生了变化
    attribute_deprecated
    int palette_has_changed;
#endif

    /**
     * Sample rate of the audio data.
     */
    // 音频数据的采样率
    int sample_rate;

    /**
     * AVBuffer references backing the data for this frame. All the pointers in
     * data and extended_data must point inside one of the buffers in buf or
     * extended_buf. This array must be filled contiguously -- if buf[i] is
     * non-NULL then buf[j] must also be non-NULL for all j < i.
     *
     * There may be at most one AVBuffer per data plane, so for video this array
     * always contains all the references. For planar audio with more than
     * AV_NUM_DATA_POINTERS channels, there may be more buffers than can fit in
     * this array. Then the extra AVBufferRef pointers are stored in the
     * extended_buf array.
     */
    // 1.AVBuffer引用支持这一帧的数据。data和extended_data中的所有指针必须指向buf或extended_buf中的一个缓冲区
    // 		这个数组必须连续填充——如果buf[i]是非空的,那么对于所有j < i, buf[j]也必须是非空的
    //
    // 2.每个数据平面可能最多有一个AVBuffer,所以对于视频这个数组总是包含所有的引用
    //		对于具有多个av_num_data_pointer通道的平面音频,可能会有超过这个数组所能容纳的缓冲区
    //		然后额外的AVBufferRef指针存储在extended_buf数组中
    AVBufferRef *buf[AV_NUM_DATA_POINTERS];

    /**
     * For planar audio which requires more than AV_NUM_DATA_POINTERS
     * AVBufferRef pointers, this array will hold all the references which
     * cannot fit into AVFrame.buf.
     *
     * Note that this is different from AVFrame.extended_data, which always
     * contains all the pointers. This array only contains the extra pointers,
     * which cannot fit into AVFrame.buf.
     *
     * This array is always allocated using av_malloc() by whoever constructs
     * the frame. It is freed in av_frame_unref().
     */
    // 1.对于planar音频,它需要更多的av_num_data_pointer AVBufferRef pointer,这个数组将保存所有不能放入AVFrame.buf的引用
    //
    // 2.注意,这与AVFrame不同。Extended_data,它总是包含所有指针。这个数组只包含额外的指针,不能放入AVFrame.buf中
    // 
    // 3.这个数组总是由构造帧的人使用av_malloc()分配。它在av_frame_unref()中被释放
    AVBufferRef **extended_buf;
    /**
     * Number of elements in extended_buf.
     */
    // extended_buf中的元素个数
    int        nb_extended_buf;
	// 额外数据
    AVFrameSideData **side_data;
    // 额外数据的数量
    int            nb_side_data;

/**
 * @defgroup lavu_frame_flags AV_FRAME_FLAGS
 * @ingroup lavu_frame
 * Flags describing additional frame properties.
 *
 * @{
 */

/**
 * The frame data may be corrupted, e.g. due to decoding errors.
 */
// frame数据可能会损坏,例如由于解码错误
#define AV_FRAME_FLAG_CORRUPT       (1 << 0)
/**
 * A flag to mark frames that are keyframes.
 */
// 标记为关键帧的帧的标志
#define AV_FRAME_FLAG_KEY (1 << 1)
/**
 * A flag to mark the frames which need to be decoded, but shouldn't be output.
 */
// 标记需要解码但不应该输出的帧的标志
#define AV_FRAME_FLAG_DISCARD   (1 << 2)
/**
 * A flag to mark frames whose content is interlaced.
 */
// 标记帧的内容是否是交错的
#define AV_FRAME_FLAG_INTERLACED (1 << 3)
/**
 * A flag to mark frames where the top field is displayed first if the content
 * is interlaced.
 */
// 一个标记帧的标志,如果内容是交错的,顶部字段首先显示
#define AV_FRAME_FLAG_TOP_FIELD_FIRST (1 << 4)
/**
 * @}
 */

    /**
     * Frame flags, a combination of @ref lavu_frame_flags
     */
    // 帧的flag,@ref lavu_frame_flags的组合
    int flags;
	
    /**
     * MPEG vs JPEG YUV range.
     * - encoding: Set by user
     * - decoding: Set by libavcodec
     */
    // 可视内容值范围
    enum AVColorRange color_range;
	// 源primaries的色度坐标
    enum AVColorPrimaries color_primaries;
	// 色彩转移特性
    enum AVColorTransferCharacteristic color_trc;

    /**
     * YUV colorspace type.
     * - encoding: Set by user
     * - decoding: Set by libavcodec
     */
    // YUV colorspace的类型
    enum AVColorSpace colorspace;
	// 色度样本的位置
    enum AVChromaLocation chroma_location;

    /**
     * frame timestamp estimated using various heuristics, in stream time base
     * - encoding: unused
     * - decoding: set by libavcodec, read by user.
     */
    // 帧时间戳估计使用各种启发式,在流时间基础
    int64_t best_effort_timestamp;

#if FF_API_FRAME_PKT
    /**
     * reordered pos from the last AVPacket that has been input into the decoder
     * - encoding: unused
     * - decoding: Read by user.
     * @deprecated use AV_CODEC_FLAG_COPY_OPAQUE to pass through arbitrary user
     *             data from packets to frames
     */
    // 从已输入到解码器的最后一个AVPacket中重新排序pos
    // @deprecated: 使用AV_CODEC_FLAG_COPY_OPAQUE将任意用户数据从数据包传递到帧
    attribute_deprecated
    int64_t pkt_pos;
#endif

    /**
     * metadata.
     * - encoding: Set by user.
     * - decoding: Set by libavcodec.
     */
    // 元数据
    AVDictionary *metadata;

    /**
     * decode error flags of the frame, set to a combination of
     * FF_DECODE_ERROR_xxx flags if the decoder produced a frame, but there
     * were errors during the decoding.
     * - encoding: unused
     * - decoding: set by libavcodec, read by user.
     */
    // 解码帧的错误标志,如果解码器产生一个帧,但在解码过程中有错误,则设置为FF_DECODE_ERROR_xxx标志的组合
    int decode_error_flags;
    // 错误标志符:
    // 1: 码流不可用(INVALID_BITSTREAM)
    // 2: 缺失参考帧(MISSING_REFERENCE)
    // 4: 激活隐藏功能(CONCEALMENT_ACTIVE)
    // 8: 解码slice错误(DECODE_SLICES)
#define FF_DECODE_ERROR_INVALID_BITSTREAM   1
#define FF_DECODE_ERROR_MISSING_REFERENCE   2
#define FF_DECODE_ERROR_CONCEALMENT_ACTIVE  4
#define FF_DECODE_ERROR_DECODE_SLICES       8

#if FF_API_FRAME_PKT
    /**
     * size of the corresponding packet containing the compressed
     * frame.
     * It is set to a negative value if unknown.
     * - encoding: unused
     * - decoding: set by libavcodec, read by user.
     * @deprecated use AV_CODEC_FLAG_COPY_OPAQUE to pass through arbitrary user
     *             data from packets to frames
     */
    // 包含压缩帧的相应报文的大小
    // 如果未知则设置为负值
    attribute_deprecated
    int pkt_size;
#endif

    /**
     * For hwaccel-format frames, this should be a reference to the
     * AVHWFramesContext describing the frame.
     */
    // 用于hwaccel-format的帧,这应该是对描述帧的AVHWFramesContext的引用
    AVBufferRef *hw_frames_ctx;
	
    /**
     * Frame owner's private data.
     *
     * This field may be set by the code that allocates/owns the frame data.
     * It is then not touched by any library functions, except:
     * - a new reference to the underlying buffer is propagated by
     *   av_frame_copy_props() (and hence by av_frame_ref());
     * - it is unreferenced in av_frame_unref();
     * - on the caller's explicit request. E.g. libavcodec encoders/decoders
     *   will propagate a new reference to/from @ref AVPacket "AVPackets" if the
     *   caller sets @ref AV_CODEC_FLAG_COPY_OPAQUE.
     *
     * @see opaque the plain pointer analogue
     */
    // 帧拥有者的私有数据
    // 该字段可以由分配/拥有帧数据的代码设置。
	// 然后,它不会被任何标准库函数触及,除了:
	// (1) 对底层缓冲区的新引用由av_frame_copy_props()(因此由av_frame_ref())传播;
	// (2) 在av_frame_unref()中未被引用;
	// (3) 应调用者的明确要求。例如,如果调用者设置了@ref AV_CODEC_FLAG_COPY_OPAQUE, libavcodec编码器/解码器将向/从@ref AVPacket“AVPackets”传播一个新的引用。
    AVBufferRef *opaque_ref;

    /**
     * @anchor cropping
     * @name Cropping
     * Video frames only. The number of pixels to discard from the the
     * top/bottom/left/right border of the frame to obtain the sub-rectangle of
     * the frame intended for presentation.
     * @{
     */
    // 仅用于视频帧
    // 表示了帧裁剪的上下左右4个方向,裁剪的多少
    size_t crop_top;
    size_t crop_bottom;
    size_t crop_left;
    size_t crop_right;
    /**
     * @}
     */

    /**
     * AVBufferRef for internal use by a single libav* library.
     * Must not be used to transfer data between libraries.
     * Has to be NULL when ownership of the frame leaves the respective library.
     *
     * Code outside the FFmpeg libs should never check or change the contents of the buffer ref.
     *
     * FFmpeg calls av_buffer_unref() on it when the frame is unreferenced.
     * av_frame_copy_props() calls create a new reference with av_buffer_ref()
     * for the target frame's private_ref field.
     */
    // AVBufferRef的内部使用由单个libav*库。不能用于在库之间传输数据。当帧的所有权离开相应的库时必须为NULL。
    // 
    // 1.FFmpeg库之外的代码永远不应该检查或更改缓冲区ref的内容
    // 
    // 2.当帧未被引用时,FFmpeg调用av_buffer_unref()。Av_frame_copy_props()调用用av_buffer_ref()为
    //		目标帧的private_ref字段创建一个新的引用
    AVBufferRef *private_ref;

    /**
     * Channel layout of the audio data.
     */
    // 音频数据的声道数
    AVChannelLayout ch_layout;

    /**
     * Duration of the frame, in the same units as pts. 0 if unknown.
     */
    // 帧持续的时间,单位与pts相同
    int64_t duration;
} AVFrame;

AVFrame结构体中记录的比较关键的信息包括:
(1)uint8_t *data[AV_NUM_DATA_POINTERS]:存储frame数据
(2)int linesize[AV_NUM_DATA_POINTERS]:帧当中一行有多少像素
(3)int width, height:视频的宽高
(4)int format:frame的格式
(5)int key_frame:是否是关键帧
(6)enum AVPictureType pict_type:帧的类型
(7)int64_t pts:presentation的时间戳
(8)int64_t pkt_dts:解码时间戳
(9)int quality:视频质量
(10)AVBufferRef *buf[AV_NUM_DATA_POINTERS]:AVBuffer引用支持这一帧的数据。data和extended_data中的所有指针必须指向buf或extended_buf中的一个缓冲区
(11)enum AVColorXXX:一些与color相关的配置
(12)size_t cropXXX:显示框的裁剪

新版本当中定义的AVFrame与老版本相对比,新增加了一些如颜色处理,crop处理,硬件编解码信息等内容,将一些比较底层的信息删去,如qp、mv和dct等,将这些较底层的信息封装到别的地方去。这里还有一个问题,opaque、side_data和metadata这3个变量都描述了一些额外或者私有的数据,它们之间有什么异同?我的理解是,这3个变量都为音视频数据处理提供了额外的信息,其中:
(1)opaque:用户的私有数据,用户可以访问和修改,需要自行管理生命周期,适用于某些上下文或者配置数据传递给解码器的情况,也没有特定的格式需要,定义为void*格式,取决于用户的需求
(2)side data:编码器和解码器使用,存储与当前帧相关的辅助数据;通过AVFrame的side_data和nb_side_data访问,管理由编解码器自动进行;常用于传输编码特定信息,如H264的SEI(补充增强信息)
(3)Metadata:用于存储关键值对元数据,供编解码器及其他组件使用;通过AVDictionary管理,提供一组API来设置和获取元数据;用于传递编解码器配置或视频流的附加信息,如字幕;使用键值对进行存储,方便编解码器及其他组件读取和解析

参考雷博的文章,知道了对于data的理解:
(1)对于packed格式的数据(例如RGB24),会存到data[0]里面。
(2)对于planar格式的数据(例如YUV420P),则会分开成data[0],data[1],data[2]…(YUV420P中data[0]存Y,data[1]存U,data[2]存V)

enum AVPictureType pict_type

结构体定义了帧的类型

/**
 * @}
 * @}
 * @defgroup lavu_picture Image related
 *
 * AVPicture types, pixel formats and basic image planes manipulation.
 *
 * @{
 */
enum AVPictureType {
    AV_PICTURE_TYPE_NONE = 0, ///< Undefined
    AV_PICTURE_TYPE_I,     ///< Intra
    AV_PICTURE_TYPE_P,     ///< Predicted
    AV_PICTURE_TYPE_B,     ///< Bi-dir predicted
    AV_PICTURE_TYPE_S,     ///< S(GMC)-VOP MPEG-4
    AV_PICTURE_TYPE_SI,    ///< Switching Intra
    AV_PICTURE_TYPE_SP,    ///< Switching Predicted
    AV_PICTURE_TYPE_BI,    ///< BI type
};

P帧和SP帧都是基于帧间预测的运动补偿预测编码,根据查阅一些资料看,SP帧能够参考不同的参考帧重构出相同的图像帧,常用于流之间切换、拼接、随机接入和错误恢复场景,暂时不是很了解

AVFrameSideData **side_data

在雷博记录的老版本FFmpeg中,side_data是没有进行封装的,但是在新版本中进行了封装,包括类型,数据内容,数据大小,还有metadata和buf

/**
 * Structure to hold side data for an AVFrame.
 *
 * sizeof(AVFrameSideData) is not a part of the public ABI, so new fields may be added
 * to the end with a minor bump.
 */
typedef struct AVFrameSideData {
    enum AVFrameSideDataType type;
    uint8_t *data;
    size_t   size;
    AVDictionary *metadata;
    AVBufferRef *buf;
} AVFrameSideData;

AVFrameSideDataType的定义如下

/**
 * @defgroup lavu_frame AVFrame
 * @ingroup lavu_data
 *
 * @{
 * AVFrame is an abstraction for reference-counted raw multimedia data.
 */
enum AVFrameSideDataType {
    /**
     * The data is the AVPanScan struct defined in libavcodec.
     */
    AV_FRAME_DATA_PANSCAN,
    /**
     * ATSC A53 Part 4 Closed Captions.
     * A53 CC bitstream is stored as uint8_t in AVFrameSideData.data.
     * The number of bytes of CC data is AVFrameSideData.size.
     */
    AV_FRAME_DATA_A53_CC,
    /**
     * Stereoscopic 3d metadata.
     * The data is the AVStereo3D struct defined in libavutil/stereo3d.h.
     */
    AV_FRAME_DATA_STEREO3D,
    /**
     * The data is the AVMatrixEncoding enum defined in libavutil/channel_layout.h.
     */
    AV_FRAME_DATA_MATRIXENCODING,
    /**
     * Metadata relevant to a downmix procedure.
     * The data is the AVDownmixInfo struct defined in libavutil/downmix_info.h.
     */
    AV_FRAME_DATA_DOWNMIX_INFO,
    /**
     * ReplayGain information in the form of the AVReplayGain struct.
     */
    AV_FRAME_DATA_REPLAYGAIN,
    ///....

CSDN : https://blog.csdn.net/weixin_42877471
Github : https://github.com/DoFulangChen