【FFmpeg】AVFrame结构体
示例工程:
【FFmpeg】调用ffmpeg库实现264软编
【FFmpeg】调用ffmpeg库实现264软解
【FFmpeg】调用ffmpeg库进行RTMP推流和拉流
【FFmpeg】调用ffmpeg库进行SDL2解码后渲染
流程分析:
【FFmpeg】编码链路上主要函数的简单分析
【FFmpeg】解码链路上主要函数的简单分析
结构体分析:
【FFmpeg】AVCodec结构体
【FFmpeg】AVCodecContext结构体
【FFmpeg】AVStream结构体
【FFmpeg】AVFormatContext结构体
【FFmpeg】AVIOContext结构体
【FFmpeg】AVPacket结构体
1.AVFrame结构体的定义
AVFrame结构体的定义位于libavutil\frame.h中,存储了未压缩的数据(编码前或解码后)
/**
* This structure describes decoded (raw) audio or video data.
*
* AVFrame must be allocated using av_frame_alloc(). Note that this only
* allocates the AVFrame itself, the buffers for the data must be managed
* through other means (see below).
* AVFrame must be freed with av_frame_free().
*
* AVFrame is typically allocated once and then reused multiple times to hold
* different data (e.g. a single AVFrame to hold frames received from a
* decoder). In such a case, av_frame_unref() will free any references held by
* the frame and reset it to its original clean state before it
* is reused again.
*
* The data described by an AVFrame is usually reference counted through the
* AVBuffer API. The underlying buffer references are stored in AVFrame.buf /
* AVFrame.extended_buf. An AVFrame is considered to be reference counted if at
* least one reference is set, i.e. if AVFrame.buf[0] != NULL. In such a case,
* every single data plane must be contained in one of the buffers in
* AVFrame.buf or AVFrame.extended_buf.
* There may be a single buffer for all the data, or one separate buffer for
* each plane, or anything in between.
*
* sizeof(AVFrame) is not a part of the public ABI, so new fields may be added
* to the end with a minor bump.
*
* Fields can be accessed through AVOptions, the name string used, matches the
* C structure field name for fields accessible through AVOptions. The AVClass
* for AVFrame can be obtained from avcodec_get_frame_class()
*/
// 这个结构描述解码后的(原始的)音频或视频数据
//
// 1.AVFrame必须使用av_frame_alloc()来分配。注意,这只分配AVFrame本身,数据的缓冲区必须通过其他方式管理(见下文)。
//
// 2.AVFrame通常分配一次,然后多次重用以保存不同的数据(例如,单个AVFrame保存从解码器接收的帧)。
// 在这种情况下,av_frame_unref()将释放框架持有的所有引用,并在再次重用之前将其重置为原始的干净状态
//
// 3.AVFrame描述的数据通常通过AVBuffer API进行引用计数。底层缓冲区引用存储在AVFrame.buf或者AVFrame.extended_buf
// 如果至少设置了一个引用,则AVFrame被认为是引用计数。但是[0]!= NULL。在这种情况下,每个单独的数据平面必须包含在
// AVFrame中的一个缓冲区中。但是对于AVFrame.extended_buf
//
// 4.sizeof(AVFrame)不是公共ABI的一部分,所以新字段可能会添加到末尾
//
// 5.字段可以通过AVOptions访问,使用的名称字符串与通过AVOptions访问的字段的C结构字段名称相匹配
// AVFrame的AVClass可以通过avcodec_get_frame_class()获得
typedef struct AVFrame {
#define AV_NUM_DATA_POINTERS 8
/**
* pointer to the picture/channel planes.
* This might be different from the first allocated byte. For video,
* it could even point to the end of the image data.
*
* All pointers in data and extended_data must point into one of the
* AVBufferRef in buf or extended_buf.
*
* Some decoders access areas outside 0,0 - width,height, please
* see avcodec_align_dimensions2(). Some filters and swscale can read
* up to 16 bytes beyond the planes, if these filters are to be used,
* then 16 extra bytes must be allocated.
*
* NOTE: Pointers not needed by the format MUST be set to NULL.
*
* @attention In case of video, the data[] pointers can point to the
* end of image data in order to reverse line order, when used in
* combination with negative values in the linesize[] array.
*/
// 1.指向picture/channel plane的指针。这可能与第一个分配的字节不同。对于视频,它甚至可以指向图像数据的末端
//
// 2.data和extended_data中的所有指针必须指向AVBufferRef中的buf或extended_buf中的一个
//
// 3.一些解码器访问0,0以外的区域-宽度,高度,请参阅avcodec_align_dimensions2()
// 一些过滤器和swscale可以读取平面之外的16个字节,如果要使用这些过滤器,则必须分配16个额外的字节
//
// @note: 格式中不需要的指针必须设置为NULL
// @attention: 在视频的情况下,当与linsize[]数组中的负值结合使用时,data[]指针可以指向图像数据的末尾,以便反转行顺序
uint8_t *data[AV_NUM_DATA_POINTERS];
/**
* For video, a positive or negative value, which is typically indicating
* the size in bytes of each picture line, but it can also be:
* - the negative byte size of lines for vertical flipping
* (with data[n] pointing to the end of the data
* - a positive or negative multiple of the byte size as for accessing
* even and odd fields of a frame (possibly flipped)
*
* For audio, only linesize[0] may be set. For planar audio, each channel
* plane must be the same size.
*
* For video the linesizes should be multiples of the CPUs alignment
* preference, this is 16 or 32 for modern desktop CPUs.
* Some code requires such alignment other code can be slower without
* correct alignment, for yet other it makes no difference.
*
* @note The linesize may be larger than the size of usable data -- there
* may be extra padding present for performance reasons.
*
* @attention In case of video, line size values can be negative to achieve
* a vertically inverted iteration over image lines.
*/
// 1.对于视频,一个正值或负值,通常表示每条图片行的字节大小,但它也可以是:
// (1)垂直翻转(data[n]指向数据的末尾)行的负字节大小
// (2)访问一个帧的偶数和奇数字段的字节大小的正负倍数(可能是翻转)
//
// 2.对于音频,只能设置linesize[0]。对于平面音频,每个声道平面必须是相同的大小
//
// 3.对于视频,linesize应该是cpu对齐首选项的倍数,对于cpu,这是16或32
// 一些代码需要这样的对齐,另一些代码如果没有正确的对齐可能会变慢,但这并没有什么区别。
//
// @note: 行大小可能大于可用数据的大小——出于性能原因可能会有额外的填充
// @attention: 在视频的情况下,linesize大小值可以为负,以实现对图像线的垂直反向迭代
int linesize[AV_NUM_DATA_POINTERS];
/**
* pointers to the data planes/channels.
*
* For video, this should simply point to data[].
*
* For planar audio, each channel has a separate data pointer, and
* linesize[0] contains the size of each channel buffer.
* For packed audio, there is just one data pointer, and linesize[0]
* contains the total size of the buffer for all channels.
*
* Note: Both data and extended_data should always be set in a valid frame,
* but for planar audio with more channels that can fit in data,
* extended_data must be used in order to access all channels.
*/
// 指向数据planes/channels的指针(根据视频或者音频确定)
// 1.对于视频,直接指向data[]
// 2.对于planar音频,每个通道都有一个单独的数据指针,linesize[0]包含每个通道缓冲区的大小
// 对于packed音频,只有一个数据指针,并且linesize[0]包含所有通道的缓冲区的总大小
// Note: data和extended_data应该始终设置在一个有效的帧中,但是对于具有更多可以容纳数据的通道的平面音频
// 必须使用extended_data来访问所有通道
uint8_t **extended_data;
/**
* @name Video dimensions
* Video frames only. The coded dimensions (in pixels) of the video frame,
* i.e. the size of the rectangle that contains some well-defined values.
*
* @note The part of the frame intended for display/presentation is further
* restricted by the @ref cropping "Cropping rectangle".
* @{
*/
// 视频的长和宽
int width, height;
/**
* @}
*/
/**
* number of audio samples (per channel) described by this frame
*/
// 这个帧描述的音频采样数(每个通道)
int nb_samples;
/**
* format of the frame, -1 if unknown or unset
* Values correspond to enum AVPixelFormat for video frames,
* enum AVSampleFormat for audio)
*/
// frame的格式,unknown则设置为-1,否则不设置
// 值对应于视频帧的enum AVPixelFormat,音频帧的enum AVSampleFormat
int format;
#if FF_API_FRAME_KEY
/**
* 1 -> keyframe, 0-> not
*
* @deprecated Use AV_FRAME_FLAG_KEY instead
*/
// attribute_deprecated表示某函数、变量或类型已经不再推荐使用,并可能在未来的版本中移除
// 这里key_frame这个变量在将来可能会使用AV_FRAME_FLAG_KEY替代
attribute_deprecated
int key_frame;
#endif
/**
* Picture type of the frame.
*/
// frame的帧类型(例如I帧,P帧,B帧等)
enum AVPictureType pict_type;
/**
* Sample aspect ratio for the video frame, 0/1 if unknown/unspecified.
*/
// 视频帧的采样宽高比,如果未知/未指定,则为0/1
AVRational sample_aspect_ratio;
/**
* Presentation timestamp in time_base units (time when frame should be shown to user).
*/
// 以time_base为单位的呈现时间戳(帧应该显示给用户的时间)
int64_t pts;
/**
* DTS copied from the AVPacket that triggered returning this frame. (if frame threading isn't used)
* This is also the Presentation time of this AVFrame calculated from
* only AVPacket.dts values without pts values.
*/
// 从触发返回此帧的AVPacket中复制的DTS。(如果没有使用帧线程)
// 这也是仅从AVPacket计算的AVFrame的呈现时间。没有PTS值的DTS值
int64_t pkt_dts;
/**
* Time base for the timestamps in this frame.
* In the future, this field may be set on frames output by decoders or
* filters, but its value will be by default ignored on input to encoders
* or filters.
*/
// 此框架中时间戳的时间基础
// 在未来,该字段可能会在解码器或过滤器输出的帧上设置,但默认情况下,它的值将在编码器或过滤器输入时被忽略
AVRational time_base;
/**
* quality (between 1 (good) and FF_LAMBDA_MAX (bad))
*/
// 质量
// 位于1和FF_LAMBDA_MAX之间,其中1=good,FF_LAMBDA_MAX=bad
// #define FF_LAMBDA_MAX (256*128-1)
int quality;
/**
* Frame owner's private data.
*
* This field may be set by the code that allocates/owns the frame data.
* It is then not touched by any library functions, except:
* - it is copied to other references by av_frame_copy_props() (and hence by
* av_frame_ref());
* - it is set to NULL when the frame is cleared by av_frame_unref()
* - on the caller's explicit request. E.g. libavcodec encoders/decoders
* will copy this field to/from @ref AVPacket "AVPackets" if the caller sets
* @ref AV_CODEC_FLAG_COPY_OPAQUE.
*
* @see opaque_ref the reference-counted analogue
*/
// 帧拥有者的私有数据
// 该字段可以由分配/拥有帧数据的代码设置
// 然后,它不会被任何标准库函数触及,除了:
// (1) 它被av_frame_copy_props()复制到其他引用(因此av_frame_ref());
// (2) 当av_frame_unref()清除帧时,它被设置为NULL
// (3) 应来电者的明确要求。例如,如果调用者设置了@ref AV_CODEC_FLAG_COPY_OPAQUE,
// libavcodec编码器/解码器将把这个字段复制到/从@ref AVPacket "AVPackets"中
void *opaque;
/**
* Number of fields in this frame which should be repeated, i.e. the total
* duration of this frame should be repeat_pict + 2 normal field durations.
*
* For interlaced frames this field may be set to 1, which signals that this
* frame should be presented as 3 fields: beginning with the first field (as
* determined by AV_FRAME_FLAG_TOP_FIELD_FIRST being set or not), followed
* by the second field, and then the first field again.
*
* For progressive frames this field may be set to a multiple of 2, which
* signals that this frame's duration should be (repeat_pict + 2) / 2
* normal frame durations.
*
* @note This field is computed from MPEG2 repeat_first_field flag and its
* associated flags, H.264 pic_struct from picture timing SEI, and
* their analogues in other codecs. Typically it should only be used when
* higher-layer timing information is not available.
*/
// 该帧中应该重复的字段数,即该帧的总持续时间应为repeat_pict + 2个正常字段持续时间
//
// 1.对于隔行帧,该字段可以设置为1,这表明该帧应该以3个字段表示:
// 从第一个字段开始(由AV_FRAME_FLAG_TOP_FIELD_FIRST设置或不设置决定),然后是第二个字段,然后是第一个字段
//
// 2.对于渐进帧,该字段可以设置为2的倍数,这表明该帧的持续时间应该是(repeat_pict + 2) / 2个正常帧持续时间
//
// @note: 这个字段是从MPEG2的repeat_first_field标志及其相关标志,H.264的pic_struct从图像时序SEI,
// 以及它们在其他编解码器中的类似物中计算出来的。通常,只有在无法获得更高层的计时信息时才应该使用它
int repeat_pict;
#if FF_API_INTERLACED_FRAME
/**
* The content of the picture is interlaced.
*
* @deprecated Use AV_FRAME_FLAG_INTERLACED instead
*/
// 查看当前picture是否是交错的
// 在将来可能会使用AV_FRAME_FLAG_INTERLACED替代
attribute_deprecated
int interlaced_frame;
/**
* If the content is interlaced, is top field displayed first.
*
* @deprecated Use AV_FRAME_FLAG_TOP_FIELD_FIRST instead
*/
// 如果内容是隔行显示的,则首先显示顶部字段
// 在将来可能会使用AV_FRAME_FLAG_TOP_FIELD_FIRST替代
attribute_deprecated
int top_field_first;
#endif
#if FF_API_PALETTE_HAS_CHANGED
/**
* Tell user application that palette has changed from previous frame.
*/
// 告诉用户应用程序调色板与前一帧相比发生了变化
attribute_deprecated
int palette_has_changed;
#endif
/**
* Sample rate of the audio data.
*/
// 音频数据的采样率
int sample_rate;
/**
* AVBuffer references backing the data for this frame. All the pointers in
* data and extended_data must point inside one of the buffers in buf or
* extended_buf. This array must be filled contiguously -- if buf[i] is
* non-NULL then buf[j] must also be non-NULL for all j < i.
*
* There may be at most one AVBuffer per data plane, so for video this array
* always contains all the references. For planar audio with more than
* AV_NUM_DATA_POINTERS channels, there may be more buffers than can fit in
* this array. Then the extra AVBufferRef pointers are stored in the
* extended_buf array.
*/
// 1.AVBuffer引用支持这一帧的数据。data和extended_data中的所有指针必须指向buf或extended_buf中的一个缓冲区
// 这个数组必须连续填充——如果buf[i]是非空的,那么对于所有j < i, buf[j]也必须是非空的
//
// 2.每个数据平面可能最多有一个AVBuffer,所以对于视频这个数组总是包含所有的引用
// 对于具有多个av_num_data_pointer通道的平面音频,可能会有超过这个数组所能容纳的缓冲区
// 然后额外的AVBufferRef指针存储在extended_buf数组中
AVBufferRef *buf[AV_NUM_DATA_POINTERS];
/**
* For planar audio which requires more than AV_NUM_DATA_POINTERS
* AVBufferRef pointers, this array will hold all the references which
* cannot fit into AVFrame.buf.
*
* Note that this is different from AVFrame.extended_data, which always
* contains all the pointers. This array only contains the extra pointers,
* which cannot fit into AVFrame.buf.
*
* This array is always allocated using av_malloc() by whoever constructs
* the frame. It is freed in av_frame_unref().
*/
// 1.对于planar音频,它需要更多的av_num_data_pointer AVBufferRef pointer,这个数组将保存所有不能放入AVFrame.buf的引用
//
// 2.注意,这与AVFrame不同。Extended_data,它总是包含所有指针。这个数组只包含额外的指针,不能放入AVFrame.buf中
//
// 3.这个数组总是由构造帧的人使用av_malloc()分配。它在av_frame_unref()中被释放
AVBufferRef **extended_buf;
/**
* Number of elements in extended_buf.
*/
// extended_buf中的元素个数
int nb_extended_buf;
// 额外数据
AVFrameSideData **side_data;
// 额外数据的数量
int nb_side_data;
/**
* @defgroup lavu_frame_flags AV_FRAME_FLAGS
* @ingroup lavu_frame
* Flags describing additional frame properties.
*
* @{
*/
/**
* The frame data may be corrupted, e.g. due to decoding errors.
*/
// frame数据可能会损坏,例如由于解码错误
#define AV_FRAME_FLAG_CORRUPT (1 << 0)
/**
* A flag to mark frames that are keyframes.
*/
// 标记为关键帧的帧的标志
#define AV_FRAME_FLAG_KEY (1 << 1)
/**
* A flag to mark the frames which need to be decoded, but shouldn't be output.
*/
// 标记需要解码但不应该输出的帧的标志
#define AV_FRAME_FLAG_DISCARD (1 << 2)
/**
* A flag to mark frames whose content is interlaced.
*/
// 标记帧的内容是否是交错的
#define AV_FRAME_FLAG_INTERLACED (1 << 3)
/**
* A flag to mark frames where the top field is displayed first if the content
* is interlaced.
*/
// 一个标记帧的标志,如果内容是交错的,顶部字段首先显示
#define AV_FRAME_FLAG_TOP_FIELD_FIRST (1 << 4)
/**
* @}
*/
/**
* Frame flags, a combination of @ref lavu_frame_flags
*/
// 帧的flag,@ref lavu_frame_flags的组合
int flags;
/**
* MPEG vs JPEG YUV range.
* - encoding: Set by user
* - decoding: Set by libavcodec
*/
// 可视内容值范围
enum AVColorRange color_range;
// 源primaries的色度坐标
enum AVColorPrimaries color_primaries;
// 色彩转移特性
enum AVColorTransferCharacteristic color_trc;
/**
* YUV colorspace type.
* - encoding: Set by user
* - decoding: Set by libavcodec
*/
// YUV colorspace的类型
enum AVColorSpace colorspace;
// 色度样本的位置
enum AVChromaLocation chroma_location;
/**
* frame timestamp estimated using various heuristics, in stream time base
* - encoding: unused
* - decoding: set by libavcodec, read by user.
*/
// 帧时间戳估计使用各种启发式,在流时间基础
int64_t best_effort_timestamp;
#if FF_API_FRAME_PKT
/**
* reordered pos from the last AVPacket that has been input into the decoder
* - encoding: unused
* - decoding: Read by user.
* @deprecated use AV_CODEC_FLAG_COPY_OPAQUE to pass through arbitrary user
* data from packets to frames
*/
// 从已输入到解码器的最后一个AVPacket中重新排序pos
// @deprecated: 使用AV_CODEC_FLAG_COPY_OPAQUE将任意用户数据从数据包传递到帧
attribute_deprecated
int64_t pkt_pos;
#endif
/**
* metadata.
* - encoding: Set by user.
* - decoding: Set by libavcodec.
*/
// 元数据
AVDictionary *metadata;
/**
* decode error flags of the frame, set to a combination of
* FF_DECODE_ERROR_xxx flags if the decoder produced a frame, but there
* were errors during the decoding.
* - encoding: unused
* - decoding: set by libavcodec, read by user.
*/
// 解码帧的错误标志,如果解码器产生一个帧,但在解码过程中有错误,则设置为FF_DECODE_ERROR_xxx标志的组合
int decode_error_flags;
// 错误标志符:
// 1: 码流不可用(INVALID_BITSTREAM)
// 2: 缺失参考帧(MISSING_REFERENCE)
// 4: 激活隐藏功能(CONCEALMENT_ACTIVE)
// 8: 解码slice错误(DECODE_SLICES)
#define FF_DECODE_ERROR_INVALID_BITSTREAM 1
#define FF_DECODE_ERROR_MISSING_REFERENCE 2
#define FF_DECODE_ERROR_CONCEALMENT_ACTIVE 4
#define FF_DECODE_ERROR_DECODE_SLICES 8
#if FF_API_FRAME_PKT
/**
* size of the corresponding packet containing the compressed
* frame.
* It is set to a negative value if unknown.
* - encoding: unused
* - decoding: set by libavcodec, read by user.
* @deprecated use AV_CODEC_FLAG_COPY_OPAQUE to pass through arbitrary user
* data from packets to frames
*/
// 包含压缩帧的相应报文的大小
// 如果未知则设置为负值
attribute_deprecated
int pkt_size;
#endif
/**
* For hwaccel-format frames, this should be a reference to the
* AVHWFramesContext describing the frame.
*/
// 用于hwaccel-format的帧,这应该是对描述帧的AVHWFramesContext的引用
AVBufferRef *hw_frames_ctx;
/**
* Frame owner's private data.
*
* This field may be set by the code that allocates/owns the frame data.
* It is then not touched by any library functions, except:
* - a new reference to the underlying buffer is propagated by
* av_frame_copy_props() (and hence by av_frame_ref());
* - it is unreferenced in av_frame_unref();
* - on the caller's explicit request. E.g. libavcodec encoders/decoders
* will propagate a new reference to/from @ref AVPacket "AVPackets" if the
* caller sets @ref AV_CODEC_FLAG_COPY_OPAQUE.
*
* @see opaque the plain pointer analogue
*/
// 帧拥有者的私有数据
// 该字段可以由分配/拥有帧数据的代码设置。
// 然后,它不会被任何标准库函数触及,除了:
// (1) 对底层缓冲区的新引用由av_frame_copy_props()(因此由av_frame_ref())传播;
// (2) 在av_frame_unref()中未被引用;
// (3) 应调用者的明确要求。例如,如果调用者设置了@ref AV_CODEC_FLAG_COPY_OPAQUE, libavcodec编码器/解码器将向/从@ref AVPacket“AVPackets”传播一个新的引用。
AVBufferRef *opaque_ref;
/**
* @anchor cropping
* @name Cropping
* Video frames only. The number of pixels to discard from the the
* top/bottom/left/right border of the frame to obtain the sub-rectangle of
* the frame intended for presentation.
* @{
*/
// 仅用于视频帧
// 表示了帧裁剪的上下左右4个方向,裁剪的多少
size_t crop_top;
size_t crop_bottom;
size_t crop_left;
size_t crop_right;
/**
* @}
*/
/**
* AVBufferRef for internal use by a single libav* library.
* Must not be used to transfer data between libraries.
* Has to be NULL when ownership of the frame leaves the respective library.
*
* Code outside the FFmpeg libs should never check or change the contents of the buffer ref.
*
* FFmpeg calls av_buffer_unref() on it when the frame is unreferenced.
* av_frame_copy_props() calls create a new reference with av_buffer_ref()
* for the target frame's private_ref field.
*/
// AVBufferRef的内部使用由单个libav*库。不能用于在库之间传输数据。当帧的所有权离开相应的库时必须为NULL。
//
// 1.FFmpeg库之外的代码永远不应该检查或更改缓冲区ref的内容
//
// 2.当帧未被引用时,FFmpeg调用av_buffer_unref()。Av_frame_copy_props()调用用av_buffer_ref()为
// 目标帧的private_ref字段创建一个新的引用
AVBufferRef *private_ref;
/**
* Channel layout of the audio data.
*/
// 音频数据的声道数
AVChannelLayout ch_layout;
/**
* Duration of the frame, in the same units as pts. 0 if unknown.
*/
// 帧持续的时间,单位与pts相同
int64_t duration;
} AVFrame;
AVFrame结构体中记录的比较关键的信息包括:
(1)uint8_t *data[AV_NUM_DATA_POINTERS]:存储frame数据
(2)int linesize[AV_NUM_DATA_POINTERS]:帧当中一行有多少像素
(3)int width, height:视频的宽高
(4)int format:frame的格式
(5)int key_frame:是否是关键帧
(6)enum AVPictureType pict_type:帧的类型
(7)int64_t pts:presentation的时间戳
(8)int64_t pkt_dts:解码时间戳
(9)int quality:视频质量
(10)AVBufferRef *buf[AV_NUM_DATA_POINTERS]:AVBuffer引用支持这一帧的数据。data和extended_data中的所有指针必须指向buf或extended_buf中的一个缓冲区
(11)enum AVColorXXX:一些与color相关的配置
(12)size_t cropXXX:显示框的裁剪
新版本当中定义的AVFrame与老版本相对比,新增加了一些如颜色处理,crop处理,硬件编解码信息等内容,将一些比较底层的信息删去,如qp、mv和dct等,将这些较底层的信息封装到别的地方去。这里还有一个问题,opaque、side_data和metadata这3个变量都描述了一些额外或者私有的数据,它们之间有什么异同?我的理解是,这3个变量都为音视频数据处理提供了额外的信息,其中:
(1)opaque:用户的私有数据,用户可以访问和修改,需要自行管理生命周期,适用于某些上下文或者配置数据传递给解码器的情况,也没有特定的格式需要,定义为void*格式,取决于用户的需求
(2)side data:编码器和解码器使用,存储与当前帧相关的辅助数据;通过AVFrame的side_data和nb_side_data访问,管理由编解码器自动进行;常用于传输编码特定信息,如H264的SEI(补充增强信息)
(3)Metadata:用于存储关键值对元数据,供编解码器及其他组件使用;通过AVDictionary管理,提供一组API来设置和获取元数据;用于传递编解码器配置或视频流的附加信息,如字幕;使用键值对进行存储,方便编解码器及其他组件读取和解析
参考雷博的文章,知道了对于data的理解:
(1)对于packed格式的数据(例如RGB24),会存到data[0]里面。
(2)对于planar格式的数据(例如YUV420P),则会分开成data[0],data[1],data[2]…(YUV420P中data[0]存Y,data[1]存U,data[2]存V)
enum AVPictureType pict_type
结构体定义了帧的类型
/**
* @}
* @}
* @defgroup lavu_picture Image related
*
* AVPicture types, pixel formats and basic image planes manipulation.
*
* @{
*/
enum AVPictureType {
AV_PICTURE_TYPE_NONE = 0, ///< Undefined
AV_PICTURE_TYPE_I, ///< Intra
AV_PICTURE_TYPE_P, ///< Predicted
AV_PICTURE_TYPE_B, ///< Bi-dir predicted
AV_PICTURE_TYPE_S, ///< S(GMC)-VOP MPEG-4
AV_PICTURE_TYPE_SI, ///< Switching Intra
AV_PICTURE_TYPE_SP, ///< Switching Predicted
AV_PICTURE_TYPE_BI, ///< BI type
};
P帧和SP帧都是基于帧间预测的运动补偿预测编码,根据查阅一些资料看,SP帧能够参考不同的参考帧重构出相同的图像帧,常用于流之间切换、拼接、随机接入和错误恢复场景,暂时不是很了解
AVFrameSideData **side_data
在雷博记录的老版本FFmpeg中,side_data是没有进行封装的,但是在新版本中进行了封装,包括类型,数据内容,数据大小,还有metadata和buf
/**
* Structure to hold side data for an AVFrame.
*
* sizeof(AVFrameSideData) is not a part of the public ABI, so new fields may be added
* to the end with a minor bump.
*/
typedef struct AVFrameSideData {
enum AVFrameSideDataType type;
uint8_t *data;
size_t size;
AVDictionary *metadata;
AVBufferRef *buf;
} AVFrameSideData;
AVFrameSideDataType的定义如下
/**
* @defgroup lavu_frame AVFrame
* @ingroup lavu_data
*
* @{
* AVFrame is an abstraction for reference-counted raw multimedia data.
*/
enum AVFrameSideDataType {
/**
* The data is the AVPanScan struct defined in libavcodec.
*/
AV_FRAME_DATA_PANSCAN,
/**
* ATSC A53 Part 4 Closed Captions.
* A53 CC bitstream is stored as uint8_t in AVFrameSideData.data.
* The number of bytes of CC data is AVFrameSideData.size.
*/
AV_FRAME_DATA_A53_CC,
/**
* Stereoscopic 3d metadata.
* The data is the AVStereo3D struct defined in libavutil/stereo3d.h.
*/
AV_FRAME_DATA_STEREO3D,
/**
* The data is the AVMatrixEncoding enum defined in libavutil/channel_layout.h.
*/
AV_FRAME_DATA_MATRIXENCODING,
/**
* Metadata relevant to a downmix procedure.
* The data is the AVDownmixInfo struct defined in libavutil/downmix_info.h.
*/
AV_FRAME_DATA_DOWNMIX_INFO,
/**
* ReplayGain information in the form of the AVReplayGain struct.
*/
AV_FRAME_DATA_REPLAYGAIN,
///....
CSDN : https://blog.csdn.net/weixin_42877471
Github : https://github.com/DoFulangChen