H.265码流解析-EW帮帮网

这一篇内容旨在对H.265码流中的一些概念做简单了解，部分概念与H.264相同，本篇中将不再重复。

1、NALU

H.265（HEVC）码流的NALU结构和AVC有一些不同，属于增强版，HEVC NALU结构如下：

NALU Header：
- Forbidden_zero_bit：1位，必须为0，如果不是则表示NALU非法；
- Nal_unit_type：6位，表示NALU的类型，指示NALU的功能和内容，如VPS、SPS、PPS、SEI、Slice等;
- Nuh_layer_id：6位，表示每个NALU所属的层ID，主要用于多层编码
- Nuh_temporal_id_plus1：3位，表示每个NALU所属的时域ID（TempOraL ID），主要用于多时域编码
NALU Payload：

可以看到HEVC码流的header变成了2字节，重要信息字段Nal_unit_type变成6位，这里只了解一下常用的type：

nal_unit_type	NALU 类型
`0`	TRAIL_N，表示这类型的NALU包含的是普通的切片，并且这个切片在编码顺序上是其他切片的后续（trailing），大部分的NALU都是这个类型，"N"是"Non-reference"的意思，表示切片不能作为其他切片预测的参考
`1`	TRAIL_R，同TRAIL_N，只不过这个切片可以作为其他视频切片预测的参考
`2`	TSA_N，Temporal Sub-layer Access（时间子层访问），包含的是普通的视频切片，这个切片在编码顺序上是其它切片的后继，但是不能作为其它切片的预测参考
`3`	TSA_R
`4`	STSA_N，Decoding order random access point（解码顺序的随机访问点）
`5`	STSA_R
`6`	RADL_N，Random Access Decodable Leading（随机访问可解码的前导）
`7`	RADL_R
`8`	RASL_N，Random Access Skipped Leading（随机访问跳过的前导）
`9`	RASL_R
`19`	IDR_W_RADL，Instantaneous Decoding Refresh（顺势解码刷新），W 表示它是可以作为预测参考的帧，RADL同上，表示这是一个可以瞬间刷新解码器状态并且可以被解码的帧
`20`	IDR_N_LP，Instantaneous Decoding Refresh No Reference Leading Picture，此类帧可以立即被解码并显示，且其后的所有帧都不会引用此帧的数据
`21`	CRA_NUT，Clean Random Access Network（无重叠的随机访问） Abstraction Layer Unit Type（网络抽象层单元类型）
`22`	RSV_IRAP_VCL22，RSV表示保留，IRAP（Instantaneous Decoding Refresh Picture）保留的瞬时解码刷新图片
`23`	RSV_IRAP_VCL23
`32`	VPS_NUT，Video Parameter Set（视频参数集），包含了视频序列的一些全局参数，比如视频的分辨率，帧率等等
`33`	SPS_NUT，Sequence Parameter Set（序列参数集），包含了相应视频序列的基本参数信息，如像素尺寸，颜色空间等等
`34`	PPS_NUT，Picture Parameter Set（图像参数集），包含了解码单个或多个图像（或称为帧）所需的参数
`35`	AUD_NUT，Access Unit Delimiter（分界符），用于标识访问单元（一个视频帧及其相关信息）的开始和结束
`36`	EOS_NUT，End of Sequence，用于标识视频序列的结束
`37`	EOB_NUT，End of Bitstream，用于标识视频比特流的结束
`38`	FD_NUT，Filler Data，用于填充视频数据流
`39`	PREFIX_SEI_NUT，Supplemental Enhancement Information，前缀补充增强信息，提供一些对解码图像并不是必须的，但对提高图像质量或提供额外功能是有帮助的信息
`40`	SUFFIX_SEI_NUT，后缀补充增强信息

从上面的表格我们可以知道，对于H.265, I帧的nalu_type值为19（IDR_W_RADL）或者20（IDR_N_LP），这两种类型都表示随机访问帧，即I帧；IDR帧的nalu_type值也是19（IDR_W_RADL）或20（IDR_N_LP）。

在H.265码流中，NALU顺序通常是：VPS，然后是SPS，接着是PPS；之后就是I帧（或IDR帧）和其它类型的帧。

HEVC的csd buffer结构参考：

typedef struct HEVCDecoderConfigurationRecord {
	uint8_t  configurationVersion;
	uint8_t  general_profile_space;
	uint8_t  general_tier_flag;
	uint8_t  general_profile_idc;
	uint32_t general_profile_compatibility_flags;
	uint64_t general_constraint_indicator_flags;
	uint8_t  general_level_idc;
	uint16_t min_spatial_segmentation_idc;
	uint8_t  parallelismType;
	uint8_t  chromaFormat;
	uint8_t  bitDepthLumaMinus8;
	uint8_t  bitDepthChromaMinus8;
	uint16_t avgFrameRate;
	uint8_t  constantFrameRate;
	uint8_t  numTemporalLayers;
	uint8_t  temporalIdNested;
	uint8_t  lengthSizeMinusOne;
	uint8_t  numOfArrays;
	HVCCNALUnitArray *array;
} HEVCDecoderConfigurationRecord;

Android中创建hevc csd buffer的代码参考：

sp<MetaData> MakeHEVCCodecSpecificData(const sp<ABuffer> &accessUnit){
	ALOGI("MakeHEVCCodecSpecificData ++");
	const uint8_t *data = accessUnit->data();
	size_t size = accessUnit->size();
	size_t numOfParamSets = 0;
	const uint8_t VPS_NAL_TYPE = 32;
	const uint8_t SPS_NAL_TYPE = 33;
	const uint8_t PPS_NAL_TYPE = 34;
	
	//find vps,only choose the first vps,
	//need check whether need sent all the vps to decoder
	sp<ABuffer> videoParamSet = FindHEVCNAL(data, size, VPS_NAL_TYPE, NULL);
	if (videoParamSet == NULL) {
		ALOGW("no vps found !!!");
		//return NULL;
	}else{
		numOfParamSets++;
		ALOGI("find vps, size =%d",videoParamSet->size());
	}
	
	//find sps,only choose the first sps,
	//need check whether need sent all the sps to decoder
	sp<ABuffer> seqParamSet = FindHEVCNAL(data, size, SPS_NAL_TYPE, NULL);
	if (seqParamSet == NULL) {
		ALOGW("no sps found !!!");
		return NULL;
	}else{
		numOfParamSets++;
		ALOGI("find sps, size =%d",seqParamSet->size());
	}

	
	int32_t width, height;
	FindHEVCDimensions(seqParamSet, &width, &height);
	
	//find pps,only choose the first pps,
	//need check whether need sent all the pps to decoder
	size_t stopOffset;
	sp<ABuffer> picParamSet = FindHEVCNAL(data, size, PPS_NAL_TYPE, &stopOffset);
	if (picParamSet == NULL) {
		ALOGW("no sps found !!!");
		return NULL;
	}else{	
		numOfParamSets++;
		ALOGI("find pps, size =%d",picParamSet->size());
	}
	
	int32_t numbOfArrays = numOfParamSets;
	int32_t paramSetSize = 0;

	//only save one vps,sps,pps in codecConfig data
	if(videoParamSet != NULL){
		paramSetSize += 1 + 2 + 2 + videoParamSet->size();
	}
	if(seqParamSet != NULL){
		paramSetSize += 1 + 2 + 2 + seqParamSet->size();
	}
	if(picParamSet != NULL){
		paramSetSize += 1 + 2 + 2 + picParamSet->size();
	}
	
	size_t csdSize =
		1 + 1 + 4 + 6 + 1 + 2 + 1 + 1 + 1 + 1 + 2 + 1 
		+ 1 + paramSetSize;
	ALOGI("MakeHEVCCodecSpecificData,codec config data size =%d",csdSize);
	sp<ABuffer> csd = new ABuffer(csdSize);
	uint8_t *out = csd->data();

	*out++ = 0x01;	// configurationVersion
	
	/*copy profile_tier_leve info in sps, containing
	1 byte:general_profile_space(2),general_tier_flag(1),general_profile_idc(5)
	4 bytes: general_profile_compatibility_flags, 6 bytes: general_constraint_indicator_flags
	1 byte:general_level_idc
	*/
	memcpy(out,seqParamSet->data() + 3, 1 + 4 + 6 + 1);

	uint8_t profile = out[0] & 0x1f;
	uint8_t level = out[11];
	
	out += 1 + 4 + 6 + 1;

	*out++ = 0xf0; //reserved(1111b) + min_spatial_segmentation_idc(4)
	*out++ = 0x00;// min_spatial_segmentation_idc(8) 
	*out++ = 0xfc; // reserved(6bits,111111b) + parallelismType(2)(0=unknow,1=slices,2=tiles,3=WPP)
	*out++ = 0xfd; //reserved(6bits,111111b)+chromaFormat(2)(0=monochrome, 1=4:2:0, 2=4:2:2, 3=4:4:4)

	*out++ = 0xf8;//reserved(5bits,11111b) + bitDepthLumaMinus8(3)
	*out++ = 0xf8;//reserved(5bits,11111b) + bitDepthChromaMinus8(3)
	
	uint16_t avgFrameRate = 0;
	*out++ = avgFrameRate >> 8; // avgFrameRate (16bits,in units of frames/256 seconds,0 indicates an unspecified average frame rate)
	*out++ = avgFrameRate & 0xff;

	*out++ = 0x03;//constantFrameRate(2bits,0=not be of constant frame rate),numTemporalLayers(3bits),temporalIdNested(1bits),
				 //lengthSizeMinusOne(2bits)

	*out++ = numbOfArrays;//numOfArrays

	if(videoParamSet != NULL){
		*out++ = 0x3f & VPS_NAL_TYPE; //array_completeness(1bit)+reserved(1bit,0)+NAL_unit_type(6bits)

		//num of vps
		uint16_t numNalus = 1;
		*out++ = numNalus >> 8;
		*out++ =  numNalus & 0xff;

		//vps nal length
		*out++ = videoParamSet->size() >> 8;
		*out++ = videoParamSet->size() & 0xff;

		memcpy(out,videoParamSet->data(),videoParamSet->size());
		out += videoParamSet->size();
	}

	if(seqParamSet != NULL){
		
		*out++ = 0x3f & SPS_NAL_TYPE; //array_completeness(1bit)+reserved(1bit,0)+NAL_unit_type(6bits)
		
		//num of sps
		uint16_t numNalus = 1;
		*out++ = numNalus >> 8;
		*out++ = numNalus & 0xff;

		//sps nal length
		*out++ = seqParamSet->size() >> 8;
		*out++ = seqParamSet->size() & 0xff;

		memcpy(out,seqParamSet->data(),seqParamSet->size());
		out += seqParamSet->size();

	}
	if(picParamSet != NULL){
	
		*out++ = 0x3f & PPS_NAL_TYPE; //array_completeness(1bit)+reserved(1bit,0)+NAL_unit_type(6bits)
		
		//num of pps
		uint16_t numNalus = 1;
		*out++ = numNalus >> 8;
		*out++ = numNalus & 0xff;

		//pps nal length
		*out++ = picParamSet->size() >> 8;
		*out++ = picParamSet->size() & 0xff;

		memcpy(out,picParamSet->data(),picParamSet->size());
		//no need add out offset
	}


	sp<MetaData> meta = new MetaData;
	meta->setCString(kKeyMIMEType, MEDIA_MIMETYPE_VIDEO_HEVC);

	meta->setData(kKeyHVCC, kTypeHVCC, csd->data(), csd->size());
	meta->setInt32(kKeyWidth, width);
	meta->setInt32(kKeyHeight, height);

	ALOGI("found HEVC codec config (%d x %d, %s-profile level %d.%d)",
		 width, height, HEVCProfileToString(profile), level / 10, level % 10);
	ALOGI("MakeHEVCCodecSpecificData --");
	return meta;
}

2、Slice

如何确定Slice是图像的第一个切片？

一般来说 0 <= sliceType <= 3, 16 <= sliceType <= 21，表示当前NALU中存储的是一个Slice。Slice分为Header和Data两部分，Slice Header中包括有如下内容：

first_slice_segment_in_pic_flag：指示当前切片是不是该帧的第一个切片；
slice_pic_parameter_set_id：指定使用哪一个图片参数集PPS对当前切片进行解码；
dependent_slice_segment_flag: 表明此切片是不是依赖其他切片的切片；
slice_address: 标识切片在图片中的位置；
slice_type: 指定当前切片的类型，包括I（帧内编码）、P（预测编码）和B（双向预测编码）类型；
pic_output_flag: 标识该图片是否需要输出；
colour_plane_id: 如果色度格式是4:4:4，此字段指定当前切片的色度平面；
slice_pic_order_cnt_lsb: 当前切片的图片顺序计数最低有效位（Least Significant Bit）；
short_term_ref_pic_set: 指定参考图片集合的配置；
num_ref_idx_active_override_flag: 表示当前帧的参考索引数是否覆盖PPS中的设置；

Slice Header的第一位就可以表示当前NALU是否为帧的第一个NALU。如first_slice_segment_in_pic_flag的值为1，表示当前的slice segment是该帧的第一个slice segment。如果它的值为0，则表示当前的slice segment并不是该帧的第一个slice segment。

3、AUD

先了解一个概念：访问单元（Access Unit），表示一个视频序列中的一帧图像及其关联数据的集合，一个访问单元中可能包含一个或多个NALU，这些NALU可能代表着包含真实视频数据的I帧、P帧或B帧，也可能是包含元数据或其他辅助信息的非视频编码层（Non-VCL）NAL单元，例如SEI等。

AUD（Access Unit Delimiter）访问单原分隔符，AUD用于标记访问单元的边界，AUD结构如下：

aud_irap_or_idr_flag: 1 bit，如果当前帧是I帧或者IRAP帧，那么该位为1，否则为0;
aud_pic_type: 3 bits，这个值表明了当前帧的类型（I帧，P帧，B帧，等等）;
rbsp_stop_one_bit: 1 bit，必须为1;
rbsp_alignment_zero_bits: 后续0比特，直到字节对齐，即到达下一个字节;

aud_pic_type描述中的当前帧的类型，指的是跟在AUD后面的一帧的类型，具体的值和类型的对应关系如下：

0: 对应的PicType是B图像（B picture）
1: 对应的PicType是P图像（P picture）
2: 对应的PicType是I图像（I picture）
3: 对应的PicType是SP图像（SP（Switching Pictures））
4: 对应的PicType是SI图像（SI picture）
5-7: 未来可能进行扩展的备用字段

4、如何从BS中解析出完整的一帧数据

dequeueAccessUnitH265

关注公众号《青山渺渺》阅读完整内容；如有问题可在公众号后台私信，也可进入音视频开发技术分享群一起讨论！

在这里插入图片描述

H.265码流解析

1、NALU

2、Slice

3、AUD

4、如何从BS中解析出完整的一帧数据

关注公众号《青山渺渺》阅读完整内容；如有问题可在公众号后台私信，也可进入音视频开发技术分享群一起讨论！

网站公告

今日签到

热门文章

最新发布

H.265码流解析

1、NALU

2、Slice

3、AUD

4、如何从BS中解析出完整的一帧数据

关注公众号《青山渺渺》阅读完整内容； 如有问题可在公众号后台私信，也可进入音视频开发技术分享群一起讨论！

网站公告

今日签到

热门文章

最新发布

关注公众号《青山渺渺》阅读完整内容；如有问题可在公众号后台私信，也可进入音视频开发技术分享群一起讨论！