首页 > 其他分享 >使用NVENC API编码D3D12材质

使用NVENC API编码D3D12材质

时间:2025-01-14 19:45:13浏览次数:1  
标签:编码 ENC encoder NVENC API nvencAPI D3D12 GUID NV

前言

  之前在写图形引擎的时候就有个想法,想让我的图形引擎以一个固定的时间步进(DeltaTime)来渲染材质,并且把连续渲染的材质以视频的方式保存下来。其实我很久之前就把这个东西实现了,最近也是修改了下代码,准备写一篇关于这个的随笔。

介绍

  看了些网上的视频以及相关的文章,把连续渲染的材质保存为视频的过程,其实是先将材质编码为压缩视频帧,然后用容器来封装。到今天为止,压缩视频帧有三种编码格式分别为:H264HEVCAV1。去NVIDIA的官网上面看了下,发现我目前使用的GPU支持H264HEVC这两种编码格式,而且相关的API已经写好了,直接从动态链接库载入相关的函数就行了。接下来我们让图形引擎以60帧的速率渲染出BGRA材质,然后用NVENC API把材质编码为H264压缩视频帧,最后借助FFMPEG把H264压缩视频帧封装到MP4容器。

准备过程

编码器的初始化

  首先我们需要载入相关的API,用NvEncodeAPICreateInstance这个函数就行,而这个函数就在nvEncodeAPI64.dll里。

moduleNvEncAPI = LoadLibraryA("nvEncodeAPI64.dll");

if (moduleNvEncAPI == 0)
{
	throw "unable to load nvEncodeAPI64.dll";
}

NVENCSTATUS(__stdcall * NVENCAPICreateInstance)(NV_ENCODE_API_FUNCTION_LIST*) = (NVENCSTATUS(*)(NV_ENCODE_API_FUNCTION_LIST*))GetProcAddress(moduleNvEncAPI, "NvEncodeAPICreateInstance");

std::cout << "[class NvidiaEncoder] api instance create status " << NVENCAPICreateInstance(&nvencAPI) << "\n";

  NvEncodeAPICreateInstance要求传入一个名为NV_ENCODE_API_FUNCTION_LIST结构体的地址,这个结构体实际上是一个和编码有关的函数表,接下来我们可以用nvencAPI来为我们实现编码的具体操作。载入API后我们的下一步操作就是打开编码会议。

NV_ENC_OPEN_ENCODE_SESSION_EX_PARAMS sessionParams = { NV_ENC_OPEN_ENCODE_SESSION_EX_PARAMS_VER };
sessionParams.device = GraphicsDevice::get();
sessionParams.deviceType = NV_ENC_DEVICE_TYPE_DIRECTX;
sessionParams.apiVersion = NVENCAPI_VERSION;

NVENCCALL(nvencAPI.nvEncOpenEncodeSessionEx(&sessionParams, &encoder));

  其中device是ID3D12Device指针,打开编码会议后NVENC API会给我们一个编码器实例也就是上面代码的encoder,接下来我们需要初始化encoder。初始化encoder需要提供一个叫NV_ENC_INITIALIZE_PARAMS的结构体,由它的名字就可以知道这个结构体储存的是编码器的初始化参数。这个结构体有很多成员,官方的文档说明了该如何设置这个结构体,而我们要关注的成员有以下这些:bufferFormat、encodeConfig、encodeGUID、presetGUID、tuningInfo、encodeWidth、encodeHeight、darWidth、darHeight、maxEncodeWidth、maxEncodeHeight、frameRateNum、frameRateDen。接下来我们介绍下这些参数是什么。

bufferFormat:输入的材质格式,如果引擎渲染出的是DXGI_B8G8R8A8_UNORM的材质,这个参数应该设置为NV_ENC_BUFFER_FORMAT_ARGB

encodeConfig:和编码有关的配置,它涉及到很多和视频编码相关的知识,例如比特率控制模式、平均比特率、最大比特率等等。我们可以通过nvEncGetEncodePresetConfigEx得到一个粗糙的配置,然后可以在这基础上进行修改。

NV_ENC_PRESET_CONFIG presetConfig = { NV_ENC_PRESET_CONFIG_VER,{NV_ENC_CONFIG_VER} };

NVENCCALL(nvencAPI.nvEncGetEncodePresetConfigEx(encoder, codec, preset, tuningInfo, &presetConfig));

NV_ENC_CONFIG config;
memcpy(&config, &presetConfig.presetCfg, sizeof(NV_ENC_CONFIG));
config.profileGUID = profile;

//high quality encode
config.gopLength = 120;
config.frameIntervalP = 1;
config.rcParams.enableLookahead = 0;
config.rcParams.rateControlMode = NV_ENC_PARAMS_RC_VBR;
config.rcParams.averageBitRate = 20000000U;
config.rcParams.maxBitRate = 40000000U;
config.rcParams.vbvBufferSize = config.rcParams.maxBitRate * 4;
config.rcParams.enableAQ = 1;

这里的profileGUID指的是与编码格式对应的规格,指代的是一系列用于编码的特性,应该和具体的应用场景有关

// =========================================================================================
// *   Encode Profile GUIDS supported by the NvEncodeAPI interface.
// =========================================================================================

// {BFD6F8E7-233C-4341-8B3E-4818523803F4}
static const GUID NV_ENC_CODEC_PROFILE_AUTOSELECT_GUID =
{ 0xbfd6f8e7, 0x233c, 0x4341, { 0x8b, 0x3e, 0x48, 0x18, 0x52, 0x38, 0x3, 0xf4 } };

// {0727BCAA-78C4-4c83-8C2F-EF3DFF267C6A}
static const GUID  NV_ENC_H264_PROFILE_BASELINE_GUID =
{ 0x727bcaa, 0x78c4, 0x4c83, { 0x8c, 0x2f, 0xef, 0x3d, 0xff, 0x26, 0x7c, 0x6a } };

// {60B5C1D4-67FE-4790-94D5-C4726D7B6E6D}
static const GUID  NV_ENC_H264_PROFILE_MAIN_GUID =
{ 0x60b5c1d4, 0x67fe, 0x4790, { 0x94, 0xd5, 0xc4, 0x72, 0x6d, 0x7b, 0x6e, 0x6d } };

// {E7CBC309-4F7A-4b89-AF2A-D537C92BE310}
static const GUID NV_ENC_H264_PROFILE_HIGH_GUID =
{ 0xe7cbc309, 0x4f7a, 0x4b89, { 0xaf, 0x2a, 0xd5, 0x37, 0xc9, 0x2b, 0xe3, 0x10 } };

// {7AC663CB-A598-4960-B844-339B261A7D52}
static const GUID  NV_ENC_H264_PROFILE_HIGH_444_GUID =
{ 0x7ac663cb, 0xa598, 0x4960, { 0xb8, 0x44, 0x33, 0x9b, 0x26, 0x1a, 0x7d, 0x52 } };

// {40847BF5-33F7-4601-9084-E8FE3C1DB8B7}
static const GUID NV_ENC_H264_PROFILE_STEREO_GUID =
{ 0x40847bf5, 0x33f7, 0x4601, { 0x90, 0x84, 0xe8, 0xfe, 0x3c, 0x1d, 0xb8, 0xb7 } };

// {B405AFAC-F32B-417B-89C4-9ABEED3E5978}
static const GUID NV_ENC_H264_PROFILE_PROGRESSIVE_HIGH_GUID =
{ 0xb405afac, 0xf32b, 0x417b, { 0x89, 0xc4, 0x9a, 0xbe, 0xed, 0x3e, 0x59, 0x78 } };

// {AEC1BD87-E85B-48f2-84C3-98BCA6285072}
static const GUID NV_ENC_H264_PROFILE_CONSTRAINED_HIGH_GUID =
{ 0xaec1bd87, 0xe85b, 0x48f2, { 0x84, 0xc3, 0x98, 0xbc, 0xa6, 0x28, 0x50, 0x72 } };

// {B514C39A-B55B-40fa-878F-F1253B4DFDEC}
static const GUID NV_ENC_HEVC_PROFILE_MAIN_GUID =
{ 0xb514c39a, 0xb55b, 0x40fa, { 0x87, 0x8f, 0xf1, 0x25, 0x3b, 0x4d, 0xfd, 0xec } };

// {fa4d2b6c-3a5b-411a-8018-0a3f5e3c9be5}
static const GUID NV_ENC_HEVC_PROFILE_MAIN10_GUID =
{ 0xfa4d2b6c, 0x3a5b, 0x411a, { 0x80, 0x18, 0x0a, 0x3f, 0x5e, 0x3c, 0x9b, 0xe5 } };

// For HEVC Main 444 8 bit and HEVC Main 444 10 bit profiles only
// {51ec32b5-1b4c-453c-9cbd-b616bd621341}
static const GUID NV_ENC_HEVC_PROFILE_FREXT_GUID =
{ 0x51ec32b5, 0x1b4c, 0x453c, { 0x9c, 0xbd, 0xb6, 0x16, 0xbd, 0x62, 0x13, 0x41 } };

// {5f2a39f5-f14e-4f95-9a9e-b76d568fcf97}
static const GUID NV_ENC_AV1_PROFILE_MAIN_GUID =
{ 0x5f2a39f5, 0xf14e, 0x4f95, { 0x9a, 0x9e, 0xb7, 0x6d, 0x56, 0x8f, 0xcf, 0x97 } };

encodeGUID:编码格式,可选的有三个

// =========================================================================================
// Encode Codec GUIDS supported by the NvEncodeAPI interface.
// =========================================================================================

// {6BC82762-4E63-4ca4-AA85-1E50F321F6BF}
static const GUID NV_ENC_CODEC_H264_GUID =
{ 0x6bc82762, 0x4e63, 0x4ca4, { 0xaa, 0x85, 0x1e, 0x50, 0xf3, 0x21, 0xf6, 0xbf } };

// {790CDC88-4522-4d7b-9425-BDA9975F7603}
static const GUID NV_ENC_CODEC_HEVC_GUID =
{ 0x790cdc88, 0x4522, 0x4d7b, { 0x94, 0x25, 0xbd, 0xa9, 0x97, 0x5f, 0x76, 0x3 } };

// {0A352289-0AA7-4759-862D-5D15CD16D254}
static const GUID NV_ENC_CODEC_AV1_GUID =
{ 0x0a352289, 0x0aa7, 0x4759, { 0x86, 0x2d, 0x5d, 0x15, 0xcd, 0x16, 0xd2, 0x54 } };

presetGUID:预设等级,官方的介绍如下。数字越高质量越高但是性能越低,对于H264编码格式有P3到P7可选

// =========================================================================================
// *   Preset GUIDS supported by the NvEncodeAPI interface.
// =========================================================================================
// Performance degrades and quality improves as we move from P1 to P7. Presets P3 to P7 for H264 and Presets P2 to P7 for HEVC have B frames enabled by default
// for HIGH_QUALITY and LOSSLESS tuning info, and will not work with Weighted Prediction enabled. In case Weighted Prediction is required, disable B frames by
// setting frameIntervalP = 1
// {FC0A8D3E-45F8-4CF8-80C7-298871590EBF}
static const GUID NV_ENC_PRESET_P1_GUID   =
{ 0xfc0a8d3e, 0x45f8, 0x4cf8, { 0x80, 0xc7, 0x29, 0x88, 0x71, 0x59, 0xe, 0xbf } };

// {F581CFB8-88D6-4381-93F0-DF13F9C27DAB}
static const GUID NV_ENC_PRESET_P2_GUID   =
{ 0xf581cfb8, 0x88d6, 0x4381, { 0x93, 0xf0, 0xdf, 0x13, 0xf9, 0xc2, 0x7d, 0xab } };

// {36850110-3A07-441F-94D5-3670631F91F6}
static const GUID NV_ENC_PRESET_P3_GUID   =
{ 0x36850110, 0x3a07, 0x441f, { 0x94, 0xd5, 0x36, 0x70, 0x63, 0x1f, 0x91, 0xf6 } };

// {90A7B826-DF06-4862-B9D2-CD6D73A08681}
static const GUID NV_ENC_PRESET_P4_GUID   =
{ 0x90a7b826, 0xdf06, 0x4862, { 0xb9, 0xd2, 0xcd, 0x6d, 0x73, 0xa0, 0x86, 0x81 } };

// {21C6E6B4-297A-4CBA-998F-B6CBDE72ADE3}
static const GUID NV_ENC_PRESET_P5_GUID   =
{ 0x21c6e6b4, 0x297a, 0x4cba, { 0x99, 0x8f, 0xb6, 0xcb, 0xde, 0x72, 0xad, 0xe3 } };

// {8E75C279-6299-4AB6-8302-0B215A335CF5}
static const GUID NV_ENC_PRESET_P6_GUID   =
{ 0x8e75c279, 0x6299, 0x4ab6, { 0x83, 0x2, 0xb, 0x21, 0x5a, 0x33, 0x5c, 0xf5 } };

// {84848C12-6F71-4C13-931B-53E283F57974}
static const GUID NV_ENC_PRESET_P7_GUID   =
{ 0x84848c12, 0x6f71, 0x4c13, { 0x93, 0x1b, 0x53, 0xe2, 0x83, 0xf5, 0x79, 0x74 } };

tuningInfo:调教信息,用来应对不同的视频编码用途,例如串流、云游戏、视频会议等等,有以下可选

typedef enum NV_ENC_TUNING_INFO
{
    NV_ENC_TUNING_INFO_UNDEFINED         = 0,                                     /**< Undefined tuningInfo. Invalid value for encoding. */
    NV_ENC_TUNING_INFO_HIGH_QUALITY      = 1,                                     /**< Tune presets for latency tolerant encoding.*/
    NV_ENC_TUNING_INFO_LOW_LATENCY       = 2,                                     /**< Tune presets for low latency streaming.*/
    NV_ENC_TUNING_INFO_ULTRA_LOW_LATENCY = 3,                                     /**< Tune presets for ultra low latency streaming.*/
    NV_ENC_TUNING_INFO_LOSSLESS          = 4,                                     /**< Tune presets for lossless encoding.*/
    NV_ENC_TUNING_INFO_COUNT                                                      /**< Count number of tuningInfos. Invalid value. */
}NV_ENC_TUNING_INFO;

encodeWidth:编码材质的宽度

encodeHeight:编码材质的高度

darWidth:纵横比分子

darHeight:纵横比分母

maxEncodeWidth:编码材质的最大宽度

maxEncodeHeight:编码材质的最大高度

frameRateNum:帧率分子

frameRateDen:帧率分母

NVENC的编程指南提供了如下推荐设置
img

我使用的NV_ENC_INITIALIZE_PARAMS设置如下:

static constexpr NV_ENC_BUFFER_FORMAT bufferFormat = NV_ENC_BUFFER_FORMAT_ARGB;

static constexpr NV_ENC_TUNING_INFO tuningInfo = NV_ENC_TUNING_INFO_HIGH_QUALITY;

const GUID codec = NV_ENC_CODEC_H264_GUID;

const GUID preset = NV_ENC_PRESET_P7_GUID;

const GUID profile = NV_ENC_H264_PROFILE_HIGH_GUID;
NV_ENC_PRESET_CONFIG presetConfig = { NV_ENC_PRESET_CONFIG_VER,{NV_ENC_CONFIG_VER} };

NVENCCALL(nvencAPI.nvEncGetEncodePresetConfigEx(encoder, codec, preset, tuningInfo, &presetConfig));

NV_ENC_CONFIG config;
memcpy(&config, &presetConfig.presetCfg, sizeof(NV_ENC_CONFIG));
config.version = NV_ENC_CONFIG_VER;
config.profileGUID = profile;

//high quality encode
config.gopLength = 120;
config.frameIntervalP = 1;
config.rcParams.enableLookahead = 0;
config.rcParams.rateControlMode = NV_ENC_PARAMS_RC_VBR;
config.rcParams.averageBitRate = 20000000U;
config.rcParams.maxBitRate = 40000000U;
config.rcParams.vbvBufferSize = config.rcParams.maxBitRate * 4;
config.rcParams.enableAQ = 1;

NV_ENC_INITIALIZE_PARAMS encoderParams = { NV_ENC_INITIALIZE_PARAMS_VER };
encoderParams.bufferFormat = bufferFormat;
encoderParams.encodeConfig = &config;
encoderParams.encodeGUID = codec;
encoderParams.presetGUID = preset;
encoderParams.tuningInfo = tuningInfo;
encoderParams.encodeWidth = Graphics::getWidth();
encoderParams.encodeHeight = Graphics::getHeight();
encoderParams.darWidth = Graphics::getWidth();
encoderParams.darHeight = Graphics::getHeight();
encoderParams.maxEncodeWidth = Graphics::getWidth();
encoderParams.maxEncodeHeight = Graphics::getHeight();
encoderParams.frameRateNum = frameRate;
encoderParams.frameRateDen = 1;
encoderParams.enablePTD = 1;
encoderParams.enableOutputInVidmem = 0;
encoderParams.enableEncodeAsync = 0;

设置好后我们可以用它来初始化编码器

NVENCCALL(nvencAPI.nvEncInitializeEncoder(encoder, &encoderParams));

FFMPEG的初始化

  初始化编码器后,我们就可以开始用相关的API把材质编码为压缩视频帧了。但是我们还有另外一个问题,也就是压缩视频帧封装的处理。我准备借助FFMPEG来封装压缩视频帧。我们首先初始化一个输出容器类型为mp4,输出文件名称为output.mp4的输出上下文。

avformat_alloc_output_context2(&outCtx, nullptr, "mp4", "output.mp4");

接下来为输出上下文初始化一个视频流,并且设置其编码格式、编码数据类型、帧的宽、帧的高

outStream = avformat_new_stream(outCtx, nullptr);

outStream->id = 0;

AVCodecParameters* vpar = outStream->codecpar;
vpar->codec_id = AV_CODEC_ID_H264;
vpar->codec_type = AVMEDIA_TYPE_VIDEO;
vpar->width = Graphics::getWidth();
vpar->height = Graphics::getHeight();

设置完后我们打开输出文件,并写入头部信息

avio_open(&outCtx->pb, "output.mp4", AVIO_FLAG_WRITE);

avformat_write_header(outCtx, nullptr);

最后我们初始化AVPacket来写入我们的压缩视频帧

pkt = av_packet_alloc();

编码资源的初始化

  输出的压缩视频帧实际上是用一定字节大小的比特流来表示的,官方的编程指南指出对于编码D3D12材质来说,我们还要分配一个ReadbackHeap用于取出比特流,它的推荐大小为材质大小的两倍。

readbackHeap(new ReadbackHeap(2 * 4 * Graphics::getWidth() * Graphics::getHeight()))

编码和封装过程

每帧的编码和封装

  如果要将外部创建的资源用于输入材质和输出比特流这两个用途,官方的编程指南指出应该先注册资源然后映射资源。接下来我们分别讲解输入材质和输出比特流的注册以及输入材质和输出比特流的映射。

我们可以用nvEncRegisterResource来注册资源

NV_ENC_REGISTER_RESOURCE registerInputResource = { NV_ENC_REGISTER_RESOURCE_VER };
registerInputResource.bufferFormat = bufferFormat;
registerInputResource.bufferUsage = NV_ENC_INPUT_IMAGE;
registerInputResource.resourceType = NV_ENC_INPUT_RESOURCE_TYPE_DIRECTX;
registerInputResource.resourceToRegister = inputTexture->getResource();
registerInputResource.subResourceIndex = 0;
registerInputResource.width = Graphics::getWidth();
registerInputResource.height = Graphics::getHeight();
registerInputResource.pitch = 0;
registerInputResource.pInputFencePoint = nullptr;

NVENCCALL(nvencAPI.nvEncRegisterResource(encoder, &registerInputResource));
NV_ENC_REGISTER_RESOURCE registerOutputResource = { NV_ENC_REGISTER_RESOURCE_VER };
registerOutputResource.bufferFormat = NV_ENC_BUFFER_FORMAT_U8;
registerOutputResource.bufferUsage = NV_ENC_OUTPUT_BITSTREAM;
registerOutputResource.resourceType = NV_ENC_INPUT_RESOURCE_TYPE_DIRECTX;
registerOutputResource.resourceToRegister = readbackHeap->getResource();
registerOutputResource.subResourceIndex = 0;
registerOutputResource.width = 2 * 4 * Graphics::getWidth() * Graphics::getHeight();
registerOutputResource.height = 1;
registerOutputResource.pitch = 0;
registerOutputResource.pInputFencePoint = nullptr;

NVENCCALL(nvencAPI.nvEncRegisterResource(encoder, &registerOutputResource));

这里的resourceToRegister指的是ID3D12Resource指针,注册资源后NVENC API会返回一个注册句柄例如registerInputResource.registeredResource,我们接下来可以用这个句柄和nvEncMapInputResource来映射资源

NV_ENC_MAP_INPUT_RESOURCE mapInputResource = { NV_ENC_MAP_INPUT_RESOURCE_VER };
mapInputResource.registeredResource = registerInputResource.registeredResource;

NVENCCALL(nvencAPI.nvEncMapInputResource(encoder, &mapInputResource));
NV_ENC_MAP_INPUT_RESOURCE mapOutputResource = { NV_ENC_MAP_INPUT_RESOURCE_VER };
mapOutputResource.registeredResource = registerOutputResource.registeredResource;

NVENCCALL(nvencAPI.nvEncMapInputResource(encoder, &mapOutputResource));

映射资源后同样的也会给我们一个映射句柄,比如mapInputResource.mappedResource,有了映射句柄后我们接下来可以用nvEncEncodePicture提交我们的输入材质和输出比特流。对于D3D12来说,这个函数要求提供一个NV_ENC_INPUT_RESOURCE_D3D12结构体和一个NV_ENC_OUTPUT_RESOURCE_D3D12结构体,代码如下

NV_ENC_INPUT_RESOURCE_D3D12 inputResource = { NV_ENC_INPUT_RESOURCE_D3D12_VER };
inputResource.pInputBuffer = mapInputResource.mappedResource;
inputResource.inputFencePoint = NV_ENC_FENCE_POINT_D3D12{ NV_ENC_FENCE_POINT_D3D12_VER };

NV_ENC_OUTPUT_RESOURCE_D3D12 outputResource = { NV_ENC_INPUT_RESOURCE_D3D12_VER };
outputResource.pOutputBuffer = mapOutputResource.mappedResource;
outputResource.outputFencePoint = NV_ENC_FENCE_POINT_D3D12{ NV_ENC_FENCE_POINT_D3D12_VER };
outputResource.outputFencePoint.pFence = outputFence.Get();
outputResource.outputFencePoint.signalValue = ++outputFenceValue;
outputResource.outputFencePoint.bSignal = true;

相信你也注意到了,对于输入材质和输出比特流分别需要一个inputFencePoint和一个outputFencePointinputFencePoint用来知道什么时候材质被渲染完了可以用作编码用途,而outputFencePoint用来知道什么时候材质被编码完了可以取出对应的比特流。由于我的图形引擎有等待当前帧完成的方法,于是省略了inputFencePoint。接下来我们使用nvEncEncodePicture来指定提交的材质以及输出的比特流,代码如下

NV_ENC_PIC_PARAMS picParams = { NV_ENC_PIC_PARAMS_VER };

picParams.pictureStruct = NV_ENC_PIC_STRUCT_FRAME;

picParams.inputBuffer = &inputResource;

picParams.outputBitstream = &outputResource;

picParams.bufferFmt = bufferFormat;

picParams.inputWidth = Graphics::getWidth();

picParams.inputHeight = Graphics::getHeight();

picParams.completionEvent = nullptr;

NVENCCALL(nvencAPI.nvEncEncodePicture(encoder, &picParams));

编码完成后我们可以使用nvEncLockBitstream取出比特流,代码如下

NV_ENC_LOCK_BITSTREAM lockBitstream = { NV_ENC_LOCK_BITSTREAM_VER };

lockBitstream.outputBitstream = &outputResource;

lockBitstream.doNotWait = 0;

NVENCCALL(nvencAPI.nvEncLockBitstream(encoder, &lockBitstream));

uint8_t* const bitstreamPtr = (uint8_t*)lockBitstream.bitstreamBufferPtr;

const int bitstreamSize = lockBitstream.bitstreamSizeInBytes;

有了比特流指针以及比特流字节大小,我们接下来可以进行封装。查阅相关资料后,写入压缩视频帧得通过av_write_frame这个函数来执行,它要求输入一个输出上下文以及一个AVPacket,我们的比特流就是用AVPacket来承载的,官方文档对于这个函数的解释如下
img

查看参数解释后,我们应该把AVPacketstream_indexptsdtsduration设置到正确的值。stream_index指代的应该就是我们之前创建的视频流的索引outStream->index,而ptsdtsduration这三个成员我是一点也不懂,查阅了下官方的文档

img
发现这三个成员都是以AVStream->time_base为单位的值,其中duration是压缩帧的持续时间,dtspts分别是解压时间戳和呈现时间戳,前者决定了压缩比特流什么时候被解压,后者决定了被解压后的画面什么时候被呈现到用户面前。FFMPEG里有对应的函数av_rescale_q用来计算这几个成员的值,我们可以通过已编码的帧数和帧率还有AVStream->time_base来计算,下面是代码

frameEncoded++;

pkt->pts = av_rescale_q(frameEncoded, AVRational{ 1,(int)FrameRate }, outStream->time_base);

pkt->dts = pkt->pts;

pkt->duration = av_rescale_q(1, AVRational{ 1,(int)FrameRate }, outStream->time_base);

pkt->stream_index = outStream->index;

pkt->data = bitstreamPtr;

pkt->size = bitstreamSize;

av_write_frame(outCtx, pkt);

av_write_frame(outCtx, nullptr);

编码完这一帧并取出数据后,我们接下来解除对比特流的锁定、取消映射、取消注册

NVENCCALL(nvencAPI.nvEncLockBitstream(encoder, &lockBitstream));

NVENCCALL(nvencAPI.nvEncUnmapInputResource(encoder, mapInputResource.mappedResource));

NVENCCALL(nvencAPI.nvEncUnregisterResource(encoder, registerInputResource.registeredResource));

NVENCCALL(nvencAPI.nvEncUnmapInputResource(encoder, mapOutputResource.mappedResource));

NVENCCALL(nvencAPI.nvEncUnregisterResource(encoder, registerOutputResource.registeredResource));

结束编码和封装

将所有的帧编码封装完毕后,我们得用NVENC API来发送EOS信息

NV_ENC_PIC_PARAMS picParams = { NV_ENC_PIC_PARAMS_VER };

picParams.encodePicFlags = NV_ENC_PIC_FLAG_EOS;

NVENCCALL(nvencAPI.nvEncEncodePicture(encoder, &picParams));

发送完EOS信息后我们做点收尾工作,例如结束对文件的输出还有释放资源什么的

NVENCCALL(nvencAPI.nvEncDestroyEncoder(encoder));
delete readbackHeap;
FreeLibrary(moduleNvEncAPI);

av_packet_free(&pkt);
av_write_trailer(outCtx);
avio_close(outCtx->pb);
avformat_free_context(outCtx);

开启Lookahead

  上面就是最基本的步骤,也许你也注意到了在配置编码参数时我用config.rcParams.enableLookahead = 0;关闭了Lookahead,首先我们了解下什么是Lookahead,NVENC API编程指南对Lookahead的解释如下
img

由此可以见在编码当前帧时,可以通过参考后面的视频帧,来更精确地分配比特率,从而获得更好的画质。如果要开启Lookahead我们首先应该把config.rcParams.enableLookahead设置为1然后设置config.rcParams.lookaheadDepth。假设config.rcParams.lookaheadDepth = 4;下面是启用Lookahead的编码流程

nvEncEncodePicture frame0

nvEncEncodePicture frame1

nvEncEncodePicture frame2

nvEncEncodePicture frame3

nvEncEncodePicture frame4

nvEncLockBitstream frame0

nvEncEncodePicture frame5

nvEncLockBitstream frame1

......

我们首先得分配config.rcParams.lookaheadDepth + 1个材质供图形引擎循环渲染,下面是修改后的和编码有关的代码

NvidiaEncoder::NvidiaEncoder(const UINT frameToEncode) :
	Encoder(frameToEncode), encoder(nullptr),
	readbackHeap(new ReadbackHeap(2 * 4 * Graphics::getWidth() * Graphics::getHeight())),
	outCtx(nullptr), outStream(nullptr), pkt(nullptr),
	nvencAPI{ NV_ENCODE_API_FUNCTION_LIST_VER },
	outputFenceValue(0)
{
	moduleNvEncAPI = LoadLibraryA("nvEncodeAPI64.dll");

	if (moduleNvEncAPI == 0)
	{
		throw "unable to load nvEncodeAPI64.dll";
	}

	NVENCSTATUS(__stdcall * NVENCAPICreateInstance)(NV_ENCODE_API_FUNCTION_LIST*) = (NVENCSTATUS(*)(NV_ENCODE_API_FUNCTION_LIST*))GetProcAddress(moduleNvEncAPI, "NvEncodeAPICreateInstance");

	std::cout << "[class NvidiaEncoder] api instance create status " << NVENCAPICreateInstance(&nvencAPI) << "\n";

	NV_ENC_OPEN_ENCODE_SESSION_EX_PARAMS sessionParams = { NV_ENC_OPEN_ENCODE_SESSION_EX_PARAMS_VER };
	sessionParams.device = GraphicsDevice::get();
	sessionParams.deviceType = NV_ENC_DEVICE_TYPE_DIRECTX;
	sessionParams.apiVersion = NVENCAPI_VERSION;

	NVENCCALL(nvencAPI.nvEncOpenEncodeSessionEx(&sessionParams, &encoder));

	NV_ENC_PRESET_CONFIG presetConfig = { NV_ENC_PRESET_CONFIG_VER,{NV_ENC_CONFIG_VER} };

	NVENCCALL(nvencAPI.nvEncGetEncodePresetConfigEx(encoder, codec, preset, tuningInfo, &presetConfig));

	NV_ENC_CONFIG config;
	memcpy(&config, &presetConfig.presetCfg, sizeof(NV_ENC_CONFIG));
	config.version = NV_ENC_CONFIG_VER;
	config.profileGUID = profile;

	//high quality encode
	config.gopLength = 120;
	config.frameIntervalP = 1;
	config.rcParams.enableLookahead = 1;
	config.rcParams.lookaheadDepth = LookaheadDepth;
	config.rcParams.rateControlMode = NV_ENC_PARAMS_RC_VBR;
	config.rcParams.averageBitRate = 20000000U;
	config.rcParams.maxBitRate = 40000000U;
	config.rcParams.vbvBufferSize = config.rcParams.maxBitRate * 4;
	config.rcParams.enableAQ = 1;

	NV_ENC_INITIALIZE_PARAMS encoderParams = { NV_ENC_INITIALIZE_PARAMS_VER };
	encoderParams.bufferFormat = bufferFormat;
	encoderParams.encodeConfig = &config;
	encoderParams.encodeGUID = codec;
	encoderParams.presetGUID = preset;
	encoderParams.tuningInfo = tuningInfo;
	encoderParams.encodeWidth = Graphics::getWidth();
	encoderParams.encodeHeight = Graphics::getHeight();
	encoderParams.darWidth = Graphics::getWidth();
	encoderParams.darHeight = Graphics::getHeight();
	encoderParams.maxEncodeWidth = Graphics::getWidth();
	encoderParams.maxEncodeHeight = Graphics::getHeight();
	encoderParams.frameRateNum = FrameRate;
	encoderParams.frameRateDen = 1;
	encoderParams.enablePTD = 1;
	encoderParams.enableOutputInVidmem = 0;
	encoderParams.enableEncodeAsync = 0;

	NVENCCALL(nvencAPI.nvEncInitializeEncoder(encoder, &encoderParams));

	GraphicsDevice::get()->CreateFence(0, D3D12_FENCE_FLAG_NONE, IID_PPV_ARGS(&outputFence));

	std::cout << "[class NvidiaEncoder] render at " << Graphics::getWidth() << " x " << Graphics::getHeight() << "\n";

	std::cout << "[class NvidiaEncoder] frameRate " << FrameRate << "\n";

	std::cout << "[class NvidiaEncoder] frameToEncode " << frameToEncode << "\n";

	std::cout << "[class NvidiaEncoder] start encoding\n";

	avformat_alloc_output_context2(&outCtx, nullptr, "mp4", "output.mp4");

	outStream = avformat_new_stream(outCtx, nullptr);

	outStream->id = 0;

	AVCodecParameters* vpar = outStream->codecpar;
	vpar->codec_id = AV_CODEC_ID_H264;
	vpar->codec_type = AVMEDIA_TYPE_VIDEO;
	vpar->width = Graphics::getWidth();
	vpar->height = Graphics::getHeight();

	avio_open(&outCtx->pb, "output.mp4", AVIO_FLAG_WRITE);

	avformat_write_header(outCtx, nullptr);

	pkt = av_packet_alloc();

	NV_ENC_REGISTER_RESOURCE registerOutputResource = { NV_ENC_REGISTER_RESOURCE_VER };
	registerOutputResource.bufferFormat = NV_ENC_BUFFER_FORMAT_U8;
	registerOutputResource.bufferUsage = NV_ENC_OUTPUT_BITSTREAM;
	registerOutputResource.resourceType = NV_ENC_INPUT_RESOURCE_TYPE_DIRECTX;
	registerOutputResource.resourceToRegister = readbackHeap->getResource();
	registerOutputResource.subResourceIndex = 0;
	registerOutputResource.width = 2 * 4 * Graphics::getWidth() * Graphics::getHeight();
	registerOutputResource.height = 1;
	registerOutputResource.pitch = 0;
	registerOutputResource.pInputFencePoint = nullptr;

	NVENCCALL(nvencAPI.nvEncRegisterResource(encoder, &registerOutputResource));

	registeredOutputResourcePtr = registerOutputResource.registeredResource;

	NV_ENC_MAP_INPUT_RESOURCE mapOutputResource = { NV_ENC_MAP_INPUT_RESOURCE_VER };
	mapOutputResource.registeredResource = registerOutputResource.registeredResource;

	NVENCCALL(nvencAPI.nvEncMapInputResource(encoder, &mapOutputResource));

	mappedOutputResourcePtr = mapOutputResource.mappedResource;
}

NvidiaEncoder::~NvidiaEncoder()
{
	if (moduleNvEncAPI)
	{
		nvencAPI.nvEncUnmapInputResource(encoder, mappedOutputResourcePtr);

		nvencAPI.nvEncUnregisterResource(encoder, registeredOutputResourcePtr);

		while (mappedInputResourcePtrs.size())
		{
			nvencAPI.nvEncUnmapInputResource(encoder, mappedInputResourcePtrs.front());

			mappedInputResourcePtrs.pop();
		}

		while (registeredInputResourcePtrs.size())
		{
			nvencAPI.nvEncUnregisterResource(encoder, registeredInputResourcePtrs.front());

			registeredInputResourcePtrs.pop();
		}

		NVENCCALL(nvencAPI.nvEncDestroyEncoder(encoder));

		delete readbackHeap;

		FreeLibrary(moduleNvEncAPI);

		av_packet_free(&pkt);
		av_write_trailer(outCtx);
		avio_close(outCtx->pb);
		avformat_free_context(outCtx);
	}
}

bool NvidiaEncoder::encode(Texture* const inputTexture)
{
	timeStart = std::chrono::steady_clock::now();

	if (frameEncoded == frameToEncode)
	{
		NV_ENC_PIC_PARAMS picParams = { NV_ENC_PIC_PARAMS_VER };

		picParams.encodePicFlags = NV_ENC_PIC_FLAG_EOS;

		NVENCCALL(nvencAPI.nvEncEncodePicture(encoder, &picParams));

		encoding = false;

		std::cout << "\n[class NvidiaEncoder] encode complete!\n";

		std::cout << "[class NvidiaEncoder] frame encode avg speed " << frameToEncode / encodeTime << "\n";
	}
	else
	{
		NVENCSTATUS status;

		{
			NV_ENC_REGISTER_RESOURCE registerInputResource = { NV_ENC_REGISTER_RESOURCE_VER };
			registerInputResource.bufferFormat = bufferFormat;
			registerInputResource.bufferUsage = NV_ENC_INPUT_IMAGE;
			registerInputResource.resourceType = NV_ENC_INPUT_RESOURCE_TYPE_DIRECTX;
			registerInputResource.resourceToRegister = inputTexture->getResource();
			registerInputResource.subResourceIndex = 0;
			registerInputResource.width = Graphics::getWidth();
			registerInputResource.height = Graphics::getHeight();
			registerInputResource.pitch = 0;
			registerInputResource.pInputFencePoint = nullptr;

			NVENCCALL(nvencAPI.nvEncRegisterResource(encoder, &registerInputResource));

			registeredInputResourcePtrs.push(registerInputResource.registeredResource);

			NV_ENC_MAP_INPUT_RESOURCE mapInputResource = { NV_ENC_MAP_INPUT_RESOURCE_VER };
			mapInputResource.registeredResource = registerInputResource.registeredResource;

			NVENCCALL(nvencAPI.nvEncMapInputResource(encoder, &mapInputResource));

			mappedInputResourcePtrs.push(mapInputResource.mappedResource);

			NV_ENC_INPUT_RESOURCE_D3D12 inputResource = { NV_ENC_INPUT_RESOURCE_D3D12_VER };
			inputResource.pInputBuffer = mapInputResource.mappedResource;
			inputResource.inputFencePoint = NV_ENC_FENCE_POINT_D3D12{ NV_ENC_FENCE_POINT_D3D12_VER };

			NV_ENC_OUTPUT_RESOURCE_D3D12 outputResource = { NV_ENC_INPUT_RESOURCE_D3D12_VER };
			outputResource.pOutputBuffer = mappedOutputResourcePtr;
			outputResource.outputFencePoint = NV_ENC_FENCE_POINT_D3D12{ NV_ENC_FENCE_POINT_D3D12_VER };
			outputResource.outputFencePoint.pFence = outputFence.Get();
			outputResource.outputFencePoint.signalValue = ++outputFenceValue;
			outputResource.outputFencePoint.bSignal = true;

			outputResources.push(outputResource);

			NV_ENC_PIC_PARAMS picParams = { NV_ENC_PIC_PARAMS_VER };

			picParams.pictureStruct = NV_ENC_PIC_STRUCT_FRAME;

			picParams.inputBuffer = &inputResource;

			picParams.outputBitstream = &outputResource;

			picParams.bufferFmt = bufferFormat;

			picParams.inputWidth = Graphics::getWidth();

			picParams.inputHeight = Graphics::getHeight();

			picParams.completionEvent = nullptr;

			status = nvencAPI.nvEncEncodePicture(encoder, &picParams);

			if (status != NV_ENC_SUCCESS && status != NV_ENC_ERR_NEED_MORE_INPUT)
			{
				const char* error = nvencAPI.nvEncGetLastErrorString(encoder);

				std::cout << "status " << status << "\n";

				std::cout << error << "\n";

				__debugbreak();
			}
		}

		if ((outputResources.size() == LookaheadDepth + 1) && (status == NV_ENC_SUCCESS || status == NV_ENC_ERR_NEED_MORE_INPUT))
		{
			frameEncoded++;

			NV_ENC_LOCK_BITSTREAM lockBitstream = { NV_ENC_LOCK_BITSTREAM_VER };

			lockBitstream.outputBitstream = &outputResources.front();

			lockBitstream.doNotWait = 0;

			NVENCCALL(nvencAPI.nvEncLockBitstream(encoder, &lockBitstream));

			uint8_t* const bitstreamPtr = (uint8_t*)lockBitstream.bitstreamBufferPtr;

			const int bitstreamSize = lockBitstream.bitstreamSizeInBytes;

			pkt->pts = av_rescale_q(frameEncoded, AVRational{ 1,(int)FrameRate }, outStream->time_base);

			pkt->dts = pkt->pts;

			pkt->duration = av_rescale_q(1, AVRational{ 1,(int)FrameRate }, outStream->time_base);

			pkt->stream_index = outStream->index;

			pkt->data = bitstreamPtr;

			pkt->size = bitstreamSize;

			av_write_frame(outCtx, pkt);

			av_write_frame(outCtx, nullptr);

			NVENCCALL(nvencAPI.nvEncUnlockBitstream(encoder, lockBitstream.outputBitstream));

			outputResources.pop();

			nvencAPI.nvEncUnmapInputResource(encoder, mappedInputResourcePtrs.front());

			mappedInputResourcePtrs.pop();

			nvencAPI.nvEncUnregisterResource(encoder, registeredInputResourcePtrs.front());

			registeredInputResourcePtrs.pop();
		}
		else if (status != NV_ENC_SUCCESS && status != NV_ENC_ERR_NEED_MORE_INPUT)
		{
			const char* error = nvencAPI.nvEncGetLastErrorString(encoder);

			std::cout << "status " << status << "\n";

			std::cout << error << "\n";

			__debugbreak();
		}
	}

	displayProgress();

	timeEnd = std::chrono::steady_clock::now();

	const float frameTime = std::chrono::duration<float>(timeEnd - timeStart).count();

	encodeTime += frameTime;

	return encoding;
}

总结

  上述就是所有编码和封装的流程,完成这件事还是学到了挺多知识的。目前只遇到了一个问题,使用H264编码格式时在启用Lookahead和B帧时编码出现了问题导致无法开启B帧,NVENC API头文件对B帧编码的解释如下
img
也就是说B帧的编码需要后面的帧提交来参考并进行编码,但是我已经开启了Lookahead,理论上来说应该是不需要考虑这个问题的,但是D3D12 Debug LayerNVENC API会报如下错误

D3D12 ERROR: ID3D12CommandQueue::ExecuteCommandLists: Non-simultaneous-access Texture Resource (0x000001FC1F9B38F0:'Unnamed Object') is still referenced by write|transition_barrier GPU operations in-flight on another Command Queue (0x000001FC1EF2B1E0:'Unnamed ID3D12CommandQueue Object'). It is not safe to start transition_barrier GPU operations now on this Command Queue (0x000001FC1F201230:'Unnamed ID3D12CommandQueue Object'). This can result in race conditions and application instability. [ EXECUTION ERROR #1047: OBJECT_ACCESSED_WHILE_STILL_IN_USE]
D3D12 ERROR: ID3D12CommandQueue::ExecuteCommandLists: Non-simultaneous-access Texture Resource (0x000001FC1F9B6930:'Unnamed Object') is still referenced by write|transition_barrier GPU operations in-flight on another Command Queue (0x000001FC1EF2B1E0:'Unnamed ID3D12CommandQueue Object'). It is not safe to start transition_barrier GPU operations now on this Command Queue (0x000001FC1F201230:'Unnamed ID3D12CommandQueue Object'). This can result in race conditions and application instability. [ EXECUTION ERROR #1047: OBJECT_ACCESSED_WHILE_STILL_IN_USE]
D3D12 ERROR: ID3D12CommandQueue::ExecuteCommandLists: Non-simultaneous-access Texture Resource (0x000001FC1F9B7540:'Unnamed Object') is still referenced by write|transition_barrier GPU operations in-flight on another Command Queue (0x000001FC1EF2B1E0:'Unnamed ID3D12CommandQueue Object'). It is not safe to start transition_barrier GPU operations now on this Command Queue (0x000001FC1F201230:'Unnamed ID3D12CommandQueue Object'). This can result in race conditions and application instability. [ EXECUTION ERROR #1047: OBJECT_ACCESSED_WHILE_STILL_IN_USE]
error occured at function nvencAPI.nvEncLockBitstream(encoder, &lockBitstream)
status 8 INVALID PARAM

真的是个超级奇怪的问题,虽然B帧有最大的压缩率,但是好像不擅长动作非常复杂的情况。有时候可能会让引擎渲染一些非常复杂的画面,关掉B帧对我来说其实还挺合适的,索性就不管这个问题了。

参考资料

NVENC Video Encoder API Programming Guide

Official Samples

标签:编码,ENC,encoder,NVENC,API,nvencAPI,D3D12,GUID,NV
From: https://www.cnblogs.com/PuddingCoke/p/18656474

相关文章

  • UnityAPI:利器CullingGroup
    https://docs.unity3d.com/Manual/CullingGroupAPI.html这个API非常强大,可以快速的实现自定义的Occlusionculling和Lod系统,并且性能表现极佳。简要原理CullingGroup为了性能考虑,把所有的物体模拟为球形,传入摄像机后,检测球形与相机视窗的交集,通过onStateChanged通知应用......
  • Composition API与Options API的区别
    CompositionAPI相比OptionsAPI的优点主要体现在代码的灵活性、可重用性、逻辑组织等方面,尤其是在大型项目或复杂组件中更为显著。1.更好的逻辑组织在OptionsAPI中,组件的不同逻辑通常分散在data、computed、methods、mounted等不同选项中。当组件逻辑复杂或包含多个功......
  • 诗歌创作 AI 大师 API 数据接口
    诗歌创作AI大师API数据接口AI/文本生成基于AI模型的诗歌创作大师诗歌创作/文学生成。1.产品功能支持根据主题生成高质量原创诗歌;自动识别主题内容,生成符合情感和语境的诗歌;多语言支持,可创作不同语言的诗歌;基于AI模型,持续优化诗歌生成质量;适用于文学创作、......
  • 使用OpenAI GPT-3 API构建智能对话系统
    技术背景介绍在人工智能飞速发展的今天,智能对话系统已经渗透到各个行业中。无论是客服机器人、虚拟助手,还是智能家居控制系统,智能对话系统都展现出了巨大的应用潜力。GPT-3作为目前最先进的自然语言处理模型之一,通过API可以方便地集成到我们的应用中,实现智能对话功能。核......
  • 【HarmonyOS NAPI 深度探索4】安装开发环境(Node.js、C++ 编译器、node-gyp)
    【HarmonyOSNAPI深度探索4】安装开发环境(Node.js、C++编译器、node-gyp)要使用N-API开发原生模块,第一步就是配置好开发环境。虽然HarmonyOSNext中提供了DevEco-Studio一站式IDE,可以直接帮助我们完成开发环境的搭建,但是为了更深入的了解NAPI,我们用最原始的编译工具一步......
  • SDK与API
    1.1.SDK的定义SDK是SoftwareDevelopmentKit的缩写,翻译成中文是:软件开发工具包。SDK是一组工具、库、文档和示例代码的集合,旨在帮助开发者更轻松地创建应用程序或集成特定服务。SDK通常由硬件平台、操作系统或服务提供商提供,以便开发者能够利用其平台或服务的功能。1.2.SD......
  • 使用OpenAI API进行文本生成的实践指南
    在AI技术日新月异的发展中,文本生成已经成为一项重要应用。通过使用OpenAI的API,开发者可以轻松地实现复杂的文本生成任务。在本文中,我们将深入探讨如何使用OpenAIAPI进行文本生成,从技术背景、核心原理到实际代码实现,并结合应用场景提供实践建议。技术背景介绍文本生成是自......
  • 如何使用 Java 的 Spring Boot 创建一个 RESTful API?
    大家好,我是V哥,使用Java的SpringBoot创建RESTfulAPI可以满足多种开发场景,它提供了快速开发、易于配置、可扩展、可维护的优点,尤其适合现代软件开发的需求,帮助你快速构建出高性能的后端服务。例如,在企业级应用中,通常需要开发大量的业务功能,并且要求系统具有可扩展......
  • Flux Images Generation API 对接说明
    本文将介绍一种FluxImagesGenerationAPI对接说明,它是可以通过输入自定义参数来生成Flux官方的图片。接下来介绍下FluxImagesGenerationAPI的对接说明。申请流程要使用API,需要先到FluxImagesGenerationAPI对应页面申请对应的服务,进入页面之后,点击「Acquir......
  • 解锁电商新可能:详解主流电商平台API接口
    在数字化浪潮中,电商平台正以前所未有的速度发展,而API(应用程序编程接口)接口作为不同软件系统之间进行数据交换和通信的桥梁,对电商平台的重要性不言而喻。以下是对主流电商平台API接口的详细解析:一、主流电商平台API接口概述主流电商平台如阿里巴巴、京东、淘宝、拼多多等,都提......