Nvrhi | ShaderTool | Blurred code

Nvrhi | ShaderTool



Categories: nvrhi




ShaderBlob is an asset format defined by NVRHI which packages multiple shader variants into a single blob file. The tool is originally a submodule of NVRHI(but a standalone tool), and now it has been moved to a standalone repo. For a RHI a similar tool is always needed for compiling & managing shaders, especially for RHIs targeing Vulkan/DX since the formats they need are different.

Reference:NVIDIAGameWorks/ShaderMake: Shader Compilation Tool

Blob Format

There are two variants Blob format. The first one is for shader without any other variant, it's just bitcode generated by FXC/DXC. ShaderMake doesn't do any additional work but just translate the compilation flags to corresponding compiler.

The second format is for shader with multiple variants(a combination with multiple macros). In the case there is a magic header appended before the binary file, which contains enough information to parse and unpack it.

The logic for appending the header:

struct ShaderBlobEntry
	uint32_t permutationSize;
	uint32_t dataSize;

nvrhi::ShaderBlobEntry binaryEntry;
binaryEntry.permutationSize = (uint32_t)entry.permutation.size();
binaryEntry.dataSize = (uint32_t)fileSize;

fwrite(&binaryEntry, 1, sizeof(binaryEntry), outputFile);
fwrite(entry.permutation.data(), 1, entry.permutation.size(), outputFile);
fwrite(buffer, 1, fileSize, outputFile);


A blob with permutation has a magic NVSP string (4 bytes) header at the begining, which indicates that is not vanilla and is a combination blob. Followed by the NVSP, there are two uint32_t values, which are the length of the permutation string(without the tailing \0) and the length of the shader binary data. Therefore, when parsing a combination blob, despiting the first 4bytes NVSP chars, the next 8 bytes are needed to get permutation string and shader hex.

Looking for variatn in combination blob

The naive way to find a variant in a combination blob is to comparing string.

std::stringstream ss;
for (uint32_t n = 0; n < numConstants; n++)
    const ShaderConstant& constant = constants[n];

    ss << constant.name << "=" << constant.value << " "; // there will be an additional space in the permutation string
//concating macro strings. For example: FOO=1 Bar=2
std::string permutation = ss.str();
// 与Blob里的变体字符串相比较,只比较前n个字符,
// 比如Blob里是`Foo=1 Bar=2`, 现在寻找`Foo=1`的变体也能找到
// 但是反过来Blob里是`Foo=1`,但是寻找`Foo=1 Bar=2`的变体应该判断为失败
strncmp(entryPermutation, permutation.data(), permutation.size()) == 0)


It's a relatively simple and error-prone implementation, because there is no sorting or any other processing. Permutations FOO=1 Bar=2 and Bar=2 Foo=1 are actually the same, but the above implementation treats them as different. The simple implementation forces us to write code in the same order as how the shader is compiled, which is very error-prone.

cfg file to prompt ShaderMake for compilation:

passes/gbuffer_ps.hlsl -T ps_5_0 -D MOTION_VECTORS={0,1} -D ALPHA_TESTED={0,1}

In cpp to read the variant from compiled blob. Note that macros must be in the same order with the cfg file.

std::vector<ShaderMacro> PixelShaderMacros;
PixelShaderMacros.push_back(ShaderMacro("MOTION_VECTORS", params.enableMotionVectors ? "1" : "0"));
PixelShaderMacros.push_back(ShaderMacro("ALPHA_TESTED", alphaTested ? "1" : "0"));
return shaderFactory.CreateShader("donut/passes/gbuffer_ps.hlsl", "main", &PixelShaderMacros, nvrhi::ShaderType::Pixel);