记32位程序(使用3gb用户虚拟内存)使用D3DX9导致的一个崩溃的问题

时间:2021-02-09 08:42:45

为了增加32位程序的用户虚拟内存的使用量,我们使用了/LARGEADDRESSAWARE编译选项来使32位程序可能使用到3gb的内存,能否使用到3gb内存也跟平台、系统和设置有关系,现摘抄部分作为参考具体可参考微软官方网站[i]

Limits on memory and address space vary by platform, operating system, and by whether the IMAGE_FILE_LARGE_ADDRESS_AWARE value of the LOADED_IMAGE structure and 4-gigabyte tuning (4GT) are in use. IMAGE_FILE_LARGE_ADDRESS_AWARE is set or cleared by using the /LARGEADDRESSAWARE linker option.

4-gigabyte tuning (4GT), also known as application memory tuning, or the /3GB switch, is a technology (only applicable to 32 bit systems) that alters the amount of virtual address space available to user mode applications. Enabling this technology reduces the overall size of the system virtual address space and therefore system resource maximums. For more information, see What is 4GT.

Limits on physical memory for 32-bit platforms also depend on the Physical Address Extension (PAE), which allows 32-bit Windows systems to use more than 4 GB of physical memory.

Memory and Address Space Limits

The following table specifies the limits on memory and address space for supported releases of Windows. Unless otherwise noted, the limits in this table apply to all supported releases.

Memory type

Limit on X86

Limit in 64-bit Windows

User-mode virtual address space for each 32-bit process

2 GB

Up to 3 GB with IMAGE_FILE_LARGE_ADDRESS_AWARE and 4GT

2 GB with IMAGE_FILE_LARGE_ADDRESS_AWARE

cleared (default)

4 GB with IMAGE_FILE_LARGE_ADDRESS_AWARE

set

使用更多的用户虚拟地址好处就不说了,下面说下我们遇到的一个比较难解决的崩溃,就是引擎在解析shader的时候,有机率崩溃,下面是示例代码:

  

 LPD3DXINCLUDE pInclude = NULL;
DWORD dwFlag = ;
dwFlag |= D3DCOMPILE_PACK_MATRIX_ROW_MAJOR;
LPD3DXBUFFER pErrorBuffer = NULL;
LPD3DXCONSTANTTABLE pConstantTable = NULL;
HRESULT hr = D3DXCompileShader (m_strSource.c_str(), m_strSource.size(), &macroVec[],
pInclude, m_strEntryFunc.c_str(), m_strProfile.c_str(), dwFlag,
&m_pShaderBuffer, &pErrorBuffer, &pConstantTable); if (hr != S_OK)
{
// 错误处理
} // if D3DXCONSTANTTABLE_DESC desc;
pConstantTable->GetDesc(&desc);
for (UINT i = ; i < desc.Constants; ++i)
{
D3DXCONSTANT_DESC constantDesc;
D3DXHANDLE handle = pConstantTable->GetConstant(NULL, i);
UINT numCount;
hr = pConstantTable->GetConstantDesc(handle, &constantDesc, &numCount);
VERIFY_D3D_RESULT (hr);
if (constantDesc.RegisterSet == D3DXRS_SAMPLER)
{
// 相应的操作
}
} // for

崩溃行在hr = pConstantTable->GetConstantDesc(handle, &constantDesc, &numCount);一开始崩溃看了一下最后跟到d3d9x里面崩溃,具体原因也就不好查找,所以也没仔细去管,但是后来发现崩溃的机率还是有点大的,一开始怀疑是编译的shader有问题,但是从内存中拿出编译好的二进制跟离线生成的二进制比较并没有任何差异,问题也暂时搁置。后来抓住一次崩溃的机会对d3d9x的汇编代码进行了跟踪调试,经过一系列的跟踪比对,发现一处代码很可疑,就是GetConstantDesc这个函数,函数声明为STDMETHOD(GetConstantDesc)(THIS_ D3DXHANDLE hConstant, D3DXCONSTANT_DESC *pConstantDesc, UINT *pCount) PURE,汇编代码如下:

 D3DXShader::CConstantTable::GetConstantDesc:
0F52AFF0 mov edi,edi
0F52AFF2 push ebp
0F52AFF3 mov ebp,esp
0F52AFF5 push esi
0F52AFF6 mov esi,dword ptr [ebp+10h]
0F52AFF9 push edi
0F52AFFA mov edi,dword ptr [ebp+14h]
0F52AFFD test esi,esi
0F52AFFF jne D3DXShader::CConstantTable::GetConstantDesc+1Ch (0F52B00Ch)
0F52B001 test edi,edi
0F52B003 jne D3DXShader::CConstantTable::GetConstantDesc+1Ch (0F52B00Ch)
0F52B005 mov eax,8876086Ch
0F52B00A jmp D3DXShader::CConstantTable::GetConstantDesc+8Dh (0F52B07Dh)
0F52B00C mov ecx,dword ptr [ebp+0Ch] // ecx 中即为hConstant
0F52B00F test ecx,ecx
0F52B011 je D3DXShader::CConstantTable::GetConstantDesc+15h (0F52B005h)
0F52B013 mov edx,dword ptr [ebp+]
0F52B016 mov eax,dword ptr [edx+]
0F52B019 or eax,ecx ;注意此处代码,用来判断ecx的最高位是否是1
; 如果为1则不跳转,否则跳转到绿色所示的代码处开始执行,这也是崩溃的开始,最终在调用
; 红色所示的函数时/发生了崩溃。
0F52B01B jge D3DXShader::CConstantTable::GetConstantDesc+31h (0F52B021h)
0F52B01D neg ecx ; 至于此处为什么要取负值,请详见下面的具体说明。
0F52B01F jmp D3DXShader::CConstantTable::GetConstantDesc+44h (0F52B034h)
0F52B021 lea eax,[ebp+10h]
0F52B024 push eax
0F52B025 push ecx
0F52B026 mov ecx,edx
0F52B028 call D3DXShader::CConstantTable::FindConstantByName (0F52AE0Dh)
0F52B02D test eax,eax
0F52B02F js D3DXShader::CConstantTable::GetConstantDesc+8Dh (0F52B07Dh)
0F52B031 mov ecx,dword ptr [ebp+10h]
0F52B034 xor edx,edx
0F52B036 xor eax,eax
0F52B038 inc edx
0F52B039 push ebx
0F52B03A mov ebx,ecx
0F52B03C test ecx,ecx
0F52B03E je D3DXShader::CConstantTable::GetConstantDesc+58h (0F52B048h)
0F52B040 mov ebx,dword ptr [ebx+24h]
0F52B043 inc eax
0F52B044 test ebx,ebx
0F52B046 jne D3DXShader::CConstantTable::GetConstantDesc+50h (0F52B040h)
0F52B048 pop ebx
0F52B049 test edi,edi
0F52B04B je D3DXShader::CConstantTable::GetConstantDesc+6Ch (0F52B05Ch)
0F52B04D cmp dword ptr [edi],
0F52B050 je D3DXShader::CConstantTable::GetConstantDesc+64h (0F52B054h)
0F52B052 mov edx,dword ptr [edi]
0F52B054 cmp edx,eax
0F52B056 jbe D3DXShader::CConstantTable::GetConstantDesc+6Ah (0F52B05Ah)
0F52B058 mov edx,eax
0F52B05A mov dword ptr [edi],eax
0F52B05C test esi,esi
0F52B05E je D3DXShader::CConstantTable::GetConstantDesc+8Bh (0F52B07Bh)
0F52B060 jmp D3DXShader::CConstantTable::GetConstantDesc+87h (0F52B077h)
0F52B062 test edx,edx
0F52B064 je D3DXShader::CConstantTable::GetConstantDesc+8Bh (0F52B07Bh)
0F52B066 push esi
0F52B067 call D3DXShader::CConstant::GetDesc (0F5275B6h)
0F52B06C test eax,eax
0F52B06E js D3DXShader::CConstantTable::GetConstantDesc+8Dh (0F52B07Dh)
0F52B070 mov ecx,dword ptr [ecx+24h]
0F52B073 add esi,30h
0F52B076 dec edx
0F52B077 test ecx,ecx
0F52B079 jne D3DXShader::CConstantTable::GetConstantDesc+72h (0F52B062h)
0F52B07B xor eax,eax
0F52B07D pop edi
0F52B07E pop esi
0F52B07F pop ebp
0F52B080 ret 10h

上面的蓝色的代码处为什么会取负值呢,让我们把GetConstant的汇编代码帖出来读者就能看出来了:

 D3DXShader::CConstantTable::GetConstant:
0F52B0D0 mov edi,edi
0F52B0D2 push ebp
0F52B0D3 mov ebp,esp
0F52B0D5 mov ecx,dword ptr [ebp+0Ch]
0F52B0D8 test ecx,ecx
0F52B0DA jne D3DXShader::CConstantTable::GetConstant+23h (0F52B0F3h)
0F52B0DC mov ecx,dword ptr [ebp+10h]
0F52B0DF mov eax,dword ptr [ebp+]
0F52B0E2 cmp ecx,dword ptr [eax+1Ch]
0F52B0E5 jb D3DXShader::CConstantTable::GetConstant+1Bh (0F52B0EBh)
0F52B0E7 xor eax,eax
0F52B0E9 jmp D3DXShader::CConstantTable::GetConstant+52h (0F52B122h)
0F52B0EB mov eax,dword ptr [eax+18h]
0F52B0EE mov eax,dword ptr [eax+ecx*]
0F52B0F1 jmp D3DXShader::CConstantTable::GetConstant+50h (0F52B120h)
0F52B0F3 mov edx,dword ptr [ebp+]
0F52B0F6 mov eax,dword ptr [edx+]
0F52B0F9 or eax,ecx
0F52B0FB jge D3DXShader::CConstantTable::GetConstant+31h (0F52B101h)
0F52B0FD neg ecx
0F52B0FF jmp D3DXShader::CConstantTable::GetConstant+44h (0F52B114h)
0F52B101 lea eax,[ebp+]
0F52B104 push eax
0F52B105 push ecx
0F52B106 mov ecx,edx
0F52B108 call D3DXShader::CConstantTable::FindConstantByName (0F52AE0Dh)
0F52B10D test eax,eax
0F52B10F js D3DXShader::CConstantTable::GetConstant+17h (0F52B0E7h)
0F52B111 mov ecx,dword ptr [ebp+]
0F52B114 push dword ptr [ebp+10h]
0F52B117 call D3DXShader::CConstant::GetConstantMember (0F52765Fh)
0F52B11C test eax,eax
0F52B11E je D3DXShader::CConstantTable::GetConstant+17h (0F52B0E7h)
0F52B120 neg eax ; 此处对要返回的结果进行了取负值操作,所以在下面调用
; GetConstantDesc 的时候需要对其进行取负值操作,得到最终的结果,至于为什么这样做
; 目前不是特别清楚,希望了解的人可以告知。
0F52B122 pop ebp
0F52B123 ret 0Ch
 //----------------------------------------------------------------------------
// D3DXHANDLE:
// -----------
// Handle values used to efficiently reference shader and effect parameters.
// Strings can be used as handles. However, handles are not always strings.
//---------------------------------------------------------------------------- #ifndef D3DXFX_LARGEADDRESS_HANDLE
typedef LPCSTR D3DXHANDLE;
#else
typedef UINT_PTR D3DXHANDLE;
#endif
typedef D3DXHANDLE *LPD3DXHANDLE;

HRESULT D3DXCompileShader(
_In_ LPCSTR pSrcData,
_In_ UINT srcDataLen,
_In_ const D3DXMACRO *pDefines,
_In_ LPD3DXINCLUDE pInclude,
_In_ LPCSTR pFunctionName,
_In_ LPCSTR pProfile,
_In_ DWORD Flags,
_Out_ LPD3DXBUFFER *ppShader,
_Out_ LPD3DXBUFFER *ppErrorMsgs,
_Out_ LPD3DXCONSTANTTABLE *ppConstantTable
);

Parameters

ppConstantTable [out]
Type: LPD3DXCONSTANTTABLE*
Returns an ID3DXConstantTable interface, which can be used to access shader constants. This value can be NULL. If you compile your application as large address aware (that is, you use the /LARGEADDRESSAWARE linker option to handle addresses larger than 2 GB), you cannot use this parameter and must set it to NULL. Instead, you must use the D3DXGetShaderConstantTableEx function to retrieve the shader-constant table that is embedded inside the shader. In this D3DXGetShaderConstantTableEx call, you must pass the D3DXCONSTTABLE_LARGEADDRESSAWARE flag to the Flags parameter to specify to access up to 4 GB of virtual address space.[ii]

也就是说如果开启了3gb用户虚拟地址空间,那么我们在获取Shader常量表时就必须使用D3DXGetShaderConstantTableEx并且加上D3DXCONSTTABLE_LARGEADDRESSAWARE标记。到此为止,崩溃问题已经完美解决了。希望没有白白浪费你的时间,能有所收获。当然,如果文中有描述不对的地方也请指正。