Lets Make Malware – Position Independent Code
Position-Independent Code
What is Position-Independent Code (PIC)? What is shellcode? What is machine code? How do these relate?
There are a lot of resources both online and offline about these topics. No matter how deep you would like to dive, you will find material. What is harder to find, as always, is something to put these all into simple terms.
And that is exactly what we aim to do here. So, let’s make malware.
PIC
First, let’s talk about PIC. In Computer Science, it is defined as “a body of machine code that executes properly regardless of it’s memory address”.
Executable files (and most files in general) have a format to them. We won’t discuss the entire Portable Executable (PE or aka the Windows .exe) file format here.
For simplicity’s sake, just know that within this format are sections. Important for our purposes here are the .text section and the .data section. There are many, many more other sections.
The .text section is where all the instructions are.
Example: if 'a' == 'a': print("Yes, a is equal to a. Much wow.")
Example: mov eax, 4
The .data section is where things like strings (or human-readable text) are.
When we want to create Position-Independent Code, we want everything to live inside the .text section. This means we must write our code in an abnormal (for a proper Software Engineer) way.
No more globally scoped variables.
Instead of this:
int x = 4;
int main() {}
Now, we must do this:
int main() {
int x = 4;
return 0;
}
This, of course, is a simplified example so it doesn’t highlight just how tricky it can be to put everything within functions. What happens when one function needs to access variables defined in another function?
We can either redefine that variable in every function that needs it, assuming they don’t need to have the same value, or we can start messing around with pointers.
That is the first issue we had to address to create PIC. We can also no longer use normal strings.
As we said above, strings generally end up in the .data section. We don’t want that. Now, we must use Stack Strings.
Stack Strings
What are Stack Strings? They are really just a fancy way to say character arrays. In most languages, we can do something like strExample = "This is a string"
.
This makes it incredibly simple to use strings. But for our purposes, we cannot do this. Instead, we must do something like this:
int main() {
char strExample[17];
strExample[0] = 't';
strExample[1] = 'h';
strExample[2] = 'i';
strExample[3] = 's';
strExample[4] = ' ';
strExample[5] = 'i';
strExample[6] = 's';
strExample[7] = ' ';
strExample[8] = 'a';
strExample[9] = ' ';
strExample[10] = 's';
strExample[11] = 't';
strExample[12] = 'r';
strExample[13] = 'i';
strExample[14] = 'n';
strExample[15] = 'g';
strExample[16] = '\0';
}
Very tedious to do but it gets us one step closer to PIC.
No More Standard Library
The final thing we need to do is also the one that makes our lives that much more difficult. And that’s getting rid of the standard library.
If you don’t already know, the standard library in any language contains all of the most-commonly needed, helpful functions, macros, and data types.
You cannot use something like printf
in C without it.
Furthermore, we no longer have access to the Windows API functions if we want to create PIC.
Why?
Normally, we link to these functions dynamically. These aren’t functions we have defined anywhere in our programs, so we load a library (.DLL files in Windows) and find the address of the function within that library so we can use it. This information is stored in the Import Address Table (IAT). The IAT is located within the .idata section. And we can’t have that.
However, this does not mean we don’t have access to these functions at all. This just means that we have to resolve the library addresses and function addresses manually.
On Windows, ntdll.dll and kernel32.dll are loaded into every process during initialization so we don’t actually need to load these libraries ourselves. We only need their memory addresses.
In other words, we need to create our own custom equivalents to the Windows API functions GetProcAddress
and GetModuleHandle
.
There are tons of resources on how to do this elsewhere, so we will only be covering the theory quickly.
First, we need to access the Process Environment Block. This is a special structure that holds a ton of information about the process. We can access the PEB with a bit of assembly. Again, we will not be diving too deep into this. Just note that we can access the PEB by loading the address found at an offset of 0x60 of a special register called the GS register. The address found here is the address of the PEB.
Inside the PEB, among other things, is a list of the loaded modules and their addresses. This is where we find the address of kernel32.dll.
Once we have the address for kernel32.dll, with a bit of knowledge about the PE file format, we can access other functions. This also applies to every library we may want to load.
What we need to know is there is something called the Export Table. You can think of it as the opposite of the Import Table. It is here that we can find all the functions a library is allowing others to access. By going through this list, we can find whichever function we want to use.
Alternatively, we can do this only once to find the address of kernel32.dll’s GetProcAddress
and just use that function for everything else.
So, now that we have everything in the .text section, we can compile our malicious application and extract the machine code. How we choose to do that is up to us.
We could use Python. We could do it manually by opening our application in a hex-editor and copy-pasting all of the contents of the .text section into a new file. We could tell our compiler and linker to produce an object file instead of a full executable.
Machine Code
Now then, what exactly is machine code? Simply put, it is the binary instructions commonly represented as hexadecimal.
b8 21 0a 00 00 #moving "!\n" into eax
a3 0c 10 00 06 #moving eax into first memory location
b8 6f 72 6c 64 #moving "orld" into eax
a3 08 10 00 06 #moving eax into next memory location
b8 6f 2c 20 57 #moving "o, W" into eax
a3 04 10 00 06 #moving eax into next memory location
b8 48 65 6c 6c #moving "Hell" into eax
a3 00 10 00 06 #moving eax into next memory location
b9 00 10 00 06 #moving pointer to start of memory location into ecx
ba 10 00 00 00 #moving string size into edx
bb 01 00 00 00 #moving "stdout" number to ebx
b8 04 00 00 00 #moving "print out" syscall number to eax
cd 80 #calling the linux kernel to execute our print to stdout
b8 01 00 00 00 #moving "sys_exit" call number to eax
cd 80 #executing it via linux sys_call
That is Machine Code.
How does it differ from shellcode? Well, all shellcode is machine code. But not all machine code is shellcode!
Confused yet? Shellcode is position-independent machine code that commonly results in a shell (whether bind or reverse is irrelevant). Nowadays, we don’t always want a shell but the point still stands.
Also, of note, is that while Machine Code and Assembly have a direct 1-1 mapping. They are, technically-speaking, different. Computers understand machine code (0x88). Humans understand Assembly (mov).
Finally, let’s speak quickly about why anyone would go through the painstaking labors involved with creating PIC, or rather shellcode, for the purposes of Cybersecurity and more specifically Offensive Security.
Simply put, stealth.
Whenever an executable file runs, there is an event logged saying as much. Sometimes, this is fine. “File A is now running” does not indicate anything bad is happening on its own. But combined with other factors, it may indeed scream a threat actor has accessed your machine.
Because shellcode is not a complete, functioning executable, there is no logging for it. Or, rather, no “process start” event log associated with it. You cannot run it on its own.
Sidenote: there may be logs associated with other actions the shellcode performs. However, these aren’t attributed to the shellcode itself but rather the process the shellcode is running under.
Another benefit of shellcode is that you can separate the actions the shellcode performs from any kind of added protections and evasion techniques. In other words, you can build on top of the shellcode itself without having to directly modify the shellcode.