/SLAE32: Assignment 4

# Introduction

For assignment #4 I had to create a custom shellcode encoding sheme (encoder and decoder). I wrote my encoder script in Python3 and decoder of course in NASM.

# Encoding Scheme

The scheme I decided upon is fairly simple. I thought it would be cool to convert all bytes into lowercase alphabetic characters (a-z) and so that's what I did. In the ASCII table a-z is 0x61 (97) - 0x7a (122) so there are definitely not enough to represent all 256 unique bytes. To solve this my scheme will encode each byte two letters: the first letter encodes the bytes first bit, and the second letter the second bit to give well over 256 possible combinations.

The exact steps are:
  1. XOR the byte with a set key. This is to provide a bit of extra 'randomness' so that a pattern isn't too obvious
  2. Split the byte into bits
  3. Add 0x61 to each bit. This is not a random number, it is the hex value for a lowercase 'a' character. Since bits are from 0-f in value, 0x61+bit will never fall out of the range of the alphabet and so the total will always be the hexcode of some lowercase alphabetic character.
  4. These two sums (bytes now) are combined into a word and represent the encoded byte.
Since each byte is encoded into a word, the length of any encoded shellcode will be twice as long as the original shellcode, so if space is a concern then perhaps this isn't the right encoder for you.

I created a flowchart to visualize the encoding process a bit better:



As I mentioned above I created an encoding script in Python3 which basically just calls this function on each byte and prints it out nicely:
def encode_byte(b):
	b = b ^ key
	l = (b & 0xf0) >> 4
	r = b & 0x0f
	l_enc = l + 97
	r_enc = r + 97
	return chr(l_enc) + chr(r_enc)
The script spits out three lines: "LENGTH", "KEY" and "SHELLCODE". These are all lines which you (the user) need to copy and paste into decoder.nasm before compiling:

# Decoder

The decoder itself is fairly short, partly because I spent a bunch of time trying to find ways to combine commands to save space. The first thing that happens is a Jump-Call-Pop to figure out where the shellcode is in memory.

Jump:
; A4 - Custom Encoding Scheme
; William Moody
; PA-25640
; 28.12.2021

global _start
section .text

_start:

	jmp short call_decoder
Call. The shellcode here is defined to be referenced later on.
call_decoder:

	call decoder

	; === PASTE SHELLCODE FROM ENCODER HERE ===
	shellcode: db "haibpbcjbbcjgocdcadccjgocdcicpmikchaibbbmikdbcmikapbekimmb"
	; =========================================
And Pop. I stored the pointer to the shellcode variable in ESI, and ECX acts as an offset from the beginning of the shellcode to read and write locations. I used one register for both purposes in order to shorten the decoder stub. I chose to use the full ECX register as opposed to just CX so that the decoder can work for longer shellcodes.
decoder:

	pop esi                     ; ESI points to shellcode
	xor ecx, ecx                ; ECX is the offset to write location,
	                            ; ECX*2 is the offset to read location
Right after this, the decoder jumps into the main loop. The decoding process is just the reverse of the encoding, so we need to
  1. Subtract 0x61 from both bytes (letters)
  2. Combine the two bits back into one byte
  3. XOR with the key to retrieve the original byte
decode_loop:

	mov ax, word [esi+ecx*2]    ; ah = shellcode[i], al = shellcode[i+1]
	sub ax, 0x6161              ; ah -= 97, ah -= 97
	shl al, 0x4                 ; al = 0xA -> 0xA0
	add al, ah                  ; ah = 0xB -> al = 0xAB

	; === PASTE KEY FROM ENCODER HERE ===
	xor al, 0x41                ; al = 0xAB ^ key
	; ===================================
Once the byte is decoded, it is written into the memory space of the shellcode variable. The offset variable ECX is compared with the length of the shellcode (0x1d in this case) to see if it has decoded all bytes yet or still needs to continue. JNAE will jump if ECX is not above or equal to 0x1d. If all bytes are decoded, the decoder jumps into the shellcode variable and hands over control.
        mov [esi+ecx], byte al      ; shellcode[j] = al
	inc ecx                     ; Increment offset(s) to read/write locations

	; === PASTE LENGTH FROM ENCODER HERE ===
	cmp ecx, 0x1d
	; ======================================
	
	jnae decode_loop            ; Loop if i < len(shellcode)
	jmp short shellcode         ; Hand over control to decoded shellcode

# Compiling

To test it all out, I threw together a bash / Python3 script which:
  1. Compiles / Links the decoder.nasm file
  2. Dumps the shellcode of decoder
  3. Pastes this shellcode into a shellcode runner c file
  4. Compiles the shellcode runner c file
One thing to note is this: dumping the shellcode of decoder with objdump does not work for some reason (for key values which are smaller than the length of the shellcode). One byte is missing from the output and it caused me a lot of confusion because it ends up decoding incorrectly and I assumed my decoder was incorrect and not the shellcode itself. So to remedy this I wrote a Python3 script which dumps shellcode and included it in the compile.sh script.

# SLAE32 Exam Statement

This blog post has been created for completing the requirements of the SecurityTube Linux Assembly Expert certification:
http://securitytube-training.com/online-courses/securitytube-linx-assembly-expert/
Student ID: PA-25640

All my code for the exam is available in my SLAE32 exam Github repository.