ARM64 Assembly and Ghidra Analysis Basics

• Reverse Engineering

Notes from analyzing Roblox purchase system binaries. Covers ARM64 assembly fundamentals, Ghidra navigation patterns, and common C++ binary structures.

ARM64 Basics

Registers

ARM64 has two views of the same registers:

x0 through x30
64-bit general purpose registers
w0 through w30
Lower 32 bits of corresponding x register

Writing to a w register clears the upper 32 bits of the x register. They reference the same physical register.

Special Purpose Registers

sp
Stack pointer - points to top of stack
x29
Frame pointer - marks base of current function's stack frame
x30
Link register - holds return address after branch with link
x0-x7
Function arguments and return values

Common Instructions

stp x29, x30, [sp, #-0x20]!
Store pair - saves two registers to stack, decrements sp
ldp x29, x30, [sp], #0x20
Load pair - restores two registers from stack, increments sp
adrp x8, 0x104b10000
Load page address - gets base of 4KB memory page
add x8, x8, #0x50
Add immediate to complete full 64-bit address after adrp
bl FUN_10039d6b58
Branch with link - calls function, saves return address in x30
ret
Return - branches to address in x30

The Stack

Stack stores local variables, saved registers, and return addresses. Grows downward (toward lower addresses).

Typical function prologue:

stp x29, x30, [sp, #-0x20]!  ; Save frame pointer and return address
mov x29, sp                      ; Set new frame pointer
sub sp, sp, #0x450               ; Allocate stack space

Typical epilogue:

add sp, sp, #0x450               ; Free stack space
ldp x29, x30, [sp], #0x20        ; Restore frame pointer and return address
ret                               ; Return to caller

Ghidra Navigation

Function vs Label

FUN_
Function entry point - callable subroutine
LAB_
Label - branch target inside a function (loops, conditionals)

Cross References (XREF)

Right-click any symbol → Show References to. Shows where functions are called or data is accessed.

Example:

s_ProximityPrompt_Triggered_1040cc7a7    XREF[1]: 10191adc4(*)
1040cc7a7    "ProximityPrompt_Triggered"

XREF shows this string is referenced from address 0x10191adc4.

Search → For Strings to find text. Search → For Bytes to find byte patterns or consecutive zeros (code caves).

C++ Binary Patterns

Small String Optimization

libc++ std::string layout:

OffsetFieldPurpose
+0x00data pointerPoints to string data (if heap allocated)
+0x08sizeString length
+0x18inline bufferSmall strings stored here
+0x17flag byteBit 7 set = heap allocated, else inline

Common pattern in decompiled code:

if (local_71 < '\0') {
FUN_1039f9090(local_88);  // Free heap-allocated string
}

This checks the sign bit of the flag byte to determine storage type.

Simple Getters

Libraries use accessor functions to keep structures opaque:

undefined8 FUN_103aaf480(long param_1)
{
return *(undefined8 *)(param_1 + 0x40);
}

Reads 8-byte value at offset 0x40 from structure. Common pattern for internal data access.

Reference Counting

Shared pointer patterns:

LOAcquire();
lVar2 = plVar3[1];
plVar3[1] = lVar2 + -1;
LORelease();
if (lVar2 == 0) {
(**(code **)(*plVar3 + 0x10))(plVar3);
__ZNSt3__119__shared_weak_count14__release_weakEv(plVar3);
}

Decrements reference count atomically. Calls destructor when count reaches zero.

HTTP Request Patterns

Host Resolution

Each service has wrapper function:

void Get_Host_Service(void) {
Http_ResolveHost("service_name");
}

Core resolver constructs full domain based on service name and environment.

Request Building

Standard pattern:

std::string host = Get_Host_Service();
std::string path = "/api/endpoint";
std::string url = Http_BuildUrl("https", host, path, query);
Http_SubmitRequest(ctx, url);

Subdomain Validation

Subdomain checking function validates if host matches suffix at domain boundaries:

bool HostIsSubdomainOf(const std::string& host, const char* suffix)

Returns true if:

  • Host exactly matches suffix, or
  • Host ends with .suffix, and
  • Optional :port allowed after suffix

Logging Infrastructure

FLog_LogFmt at 0x1039d6b58 - main logging function with 4578 cross-references.

Function signature:

void FLog_LogFmt(
void* logger_ctx,      // Logger context
void* logger_sink,     // Output sink
int level,             // Log level (5 = info)
const char* fmt,       // Format string
size_t fmt_len,        // Format string length
int tag,               // Category tag
void* arg0,            // First format argument
uint32_t nargs         // Number of arguments
)

Example usage:

FLog_LogFmt(_DAT_10507bd58, DAT_10507bd60, 5,
"[FLog::WebLoginProtocol] {}", 0x1b, 0xd,
&stringVar, 1);

Format strings use {} placeholders. Function is called with level 5 (info) for most operational logging.

Reflection System

Roblox uses metaclass architecture for properties:

Property Setter Pattern

bool SetPropertyByName(
Instance* instance,
const char* propertyName,
ClassDescriptor* cls,
int flags,
bool quiet)
{
ClassDescriptor* cd = FindClassDescriptor(cls);
if (!cd) return false;

PropertyDescriptor* pd = FindPropertyDescriptor(cd, propertyName);
if (!pd) return false;

PropertyInstance* p = CreatePropertyInstance(instance, pd);
p->flags = flags;
return true;
}

Uses runtime lookup of class and property descriptors. Error messages "Could not find class descriptor" or "Could not find property descriptor" indicate reflection failures.

OpenSSL Components

Some functions identified as OpenSSL Base64 BIO filter based on:

  • Assert strings: assertion failed: ctx->buf_len <= (int)sizeof(ctx->buf)
  • File path: crypto/evp/bio_b64.c
  • Function patterns matching b64_write() and b64_read()

Context structure fields: buf_len, buf_off, tmp_len, init.

Analysis Techniques

String-Based Discovery

Start with known strings:

  1. Search for error messages, API paths, or feature names
  2. View cross-references to find usage
  3. Examine calling functions
  4. Rename functions based on context

Pattern Recognition

Host resolution wrappers all follow pattern:

void Get_Host_<Service>(void) {
FUN_1039319c0("<service>");
}

HTTP request builders follow:

Get_Host_<Service>(&host);
Http_BuildUrl(&url, "https", host, path, query);
Http_SubmitRequest(ctx, url);

Function Renaming

Renaming improves readability. Example:

Before:

lVar2 = FUN_100242e24(param_1);

After renaming to GetLocalPlayer:

lVar2 = GetLocalPlayer(param_1);

Context from surrounding code and variable usage indicates function purpose.

Ghidra Patching

Code Caves

Executable memory regions with unused space. Search for consecutive 0x00 bytes in __TEXT segment.

Requirements:

  • Must be in initialized memory (not ??)
  • Must have execute permission
  • Size depends on patch - typically need 0x40-0x80 bytes

Uninitialized vs Initialized

?? in Ghidra
Uninitialized memory - cannot assemble code here
Hex bytes (blue)
Initialized memory - can patch and export

Use Patch Program → Fill to initialize regions before assembling.

ARM64 Address Loading

Cannot fit 64-bit address in 4-byte instruction. Two methods:

Method 1 - adrp/add for page-relative:

adrp x3, 0x10507b000
add  x3, x3, #0xd58

Method 2 - movz/movk for absolute:

movz x3, #0xB37C              // Bits 0-15
movk x3, #0x063D, lsl #16    // Bits 16-31
movk x3, #0x0001, lsl #32    // Bits 32-47

Manual Analysis Workflow

Steps for tracing function purpose:

  1. Find relevant strings
  2. Check cross-references
  3. Examine calling context
  4. Decompile to C
  5. Ask ChatGPT about unfamiliar patterns
  6. Rename based on usage
  7. Trace related functions