ARM64 Assembly and Ghidra Analysis Basics
• Reverse Engineering
Notes from analyzing Roblox purchase system binaries. Covers ARM64 assembly fundamentals, Ghidra navigation patterns, and common C++ binary structures.
ARM64 Basics
Registers
ARM64 has two views of the same registers:
x0throughx30- 64-bit general purpose registers
w0throughw30- Lower 32 bits of corresponding x register
Writing to a w register clears the upper 32 bits of the x register. They reference the same physical register.
Special Purpose Registers
sp- Stack pointer - points to top of stack
x29- Frame pointer - marks base of current function's stack frame
x30- Link register - holds return address after branch with link
x0-x7- Function arguments and return values
Common Instructions
stp x29, x30, [sp, #-0x20]!- Store pair - saves two registers to stack, decrements sp
ldp x29, x30, [sp], #0x20- Load pair - restores two registers from stack, increments sp
adrp x8, 0x104b10000- Load page address - gets base of 4KB memory page
add x8, x8, #0x50- Add immediate to complete full 64-bit address after adrp
bl FUN_10039d6b58- Branch with link - calls function, saves return address in x30
ret- Return - branches to address in x30
The Stack
Stack stores local variables, saved registers, and return addresses. Grows downward (toward lower addresses).
Typical function prologue:
stp x29, x30, [sp, #-0x20]! ; Save frame pointer and return address
mov x29, sp ; Set new frame pointer
sub sp, sp, #0x450 ; Allocate stack space
Typical epilogue:
add sp, sp, #0x450 ; Free stack space
ldp x29, x30, [sp], #0x20 ; Restore frame pointer and return address
ret ; Return to caller
Ghidra Navigation
Function vs Label
FUN_- Function entry point - callable subroutine
LAB_- Label - branch target inside a function (loops, conditionals)
Cross References (XREF)
Right-click any symbol → Show References to. Shows where functions are called or data is accessed.
Example:
s_ProximityPrompt_Triggered_1040cc7a7 XREF[1]: 10191adc4(*)
1040cc7a7 "ProximityPrompt_Triggered"
XREF shows this string is referenced from address 0x10191adc4.
String Search
Search → For Strings to find text. Search → For Bytes to find byte patterns or consecutive zeros (code caves).
C++ Binary Patterns
Small String Optimization
libc++ std::string layout:
| Offset | Field | Purpose |
|---|---|---|
| +0x00 | data pointer | Points to string data (if heap allocated) |
| +0x08 | size | String length |
| +0x18 | inline buffer | Small strings stored here |
| +0x17 | flag byte | Bit 7 set = heap allocated, else inline |
Common pattern in decompiled code:
if (local_71 < '\0') {
FUN_1039f9090(local_88); // Free heap-allocated string
}
This checks the sign bit of the flag byte to determine storage type.
Simple Getters
Libraries use accessor functions to keep structures opaque:
undefined8 FUN_103aaf480(long param_1)
{
return *(undefined8 *)(param_1 + 0x40);
}
Reads 8-byte value at offset 0x40 from structure. Common pattern for internal data access.
Reference Counting
Shared pointer patterns:
LOAcquire();
lVar2 = plVar3[1];
plVar3[1] = lVar2 + -1;
LORelease();
if (lVar2 == 0) {
(**(code **)(*plVar3 + 0x10))(plVar3);
__ZNSt3__119__shared_weak_count14__release_weakEv(plVar3);
}
Decrements reference count atomically. Calls destructor when count reaches zero.
HTTP Request Patterns
Host Resolution
Each service has wrapper function:
void Get_Host_Service(void) {
Http_ResolveHost("service_name");
}
Core resolver constructs full domain based on service name and environment.
Request Building
Standard pattern:
std::string host = Get_Host_Service();
std::string path = "/api/endpoint";
std::string url = Http_BuildUrl("https", host, path, query);
Http_SubmitRequest(ctx, url);
Subdomain Validation
Subdomain checking function validates if host matches suffix at domain boundaries:
bool HostIsSubdomainOf(const std::string& host, const char* suffix)
Returns true if:
- Host exactly matches suffix, or
- Host ends with
.suffix, and - Optional
:portallowed after suffix
Logging Infrastructure
FLog_LogFmt at 0x1039d6b58 - main logging function with 4578 cross-references.
Function signature:
void FLog_LogFmt(
void* logger_ctx, // Logger context
void* logger_sink, // Output sink
int level, // Log level (5 = info)
const char* fmt, // Format string
size_t fmt_len, // Format string length
int tag, // Category tag
void* arg0, // First format argument
uint32_t nargs // Number of arguments
)
Example usage:
FLog_LogFmt(_DAT_10507bd58, DAT_10507bd60, 5,
"[FLog::WebLoginProtocol] {}", 0x1b, 0xd,
&stringVar, 1);
Format strings use {} placeholders. Function is called with level 5 (info) for most operational logging.
Reflection System
Roblox uses metaclass architecture for properties:
Property Setter Pattern
bool SetPropertyByName(
Instance* instance,
const char* propertyName,
ClassDescriptor* cls,
int flags,
bool quiet)
{
ClassDescriptor* cd = FindClassDescriptor(cls);
if (!cd) return false;
PropertyDescriptor* pd = FindPropertyDescriptor(cd, propertyName);
if (!pd) return false;
PropertyInstance* p = CreatePropertyInstance(instance, pd);
p->flags = flags;
return true;
}
Uses runtime lookup of class and property descriptors. Error messages "Could not find class descriptor" or "Could not find property descriptor" indicate reflection failures.
OpenSSL Components
Some functions identified as OpenSSL Base64 BIO filter based on:
- Assert strings:
assertion failed: ctx->buf_len <= (int)sizeof(ctx->buf) - File path:
crypto/evp/bio_b64.c - Function patterns matching
b64_write()andb64_read()
Context structure fields: buf_len, buf_off, tmp_len, init.
Analysis Techniques
String-Based Discovery
Start with known strings:
- Search for error messages, API paths, or feature names
- View cross-references to find usage
- Examine calling functions
- Rename functions based on context
Pattern Recognition
Host resolution wrappers all follow pattern:
void Get_Host_<Service>(void) {
FUN_1039319c0("<service>");
}
HTTP request builders follow:
Get_Host_<Service>(&host);
Http_BuildUrl(&url, "https", host, path, query);
Http_SubmitRequest(ctx, url);
Function Renaming
Renaming improves readability. Example:
Before:
lVar2 = FUN_100242e24(param_1);
After renaming to GetLocalPlayer:
lVar2 = GetLocalPlayer(param_1);
Context from surrounding code and variable usage indicates function purpose.
Ghidra Patching
Code Caves
Executable memory regions with unused space. Search for consecutive 0x00 bytes in __TEXT segment.
Requirements:
- Must be in initialized memory (not
??) - Must have execute permission
- Size depends on patch - typically need 0x40-0x80 bytes
Uninitialized vs Initialized
??in Ghidra- Uninitialized memory - cannot assemble code here
- Hex bytes (blue)
- Initialized memory - can patch and export
Use Patch Program → Fill to initialize regions before assembling.
ARM64 Address Loading
Cannot fit 64-bit address in 4-byte instruction. Two methods:
Method 1 - adrp/add for page-relative:
adrp x3, 0x10507b000
add x3, x3, #0xd58
Method 2 - movz/movk for absolute:
movz x3, #0xB37C // Bits 0-15
movk x3, #0x063D, lsl #16 // Bits 16-31
movk x3, #0x0001, lsl #32 // Bits 32-47
Manual Analysis Workflow
Steps for tracing function purpose:
- Find relevant strings
- Check cross-references
- Examine calling context
- Decompile to C
- Ask ChatGPT about unfamiliar patterns
- Rename based on usage
- Trace related functions