x86 Assembly Reference
Free reference guide: x86 Assembly Reference
About x86 Assembly Reference
The x86 Assembly Reference is a searchable quick-reference covering the essential instructions of the x86 and x86-64 (AMD64) instruction set architecture using Intel syntax. The reference is organized into eight categories: Data Transfer (MOV, LEA, PUSH/POP, XCHG, MOVZX/MOVSX zero/sign extension, CMOVcc conditional moves), Arithmetic (ADD/SUB, INC/DEC, IMUL/MUL signed/unsigned multiplication, IDIV/DIV division, NEG negation, ADC/SBB carry arithmetic for 64-bit operations), Logic (AND/OR/XOR bitwise operations, NOT complement, SHL/SHR logical shifts, SAR/SAL arithmetic shifts, ROL/ROR rotation, TEST flag-only AND), Compare & Branch (CMP comparison, JMP unconditional jump, JE/JNE equality jumps, JG/JL/JGE/JLE signed comparison, JA/JB/JAE/JBE unsigned comparison, LOOP counter-based iteration), Function Calls (CALL/RET, cdecl and stdcall calling conventions, function prologue/epilogue patterns, ENTER/LEAVE shorthand), String Operations (REP MOVSB/D for memcpy, REP STOSB/D for memset, REPNE SCASB for byte search, REPE CMPSB for memcmp), Registers (EAX-EDX general purpose, ESI/EDI index, ESP/EBP stack, EFLAGS, EIP), and System (INT 0x80 Linux syscall, SYSCALL x64, NOP, CPUID, RDTSC).
Understanding x86 assembly is fundamental for reverse engineering, vulnerability research, malware analysis, compiler development, and low-level systems programming. When analyzing compiled binaries, security researchers need to read disassembly output to understand program flow, identify buffer overflows, trace function calls, and locate shellcode. Performance-critical code in game engines, cryptographic implementations, and operating system kernels often includes hand-written assembly for specific CPU features. The calling convention section (cdecl vs stdcall) is particularly important for understanding how functions pass arguments and clean up the stack, which is essential for writing exploits, debugging crashes, and interfacing with foreign function interfaces.
Each instruction entry includes the mnemonic, a clear description of its operation, and practical examples showing common usage patterns. The examples use realistic scenarios: LEA for fast multiplication (LEA EAX, [EAX+EAX*2] for EAX*3), XOR EAX,EAX for efficient register clearing, function prologue/epilogue patterns that appear in virtually every compiled function, and Linux/x64 system call sequences for sys_write. The register section documents the conventional roles of each register (EAX as accumulator/return value, ECX as counter, ESP/EBP for stack frames) along with the EFLAGS register bits (CF, ZF, SF, OF, DF, IF). All content is searchable with instant filtering and dark mode support.
Key Features
- Data transfer instructions: MOV, LEA effective address calculation, PUSH/POP stack operations, XCHG, MOVZX/MOVSX extensions, CMOVcc conditionals
- Arithmetic instructions: ADD/SUB, INC/DEC, IMUL/MUL signed/unsigned multiply, IDIV/DIV with CDQ/XOR EDX preparation, NEG, ADC/SBB carry operations
- Logic and shift instructions: AND/OR/XOR bitwise ops, NOT complement, SHL/SHR/SAR/SAL shifts, ROL/ROR rotation, TEST for flag-only testing
- Compare and branch: CMP, JMP (direct/indirect/table), conditional jumps for signed (JG/JL) and unsigned (JA/JB) comparisons, LOOP instruction
- Function call conventions: CALL/RET, cdecl (caller cleanup) vs stdcall (callee cleanup), complete prologue/epilogue stack frame patterns, ENTER/LEAVE
- String operations: REP MOVSB/D (memcpy), REP STOSB/D (memset), REPNE SCASB (strchr), REPE CMPSB (memcmp) with direction flag CLD
- Register documentation: EAX-EDX roles, ESI/EDI string ops, ESP/EBP stack frame layout with argument/local variable offsets, EFLAGS bits, EIP
- System instructions: INT 0x80 (Linux 32-bit syscall), SYSCALL (x64), NOP/multi-byte NOP padding, CPUID for CPU identification, RDTSC timestamp counter
Frequently Asked Questions
What x86 instructions does this reference cover?
The reference covers eight categories: Data Transfer (MOV, LEA, PUSH/POP, XCHG, MOVZX/MOVSX, CMOVcc), Arithmetic (ADD/SUB, INC/DEC, IMUL/MUL, IDIV/DIV, NEG, ADC/SBB), Logic (AND/OR/XOR, NOT, SHL/SHR, SAR/SAL, ROL/ROR, TEST), Compare & Branch (CMP, JMP, JE/JNE, JG/JL/JGE/JLE signed, JA/JB/JAE/JBE unsigned, LOOP), Function Calls (CALL/RET, cdecl, stdcall, prologue/epilogue, ENTER/LEAVE), String Operations (REP MOVSB/D, REP STOSB/D, REPNE SCASB, REPE CMPSB), Registers (EAX-EDX, ESI/EDI, ESP/EBP, EFLAGS, EIP), and System (INT 0x80, SYSCALL, NOP, CPUID, RDTSC).
What is the difference between cdecl and stdcall calling conventions?
In cdecl (the standard C calling convention), the caller pushes arguments right-to-left onto the stack and the caller cleans up the stack after the call (ADD ESP, 8 for two 4-byte arguments). In stdcall (used by Windows API), the callee cleans the stack using RET N (e.g., RET 8). The return value is in EAX for both. cdecl supports variadic functions (like printf) because the caller knows the argument count, while stdcall produces slightly smaller code since cleanup happens once in the callee.
How does the LEA instruction differ from MOV?
MOV copies data between registers and memory. LEA (Load Effective Address) computes an address using the x86 addressing modes but stores the computed address itself, not the data at that address. This makes LEA useful for fast arithmetic: LEA EAX, [EBX+ECX*4] computes EBX + ECX*4 in one instruction without affecting flags, LEA EAX, [EAX+EAX*2] multiplies EAX by 3, and LEA EAX, [EBP-0x10] computes a stack variable address. Unlike ADD/MUL, LEA does not modify the EFLAGS register.
Why is XOR EAX, EAX used instead of MOV EAX, 0?
XOR EAX, EAX encodes as 2 bytes (31 C0) while MOV EAX, 0 encodes as 5 bytes (B8 00 00 00 00). Both set EAX to zero, but XOR is shorter, faster on modern CPUs (it is recognized as a zeroing idiom and does not wait for the previous EAX value), and breaks the dependency chain on the register. Compilers universally prefer XOR for register zeroing. However, XOR modifies flags (sets ZF=1, clears CF/OF) while MOV does not, which matters in rare flag-sensitive contexts.
What is a function prologue and epilogue in x86?
The prologue sets up a stack frame: PUSH EBP saves the caller's frame pointer, MOV EBP, ESP establishes the new frame pointer, and SUB ESP, N reserves space for local variables. Callee-saved registers (EBX, ESI, EDI) are pushed next. The epilogue reverses this: POP the saved registers, MOV ESP, EBP restores the stack, POP EBP restores the frame pointer, then RET returns. The shorthand ENTER N,0 replaces the three prologue instructions and LEAVE replaces the two epilogue instructions.
How do signed and unsigned conditional jumps differ?
After a CMP instruction, signed jumps (JG/JL/JGE/JLE) check the Sign Flag (SF), Overflow Flag (OF), and Zero Flag (ZF) to compare values interpreted as two's complement signed integers. Unsigned jumps (JA/JB/JAE/JBE) check the Carry Flag (CF) and Zero Flag (ZF) to compare values interpreted as unsigned integers. For example, comparing 0xFFFFFFFF: as signed it is -1, so JL would jump; as unsigned it is 4294967295, so JA would jump. Using the wrong jump type after CMP is a common source of bugs.
How does the Linux system call interface work in x86?
In 32-bit Linux, system calls use INT 0x80: place the syscall number in EAX (e.g., 4 for sys_write), arguments in EBX, ECX, EDX, ESI, EDI, EBP, and execute INT 0x80. The return value appears in EAX. In 64-bit Linux (x86-64), use SYSCALL: place the syscall number in RAX (e.g., 1 for sys_write), arguments in RDI, RSI, RDX, R10, R8, R9, and execute SYSCALL. The 64-bit interface is faster because SYSCALL avoids the interrupt descriptor table lookup used by INT 0x80.
Is any data sent to a server when using this reference?
No. The complete x86 assembly instruction reference is embedded in the page and rendered entirely client-side. Searching by instruction name, filtering by category, and browsing entries all happen within your browser using JavaScript. No assembly code, search queries, or data are transmitted to any server.