ARM Assembly Reference
Free reference guide: ARM Assembly Reference
About ARM Assembly Reference
This ARM Assembly Reference is a comprehensive instruction set guide covering ARM32 (ARMv7) and ARM64 (AArch64/ARMv8) architectures. It includes data transfer instructions (MOV, LDR/STR with pre/post-indexing, LDRB/LDRH byte/halfword variants, LDM/STM multiple register operations, ADR/LDR pseudo-instructions), arithmetic operations (ADD/SUB, MUL/MLA, SDIV/UDIV, ADC/SBC for 64-bit), and logical operations (AND/ORR/EOR/BIC, barrel shifter LSL/LSR/ASR/ROR, TST/TEQ, CLZ).
The reference covers branch and conditional execution including CMP/CMN comparisons, unconditional branch (B), conditional branches (BEQ/BNE, BGT/BLT for signed, BHI/BLO for unsigned), Thumb-2 extensions (CBZ/CBNZ, IT blocks), condition codes (EQ, NE, GT, LT, GE, LE, HI, LO, CS, CC, MI, PL), the S suffix for flag updates, and ARM mode conditional instruction execution.
Advanced topics include function calling (BL, AAPCS convention with R0-R3 arguments, R4-R11 callee-saved, 8-byte SP alignment), system calls (SVC), register architecture (R0-R15, CPSR flags N/Z/C/V, VFP/NEON S0-S31/D0-D31/Q0-Q15), Thumb/Thumb-2 16/32-bit instruction modes, NEON SIMD load/store and parallel arithmetic, and AArch64 64-bit instructions with X0-X30 registers, LDP/STP pairs, and RET.
Key Features
- Data transfer: MOV, LDR/STR with pre/post-indexing, LDRB/LDRH/LDRSB/LDRSH byte/halfword variants, LDM/STM/PUSH/POP
- Arithmetic operations: ADD/SUB/RSB, MUL/MLA/UMULL/SMULL, SDIV/UDIV (ARMv7+), ADC/SBC for 64-bit carry chains
- Logic and shifts: AND/ORR/EOR/BIC masking, LSL/LSR/ASR/ROR barrel shifter, TST/TEQ bit testing, CLZ leading zeros
- Branch instructions: B/BL/BX/BLX, conditional BEQ/BNE/BGT/BLT/BGE/BLE/BHI/BLO, CBZ/CBNZ, IT blocks
- AAPCS calling convention: R0-R3 arguments, R0 return, R4-R11 callee-saved, 8-byte SP alignment, SVC system calls
- Register reference: R0-R15 (SP/LR/PC), CPSR flags (N/Z/C/V/T), VFP S0-S31, NEON D0-D31 and Q0-Q15
- NEON SIMD: VLD1/VST1 load/store, VADD/VSUB/VMUL parallel arithmetic, VCEQ/VCGT compare, VAND/VORR/VEOR logic
- AArch64 (ARMv8): X0-X30/W0-W30 registers, XZR zero register, LDP/STP pair operations, RET, SVC system calls
Frequently Asked Questions
What ARM data transfer instructions are covered?
The reference covers MOV for register-to-register moves and immediates, LDR/STR with multiple addressing modes (base, offset, pre-index with !, post-index), byte variants LDRB/STRB and LDRSB for sign extension, halfword variants LDRH/STRH and LDRSH, multiple register LDM/STM (LDMIA/STMDB with ! for writeback), PUSH/POP shortcuts, and ADR/LDR pseudo-instructions for PC-relative addressing and literal pool constants.
How are arithmetic and multiply instructions documented?
The reference covers ADD/SUB with register and immediate operands, ADDS/SUBS with flag updates, RSB (reverse subtract for negation), MUL for 32-bit multiply, MLA for multiply-accumulate, UMULL/SMULL for 64-bit unsigned/signed results, SDIV/UDIV for hardware division on ARMv7+, and ADC/SBC for 64-bit addition/subtraction using carry chains across register pairs.
What branch and conditional execution features are explained?
The reference covers B (unconditional), BL (function call saving LR), BX/BLX (indirect with Thumb interworking), conditional branches for signed (BEQ/BNE/BGT/BLT/BGE/BLE) and unsigned (BHI/BLO/BHS/BLS) comparisons, Thumb-2 CBZ/CBNZ for zero checks, IT blocks (ITE/ITTT) for conditional execution sequences, all 14 condition codes (EQ through AL), and the S suffix for CPSR flag updates.
How is the AAPCS calling convention presented?
The AAPCS entry documents R0-R3 for the first four arguments with remaining on stack, R0 for 32-bit return values (R0-R1 for 64-bit), R4-R11 as callee-saved registers, R0-R3 and R12 as caller-saved, 8-byte SP alignment requirement, LR (R14) for return address, and PC (R15) as program counter. Function prologue/epilogue patterns with PUSH/POP of callee-saved registers and LR/PC are shown.
What ARM register information is included?
Three register entries are covered: general-purpose R0-R15 with roles (R0-R3 arguments/scratch, R4-R11 callee-saved, R12 IP, R13 SP, R14 LR, R15 PC), CPSR status register flags (N negative, Z zero, C carry, V overflow, T Thumb mode, I/F interrupt bits, M[4:0] processor mode), and VFP/NEON floating-point/SIMD registers (S0-S31 single, D0-D31 double, Q0-Q15 quadword) with usage examples.
How are Thumb and Thumb-2 instruction modes explained?
Thumb mode covers 16-bit instruction encoding, BX-based ARM/Thumb switching via LSB, 2-operand format constraints, limited immediate ranges, and register restrictions. Thumb-2 covers mixed 16/32-bit encoding with ARM-equivalent functionality, MOVW/MOVT for full 32-bit constant loading, ADDW for wide immediate ranges, and the ability to use all features previously limited to ARM mode.
What NEON SIMD operations are documented?
The NEON section covers load/store operations (VLD1.8/VLD1.32 for 8-16 byte loads to D/Q registers, VST1 for stores, multi-register VLD1 {D0-D3}), parallel arithmetic (VADD.I32 for 4x32-bit integer, VSUB.I16 for 4x16-bit, VMUL.F32 for 4x32-bit float, VABS for absolute value), and compare/logic operations (VCEQ/VCGT for comparisons producing mask vectors, VAND/VORR/VEOR for bitwise operations).
What AArch64 (ARMv8) instructions are included?
The AArch64 section covers the 64-bit register file (X0-X30 64-bit, W0-W30 32-bit lower half, XZR/WZR zero registers, V0-V31 128-bit SIMD/FP), basic instructions (ADD/SUB/LDR/STR with 64-bit operands, LDP/STP register pair operations for efficient push/pop), branch instructions (B/BL with X30 as LR, RET as BR X30, CBZ/CBNZ, TBNZ for bit testing), and system calls using X8 for syscall number with SVC #0.