How to Convert ARM Assembly Code to Machine Language

Table Of Content
- ARM Instruction Format
- ARM Opcodes
- ARM Condition Codes
- Example1: Converting ADD Instruction
- Example2: Converting SUB Instruction
- Example3 : Converting Another ADD Instruction
- Instruction Encoding with Shift Operations
- Example4: Converting Instruction with Shift
- Example5: Another Instruction with Shift
- Additional MOV Instructions with Shift Operations
- Example6: MOV Instruction with Shift
- Example7: MOV Instruction with Shift
- Conclusion
How to Convert ARM Assembly Code to Machine Language
In this tutorial, we will learn how to convert ARM assembly code to machine language code and decode binary instructions back to assembly. This process is essential for understanding how ARM processors execute instructions at a low level.
ARM Instruction Format
An ARM instruction is divided into several fields:
Bits | Field | Description |
---|---|---|
31-28 | Condition | Determines if the instruction executes based on flags |
27-26 | Mode Bits | Specifies instruction type (Data Processing, Load/Store, etc.) |
25 | Immediate (I) | Specifies if operand is an immediate value If there is no shifting, I will be 1 always |
24-21 | Opcode | Defines the operation (ADD, SUB, MOV, etc.) |
20 | S (Set Flags) | Determines if condition flags are updated |
19-16 | Rn (Source Register) | First operand register |
15-12 | Rd (Destination Register) | Where the result is stored |
11-0 | Operand-2 | Second operand, which can be a register or immediate value |
Summary of the ARM instruction format:
31-28 | 27-26 | 25 | 24-21 | 20 | 19-16 | 15-12 | 11-0 |
---|---|---|---|---|---|---|---|
Condition | Mode Bits | Immediate | Opcode | Set Flags | Source1(Rn) | Destination(Rd) | Operand-2 |
ARM Opcodes
ARM instructions use a 4-bit opcode to define the operation to be performed. Here are some common ARM opcodes:
Opcode | Instruction |
---|---|
0000 | AND |
0001 | EOR |
0010 | SUB |
0100 | ADD |
0101 | ADC |
0110 | SBC |
1010 | CMP |
1100 | ORR |
1101 | MOV / Shift |
1110 | BIC |
ARM Condition Codes
Each ARM instruction includes a 4-bit condition field that determines whether the instruction executes based on the processor's flags:
Condition Code | Meaning |
---|---|
0000 (EQ) | Z set (equal) |
0001 (NE) | Z clear (not equal) |
0010 (HS/CS) | C set (unsigned higher or same) |
0011 (LO/CC) | C clear (unsigned lower) |
0100 (MI) | N set (negative) |
0101 (PL) | N clear (positive or zero) |
0110 (VS) | V set (overflow) |
0111 (VC) | V clear (no overflow) |
1000 (HI) | C set and Z clear (unsigned higher) |
1001 (LS) | C clear or Z set ( unsigned lower or same)` |
1010 (GE) | N set and V set, or N clear and V clear (>=) |
1011 (LT) | N set and V clear, or N clear and V set (<) |
1100 (GT) | Z clear, and either N set and V set, or N clear and V set (>) |
1101 (LE) | Z set, or N set and V clear, or N clear and V set (<=) |
1110 (AL) | Always execute |
1111 (NV) | Reserved |
Example1: Converting ADD Instruction
Assembly Instruction
ADD R1, R2, #12
This means:
ADD
: Opcode for addition.R1
: Destination register.R2
: First operand register.#12
: Immediate value (decimal 12).
Encoding the Instruction
Breaking down the encoding:
- Condition:
1110
(Always execute) - Mode Bits:
00
(Data Processing instruction) - I Bit:
1
(Immediate value used, since there is no shifting, I will be 1) - Opcode:
0100
(ADD) - S Bit:
0
(Do not update flags) - Rn (Source Register):
0010
(R2) - Rd (Destination Register):
0001
(R1) - Operand-2: Immediate
00000000001100
(Binary of 12)
Final Machine Code
ADD R1, R2, #1231-28 | 27-26 | 25 | 24-21 | 20 | 19-16 | 15-12 | 11-0 |
---|---|---|---|---|---|---|---|
Condition | Mode Bits | Immediate | Opcode | Set Flags | Source(R2) | Destination(R1) | Operand-2 |
1110 | 00 | 1 | 0100 | 0 | 0010 | 0001 | 000000001100 |
Written as a 32-bit binary:
11100010100000100001000000001100
Decoding Machine Code Back to Assembly
Given a binary instruction like:
11100010100000100001000000001100
- Identify Condition:
1110
→ Always execute - Identify Opcode:
0100
→ ADD - Identify I Bit:
1
→ No shifting is used - Identify Registers:
Rn = 0010 (R2)
,Rd = 0001 (R1)
- Identify Operand-2: Immediate value
00000000001100
→12
- Convert back to assembly:
ADD R1, R2, #12
ADD R1, R2, R3What if Operand-2 is a register? For example,
ADD R1, R2, R3
31-28 | 27-26 | 25 | 24-21 | 20 | 19-16 | 15-12 | 11-0 |
---|---|---|---|---|---|---|---|
Condition | Mode Bits | Immediate | Opcode | Set Flags | Source(R2) | Destination(R1) | Operand-2 |
1110 | 00 | 1 | 0100 | 0 | 0010 | 0001 | 000000000011 |
Written as a 32-bit binary:
11100010100000100001000000000011
Example2: Converting SUB Instruction
Assembly Instruction
SUBEQS R0, R1, #231-28 | 27-26 | 25 | 24-21 | 20 | 19-16 | 15-12 | 11-0 |
---|---|---|---|---|---|---|---|
Condition | Mode Bits | Immediate | Opcode | Set Flags | Source(R1) | Destination(R0) | Operand-2 |
0000 | 00 | 1 | 0010 | 1 | 0001 | 0000 | 000000000010 |
Example3 : Converting Another ADD Instruction
EORGTS R6, R3, #5Convert the following ARM assembly code into machine language. Write the instructions in hexadecimal.
31-28 | 27-26 | 25 | 24-21 | 20 | 19-16 | 15-12 | 11-0 |
---|---|---|---|---|---|---|---|
Condition | Mode Bits | Immediate | Opcode | Set Flags | Source(R3) | Destination(R6) | Operand-2 |
1100 | 00 | 1 | 0001 | 1 | 0011 | 0110 | 000000000101 |
- Written as a 32-bit binary:
11000010001100110110000000000101
- Convert to hexadecimal:
0xC2336005
Instruction Encoding with Shift Operations
ARM instructions can include shift operations on the second operand. The encoding format changes based on whether the shift is immediate or register-defined.
Case 1: Immediate Shift (Shift by a Constant)
31-28 | 27-26 | 25 (I) | 24-21 | 20 (S) | 19-16 (Rn) | 15-12 (Rd) | 11-7 (5-bit) | 6-5 (Shift Type) | 4(I) | 3-0(Rm) |
---|---|---|---|---|---|---|---|---|---|---|
Condition | Mode Bits | 0 (Immediate) | Opcode | Set Flags | Source Register | Destination Register | Shift Amount | Shift Type (2-bit) | 0 | Register containing the value to be shifted |
Case 2: Register-Defined Shift (Shift by Register)
31-28 | 27-26 | 25 (I) | 24-21 | 20 (S) | 19-16 (Rn) | 15-12 (Rd) | 11-8(4-bit) | 7 | 6-5 (Shift Type) | 4(I) | 3-0(Rm) |
---|---|---|---|---|---|---|---|---|---|---|---|
Condition | Mode Bits | 0 (Imm.) | Opcode | Set Flags | Source Register | Destination Register | Rs (Register holding shift amount) | 0 | Shift Type (2-bit) | 1 | Register containing the value to be shifted |
Shift Types in ARM Encoding
ARM instructions can apply shifts to registers before using them as operands. The shift type is represented by two bits:
Shift Type | Meaning |
---|---|
00 | Logical Left (LSL) |
01 | Logical Right (LSR) |
10 | Arithmetic Right (ASR) |
11 | Rotate Right (ROR) |
Example4: Converting Instruction with Shift
Assembly Instruction:
ADD R0, R1, R2, LSL #231-28 | 27-26 | 25 | 24-21 | 20 | 19-16 | 15-12 | 11-0 |
---|---|---|---|---|---|---|---|
Condition | Mode Bits | Immediate | Opcode | Set Flags | Source(R1) | Destination(R0) | Operand-2 |
1110 | 00 | 0 | 0100 | 0 | 0001 | 0000 | 000100000010 |
Operand-2 Breakdown
00010
→ Shift amount (Decimal 2)00
→ Shift Type (00
means Logical Left Shift LSL)0
→ Immediate flag (0
means immediate shift amount is used)0010
→ R2 (Register to be shifted)
ADD R0, R1, R2, LSL R3What if the shift amount is a register? For example,
ADD R0, R1, R2, LSL R3
31-28 | 27-26 | 25 | 24-21 | 20 | 19-16 | 15-12 | 11-0 |
---|---|---|---|---|---|---|---|
Condition | Mode Bits | Immediate | Opcode | Set Flags | Source(R1) | Destination(R0) | Operand-2 |
1110 | 00 | 0 | 0100 | 0 | 0001 | 0000 | 001100010010 |
Operand-2 Breakdown
0011
→ Shift amount (Register R2)0
→ Always0
for register-defined shift00
→ Shift Type (00
means Logical Left Shift LSL)1
→ Immediate flag (1
means register-defined shift amount)0010
→ R3 (Register to be shifted)
Example5: Another Instruction with Shift
ADD R1, R2, R3, LSL #9Convert the following ARM assembly code into machine language. Write the instructions in hexadecimal.
31-28 | 27-26 | 25 | 24-21 | 20 | 19-16 | 15-12 | 11-0 |
---|---|---|---|---|---|---|---|
Condition | Mode Bits | Immediate | Opcode | Set Flags | Source(R2) | Destination(R1) | Operand-2 |
1110 | 00 | 0 | 0100 | 0 | 0010 | 0001 | 010010000011 |
- Written as a 32-bit binary:
11100000100000100001010010000011
- Convert to hexadecimal:
0xE0821483
Additional MOV Instructions with Shift Operations
In ARM assembly, instructions such as MOV with a shift allow you to move a register's content into another register while applying a shift. The MOV opcode is typically 1101.
For a MOV instruction using shift, the instruction format is as follows:
Register-Defined Shift (Shift by Register)
31-28 | 27-26 | 25 (I) | 24-21 (Opcode) | 20 (S) | 19-16 (Rn) | 15-12 (Rd) | 11-8 (Rs) | 7 | 6-5 (Shift Type) | 4 (I) | 3-0 (Rm) |
---|---|---|---|---|---|---|---|---|---|---|---|
Condition | Mode Bits | 0 | 1101 (MOV) | 0 | 0000 (unused) | Destination | Register holding shift amount | 0 | Shift Type | Immediate | Source (Register to be shifted) |
Immediate Shift (Shift by a Constant)
31-28 | 27-26 | 25 (I) | 24-21 (Opcode) | 20 (S) | 19-16 (Rn) | 15-12 (Rd) | 11-7 (Rs) | 6-5 (Shift Type) | 4 (I) | 3-0 (Rm) |
---|---|---|---|---|---|---|---|---|---|---|
Condition | Mode Bits | 0 | 1101 (MOV) | 0 | 0000 (unused) | Destination | Shift amount | Shift Type | Immediate | Source (Register to be shifted) |
Example6: MOV Instruction with Shift
LSL R9, R6, R7This is interpreted as:MOV R9, R6, LSL R7
31-28 | 27-26 | 25 | 24-21 | 20 | 19-16 | 15-12 | 11-0 |
---|---|---|---|---|---|---|---|
Condition | Mode Bits | Immediate | Opcode | Set Flags | (unused) | Destination(R9) | Operand-2 |
1110 | 00 | 0 | 1101 | 0 | 0000 | 1001 | 011100010110 |
LSL R9, R6, #5What if the shift amount is an immediate value? For example,
LSL R9, R6, #5
This is interpreted as:MOV R9, R6, LSL #5
31-28 | 27-26 | 25 | 24-21 | 20 | 19-16 | 15-12 | 11-0 |
---|---|---|---|---|---|---|---|
Condition | Mode Bits | Immediate | Opcode | Set Flags | (unused) | Destination(R9) | Operand-2 |
1110 | 00 | 0 | 1101 | 0 | 0000 | 1001 | 010100000110 |
Example7: MOV Instruction with Shift
ASR R6, R7, R3This is interpreted as:MOV R6, R7, ASR R3
31-28 | 27-26 | 25 | 24-21 | 20 | 19-16 | 15-12 | 11-0 |
---|---|---|---|---|---|---|---|
Condition | Mode Bits | Immediate | Opcode | Set Flags | (unused) | Destination(R6) | Operand-2 |
1110 | 00 | 0 | 1101 | 0 | 0000 | 0110 | 001101010111 |
Conclusion
By understanding the ARM instruction format and condition codes, you can manually encode and decode ARM instructions. This process is essential for low-level programming, debugging, and understanding how ARM processors execute instructions.