Microprocessor Systems: 8086 and ARM Architecture Reference

CMPS201 Microprocessor Systems Reference: Front

8086 Architecture and Layer Stack

Layer Stack (Bottom to Top):

  • Physics → Transistors → Logic Gates → Microarchitecture (This Course) → Instruction Set Architecture (ISA) → Operating System → Application

Microprocessor (MPU) vs. Microcontroller (MCU)

FeatureMPUMCU
TypeCPU onlySystem on Chip (CPU, RAM, ROM, I/O)
External Hardware Needed?Yes (Motherboard, RAM)No (Just battery and crystal)
ExampleIntel i7, AMD RyzenSTM32, ATmega, Arduino
Use CasePC or ServerEmbedded (Appliances, Cars)
AnalogyFerrari EngineToyota Corolla (Ready to drive)

Von Neumann vs. Harvard Architecture

FeatureVon NeumannHarvard
Buses1 Shared2 Separate
Bottleneck?Yes (1-lane bridge)No
Fetch and Read?Serial (Wait)Simultaneous
8086?Von NeumannModern i7 cache uses Harvard internally

Warning: 8086 is Von Neumann (shared bus). Harvard is not 8086.

Bus Interface Unit (BIU) vs. Execution Unit (EU)

UnitRoleComponents
BIU (Logistics Manager)External world interface; fetches instructions; reads/writes data; calculates 20-bit physical address.6-byte Instruction Queue, Segment Registers (CS, DS, SS, ES), Instruction Pointer (IP), Address Generator.
EU (Worker/Brain)No external pins; gets instructions from BIU queue; executes; writes flags.Control Circuit (Decoder), Arithmetic Logic Unit (ALU), Flag Register.
  • 6-byte Instruction Queue: Primitive pipelining. While the EU is busy, the BIU pre-fetches the next instruction.
  • Warning: JUMP or GOTO instructions cause the queue to be flushed (Branch Penalty/Pipeline Flush), requiring the BIU to restart fetching from a new address.

8086 Registers

General Purpose (16-bit; can split to High/Low byte)

RegisterNameCritical Rule
AX (AH|AL)AccumulatorMUL and DIV must use AX. Math specialist.
BX (BH|BL)Base / PointerOnly General Purpose register usable as a memory pointer [BX]. [CX] or [DX] are illegal.
CX (CH|CL)CounterLOOP instructions automatically decrement CX and check for zero.
DX (DH|DL)Data / I/OPort address for I/O. In 16×16 MUL, the upper 16 bits go to DX and the lower to AX.

Index and Pointer Registers

  • SI (Source Index): Used for string operations (MOVSB reads from [SI]).
  • DI (Destination Index): Used for string operations (MOVSB writes to [DI]).
  • SP (Stack Pointer): Points to the top of the stack. Never modify manually.
  • BP (Base Pointer): Used to access stack locals and parameters without push/pop.
  • IP (Instruction Pointer): Program Counter. Cannot be overridden in the BIU.

Segment Registers

  • CS (Code Segment): Executable instructions.
  • DS (Data Segment): Global variables; default for [BX].
  • SS (Stack Segment): Function variables and stack.
  • ES (Extra Segment): Destination for string copies.

Warning: Valid pointer registers inside brackets [] are BX, BP, SI, and DI only. AX, CX, and DX are illegal and cause compile errors.

Flag Register (Resides in EU)

  • Z (Zero): Result equals 0.
  • N/S (Negative/Sign): Result is less than 0.
  • C (Carry): Unsigned overflow.
  • O/V (Overflow): Signed overflow.

Flags are updated after every ALU operation.

Memory Addressing and Endianness

  • Byte-addressable: Smallest unit is 1 Byte (8 bits).
  • 16-bit bus: Maximum 64 KB directly addressable.
  • 8086 20-bit address bus: 1 MB address space.
  • Intel Little Endian: Least Significant Byte (LSB) is stored at the lowest address.

Example: 0x12345678 at address 1000:

  • Address 1000: 78 (LSB)
  • Address 1001: 56
  • Address 1002: 34
  • Address 1003: 12 (MSB)

Exam Trap: “What byte is at address 1000?” The answer is 78, not 12. Big Endian would store the MSB first.

Segmentation and Physical Addresses

Physical Address = (Segment × 16) + Offset

  • Equivalent to: (Segment << 4) + Offset.
  • Example: DS = 0x2000, Offset = 0x0050 → 0x20000 + 0x0050 = 0x20050.
SegmentDefault RegisterContent
CSIPCode (Instructions)
DSBX, SI, DIGlobal Variables
SSSP, BPStack
ESDIString Destination

Segment Override: MOV AX, CS:[BX] uses CS instead of DS. Note: You cannot override IP; the CPU always fetches from CS.

Addressing Modes

ModeSyntaxSpeedNotes
ImmediateMOV AX, 5FastData is in the instruction; no memory fetch.
RegisterMOV AX, BXFastestIn-CPU; zero memory access.
DirectMOV AX, [1000H]SlowerHardcoded address (DS:1000H). Used for global variables.
Register IndirectMOV AX, [BX]SlowerPointer. BX holds the address. Only BX, BP, SI, DI allowed.
Based + IndexedMOV AL, [BX+SI]SlowerArrays. Base (BX/BP) + Index (SI/DI). AGU calculates in 1 cycle.

Warning: MOV AX, BX copies the value of BX. MOV AX, [BX] goes to the address stored in BX (pointer dereference).

Instruction Cycles and CISC vs. RISC

Fetch-Decode-Execute Cycle

  1. FETCH: PC address → Address Bus → Memory → Instruction → Instruction Register (IR). PC auto-increments.
  2. DECODE: Decoder reads opcode bits and activates hardware paths (ALU/MOV).
  3. EXECUTE: ALU performs math/logic. Write-back to destination register. Flags update.

CISC (x86) vs. RISC (ARM)

FeatureCISC (8086)RISC (ARM)
PhilosophyHardware does complex workSoftware breaks tasks into simple steps
Instruction SizeVariable lengthFixed length (32-bit)
CPI> 1 (Many cycles per instruction)~1 (Goal)
Memory AccessALU can touch RAM directlyLoad/Store only (ALU cannot touch RAM)
PowerHigh (Desktop)Low (Mobile/Embedded)

Warning: Modern CISC (Intel) secretly converts instructions into internal Micro-ops, acting like RISC internally.

ARM Cortex-M Registers

RegisterNamePurpose
R0–R3General PurposeArguments and return values. Caller-saved.
R4–R11General PurposeLocal variables. Callee-saved (must preserve).
R12IP (Scratch)Intra-procedure scratch. Auto-saved on interrupt.
R13 (SP)Stack PointerFull Descending stack.
R14 (LR)Link RegisterStores return address on BL call. Fast (no RAM needed).
R15 (PC)Program CounterWriting to R15 causes a Jump.

xPSR Flags: N (Negative), Z (Zero), C (Carry), V (Overflow). Note: Parity (P) is not an ARM flag.

ARM Assembly (UAL) and Control Flow

Format: OPCODE Destination, Source1, Source2

  • LDR R0, [R1]: Load from RAM address in R1 to R0.
  • STR R0, [R1]: Store R0 to RAM address in R1.
  • BIC R0, R1, #0x20: Bit Clear (R1 AND NOT mask).
  • BL function: Branch with Link (saves PC+4 in LR).
  • BX LR: Return from function (PC = LR).

Barrel Shifter: ADD R0, R1, R2, LSL #2 → R0 = R1 + (R2 × 4) in one cycle.

Warning: MOV R0, #0x12345678 fails because a 32-bit number cannot fit in a 32-bit instruction with an opcode. Use LDR R0, =0x12345678.

Performance and Power (Iron Law)

  • Time = Instruction Count × CPI × Clock Period
  • Dynamic Power (P) = C × V² × f
  • Voltage (V) is the most impactful factor because it is squared.
  • Race-to-Sleep: Run the CPU fast to finish tasks, then sleep immediately to save energy.

CMPS201 Microprocessor Systems Reference: Back

GPIO and Memory-Mapped I/O

Peripherals are mapped to specific memory addresses. Writing to these addresses triggers hardware actions.

Register Address = Peripheral Base Address + Register Offset

GPIO Registers (STM32 Example)

OffsetRegisterFunctionKey Values
0x00MODERPin Direction00=Input, 01=Output, 10=Alternate, 11=Analog
0x10IDRInput DataRead-only; current pin voltage
0x14ODROutput DataRead/Write; 1=High, 0=Low
0x18BSRRAtomic Set/ResetBits 0-15=Set; Bits 16-31=Reset

Warning: Always enable the RCC Clock first. Without a clock, register writes are silently ignored.

Bit Manipulation (Read-Modify-Write)

  • Set Bit: REG |= (1 << 5); (OR with 1 forces set).
  • Clear Bit: REG &= ~(1 << 5); (AND with 0 forces clear).
  • Toggle Bit: REG ^= (1 << 5); (XOR with 1 flips bit).

Safety: Use volatile for hardware pointers to prevent compiler optimization from removing necessary hardware reads.

Interrupts and the NVIC

FeaturePollingInterrupts
CPU Load100% (Busy-wait)~0% (Sleeps/Works)
ResponsivenessDelayedInstant (Hardware trigger)

NVIC (Nested Vectored Interrupt Controller)

  • Nested: Priority-based pre-emption.
  • Vectored: Uses a lookup table for ISR addresses.
  • Priority: Lower numbers equal higher priority (0 is highest).
  • Tail-chaining: CPU skips unstacking/restacking between back-to-back interrupts to save ~12 cycles.

Critical: You must clear the pending flag inside the ISR (e.g., EXTI_ClearITPendingBit). Forgetting this causes an infinite loop.

Hardware Timers and PWM

  • Prescaler (PSC): Divides the clock frequency. F_timer = F_clk / (PSC + 1).
  • Auto-Reload (ARR): Defines the period. ARR = desired_count - 1.
  • Capture Compare (CCR): Defines the PWM duty cycle. Duty% = CCR / (ARR + 1) × 100%.

The +1 Rule: Always subtract 1 when setting PSC and ARR because the counter includes zero.

Motor Control

  • H-Bridge: Controls direction. Forward (Q1+Q4), Reverse (Q3+Q2), Brake (Q2+Q4), Coast (All open).
  • Shoot-Through: If Q1 and Q2 are on simultaneously, it creates a short circuit. “Dead Time” delays prevent this.
  • Servo (PPM): Uses pulse width (time) to set position (1.5ms = 90°).
  • Stepper: Open-loop control; moves in discrete steps (e.g., 1.8°).

Serial Communication Protocols

FeatureUARTI2CSPI
Wires2 (TX, RX)2 (SDA, SCL)4 (MOSI, MISO, SCK, CS)
ClockAsynchronousSynchronousSynchronous
AddressingNone7-bit SoftwareHardware (Chip Select)
DuplexFullHalfFull

UART Frame: Start Bit (Low), Data (LSB first), Optional Parity, Stop Bit (High). Note: GND must be shared between devices.

ADC, DMA, and Pipelining

  • ADC: Converts analog voltage to digital. Value = Vin / Vref × (2^n - 1). Warning: Vin > Vref can destroy the hardware.
  • DMA (Direct Memory Access): Copies data (e.g., ADC to RAM) without CPU involvement, freeing the CPU for other tasks.
  • Pipelining: Fetch-Decode-Execute-Writeback. Ideal throughput is 1 instruction per cycle. Branch mispredictions cause pipeline flushes.
  • Cache: L1 (fastest/smallest) → L2 → RAM (slowest/largest). Sequential access improves the “Cache Hit” rate.