Machine Learning

Assembly Language for x86 Processors
Eighth Edition
Chapter 2
x86 Processor Architecture
Chapter Overview
• General Concepts
• IA-32 Processor Architecture
• IA-32 Memory Management
• 64-bit Processors
• Components of an IA-32 Microcomputer
• Input-Output System
General Concepts
• Basic microcomputer design
• Instruction execution cycle
• Reading from memory
• How programs run
Basic Microcomputer Design
• clock synchronizes CPU operations
• control unit (CU) coordinates sequence of
execution steps
• ALU performs arithmetic and bitwise processing
data bus
Central Processor Unit
Memory Storage
control bus
address bus
• synchronizes all CPU and BUS operations
• machine (clock) cycle measures time of a single
• clock is used to trigger events
one cycle
• Timing waveforms example
• A CPU operating at 50MHz (one clock cycle = 20ns)
• A memory chip is designed with an access speed not to
exceed 50ns
• Each memory request will require at least two-and-a
half CPU clock cycles (50ns/20ns = 2.5 clock cycles)
• Each memory request will require at least two-and-a half CPU clock
cycles (50ns/20ns = 2.5 clock cycles)
Instruction Execution Cycle
• Fetch
• Decode
• Fetch operands
• Execute
• Store output
Instruction Execution Cycle
• Typically simplified to:
– Fetch
– Decode
– Execute
Reading from Memory
Multiple machine cycles are required when reading from
memory, because it responds much more slowly than the
CPU. The steps are:
1. Place the address of the value you want to read on the address
2. Assert (changing the value of) the processor’s RD (read) pin.
3. Wait one clock cycle for the memory chips to respond.
4. Copy the data from the data bus into the destination operand
Cache Memory
• High-speed expensive static RAM both inside and
outside the CPU.
– Level-1 cache: inside the CPU
– Level-2 cache: outside the CPU
• Cache hit: when data to be read is already in
cache memory
• Cache miss: when data to be read is not in cache
How a Program Runs
• Program loaded into
memory by program
• The OS then points
to the program entry
sends program
name to
gets starting
cluster from
searches for
program in
returns to
loads and
IA-32 Processor Architecture
• Modes of operation
• Basic execution environment
• Floating-point unit
• Intel Microprocessor history
Modes of Operation
• Protected mode
– native mode (Windows, Linux)
• Real-address mode
– native MS-DOS
• System management mode
– power management, system security, diagnostics
• Virtual-8086 mode
− hybrid of Protected
− each program has its own 8086 computer
Basic Execution Environment
• Addressable memory
• General-purpose registers
• Index and base registers
• Specialized register uses
• Status flags
• Floating-point, MMX, XMM registers
Addressable Memory
• Protected mode
– 4 GB
– 32-bit address
• Real-address and Virtual-8086 modes
– 1 MB space
– 20-bit address
General-Purpose Registers
Named storage locations inside the CPU, optimized
for speed.
32-bit General-Purpose Registers
16-bit Segment Registers
Accessing Parts of Registers
• Use 8-bit name,16-bit name, or 32-bit name
• Applies to EAX, EBX, ECX, and EDX
8 bits + 8 bits
16 bits
32 bits
Index and Base Registers
• Some registers have only a 16-bit name for their
lower half:
Some Specialized Register Uses (1 of 3)
• General-Purpose
– EAX – (extended) accumulator
 multiplication & division
– ECX – loop counter
– ESP – (extended) stack pointer
 addresses data on the stack
– ESI, EDI – index registers
 extended source and extended destination
– EBP – extended frame pointer (stack)
 used by high-level languages to reference function
parameters and local variables on the stack
Some Specialized Register Uses (2 of 3)
• Segment
– Indicate base addresses of preassigned memory
areas (segments)
 CS – code segment
 DS – data segment
 SS – stack segment
 ES, FS, GS – additional segments
– ES (extra segment), FS and GS, provide
additional segments for storing data.
Some Specialized Register Uses (3 of 3)
• EIP – instruction pointer
– register contains the address of the next
instruction to execute
• EFLAGS (execution flags) or just
– status and control flags
– each flag is a single binary bit
Status FLAGS: Next Slide
Status Flags
• Carry
– unsigned arithmetic out of range
• Overflow
– signed arithmetic out of range
• Sign
– result is negative
• Zero
– result is zero
• Auxiliary Carry
– carry from bit 3 to bit 4
• Parity
– sum of 1 bits is an even number
Floating-Point, MMX, XMM Registers
• Eight 80-bit floating-point data
– ST(0), ST(1), . . . , ST(7)
– arranged in a stack
– used for all floating-point arithmetic
80-bit Data Registers
• Eight 64-bit MMX registers
• Eight 128-bit XMM registers for singleinstruction multiple-data (SIMD)
SIMD – Single Instruction, Multiple Data
Opcode Register
IA-32 Memory Management
• Real-address mode
• Calculating linear addresses
• Protected mode
• Multi-segment model
• Paging
Real Address Mode
• 1 MB addressable RAM
– (00000 to FFFFFh)
• The processor can run only one program at a time
• Applications can access any memory location
• MS-DOS runs in Real Address Mode
Protected Mode
• 4 GB addressable RAM
– (00000000 to FFFFFFFFh)
• Each program assigned a memory partition of 4 GB which
is protected from other programs
• Designed for multitasking
• Supported by Linux & MS-Windows
Components of an IA-32 Microcomputer
• Motherboard
• Video output
• Memory
• Input-output ports
• CPU socket
• External cache memory slots
• Main memory slots
• BIOS chips
• Sound synthesizer chip (optional)
• Video controller chip (optional)
• IDE, parallel, serial, USB, video, keyboard, joystick,
network, and mouse connectors
• PCI bus connectors (expansion cards)
Intel D850MD Motherboard
PCI slots
AGP slot
mouse, keyboard,
parallel, serial, and
USB connectors
memory controller hub
Pentium 4
Diskette connector
Source: Intel® Desktop Board D850MD/D850MV IDE drive connectors
Technical Product Specification
Memory (1 of 2)
– read-only memory
– erasable programmable read-only memory
• Dynamic RAM (DRAM)
– inexpensive; must be refreshed constantly
• Static RAM (SRAM)
– expensive; used for cache memory; no refresh required
Memory (2 of 2)
• Video RAM (VRAM)
– dual ported; optimized for constant video refresh
– complimentary metal-oxide semiconductor
– system setup information
• See: Intel platform memory (Intel technology brief:
link address may change)
Input-Output Ports
• USB (universal serial bus)
– intelligent high-speed connection to devices
– up to 12 megabits/second
– USB hub connects multiple devices
– enumeration: computer queries devices
– supports hot connections
• Parallel
– short cable, high speed
– common for printers
– bidirectional, parallel data transfer
– Intel 8255 controller chip
Input-Output Ports (cont)
• Serial
– RS-232 serial port
– one bit at a time
– uses long cables and modems
– 16550 UART (universal asynchronous receiver
– programmable in assembly language
Device Interfaces
• ATA host adapters
– intelligent drive electronics (hard drive, CDROM)
• SATA (Serial ATA)
– inexpensive, fast, bidirectional
• FireWire
– high speed (800 MB/sec), many devices at once
• Bluetooth
– small amounts of data, short distances, low power usage
• Wi-Fi (wireless Ethernet)
– IEEE 802.11 standard, faster than Bluetooth
Levels of Input-Output
• Level 3: High-level language function
– examples: C++, Java
– portable, convenient, not always the fastest
• Level 2: Operating system
– Application Programming Interface (API)
– extended capabilities, lots of details to master
• Level 1: BIOS
– drivers that communicate directly with devices
– OS security may prevent application-level code from working at
this level
Displaying a String of Characters
When a HLL program displays a string of characters, the
following steps take place:
Application Program
Level 3
OS Function
Level 2
BIOS Function
Level 1
Level 0
Programming levels
Assembly language programs can perform inputoutput at each of the following levels:
• Central Processing Unit (CPU)
• Arithmetic Logic Unit (ALU)
• Instruction execution cycle
• Multitasking
• Floating Point Unit (FPU)
• Complex Instruction Set
• Real mode and Protected mode
• Motherboard components
• Memory types
• Input/Output and access levels
42 69 6E 61 72 79
What does this say?
