Chapter 4: Processor Architecture_《Computer Systems A Programmer’s Perspective》

发布于:2025-04-05 ⋅ 阅读:(12) ⋅ 点赞:(0)


1. Y86-64 Instruction Set Architecture (ISA)

Key Concepts:

  • Y86-64 is a simplified 64-bit instruction set derived from x86-64.
  • Registers: 15 general-purpose registers (%rax, %rcx, etc.) and %rsp (stack pointer).
  • Instruction encoding: 1–10 bytes per instruction, including opcode, register specifiers, and immediates.
  • Status codes: AOK (normal), HLT (halt), ADR (invalid address), INS (invalid instruction).

Example Code (Y86-64 Assembly):

# Example Y86-64 program: Sum an array
.pos 0
    irmovq stack, %rsp   # Initialize stack pointer
    call main            # Call main function
    halt

main:
    irmovq array, %rdi   # Address of array
    irmovq 3, %rsi       # Length of array
    call sum             # Call sum function
    ret

sum:
    xorq %rax, %rax      # Initialize sum to 0
loop:
    andq %rsi, %rsi      # Check loop counter
    jle done
    addq (%rdi), %rax    # Add current element to sum
    irmovq 8, %rdx       # Offset for next element
    addq %rdx, %rdi      # Move to next element
    irmovq 1, %rdx
    subq %rdx, %rsi      # Decrement counter
    jmp loop
done:
    ret

# Data section
array:
    .quad 1, 2, 3, 4, 5
stack:

2. Logic Design & Hardware Control Language (HCL)

Key Concepts:

  • Combinational circuits (e.g., ALU, condition codes) and sequential circuits (e.g., registers, memory).
  • HCL: A hardware description language to model logic gates and processor behavior.
  • Register file: Read/write operations based on register IDs.
  • Memory: Read/write operations with address validation.

Example HCL Code (Condition Code Calculation):

# HCL code to compute ZF (Zero Flag)
word alu_output = [
    icode == IOPQ : aluA OP aluB;
    # Other cases...
];

bool ZF = (alu_output == 0);

3. Sequential Y86-64 Implementation (SEQ)

Key Concepts:

  • SEQ (Sequential Processor): Executes one instruction per clock cycle.
  • Stages: Fetch, Decode, Execute, Memory, Write-Back, PC Update.
  • PC Computation: Next PC is determined by the current instruction (e.g., jmp, call).

Example C Code (Simulating SEQ Stages):

#include <stdio.h>

typedef struct {
    int pc;
    int registers[16];
    unsigned char memory[4096];
} CPU;

void fetch(CPU *cpu) {
    // Simplified fetch stage: Read opcode from memory
    unsigned char opcode = cpu->memory[cpu->pc];
    printf("Fetch: Opcode = 0x%02X at PC = 0x%04X\n", opcode, cpu->pc);
    cpu->pc++;
}

int main() {
    CPU cpu = {0};
    cpu.memory[0] = 0x30;  // Example: irmovq instruction

    fetch(&cpu);
    return 0;
}

Output:

Fetch: Opcode = 0x30 at PC = 0x0000

4. Pipelined Y86-64 Implementation (PIPE)

Key Concepts:

  • Pipelining: Overlaps execution of multiple instructions.
  • Pipeline registers (F, D, E, M, W) hold intermediate state.
  • Hazards:
    • Data hazards: Solved via forwarding/stalling.
    • Control hazards: Solved via branch prediction.

Example Code (Pipeline Hazard Detection):

#include <stdio.h>

typedef struct {
    int srcA, srcB;
    int dstE, dstM;
    int valA, valB;
} PipelineRegister;

void detect_hazards(PipelineRegister *d, PipelineRegister *e) {
    if (d->srcA == e->dstE || d->srcA == e->dstM ||
        d->srcB == e->dstE || d->srcB == e->dstM) {
        printf("Data hazard detected!\n");
    }
}

int main() {
    PipelineRegister D = {.srcA = 1, .srcB = 2};
    PipelineRegister E = {.dstE = 1, .dstM = 3};
    detect_hazards(&D, &E);  // Hazard: D.srcA matches E.dstE
    return 0;
}

Output:

Data hazard detected!

5. Advanced Topics: Forwarding & Stalling

Key Concepts:

  • Forwarding: Bypass pipeline registers to resolve data dependencies.
  • Stalling: Insert bubbles to delay instruction execution.

Example Code (Forwarding Logic):

#include <stdio.h>

int main() {
    int alu_result = 100;  // Value computed in Execute stage
    int forward_val = alu_result;

    printf("Forwarded value: %d\n", forward_val);  // Simulate forwarding
    return 0;
}

Output:

Forwarded value: 100

6. Testing the Complete Pipeline

Example Code (Full Pipeline Simulation):

#include <stdio.h>

typedef struct {
    int pc;
    int stage;  // F=0, D=1, E=2, M=3, W=4
} PipelineStage;

void advance_pipeline(PipelineStage *p) {
    p->stage = (p->stage + 1) % 5;
    printf("Stage advanced to %d. PC = 0x%04X\n", p->stage, p->pc);
}

int main() {
    PipelineStage pipe = {.pc = 0x100, .stage = 0};
    for (int i = 0; i < 5; i++) {
        advance_pipeline(&pipe);
    }
    return 0;
}

Output:

Stage advanced to 1. PC = 0x0100
Stage advanced to 2. PC = 0x0100
Stage advanced to 3. PC = 0x0100
Stage advanced to 4. PC = 0x0100
Stage advanced to 0. PC = 0x0100

Summary:

  • Y86-64 ISA defines the instruction set and registers.
  • HCL models hardware behavior for combinational/sequential circuits.
  • SEQ executes one instruction per cycle, while PIPE uses pipelining for higher throughput.
  • Hazards are resolved via forwarding/stalling.
  • Test each component with small, verifiable examples to validate correctness.

Multiple-Choice Questions


Question 1: Pipeline Stages in Y86-64
Which of the following are mandatory stages in the Y86-64 pipeline? (Select all that apply.)
A. Fetch
B. Decode
C. Execute
D. Memory
E. Write-Back
F. Commit


Question 2: Data Hazards
Which techniques are used to resolve data hazards in a pipelined processor? (Select all that apply.)
A. Stalling the pipeline (inserting bubbles)
B. Branch prediction
C. Forwarding (bypassing)
D. Reordering instructions at compile time


Question 3: Control Logic and HCL
Which statements about HCL (Hardware Control Language) are true? (Select all that apply.)
A. HCL describes sequential logic using registers.
B. HCL equations evaluate to Boolean values (0 or 1).
C. The mux operation in HCL selects between multiple inputs based on a condition.
D. HCL directly synthesizes Verilog code for FPGA implementations.


Question 4: Exception Handling
In Y86-64, which scenarios trigger a processor exception? (Select all that apply.)
A. Invalid opcode detection
B. Memory access to an unaligned address
C. Arithmetic overflow during ADDQ
D. A ret instruction with an empty stack


Question 5: Forwarding Paths
Which forwarding paths are valid in a 5-stage pipeline to resolve a read-after-write (RAW) hazard? (Select all that apply.)
A. Forwarding the result from the Memory stage to the Execute stage.
B. Forwarding the result from the Write-Back stage to the Decode stage.
C. Forwarding the result from the Execute stage to the Decode stage.
D. Forwarding the result from the Fetch stage to the Execute stage.


Question 6: Branch Misprediction Penalty
What happens when a branch misprediction occurs in a pipelined processor? (Select all that apply.)
A. All instructions after the branch are squashed.
B. The pipeline is flushed, and execution resumes at the correct target.
C. The branch prediction buffer is updated.
D. The processor switches to in-order execution temporarily.


Question 7: HCL and Combinational Logic
Which HCL constructs represent combinational logic? (Select all that apply.)
A. bool eq = (A == B);
B. reg = in; (with clock triggering)
C. val = [ expr1 : expr2; … ]; (case expression)
D. mem_read = icode in { IRMMOVQ, IPOPQ };


Question 8: Pipeline Registers
What is the primary purpose of pipeline registers (e.g., between Fetch and Decode stages)? (Select all that apply.)
A. To store the results of each stage for the next cycle.
B. To synchronize the flow of instructions between stages.
C. To resolve structural hazards by buffering instructions.
D. To implement speculative execution.


Question 9: Performance Metrics
Which factors directly impact the CPI (Cycles Per Instruction) of a pipelined processor? (Select all that apply.)
A. The number of pipeline stages.
B. The frequency of branch mispredictions.
C. The clock frequency of the processor.
D. The number of data hazards requiring stalls.


Question 10: Multi-Cycle Instructions
Which Y86-64 instructions require multiple cycles in the Execute stage? (Select all that apply.)
A. ADDQ
B. MULQ (hypothetical multiplication instruction)
C. JMP
D. PUSHQ



Answers and Explanations


Question 1: Pipeline Stages in Y86-64
Correct Answers: A, B, C, D, E

  • Explanation: The Y86-64 pipeline includes Fetch (A), Decode (B), Execute ©, Memory (D), and Write-Back (E). “Commit” (F) is not part of the Y86-64 pipeline; it is a stage in more complex architectures like out-of-order processors.

Question 2: Data Hazards
Correct Answers: A, C

  • Explanation: Data hazards (e.g., RAW) are resolved by forwarding © to bypass stale results and stalling (A) if forwarding is insufficient. Branch prediction (B) handles control hazards, and reordering (D) is done by compilers/out-of-order engines, not the pipeline itself.

Question 3: Control Logic and HCL
Correct Answers: A, C

  • Explanation: HCL describes combinational logic (not sequential) using mux ©. HCL equations evaluate to integers, not Booleans (B is false). While HCL can guide Verilog (D), it does not directly synthesize it. Sequential logic (A) uses registers with clock triggers.

Question 4: Exception Handling
Correct Answers: A, D

  • Explanation: Y86-64 exceptions occur for invalid opcodes (A) and ret with an empty stack (D). Unaligned memory access (B) and arithmetic overflow © are not exceptions in Y86-64.

Question 5: Forwarding Paths
Correct Answers: A, B

  • Explanation: Forwarding from Memory (A) or Write-Back (B) stages to earlier stages resolves RAW hazards. Forwarding from Fetch (D) is invalid, and Execute-to-Decode © is not a standard path.

Question 6: Branch Misprediction Penalty
Correct Answers: B, C

  • Explanation: On misprediction, the pipeline flushes incorrect instructions (B), and the branch predictor updates its buffer ©. Instructions after the branch are not necessarily squashed (A), and in-order execution (D) is unrelated.

Question 7: HCL and Combinational Logic
Correct Answers: A, C, D

  • Explanation: Combinational logic includes Boolean expressions (A), case expressions ©, and signal assignments (D). reg = in (B) represents sequential logic.

Question 8: Pipeline Registers
Correct Answers: A, B

  • Explanation: Pipeline registers store stage outputs (A) and synchronize flow (B). Structural hazards © are resolved by resource duplication, not registers. Speculative execution (D) is unrelated.

Question 9: Performance Metrics
Correct Answers: B, D

  • Explanation: CPI increases with stalls (D) and mispredictions (B). Pipeline stages (A) affect latency, and clock frequency © affects cycle time, not CPI.

Question 10: Multi-Cycle Instructions
Correct Answers: B, D

  • Explanation: MULQ (B) and PUSHQ (D) may take multiple cycles (e.g., memory access). ADDQ (A) and JMP © are single-cycle in Y86-64.

These questions test deep understanding of pipeline mechanics, hazard resolution, HCL semantics, and performance analysis, aligning with the key challenges in Chapter 4.

After you are done, it is highly recommended to do the exercises from the original book