💾Intro to Computer Architecture Unit 5 – Memory System Design

Memory system design is a critical aspect of computer architecture, focusing on optimizing data storage and access. This unit explores the hierarchy of memory components, from fast registers to slower secondary storage, and introduces key concepts like locality of reference and cache memory. The study delves into cache fundamentals, main memory design, and virtual memory principles. It covers performance metrics, advanced memory technologies, and practical applications across various computing domains. Understanding these concepts is crucial for designing efficient and high-performing computer systems.

Key Concepts in Memory Systems

  • Memory systems play a crucial role in computer architecture by storing and providing access to data and instructions
  • Hierarchy of memory components includes registers, cache, main memory, and secondary storage (hard disk drives, SSDs)
  • Locality of reference principle states that programs tend to access data and instructions that are close together in memory or have been recently accessed
    • Temporal locality refers to the reuse of specific data within a relatively small time duration
    • Spatial locality refers to the use of data elements within relatively close storage locations
  • Memory performance is measured in terms of latency (access time) and bandwidth (data transfer rate)
  • Cache memory is a small, fast memory located close to the CPU that stores frequently accessed data to reduce average access time
  • Virtual memory allows the separation of logical memory from physical memory, enabling programs to exceed the size of available physical memory

Memory Hierarchy Overview

  • Memory hierarchy consists of multiple levels of memory with varying capacities, speeds, and costs
  • Registers are the fastest and most expensive memory, located closest to the CPU
  • Cache memory is faster than main memory but smaller in capacity, acting as a buffer between CPU and main memory
    • Levels of cache (L1, L2, L3) with increasing capacity and latency as distance from CPU grows
  • Main memory, typically DRAM, is larger than cache but slower, storing currently executing programs and data
  • Secondary storage (hard disk drives, SSDs) has the largest capacity but slowest access times, used for long-term storage
  • Memory management unit (MMU) handles translation between logical and physical memory addresses
  • Effective memory hierarchy design balances cost, capacity, and performance to minimize average access time

Cache Memory Fundamentals

  • Cache memory exploits the locality of reference principle to reduce average memory access time
  • Data is transferred between memory levels in fixed-size blocks called cache lines or cache blocks
  • Cache hit occurs when requested data is found in the cache, resulting in faster access
  • Cache miss occurs when requested data is not found in the cache, requiring access to a lower level of memory hierarchy
    • Compulsory miss (cold start miss) occurs on the first access to a memory location
    • Capacity miss occurs when the cache cannot contain all the required data due to its limited size
    • Conflict miss occurs when multiple memory locations map to the same cache line, leading to evictions
  • Cache placement policies determine where data is stored in the cache (direct-mapped, set-associative, fully associative)
  • Cache replacement policies decide which cache line to evict when a miss occurs and the cache is full (LRU, LFU, random)
  • Write policies control how writes to the cache are handled (write-through, write-back)

Main Memory Design

  • Main memory, typically implemented using DRAM, stores the currently executing programs and their data
  • DRAM (Dynamic Random Access Memory) stores each bit in a separate capacitor, requiring periodic refresh to maintain data
  • SRAM (Static Random Access Memory) is faster but more expensive than DRAM, used for cache memory
  • Memory controllers manage the flow of data between the CPU and main memory, handling refresh and error correction
  • Interleaving memory accesses across multiple memory banks can increase bandwidth and reduce latency
  • Error-correcting codes (ECC) are used to detect and correct bit errors in memory
  • Dual In-line Memory Modules (DIMMs) are the physical packaging of DRAM chips, connected to the memory bus

Virtual Memory Principles

  • Virtual memory allows programs to access a larger address space than the available physical memory
  • Memory management unit (MMU) translates virtual addresses to physical addresses using a page table
  • Virtual address space is divided into fixed-size pages, while physical memory is divided into frames of the same size
  • Page table stores the mappings between virtual pages and physical frames
    • Page table entries (PTEs) contain the physical frame number and additional metadata (valid, dirty, access rights)
  • Translation lookaside buffer (TLB) caches recently used page table entries to speed up address translation
  • Page faults occur when a requested virtual page is not present in physical memory, triggering a page table walk or disk access
  • Demand paging loads pages into memory only when they are accessed, reducing memory usage
  • Swapping moves pages between main memory and secondary storage to accommodate memory demands

Memory Performance Metrics

  • Latency is the time taken to access a memory location, measured in clock cycles or nanoseconds
    • Access time is the sum of latency and transfer time
  • Bandwidth is the rate at which data can be transferred between memory and the CPU, measured in bytes per second (B/s)
  • Miss rate is the fraction of memory accesses that result in a cache miss
    • Miss penalty is the additional time required to fetch data from a lower level of memory hierarchy upon a miss
  • Average memory access time (AMAT) is the average time to access memory considering hits and misses
    • AMAT = Hit time + (Miss rate × Miss penalty)
  • Memory stall cycles represent the number of CPU cycles wasted due to waiting for memory accesses
  • Throughput measures the number of memory operations completed per unit time
  • Speedup is the improvement in performance gained by using a faster memory system or optimization technique

Advanced Memory Technologies

  • 3D-stacked memory (HBM, HMC) integrates multiple DRAM dies vertically, providing higher bandwidth and lower latency
  • Non-volatile memory (NVM) retains data even when power is removed, offering persistence and high density
    • Examples include PCM (Phase-Change Memory), MRAM (Magnetoresistive RAM), and ReRAM (Resistive RAM)
  • Hybrid memory systems combine DRAM and NVM to balance performance, capacity, and cost
  • Processing-in-Memory (PIM) integrates computation units with memory to reduce data movement and improve performance
  • Transactional memory provides hardware support for atomic memory operations, simplifying concurrent programming
  • Memory compression techniques reduce memory footprint by exploiting data redundancy and encoding schemes
  • Prefetching techniques anticipate future memory accesses and fetch data in advance to hide latency

Practical Applications and Case Studies

  • Memory system design is crucial for high-performance computing (HPC) applications, such as scientific simulations and data analytics
  • Embedded systems often have tight memory constraints, requiring careful memory optimization and management
  • Mobile devices prioritize energy efficiency and low power consumption in memory system design
  • Gaming consoles and graphics processing units (GPUs) employ specialized memory architectures to meet the demands of real-time rendering and parallel processing
  • Databases and data-intensive applications rely on efficient memory systems to handle large datasets and frequent memory accesses
  • Virtualization and cloud computing environments require effective memory management and isolation techniques to support multiple virtual machines
  • Case studies showcase real-world examples of memory system optimizations in various domains (e.g., Google's TPU, Intel's Optane memory)


© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.