CS:APP--Chapter06 : memory hierarchy(part 1)
标签(空格分隔): CS:APP
目录prologue
Back to the simple model of memory we often use before, the realistic implementation of it at low level isn't in correspondence with it. We get used to describing memory as a linear array accessed by an index followed by returning the data stored at the position of the index.
In practice, memory of computer system can be regraded as a pyramid-shaped hierarchy shown in section 3. All memory can be categoried into several classes, and these types of storage devices get larger, cheaper and slower as we move from top to bottom.
It also means that each kind of storage device has its own capacity and cost as well as speed. In this case, one probelem arises : how these storage devices works and organized and how to integrate
these into computer system. The details will be reported later.
As a result, memory provides a fundamental concept : locality, which is widely used in computer system and even network as a cache to directly response the request which has been accessed most recently without the intervining from servers.
One of the most significant advantages of having a better understanding of memory hierarchy is to optimize our code for the higher locality in terms of cache hit and miss.
1. storage technologies
The previous chapters introduces what processor is and how it works by transferring data between registers and main memory. Indeed, in the beginning statg of computer, it always held small volume of main memory and storage devices.
Up now, a loans of storage devices are shown such SRAM, DRAM and convential disk as well as SSD.
1.1 random access memory
There are two varieties originated from RAM : static and dynamic RAM, abbreviated to SRAM and DRAM.
SRAM is more expensive and faster than DRAM.
1.1.1 static RAM
six-transistor circuit constitutes a bistable memory cell (store bit here).
SRAM has three status :
the middle state in which the pendulum stays balanced in the vertical position is called by metastable.
One principle: as long as it is kept powered, the cell remains the position(just value inside) indefinitely. It otherwise will fall to either of stable postion.
Q : It recalls one point in OS, OS initializes computer by accessing a set of instructions stored in one fixed address block. Does the block lying out in SRAM? but it is bistable.
1.1.2 dynamic RAM
only one capacitor and a single access transistor -> memory cell
Unlike SRAM, DRAM is easily susceptible to disturbance and then cannot get recovery.
One reference where demonstrates what consists of DRAM at a hardware point of view. DRAM detail.
Due to DRAM only storing data over a short period, it must be periodically refreshed by reading bit out and rewriting another bit in.
So SRAM usually is used in high-speed cache memory but DRAM often is used in main memory and frame buffer.
conventional DRAM
All cells inDRAM are partitioned into \(d\) supercells consisting of \(w\) cells (\(w\) bits).
As shown above, the address is transferred via address bus to the **memory controller ** in main memory and it then is transferred to DRAM via pins adhering to DRAM.
On top of it, for reading bit out and rewriting bit in, data must also be transferred into DRAM via data bus and pin. ( There is a technology called as multiplexing by using less address pins than before, this is why DRAM are mapped into a two-dimensional array.)
Each supercell is accessed by row and column index with (i,j) produced by memory controller. And DRAM reads a word out directly to the memory controller.
type | description |
---|---|
pros | decrease the number of pins |
cons | increase the access time because of two distinct steps |
memory modules
the DRAM chips are packaged into one memory modules that just plug into the empty and expansion slot on the matherboard.
To take the reference machine I7 for example:
240-pins inline memory modules(DIMM) to transfer data via a 64-bits chunks.
Something needs to be emphasized here : DRAM collects the eight supercell of 1 byte and then forms a word of 64-bits, which is returned to memory controller.
Upto now, we eventually make sense why writing is more time-comsuming than reading.
enhanced memory
So as to cut to the chase, more information is provided on page621.
accessing main memory
buses : a collection of parallel wires classifed into there categories : address,data,control buses.
I/O bridge : an interface(which includes memory controller) between system bused and memory buses.
the clock cycles required for MOVq %rax,A is more than MOVq A,%rax.
1.1.3 disk storage
It takes a hundred thousand times longer to read and write information from disk than from SRAM and DRAM.
disk geometry
The spindle in the center of the platter just spins the platter at a fixed rate.
The rings on the platter surface are named track, and it is partitioned into several set of sectors, which stores data bits, and are separated by gaps where no data bits are stored but formatting bits that identifies sectors instead.
Because there are multiple cyclinders on (b) and the consistency between cyclinder and track, we are used to using cyclinder instead of track in multiple-platter view.
disk capacity
disk capacity is the maximum nubmber of bits can be recorded by a disk.
To put it another way:
capacity = N(bits per sector) * N(sector per track) * N(track per surface)N(surface per platter)N(platter per disk)
disk operation
To slide read/write head over surface by moving actuator arm can accomplish read and write operation on platter. And this design makes every track accessable for the head.
In this case, If CPU issues the read operation on some block of disk,
- head must be above the correct track.
- the sector must be under the head
- data can be transferred later
In this case, the access time to a block of disk has three main components :
- seek time
- rotational latency
- transfer time
Each of time is the time required to finish the work identified by the same number above.
logical disk blocks
It hides the complexity of disk geometry that mapping A identifier to a triple (surface,track,sector) by the firmware in the disk controller, which performs as same functionality as memory controller.
connecting I/O devices
Considering the imcompatiblity between I/O buses buses and external buses, some controllers play an important role in formatting input.
access disk
We can start with three technologies : memory-mapping I/O and direct memory access as well as interruption that we will discuss later.
memory-mapping I/O:
From an abstract point of view, we can regard each of I/O devices as an address. It's much easy to manipulate these device by writing and reading the data in the block starting with this address.
DMA :
Disk controller can directly move data in disk to a block of memory without the intervening from processor.