实验要求
- 实现一个组相联的Data Cache,达到最高命中率
- Data Cache最大只能使用16KB的数据存储空间
- 附加功能:实现指令Cache
分析
- 实验的框架代码已经给出(直接映射),要改成组相联的结构,无非就是修改访问Data Cache的接口,确定组号后遍历组内每一行的Tag,若能与Address匹配上则hit,否则miss
- 以上内容任何一个人写出来都一样,关键是如何采用合适的策略提高Cache的命中率?
- 根据教材6.4.7的总结,我考虑了从以下几个方面来优化:
- 高速缓存大小(C):毫无疑问较大的Cache大小可以存下更多的数据,显著减少行替换,因此这里采用16KB
- 组数(S)、相联度(E)、块大小(B):以上三个指标的选择没有理论指导,在后续实验中通过不断调整测试找出最优方案
- 替换算法:实验中先后尝试了固定替换法、LRU法(最近最少使用,Least Recently Used)、随机法,从实验结果上看,命中率:随机法>固定替换法>LRU法
- 写策略:不同的写策略对命中率没有影响,但很明显写回策略比直写策略耗时更短,因此实验中采用写回策略
实验过程
实验开始前,先运行框架代码,得到直接相联的命中率为84.95%,记录下来以便比较
注:这里及后续实验记录的都是运行./Cache traces/gedit.trace.zst
这一测试集的结果
策略一:固定替换法
- 首先解析
Address
得到标记、组索引和块偏移,然后在组内遍历每一行,若能找到Valid
为1且Tag
与之匹配的行,则命中(H
),否则不命中(M
) - 若命中,则直接根据操作类型进行对应的操作(这部分代码几乎没有改动),操作完后直接
break
- 注意若执行了写或修改,需要将结构体中的
Dirty
(脏位)设置为1,方便后续写回
- 注意若执行了写或修改,需要将结构体中的
- 若不命中,则在组内遍历,若能找到空行(即
Valid
为0),则直接将数据Load
到这一行,并设置好相应的控制位;否则再次遍历,若能找到脏位为0的(即未经过修改的)行,则直接将数据Load
到这一行,并设置好相应的控制位(因为这样的行无需写回,可以节省时间);若上述两次遍历后还是没有找到符合要求的行,则说明组内每一行Valid
都为1,且都做过修改,替换时需要写回,那么就固定取第0行:首先从Tag中恢复旧的地址(OldAddress
),将Cache中的内容写回后Load
新数据到这一行。随后,根据操作类型进行对应的操作 - 实验结果如下表所示
- 改成组相联后命中率比直接相联更低,那么替换策略改成LRU法会是什么效果呢?
(S,E,B) | 命中率(%) |
---|---|
(32,8,64) | 76.73 |
(16,16,64) | 73.49 |
(32,16,32) | 70.16 |
(64,8,32) | 64.58 |
策略二:LRU法
- 在每一行增加了一个变量
Time
用于记录这一行有多少次被遍历到,但Tag
没有匹配上- 初始时
Time
为0,一旦匹配上Time
清零,否则Time
自增1 - 在替换时只需找出
Time
最大的行进行替换即可
- 初始时
- 其余部分未做大的变动,不再赘述
- 实验结果如下表所示
- 和上述策略相比,LRU法的命中率反而有所下降。并且,可以看出,相联度较低时,命中率反而越高。当相联度为1时,命中率最高,但这不就是直接相联了吗?
(S,E,B) | 命中率(%) |
---|---|
(32,8,64) | 72.19 |
(16,16,64) | 68.18 |
(32,4,128) | 76.83 |
(64,2,128) | 81.95 |
策略三:随机法
- 大道至简,返璞归真
- 随机法的策略就是在组内已满的情况下,用
rand()
随机选取一行进行替换,这可比实现LRU简单多了 - 实验结果如下表所示
- 显然随机法的效果是最好的,最优方案是
(S,E,B)=(16,64,16)
,命中率达到93.58%,比框架代码(直接相联)的命中率提高了10.2% - 并且可以看出,在一定范围内,相联度越高,命中率越高,命中率最高点绝不是直接相联
- 显然随机法的效果是最好的,最优方案是
(S,E,B) | 命中率(%) |
---|---|
(32,8,64) | 88.99 |
(64,4,64) | 88.53 |
(64,2,128) | 86.83 |
(32,16,32) | 92.09 |
(32,32,16) | 93.39 |
(16,64,16) | 93.58 |
(16,128,8) | 93.50 |
(8,256,8) | 93.54 |
(2,1028,8) | 93.56 |
附加功能:实现Instruction Cache
- 实现的思路和DCache类似,并且由于指令Cache只读,即
Operation
一定为I
,所以实现起来更简单 - 根据上面的实验结果,直接选择了更易实现且效果更好的随机替换策略
- 在
(S,E,B)=(16,64,16)
条件下,命中率达到98.87%
实验结果
Data Cache命中率:93.58%
Inst Cache命中率:98.87%
实验结果讨论
- 为什么随机法这一玄学策略的效果比更讲科学的LRU法效果更好,我查询了一些资料,得出了如下结论:
- 数据访问模式:如果数据的访问模式非常随机,没有任何可预测的模式,那么随机算法可能表现得和更复杂的算法一样好,甚至更好。因为LRU算法是基于“最近最少使用的数据将来也不太可能被访问”的假设,如果访问模式完全随机,这个假设就不成立
- 预取和局部性原理:LRU算法利用了计算机科学中的局部性原理,即最近访问过的数据很可能在不久的将来再次被访问。然而,如果测试集中数据访问模式违背了这一原理,LRU算法的优势就无法体现
- 适应性:LRU算法需要一定时间来“学习”或“适应”访问模式,对于变化非常快的访问模式,LRU可能还没有适应新的模式,就已经需要进行替换决策了
- 为什么要将数据和指令单独放在两个Cache里?
- 提高性能:由于数据和指令可以同时被访问(实现并行),这可以减少CPU等待时间,从而提高整体性能
- 专用优化:可以针对数据和指令的不同特性进行专门的优化,比如数据访问可能更倾向于连续性,而指令访问可能更随机(当然在这个实验的测试集下都挺随机的)
- 安全性:在某些系统中,将数据和指令分开可以提高安全性,防止某些类型的攻击,如缓冲区溢出攻击
完整代码
///
Copyright 2024 by Lane. //
///
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include "common.h"
#define DEBUG 0
#define GET_POWER_OF_2(X) (X == 0x00 ? 0 : \
X == 0x01 ? 0 : \
X == 0x02 ? 1 : \
X == 0x04 ? 2 : \
X == 0x08 ? 3 : \
X == 0x10 ? 4 : \
X == 0x20 ? 5 : \
X == 0x40 ? 6 : \
X == 0x80 ? 7 : \
X == 0x100 ? 8 : \
X == 0x200 ? 9 : \
X == 0x400 ? 10 : \
X == 0x800 ? 11 : \
X == 0x1000 ? 12 : \
X == 0x2000 ? 13 : \
X == 0x4000 ? 14 : \
X == 0x8000 ? 15 : \
X == 0x10000 ? 16 : \
X == 0x20000 ? 17 : \
X == 0x40000 ? 18 : \
X == 0x80000 ? 19 : \
X == 0x100000 ? 20 : \
X == 0x200000 ? 21 : \
X == 0x400000 ? 22 : \
X == 0x800000 ? 23 : \
X == 0x1000000 ? 24 : \
X == 0x2000000 ? 25 : \
X == 0x4000000 ? 26 : \
X == 0x8000000 ? 27 : \
X == 0x10000000 ? 28 : \
X == 0x20000000 ? 29 : \
X == 0x40000000 ? 30 : \
X == 0x80000000 ? 31 : \
X == 0x100000000 ? 32 : 0) // 例如:2^32=0x100000000
/*
直接映射Data Cache,16KB大小
每行存放64个字节,共32组,每组8行
*/
#define DCACHE_SIZE 16384
#define DCACHE_DATA_PER_LINE 16 // DCache每行存放字节数,必须是8字节的倍数
#define DCACHE_DATA_PER_LINE_ADDR_BITS GET_POWER_OF_2(DCACHE_DATA_PER_LINE) // 必须与上面设置一致
#define DCACHE_SET 16 // DCache组数
#define DCACHE_SET_ADDR_BITS GET_POWER_OF_2(DCACHE_SET) // 必须与上面设置一致
#define DCACHE_LINES_PER_SET (DCACHE_SIZE/DCACHE_DATA_PER_LINE/DCACHE_SET) // 每组行数
#define DCACHE_LINES DCACHE_SIZE/DCACHE_DATA_PER_LINE // DCache行数
#define ICACHE_SIZE 16384
#define ICACHE_DATA_PER_LINE 16 // ICache每行存放字节数,必须是8字节的倍数
#define ICACHE_DATA_PER_LINE_ADDR_BITS GET_POWER_OF_2(ICACHE_DATA_PER_LINE) // 必须与上面设置一致
#define ICACHE_SET 16 // ICache组数
#define ICACHE_SET_ADDR_BITS GET_POWER_OF_2(ICACHE_SET) // 必须与上面设置一致
#define ICACHE_LINES_PER_SET (ICACHE_SIZE/ICACHE_DATA_PER_LINE/ICACHE_SET) // 每组行数
#define ICACHE_LINES ICACHE_SIZE/ICACHE_DATA_PER_LINE // ICache行数
// Cache行的结构,包括Valid、Tag和Data。你所有的状态信息,只能记录在Cache行中!
struct DCACHE_LineStruct
{
UINT8 Valid;
UINT8 Dirty; // 脏位
UINT64 Time; // 时间戳
UINT64 Tag;
UINT8 Data[DCACHE_DATA_PER_LINE];
}DCache[DCACHE_LINES];
struct ICACHE_LineStruct
{
UINT8 Valid;
UINT64 Tag;
UINT8 Data[ICACHE_DATA_PER_LINE];
}ICache[ICACHE_LINES];
/*
DCache初始化代码,一般需要把DCache的有效位Valid设置为0
模拟器启动时,会调用此InitDataCache函数
*/
void InitDataCache()
{
UINT32 i;
printf("[%s] +-----------------------------------+\n", __func__);
printf("[%s] | Lane的Data Cache初始化ing.... |\n", __func__);
printf("[%s] +-----------------------------------+\n", __func__);
for (i = 0; i < DCACHE_LINES; i++)
{
DCache[i].Valid = 0;
DCache[i].Dirty = 0;
DCache[i].Time = 0;
}
}
/*
从Memory中读入一行数据到Data Cache中
*/
void LoadDataCacheLineFromMemory(UINT64 Address, UINT32 CacheLineAddress)
{
// 一次性从Memory中将DCACHE_DATA_PER_LINE数据读入某个Data Cache行
UINT32 i;
UINT64 ReadData;
UINT64 AlignAddress;
UINT64* pp;
AlignAddress = Address & ~(DCACHE_DATA_PER_LINE - 1); // 地址必须对齐到DCACHE_DATA_PER_LINE (64)字节边界
pp = (UINT64*)DCache[CacheLineAddress].Data;
for (i = 0; i < DCACHE_DATA_PER_LINE / 8; i++)
{
ReadData = ReadMemory(AlignAddress + 8LL * i);
pp[i] = ReadData;
if (DEBUG)
printf("[%s] Address=%016llX ReadData=%016llX\n", __func__, AlignAddress + 8LL * i, ReadData);
}
}
/*
将Data Cache中的一行数据,写入存储器
*/
void StoreDataCacheLineToMemory(UINT64 Address, UINT32 CacheLineAddress)
{
// 一次性将DCACHE_DATA_PER_LINE数据从某个Data Cache行写入Memory中
// 提供了一个函数,一次可以写入8个字节
UINT32 i;
UINT64 WriteData;
UINT64 AlignAddress;
UINT64* pp;
AlignAddress = Address & ~(DCACHE_DATA_PER_LINE - 1); // 地址必须对齐到DCACHE_DATA_PER_LINE (64)字节边界
pp = (UINT64*)DCache[CacheLineAddress].Data;
WriteData = 0;
for (i = 0; i < DCACHE_DATA_PER_LINE / 8; i++)
{
WriteData = pp[i];
WriteMemory(AlignAddress + 8LL * i, WriteData);
if (DEBUG)
printf("[%s] Address=%016llX ReadData=%016llX\n", __func__, AlignAddress + 8LL * i, WriteData);
}
}
/*
Data Cache访问接口,系统模拟器会调用此接口,来实现对你的Data Cache访问
Address: 访存字节地址
Operation: 操作:读操作('L')、写操作('S')、读-修改-写操作('M')
DataSize: 数据大小:1字节、2字节、4字节、8字节
StoreValue: 当执行写操作的时候,需要写入的数据
LoadResult: 当执行读操作的时候,从Cache读出的数据
*/
UINT8 AccessDataCache(UINT64 Address, UINT8 Operation, UINT8 DataSize, UINT64 StoreValue, UINT64* LoadResult)
{
UINT32 CacheSetAddress;
UINT32 CacheLineAddress;
UINT8 BlockOffset;
UINT64 AddressTag;
UINT8 MissFlag = 'M';
UINT64 ReadValue;
*LoadResult = 0;
/*
* 组相联中,Address被切分为 AddressTag,CacheSetAddress,BlockOffset
*/
CacheSetAddress = (Address >> DCACHE_DATA_PER_LINE_ADDR_BITS) % DCACHE_SET; // 组索引
BlockOffset = Address % DCACHE_DATA_PER_LINE; // 块偏移
AddressTag = (Address >> DCACHE_DATA_PER_LINE_ADDR_BITS) >> DCACHE_SET_ADDR_BITS; // 地址去掉DCACHE_SET、DCACHE_DATA_PER_LINE,剩下的作为Tag。警告!不能将整个地址作为Tag!!
for(int i=0;i<DCACHE_LINES_PER_SET;i++)
{
CacheLineAddress=CacheSetAddress*DCACHE_LINES_PER_SET+i;
if (MissFlag == 'M' && DCache[CacheLineAddress].Valid == 1 && DCache[CacheLineAddress].Tag == AddressTag)
{
MissFlag = 'H'; // 命中!
DCache[CacheLineAddress].Time=0;
if (Operation == 'L') // 读操作
{
ReadValue = 0;
switch (DataSize)
{
case 1: // 1个字节
ReadValue = DCache[CacheLineAddress].Data[BlockOffset + 0];
break;
case 2: // 2个字节
BlockOffset = BlockOffset & 0xFE; // 需对齐到2字节边界
ReadValue = DCache[CacheLineAddress].Data[BlockOffset + 1]; ReadValue = ReadValue << 8;
ReadValue |= DCache[CacheLineAddress].Data[BlockOffset + 0];
break;
case 4: // 4个字节
BlockOffset = BlockOffset & 0xFC; // 需对齐到4字节边界
ReadValue = DCache[CacheLineAddress].Data[BlockOffset + 3]; ReadValue = ReadValue << 8;
ReadValue |= DCache[CacheLineAddress].Data[BlockOffset + 2]; ReadValue = ReadValue << 8;
ReadValue |= DCache[CacheLineAddress].Data[BlockOffset + 1]; ReadValue = ReadValue << 8;
ReadValue |= DCache[CacheLineAddress].Data[BlockOffset + 0];
break;
case 8: // 8个字节
BlockOffset = BlockOffset & 0xF8; // 需对齐到8字节边界
ReadValue = DCache[CacheLineAddress].Data[BlockOffset + 7]; ReadValue = ReadValue << 8;
ReadValue |= DCache[CacheLineAddress].Data[BlockOffset + 6]; ReadValue = ReadValue << 8;
ReadValue |= DCache[CacheLineAddress].Data[BlockOffset + 5]; ReadValue = ReadValue << 8;
ReadValue |= DCache[CacheLineAddress].Data[BlockOffset + 4]; ReadValue = ReadValue << 8;
ReadValue |= DCache[CacheLineAddress].Data[BlockOffset + 3]; ReadValue = ReadValue << 8;
ReadValue |= DCache[CacheLineAddress].Data[BlockOffset + 2]; ReadValue = ReadValue << 8;
ReadValue |= DCache[CacheLineAddress].Data[BlockOffset + 1]; ReadValue = ReadValue << 8;
ReadValue |= DCache[CacheLineAddress].Data[BlockOffset + 0];
break;
}
*LoadResult = ReadValue;
if (DEBUG)
printf("[%s] Address=%016llX Operation=%c DataSize=%u StoreValue=%016llX ReadValue=%016llX\n", __func__, Address, Operation, DataSize, StoreValue, ReadValue);
}
else if (Operation == 'S' || Operation == 'M') // 写操作(修改操作在此等价于写操作)
{
if (DEBUG)
printf("[%s] Address=%016llX Operation=%c DataSize=%u StoreValue=%016llX\n", __func__, Address, Operation, DataSize, StoreValue);
DCache[CacheLineAddress].Dirty=1;
switch (DataSize)
{
case 1: // 1个字节
DCache[CacheLineAddress].Data[BlockOffset + 0] = StoreValue & 0xFF;
break;
case 2: // 2个字节
BlockOffset = BlockOffset & 0xFE; // 需对齐到2字节边界
DCache[CacheLineAddress].Data[BlockOffset + 0] = StoreValue & 0xFF; StoreValue = StoreValue >> 8;
DCache[CacheLineAddress].Data[BlockOffset + 1] = StoreValue & 0xFF;
break;
case 4: // 4个字节
BlockOffset = BlockOffset & 0xFC; // 需对齐到4字节边界
DCache[CacheLineAddress].Data[BlockOffset + 0] = StoreValue & 0xFF; StoreValue = StoreValue >> 8;
DCache[CacheLineAddress].Data[BlockOffset + 1] = StoreValue & 0xFF; StoreValue = StoreValue >> 8;
DCache[CacheLineAddress].Data[BlockOffset + 2] = StoreValue & 0xFF; StoreValue = StoreValue >> 8;
DCache[CacheLineAddress].Data[BlockOffset + 3] = StoreValue & 0xFF;
break;
case 8: // 8个字节
BlockOffset = BlockOffset & 0xF8; // 需对齐到8字节边界
DCache[CacheLineAddress].Data[BlockOffset + 0] = StoreValue & 0xFF; StoreValue = StoreValue >> 8;
DCache[CacheLineAddress].Data[BlockOffset + 1] = StoreValue & 0xFF; StoreValue = StoreValue >> 8;
DCache[CacheLineAddress].Data[BlockOffset + 2] = StoreValue & 0xFF; StoreValue = StoreValue >> 8;
DCache[CacheLineAddress].Data[BlockOffset + 3] = StoreValue & 0xFF; StoreValue = StoreValue >> 8;
DCache[CacheLineAddress].Data[BlockOffset + 4] = StoreValue & 0xFF; StoreValue = StoreValue >> 8;
DCache[CacheLineAddress].Data[BlockOffset + 5] = StoreValue & 0xFF; StoreValue = StoreValue >> 8;
DCache[CacheLineAddress].Data[BlockOffset + 6] = StoreValue & 0xFF; StoreValue = StoreValue >> 8;
DCache[CacheLineAddress].Data[BlockOffset + 7] = StoreValue & 0xFF;
break;
}
}
}
else if(DCache[CacheLineAddress].Valid == 1)
DCache[CacheLineAddress].Time++;
}
if(MissFlag == 'M') // 不命中
{
UINT8 OKFlag=0;
if (DEBUG)
printf("[%s] Address=%016llX Operation=%c DataSize=%u StoreValue=%016llX\n", __func__, Address, Operation, DataSize, StoreValue);
// 该组是否full
for(int i=0;i<DCACHE_LINES_PER_SET;i++)
{
CacheLineAddress=CacheSetAddress*DCACHE_LINES_PER_SET+i;
if(DCache[CacheLineAddress].Valid == 0)
{
OKFlag=1;
break;
}
}
// 不是full则load到CacheLineAddress里,否则替换Time最大的
if(OKFlag==0)
{
// UINT64 Time_max=-1;
// UINT32 CacheLineAddress_tmp;
// for(int i=0;i<DCACHE_LINES_PER_SET;i++)
// {
// CacheLineAddress_tmp=CacheSetAddress*DCACHE_LINES_PER_SET+i;
// if(DCache[CacheLineAddress_tmp].Time>Time_max)
// {
// Time_max=DCache[CacheLineAddress_tmp].Time;
// CacheLineAddress=CacheLineAddress_tmp;
// }
// }
// CacheLineAddress=CacheSetAddress*DCACHE_LINES_PER_SET;
// 生成一个随机数
UINT32 random_number = rand() % DCACHE_LINES_PER_SET;
CacheLineAddress=CacheSetAddress*DCACHE_LINES_PER_SET+random_number;
}
if(DCache[CacheLineAddress].Valid==1 && DCache[CacheLineAddress].Dirty==1)
{
// 淘汰对应的Cache行,如果对应的Cache行有数据,需要写回到Memory中
UINT64 OldAddress;
// OldAddress = > (Tag,Set,000000)
OldAddress = ((DCache[CacheLineAddress].Tag << DCACHE_SET_ADDR_BITS) << DCACHE_DATA_PER_LINE_ADDR_BITS) | ((UINT64)CacheSetAddress << DCACHE_DATA_PER_LINE_ADDR_BITS); // 从Tag中恢复旧的地址
StoreDataCacheLineToMemory(OldAddress, CacheLineAddress);
}
LoadDataCacheLineFromMemory(Address, CacheLineAddress);
DCache[CacheLineAddress].Valid = 1;
DCache[CacheLineAddress].Tag = AddressTag;
DCache[CacheLineAddress].Dirty = 0;
DCache[CacheLineAddress].Time = 0;
if (Operation == 'L') // 读操作
{
// 读操作不需要做事情,因为已经MISS了
}
else if (Operation == 'S' || Operation == 'M') // 写操作(修改操作在此等价于写操作)
{
DCache[CacheLineAddress].Dirty=1;
// 写操作,需要将新的StoreValue更新到CacheLine中
switch (DataSize)
{
case 1: // 1个字节
DCache[CacheLineAddress].Data[BlockOffset + 0] = StoreValue & 0xFF;
break;
case 2: // 2个字节
BlockOffset = BlockOffset & 0xFE; // 需对齐到2字节边界
DCache[CacheLineAddress].Data[BlockOffset + 0] = StoreValue & 0xFF; StoreValue = StoreValue >> 8;
DCache[CacheLineAddress].Data[BlockOffset + 1] = StoreValue & 0xFF;
break;
case 4: // 4个字节
BlockOffset = BlockOffset & 0xFC; // 需对齐到4字节边界
DCache[CacheLineAddress].Data[BlockOffset + 0] = StoreValue & 0xFF; StoreValue = StoreValue >> 8;
DCache[CacheLineAddress].Data[BlockOffset + 1] = StoreValue & 0xFF; StoreValue = StoreValue >> 8;
DCache[CacheLineAddress].Data[BlockOffset + 2] = StoreValue & 0xFF; StoreValue = StoreValue >> 8;
DCache[CacheLineAddress].Data[BlockOffset + 3] = StoreValue & 0xFF;
break;
case 8: // 8个字节
BlockOffset = BlockOffset & 0xF8; // 需对齐到8字节边界
DCache[CacheLineAddress].Data[BlockOffset + 0] = StoreValue & 0xFF; StoreValue = StoreValue >> 8;
DCache[CacheLineAddress].Data[BlockOffset + 1] = StoreValue & 0xFF; StoreValue = StoreValue >> 8;
DCache[CacheLineAddress].Data[BlockOffset + 2] = StoreValue & 0xFF; StoreValue = StoreValue >> 8;
DCache[CacheLineAddress].Data[BlockOffset + 3] = StoreValue & 0xFF; StoreValue = StoreValue >> 8;
DCache[CacheLineAddress].Data[BlockOffset + 4] = StoreValue & 0xFF; StoreValue = StoreValue >> 8;
DCache[CacheLineAddress].Data[BlockOffset + 5] = StoreValue & 0xFF; StoreValue = StoreValue >> 8;
DCache[CacheLineAddress].Data[BlockOffset + 6] = StoreValue & 0xFF; StoreValue = StoreValue >> 8;
DCache[CacheLineAddress].Data[BlockOffset + 7] = StoreValue & 0xFF;
break;
}
}
}
return MissFlag;
}
/* 指令Cache实现部分,可选实现 */
void InitInstCache(void)
{
UINT32 i;
printf("[%s] +-----------------------------------+\n", __func__);
printf("[%s] |Lane的Instruction Cache初始化ing....|\n", __func__);
printf("[%s] +-----------------------------------+\n", __func__);
for (i = 0; i < ICACHE_LINES; i++)
{
ICache[i].Valid = 0;
}
}
void LoadInstCacheLineFromMemory(UINT64 Address, UINT32 CacheLineAddress)
{
// 一次性从Memory中将ICACHE_DATA_PER_LINE数据读入某个ICache行
UINT32 i;
UINT64 ReadData;
UINT64 AlignAddress;
UINT64* pp;
AlignAddress = Address & ~(ICACHE_DATA_PER_LINE - 1); // 地址必须对齐到ICACHE_DATA_PER_LINE (64)字节边界
pp = (UINT64*)ICache[CacheLineAddress].Data;
for (i = 0; i < ICACHE_DATA_PER_LINE / 8; i++)
{
ReadData = ReadMemory(AlignAddress + 8LL * i);
pp[i] = ReadData;
}
}
UINT8 AccessInstCache(UINT64 Address, UINT8 Operation, UINT8 InstSize, UINT64* InstResult)
{
// 返回值'M' = Miss,'H'=Hit
// Operation一定是'I',只读
UINT32 CacheSetAddress;
UINT32 CacheLineAddress;
UINT8 BlockOffset;
UINT64 AddressTag;
UINT8 MissFlag = 'M';
UINT64 ReadValue;
*InstResult = 0;
/*
* 组相联中,Address被切分为 AddressTag,CacheSetAddress,BlockOffset
*/
CacheSetAddress = (Address >> ICACHE_DATA_PER_LINE_ADDR_BITS) % ICACHE_SET; // 组索引
BlockOffset = Address % ICACHE_DATA_PER_LINE; // 块偏移
AddressTag = (Address >> ICACHE_DATA_PER_LINE_ADDR_BITS) >> ICACHE_SET_ADDR_BITS; // 地址去掉ICACHE_SET、ICACHE_DATA_PER_LINE,剩下的作为Tag。警告!不能将整个地址作为Tag!!
for(int i=0;i<ICACHE_LINES_PER_SET;i++)
{
CacheLineAddress=CacheSetAddress*ICACHE_LINES_PER_SET+i;
if (ICache[CacheLineAddress].Valid == 1 && ICache[CacheLineAddress].Tag == AddressTag)
{
MissFlag = 'H'; // 命中!
ReadValue = 0;
switch (InstSize)
{
case 1: // 1个字节
ReadValue = ICache[CacheLineAddress].Data[BlockOffset + 0];
break;
case 2: // 2个字节
BlockOffset = BlockOffset & 0xFE; // 需对齐到2字节边界
ReadValue = ICache[CacheLineAddress].Data[BlockOffset + 1]; ReadValue = ReadValue << 8;
ReadValue |= ICache[CacheLineAddress].Data[BlockOffset + 0];
break;
case 4: // 4个字节
BlockOffset = BlockOffset & 0xFC; // 需对齐到4字节边界
ReadValue = ICache[CacheLineAddress].Data[BlockOffset + 3]; ReadValue = ReadValue << 8;
ReadValue |= ICache[CacheLineAddress].Data[BlockOffset + 2]; ReadValue = ReadValue << 8;
ReadValue |= ICache[CacheLineAddress].Data[BlockOffset + 1]; ReadValue = ReadValue << 8;
ReadValue |= ICache[CacheLineAddress].Data[BlockOffset + 0];
break;
case 8: // 8个字节
BlockOffset = BlockOffset & 0xF8; // 需对齐到8字节边界
ReadValue = ICache[CacheLineAddress].Data[BlockOffset + 7]; ReadValue = ReadValue << 8;
ReadValue |= ICache[CacheLineAddress].Data[BlockOffset + 6]; ReadValue = ReadValue << 8;
ReadValue |= ICache[CacheLineAddress].Data[BlockOffset + 5]; ReadValue = ReadValue << 8;
ReadValue |= ICache[CacheLineAddress].Data[BlockOffset + 4]; ReadValue = ReadValue << 8;
ReadValue |= ICache[CacheLineAddress].Data[BlockOffset + 3]; ReadValue = ReadValue << 8;
ReadValue |= ICache[CacheLineAddress].Data[BlockOffset + 2]; ReadValue = ReadValue << 8;
ReadValue |= ICache[CacheLineAddress].Data[BlockOffset + 1]; ReadValue = ReadValue << 8;
ReadValue |= ICache[CacheLineAddress].Data[BlockOffset + 0];
break;
}
*InstResult = ReadValue;
}
}
if(MissFlag == 'M') // 不命中
{
UINT8 OKFlag=0;
// 该组是否full
for(int i=0;i<ICACHE_LINES_PER_SET;i++)
{
CacheLineAddress=CacheSetAddress*ICACHE_LINES_PER_SET+i;
if(ICache[CacheLineAddress].Valid == 0)
{
OKFlag=1;
break;
}
}
if(OKFlag==0)
{
// 生成一个随机数
UINT32 random_number = rand() % ICACHE_LINES_PER_SET;
CacheLineAddress=CacheSetAddress*ICACHE_LINES_PER_SET+random_number;
}
// printf("hello!\n");
LoadInstCacheLineFromMemory(Address, CacheLineAddress);
ICache[CacheLineAddress].Valid = 1;
ICache[CacheLineAddress].Tag = AddressTag;
}
return MissFlag;
}
标签:StoreValue,BlockOffset,CSAPP,CacheLineAddress,Cache,PER,DCache,Data,模拟器
From: https://blog.csdn.net/Lane0218/article/details/140808501