Lab 3: User Environments
在这个lab中我们需要创建一个用户环境(UNIX中的进程,它们的接口和实现不同),加载一个程序并运行,并使内核能够处理一些常用的中断请求。
Part A: User Environments and Exception Handling
在kern/env.c
中可以找到内核维护的三个全局变量:
struct Env *envs = NULL; // All environments
struct Env *curenv = NULL; // The current env
static struct Env *env_free_list; // Free environment list
envs
是一个数组,每个元素都是一个Env
结构体。这个数组保存了所有的环境对应的结构体。curenv
指的是当前环境对应的Env
。env_free_list
保存空闲的环境,可以通过env_free_list->env_link
来访问下一项。
Environment State
在inc/env.h
中可以找到struct Env
的定义:
struct Env {
struct Trapframe env_tf; // Saved registers
struct Env *env_link; // Next free Env
envid_t env_id; // Unique environment identifier
envid_t env_parent_id; // env_id of this env's parent
enum EnvType env_type; // Indicates special system environments
unsigned env_status; // Status of the environment
uint32_t env_runs; // Number of times environment has run
// Address space
pde_t *env_pgdir; // Kernel virtual address of page dir
};
env_tf
当此环境未运行时,它保存此环境的寄存器的值,在恢复运行时恢复寄存器的值。env_link
指向下一个空闲的环境。env_id
在环境创建时分配的独有的标识符。env_parent_id
在一个环境中创建一个新的环境时,当前环境就视为新环境的「parent」,新环境的env_parent_id
就等于其「parent」的env_id
。env_type
环境的类型,大多数是ENV_TYPE_USER
,在之后的lab中会出现更多类型。env_status
环境的状态。env_runs
环境运行的次数。env_pgdir
环境对应的页目录的内核虚拟地址。
env_type:
ENV_FREE:
Indicates that the Env structure is inactive, and therefore on the env_free_list.
ENV_RUNNABLE:
Indicates that the Env structure represents an environment that is waiting to run on the processor.
ENV_RUNNING:
Indicates that the Env structure represents the currently running environment.
ENV_NOT_RUNNABLE:
Indicates that the Env structure represents a currently active environment, but it is not currently ready to run: for example, because it is waiting for an interprocess communication (IPC) from another environment.
ENV_DYING:
Indicates that the Env structure represents a zombie environment. A zombie environment will be freed the next time it traps to the kernel. We will not use this flag until Lab 4.
Allocating the Environments Array
为envs
数组分配空间。
在kern/pmap.c
中:
// Map the 'envs' array read-only by the user at linear address UENVS
// (ie. perm = PTE_U | PTE_P).
// Permissions:
// - the new image at UENVS -- kernel R, user R
// - envs itself -- kernel RW, user NONE
// LAB 3: Your code here.
boot_map_region(kern_pgdir, UENVS, NENV * sizeof(struct Env), PADDR(envs), PTE_U | PTE_P);
Creating and Running Environments
由于当前还没有文件系统,因此二进制文件是以ELF可执行映像格式嵌入在内核中的。
我们需要创建好用户环境,加载映像文件并运行。
kern/env.c:
env_init()
Initialize all of the Env structures in the envs array and add them to the env_free_list. Also calls env_init_percpu, which configures the segmentation hardware with separate segments for privilege level 0 (kernel) and privilege level 3 (user).
env_setup_vm()
Allocate a page directory for a new environment and initialize the kernel portion of the new environment's address space.
region_alloc()
Allocates and maps physical memory for an environment
load_icode()
You will need to parse an ELF binary image, much like the boot loader already does, and load its contents into the user address space of a new environment.
env_create()
Allocate an environment with env_alloc and call load_icode to load an ELF binary into it.
env_run()
Start a given environment running in user mode.
env_init
:
// Mark all environments in 'envs' as free, set their env_ids to 0,
// and insert them into the env_free_list.
// Make sure the environments are in the free list in the same order
// they are in the envs array (i.e., so that the first call to
// env_alloc() returns envs[0]).
//
void
env_init(void)
{
// Set up envs array
// LAB 3: Your code here.
env_free_list = NULL;
for (int i = NENV - 1; i >= 0; i--) {
envs[i].env_id = 0;
envs[i].env_link = env_free_list;
env_free_list = &envs[i];
}
// Per-CPU part of the initialization
env_init_percpu();
}
初始化envs
中所有的元素,注意链表的方向。
env_setup_vm
:
// Initialize the kernel virtual memory layout for environment e.
// Allocate a page directory, set e->env_pgdir accordingly,
// and initialize the kernel portion of the new environment's address space.
// Do NOT (yet) map anything into the user portion
// of the environment's virtual address space.
//
// Returns 0 on success, < 0 on error. Errors include:
// -E_NO_MEM if page directory or table could not be allocated.
//
static int
env_setup_vm(struct Env *e)
{
int i;
struct PageInfo *p = NULL;
// Allocate a page for the page directory
if (!(p = page_alloc(ALLOC_ZERO)))
return -E_NO_MEM;
// Now, set e->env_pgdir and initialize the page directory.
//
// Hint:
// - The VA space of all envs is identical above UTOP
// (except at UVPT, which we've set below).
// See inc/memlayout.h for permissions and layout.
// Can you use kern_pgdir as a template? Hint: Yes.
// (Make sure you got the permissions right in Lab 2.)
// - The initial VA below UTOP is empty.
// - You do not need to make any more calls to page_alloc.
// - Note: In general, pp_ref is not maintained for
// physical pages mapped only above UTOP, but env_pgdir
// is an exception -- you need to increment env_pgdir's
// pp_ref for env_free to work correctly.
// - The functions in kern/pmap.h are handy.
// LAB 3: Your code here.
p->pp_ref += 1;
e->env_pgdir = (pde_t *)page2kva(p);
memcpy(e->env_pgdir, kern_pgdir, PGSIZE);
// UVPT maps the env's own page table read-only.
// Permissions: kernel R, user R
e->env_pgdir[PDX(UVPT)] = PADDR(e->env_pgdir) | PTE_P | PTE_U;
return 0;
}
为环境的页目录分配一个内存页。然后增加该内存页的引用,将该页的虚拟地址作为环境的页目录地址。然后初始化该页目录。UVPT
映射到当前环境的页目录起始地址处。
region_alloc
:
// Allocate len bytes of physical memory for environment env,
// and map it at virtual address va in the environment's address space.
// Does not zero or otherwise initialize the mapped pages in any way.
// Pages should be writable by user and kernel.
// Panic if any allocation attempt fails.
//
static void
region_alloc(struct Env *e, void *va, size_t len)
{
// LAB 3: Your code here.
// (But only if you need it for load_icode.)
//
struct PageInfo *pp = NULL;
void *up = ROUNDUP(va + len, PGSIZE);
void *down = ROUNDDOWN(va, PGSIZE);
while (down < up) {
if((pp = page_alloc(0)) != NULL) {
if(page_insert(e->env_pgdir, pp, down, PTE_U | PTE_W) < 0){
panic("page_insert failed\n");
}
}
else {
panic("page_alloc failed\n");
}
down += PGSIZE;
}
// Hint: It is easier to use region_alloc if the caller can pass
// 'va' and 'len' values that are not page-aligned.
// You should round va down, and round (va + len) up.
// (Watch out for corner-cases!)
}
为环境e从虚拟地址va起分配len字节空间,需要注意的是地址对齐。
load_icode
:
// Set up the initial program binary, stack, and processor flags
// for a user process.
// This function is ONLY called during kernel initialization,
// before running the first user-mode environment.
//
// This function loads all loadable segments from the ELF binary image
// into the environment's user memory, starting at the appropriate
// virtual addresses indicated in the ELF program header.
// At the same time it clears to zero any portions of these segments
// that are marked in the program header as being mapped
// but not actually present in the ELF file - i.e., the program's bss section.
//
// All this is very similar to what our boot loader does, except the boot
// loader also needs to read the code from disk. Take a look at
// boot/main.c to get ideas.
//
// Finally, this function maps one page for the program's initial stack.
//
// load_icode panics if it encounters problems.
// - How might load_icode fail? What might be wrong with the given input?
//
static void
load_icode(struct Env *e, uint8_t *binary)
{
// Hints:
// Load each program segment into virtual memory
// at the address specified in the ELF segment header.
// You should only load segments with ph->p_type == ELF_PROG_LOAD.
// Each segment's virtual address can be found in ph->p_va
// and its size in memory can be found in ph->p_memsz.
// The ph->p_filesz bytes from the ELF binary, starting at
// 'binary + ph->p_offset', should be copied to virtual address
// ph->p_va. Any remaining memory bytes should be cleared to zero.
// (The ELF header should have ph->p_filesz <= ph->p_memsz.)
// Use functions from the previous lab to allocate and map pages.
//
// All page protection bits should be user read/write for now.
// ELF segments are not necessarily page-aligned, but you can
// assume for this function that no two segments will touch
// the same virtual page.
//
// You may find a function like region_alloc useful.
//
// Loading the segments is much simpler if you can move data
// directly into the virtual addresses stored in the ELF binary.
// So which page directory should be in force during
// this function?
//
// You must also do something with the program's entry point,
// to make sure that the environment starts executing there.
// What? (See env_run() and env_pop_tf() below.)
// LAB 3: Your code here.
struct Elf *ELFHDR = (struct Elf *)binary;
struct Proghdr *ph = NULL, *eph = NULL;
if (ELFHDR->e_magic != ELF_MAGIC) {
panic("ELFHDR->e_magic != ELF_MAGIC\n");
}
ph = (struct Proghdr *) ((uint8_t *) ELFHDR + ELFHDR->e_phoff);
eph = ph + ELFHDR->e_phnum;
lcr3(PADDR(e->env_pgdir));
for(; ph < eph; ph++){
if(ph->p_type == ELF_PROG_LOAD){
if(ph->p_filesz > ph->p_memsz){
panic("ph->filesz > ph->memsz\n");
}
region_alloc(e, (void *)ph->p_va, ph->p_memsz);
memcpy((void *)ph->p_va, (void *)(binary + ph->p_offset), ph->p_filesz);
memset((void *)(ph->p_va + ph->p_filesz), 0, ph->p_memsz - ph->p_filesz);
}
}
lcr3(PADDR(kern_pgdir));
e->env_tf.tf_eip = ELFHDR->e_entry;
// Now map one page for the program's initial stack
// at virtual address USTACKTOP - PGSIZE.
// LAB 3: Your code here.
region_alloc(e, (void *)USTACKTOP - PGSIZE, PGSIZE);
}
给定二进制文件的起始地址,我们需要将它加载进当前环境中。主要参考boot/main.c
的实现,先读取ELF头文件,计算每个段的偏移量并读取它们,因为是按页读取的,所以需要将多余的部分重新初始化一下。 lcr3(PADDR(e->env_pgdir));
设置使用用户的线性地址空间,此时使用内核的也可以,因为当前环境与内核的页目录只有一项不同。之后将环境的运行起始地址设为可执行文件的第一条指令。然后为程序分配一个栈。
env_create
:
// Allocates a new env with env_alloc, loads the named elf
// binary into it with load_icode, and sets its env_type.
// This function is ONLY called during kernel initialization,
// before running the first user-mode environment.
// The new env's parent ID is set to 0.
//
void
env_create(uint8_t *binary, enum EnvType type)
{
// LAB 3: Your code here.
struct Env *e = NULL;
if(env_alloc(&e, 0) == 0)
{
e->env_type = type;
load_icode(e, binary);
}
}
给定一个二进制文件和一个环境类型,我们要创建并初始化好这个环境。
直接调函数即可。env_alloc
的实现也在kern/env.c
中。该函数将创建好的环境保存在e中。我们直接设置环境类型并加载文件即可。
env_run
:
// Context switch from curenv to env e.
// Note: if this is the first call to env_run, curenv is NULL.
//
// This function does not return.
//
void
env_run(struct Env *e)
{
// Step 1: If this is a context switch (a new environment is running):
// 1. Set the current environment (if any) back to
// ENV_RUNNABLE if it is ENV_RUNNING (think about
// what other states it can be in),
// 2. Set 'curenv' to the new environment,
// 3. Set its status to ENV_RUNNING,
// 4. Update its 'env_runs' counter,
// 5. Use lcr3() to switch to its address space.
// Step 2: Use env_pop_tf() to restore the environment's
// registers and drop into user mode in the
// environment.
// Hint: This function loads the new environment's state from
// e->env_tf. Go back through the code you wrote above
// and make sure you have set the relevant parts of
// e->env_tf to sensible values.
// LAB 3: Your code here.
if (curenv != NULL && curenv->env_status == ENV_RUNNING) {
curenv->env_status = ENV_RUNNABLE;
}
curenv = e;
e->env_status = ENV_RUNNING;
e->env_runs += 1;
lcr3(PADDR(e->env_pgdir));
env_pop_tf(&(e->env_tf));
// panic("env_run not yet implemented");
}
从当前环境切换到环境e。设置当前环境和环境e的状态,增加环境e的运行次数,更新全局变量curenv
,加载环境e的地址空间。恢复保存的寄存器值。
env_pop_tf
:
// Restores the register values in the Trapframe with the 'iret' instruction.
// This exits the kernel and starts executing some environment's code.
//
// This function does not return.
//
void
env_pop_tf(struct Trapframe *tf)
{
asm volatile(
"\tmovl %0,%%esp\n"
"\tpopal\n"
"\tpopl %%es\n"
"\tpopl %%ds\n"
"\taddl $0x8,%%esp\n" /* skip tf_trapno and tf_errcode */
"\tiret\n"
: : "g" (tf) : "memory");
panic("iret failed"); /* mostly to placate the compiler */
}
这个函数接受一个struct Trapframe
为参数,这个结构的定义在inc/trap.h
中:
struct PushRegs {
/* registers as pushed by pusha */
uint32_t reg_edi;
uint32_t reg_esi;
uint32_t reg_ebp;
uint32_t reg_oesp; /* Useless */
uint32_t reg_ebx;
uint32_t reg_edx;
uint32_t reg_ecx;
uint32_t reg_eax;
} __attribute__((packed));
struct Trapframe {
struct PushRegs tf_regs;
uint16_t tf_es;
uint16_t tf_padding1;
uint16_t tf_ds;
uint16_t tf_padding2;
uint32_t tf_trapno;
/* below here defined by x86 hardware */
uint32_t tf_err;
uintptr_t tf_eip;
uint16_t tf_cs;
uint16_t tf_padding3;
uint32_t tf_eflags;
/* below here only when crossing rings, such as from user to kernel */
uintptr_t tf_esp;
uint16_t tf_ss;
uint16_t tf_padding4;
} __attribute__((packed));
再看看这个函数的汇编代码:
f010399a: 55 push %ebp
f010399b: 89 e5 mov %esp,%ebp
f010399d: 53 push %ebx
f010399e: 83 ec 08 sub $0x8,%esp
f01039a1: e8 c1 c7 ff ff call f0100167 <__x86.get_pc_thunk.bx>
f01039a6: 81 c3 7a 96 08 00 add $0x8967a,%ebx
asm volatile(
f01039ac: 8b 65 08 mov 0x8(%ebp),%esp
f01039af: 61 popa
f01039b0: 07 pop %es
f01039b1: 1f pop %ds
f01039b2: 83 c4 08 add $0x8,%esp
f01039b5: cf iret
"\tpopl %%es\n"
"\tpopl %%ds\n"
"\taddl $0x8,%%esp\n" /* skip tf_trapno and tf_errcode */
"\tiret\n"
: : "g" (tf) : "memory");
panic("iret failed"); /* mostly to placate the compiler */
f01039b6: 8d 83 e9 95 f7 ff lea -0x86a17(%ebx),%eax
f01039bc: 50 push %eax
f01039bd: 68 dd 01 00 00 push $0x1dd
f01039c2: 8d 83 6a 95 f7 ff lea -0x86a96(%ebx),%eax
f01039c8: 50 push %eax
f01039c9: e8 e3 c6 ff ff call f01000b1 <_panic>
tf_regs
保存了当前环境的寄存器的值,因为保存的寄存器为Trapframe的第一项,因此直接将tf
设当前的栈指针,然后通过popa
指令恢复通用寄存器的值。然后恢复寄存器es
和ds
的值。通过add $0x8,%esp
跳过tf_err
和tf_eip
。然后执行iret
指令,该指令会加载cs,eflags,esp,ss
等的值。
我们可以通过GDB调试查看一下执行iret
执行前后各个寄存器的值:
(gdb) b *0xf01039b5
Breakpoint 1 at 0xf01039b5: file kern/env.c, line 469.
(gdb) c
Continuing.
The target architecture is assumed to be i386
=> 0xf01039b5 <env_pop_tf+27>: iret
Breakpoint 1, 0xf01039b5 in env_pop_tf (
tf=<error reading variable: Unknown argument list address for `tf'.>)
at kern/env.c:469
469 asm volatile(
(gdb) info registers
eax 0x0 0
ecx 0x0 0
edx 0x0 0
ebx 0x0 0
esp 0xf01d2030 0xf01d2030
ebp 0x0 0x0
esi 0x0 0
edi 0x0 0
eip 0xf01039b5 0xf01039b5 <env_pop_tf+27>
eflags 0x96 [ PF AF SF ]
cs 0x8 8
ss 0x10 16
ds 0x23 35
es 0x23 35
fs 0x23 35
gs 0x23 35
(gdb) si
=> 0x800020: cmp $0xeebfe000,%esp
0x00800020 in ?? ()
(gdb) info registers
eax 0x0 0
ecx 0x0 0
edx 0x0 0
ebx 0x0 0
esp 0xeebfe000 0xeebfe000
ebp 0x0 0x0
esi 0x0 0
edi 0x0 0
eip 0x800020 0x800020
eflags 0x2 [ ]
cs 0x1b 27
ss 0x23 35
ds 0x23 35
es 0x23 35
fs 0x23 35
gs 0x23 35
可以看到,执行iret后cs,eflags,esp,ss的值已经改变了,他们的值定义在env_alloc
函数里面:
……
// Set up appropriate initial values for the segment registers.
// GD_UD is the user data segment selector in the GDT, and
// GD_UT is the user text segment selector (see inc/memlayout.h).
// The low 2 bits of each segment register contains the
// Requestor Privilege Level (RPL); 3 means user mode. When
// we switch privilege levels, the hardware does various
// checks involving the RPL and the Descriptor Privilege Level
// (DPL) stored in the descriptors themselves.
e->env_tf.tf_ds = GD_UD | 3;
e->env_tf.tf_es = GD_UD | 3;
e->env_tf.tf_ss = GD_UD | 3;
e->env_tf.tf_esp = USTACKTOP;
e->env_tf.tf_cs = GD_UT | 3;
……
执行完iret
指令后,eip的值变为0x800020,这是hello程序第一条指令的位置。之后就是在用户环境里执行刚刚加载的程序了。
Handling Interrupts and Exceptions
在hello
程序里,我们会遇到int $0x30
这样一条指令。该指令是一个系统调用,将字符显示到控制台。
我们现在需要处理用户发出的系统调用请求。
前置芝士:Chapter 9 Exceptions and Interrupts
Basics of Protected Control Transfer
异常和中断都是受保护的控制转移,处理器从用户模式转向内核模式。中断通常是由外部设备引发的,异常则是由当前运行的代码产生的。为了使这两种控制转移确实是受保护的(受限的),x86提供了两种机制:
- The Interrupt Descriptor Table。产生中断或异常后,处理器执行的指令不能是随意的,应当由内核规定,并且用户无法修改。不同的中断信号所对应的中断向量号不同,每个都具有独特的处理程序。这个中断向量号可以作为IDT的索引。在描述符中有处理程序的地址和cs寄存器的值。
- The Task State Segment。在进入处理程序之前,我们需要保存当前环境的状态。而且保存的环境状态不能受到其他用户环境的干扰,否则可能会危害到内核。因此在从用户模式转到内核模式时,我们会使用内核的栈来保存环境状态。
A structure called the task state segment (TSS) specifies the segment selector and address where this stack lives. The processor pushes (on this new stack) SS, ESP, EFLAGS, CS, EIP, and an optional error code. Then it loads the CS and EIP from the interrupt descriptor, and sets the ESP and SS to refer to the new stack.
Types of Exceptions and Interrupts
x86可以产生的同步异常的中断向量号为 0-31。对应IDT的第0-31项。
Nested Exceptions and Interrupts
在内核模式出现异常或者中断时,不需要进行栈的切换,也就不需要保存当前环境的SS和ESP寄存器。
The processor can take exceptions and interrupts both from kernel and user mode. It is only when entering the kernel from user mode, however, that the x86 processor automatically switches stacks before pushing its old register state onto the stack and invoking the appropriate exception handler through the IDT. If the processor is already in kernel mode when the interrupt or exception occurs (the low 2 bits of the CS register are already zero), then the CPU just pushes more values on the same kernel stack. In this way, the kernel can gracefully handle nested exceptions caused by code within the kernel itself. This capability is an important tool in implementing protection, as we will see later in the section on system calls.
If the processor is already in kernel mode and takes a nested exception, since it does not need to switch stacks, it does not save the old SS or ESP registers. For exception types that do not push an error code, the kernel stack therefore looks like the following on entry to the exception handler:
+--------------------+ <---- old ESP
| old EFLAGS | " - 4
| 0x00000 | old CS | " - 8
| old EIP | " - 12
+--------------------+
For exception types that push an error code, the processor pushes the error code immediately after the old EIP, as before.
There is one important caveat to the processor's nested exception capability. If the processor takes an exception while already in kernel mode, and cannot push its old state onto the kernel stack for any reason such as lack of stack space, then there is nothing the processor can do to recover, so it simply resets itself. Needless to say, the kernel should be designed so that this can't happen.
Setting Up the IDT
在trapentry.S
有两个宏:
/* TRAPHANDLER defines a globally-visible function for handling a trap.
* It pushes a trap number onto the stack, then jumps to _alltraps.
* Use TRAPHANDLER for traps where the CPU automatically pushes an error code.
*
* You shouldn't call a TRAPHANDLER function from C, but you may
* need to _declare_ one in C (for instance, to get a function pointer
* during IDT setup). You can declare the function with
* void NAME();
* where NAME is the argument passed to TRAPHANDLER.
*/
#define TRAPHANDLER(name, num) \
.globl name; /* define global symbol for 'name' */ \
.type name, @function; /* symbol type is function */ \
.align 2; /* align function definition */ \
name: /* function starts here */ \
pushl $(num); \
jmp _alltraps
/* Use TRAPHANDLER_NOEC for traps where the CPU doesn't push an error code.
* It pushes a 0 in place of the error code, so the trap frame has the same
* format in either case.
*/
#define TRAPHANDLER_NOEC(name, num) \
.globl name; \
.type name, @function; \
.align 2; \
name: \
pushl $0; \
pushl $(num); \
jmp _alltraps
这两个宏向栈内压入一个数字num
,然后跳转到_alltraps
。TRPHANDLER_NOEC会向栈内多压一个0,这是为那些没有error code的中断和异常准备的,使所有中断和异常具有相同的格式,方便后续处理。
我们现在需要设置IDT表。在trapentry.S
中利用这两个宏来定义我们的处理程序:
TRAPHANDLER_NOEC(DIVIDE_HANDLER, T_DIVIDE);
TRAPHANDLER_NOEC(DEBUG_HANDLER, T_DEBUG);
TRAPHANDLER_NOEC(NMI_HANDLER, T_NMI);
TRAPHANDLER_NOEC(BRKPT_HANDLER, T_BRKPT);
TRAPHANDLER_NOEC(OFLOW_HANDLER, T_OFLOW);
TRAPHANDLER_NOEC(BOUND_HANDLER, T_BOUND);
TRAPHANDLER_NOEC(ILLOP_HANDLER, T_ILLOP);
TRAPHANDLER_NOEC(DEVICE_HANDLER, T_DEVICE);
TRAPHANDLER(DBLFLT_HANDLER, T_DBLFLT);
/* reserved */
TRAPHANDLER(TSS_HANDLER, T_TSS);
TRAPHANDLER(SEGNP_HANDLER, T_SEGNP);
TRAPHANDLER(STACK_HANDLER, T_STACK);
TRAPHANDLER(GPFLT_HANDLER, T_GPFLT);
TRAPHANDLER(PGFLT_HANDLER, T_PGFLT);
/* reserved */
TRAPHANDLER_NOEC(FPERR_HANDLER, T_FPERR);
TRAPHANDLER(ALIGN_HANDLER, T_ALIGN);
TRAPHANDLER_NOEC(MCHK_HANDLER, T_MCHK);
TRAPHANDLER_NOEC(SIMDERR_HANDLER, T_SIMDERR);
然后在trap.c
中定义我们的处理程序,然后使用SETGATE
加载IDT。
……
// LAB 3: Your code here.
void DIVIDE_HANDLER();
void DEBUG_HANDLER();
void NMI_HANDLER();
void BRKPT_HANDLER();
void OFLOW_HANDLER();
void BOUND_HANDLER();
void ILLOP_HANDLER();
void DEVICE_HANDLER();
void DBLFLT_HANDLER();
/* T_COPROC 9 reserved */
void TSS_HANDLER();
void SEGNP_HANDLER();
void STACK_HANDLER();
void GPFLT_HANDLER();
void PGFLT_HANDLER();
/* T_RES 15 reserved */
void FPERR_HANDLER();
void ALIGN_HANDLER();
void MCHK_HANDLER();
void SIMDERR_HANDLER();
SETGATE(idt[T_DIVIDE], 0, GD_KT, DIVIDE_HANDLER, 0);
SETGATE(idt[T_DEBUG], 0, GD_KT, DEBUG_HANDLER, 0);
SETGATE(idt[T_NMI], 0, GD_KT, NMI_HANDLER, 0);
SETGATE(idt[T_BRKPT], 0, GD_KT, BRKPT_HANDLER, 0);
SETGATE(idt[T_OFLOW], 0, GD_KT, OFLOW_HANDLER, 0);
SETGATE(idt[T_BOUND], 0, GD_KT, BOUND_HANDLER, 0);
SETGATE(idt[T_ILLOP], 0, GD_KT, ILLOP_HANDLER, 0);
SETGATE(idt[T_DEVICE], 0, GD_KT, DEVICE_HANDLER, 0);
SETGATE(idt[T_DBLFLT], 0, GD_KT, DBLFLT_HANDLER, 0);
/* reserved */
SETGATE(idt[T_TSS], 0, GD_KT, TSS_HANDLER, 0);
SETGATE(idt[T_SEGNP], 0, GD_KT, SEGNP_HANDLER, 0);
SETGATE(idt[T_STACK], 0, GD_KT, STACK_HANDLER, 0);
SETGATE(idt[T_GPFLT], 0, GD_KT, GPFLT_HANDLER, 0);
SETGATE(idt[T_PGFLT], 0, GD_KT, PGFLT_HANDLER, 0);
/* reserved */
SETGATE(idt[T_FPERR], 0, GD_KT, FPERR_HANDLER, 0);
SETGATE(idt[T_ALIGN], 0, GD_KT, ALIGN_HANDLER, 0);
SETGATE(idt[T_MCHK], 0, GD_KT, MCHK_HANDLER, 0);
SETGATE(idt[T_SIMDERR], 0, GD_KT, SIMDERR_HANDLER, 0);
……
现在在trapentry.S
中实现_alltraps
:
Your _alltraps should:
- push values to make the stack look like a struct Trapframe
- load GD_KD into %ds and %es
- pushl %esp to pass a pointer to the Trapframe as an argument to trap()
- call trap (can trap ever return?)
_alltraps:
pushl %ds
pushl %es
pushal
movw $GD_KD, %ax
movw %ax, %ds
movw %ax, %es
pushl %esp
call trap
如果是从用户模式转向内核模式,那么此时栈内的元素已经有ss,esp,eflags,cs,eip和可选的err。我们需要手动保存ds和es,然后保存通用寄存器。这个和env_pop_tf
的顺序刚好相反。然后调用trap函数进行处理。
Part B: Page Faults, Breakpoints Exceptions, and System Calls
Handling Page Faults
static void
trap_dispatch(struct Trapframe *tf)
{
// Handle processor exceptions.
// LAB 3: Your code here.
switch(tf->tf_trapno) {
case T_PGFLT:
page_fault_handler(tf);
return;
default:
break;
}
// Unexpected trap: The user process or the kernel has a bug.
print_trapframe(tf);
if (tf->tf_cs == GD_KT)
panic("unhandled trap in kernel");
else {
env_destroy(curenv);
return;
}
}
根据tf->tf_trapno来选择对应的处理程序。
The Breakpoint Exception
case T_BRKPT:
monitor(tf);
return;
在switch中添加一项即可。修改 SETGATE(idt[T_BRKPT], 0, GD_KT, BRKPT_HANDLER, 0);
的最后一项参数为3
。因为该函数的最后一项参数为描述符的特权等级,此处应该设置为用户所在的等级。
System calls
增加一项系统调用对应的处理程序。需要在trapentry.S
中添加:
TRAPHANDLER_NOEC(SYSCALL_HANDLER, T_SYSCALL);
在trap_init
中添加:
void SYSCALL_HANDLER();
SETGATE(idt[T_SYSCALL], 0, GD_KT, SYSCALL_HANDLER, 3);
在switch中添加:
case T_SYSCALL:
tf->tf_regs.reg_eax = syscall(tf->tf_regs.reg_eax,
tf->tf_regs.reg_edx,
tf->tf_regs.reg_ecx,
tf->tf_regs.reg_ebx,
tf->tf_regs.reg_edi,
tf->tf_regs.reg_esi);
return;
要注意系统调用的返回值保存在eax寄存器中。
syscall
:
// Dispatches to the correct kernel function, passing the arguments.
int32_t
syscall(uint32_t syscallno, uint32_t a1, uint32_t a2, uint32_t a3, uint32_t a4, uint32_t a5)
{
// Call the function corresponding to the 'syscallno' parameter.
// Return any appropriate return value.
// LAB 3: Your code here.
// panic("syscall not implemented");
int32_t result;
switch (syscallno) {
case SYS_cgetc :
result = sys_cgetc();
break;
case SYS_cputs :
sys_cputs((char *)a1, a2);
result = 0;
break;
case SYS_env_destroy :
result = sys_env_destroy((envid_t)a1);
break;
case SYS_getenvid :
result = sys_getenvid();
break;
default:
result = -E_INVAL;
}
return result;
}
User-mode startup
void
libmain(int argc, char **argv)
{
// set thisenv to point at our Env structure in envs[].
// LAB 3: Your code here.
thisenv = &envs[ENVX(sys_getenvid())];
// save the name of the program so that panic() can use it
if (argc > 0)
binaryname = argv[0];
// call user main routine
umain(argc, argv);
// exit gracefully
exit();
}
Page faults and memory protection
page_fault_handler
:
void
page_fault_handler(struct Trapframe *tf)
{
uint32_t fault_va;
// Read processor's CR2 register to find the faulting address
fault_va = rcr2();
// Handle kernel-mode page faults.
if ((tf->tf_cs & 3) == 0) {
panic("page-fault in kernel!\n");
}
// LAB 3: Your code here.
// We've already handled kernel-mode exceptions, so if we get here,
// the page fault happened in user mode.
// Destroy the environment that caused the fault.
cprintf("[%08x] user fault va %08x ip %08x\n",
curenv->env_id, fault_va, tf->tf_eip);
print_trapframe(tf);
env_destroy(curenv);
}
根据cs的低2位来判断,00为内核模式。
user_mem_check
:
// Check that an environment is allowed to access the range of memory
// [va, va+len) with permissions 'perm | PTE_P'.
// Normally 'perm' will contain PTE_U at least, but this is not required.
// 'va' and 'len' need not be page-aligned; you must test every page that
// contains any of that range. You will test either 'len/PGSIZE',
// 'len/PGSIZE + 1', or 'len/PGSIZE + 2' pages.
//
// A user program can access a virtual address if (1) the address is below
// ULIM, and (2) the page table gives it permission. These are exactly
// the tests you should implement here.
//
// If there is an error, set the 'user_mem_check_addr' variable to the first
// erroneous virtual address.
//
// Returns 0 if the user program can access this range of addresses,
// and -E_FAULT otherwise.
//
int
user_mem_check(struct Env *env, const void *va, size_t len, int perm)
{
// LAB 3: Your code here.
int result = 0;
uint32_t cva = (uint32_t)va;
void *down = ROUNDDOWN((void *)cva, PGSIZE);
void *up = ROUNDUP((void *)cva + len, PGSIZE);
for (; down < up; down += PGSIZE) {
if((uint32_t)down >= ULIM) {
user_mem_check_addr = (uint32_t)down;
if((uint32_t)down < cva) user_mem_check_addr = cva;
result = -E_FAULT;
break;
}
pte_t *pte = pgdir_walk(env->env_pgdir, down, 0);
if(!pte || ((*pte) & (perm | PTE_P)) != (perm | PTE_P)){
user_mem_check_addr = (uint32_t)down;
if((uint32_t)down < cva) user_mem_check_addr = cva;
result = -E_FAULT;
break;
}
}
return result;
}
根据注释要求写即可。
// Print a string to the system console.
// The string is exactly 'len' characters long.
// Destroys the environment on memory errors.
static void
sys_cputs(const char *s, size_t len)
{
// Check that the user has permission to read memory [s, s+len).
// Destroy the environment if not.
// LAB 3: Your code here.
user_mem_assert(curenv, s, len, PTE_U);
// Print the string supplied by the user.
cprintf("%.*s", len, s);
}
make grade
:
divzero: OK (1.3s)
softint: OK (2.0s)
badsegment: OK (1.0s)
Part A score: 30/30
faultread: OK (2.0s)
faultreadkernel: OK (1.1s)
faultwrite: OK (1.9s)
faultwritekernel: OK (1.7s)
breakpoint: OK (1.3s)
testbss: OK (2.0s)
hello: OK (1.9s)
buggyhello: OK (1.1s)
buggyhello2: OK (1.0s)
evilhello: OK (1.6s)
Part B score: 50/50
Score: 80/80
标签:Lab3,environment,void,笔记,MIT6.828,HANDLER,user,env,tf
From: https://www.cnblogs.com/despot/p/despot_mit6828_lab3.html