MIT6.828学习笔记3(Lab3)

时间:2022-12-08 07:08:02

在这个lab中我们需要创建一个用户环境(UNIX中的进程,它们的接口和实现不同),加载一个程序并运行,并使内核能够处理一些常用的中断请求。

Part A: User Environments and Exception Handling

kern/env.c中可以找到内核维护的三个全局变量:

struct Env *envs = NULL;		// All environments
struct Env *curenv = NULL;		// The current env
static struct Env *env_free_list;	// Free environment list
  • envs是一个数组,每个元素都是一个Env结构体。这个数组保存了所有的环境对应的结构体。
  • curenv指的是当前环境对应的Env
  • env_free_list保存空闲的环境,可以通过env_free_list->env_link来访问下一项。

Environment State

inc/env.h中可以找到struct Env的定义:

struct Env {
	struct Trapframe env_tf;	// Saved registers
	struct Env *env_link;		// Next free Env
	envid_t env_id;			// Unique environment identifier
	envid_t env_parent_id;		// env_id of this env's parent
	enum EnvType env_type;		// Indicates special system environments
	unsigned env_status;		// Status of the environment
	uint32_t env_runs;		// Number of times environment has run

	// Address space
	pde_t *env_pgdir;		// Kernel virtual address of page dir
};
  • env_tf当此环境未运行时,它保存此环境的寄存器的值,在恢复运行时恢复寄存器的值。
  • env_link指向下一个空闲的环境。
  • env_id在环境创建时分配的独有的标识符。
  • env_parent_id在一个环境中创建一个新的环境时,当前环境就视为新环境的「parent」,新环境的env_parent_id就等于其「parent」的env_id
  • env_type环境的类型,大多数是ENV_TYPE_USER,在之后的lab中会出现更多类型。
  • env_status环境的状态。
  • env_runs环境运行的次数。
  • env_pgdir环境对应的页目录的内核虚拟地址。
env_type:

ENV_FREE:
    Indicates that the Env structure is inactive, and therefore on the env_free_list.
ENV_RUNNABLE:
    Indicates that the Env structure represents an environment that is waiting to run on the processor.
ENV_RUNNING:
    Indicates that the Env structure represents the currently running environment.
ENV_NOT_RUNNABLE:
    Indicates that the Env structure represents a currently active environment, but it is not currently ready to run: for example, because it is waiting for an interprocess communication (IPC) from another environment.
ENV_DYING:
    Indicates that the Env structure represents a zombie environment. A zombie environment will be freed the next time it traps to the kernel. We will not use this flag until Lab 4. 

Allocating the Environments Array

envs数组分配空间。
kern/pmap.c中:

	// Map the 'envs' array read-only by the user at linear address UENVS
	// (ie. perm = PTE_U | PTE_P).
	// Permissions:
	//    - the new image at UENVS  -- kernel R, user R
	//    - envs itself -- kernel RW, user NONE
	// LAB 3: Your code here.
	boot_map_region(kern_pgdir, UENVS, NENV * sizeof(struct Env), PADDR(envs), PTE_U | PTE_P);

Creating and Running Environments

由于当前还没有文件系统,因此二进制文件是以ELF可执行映像格式嵌入在内核中的。
我们需要创建好用户环境,加载映像文件并运行。

kern/env.c:

env_init()
    Initialize all of the Env structures in the envs array and add them to the env_free_list. Also calls env_init_percpu, which configures the segmentation hardware with separate segments for privilege level 0 (kernel) and privilege level 3 (user).
env_setup_vm()
    Allocate a page directory for a new environment and initialize the kernel portion of the new environment's address space.
region_alloc()
    Allocates and maps physical memory for an environment
load_icode()
    You will need to parse an ELF binary image, much like the boot loader already does, and load its contents into the user address space of a new environment.
env_create()
    Allocate an environment with env_alloc and call load_icode to load an ELF binary into it.
env_run()
    Start a given environment running in user mode. 

env_init:

// Mark all environments in 'envs' as free, set their env_ids to 0,
// and insert them into the env_free_list.
// Make sure the environments are in the free list in the same order
// they are in the envs array (i.e., so that the first call to
// env_alloc() returns envs[0]).
//
void
env_init(void)
{
	// Set up envs array
	// LAB 3: Your code here.
	env_free_list = NULL;
	for (int i = NENV - 1; i >= 0; i--) {
		envs[i].env_id = 0;
		envs[i].env_link = env_free_list;
		env_free_list = &envs[i];
	}
	// Per-CPU part of the initialization
	env_init_percpu();
}

初始化envs中所有的元素,注意链表的方向。

env_setup_vm:

// Initialize the kernel virtual memory layout for environment e.
// Allocate a page directory, set e->env_pgdir accordingly,
// and initialize the kernel portion of the new environment's address space.
// Do NOT (yet) map anything into the user portion
// of the environment's virtual address space.
//
// Returns 0 on success, < 0 on error.  Errors include:
//	-E_NO_MEM if page directory or table could not be allocated.
//
static int
env_setup_vm(struct Env *e)
{
	int i;
	struct PageInfo *p = NULL;

	// Allocate a page for the page directory
	if (!(p = page_alloc(ALLOC_ZERO)))
		return -E_NO_MEM;

	// Now, set e->env_pgdir and initialize the page directory.
	//
	// Hint:
	//    - The VA space of all envs is identical above UTOP
	//	(except at UVPT, which we've set below).
	//	See inc/memlayout.h for permissions and layout.
	//	Can you use kern_pgdir as a template?  Hint: Yes.
	//	(Make sure you got the permissions right in Lab 2.)
	//    - The initial VA below UTOP is empty.
	//    - You do not need to make any more calls to page_alloc.
	//    - Note: In general, pp_ref is not maintained for
	//	physical pages mapped only above UTOP, but env_pgdir
	//	is an exception -- you need to increment env_pgdir's
	//	pp_ref for env_free to work correctly.
	//    - The functions in kern/pmap.h are handy.

	// LAB 3: Your code here.
	p->pp_ref += 1;
	e->env_pgdir = (pde_t *)page2kva(p);
	memcpy(e->env_pgdir, kern_pgdir, PGSIZE);
	// UVPT maps the env's own page table read-only.
	// Permissions: kernel R, user R
	e->env_pgdir[PDX(UVPT)] = PADDR(e->env_pgdir) | PTE_P | PTE_U;

	return 0;
}

为环境的页目录分配一个内存页。然后增加该内存页的引用,将该页的虚拟地址作为环境的页目录地址。然后初始化该页目录。UVPT映射到当前环境的页目录起始地址处。

region_alloc:

// Allocate len bytes of physical memory for environment env,
// and map it at virtual address va in the environment's address space.
// Does not zero or otherwise initialize the mapped pages in any way.
// Pages should be writable by user and kernel.
// Panic if any allocation attempt fails.
//
static void
region_alloc(struct Env *e, void *va, size_t len)
{
	// LAB 3: Your code here.
	// (But only if you need it for load_icode.)
	//
	struct PageInfo *pp = NULL;
	void *up = ROUNDUP(va + len, PGSIZE);
	void *down = ROUNDDOWN(va, PGSIZE);
	while (down < up) {
		if((pp = page_alloc(0)) != NULL) {
			if(page_insert(e->env_pgdir, pp, down, PTE_U | PTE_W) < 0){
				panic("page_insert failed\n");
			}
		}
		else {
			panic("page_alloc failed\n");
		}
		down += PGSIZE;
	}
	// Hint: It is easier to use region_alloc if the caller can pass
	//   'va' and 'len' values that are not page-aligned.
	//   You should round va down, and round (va + len) up.
	//   (Watch out for corner-cases!)
}

为环境e从虚拟地址va起分配len字节空间,需要注意的是地址对齐。

load_icode:

// Set up the initial program binary, stack, and processor flags
// for a user process.
// This function is ONLY called during kernel initialization,
// before running the first user-mode environment.
//
// This function loads all loadable segments from the ELF binary image
// into the environment's user memory, starting at the appropriate
// virtual addresses indicated in the ELF program header.
// At the same time it clears to zero any portions of these segments
// that are marked in the program header as being mapped
// but not actually present in the ELF file - i.e., the program's bss section.
//
// All this is very similar to what our boot loader does, except the boot
// loader also needs to read the code from disk.  Take a look at
// boot/main.c to get ideas.
//
// Finally, this function maps one page for the program's initial stack.
//
// load_icode panics if it encounters problems.
//  - How might load_icode fail?  What might be wrong with the given input?
//
static void
load_icode(struct Env *e, uint8_t *binary)
{
	// Hints:
	//  Load each program segment into virtual memory
	//  at the address specified in the ELF segment header.
	//  You should only load segments with ph->p_type == ELF_PROG_LOAD.
	//  Each segment's virtual address can be found in ph->p_va
	//  and its size in memory can be found in ph->p_memsz.
	//  The ph->p_filesz bytes from the ELF binary, starting at
	//  'binary + ph->p_offset', should be copied to virtual address
	//  ph->p_va.  Any remaining memory bytes should be cleared to zero.
	//  (The ELF header should have ph->p_filesz <= ph->p_memsz.)
	//  Use functions from the previous lab to allocate and map pages.
	//
	//  All page protection bits should be user read/write for now.
	//  ELF segments are not necessarily page-aligned, but you can
	//  assume for this function that no two segments will touch
	//  the same virtual page.
	//
	//  You may find a function like region_alloc useful.
	//
	//  Loading the segments is much simpler if you can move data
	//  directly into the virtual addresses stored in the ELF binary.
	//  So which page directory should be in force during
	//  this function?
	//
	//  You must also do something with the program's entry point,
	//  to make sure that the environment starts executing there.
	//  What?  (See env_run() and env_pop_tf() below.)

	// LAB 3: Your code here.
	struct Elf *ELFHDR = (struct Elf *)binary;
	struct Proghdr *ph = NULL, *eph = NULL;
	if (ELFHDR->e_magic != ELF_MAGIC) {
		panic("ELFHDR->e_magic != ELF_MAGIC\n");
	}
	ph = (struct Proghdr *) ((uint8_t *) ELFHDR + ELFHDR->e_phoff);
	eph = ph + ELFHDR->e_phnum;
	lcr3(PADDR(e->env_pgdir));
	for(; ph < eph; ph++){
		if(ph->p_type == ELF_PROG_LOAD){
			if(ph->p_filesz > ph->p_memsz){
				panic("ph->filesz > ph->memsz\n");
			}
			region_alloc(e, (void *)ph->p_va, ph->p_memsz);
			memcpy((void *)ph->p_va, (void *)(binary + ph->p_offset), ph->p_filesz);
			memset((void *)(ph->p_va + ph->p_filesz), 0, ph->p_memsz - ph->p_filesz);
		}

	}
	lcr3(PADDR(kern_pgdir));
	e->env_tf.tf_eip = ELFHDR->e_entry;
	// Now map one page for the program's initial stack
	// at virtual address USTACKTOP - PGSIZE.

	// LAB 3: Your code here.
	region_alloc(e, (void *)USTACKTOP - PGSIZE, PGSIZE);
}	

给定二进制文件的起始地址,我们需要将它加载进当前环境中。主要参考boot/main.c的实现,先读取ELF头文件,计算每个段的偏移量并读取它们,因为是按页读取的,所以需要将多余的部分重新初始化一下。 lcr3(PADDR(e->env_pgdir));设置使用用户的线性地址空间,此时使用内核的也可以,因为当前环境与内核的页目录只有一项不同。之后将环境的运行起始地址设为可执行文件的第一条指令。然后为程序分配一个栈。

env_create:

// Allocates a new env with env_alloc, loads the named elf
// binary into it with load_icode, and sets its env_type.
// This function is ONLY called during kernel initialization,
// before running the first user-mode environment.
// The new env's parent ID is set to 0.
//
void
env_create(uint8_t *binary, enum EnvType type)
{
	// LAB 3: Your code here.
	struct Env *e = NULL;
	if(env_alloc(&e, 0) == 0)
	{
		e->env_type = type;
		load_icode(e, binary);
	}
}

给定一个二进制文件和一个环境类型,我们要创建并初始化好这个环境。
直接调函数即可。env_alloc的实现也在kern/env.c中。该函数将创建好的环境保存在e中。我们直接设置环境类型并加载文件即可。

env_run:

// Context switch from curenv to env e.
// Note: if this is the first call to env_run, curenv is NULL.
//
// This function does not return.
//
void
env_run(struct Env *e)
{
	// Step 1: If this is a context switch (a new environment is running):
	//	   1. Set the current environment (if any) back to
	//	      ENV_RUNNABLE if it is ENV_RUNNING (think about
	//	      what other states it can be in),
	//	   2. Set 'curenv' to the new environment,
	//	   3. Set its status to ENV_RUNNING,
	//	   4. Update its 'env_runs' counter,
	//	   5. Use lcr3() to switch to its address space.
	// Step 2: Use env_pop_tf() to restore the environment's
	//	   registers and drop into user mode in the
	//	   environment.

	// Hint: This function loads the new environment's state from
	//	e->env_tf.  Go back through the code you wrote above
	//	and make sure you have set the relevant parts of
	//	e->env_tf to sensible values.

	// LAB 3: Your code here.
	if (curenv != NULL && curenv->env_status == ENV_RUNNING) {
		curenv->env_status = ENV_RUNNABLE;
	}
	curenv = e;
	e->env_status = ENV_RUNNING;
	e->env_runs += 1;
	lcr3(PADDR(e->env_pgdir));
	env_pop_tf(&(e->env_tf));
	// panic("env_run not yet implemented");
}

从当前环境切换到环境e。设置当前环境和环境e的状态,增加环境e的运行次数,更新全局变量curenv,加载环境e的地址空间。恢复保存的寄存器值。

env_pop_tf:

// Restores the register values in the Trapframe with the 'iret' instruction.
// This exits the kernel and starts executing some environment's code.
//
// This function does not return.
//
void
env_pop_tf(struct Trapframe *tf)
{
	asm volatile(
		"\tmovl %0,%%esp\n"
		"\tpopal\n"
		"\tpopl %%es\n"
		"\tpopl %%ds\n"
		"\taddl $0x8,%%esp\n" /* skip tf_trapno and tf_errcode */
		"\tiret\n"
		: : "g" (tf) : "memory");
	panic("iret failed");  /* mostly to placate the compiler */
}

这个函数接受一个struct Trapframe为参数,这个结构的定义在inc/trap.h中:

struct PushRegs {
	/* registers as pushed by pusha */
	uint32_t reg_edi;
	uint32_t reg_esi;
	uint32_t reg_ebp;
	uint32_t reg_oesp;		/* Useless */
	uint32_t reg_ebx;
	uint32_t reg_edx;
	uint32_t reg_ecx;
	uint32_t reg_eax;
} __attribute__((packed));

struct Trapframe {
	struct PushRegs tf_regs;
	uint16_t tf_es;
	uint16_t tf_padding1;
	uint16_t tf_ds;
	uint16_t tf_padding2;
	uint32_t tf_trapno;
	/* below here defined by x86 hardware */
	uint32_t tf_err;
	uintptr_t tf_eip;
	uint16_t tf_cs;
	uint16_t tf_padding3;
	uint32_t tf_eflags;
	/* below here only when crossing rings, such as from user to kernel */
	uintptr_t tf_esp;
	uint16_t tf_ss;
	uint16_t tf_padding4;
} __attribute__((packed));

再看看这个函数的汇编代码:

f010399a:	55                   	push   %ebp
f010399b:	89 e5                	mov    %esp,%ebp
f010399d:	53                   	push   %ebx
f010399e:	83 ec 08             	sub    $0x8,%esp
f01039a1:	e8 c1 c7 ff ff       	call   f0100167 <__x86.get_pc_thunk.bx>
f01039a6:	81 c3 7a 96 08 00    	add    $0x8967a,%ebx
	asm volatile(
f01039ac:	8b 65 08             	mov    0x8(%ebp),%esp
f01039af:	61                   	popa   
f01039b0:	07                   	pop    %es
f01039b1:	1f                   	pop    %ds
f01039b2:	83 c4 08             	add    $0x8,%esp
f01039b5:	cf                   	iret   
		"\tpopl %%es\n"
		"\tpopl %%ds\n"
		"\taddl $0x8,%%esp\n" /* skip tf_trapno and tf_errcode */
		"\tiret\n"
		: : "g" (tf) : "memory");
	panic("iret failed");  /* mostly to placate the compiler */
f01039b6:	8d 83 e9 95 f7 ff    	lea    -0x86a17(%ebx),%eax
f01039bc:	50                   	push   %eax
f01039bd:	68 dd 01 00 00       	push   $0x1dd
f01039c2:	8d 83 6a 95 f7 ff    	lea    -0x86a96(%ebx),%eax
f01039c8:	50                   	push   %eax
f01039c9:	e8 e3 c6 ff ff       	call   f01000b1 <_panic>

tf_regs保存了当前环境的寄存器的值,因为保存的寄存器为Trapframe的第一项,因此直接将tf设当前的栈指针,然后通过popa指令恢复通用寄存器的值。然后恢复寄存器esds的值。通过add $0x8,%esp跳过tf_errtf_eip。然后执行iret指令,该指令会加载cs,eflags,esp,ss等的值。
我们可以通过GDB调试查看一下执行iret执行前后各个寄存器的值:

(gdb) b *0xf01039b5
Breakpoint 1 at 0xf01039b5: file kern/env.c, line 469.
(gdb) c
Continuing.
The target architecture is assumed to be i386
=> 0xf01039b5 <env_pop_tf+27>:	iret   

Breakpoint 1, 0xf01039b5 in env_pop_tf (
    tf=<error reading variable: Unknown argument list address for `tf'.>)
    at kern/env.c:469
469		asm volatile(
(gdb) info registers
eax            0x0	0
ecx            0x0	0
edx            0x0	0
ebx            0x0	0
esp            0xf01d2030	0xf01d2030
ebp            0x0	0x0
esi            0x0	0
edi            0x0	0
eip            0xf01039b5	0xf01039b5 <env_pop_tf+27>
eflags         0x96	[ PF AF SF ]
cs             0x8	8
ss             0x10	16
ds             0x23	35
es             0x23	35
fs             0x23	35
gs             0x23	35
(gdb) si
=> 0x800020:	cmp    $0xeebfe000,%esp
0x00800020 in ?? ()
(gdb) info registers
eax            0x0	0
ecx            0x0	0
edx            0x0	0
ebx            0x0	0
esp            0xeebfe000	0xeebfe000
ebp            0x0	0x0
esi            0x0	0
edi            0x0	0
eip            0x800020	0x800020
eflags         0x2	[ ]
cs             0x1b	27
ss             0x23	35
ds             0x23	35
es             0x23	35
fs             0x23	35
gs             0x23	35

可以看到,执行iret后cs,eflags,esp,ss的值已经改变了,他们的值定义在env_alloc函数里面:

……
	// Set up appropriate initial values for the segment registers.
	// GD_UD is the user data segment selector in the GDT, and
	// GD_UT is the user text segment selector (see inc/memlayout.h).
	// The low 2 bits of each segment register contains the
	// Requestor Privilege Level (RPL); 3 means user mode.  When
	// we switch privilege levels, the hardware does various
	// checks involving the RPL and the Descriptor Privilege Level
	// (DPL) stored in the descriptors themselves.
	e->env_tf.tf_ds = GD_UD | 3;
	e->env_tf.tf_es = GD_UD | 3;
	e->env_tf.tf_ss = GD_UD | 3;
	e->env_tf.tf_esp = USTACKTOP;
	e->env_tf.tf_cs = GD_UT | 3;
……

执行完iret指令后,eip的值变为0x800020,这是hello程序第一条指令的位置。之后就是在用户环境里执行刚刚加载的程序了。

Handling Interrupts and Exceptions

hello程序里,我们会遇到int $0x30这样一条指令。该指令是一个系统调用,将字符显示到控制台。

我们现在需要处理用户发出的系统调用请求。

前置芝士:Chapter 9 Exceptions and Interrupts

Basics of Protected Control Transfer

异常和中断都是受保护的控制转移,处理器从用户模式转向内核模式。中断通常是由外部设备引发的,异常则是由当前运行的代码产生的。为了使这两种控制转移确实是受保护的(受限的),x86提供了两种机制:

  • The Interrupt Descriptor Table。产生中断或异常后,处理器执行的指令不能是随意的,应当由内核规定,并且用户无法修改。不同的中断信号所对应的中断向量号不同,每个都具有独特的处理程序。这个中断向量号可以作为IDT的索引。在描述符中有处理程序的地址和cs寄存器的值。
  • The Task State Segment。在进入处理程序之前,我们需要保存当前环境的状态。而且保存的环境状态不能受到其他用户环境的干扰,否则可能会危害到内核。因此在从用户模式转到内核模式时,我们会使用内核的栈来保存环境状态。

A structure called the task state segment (TSS) specifies the segment selector and address where this stack lives. The processor pushes (on this new stack) SS, ESP, EFLAGS, CS, EIP, and an optional error code. Then it loads the CS and EIP from the interrupt descriptor, and sets the ESP and SS to refer to the new stack.

Types of Exceptions and Interrupts

x86可以产生的同步异常的中断向量号为 0-31。对应IDT的第0-31项。

Nested Exceptions and Interrupts

在内核模式出现异常或者中断时,不需要进行栈的切换,也就不需要保存当前环境的SS和ESP寄存器。

 The processor can take exceptions and interrupts both from kernel and user mode. It is only when entering the kernel from user mode, however, that the x86 processor automatically switches stacks before pushing its old register state onto the stack and invoking the appropriate exception handler through the IDT. If the processor is already in kernel mode when the interrupt or exception occurs (the low 2 bits of the CS register are already zero), then the CPU just pushes more values on the same kernel stack. In this way, the kernel can gracefully handle nested exceptions caused by code within the kernel itself. This capability is an important tool in implementing protection, as we will see later in the section on system calls.

If the processor is already in kernel mode and takes a nested exception, since it does not need to switch stacks, it does not save the old SS or ESP registers. For exception types that do not push an error code, the kernel stack therefore looks like the following on entry to the exception handler:

                     +--------------------+ <---- old ESP
                     |     old EFLAGS     |     " - 4
                     | 0x00000 | old CS   |     " - 8
                     |      old EIP       |     " - 12
                     +--------------------+             

For exception types that push an error code, the processor pushes the error code immediately after the old EIP, as before.

There is one important caveat to the processor's nested exception capability. If the processor takes an exception while already in kernel mode, and cannot push its old state onto the kernel stack for any reason such as lack of stack space, then there is nothing the processor can do to recover, so it simply resets itself. Needless to say, the kernel should be designed so that this can't happen. 

Setting Up the IDT

trapentry.S有两个宏:

/* TRAPHANDLER defines a globally-visible function for handling a trap.
 * It pushes a trap number onto the stack, then jumps to _alltraps.
 * Use TRAPHANDLER for traps where the CPU automatically pushes an error code.
 *
 * You shouldn't call a TRAPHANDLER function from C, but you may
 * need to _declare_ one in C (for instance, to get a function pointer
 * during IDT setup).  You can declare the function with
 *   void NAME();
 * where NAME is the argument passed to TRAPHANDLER.
 */
#define TRAPHANDLER(name, num)						\
	.globl name;		/* define global symbol for 'name' */	\
	.type name, @function;	/* symbol type is function */		\
	.align 2;		/* align function definition */		\
	name:			/* function starts here */		\
	pushl $(num);							\
	jmp _alltraps

/* Use TRAPHANDLER_NOEC for traps where the CPU doesn't push an error code.
 * It pushes a 0 in place of the error code, so the trap frame has the same
 * format in either case.
 */
#define TRAPHANDLER_NOEC(name, num)					\
	.globl name;							\
	.type name, @function;						\
	.align 2;							\
	name:								\
	pushl $0;							\
	pushl $(num);							\
	jmp _alltraps

这两个宏向栈内压入一个数字num,然后跳转到_alltraps。TRPHANDLER_NOEC会向栈内多压一个0,这是为那些没有error code的中断和异常准备的,使所有中断和异常具有相同的格式,方便后续处理。

我们现在需要设置IDT表。在trapentry.S中利用这两个宏来定义我们的处理程序:

TRAPHANDLER_NOEC(DIVIDE_HANDLER, T_DIVIDE);
TRAPHANDLER_NOEC(DEBUG_HANDLER, T_DEBUG);
TRAPHANDLER_NOEC(NMI_HANDLER, T_NMI);
TRAPHANDLER_NOEC(BRKPT_HANDLER, T_BRKPT);
TRAPHANDLER_NOEC(OFLOW_HANDLER, T_OFLOW);
TRAPHANDLER_NOEC(BOUND_HANDLER, T_BOUND);
TRAPHANDLER_NOEC(ILLOP_HANDLER, T_ILLOP);
TRAPHANDLER_NOEC(DEVICE_HANDLER, T_DEVICE);
TRAPHANDLER(DBLFLT_HANDLER, T_DBLFLT);
/* reserved */
TRAPHANDLER(TSS_HANDLER, T_TSS);
TRAPHANDLER(SEGNP_HANDLER, T_SEGNP);
TRAPHANDLER(STACK_HANDLER, T_STACK);
TRAPHANDLER(GPFLT_HANDLER, T_GPFLT);
TRAPHANDLER(PGFLT_HANDLER, T_PGFLT);
/* reserved */
TRAPHANDLER_NOEC(FPERR_HANDLER, T_FPERR);
TRAPHANDLER(ALIGN_HANDLER, T_ALIGN);
TRAPHANDLER_NOEC(MCHK_HANDLER, T_MCHK);
TRAPHANDLER_NOEC(SIMDERR_HANDLER, T_SIMDERR);

MIT6.828学习笔记3(Lab3)

然后在trap.c中定义我们的处理程序,然后使用SETGATE加载IDT。

……
	// LAB 3: Your code here.
	void DIVIDE_HANDLER();
	void DEBUG_HANDLER();
	void NMI_HANDLER();
	void BRKPT_HANDLER();
	void OFLOW_HANDLER();
	void BOUND_HANDLER();
	void ILLOP_HANDLER();
	void DEVICE_HANDLER();
	void DBLFLT_HANDLER();
	/* T_COPROC 9 reserved */
	void TSS_HANDLER();
	void SEGNP_HANDLER();
	void STACK_HANDLER();
	void GPFLT_HANDLER();
	void PGFLT_HANDLER();
	/* T_RES 15 reserved */
	void FPERR_HANDLER();
	void ALIGN_HANDLER();
	void MCHK_HANDLER();
	void SIMDERR_HANDLER();


	SETGATE(idt[T_DIVIDE], 0, GD_KT, DIVIDE_HANDLER, 0);
	SETGATE(idt[T_DEBUG], 0, GD_KT, DEBUG_HANDLER, 0);
	SETGATE(idt[T_NMI], 0, GD_KT, NMI_HANDLER, 0);
	SETGATE(idt[T_BRKPT], 0, GD_KT, BRKPT_HANDLER, 0);
	SETGATE(idt[T_OFLOW], 0, GD_KT, OFLOW_HANDLER, 0);
	SETGATE(idt[T_BOUND], 0, GD_KT, BOUND_HANDLER, 0);
	SETGATE(idt[T_ILLOP], 0, GD_KT, ILLOP_HANDLER, 0);
	SETGATE(idt[T_DEVICE], 0, GD_KT, DEVICE_HANDLER, 0);
	SETGATE(idt[T_DBLFLT], 0, GD_KT, DBLFLT_HANDLER, 0);
	/* reserved */
	SETGATE(idt[T_TSS], 0, GD_KT, TSS_HANDLER, 0);
	SETGATE(idt[T_SEGNP], 0, GD_KT, SEGNP_HANDLER, 0);
	SETGATE(idt[T_STACK], 0, GD_KT, STACK_HANDLER, 0);
	SETGATE(idt[T_GPFLT], 0, GD_KT, GPFLT_HANDLER, 0);
	SETGATE(idt[T_PGFLT], 0, GD_KT, PGFLT_HANDLER, 0);
	/* reserved */
	SETGATE(idt[T_FPERR], 0, GD_KT, FPERR_HANDLER, 0);
	SETGATE(idt[T_ALIGN], 0, GD_KT, ALIGN_HANDLER, 0);
	SETGATE(idt[T_MCHK], 0, GD_KT, MCHK_HANDLER, 0);
	SETGATE(idt[T_SIMDERR], 0, GD_KT, SIMDERR_HANDLER, 0);
……

现在在trapentry.S中实现_alltraps:

Your _alltraps should:

  • push values to make the stack look like a struct Trapframe
  • load GD_KD into %ds and %es
  • pushl %esp to pass a pointer to the Trapframe as an argument to trap()
  • call trap (can trap ever return?)
_alltraps:
	pushl %ds
	pushl %es
	pushal
	movw $GD_KD, %ax 
	movw %ax, %ds
	movw %ax, %es
	pushl %esp
	call trap

如果是从用户模式转向内核模式,那么此时栈内的元素已经有ss,esp,eflags,cs,eip和可选的err。我们需要手动保存ds和es,然后保存通用寄存器。这个和env_pop_tf的顺序刚好相反。然后调用trap函数进行处理。


Part B: Page Faults, Breakpoints Exceptions, and System Calls

Handling Page Faults

static void
trap_dispatch(struct Trapframe *tf)
{
	// Handle processor exceptions.
	// LAB 3: Your code here.
	switch(tf->tf_trapno) {
		case T_PGFLT:
			page_fault_handler(tf);
			return;
		default:
			break;
	}
	// Unexpected trap: The user process or the kernel has a bug.
	print_trapframe(tf);
	if (tf->tf_cs == GD_KT)
		panic("unhandled trap in kernel");
	else {
		env_destroy(curenv);
		return;
	}
}

根据tf->tf_trapno来选择对应的处理程序。

The Breakpoint Exception

		case T_BRKPT:
			monitor(tf);
			return;

在switch中添加一项即可。修改 SETGATE(idt[T_BRKPT], 0, GD_KT, BRKPT_HANDLER, 0);的最后一项参数为3。因为该函数的最后一项参数为描述符的特权等级,此处应该设置为用户所在的等级。

System calls

增加一项系统调用对应的处理程序。需要在trapentry.S中添加:

TRAPHANDLER_NOEC(SYSCALL_HANDLER, T_SYSCALL);

trap_init中添加:

	void SYSCALL_HANDLER();
	SETGATE(idt[T_SYSCALL], 0, GD_KT, SYSCALL_HANDLER, 3);

在switch中添加:

case T_SYSCALL:
			tf->tf_regs.reg_eax = syscall(tf->tf_regs.reg_eax, 
										tf->tf_regs.reg_edx, 
										tf->tf_regs.reg_ecx,
										tf->tf_regs.reg_ebx, 
										tf->tf_regs.reg_edi,
					 					tf->tf_regs.reg_esi);
			return;

要注意系统调用的返回值保存在eax寄存器中。

syscall:

// Dispatches to the correct kernel function, passing the arguments.
int32_t
syscall(uint32_t syscallno, uint32_t a1, uint32_t a2, uint32_t a3, uint32_t a4, uint32_t a5)
{
	// Call the function corresponding to the 'syscallno' parameter.
	// Return any appropriate return value.
	// LAB 3: Your code here.

	// panic("syscall not implemented");
	int32_t result;
	switch (syscallno) {
		case SYS_cgetc :
			result = sys_cgetc();
			break;
		case SYS_cputs :
			sys_cputs((char *)a1, a2);
			result = 0;
			break;
		case SYS_env_destroy :
			result = sys_env_destroy((envid_t)a1);
			break;
		case SYS_getenvid :
			result = sys_getenvid();
			break;
		default:
			result = -E_INVAL;
	}
	return result;
}

User-mode startup

void
libmain(int argc, char **argv)
{
	// set thisenv to point at our Env structure in envs[].
	// LAB 3: Your code here.
	thisenv = &envs[ENVX(sys_getenvid())];
	// save the name of the program so that panic() can use it
	if (argc > 0)
		binaryname = argv[0];

	// call user main routine
	umain(argc, argv);

	// exit gracefully
	exit();
}

Page faults and memory protection

page_fault_handler

void
page_fault_handler(struct Trapframe *tf)
{
	uint32_t fault_va;

	// Read processor's CR2 register to find the faulting address
	fault_va = rcr2();

	// Handle kernel-mode page faults.
	if ((tf->tf_cs & 3) == 0) {
		panic("page-fault in kernel!\n");
	}
	// LAB 3: Your code here.

	// We've already handled kernel-mode exceptions, so if we get here,
	// the page fault happened in user mode.

	// Destroy the environment that caused the fault.
	cprintf("[%08x] user fault va %08x ip %08x\n",
		curenv->env_id, fault_va, tf->tf_eip);
	print_trapframe(tf);
	env_destroy(curenv);
}

根据cs的低2位来判断,00为内核模式。
user_mem_check

// Check that an environment is allowed to access the range of memory
// [va, va+len) with permissions 'perm | PTE_P'.
// Normally 'perm' will contain PTE_U at least, but this is not required.
// 'va' and 'len' need not be page-aligned; you must test every page that
// contains any of that range.  You will test either 'len/PGSIZE',
// 'len/PGSIZE + 1', or 'len/PGSIZE + 2' pages.
//
// A user program can access a virtual address if (1) the address is below
// ULIM, and (2) the page table gives it permission.  These are exactly
// the tests you should implement here.
//
// If there is an error, set the 'user_mem_check_addr' variable to the first
// erroneous virtual address.
//
// Returns 0 if the user program can access this range of addresses,
// and -E_FAULT otherwise.
//
int
user_mem_check(struct Env *env, const void *va, size_t len, int perm)
{
	// LAB 3: Your code here.
	int result = 0;
	uint32_t cva = (uint32_t)va;
	void *down = ROUNDDOWN((void *)cva, PGSIZE);
	void *up = ROUNDUP((void *)cva + len, PGSIZE);
	for (; down < up; down += PGSIZE) {
		if((uint32_t)down >= ULIM) {
			user_mem_check_addr = (uint32_t)down;
			if((uint32_t)down < cva) user_mem_check_addr = cva;
			result = -E_FAULT;
			break;
		}
		pte_t *pte = pgdir_walk(env->env_pgdir, down, 0);
		if(!pte || ((*pte) & (perm | PTE_P)) != (perm | PTE_P)){
			user_mem_check_addr = (uint32_t)down;
			if((uint32_t)down < cva) user_mem_check_addr = cva;
			result = -E_FAULT;
			break;
		}

	}
	return result;
}

根据注释要求写即可。

// Print a string to the system console.
// The string is exactly 'len' characters long.
// Destroys the environment on memory errors.
static void
sys_cputs(const char *s, size_t len)
{
	// Check that the user has permission to read memory [s, s+len).
	// Destroy the environment if not.

	// LAB 3: Your code here.
	user_mem_assert(curenv, s, len, PTE_U);
	// Print the string supplied by the user.
	cprintf("%.*s", len, s);
}

make grade

divzero: OK (1.3s) 
softint: OK (2.0s) 
badsegment: OK (1.0s) 
Part A score: 30/30

faultread: OK (2.0s) 
faultreadkernel: OK (1.1s) 
faultwrite: OK (1.9s) 
faultwritekernel: OK (1.7s) 
breakpoint: OK (1.3s) 
testbss: OK (2.0s) 
hello: OK (1.9s) 
buggyhello: OK (1.1s) 
buggyhello2: OK (1.0s) 
evilhello: OK (1.6s) 
Part B score: 50/50

Score: 80/80