MIT6.828学习笔记1

时间:2022-11-22 07:13:28

Part 1: PC Bootstrap

The PC's Physical Address Space

MIT6.828学习笔记1

早期的PC机基于Intel的8088处理器,能够寻址1MB的物理内存。从0x00000000到0x000FFFFF。低640KB的空间被标注为「Low Memory」。这是早期PC机可以使用的RAM。

寻址1MB物理内存需要20位的地址总线,因此8088的地址总线是20位。但是8088的CPU中的ALU宽度依然是16位的。即数据总线宽度为16位。为了解决这个问题,8088的CPU中设置了4个段寄存器:CS、DS、SS和ES,分别用于代码段、数据段、堆栈段和其他段。每个段寄存器都是16位的。每条指令的地址在送上地址总线之前,会将段寄存器中的值进行一定量的偏移,然后相加得到20位的地址。

从0x000C0000到0x000FFFFF的384KB由硬件保留用于特殊用途,如视频显示缓冲和非易失性存储器中的固件。BIOS占用从0x000F0000到0x000FFFFF的64KB区域,早期PC的BIOS存储在真正的ROM中。当前的PC将BIOS存储在可更新的闪存中。

BIOS负责执行基本的系统初始化,例如激活显卡和检查安装的内存量。执行此初始化后,BIOS 从某个适当的位置(如软盘、硬盘、CD-ROM 或网络)加载操作系统,并将计算机的控制权传递给操作系统。

在后来出现的处理器中,寻址空间已经远不止1MB。如80286可寻址空间为4MB,80386可寻址空间为4GB。在这些机器中,BIOS的位置发生了变化,但为了保持兼容性,从0x000A0000到0x000FFFFF的空间被保留了。

The ROM BIOS

打开两个终端,分别输入make qemu-nox-gdbmake gdb

despot@ubuntu:~/6.828/lab$ make gdb
gdb -n -x .gdbinit
GNU gdb (Ubuntu 8.1-0ubuntu3) 8.1.0.20180409-git
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
+ target remote localhost:26000
warning: No executable has been specified and target does not support
determining executable automatically.  Try using the "file" command.
warning: A handler for the OS ABI "GNU/Linux" is not built into this configuration
of GDB.  Attempting to continue with the default i8086 settings.

The target architecture is assumed to be i8086
[f000:fff0]    0xffff0:	ljmp   $0xf000,$0xe05b
0x0000fff0 in ?? ()
+ symbol-file obj/kern/kernel
(gdb) 

其中,较为重要的是这一行:

 [f000:fff0] 0xffff0:	ljmp   $0xf000,$0xe05b

这是第一条将要被执行的指令。
从这条指令我们可以看出:

  • The IBM PC starts executing at physical address 0x000ffff0, which is at the very top of the 64KB area reserved for the ROM BIOS.
  • The PC starts executing with CS = 0xf000 and IP = 0xfff0.
  • The first instruction to be executed is a jmp instruction, which jumps to the segmented address CS = 0xf000 and IP = 0xe05b.

QEMU这样做的原因是BIOS「hard-wired」物理内存0x00F0000到0x00FFFFF。这样可以确保BIOS在系统重启时首先获得控制权。

QEMU 仿真器附带自己的 BIOS,它将其放置在处理器模拟物理地址空间中的此位置。在处理器重置时,(模拟)处理器进入实模式,并将 CS 设置为 0xf000,将 IP 设置为 0xfff0,以便从该 (CS:IP) 段地址开始执行。

该指令的实际地址为CS向左偏移4位加上IP。即0xF0000(16 * 0xF000) + 0xFFF0 = 0xFFFF0。

BIOS主要的工作是初始化中断向量表、各种设备。在加载完PCI总线和一些重要设备后,它开始搜索可引导设备,如软盘、硬盘驱动器或者CD-ROM,从磁盘读取bootloader并将控制权转移给它。具体的指令含义可以参考这篇文章

Part 2: The Boot Loader

对于PC来说,软盘和硬盘都被划分为一个个512KB的区域,一个这样的区域称为扇区。扇区是磁盘操作的最小粒度,即读取或者写入都需要以扇区为单位。如果一个磁盘可以用来启动操作,那么这个磁盘的第一个扇区叫做启动扇区(boot sector)。boot loader的代码就存放在这个扇区。当BIOS找到这个扇区后,它会将这个扇区的内容转移到内存0x7c00~0x7dff的空间中。然后将控制权交给boot loader。

6.828的boot loader包含两个文件:boot/boot.Sboot/main.c

boot loader主要有两个功能:

  • 从实模式转换到32位保护模式,这样才能访问超过1MB的地址空间。
  • 通过x86的I/O指令,将内核从硬盘读取到内存中。

关于实模式和操作模式,可以阅读这篇文章或者PC Assembly Language的1.2.7和1.2.8节。

boot.S

  1 #include <inc/mmu.h>
  2 
  3 # Start the CPU: switch to 32-bit protected mode, jump into C.
  4 # The BIOS loads this code from the first sector of the hard disk into
  5 # memory at physical address 0x7c00 and starts executing in real mode
  6 # with %cs=0 %ip=7c00.
  7 
  8 .set PROT_MODE_CSEG, 0x8         # kernel code segment selector
  9 .set PROT_MODE_DSEG, 0x10        # kernel data segment selector
 10 .set CR0_PE_ON,      0x1         # protected mode enable flag

第1行是包含的头文件。第3~6行是功能说明,第8~10行设置了一些全局变量。

 11 
 12 .globl start
 13 start:
 14   .code16                     # Assemble for 16-bit mode
 15   cli                         # Disable interrupts
 16   cld                         # String operations increment
 17 

cli指令关闭中断,与之相对的是sti指令,开启中断。cld指令复位方向标志位DF(direction flag)。即使DF=0.与之相对的是std,其置位DF。DF决定了串操作指令的地址增长方向。

 18   # Set up the important data segment registers (DS, ES, SS).
 19   xorw    %ax,%ax             # Segment number zero
 20   movw    %ax,%ds             # -> Data Segment
 21   movw    %ax,%es             # -> Extra Segment
 22   movw    %ax,%ss             # -> Stack Segment
 23 

第19行将ax清零,然后分别设置几个段寄存器。

 24   # Enable A20:
 25   #   For backwards compatibility with the earliest PCs, physical 26   
 26   #   address line 20 is tied low, so that addresses higher than
 27   #   1MB wrap around to zero by default.  This code undoes this.
 28 seta20.1:
 29   inb     $0x64,%al               # Wait for not busy
 30   testb   $0x2,%al
 31   jnz     seta20.1
 32 
 33   movb    $0xd1,%al               # 0xd1 -> port 0x64
 34   outb    %al,$0x64
 35 

这段代码的作用是使能A20地址线。在实模式下,A20地址线被禁止,寻址空间被限制在1MB,在转向保护模式前,需要打开A20地址线。

第29行代码inb $0x64,%al从端口0x64读取一个字节的数据到寄存器al中,然后第30行代码testb $0x2,%al检查寄存器al中数据的第2位(从1算起),如果该位为1,则跳转到seta20.1,重复上述操作。否则将数据0xd1写入寄存器al,然后将数据输出到端口0x64。

根据这份文件提供的信息,我们可以知道,0x64端口是键盘控制器的IO端口。我们在此处只需要关心bit 1的状态,第30行代码检测的位置。当该位为1时,说明输入缓冲有数据未被控制器取走,CPU需要等待直到键盘控制器的输入缓冲区为空。

0064	r	KB controller read status (MCA)
		 bit 7 = 1 parity error on transmission from keyboard
		 bit 6 = 1 general timeout
		 bit 5 = 1 mouse output buffer full
		 bit 4 = 0 keyboard inhibit
		 bit 3 = 1 data in input register is command
			 0 data in input register is data
		 bit 2	 system flag status: 0=power up or reset  1=selftest OK
		 bit 1 = 1 input buffer full (input 60/64 has data for 804x)
		 bit 0 = 1 output buffer full (output 60 has data for system)

当键盘控制器取走数据之后,boot loader向端口0x64写入数据0xd1。数据D1可以看做是一条控制指令,该条指令表示下一个写入0x0060端口的数据将被键盘控制器写到它的输出端口。有些机器使用输出端口的bit1来控制A20线。

D1	dbl   write output port. next byte written  to 0060
			      will be written to the 804x output port; the
			      original IBM AT and many compatibles use bit 1 of
			      the output port to control the A20 gate.
		      Compaq  The system speed bits are not set by this command
			      use commands A1-A6 (!) for speed functions.
 36 seta20.2:
 37   inb     $0x64,%al               # Wait for not busy
 38   testb   $0x2,%al
 39   jnz     seta20.2
 40 
 41   movb    $0xdf,%al               # 0xdf -> port 0x60
 42   outb    %al,$0x60
 43 

第37行~第39行再次等待键盘控制器将上一条指令取走。第41~第42行代码将数据0xdf输出到0x60端口,这条数据会被键盘控制器写入它的输出端口,此时输出端口的bit1为1,A20线被使能。

 44   # Switch from real to protected mode, using a bootstrap GDT
 45   # and segment translation that makes virtual addresses 
 46   # identical to their physical addresses, so that the 
 47   # effective memory map does not change during the switch.
 48   lgdt    gdtdesc
 49   movl    %cr0, %eax
 50   orl     $CR0_PE_ON, %eax
 51   movl    %eax, %cr0
 52 
 ……
 75 # Bootstrap GDT
 76 .p2align 2                                # force 4 byte alignment
 77 gdt:
 78   SEG_NULL                              # null seg
 79   SEG(STA_X|STA_R, 0x0, 0xffffffff)     # code seg
 80   SEG(STA_W, 0x0, 0xffffffff)           # data seg
 81 
 82 gdtdesc:
 83   .word   0x17                            # sizeof(gdt) - 1
 84   .long   gdt                             # address gdt
 85 

第48行~第51行代码从实模式转向保护模式。

第48行代码加载全局描述符。关于该条指令可以参考这里这里这里。该行代码还访问了第75~第85行所定义的数据。

lgdt取6个字节的数据,将前两个字节装入gdtr寄存器的limit部分,另外4个字节装入gdtr寄存器的base部分。lgdt是间接寻址的,需要用装入的数据间接找到真正的GDT的线性地址。

第49行~第51行代码将cr0控制器的最低位置1,处理器运行于保护模式。

关于实模式保护模式除了这两个链接,还可以参考这里

 53   # Jump to next instruction, but in 32-bit code segment.
 54   # Switches processor into 32-bit mode.
 55   ljmp    $PROT_MODE_CSEG, $protcseg
 56 

执行一条跳转指令。但处理器工作于32位保护模式。

 57   .code32                     # Assemble for 32-bit mode
 58 protcseg:
 59   # Set up the protected-mode data segment registers
 60   movw    $PROT_MODE_DSEG, %ax    # Our data segment selector
 61   movw    %ax, %ds                # -> DS: Data Segment
 62   movw    %ax, %es                # -> ES: Extra Segment
 63   movw    %ax, %fs                # -> FS
 64   movw    %ax, %gs                # -> GS
 65   movw    %ax, %ss                # -> SS: Stack Segment
 66 

设置一下段寄存器。前面提到过,在实模式下,指令的实际地址由段寄存器和指令寄存器组合给出。段寄存器的值左移4位加上指令寄存器的值得到实际地址。在保护模式下,段寄存器是为了获取段描述符表的某个项目。根据这份链接指出,在对GDT进行操作后,我们需要将新的段选择器加载到段寄存器。

Whatever you do with the GDT has no effect on the CPU until you load new Segment Selectors into Segment Registers. For most of these registers, the process is as simple as using MOV instructions, but changing the CS register requires code resembling a jump or call to elsewhere, as this is the only way its value is meant to be changed.

 67   # Set up the stack pointer and call into C.
 68   movl    $start, %esp
 69   call bootmain
 70 

设置esp的值,调用bootmain函数。

 71   # If bootmain returns (it shouldn't), loop.
 72 spin:
 73   jmp spin
 74 

如果从bootmain返回,死循环。

main.c

main.c的主要作用是将内核从磁盘加载到内存,然后将控制权转移给内核。

  1 #include <inc/x86.h>
  2 #include <inc/elf.h>
  3 

前三行是包含的头文件。

  4 /**********************************************************************
  5  * This a dirt simple boot loader, whose sole job is to boot
  6  * an ELF kernel image from the first IDE hard disk.
  7  *
  8  * DISK LAYOUT
  9  *  * This program(boot.S and main.c) is the bootloader.  It should
 10  *    be stored in the first sector of the disk.
 11  *
 12  *  * The 2nd sector onward holds the kernel image.
 13  *
 14  *  * The kernel image must be in ELF format.
 15  *
 16  * BOOT UP STEPS
 17  *  * when the CPU boots it loads the BIOS into memory and executes it
 18  *
 19  *  * the BIOS intializes devices, sets of the interrupt routines, and
 20  *    reads the first sector of the boot device(e.g., hard-drive)
 21  *    into memory and jumps to it.
 22  *
 23  *  * Assuming this boot loader is stored in the first sector of the
 24  *    hard-drive, this code takes over...
 25  *
 26  *  * control starts in boot.S -- which sets up protected mode,
 27  *    and a stack so C code then run, then calls bootmain()
 28  *
 29  *  * bootmain() in this file takes over, reads in the kernel and jumps to i    t.
 30  **********************************************************************/
 31

第4行~第31行介绍了main.c的功能和启动步骤。

 32 #define SECTSIZE        512
 33 #define ELFHDR          ((struct Elf *) 0x10000) // scratch space
 34 

定义一些变量。SCTSIZE是扇区大小,512KB。ELFHDR为一个内存地址。

 35 void readsect(void*, uint32_t);
 36 void readseg(uint32_t, uint32_t, uint32_t);
 37

一些函数的声明。readsect读取一个扇区的数据。readseg调用readsect读取数据。

 98 void
 99 waitdisk(void)
100 {
101         // wait for disk reaady
102         while ((inb(0x1F7) & 0xC0) != 0x40)
103                 /* do nothing */;
104 }
105 
106 void
107 readsect(void *dst, uint32_t offset)
108 {
109         // wait for disk to be ready
110         waitdisk();
111 
112         outb(0x1F2, 1);         // count = 1
113         outb(0x1F3, offset);
114         outb(0x1F4, offset >> 8);
115         outb(0x1F5, offset >> 16);
116         outb(0x1F6, (offset >> 24) | 0xE0);
117         outb(0x1F7, 0x20);      // cmd 0x20 - read sectors
118 
119         // wait for disk to be ready
120         waitdisk();
121 
122         // read a sector
123         insl(0x1F0, dst, SECTSIZE/4);
124 }
125

先看readsect函数。该函数接收两个参数。void *dst为数据装载的起始地址,offset为当前所装载的扇区距离内核起始地址的偏移量,以扇区为单位,一次装载为1个扇区。

waitdisk函数等待磁盘准备好。(inb(0x1F7) & 0xC0) != 0x40表示从0x1F7端口读取一个数据并检测该数据的高两位,当最高位为0且次高位为1时循环结束。此时磁盘已经准备好。端口0x1F7此链接可以看到相关信息。

 01F7	r	status register
		 bit 7 = 1  controller is executing a command
		 bit 6 = 1  drive is ready
		 bit 5 = 1  write fault
		 bit 4 = 1  seek complete
		 bit 3 = 1  sector buffer requires servicing
		 bit 2 = 1  disk data read successfully corrected
		 bit 1 = 1  index - set to 1 each disk revolution
		 bit 0 = 1  previous command ended in an error

bit7为0且bit6为1时表示控制器没有在执行命令且磁盘已经准备好。

outb是一个内联函数。接收两个参数。一个是port,一个是data。

static inline void
outb(int port, uint8_t data)
{
        asm volatile("outb %0,%w1" : : "a" (data), "d" (port));
}
 01F2	r/w	sector count
 01F3	r/w	sector number
 01F4	r/w	cylinder low
 01F5	r/w	cylinder high
 01F6	r/w	drive/head
		 bit 7	 = 1
		 bit 6	 = 0
		 bit 5	 = 1
		 bit 4	 = 0  drive 0 select
			     = 1  drive 1 select
		 bit 3-0      head select bits
 01F7	w	command register
		commands:
        ……
		 20	 read sectors with retry
        ……

通过上表我们可以看到一系列调用outb的含义,首先向0xF2写入1,表示一次读取一个扇区;0x1F6的低4位、0x1F30x1F40x1F5存放的是起始扇区的信息。

其中0x1F30x1F40x1F5分别存储第0~7位、第8~15位和第16~23位。0x1F6的低四位存储第24~27位。

最后向0x1F7写入命令0x20读取扇区。等待控制器读取完这些命令后执行insl进行读取。

insl也是一个内联函数,在x86.h中可以找到它的定义:

static inline void
insl(int port, void *addr, int cnt)
{
        asm volatile("cld\n\trepne\n\tinsl"
                : "=D" (addr), "=c" (cnt)
                : "d" (port), "0" (addr), "1" (cnt)
                : "memory", "cc");
}

01F0 r/w data register

insl从端口port读取cnt个双字(4字节)存储到基址为addr的内存中。

接下来看一下readseg函数。

 69 // Read 'count' bytes at 'offset' from kernel into physical address 'pa'.
 70 // Might copy more than asked
 71 void
 72 readseg(uint32_t pa, uint32_t count, uint32_t offset)
 73 {
 74         uint32_t end_pa;
 75 
 76         end_pa = pa + count;
 77 
 78         // round down to sector boundary
 79         pa &= ~(SECTSIZE - 1);
 80 
 81         // translate from bytes to sectors, and kernel starts at sector 1
 82         offset = (offset / SECTSIZE) + 1;
 83 
 84         // If this is too slow, we could read lots of sectors at a time.
 85         // We'd write more to memory than asked, but it doesn't matter --
 86         // we load in increasing order.
 87         while (pa < end_pa) {
 88                 // Since we haven't enabled paging yet and we're using
 89                 // an identity segment mapping (see boot.S), we can
 90                 // use physical addresses directly.  This won't be the
 91                 // case once JOS enables the MMU.
 92                 readsect((uint8_t*) pa, offset);
 93                 pa += SECTSIZE;
 94                 offset++;
 95         }
 96 }
 97

readseg函数接受3个参数。pa表示所读取数据在内存中存放的首地址;count表示读取的字节数;offset表示读取的数据位于距离内核起始的偏移。

end_pa表示读取的数据存放的最高地址。pa &= ~(SECTSIZE - 1)把pa重新定向到offset存储单元所在的扇区的起始地址,等价的汇编指令为and $0xfffffe00, %ebx,舍弃了低8位。第82行代码将字节的偏移量转化为了扇区的偏移量,扇区0存放的是boot loader,内核从扇区1开始存放。

接下来判断读取是否完成,没有完成则调用readsect读取数据。因为一次读取一个扇区,因此总的读取字节数可能超过count

接下来我们回到主函数。

 38 void
 39 bootmain(void)
 40 {
 41         struct Proghdr *ph, *eph;
 42 
 43         // read 1st page off disk
 44         readseg((uint32_t) ELFHDR, SECTSIZE*8, 0);
 45 
 46         // is this a valid ELF?
 47         if (ELFHDR->e_magic != ELF_MAGIC)
 48                 goto bad;
 49 
 50         // load each program segment (ignores ph flags)
 51         ph = (struct Proghdr *) ((uint8_t *) ELFHDR + ELFHDR->e_phoff);
 52         eph = ph + ELFHDR->e_phnum;
 53         for (; ph < eph; ph++)
 54                 // p_pa is the load address of this segment (as well
 55                 // as the physical address)
 56                 readseg(ph->p_pa, ph->p_memsz, ph->p_offset);
 57 
 58         // call the entry point from the ELF header
 59         // note: does not return!
 60         ((void (*)(void)) (ELFHDR->e_entry))();
 61 
 62 bad:
 63         outw(0x8A00, 0x8A00);
 64         outw(0x8A00, 0x8E00);
 65         while (1)
 66                 /* do nothing */;
 67 }
 68 

第41行代码定义了两个指向struct Proghdr的指针。这个结构体的定义在inc/elf.h,我们可以打开看一下。

struct Proghdr {
	uint32_t p_type;
	uint32_t p_offset;	//本段在文件内的偏移
	uint32_t p_va;
	uint32_t p_pa;		//段在物理内存的起始地址
	uint32_t p_filesz;
	uint32_t p_memsz;	//内存大小
	uint32_t p_flags;
	uint32_t p_align;
};

第44行代码从扇区1开始读取4KB数据到以EDFHDR(0x10000)为起始地址的内存中。这些数据其实是操作系统映像文件的elf头部。关于ELF文件,可以参考这个链接或者这里。我们使用的内核被编译为ELF格式的可执行文件。主要有ELF文件头、程序头表和相应的段组成。

ELF is a format for storing programs or fragments of programs on disk, created as a result of compiling and linking. An ELF file is divided into sections. For an executable program, these are the text section for the code, the data section for global variables and the rodata section that usually contains constant strings. The ELF file contains headers that describe how these sections should be stored in memory.

这个头部文件的结构定义也在inc/elf.h

#define ELF_MAGIC 0x464C457FU	/* "\x7FELF" in little endian */

struct Elf {
	uint32_t e_magic;		// must equal ELF_MAGIC
	uint8_t e_elf[12];
	uint16_t e_type;
	uint16_t e_machine;
	uint32_t e_version;
	uint32_t e_entry;
	uint32_t e_phoff;		//程序头表在文件内的偏移量
	uint32_t e_shoff;
	uint32_t e_flags;
	uint16_t e_ehsize;
	uint16_t e_phentsize;
	uint16_t e_phnum;		//程序头表条目数目,即段的数目
	uint16_t e_shentsize;
	uint16_t e_shnum;
	uint16_t e_shstrndx;
};

第47行代码检验这个文件是否有效。

如果无效则执行两条outw指令后进入一个死循环。outw是一个内联函数,定义在x86.h中。

static inline void
outw(int port, uint16_t data)
{
	asm volatile("outw %0,%w1" : : "a" (data), "d" (port));
}

检验完成后,通过ph指向程序头表,eph是程序头表最后一个段的地址。通过一个while循环,将各个段加载到内存中。

然后通过这条指令((void (*)(void)) (ELFHDR->e_entry))()将控制权转移给内核。

ELF header

An ELF binary starts with a fixed-length ELF header, followed by a variable-length program header listing each of the program sections to be loaded. The C definitions for these ELF headers are in inc/elf.h. The program sections we're interested in are:

  • .text: The program's executable instructions.
  • .rodata: Read-only data, such as ASCII string constants produced by the C compiler. (We will not bother setting up the hardware to prohibit writing, however.)
  • .data: The data section holds the program's initialized data, such as global variables declared with initializers like int x = 5

通过输入objdump -h obj/boot/boot.out我们可以看到,一些块的链接地址和加载地址是相同的:

obj/boot/boot.out:     file format elf32-i386

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .text         00000186  00007c00  00007c00  00000074  2**2
                  CONTENTS, ALLOC, LOAD, CODE
  1 .eh_frame     000000a8  00007d88  00007d88  000001fc  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  2 .stab         0000087c  00000000  00000000  000002a4  2**2
                  CONTENTS, READONLY, DEBUGGING
  3 .stabstr      00000925  00000000  00000000  00000b20  2**0
                  CONTENTS, READONLY, DEBUGGING
  4 .comment      00000029  00000000  00000000  00001445  2**0
                  CONTENTS, READONLY

通过输入objdump -x obj/kern/kernel我们可以看到:

Program Header:
    LOAD off    0x00001000 vaddr 0xf0100000 paddr 0x00100000 align 2**12
         filesz 0x0000759d memsz 0x0000759d flags r-x
    LOAD off    0x00009000 vaddr 0xf0108000 paddr 0x00108000 align 2**12
         filesz 0x0000b6a8 memsz 0x0000b6a8 flags rw-
   STACK off    0x00000000 vaddr 0x00000000 paddr 0x00000000 align 2**4
         filesz 0x00000000 memsz 0x00000000 flags rwx

标注为LOAD的会被读取到内存。

Link and Load address

链接地址可以理解为通过编译器链接器处理形成的可执行程序中指令的地址,即逻辑地址。加载地址则是可执行文件真正被装入内存后运行的地址,即物理地址。

BIOS默认将boot loader的加载地址设为0x7c00,而它的链接地址在boot/Makefrag中给出:

……
$(OBJDIR)/boot/boot: $(BOOT_OBJS)
	@echo + ld boot/boot
	$(V)$(LD) $(LDFLAGS) -N -e start -Ttext 0x7C00 -o $@.out $^
	$(V)$(OBJDUMP) -S $@.out >$@.asm
	$(V)$(OBJCOPY) -S -O binary -j .text $@.out $@
	$(V)perl boot/sign.pl $(OBJDIR)/boot/boot
……

其中start -Ttext 0x7C00说明了它的链接地址。

我们再打开obj/boot/boot.asm看一下:

……
.globl start
start:
  .code16                     # Assemble for 16-bit mode
  cli                         # Disable interrupts
    7c00:	fa                   	cli    
  cld                         # String operations increment
    7c01:	fc                   	cld    
……

可见,boot loader的链接地址为0x7c00

现在,我们改变一下Makefrag中的参数,将0x7c00改为其他值,如0x6c00。在lab目录下输入make clean然后make,此时再来打开boot.asm

……
.globl start
start:
  .code16                     # Assemble for 16-bit mode
  cli                         # Disable interrupts
    6c00:	fa                   	cli    
  cld                         # String operations increment
    6c01:	fc                   	cld
……

可以看到,boot loader的链接地址已经发生了变化。此时我们重新运行一下BIOS,看看会发生什么。打开两个终端,分别输入make qemu-nox-gdbmake gdb

因为BIOS的加载地址是在0x7c00,我们断点还是打在这里。

(gdb) b *0x7c00
Breakpoint 1 at 0x7c00
(gdb) c
Continuing.
[   0:7c00] => 0x7c00:	cli    

Breakpoint 1, 0x00007c00 in ?? ()

第一条指令是正确的。

[   0:7c1e] => 0x7c1e:	lgdtw  0x6c64

(gdb) x/6xb 0x6c64
0x6c64:	0x00	0x00	0x00	0x00	0x00	0x00
(gdb) x/6xb 0x7c64
0x7c64:	0x17	0x00	0x4c	0x6c	0x00	0x00

运行到这一条指令时我们会发现,加载到GDT的值是位于0x6c64处的6个字节,而这六个字节的数据全部是0.在boot.asm中我们可以看到:

00006c64 <gdtdesc>:
    6c64:	17                   	pop    %ss
    6c65:	00 4c 6c 00          	add    %cl,0x0(%esp,%ebp,2)

gdtdesc的链接地址是0x6c64,但是它被加载到了0x7c64,这样导致了GDT的设置错误。然后我们继续运行:

(gdb) si
[   0:7c23] => 0x7c23:	mov    %cr0,%eax
0x00007c23 in ?? ()
(gdb) si
[   0:7c26] => 0x7c26:	or     $0x1,%eax
0x00007c26 in ?? ()
(gdb) si
[   0:7c2a] => 0x7c2a:	mov    %eax,%cr0
0x00007c2a in ?? ()
(gdb) si
[   0:7c2d] => 0x7c2d:	ljmp   $0x8,$0x6c32
0x00007c2d in ?? ()
(gdb) si
[   0:7c2d] => 0x7c2d:	ljmp   $0x8,$0x6c32
0x00007c2d in ?? ()

我们可以发现,将保护模式打开后执行的跳转指令发生了错误。此时处理器工作在保护模式,GDT的基址部分为0,而长度值也被设置为0。因此,处理器寻址不到目标处的指令,因此出现了死循环。

Part 3: The Kernel

Using virtual memory to work around position dependence

在进入内核之后,在执行mov %eax,%cr0指令之前,我们可以看到,在地址0x00100000的地方的数据为0x02,在地址0xf01000000的地方的数据为0x00。说明此时地址映射还没有完成。当执行完mov %eax,%cr0指令后,两个地址都映射到实际物理地址0x00100000的地方,此时,两个地址的数据为0x02。

(gdb) b *0x100025
Breakpoint 1 at 0x100025
(gdb) c
Continuing.
The target architecture is assumed to be i386
=> 0x100025:	mov    %eax,%cr0

Breakpoint 1, 0x00100025 in ?? ()
(gdb) x/1b 0x00100000
0x100000:	0x02
(gdb) x/1b 0xf0100000
0xf0100000 <_start+4026531828>:	0x00
(gdb) stepi
=> 0x100028:	mov    $0xf010002f,%eax
0x00100028 in ?? ()
(gdb) x/1b 0x00100000
0x100000:	0x02
(gdb) x/1b 0xf0100000
0xf0100000 <_start+4026531828>:	0x02

kern/kernel.S中注释掉movl %eax, %cr0后我们会发现,在执行add %al,(%eax)指令时发生了错误,原因是Trying to execute code outside RAM or ROM at 0xf010002c,我们要寻址的地方超出了物理内存。

+ as kern/entry.S
+ ld obj/kern/kernel
ld: warning: section `.bss' type changed to PROGBITS
+ mk obj/kern/kernel.img
***
*** Now run 'make gdb'.
***
qemu-system-i386 -nographic -drive file=obj/kern/kernel.img,index=0,media=disk,format=raw -serial mon:stdio -gdb tcp::26000 -D qemu.log  -S
qemu: fatal: Trying to execute code outside RAM or ROM at 0xf010002c

EAX=f010002c EBX=00010094 ECX=00000000 EDX=000000a4
ESI=00010094 EDI=00000000 EBP=00007bf8 ESP=00007bec
EIP=f010002c EFL=00000086 [--S--P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
CS =0008 00000000 ffffffff 00cf9a00 DPL=0 CS32 [-R-]
SS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
DS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
FS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
GS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
GDT=     00007c4c 00000017
IDT=     00000000 000003ff
CR0=00000011 CR2=00000000 CR3=00112000 CR4=00000000
DR0=00000000 DR1=00000000 DR2=00000000 DR3=00000000 
DR6=ffff0ff0 DR7=00000400
CCS=00000084 CCD=80010011 CCO=EFLAGS  
EFER=0000000000000000
FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
XMM00=00000000000000000000000000000000 XMM01=00000000000000000000000000000000
XMM02=00000000000000000000000000000000 XMM03=00000000000000000000000000000000
XMM04=00000000000000000000000000000000 XMM05=00000000000000000000000000000000
XMM06=00000000000000000000000000000000 XMM07=00000000000000000000000000000000
GNUmakefile:171: recipe for target 'qemu-nox-gdb' failed
make: *** [qemu-nox-gdb] Aborted (core dumped)
=> 0xf010002c <relocated>:	add    %al,(%eax)
relocated () at kern/entry.S:74
74		movl	$0x0,%ebp			# nuke frame pointer
(gdb) 
Remote connection closed

Formatted Printing to the Console

  1. Explain the interface between printf.c and console.c. Specifically, what function does console.c export? How is this function used by printf.c?

cprintf(printf.c)调用了vcprintf(printf.c)vcprintf会调用vprintfmt(printfmt.c)函数,vprintfmt会调用putch(printf.c)函数,putch会调用cputchar(console.c)函数

  1. Explain the following from console.c:
1      if (crt_pos >= CRT_SIZE) {
2              int i;
3              memmove(crt_buf, crt_buf + CRT_COLS, (CRT_SIZE - CRT_COLS) * sizeof(uint16_t));
4              for (i = CRT_SIZE - CRT_COLS; i < CRT_SIZE; i++)
5                      crt_buf[i] = 0x0700 | ' ';
6              crt_pos -= CRT_COLS;
7      }

console.h中我们可以看到CRT_SIZE定义为CRT_ROWS * CRT_COLSCRT_ROWS CRT_COLS的值分别为25和80。CRT(cathode ray tube)是阴极射线显示器。该显示器有80列,25行,每个字可容纳两个字节。当crt_pos大于或等于CRT_SIZE时说明显示器已经写满。

关于计算机显示的知识可以参考这里

memmove的定义在lib/string.c里:

void *
memmove(void *dst, const void *src, size_t n)
{
	const char *s;
	char *d;

	s = src;
	d = dst;
	if (s < d && s + n > d) {
		s += n;
		d += n;
		if ((int)s%4 == 0 && (int)d%4 == 0 && n%4 == 0)
			asm volatile("std; rep movsl\n"
				:: "D" (d-4), "S" (s-4), "c" (n/4) : "cc", "memory");
		else
			asm volatile("std; rep movsb\n"
				:: "D" (d-1), "S" (s-1), "c" (n) : "cc", "memory");
		// Some versions of GCC rely on DF being clear
		asm volatile("cld" ::: "cc");
	} else {
		if ((int)s%4 == 0 && (int)d%4 == 0 && n%4 == 0)
			asm volatile("cld; rep movsl\n"
				:: "D" (d), "S" (s), "c" (n/4) : "cc", "memory");
		else
			asm volatile("cld; rep movsb\n"
				:: "D" (d), "S" (s), "c" (n) : "cc", "memory");
	}
	return dst;
}

该函数接收3个参数。

  • dst 指向用于存储复制内容的目标数组,类型强制转换为 void* 指针。
  • src 指向要复制的数据源,类型强制转换为 void* 指针。
  • n 要被复制的字节数。

在上面进行的代码调用用显示器缓冲区的后24行数据覆盖前24行的数据,再将最后一行的数据填充为0x0700 | ' '。空格字符、0x0700进行或操作的目的是让空格的颜色为黑色。最后将当前位置移到最后一行的起始位置。

  1. Trace the execution of the following code step-by-step:
	int x = 1, y = 3, z = 4;
	cprintf("x %d, y %x, z %d\n", x, y, z);
  • In the call to cprintf(), to what does fmt point? To what does ap point?
  • List (in order of execution) each call to cons_putc, va_arg, and vcprintf. For cons_putc, list its argument as well. For va_arg, list what ap points to before and after the call. For vcprintf list the values of its two arguments.

先看一下cprintf的代码:

int
cprintf(const char *fmt, ...)
{
	va_list ap;
	int cnt;

	va_start(ap, fmt);
	cnt = vcprintf(fmt, ap);
	va_end(ap);

	return cnt;
}

函数首先声明了一个变量ap,它是va_list类型的。关于这种类型,可以参考这篇文章。在inc/stdarg.h中也可以看到一些关于它们的信息。ap是一个字符型的指针,指向可变参数的字符串,在题目中,cprintf的参数除了一个字符串,还有x, y, z

typedef __builtin_va_list va_list;

#define va_start(ap, last) __builtin_va_start(ap, last)

#define va_arg(ap, type) __builtin_va_arg(ap, type)

#define va_end(ap) __builtin_va_end(ap)

我们继续看题目所给代码的执行顺序。定义了一个变量cnt。然后调用了va_start(ap, fmt)va_startap真正指向可变参数列表。

  • va_list用于声明一个变量,我们知道函数的可变参数列表其实就是一个字符串,所以va_list才被声明为字符型指针,这个类型用于声明一个指向参数列表的字符型指针变量。
  • va_start(ap,v) 它的第一个参数是指向可变参数字符串的变量,第二个参数是可变参数函数的第一个参数,通常用于指定可变参数列表中参数的个数。
  • va_arg(ap,t) 它的第一个参数指向可变参数字符串的变量,第二个参数是可变参数的类型。
  • va_end(ap) 用于将存放可变参数字符串的变量清空(赋值为NULL)。

之后cprintf调用了vcprintf函数,并将返回值赋给了cnt

int
vcprintf(const char *fmt, va_list ap)
{
	int cnt = 0;

	vprintfmt((void*)putch, &cnt, fmt, ap);
	return cnt;
}

vcprintf调用了vprintfmtvprintf的定义太长就不在此展示,该函数位于lib/printfmt.c

static void
putch(int ch, int *cnt)
{
	cputchar(ch);
	*cnt++;
}
……
void
vprintfmt(void (*putch)(int, void*), void *putdat, const char *fmt, va_list ap)

vprintfmt函数中,首先遍历fmt所指向的字符串,通过调用传递的函数指针调用putch函数,putch函数随后调用cputchar函数并增加cnt的值,cputchar函数调用cons_putc函数输出字符。遍历fmt的操作通过while循环进行直到/0或者%

// output a character to the console
static void
cons_putc(int c)
{
	serial_putc(c);
	lpt_putc(c);
	cga_putc(c);
}

当遇到/0时,vprintfmt函数返回。当遇到%时,通过switch操作将输出根据要求进行格式化。在vprintfmt调用结束后,vcprintf返回输出的字节数,然后cprintf执行va_end将存放可变参数字符串的变量清空,然后返回cnt

  1. Run the following code.
	unsigned int i = 0x00646c72;
	cprintf("H%x Wo%s", 57616, &i);
  • What is the output? Explain how this output is arrived at in the step-by-step manner of the previous exercise.
  • The output depends on that fact that the x86 is little-endian. If the x86 were instead big-endian what would you set i to in order to yield the same output? Would you need to change 57616 to a different value?

输出:He110 World

cprintfmt函数在找到一个%后退出while循环遍历,进行switch操作。而x对应的case如下:

……
// (unsigned) hexadecimal
case 'x':
	num = getuint(&ap, lflag);
	base = 16;
number:
	printnum(putch, putdat, num, base, width, padc);
	break;
……

首先从可变参数列表里获取到我们的参数57616,该参数的类型由lflag决定。在此例中,%后直接跟着xlflag的值为0,表示取的是一个无符号int型整数。

// Get an unsigned int of various possible sizes from a varargs list,
// depending on the lflag parameter.
static unsigned long long
getuint(va_list *ap, int lflag)
{
	if (lflag >= 2)
		return va_arg(*ap, unsigned long long);
	else if (lflag)
		return va_arg(*ap, unsigned long);
	else
		return va_arg(*ap, unsigned int);
}

将该参数取回后,根据给定的要求进行格式转换并输出。57616转化为16进制为e110

在进行下一次调用switch语句时,%后跟着s,代表输出的是一个字符串。

// string
case 's':
	if ((p = va_arg(ap, char *)) == NULL)
		p = "(null)";
	if (width > 0 && padc != '-')
		for (width -= strnlen(p, precision); width > 0; width--)
			putch(padc, putdat);
	for (; (ch = *p++) != '\0' && (precision < 0 || --precision >= 0); width--)
		if (altflag && (ch < ' ' || ch > '~'))
			putch('?', putdat);
		else
			putch(ch, putdat);
	for (; width > 0; width--)
		putch(' ', putdat);
	break;

0x00646c72按字节进行字符转换并输出。x86是小端模式,存储的数据从低地址开始应该是:72 6c 64 00。根据ASCII提供的信息,我们可以查到72 6c 64 00对应的字符是r l d \0。如果是大端序的机器。那么定义的变量应该是unsigned int i = 0x726c6400

  1. In the following code, what is going to be printed after 'y='? (note: the answer is not a specific value.) Why does this happen?
	cprintf("x=%d y=%d", 3);

va_arg在取完一个参数后,会将ap的值改变,使它指向下一个参数。如果可变参数列表的参数不够,则va_arg指向的地方的数据未被定义。具体信息可以参考这里

  1. Let's say that GCC changed its calling convention so that it pushed arguments on the stack in declaration order, so that the last argument is pushed last. How would you have to change cprintf or its interface so that it would still be possible to pass it a variable number of arguments?

可以改变va_argva_start的宏实现,使它们地址的增长方向相反。

The Stack

x86的栈是向下生长的。stack pointer(esp)指向当前正在使用的栈的最低地址。向栈顶添加一个数据先减小esp的值再向当前指向的地址写入数据。从栈顶弹出一个数据先将数据读出来再增加esp的值。

Determine where the kernel initializes its stack, and exactly where in memory its stack is located. How does the kernel reserve space for its stack? And at which "end" of this reserved area is the stack pointer initialized to point to?

boot loader最后通过一个调用来将控制权交给kernel,在此之前的代码我们已经分析过了,并没有初始化栈。因此我们直接从这里开始调试,看看后面执行的指令。

	((void (*)(void)) (ELFHDR->e_entry))();
    7d6b:	ff 15 18 00 01 00    	call   *0x10018

在指令执行过程中,我们可以看到有这两条指令:

(gdb) 
=> 0xf010002f <relocated>:	mov    $0x0,%ebp
relocated () at kern/entry.S:74
74		movl	$0x0,%ebp			# nuke frame pointer
(gdb) 
=> 0xf0100034 <relocated+5>:	mov    $0xf0110000,%esp
relocated () at kern/entry.S:77
77		movl	$(bootstacktop),%esp

这两条指令在entry.S中:

	# Clear the frame pointer register (EBP)
	# so that once we get into debugging C code,
	# stack backtraces will be terminated properly.
	movl	$0x0,%ebp			# nuke frame pointer

	# Set the stack pointer
	movl	$(bootstacktop),%esp

可见,正是这两条指令初始化了栈,并且将栈的初始地址设为了0xf0110000,映射到实际物理地址是0x00110000

0xf0110000显然超出了我们实际具有的物理内存,而且我们现在还没有虚拟内存的机制,entry.S中通过这样一段代码来将0xf0000000~0xf04000000x00000000~0x00400000的地址都映射到实际物理地址0x00000000~0x00400000上。

	# Load the physical address of entry_pgdir into cr3.  entry_pgdir
	# is defined in entrypgdir.c.
	movl	$(RELOC(entry_pgdir)), %eax
	movl	%eax, %cr3
	# Turn on paging.
	movl	%cr0, %eax
	orl	$(CR0_PE|CR0_PG|CR0_WP), %eax
	movl	%eax, %cr0

inc/memlayout.h中我们可以找到这样一段定义:

// Kernel stack.
#define KSTACKTOP	KERNBASE
#define KSTKSIZE	(8*PGSIZE)   		// size of a kernel stack
#define KSTKGAP		(8*PGSIZE)   		// size of a kernel stack guard

代码定义了栈的大小为8页,一页为4KB,总的为32KB,因此栈的地址为从0xf0108000~0xf0110000的地址空间,实际地址为0x00108000~0x00110000

The ebp (base pointer) register, in contrast, is associated with the stack primarily by software convention. On entry to a C function, the function's prologue code normally saves the previous function's base pointer by pushing it onto the stack, and then copies the current esp value into ebp for the duration of the function. If all the functions in a program obey this convention, then at any given point during the program's execution, it is possible to trace back through the stack by following the chain of saved ebp pointers and determining exactly what nested sequence of function calls caused this particular point in the program to be reached. This capability can be particularly useful, for example, when a particular function causes an assert failure or panic because bad arguments were passed to it, but you aren't sure who passed the bad arguments. A stack backtrace lets you find the offending function.

ebp寄存器保存了当前函数的栈帧信息。并且在当前函数执行函数调用时将数据保存在栈上,并更新为新的函数的栈帧信息。

To become familiar with the C calling conventions on the x86, find the address of the test_backtrace function in obj/kern/kernel.asm, set a breakpoint there, and examine what happens each time it gets called after the kernel starts. How many 32-bit words does each recursive nesting level of test_backtrace push on the stack, and what are those words?

obj/kern/kernel.asm中,我们可以看到以下信息:

……
// Test the stack backtrace function (lab 1 only)
void
test_backtrace(int x)
{
f0100040:	55                   	push   %ebp
……

kern/init.c中我们可以找到这个函数的定义:

// Test the stack backtrace function (lab 1 only)
void
test_backtrace(int x)
{
	cprintf("entering test_backtrace %d\n", x);
	if (x > 0)
		test_backtrace(x-1);
	else
		mon_backtrace(0, 0, 0);
	cprintf("leaving test_backtrace %d\n", x);
}

mon_backtrace目前没有做任何事情:

int
mon_backtrace(int argc, char **argv, struct Trapframe *tf)
{
	// Your code here.
	return 0;
}
	// Test the stack backtrace function (lab 1 only)
	test_backtrace(5);
f01000e8:	c7 04 24 05 00 00 00 	movl   $0x5,(%esp)
f01000ef:	e8 4c ff ff ff       	call   f0100040 <test_backtrace>
f01000f4:	83 c4 10             	add    $0x10,%esp

kernel.asm的代码我们可以看到,test_backtrace第一次被调用是在地址0xf01000e8,传入的参数是5,我们在这里设置断点,追踪栈的信息。

当运行完call指令后,我们看一下esp寄存器的值,看看当前栈指针指向的位置:

(gdb) print $esp
$1 = (void *) 0xf010ffdc

我们再看看这个地址和前一个地址(栈向下生长,前一个地址数值更大)存储的数据:

(gdb) print/x *0xf010ffdc@2
$2 = {0xf01000f4, 0x5}

可以看到,我们传入的参数5被压入栈中,其次,还有一个地址0xf01000f4也在栈中,这个地址是test_backtrace返回后要执行的指令的首地址。

随后进入test_backtrace函数。首先执行以下指令,将调用者的栈帧信息保存在栈上,并将自己的栈帧信息存储在ebp中,保存调用者的esiebx数据。

f0100040:	55                   	push   %ebp
f0100041:	89 e5                	mov    %esp,%ebp
f0100043:	56                   	push   %esi
f0100044:	53                   	push   %ebx

查看一下栈里的信息:

(gdb) print/x *0xf010ffd0@5
$3 = {0xf0111308, 0x10094, 0xf010fff8, 0xf01000f4, 0x5}

从高地址开始依次是:传入的数据0x5test_backtrace返回后执行的指令的地址,i386_init在调用次函数时ebpesiebx的值。此时ebp保存的值是指向第三条数据的地址。

然后执行这三条指令:

f0100045:	e8 72 01 00 00       	call   f01001bc <__x86.get_pc_thunk.bx>
f010004a:	81 c3 be 12 01 00    	add    $0x112be,%ebx
f0100050:	8b 75 08             	mov    0x8(%ebp),%esi

首先是一个跳转指令,跳转到这个子程序:

f01001bc <__x86.get_pc_thunk.bx>:
f01001bc:	8b 1c 24             	mov    (%esp),%ebx
f01001bf:	c3                   	ret    

在执行call指令时,会将call返回后下一条指令的地址压入栈中,然后再跳转到给定位置。执行ret时,会将执行call时保存在栈中的地址取出,赋给eip。此时我们的栈中又多了一条数据:0xf010004a

之后执行了一个mov指令,将这条新的数据传递给了ebx,返回后又执行addmov指令。最后传递了一个数据给esi。我们看一下此时esi内的数据:

(gdb) print $esi
$4 = 5

因为全局变量相对于代码来说有固定的偏移量,因此我们可以通过这种方法来访问数据(要传入cprintf的字符串)。

esi的数据通过前面保存的ebp来完成。
当前的栈的信息为:

0xf010ffe0:	0x00000005	//传入的参数
0xf010ffdc:	0xf01000f4	//函数返回后执行的下一条指令的地址
0xf010ffd8:	0xf010fff8	//运行init.c时的ebp的数据
0xf010ffd4:	0x00010094	//运行init.c时的esi的数据
0xf010ffd0:	0xf0111308	//运行init.c时的ebx的数据
(gdb) print/x $ebp
$5 = 0xf010ffd8

此时ebp保存的是指向第三条数据的指针。因此0xf0100050处的指令访问的数据是0x5

接下来调用cprintf

	cprintf("entering test_backtrace %d\n", x);
f0100053:	83 ec 08             	sub    $0x8,%esp
f0100056:	56                   	push   %esi
f0100057:	8d 83 18 07 ff ff    	lea    -0xf8e8(%ebx),%eax
f010005d:	50                   	push   %eax
f010005e:	e8 e6 09 00 00       	call   f0100a49 <cprintf>

sub $0x8,%esp在栈中开辟一些空间,用于存放临时变量。然后将参数0x5压入栈中,在通过代码和全局变量之间的偏移量访问字符串,并将数据指针压如栈中,最后调用cprintf
此时栈内新增了5条数据。从上至下依次为两个空白区域参数5字符串指针cprintf返回后下一条指令的地址0xf0100063

调用返回后执行以下指令:

	if (x > 0)
f0100063:	83 c4 10             	add    $0x10,%esp
f0100066:	85 f6                	test   %esi,%esi
f0100068:	7f 2b                	jg     f0100095 <test_backtrace+0x55>

cprintf调用返回后esp的值为:0xf010ffc0。此时通过add操作删除了4个为调用cprintf作准备的元素,ret也会删除一个。然后判断变量x的值,如果大于0,则进行递归调用,如果小于0,则执行mon_backtrace

		mon_backtrace(0, 0, 0);
f010006a:	83 ec 04             	sub    $0x4,%esp
f010006d:	6a 00                	push   $0x0
f010006f:	6a 00                	push   $0x0
f0100071:	6a 00                	push   $0x0
f0100073:	e8 0b 08 00 00       	call   f0100883 <mon_backtrace>
f0100078:	83 c4 10             	add    $0x10,%esp
	cprintf("leaving test_backtrace %d\n", x);
f010007b:	83 ec 08             	sub    $0x8,%esp
f010007e:	56                   	push   %esi
f010007f:	8d 83 34 07 ff ff    	lea    -0xf8cc(%ebx),%eax
f0100085:	50                   	push   %eax
f0100086:	e8 be 09 00 00       	call   f0100a49 <cprintf>
}
f010008b:	83 c4 10             	add    $0x10,%esp
f010008e:	8d 65 f8             	lea    -0x8(%ebp),%esp
f0100091:	5b                   	pop    %ebx
f0100092:	5e                   	pop    %esi
f0100093:	5d                   	pop    %ebp
f0100094:	c3                   	ret
		test_backtrace(x-1);
f0100095:	83 ec 0c             	sub    $0xc,%esp
f0100098:	8d 46 ff             	lea    -0x1(%esi),%eax
f010009b:	50                   	push   %eax
f010009c:	e8 9f ff ff ff       	call   f0100040 <test_backtrace>
f01000a1:	83 c4 10             	add    $0x10,%esp
f01000a4:	eb d5                	jmp    f010007b <test_backtrace+0x3b>

每次进行递归调用,上面的操作都要重新走一遍。此时栈里一共有8个元素(上面提到的最初的5个加上三个空白区域)。每一次调用都会增加8个,除了最后一次。当程序进行到x = 0,并且运行到0xf0100068处的判断条件时,栈里一共有45个元素,他们的性质跟最初5+3个元素是重复的,不过具体的值不同。此时esp的值为:0xf010ff30

我们直接来看当x = 0时的情况。此时程序调用mon_backtrace函数。先在栈内开辟了一块区域,然后又传入三个参数,接着调用函数,返回后又删去了三个参数和开辟的区域。然后再次调用cprintf函数。

先开辟两个存放数据的区域,然后传入参数,接着调用,最后又删去了这些区域。

接着通过f010008e: 8d 65 f8 lea -0x8(%ebp),%esp这条指令来设置esp指向当前调用保存在栈中的ebx的值,然后恢复寄存器的值,此时esp指向的是调用者在调用返回后要执行的指令的地址,retesp的值加载到程序计数器里,然后弹出该元素。当x = 0调用返回时,它的返回地址是0xf01000a1,此时弹出4个空白区域,然后跳转到x = 1时,第二次调用vprintf的语句0xf010007b。一直返回到x = 5时,此时的返回地址是0xf01000f4,由init.c调用call保存在栈中的数据,此时栈中只剩下数据5了(我们关心的)。

Implement the backtrace function as specified above. Use the same format as in the example, since otherwise the grading script will be confused. When you think you have it working right, run make grade to see if its output conforms to what our grading script expects, and fix it if it doesn't. After you have handed in your Lab 1 code, you are welcome to change the output format of the backtrace function any way you like.

代码如下:

int
mon_backtrace(int argc, char **argv, struct Trapframe *tf)
{
	// Your code here.
	int *ebp = (int *)read_ebp();
	cprintf("Stack backtrace:\r\n");
	while(ebp != 0) {
		cprintf("  ebp %08x  eip %08x  args %08x %08x %08x %08x %08x\r\n", ebp, ebp[1], ebp[2]
			, ebp[3], ebp[4], ebp[5], ebp[6]);
		ebp = (int *)ebp[0];
	}
	
	return 0;
}

输出结果:

entering test_backtrace 5
entering test_backtrace 4
entering test_backtrace 3
entering test_backtrace 2
entering test_backtrace 1
entering test_backtrace 0
Stack backtrace:
  ebp f010ff18  eip f0100078  args 00000000 00000000 00000000 f010004a f0111308
  ebp f010ff38  eip f01000a1  args 00000000 00000001 f010ff78 f010004a f0111308
  ebp f010ff58  eip f01000a1  args 00000001 00000002 f010ff98 f010004a f0111308
  ebp f010ff78  eip f01000a1  args 00000002 00000003 f010ffb8 f010004a f0111308
  ebp f010ff98  eip f01000a1  args 00000003 00000004 00000000 f010004a f0111308
  ebp f010ffb8  eip f01000a1  args 00000004 00000005 00000000 f010004a f0111308
  ebp f010ffd8  eip f01000f4  args 00000005 00001aac 00000640 00000000 00000000
  ebp f010fff8  eip f010003e  args 00000003 00001003 00002003 00003003 00004003
leaving test_backtrace 0
leaving test_backtrace 1
leaving test_backtrace 2
leaving test_backtrace 3
leaving test_backtrace 4
leaving test_backtrace 5

Modify your stack backtrace function to display, for each eip, the function name, source file name, and line number corresponding to that eip.

关于Stabs我们可以查看这个链接的内容。在inc/stab.h中有结构体Stab的定义。

// Entries in the STABS table are formatted as follows.
struct Stab {
	uint32_t n_strx;	// index into string table of name
	uint8_t n_type;         // type of symbol
	uint8_t n_other;        // misc info (usually empty)
	uint16_t n_desc;        // description field
	uintptr_t n_value;	// value of symbol
};

我们先打开kern/kernel.ld看一下相关的信息:

/* Include debugging information in kernel memory */
	.stab : {
		PROVIDE(__STAB_BEGIN__ = .);
		*(.stab);
		PROVIDE(__STAB_END__ = .);
		BYTE(0)		/* Force the linker to allocate space
				   for this section */
	}

	.stabstr : {
		PROVIDE(__STABSTR_BEGIN__ = .);
		*(.stabstr);
		PROVIDE(__STABSTR_END__ = .);
		BYTE(0)		/* Force the linker to allocate space
				   for this section */
	}

__STAB_BEGIN____STAB_END____STABSTR_BEGIN____STABSTR_END__分别表示.stab.stabstr这两个段的起始和结束地址。

.代表当前地址。

输入:

objdump -h obj/kern/kernel

我们现在关注的是这两条信息:

Idx Name          Size      VMA       LMA       File off  Algn
	……
  2 .stab         00003c61  f010218c  0010218c  0000318c  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  3 .stabstr      0000195b  f0105ded  00105ded  00006ded  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
	……

这两条信息说明了这两个段的存放地址和大小,我们可以籍此计算出他们的结束地址。

输入:

objdump -G obj/kern/kernel

我们可以查看.stab段内的数据。

obj/kern/kernel:     file format elf32-i386

Contents of .stab section:

Symnum n_type n_othr n_desc n_value  n_strx String

-1     HdrSym 0      1287   0000195a 1     
0      SO     0      0      f0100000 1      {standard input}
1      SOL    0      0      f010000c 18     kern/entry.S
2      SLINE  0      44     f010000c 0      
……
474    FUN    0      0      f0100883 4237   mon_backtrace:F(0,1)
475    PSYM   0      0      00000008 4129   argc:p(0,1)

根据上面的链接,我们主要要知道以下几点:

  • n_type N_UNDF
  • n_othr Unused field, always zero. This may eventually be used to hold overflows from the count in the n_desc field.
  • n_desc Count of upcoming symbols, i.e., the number of remaining stabs for this source file.
  • n_value Size of the string table fragment associated with this source file, in bytes.
  • n_strx Relative to the start of the .stabstr section.

Symnum可以看做是标号,n_type是类型。FUN指的是函数,对应的String为函数名加上一些信息。因此,我们想要在mon_backtrace中找到函数名需要找到这条信息。

通过kern/kdebug.c的信息我们可以了解stab_binsearch函数的功能:

//	Given an instruction address, this function finds the single stab
//	entry of type 'type' that contains that address.

输入:

gcc -pipe -nostdinc -O2 -fno-builtin -I. -MD -Wall -Wno-format -DJOS_KERNEL -gstabs -c -S kern/init.c

我们可以查看init.S来获取更多信息。

为了查看符号表是否被加载进内存,我们可以直接用gdb调试查看该段起始地址的数据:

(gdb) x/5s 0xf0105ded
0xf0105ded:	""
0xf0105dee:	"{standard input}"
0xf0105dff:	"kern/entry.S"
0xf0105e0c:	"kern/entrypgdir.c"
0xf0105e1e:	"gcc2_compiled."

这与init.S的信息相同。说明符号表被加载进入内存了。不过这个地址需要在进入内核完成地址映射才能看到,否则需要查看的地址可以为0x00105ded

Complete the implementation of debuginfo_eip by inserting the call to stab_binsearch to find the line number for an address.

现在我们需要去debuginfo_eip函数中补充一些代码来完成找到行号的功能。而这个功能需要用到stab_binsearch

这个函数的代码和样例在kern/kdebug.c中均有说明。

补充代码如下:

	stab_binsearch(stabs, &lline, &rline, N_SLINE, addr);
	if (lline <= rline) {
		info->eip_line = stabs[lline].n_desc;
	} else return -1;

代码注释说info->eip_line应该设置为right line number,但我设置为lline才输出正确。

更改后的mon_backtrace如下:

int
mon_backtrace(int argc, char **argv, struct Trapframe *tf)
{
	// Your code here.
	int *ebp = (int *)read_ebp();
	struct Eipdebuginfo info;
	cprintf("Stack backtrace:\r\n");
	while(ebp != 0) {
		cprintf("  ebp %08x  eip %08x  args %08x %08x %08x %08x %08x\r\n", ebp, ebp[1], ebp[2], ebp[3], ebp[4], ebp[5], ebp[6]);
		memset(&info, 0, sizeof(struct Eipdebuginfo));
		if (debuginfo_eip(ebp[1], &info)) {
			cprintf("failed to get debuginfo for eip %x.\r\n", ebp[1]);
		}
		else
        {
            cprintf("\t%s:%d: %.*s+%u\r\n", info.eip_file, info.eip_line, info.eip_fn_namelen, info.eip_fn_name, ebp[1] - info.eip_fn_addr);
        }
		ebp = (int *)ebp[0];
	}
	
	return 0;
}

命令增加如下:

static struct Command commands[] = {
	{ "help", "Display this list of commands", mon_help },
	{ "kerninfo", "Display information about the kernel", mon_kerninfo },
	{ "mon_backtrace", "Display information about Stack trace", mon_backtrace },
};

最后make grade:

……
running JOS: (1.0s) 
  printf: OK 
  backtrace count: OK 
  backtrace arguments: OK 
  backtrace symbols: OK 
  backtrace lines: OK 
Score: 50/50

关于Stabs我还弄得不是很明白,有机会再补充。