【施工完成】MIT 6.828 lab 1: C, Assembly, Tools and Bootstrapping
Overview
花费了30+小时,终于搞定了orz
Part 1: PC Bootstrap
The PC's Physical Address Space
8086/8088时代
+------------------+ <- 0x00100000 (1MB)
| BIOS ROM |
+------------------+ <- 0x000F0000 (960KB)
| 16-bit devices, |
| expansion ROMs |
+------------------+ <- 0x000C0000 (768KB)
| VGA Display |
+------------------+ <- 0x000A0000 (640KB)
| |
| Low Memory |
| |
+------------------+ <- 0x00000000
由于8086/8088只有20跟地址线,因此物理内存空间就是2^20=1MB.地址空间从0x00000到0xFFFFF.其中从0x00000开始的640k空间被称为"low memory",是PC真正能使用的RAM。从 0xA0000 到 0xFFFFF 的384k的non-volatile memory被硬件保留,用作video display buffers和BIOS等。
80286/80386时代及以后
为了保持向后兼容,因此0-1MB的空间还是和原来保持一致。因此地址空间似乎存在一个“洞”(为什么我觉得其实是两个“洞”。。。不是空着的才叫“洞”吗),PC能使用的RAM被这个“洞”(也就是0xA0000 到 0xFFFFF)分成了0x00000000到0x000BFFFF的640k和 0x00100000到0xFFFFFFFF两部分。
|
目前处理器已经可以支持超过4GB大小的内存空间。因此为了保持后向兼容性,地址空间又会多一个"洞"。
The ROM BIOS
用qemu模拟启动,观察到进入BIOS执行的第一条命令为
[f000:fff0] 0xffff0: ljmp $0xf000,$0xe05b
说明PC执行的第一条指令的物理地址为0xffff0。
然后使用si命令执行单步指令,得到的前面几条执行的指令如下:
[f000:e05b] 0xfe05b: cmpl $0x0,%cs:0x6ac8
[f000:e062] 0xfe062: jne 0xfd2e1
[f000:e066] 0xfe066: xor %dx,%dx
[f000:e068] 0xfe068: mov %dx,%ss
[f000:e06a] 0xfe06a: mov $0x7000,%esp
[f000:e070] 0xfe070: mov $0xf34c2,x
[f000:e076] 0xfe076: jmp 0xfd15c
[f000:d15c] 0xfd15c: mov x,x
...
如果看着觉得似懂非懂...不要慌,问题不大,因为这里不需要弄明白BIOS到底在干什么。不过建议先复习一下x86汇编,可以参考General Registers (AX, BX, CX, and DX),Intel 80386 Reference Programmer's Manual Table of Contents 等内容。然后强烈推荐去稍微看一下gdb_examining data 部分的教程,尤其是查看memory和register内容的章节,对搞清楚BIOS这里到底在干嘛大有裨益。(x [memory]来查看某个地址的内容,x/i [memory]将该地址的指令以人类可读的方式写出,p/x $[register] 来查看某个寄存器的值。)
那么BIOS大概做了什么呢?主要是建立Interrupt descriptor table(其实就是x86体系架构中断向量表的实现),初始化一些硬件设备,然后寻找一个"bootable"设备。如果找到了这样一个设备,BIOS就将该设备上的boot loader加载到内存,并将控制权交给boot loader.
先明确几个概念。所谓boot loader,就是在加载OS前运行的一段程序。通常在硬盘的第一个sector里,因此这个sector也叫boot sector.至于我们更经常见到的master boot record(主引导记录),其实就是一种对于分区过的媒介的特殊的boot sector.
顺便提一句,确定一个设备是否为"bootable"是通过 0x55和0xAA两个boot signature来决定的。具体来说,如果一个设备中的第0个sector的最后两个byte的值分别为0x55和0xAA,就认为这是一个bootable设备。可以参考bool sequence
Part 2: The Boot Loader
BIOS在初始化完成后需要将boot loader加载到内存,具体的地址为 0x7c00 到0x7dff。
关于0x7c00这个magic number是怎么来的? 其实不重要,不过感兴趣可以参考Why BIOS loads MBR into 0x7C00 in x86 ? 知道这个magic number其实不是x86相关的,而是和IBM的BIOS开发团队有关就可以了。
boot loader包含一个汇编文件boot/boot.S和一个c语言文件boot/main.c
先来看下boot/boot.S文件都在干什么吧
不过在这之前,不妨先复习一下real mode和proteced mode
real mode / protected mode
* [Real_mode](https://en.wikipedia.org/wiki/Real_mode) 地址空间被限制在2^20(因为地址总线为20),没有虚拟内存的概念,内存都是真实的物理内存。在real mode下,segment位于物理内存中的固定位置上。
* 16-bit Protected Mode 登场于intel 80286处理器。首次引入了虚拟内存的概念。依赖局部性原理,只将程序运行需要的部分放入内存,暂时用不到的部分则存储在硬盘。segment的位置在其从disk回到memory中,可能和之前的位置不同。由于segment的位置不再固定,引入[Global Descriptor Table,GDT](https://en.wikipedia.org/wiki/Global_Descriptor_Table)来描述segment的信息,诸如是否在内存中,如果在,在内存中的什么位置,以及访问权限。由于寄存器仍然是16bit,所以segment [OSTEP](http://pages.cs.wisc.edu/~remzi/OSTEP/)
* 32-bit Protected Mode 登场于intel 80386处理器。比起80286,使用的寄存器是32-bit的,因此segment size 增大到4GB(2^32). 同时,由于segment size不再像64k那么小,以前的一整个segment要么都在memory中,要么都在disk中的策略就变得不太科学了。因此引入[paging](https://en.wikipedia.org/wiki/Paging) 机制,将segment分成尺寸更小的page。允许segment中的一部分在memory中。关于paging可以参考[OSTEP](http://pages.cs.wisc.edu/~remzi/OSTEP/)的18章。
这里值得一提的是,对于支持protected mode的cpu,启动时为了保持向后兼容,仍然会以real mode启动,之后再切换到protected mode.
_When a processor that supports x86 protected mode is powered on, it begins executing instructions in [real mode](https://en.wikipedia.org/wiki/Real_mode), in order to maintain [backward compatibility](https://en.wikipedia.org/wiki/Backward_compatibility) with earlier x86 processors.[[4]](https://en.wikipedia.org/wiki/Protected_mode#cite_note-Real_mode_on_powered_on-4) Protected mode may only be entered after the system software sets up one descriptor table and enables the Protection Enable (PE) [bit](https://en.wikipedia.org/wiki/Bit) in the [control register](https://en.wikipedia.org/wiki/Control_register) 0 (CR0)_
boot/boot.S文件在干什么
1
2 #include <inc/mmu.h>
3
4 # Start the CPU: switch to 32-bit protected mode, jump into C.
5 # The BIOS loads this code from the first sector of the hard disk into
6 # memory at physical address 0x7c00 and starts executing in real mode
7 # with %cs=0 %ip=7c00.
8
9 .set PROT_MODE_CSEG, 0x8 # kernel code segment selector
10 .set PROT_MODE_DSEG, 0x10 # kernel data segment selector
11 .set CR0_PE_ON, 0x1 # protected mode enable flag
12
13 .globl start
14 start:
15 .code16 # Assemble for 16-bit mode
16 cli # Disable interrupts
17 cld # String operations increment
18
19 # Set up the important data segment registers (DS, ES, SS).
20 xorw %ax,%ax # Segment number zero
21 movw %ax,%ds # - Data Segment
22 movw %ax,%es # - Extra Segment
23 movw %ax,%ss # - Stack Segment
24
25 # Enable A20:
26 # For backwards compatibility with the earliest PCs, physical
27 # address line 20 is tied low, so that addresses higher than
28 # 1MB wrap around to zero by default. This code undoes this.
29 seta20.1:
30 inb $0x64,%al # Wait for not busy
31 testb $0x2,%al
32 jnz seta20.1
33
34 movb $0xd1,%al # 0xd1 - port 0x64
35 outb %al,$0x64
36
37 seta20.2:
38 inb $0x64,%al # Wait for not busy
39 testb $0x2,%al
40 jnz seta20.2
41
42 movb $0xdf,%al # 0xdf - port 0x60
43 outb %al,$0x60
44
45 # Switch from real to protected mode, using a bootstrap GDT
46 # and segment translation that makes virtual addresses
47 # identical to their physical addresses, so that the
48 # effective memory map does not change during the switch.
49 lgdt gdtdesc # lgdt means load global descriptor table
50 movl %cr0, x
51 orl $CR0_PE_ON, x # cr0 = cr0 | 1
52 movl x, %cr0
53
54 # Jump to next instruction, but in 32-bit code segment.
55 # Switches processor into 32-bit mode.
56 ljmp $PROT_MODE_CSEG, $protcseg
57
58 .code32 # Assemble for 32-bit mode
59 protcseg:
60 # Set up the protected-mode data segment registers
61 movw $PROT_MODE_DSEG, %ax # Our data segment selector
62 movw %ax, %ds # - DS: Data Segment
63 movw %ax, %es # - ES: Extra Segment
64 movw %ax, %fs # - FS
65 movw %ax, %gs # - GS
66 movw %ax, %ss # - SS: Stack Segment
67
68 # Set up the stack pointer and call into C.
69 movl $start, %esp
70 call bootmain
71
72 # If bootmain returns (it shouldn't), loop.
73 spin:
74 jmp spin
75
76 # Bootstrap GDT
77 .p2align 2 # force 4 byte alignment
78 gdt:
79 SEG_NULL # null seg
80 SEG(STA_X|STA_R, 0x0, 0xffffffff) # code seg
81 SEG(STA_W, 0x0, 0xffffffff) # data seg
82
83 gdtdesc:
84 .word 0x17 # sizeof(gdt) - 1
85 .long gdt # address gdt
86
87
第一次看到这段代码的时候感觉Enable A20这一部分比较喵(ling)喵(ren)喵(fei)喵(jie)
可以参考A20 - a pain from the past。重点是
One sets the output port of the keyboard controller by first writing 0xd1 to port 0x64, and the the desired value of the output port to port 0x60. One usually sees the values 0xdd and 0xdf used to disable/enable A20.
然后比较让人疑惑的可能是"bootstrap GDT”这部分。参考cs421 x86 Assembly Guide尤其是:
1 .data
2 var:
3 .byte 64 /* Declare a byte, referred to as location var, containing the value 64. */
4 .byte 10 /* Declare a byte with no label, containing the value 10. Its location is var + 1. */
5 x:
6 .short 42 /* Declare a 2-byte value initialized to 42, referred to as location x. */
7 y:
8 .long 30000 /* Declare a 4-byte value, referred to as location y, initialized to 30000. */
9
10
11 s:
12 .long 1, 2, 3 /* Declare three 4-byte values, initialized to 1, 2, and 3.
13 The value at location s + 8 will be 3. */
14 barr:
15 .zero 10 /* Declare 10 bytes starting at location barr, initialized to 0. */
16 str:
17 .string "hello" /* Declare 6 bytes starting at the address str initialized to
18 the ASCII character values for hello followed by a nul (0) byte. */
19
知道gdtdesc部分做的事情是,在gdtdesc这个位置定义了一个word类型(2字节)的变量,值为0x17,参考注释也就是gdt定义的那一段的size大小。然后在gdtdsec+2这个位置定义了long类型(4字节)的gdt地址.
这里gdt和gdtdesc都是"label",label其实就是标记了一个内存地址,方便使用。
具体来说,一个“label”的值,是其之后的第一条instruction的内存地址。
We use the notation
然后是关于gdt部分,SEG看起来是个宏,我们看到inc/mmu.h这个文件中相关的部分,豁然开朗。
1 #ifdef __ASSEMBLER__
2
3 /*
4 * Macros to build GDT entries in assembly.
5 */
6 #define SEG_NULL \
7 .word 0, 0; \
8 .byte 0, 0, 0, 0
9 #define SEG(type,base,lim) \
10 .word (((lim) > 12) & 0xffff), ((base) & 0xffff); \
11 .byte (((base) > 16) & 0xff), (0x90 | (type)), \
12 (0xC0 | (((lim) > 28) & 0xf)), (((base) > 24) & 0xff)
13
14 #else // not __ASSEMBLER__
15
接下来不太明确的地方可能是cr0部分。
我们看到代码最开始有一个CR0_PE_ON,值为0x1.之后就是在计算cr0 = cr0 | 0x1,按照注释说这样就可以把保护模式打开了。理解到这里其实就ok,不过我还是想多说两句。 Control register是用来控制cpu行为的寄存器。cr0是x86体系架构的Control register中的一个。cr0是32bit的寄存器,其中一些bit上有名称以及固定的作用。比如对于位置bit 0,该位置的名称是"Protected Mode Enable",简称为PE,当该位置值为1,表示保护模式被打开。
最后一个小细节是".globl start"。".globl"是什么含义?为什么要把start这个label定义成global的?可以参考What is global _start in assembly language? 用人话说就是定义成.globl的lable会被导出到生成的.o文件中,不然linker找不到这个符号。由于start是这个boot.S文件的entry point,因此需要linker看到。
最后,从全局来看,boot.S这个文件做了什么呢? 其实上面一个小节中已经提到了。
_When a processor that supports x86 protected mode is powered on, it begins executing instructions in [real mode](https://en.wikipedia.org/wiki/Real_mode), in order to maintain [backward compatibility](https://en.wikipedia.org/wiki/Backward_compatibility) with earlier x86 processors.[[4]](https://en.wikipedia.org/wiki/Protected_mode#cite_note-Real_mode_on_powered_on-4) Protected mode may only be entered after the system software sets up one descriptor table and enables the Protection Enable (PE) [bit](https://en.wikipedia.org/wiki/Bit) in the [control register](https://en.wikipedia.org/wiki/Control_register) 0 (CR0)_
boot/main.c这个文件在干什么
1
2 #include <inc/x86.h
3 #include <inc/elf.h
4
5 /**********************************************************************
6 * This a dirt simple boot loader, whose sole job is to boot
7 * an ELF kernel image from the first IDE hard disk.
8 *
9 * DISK LAYOUT
10 * * This program(boot.S and main.c) is the bootloader. It should
11 * be stored in the first sector of the disk.
12 *
13 * * The 2nd sector onward holds the kernel image.
14 *
15 * * The kernel image must be in ELF format.
16 * c
17 * BOOT UP STEPS
18 * * when the CPU boots it loads the BIOS into memory and executes it
19 *
20 * * the BIOS intializes devices, sets of the interrupt routines, and
21 * reads the first sector of the boot device(e.g., hard-drive)
22 * into memory and jumps to it.
23 *
24 * * Assuming this boot loader is stored in the first sector of the
25 * hard-drive, this code takes over...
26 *
27 * * control starts in boot.S -- which sets up protected mode,
28 * and a stack so C code then run, then calls bootmain()
29 *
30 * * bootmain() in this file takes over, reads in the kernel and jumps to it.
31 **********************************************************************/
32
33 #define SECTSIZE 512
34 #define ELFHDR ((struct Elf *) 0x10000) // scratch space
35
36 void readsect(void*, uint32_t);
37 void readseg(uint32_t, uint32_t, uint32_t);
38
39 void
40 bootmain(void)
41 {
42 struct Proghdr *ph, *eph;
43
44 // read 1st page off disk
45 readseg((uint32_t) ELFHDR, SECTSIZE*8, 0);
46
47 // is this a valid ELF?
48 if (ELFHDR->e_magic != ELF_MAGIC)
49 goto bad;
50
51 // load each program segment (ignores ph flags)
52 ph = (struct Proghdr *) ((uint8_t *) ELFHDR + ELFHDR->e_phoff);
53 eph = ph + ELFHDR->e_phnum;
54 for (; ph < eph; ph++)
55 // p_pa is the load address of this segment (as well
56 // as the physical address)
57 readseg(ph->p_pa, ph->p_memsz, ph->p_offset);
58
59 // call the entry point from the ELF header
60 // note: does not return!
61 ((void (*)(void)) (ELFHDR->e_entry))();
62
63 bad:
64 outw(0x8A00, 0x8A00);
65 outw(0x8A00, 0x8E00);
66 while (1)
67 /* do nothing */;
68 }
69
70 // Read 'count' bytes at 'offset' from kernel into physical address 'pa'.
71 // Might copy more than asked
72 void
73 readseg(uint32_t pa, uint32_t count, uint32_t offset)
74 {
75 uint32_t end_pa;
76
77 end_pa = pa + count;
78
79 // round down to sector boundary
80 pa &= ~(SECTSIZE - 1);
81
82 // translate from bytes to sectors, and kernel starts at sector 1
83 offset = (offset / SECTSIZE) + 1;
84
85 // If this is too slow, we could read lots of sectors at a time.
86 // We'd write more to memory than asked, but it doesn't matter --
87 // we load in increasing order.
88 while (pa < end_pa) {
89 // Since we haven't enabled paging yet and we're using
90 // an identity segment mapping (see boot.S), we can
91 // use physical addresses directly. This won't be the
92 // case once JOS enables the MMU.
93 readsect((uint8_t*) pa, offset);
94 pa += SECTSIZE;
95 offset++;
96 }
97 }
98
99 void
100 waitdisk(void)
101 {
102 // wait for disk reaady
103 while ((inb(0x1F7) & 0xC0) != 0x40)
104 /* do nothing */;
105 }
106
107 void
108 readsect(void *dst, uint32_t offset)
109 {
110 // wait for disk to be ready
111 waitdisk();
112
113 outb(0x1F2, 1); // count = 1
114 outb(0x1F3, offset);
115 outb(0x1F4, offset > 8);
116 outb(0x1F5, offset > 16);
117 outb(0x1F6, (offset > 24) | 0xE0);
118 outb(0x1F7, 0x20); // cmd 0x20 - read sectors
119
120 // wait for disk to be ready
121 waitdisk();
122
123 // read a sector
124 insl(0x1F0, dst, SECTSIZE/4);
125 }
126
先注意到一些看起来像是汇编指令的东西...比如outb之类。查看inc/x86.h文件,找到他们的定义。
1 static inline void
2 outb(int port, uint8_t data)
3 {
4 asm volatile("outb %0,%w1" : : "a" (data), "d" (port));
5 }
6
7
8 static inline void
9 insl(int port, void *addr, int cnt)
10 {
11 asm volatile("cld\n\trepne\n\tinsl"
12 : "=D" (addr), "=c" (cnt)
13 : "d" (port), "0" (addr), "1" (cnt)
14 : "memory", "cc");
15 }
16 static inline uint8_t
17 inb(int port)
18 {
19 uint8_t data;
20 asm volatile("inb %w1,%0" : "=a" (data) : "d" (port));
21 return data;
22 }
23
发现就是用c将汇编封装了一层。这个东西应该叫“inline assembly”,具体可以参考Brennan's Guide to Inline Assembly 其中volatile关键字表示禁止gcc优化这段代码。
If your assembly statement _must_ execute where you put it, (i.e. must not be moved out of a loop as an optimization), put the keyword **volatile** after **asm** and before the ()'s. To be ultra-careful, useasm volatile (...whatever...);
However, I would like to point out that if your assembly's only purpose is to calculate the output registers, with no other side effects, you should leave off the volatile keyword so your statement will be processed into GCC's common subexpression elimination optimization.
注释上写的要"boot an ELF kernel image from the first IDE hard disk",那么,首先要知道什么是ELF. ELF其实就是一种文件格式,全称为“Executable and Linkable Format”可以参考Executable_and_Linkable_Format#File_layout,建议通读这一部分,内容不多,不过对之后很有用。
参考一下inc/elf.h文件,以及main.c中的注释,就可以整体上知道这段代码是在干什么了:将ELF格式的kernel image从硬盘读到内存中,并将控制权交给kernel image.
1 #ifndef JOS_INC_ELF_H
2 #define JOS_INC_ELF_H
3
4 #define ELF_MAGIC 0x464C457FU /* "\x7FELF" in little endian */
5
6 struct Elf {
7 uint32_t e_magic; // must equal ELF_MAGIC
8 uint8_t e_elf[12];
9 /* e_elf[0] 1 for signed 32 bit , 2 for signed 64-bit
10 [1] 1 for little endianness ,2 for big endianness
11 [2] version type
12 [3] target OS
13 [4] ABI version
14 [5..11] unused
15 */
16 uint16_t e_type; // object file type
17 uint16_t e_machine; // instruction set arch , x86/MIPS/IA-64 and etc.
18 uint32_t e_version;
19 uint32_t e_entry; // the memory address of the entry point where process start executing.
20 uint32_t e_phoff; // points to the start of the program header table.
21 uint32_t e_shoff; // Points to the start of the section header table.
22 uint32_t e_flags;
23 uint16_t e_ehsize; // size of this header. 64byte for 64-bit,52bytes for 32-bit
24 uint16_t e_phentsize; // the size of a program header table entry.
25 uint16_t e_phnum; // the number of entries in the program header table.
26 uint16_t e_shentsize; // the size of a section header table entry.
27 uint16_t e_shnum; // the number of entries in the section header table.
28 uint16_t e_shstrndx;
29 };
30
31 struct Proghdr {
32 uint32_t p_type; // type of the segment
33 uint32_t p_offset; // offset of the segment in the file image
34 uint32_t p_va; // virtual address of the segment in memory
35 uint32_t p_pa; // physical address for segment(?)
36 uint32_t p_filesz; // Size in bytes of the segment in the file image. May be 0.
37 uint32_t p_memsz; // Size in bytes of the segment in memory. May be 0.
38 uint32_t p_flags;
39 uint32_t p_align; // 0 and 1 specify no alignment. Otherwise should be a positive, integral power of 2
40 };
41
42 struct Secthdr {
43 uint32_t sh_name; // An offset to a string in the .shstrtab section that represents the name of this section
44 uint32_t sh_type; // the type of this header
45 uint32_t sh_flags; // the attributes of the section
46 uint32_t sh_addr; // Virtual address of the section in memory
47 uint32_t sh_offset; // Offset of the section in the file image
48 uint32_t sh_size; // Size in bytes of the section in the file image. May be 0.
49 uint32_t sh_link; //
50 uint32_t sh_info;
51 uint32_t sh_addralign;
52 uint32_t sh_entsize;
53 /*
54 Contains the size, in bytes, of each entry, for sections that contain fixed-size entries.
55 Otherwise, this field contains zero.
56 */
57 };
58
59 // Values for Proghdr::p_type
60 #define ELF_PROG_LOAD 1
61
62 // Flag bits for Proghdr::p_flags
63 #define ELF_PROG_FLAG_EXEC 1
64 #define ELF_PROG_FLAG_WRITE 2
65 #define ELF_PROG_FLAG_READ 4
66
67 // Values for Secthdr::sh_type
68 #define ELF_SHT_NULL 0
69 #define ELF_SHT_PROGBITS 1
70 #define ELF_SHT_SYMTAB 2
71 #define ELF_SHT_STRTAB 3
72
73 // Values for Secthdr::sh_name
74 #define ELF_SHN_UNDEF 0
75
76 #endif /* !JOS_INC_ELF_H */
下面说几个细节。我们知道readsect是在读一个扇区,但是我怎么知道扇区是这样读的?可以参考ATA_PIO_Mode的x86 Directions部分
第二个细节是“((void (*)(void)) (ELFHDR->e_entry))()”,乍一看有点不明觉厉,其实就是一个函数指针,e_entry是入口函数的地址。通知调用该函数,将控制权交给elf格式的kernel image.
接下来我们看一下根据编译boot.s和main.c得到的反汇编文件
1
2 obj/boot/boot.out: file format elf32-i386
3
4
5 Disassembly of section .text:
6
7 00007c00 <start>:
8 .set CR0_PE_ON, 0x1 # protected mode enable flag
9
10 .globl start
11 start:
12 .code16 # Assemble for 16-bit mode
13 cli # Disable interrupts
14 7c00: fa cli
15 cld # String operations increment
16 7c01: fc cld
17
18 # Set up the important data segment registers (DS, ES, SS).
19 xorw %ax,%ax # Segment number zero
20 7c02: 31 c0 xor x,x
21 movw %ax,%ds # - Data Segment
22 7c04: 8e d8 mov x,%ds
23 movw %ax,%es # - Extra Segment
24 7c06: 8e c0 mov x,%es
25 movw %ax,%ss # - Stack Segment
26 7c08: 8e d0 mov x,%ss
27
28 00007c0a <seta20.1>:
29 # Enable A20:
30 # For backwards compatibility with the earliest PCs, physical
31 # address line 20 is tied low, so that addresses higher than
32 # 1MB wrap around to zero by default. This code undoes this.
33 seta20.1:
34 inb $0x64,%al # Wait for not busy
35 7c0a: e4 64 in $0x64,%al
36 testb $0x2,%al
37 7c0c: a8 02 test $0x2,%al
38 jnz seta20.1
39 7c0e: 75 fa jne 7c0a <seta20.1>
40
41 movb $0xd1,%al # 0xd1 - port 0x64
42 7c10: b0 d1 mov $0xd1,%al
43 outb %al,$0x64
44 7c12: e6 64 out %al,$0x64
45
46 00007c14 <seta20.2>:
47
48 seta20.2:
49 inb $0x64,%al # Wait for not busy
50 7c14: e4 64 in $0x64,%al
51 testb $0x2,%al
52 7c16: a8 02 test $0x2,%al
53 jnz seta20.2
54 7c18: 75 fa jne 7c14 <seta20.2>
55
56 movb $0xdf,%al # 0xdf - port 0x60
57 7c1a: b0 df mov $0xdf,%al
58 outb %al,$0x60
59 7c1c: e6 60 out %al,$0x60
60
61 # Switch from real to protected mode, using a bootstrap GDT
62 # and segment translation that makes virtual addresses
63 # identical to their physical addresses, so that the
64 # effective memory map does not change during the switch.
65 lgdt gdtdesc # lgdt means load global descriptor table
66 7c1e: 0f 01 16 lgdtl (%esi)
67 7c21: 64 7c 0f fs jl 7c33 <protcseg+0x1>
68 movl %cr0, x
69 7c24: 20 c0 and %al,%al
70 orl $CR0_PE_ON, x # crx = crx | 1
71 7c26: 66 83 c8 01 or $0x1,%ax
72 movl x, %cr0
73 7c2a: 0f 22 c0 mov x,%cr0
74
75 # Jump to next instruction, but in 32-bit code segment.
76 # Switches processor into 32-bit mode.
77 ljmp $PROT_MODE_CSEG, $protcseg
78 7c2d: ea .byte 0xea
79 7c2e: 32 7c 08 00 xor 0x0(x,x,1),%bh
80
81 00007c32 <protcseg>:
82
83 .code32 # Assemble for 32-bit mode
84 protcseg:
85 # Set up the protected-mode data segment registers
86 movw $PROT_MODE_DSEG, %ax # Our data segment selector
87 7c32: 66 b8 10 00 mov $0x10,%ax
88 movw %ax, %ds # - DS: Data Segment
89 7c36: 8e d8 mov x,%ds
90 movw %ax, %es # - ES: Extra Segment
91 7c38: 8e c0 mov x,%es
92 movw %ax, %fs # - FS
93 7c3a: 8e e0 mov x,%fs
94 movw %ax, %gs # - GS
95 7c3c: 8e e8 mov x,%gs
96 movw %ax, %ss # - SS: Stack Segment
97 7c3e: 8e d0 mov x,%ss
98
99 # Set up the stack pointer and call into C.
100 movl $start, %esp
101 7c40: bc 00 7c 00 00 mov $0x7c00,%esp
102 call bootmain
103 7c45: e8 c0 00 00 00 call 7d0a <bootmain>
104
105 00007c4a <spin>:
106
107 # If bootmain returns (it shouldn't), loop.
108 spin:
109 jmp spin
110 7c4a: eb fe jmp 7c4a <spin>
111
112 00007c4c <gdt>:
113 ...
114 7c54: ff (bad)
115 7c55: ff 00 incl (x)
116 7c57: 00 00 add %al,(x)
117 7c59: 9a cf 00 ff ff 00 00 lcall $0x0,$0xffff00cf
118 7c60: 00 .byte 0x0
119 7c61: 92 xchg x,x
120 7c62: cf iret
121 ...
122
123 00007c64 <gdtdesc>:
124 7c64: 17 pop %ss
125 7c65: 00 4c 7c 00 add %cl,0x0(%esp,i,2)
126 ...
127
128 00007c6a <waitdisk>:
129 }
130 }
131
132 void
133 waitdisk(void)
134 {
135 7c6a: 55 push p
136
137 static inline uint8_t
138 inb(int port)
139 {
140 uint8_t data;
141 asm volatile("inb %w1,%0" : "=a" (data) : "d" (port));
142 7c6b: ba f7 01 00 00 mov $0x1f7,x
143 7c70: 89 e5 mov %esp,p
144 7c72: ec in (%dx),%al
145 // wait for disk reaady
146 while ((inb(0x1F7) & 0xC0) != 0x40)
147 7c73: 83 e0 c0 and $0xffffffc0,x
148 7c76: 3c 40 cmp $0x40,%al
149 7c78: 75 f8 jne 7c72 <waitdisk+0x8>
150 /* do nothing */;
151 }
152 7c7a: 5d pop p
153 7c7b: c3 ret
154
155 00007c7c <readsect>:
156
157 void
158 readsect(void *dst, uint32_t offset)
159 {
160 7c7c: 55 push p
161 7c7d: 89 e5 mov %esp,p
162 7c7f: 57 push i
163 7c80: 53 push x
164 7c81: 8b 5d 0c mov 0xc(p),x
165 // wait for disk to be ready
166 waitdisk();
167 7c84: e8 e1 ff ff ff call 7c6a <waitdisk>
168 }
169
170 static inline void
171 outb(int port, uint8_t data)
172 {
173 asm volatile("outb %0,%w1" : : "a" (data), "d" (port));
174 7c89: ba f2 01 00 00 mov $0x1f2,x
175 7c8e: b0 01 mov $0x1,%al
176 7c90: ee out %al,(%dx)
177 7c91: 0f b6 c3 movzbl %bl,x
178 7c94: b2 f3 mov $0xf3,%dl
179 7c96: ee out %al,(%dx)
180 7c97: 0f b6 c7 movzbl %bh,x
181 7c9a: b2 f4 mov $0xf4,%dl
182 7c9c: ee out %al,(%dx)
183
184 outb(0x1F2, 1); // count = 1
185 outb(0x1F3, offset);
186 outb(0x1F4, offset > 8);
187 outb(0x1F5, offset > 16);
188 7c9d: 89 d8 mov x,x
189 7c9f: b2 f5 mov $0xf5,%dl
190 7ca1: c1 e8 10 shr $0x10,x
191 7ca4: 0f b6 c0 movzbl %al,x
192 7ca7: ee out %al,(%dx)
193 outb(0x1F6, (offset > 24) | 0xE0);
194 7ca8: c1 eb 18 shr $0x18,x
195 7cab: b2 f6 mov $0xf6,%dl
196 7cad: 88 d8 mov %bl,%al
197 7caf: 83 c8 e0 or $0xffffffe0,x
198 7cb2: ee out %al,(%dx)
199 7cb3: b0 20 mov $0x20,%al
200 7cb5: b2 f7 mov $0xf7,%dl
201 7cb7: ee out %al,(%dx)
202 outb(0x1F7, 0x20); // cmd 0x20 - read sectors
203
204 // wait for disk to be ready
205 waitdisk();
206 7cb8: e8 ad ff ff ff call 7c6a <waitdisk>
207 }
208
209 static inline void
210 insl(int port, void *addr, int cnt)
211 {
212 asm volatile("cld\n\trepne\n\tinsl"
213 7cbd: 8b 7d 08 mov 0x8(p),i
214 7cc0: b9 80 00 00 00 mov $0x80,x
215 7cc5: ba f0 01 00 00 mov $0x1f0,x
216 7cca: fc cld
217 7ccb: f2 6d repnz insl (%dx),%es:(i)
218
219 // read a sector
220 insl(0x1F0, dst, SECTSIZE/4);
221 }
222 7ccd: 5b pop x
223 7cce: 5f pop i
224 7ccf: 5d pop p
225 7cd0: c3 ret
226
227 00007cd1 <readseg>:
228
229 // Read 'count' bytes at 'offset' from kernel into physical address 'pa'.
230 // Might copy more than asked
231 void
232 readseg(uint32_t pa, uint32_t count, uint32_t offset)
233 {
234 7cd1: 55 push p
235 7cd2: 89 e5 mov %esp,p
236 7cd4: 57 push i
237 uint32_t end_pa;
238
239 end_pa = pa + count;
240 7cd5: 8b 7d 0c mov 0xc(p),i
241
242 // Read 'count' bytes at 'offset' from kernel into physical address 'pa'.
243 // Might copy more than asked
244 void
245 readseg(uint32_t pa, uint32_t count, uint32_t offset)
246 {
247 7cd8: 56 push %esi
248 7cd9: 8b 75 10 mov 0x10(p),%esi
249 7cdc: 53 push x
250 7cdd: 8b 5d 08 mov 0x8(p),x
251
252 // round down to sector boundary
253 pa &= ~(SECTSIZE - 1);
254
255 // translate from bytes to sectors, and kernel starts at sector 1
256 offset = (offset / SECTSIZE) + 1;
257 7ce0: c1 ee 09 shr $0x9,%esi
258 void
259 readseg(uint32_t pa, uint32_t count, uint32_t offset)
260 {
261 uint32_t end_pa;
262
263 end_pa = pa + count;
264 7ce3: 01 df add x,i
265
266 // round down to sector boundary
267 pa &= ~(SECTSIZE - 1);
268
269 // translate from bytes to sectors, and kernel starts at sector 1
270 offset = (offset / SECTSIZE) + 1;
271 7ce5: 46 inc %esi
272 uint32_t end_pa;
273
274 end_pa = pa + count;
275
276 // round down to sector boundary
277 pa &= ~(SECTSIZE - 1);
278 7ce6: 81 e3 00 fe ff ff and $0xfffffe00,x
279 offset = (offset / SECTSIZE) + 1;
280
281 // If this is too slow, we could read lots of sectors at a time.
282 // We'd write more to memory than asked, but it doesn't matter --
283 // we load in increasing order.
284 while (pa < end_pa) {
285 7cec: 39 fb cmp i,x
286 7cee: 73 12 jae 7d02 <readseg+0x31>
287 // Since we haven't enabled paging yet and we're using
288 // an identity segment mapping (see boot.S), we can
289 // use physical addresses directly. This won't be the
290 // case once JOS enables the MMU.
291 readsect((uint8_t*) pa, offset);
292 7cf0: 56 push %esi
293 pa += SECTSIZE;
294 offset++;
295 7cf1: 46 inc %esi
296 while (pa < end_pa) {
297 // Since we haven't enabled paging yet and we're using
298 // an identity segment mapping (see boot.S), we can
299 // use physical addresses directly. This won't be the
300 // case once JOS enables the MMU.
301 readsect((uint8_t*) pa, offset);
302 7cf2: 53 push x
303 pa += SECTSIZE;
304 7cf3: 81 c3 00 02 00 00 add $0x200,x
305 while (pa < end_pa) {
306 // Since we haven't enabled paging yet and we're using
307 // an identity segment mapping (see boot.S), we can
308 // use physical addresses directly. This won't be the
309 // case once JOS enables the MMU.
310 readsect((uint8_t*) pa, offset);
311 7cf9: e8 7e ff ff ff call 7c7c <readsect>
312 pa += SECTSIZE;
313 offset++;
314 7cfe: 58 pop x
315 7cff: 5a pop x
316 7d00: eb ea jmp 7cec <readseg+0x1b>
317 }
318 }
319 7d02: 8d 65 f4 lea -0xc(p),%esp
320 7d05: 5b pop x
321 7d06: 5e pop %esi
322 7d07: 5f pop i
323 7d08: 5d pop p
324 7d09: c3 ret
325
326 00007d0a <bootmain>:
327 void readsect(void*, uint32_t);
328 void readseg(uint32_t, uint32_t, uint32_t);
329
330 void
331 bootmain(void)
332 {
333 7d0a: 55 push p
334 7d0b: 89 e5 mov %esp,p
335 7d0d: 56 push %esi
336 7d0e: 53 push x
337 struct Proghdr *ph, *eph;
338
339 // read 1st page off disk
340 readseg((uint32_t) ELFHDR, SECTSIZE*8, 0);
341 7d0f: 6a 00 push $0x0
342 7d11: 68 00 10 00 00 push $0x1000
343 7d16: 68 00 00 01 00 push $0x10000
344 7d1b: e8 b1 ff ff ff call 7cd1 <readseg>
345
346 // is this a valid ELF?
347 if (ELFHDR->e_magic != ELF_MAGIC)
348 7d20: 83 c4 0c add $0xc,%esp
349 7d23: 81 3d 00 00 01 00 7f cmpl $0x464c457f,0x10000
350 7d2a: 45 4c 46
351 7d2d: 75 38 jne 7d67 <bootmain+0x5d>
352 goto bad;
353
354 // load each program segment (ignores ph flags)
355 ph = (struct Proghdr *) ((uint8_t *) ELFHDR + ELFHDR->e_phoff);
356 7d2f: a1 1c 00 01 00 mov 0x1001c,x
357 7d34: 8d 98 00 00 01 00 lea 0x10000(x),x
358 eph = ph + ELFHDR->e_phnum;
359 7d3a: 0f b7 05 2c 00 01 00 movzwl 0x1002c,x
360 7d41: c1 e0 05 shl $0x5,x
361 7d44: 8d 34 03 lea (x,x,1),%esi
362 for (; ph < eph; ph++)
363 7d47: 39 f3 cmp %esi,x
364 7d49: 73 16 jae 7d61 <bootmain+0x57>
365 // p_pa is the load address of this segment (as well
366 // as the physical address)
367 readseg(ph->p_pa, ph->p_memsz, ph->p_offset);
368 7d4b: ff 73 04 pushl 0x4(x)
369 goto bad;
370
371 // load each program segment (ignores ph flags)
372 ph = (struct Proghdr *) ((uint8_t *) ELFHDR + ELFHDR->e_phoff);
373 eph = ph + ELFHDR->e_phnum;
374 for (; ph < eph; ph++)
375 7d4e: 83 c3 20 add $0x20,x
376 // p_pa is the load address of this segment (as well
377 // as the physical address)
378 readseg(ph->p_pa, ph->p_memsz, ph->p_offset);
379 7d51: ff 73 f4 pushl -0xc(x)
380 7d54: ff 73 ec pushl -0x14(x)
381 7d57: e8 75 ff ff ff call 7cd1 <readseg>
382 goto bad;
383
384 // load each program segment (ignores ph flags)
385 ph = (struct Proghdr *) ((uint8_t *) ELFHDR + ELFHDR->e_phoff);
386 eph = ph + ELFHDR->e_phnum;
387 for (; ph < eph; ph++)
388 7d5c: 83 c4 0c add $0xc,%esp
389 7d5f: eb e6 jmp 7d47 <bootmain+0x3d>
390 // as the physical address)
391 readseg(ph->p_pa, ph->p_memsz, ph->p_offset);
392
393 // call the entry point from the ELF header
394 // note: does not return!
395 ((void (*)(void)) (ELFHDR->e_entry))();
396 7d61: ff 15 18 00 01 00 call *0x10018
397 }
398
399 static inline void
400 outw(int port, uint16_t data)
401 {
402 asm volatile("outw %0,%w1" : : "a" (data), "d" (port));
403 7d67: ba 00 8a 00 00 mov $0x8a00,x
404 7d6c: b8 00 8a ff ff mov $0xffff8a00,x
405 7d71: 66 ef out %ax,(%dx)
406 7d73: b8 00 8e ff ff mov $0xffff8e00,x
407 7d78: 66 ef out %ax,(%dx)
408 7d7a: eb fe jmp 7d7a <bootmain+0x70>
409
410
可以看到,上面的代码是从0x7c00开始执行的,而用gdb调试发现BIOS执行的第一条指令的位置其实是在0xf000:0xfff0 那么问题来了...CS段是什么时候从0xf000到0的呢? 在0x7c00之前,BIOS是在做什么呢?
我们用gdb看一下这一部分的代码:
1 [f000:fff0] 0xffff0: ljmp $0xf000,$0xe05b
2 [f000:e05b] 0xfe05b: cmpl $0x0,%cs:0x6ac8
3 [f000:e062] 0xfe062: jne 0xfd2e1
4 [f000:e066] 0xfe066: xor %dx,%dx
5 [f000:e068] 0xfe068: mov %dx,%ss
6 [f000:e06a] 0xfe06a: mov $0x7000,%esp
7 [f000:e070] 0xfe070: mov $0xf34c2,x
8 [f000:e076] 0xfe076: jmp 0xfd15c
9 [f000:d15c] 0xfd15c: mov x,x
10 [f000:d15f] 0xfd15f: cli
11 [f000:d160] 0xfd160: cld
12 [f000:d161] 0xfd161: mov $0x8f,x
13 [f000:d167] 0xfd167: out %al,$0x70
14 [f000:d169] 0xfd169: in $0x71,%al
15 [f000:d16b] 0xfd16b: in $0x92,%al
16 [f000:d16d] 0xfd16d: or $0x2,%al
17 [f000:d16f] 0xfd16f: out %al,$0x92
18 [f000:d171] 0xfd171: lidtw %cs:0x6ab8
19 [f000:d177] 0xfd177: lgdtw %cs:0x6a74
20 [f000:d17d] 0xfd17d: mov %cr0,x
21 [f000:d180] 0xfd180: or $0x1,x
22 [f000:d184] 0xfd184: mov x,%cr0
23 [f000:d187] 0xfd187: ljmpl $0x8,$0xfd18f
24
25 The target architecture is assumed to be i386
26 0xfd18f: mov $0x10,x
27 0xfd194: mov x,%ds
28 0xfd196: mov x,%es
29 0xfd198: mov x,%ss
30 0xfd19a: mov x,%fs
31 0xfd19c: mov x,%gs
32 0xfd19e: mov x,x
33 0xfd1a0: jmp *x
34 0xf34c2: push x
35 0xf34c3: sub $0x2c,%esp
36 0xf34c6: movl $0xf5b5c,0x4(%esp)
37 0xf34ce: movl $0xf447b,(%esp)
38 0xf34d5: call 0xf099e
39 0xf099e: lea 0x8(%esp),x
40 0xf09a2: mov 0x4(%esp),x
41 0xf09a6: mov $0xf5b58,x
42 0xf09ab: call 0xf0574
43 0xf0574: push p
44 0xf0575: push i
45 0xf0576: push %esi
46 0xf0577: push x
47 0xf0578: sub $0xc,%esp
48 0xf057b: mov x,0x4(%esp)
49 0xf057f: mov x,p
50 0xf0581: mov x,%esi
51 0xf0583: movsbl 0x0(p),x
52 0xf0587: test %dl,%dl
53 0xf0589: je 0xf0758
54 0xf058f: cmp $0x25,%dl
55 0xf0592: jne 0xf0741
56 0xf0741: mov 0x4(%esp),x
57
58
其中的lidtw是加载向量描述表(load interrupt descriptor table), lgdtw是加载全局描述表(global descriptor table,GDT) 可以参考 LGDT/LIDT -- Load Global/Interrupt Descriptor Table Register
第16,17行的0x70,0x71可以参考CMOS#Accessing_CMOS_Registers,虽然我觉得这太细节了,不看也罢。
18-20行的内容,是快速enbale A20的方法,可以参考A20_Line
然后第21-26行...似曾相识啊..这不就是启动protected mode的步骤吗...
可是这还没有加载boot loader啊..怎么就进入protected mode了呢。。参考bootloader - switching processor to protected mode,发现有些BIOS在实现的时候,会在加载boot loader之前,先短暂进入保护模式,目的可能是为了使用在保护模式下的一些特性(比如32-bit的register),然后在进入bootloader之前,再切换回实模式。 以及据某6.828学习群大佬说...在进入boot loader之前进入保护模式的方法和boot loader中进入保护模式的方法是不一样的...进入保护模式的方法一共有四种... 感觉太过细节,暂且不去关心了。
第26行之后的代码...抱歉我也不是很懂...看起来无关紧要,如果之后发现这段是重要的再说。
来回答一下几个问题吧。
* At what point does the processor start executing 32-bit code? What exactly causes the switch from 16- to 32-bit mode?
开始执行32-bit code是从位置0x7c32,执行的命令为mov $0x10,%ax 从16-bit mode转化到32-bit mode是将control register 0 的 第1位(PE)设置为1导致的。
* What is the _last_ instruction of the boot loader executed, and what is the _first_ instruction of the kernel it just loaded?
boot loader执行的最后一条指令是0x7d61: call 0x10018 ,对应的c语言代码是 ((void ()(void)) (ELFHDR->e_entry))(); kernel加载后执行的第一条指令为 movw $0x1234,0x472
* _Where_ is the first instruction of the kernel?
kernel的第一条指令的地址为0x10000c
* How does the boot loader decide how many sectors it must read in order to fetch the entire kernel from disk? Where does it find this information?
boot loader先读一小部分kernel,具体来说是8个sector,也就是1 page,对应的代码为 readseg((uint32_t) ELFHDR, SECTSIZE*8, 0); 然后读进来的这部分里面包含了整个kernel有多大的信息,这些信息存储在inc/elf.h文件中。
Loading the Kernel
练习4提到了要熟悉c语言的指针..去看了下推荐的"The C Programming Language "..发现真是一本非常棒的入门书...之前还以为是像《算法导论》一样只可远观的大部头...可惜已经不适初学者了... 练习4中给出了一段使用c语言指针的代码,第5个输出要注意一下大小端...
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 void
5 f(void)
6 {
7 int a[4];
8 int *b = malloc(16);
9 int *c;
10 int i;
11
12 printf("1: a = %p, b = %p, c = %p\n", a, b, c);
13
14 c = a;
15 for (i = 0; i < 4; i++)
16 a[i] = 100 + i;
17 c[0] = 200;
18 printf("2: a[0] = %d, a[1] = %d, a[2] = %d, a[3] = %d\n",
19 a[0], a[1], a[2], a[3]);
20
21 c[1] = 300;
22 *(c + 2) = 301;
23 3[c] = 302;
24 printf("3: a[0] = %d, a[1] = %d, a[2] = %d, a[3] = %d\n",
25 a[0], a[1], a[2], a[3]);
26
27 c = c + 1;
28 *c = 400;
29 printf("4: a[0] = %d, a[1] = %d, a[2] = %d, a[3] = %d\n",
30 a[0], a[1], a[2], a[3]);
31
32 c = (int *) ((char *) c + 1);
33 *c = 500;
34 printf("5: a[0] = %d, a[1] = %d, a[2] = %d, a[3] = %d\n",
35 a[0], a[1], a[2], a[3]);
36
37 b = (int *) a + 1;
38 c = (int *) ((char *) a + 1);
39 printf("6: a = %p, b = %p, c = %p\n", a, b, c);
40 }
41
42 int
43 main(int ac, char **av)
44 {
45 f();
46 return 0;
47 }
48
49
在继续之前,需要仔细看一下elf文件的内容ELF
ELF文件
elf文件分成了很多个section,通常.data section存放初始化的global/static variable,.text 存放代码,.rodata section 用来存放字符串常量,.bss section用来存放未初始化的global/static variabel. .bss section没有对应的变量内容,原因是未初始化的变量按照规定会默认为0,因此没必要再存一次。“Thus there is no need to store contents for .bss in the ELF binary; instead, the linker records just the address and size of the .bss section. The loader or the program itself must arrange to zero the.bss section.”
我们比较关心的是.data section, .text section, .rodata section
我们可以用 objdump -h 命令查看一个ELF文件的 section header,
1 objdump -h obj/kern/kernel
2
3 obj/kern/kernel: file format elf32-i386
4
5 Sections:
6 Idx Name Size VMA LMA File off Algn
7 0 .text 00001917 f0100000 00100000 00001000 2**4
8 CONTENTS, ALLOC, LOAD, READONLY, CODE
9 1 .rodata 00000714 f0101920 00101920 00002920 2**5
10 CONTENTS, ALLOC, LOAD, READONLY, DATA
11 2 .stab 00003889 f0102034 00102034 00003034 2**2
12 CONTENTS, ALLOC, LOAD, READONLY, DATA
13 3 .stabstr 000018af f01058bd 001058bd 000068bd 2**0
14 CONTENTS, ALLOC, LOAD, READONLY, DATA
15 4 .data 0000a300 f0108000 00108000 00009000 2**12
16 CONTENTS, ALLOC, LOAD, DATA
17 5 .bss 00000648 f0112300 00112300 00013300 2**5
18 CONTENTS, ALLOC, LOAD, DATA
19 6 .comment 00000023 00000000 00000000 00013948 2**0
20 CONTENTS, READONLY
21
其中size是这个section的大小,VMA (Virtual Memory Address,6.828中叫link address) 是section开始执行时所在的memory address,LMA (Load Memory Address)是这个section被加载到memory中所处的位置。通常这两个地址是一样的。
boot loader使用elf文件中的program header来决定如何记载section, program header指明了ELF文件的哪一部分需要记载到memory中,以及加载到memory的什么位置。我们可以用bjdump -x obj/kern/kernel查看ELF的全部header文件
练习5 Trace through the first few instructions of the boot loader again and identify the first instruction that would "break" or otherwise do the wrong thing if you were to get the boot loader's link address wrong. Then change the link address in boot/Makefrag to something wrong, run make clean, recompile the lab with make, and trace into the boot loader again to see what happens. Don't forget to change the link address back and make clean again afterward!
把boot loader的link address从0x7c00改成了0x9c00... 然后进入gdb单步调试。
发现lgdtw的参数出现了负数 [ 0:7c1e] = 0x7c1e: lgdtw -0x639c ,然后继续执行,到[ 0:7c2d] = 0x7c2d: ljmp $0x8,$0x9c32 ,发生了crash.
我们观察到生成的boot.asm文件,地址确实是从0x9c00开始了。
1 protcseg:
2 # Set up the protected-mode data segment registers
3 movw $PROT_MODE_DSEG, %ax # Our data segment selector
4 9c32: 66 b8 10 00 mov $0x10,%ax
5 movw %ax, %ds # - DS: Data Segment
6 9c36: 8e d8 mov x,%ds
7 movw %ax, %es # - ES: Extra Segment
8 9c38: 8e c0 mov x,%es
9 movw %ax, %fs # - FS
10 9c3a: 8e e0 mov x,%fs
11 movw %ax, %gs # - GS
12 9c3c: 8e e8 mov x,%gs
13 movw %ax, %ss # - SS: Stack Segment
14 9c3e: 8e d0 mov x,%ss
15
16 # Set up the stack pointer and call into C.
17 movl $start, %esp
18 9c40: bc 00 9c 00 00 mov $0x9c00,%esp
19 call bootmain
20 9c45: e8 c0 00 00 00 call 9d0a <bootmain>
21
22 00009c4a <spin>:
23
24 # If bootmain returns (it shouldn't), loop.
25 spin:
26 jmp spin
27 9c4a: eb fe jmp 9c4a <spin>
但是实际上。。BIOS仍然把boot loader记载到了0x7c00....这是约定俗成吗? BIOS无视Boot loader的link address,直接加载到0x7c00? 没有找到相关资料,有待进一步探寻。
练习6 Reset the machine (exit QEMU/GDB and start them again). Examine the 8 words of memory at 0x00100000 at the point the BIOS enters the boot loader, and then again at the point the boot loader enters the kernel. Why are they different? What is there at the second breakpoint? (You do not really need to use QEMU to answer this question. Just think.)
这个问题是问,BIOS进入boot loader时(也就是在0x7c00时)和boot loader进入kernel时(0x10000c),地址0x00100000开始的8个word单位的值,为什么不同。
0x7c00时,0x00100000处的8个word的值都为0...
在0x10000c时,0x00100000处的值翻译成指令之后是:
0x100000: add 0x1bad(x),%dh │·······································
0x100006: add %al,(x) │·······································
0x100008: decb 0x52(i) │·······································
0x10000b: in $0x66,%al │·······································
0x10000d: movl $0xb81234,0x472 │·······································
0x100017: add %dl,(x) │·······································
0x100019: add %cl,(i) │·······································
0x10001b: and %al,%bl
不一样的原因是,在刚刚进入boot loader时,kernel还没有加载进内存,因此是空的.
Part 3: The Kernel
Using virtual memory to work around position dependence
OS的kernel通常喜欢运行再较高地址的虚拟内存中,比如0xf0100000,为的是低地址留给用户程序。但是有的机器可能没有那么大的memory,因此不存在0xf0100000这个物理地址。因此这里需要做一个虚拟内存到物理内存的映射。在这个部分实验中,我们不需要至少地址映射是如何work的,只需要知道效果就好。
具体来说,当CR0_PG被置为1之前,内存地址为物理内存地址(严格地说,其实是线性地址,不过在boot/boot.S中做了线性地址到物理地址的等价映射),当CRO_PG flag被置为1之后,地址就变成了虚拟内存地址。我们可以用gdb调试看一下发生了什么。
Exercise 7. Use QEMU and GDB to trace into the JOS kernel and stop at the `movl x, %cr0`. Examine memory at 0x00100000 and at 0xf0100000. Now, single step over that instruction using the stepi GDB command. Again, examine memory at 0x00100000 and at 0xf0100000. Make sure you understand what just happened.What is the first instruction after the new mapping is established that would fail to work properly if the mapping weren't in place? Comment out the
movl x, %cr0
in kern/entry.S, trace into it, and see if you were right.
先用b *0x10000c处设置断点,这个是JOS kernel开始运行的地址。然后单步几步,在movl x , %cr0处停留,也就是cr0_PG flag恰好也被制为1之前。观察一下0x00100000和0xf0100000的内容:
1 (gdb) x/8x 0xf0100000
2 0xf0100000 <_start+4026531828>: 0x00000000 0x00000000 0x00000000 0x00000000
3 0xf0100010 <entry+4>: 0x00000000 0x00000000 0x00000000 0x00000000
4
5 x/8i 0x00100000 │·······································
6 0x100000: add 0x1bad(x),%dh │·······································
7 0x100006: add %al,(x) │·······································
8 0x100008: decb 0x52(i) │·······································
9 0x10000b: in $0x66,%al │·······································
10 0x10000d: movl $0xb81234,0x472 │·······································
11 0x100017: add %dl,(x) │·······································
12 0x100019: add %cl,(i) │·······································
13 0x10001b: and %al,%bl
14
15
然后接着单步一次,再次用x/8i观察8条0x00100000和0xf0100000处的内容
(gdb) x/8i 0x00100000 │·······································
0x100000: add 0x1bad(x),%dh │·······································
0x100006: add %al,(x) │·······································
0x100008: decb 0x52(i) │·······································
0x10000b: in $0x66,%al │·······································
0x10000d: movl $0xb81234,0x472 │·······································
0x100017: add %dl,(x) │·······································
0x100019: add %cl,(i) │·······································
0x10001b: and %al,%bl │·······································
(gdb) x/8i 0xf0100000 │·······································
0xf0100000 <_start+4026531828>: add 0x1bad(x),%dh │·······································
0xf0100006 <_start+4026531834>: add %al,(x) │·······································
0xf0100008 <_start+4026531836>: decb 0x52(i) │·······································
0xf010000b <_start+4026531839>: in $0x66,%al │·······································
0xf010000d <entry+1>: movl $0xb81234,0x472 │·······································
0xf0100017 <entry+11>: add %dl,(x) │·······································
0xf0100019 <entry+13>: add %cl,(i) │·······································
0xf010001b <entry+15>: and %al,%bl
可以观察到,在cx0_PG flag被置为1之前,地址0xf0100000处是一片虚无。
置为1之后,地址0xf0100000处的内容和0x00100000处的内容一致。需要注意,此时这两个地址都是虚拟内存地址了。具体来说
Once `CR0_PG` is set, memory references are virtual addresses that get translated by the virtual memory hardware to physical addresses. `entry_pgdir` translates virtual addresses in the range 0xf0000000 through 0xf0400000 to physical addresses 0x00000000 through 0x00400000, as well as virtual addresses 0x00000000 through 0x00400000 to physical addresses 0x00000000 through 0x00400000
然后我们注释掉movl x, %cr0
in kern/entry.S
再次用gdb调试,发现0x10002a: jmp *x crash了。 原因显然是由于没有开启保护模式,eax的地址值不合法。
Formatted Printing to the Console
printf的格式化输出并不是天生就有的,首先阅读一下相关的几个代码。kern/printf.c, kern/console.c和lib/printfmt.c
Exercise 8. We have omitted a small fragment of code - the code necessary to print octal numbers using patterns of the form "%o". Find and fill in this code fragment.
很简单,修改之后代码为
1 case 'o':
2 // Replace this with your code.
3 num = getuint(&ap,lflag);
4 base = 8;
5 goto number;
接下来来回答几个问题
1. Explain the interface between printf.c and console.c. Specifically, what function does console.c export? How is this function used by printf.c?
printf.c与console.c的接口是console.c中的cputchar(),作用是向console中打印一个字符。printf.c在patch()函数中使用了cputchar()
2.Explain the following from console.c:
1
2 if (crt_pos >= CRT_SIZE) {
3 int i;
4 memmove(crt_buf, crt_buf + CRT_COLS, (CRT_SIZE - CRT_COLS) * sizeof(uint16_t));
5 for (i = CRT_SIZE - CRT_COLS; i < CRT_SIZE; i++)
6 crt_buf[i] = 0x0700 | ' ';
7 crt_pos -= CRT_COLS;
8 }
9
10
这段代码很显然,含义是屏幕的字符数超过了屏幕能显示的最大数目的情况下,将第二行到最后一行的字符整体上移一行(这样原先的第一行就被覆盖了),然后将最后一行的内容清空(因为已经上移到倒数第二行了) 应该是类似屏幕滚动的效果
3. For the following questions you might wish to consult the notes for Lecture 2. These notes cover GCC's calling convention on the x86.Trace the execution of the following code step-by-step:
1 2 int x = 1, y = 3, z = 4; 3 cprintf("x %d, y %x, z %d\n", x, y, z);
* In the call to `cprintf()`, to what does `fmt` point? To what does `ap` point? * List (in order of execution) each call to `cons_putc`, `va_arg`, and `vcprintf`. For `cons_putc`, list its argument as well. For `va_arg`, list what `ap` points to before and after the call. For `vcprintf` list the values of its two arguments.
这个问题的解答可以先参考一下c语言变长参数和x86 calling conventions
我们先看一下print.c的代码:
1 static void 2 putch(int ch, int *cnt) 3 { 4 cputchar(ch); 5 *cnt++; 6 } 7 8 int 9 vcprintf(const char *fmt, va_list ap) 10 { 11 int cnt = 0; 12 13 vprintfmt((void*)putch, &cnt, fmt, ap); 14 return cnt; 15 } 16 17 int 18 cprintf(const char *fmt, ...) 19 { 20 va_list ap; 21 int cnt; 22 23 va_start(ap, fmt); 24 cnt = vcprintf(fmt, ap); 25 va_end(ap); 26 27 return cnt; 28 }
从int cprintf(const char fmt, ...)开始看,参数fmt应该就是 我们熟悉的c语言的printf的格式化部分,也就是第一个参数。
然后整体就是c语言变长参数的routine,但是没有使用va_arg, 而是用cnt = cvprintf(fmt,ap),返回了一个不知道什么的个数。
接下来看int vcprintf(const char *fmt, va_list ap),好像没什么好看的.... 然后是vprintfmt,代码如下:
1 void 2 vprintfmt(void (*putch)(int, void*), void *putdat, const char *fmt, va_list ap) 3 { 4 register const char *p; 5 register int ch, err; 6 unsigned long long num; 7 int base, lflag, width, precision, altflag; 8 char padc; 9 10 while (1) { 11 while ((ch = *(unsigned char *) fmt++) != '%') { 12 if (ch == '\0') 13 return; 14 putch(ch, putdat); 15 } 16 17 // Process a %-escape sequence 18 padc = ' '; 19 width = -1; 20 precision = -1; 21 lflag = 0; 22 altflag = 0; 23 reswitch: 24 switch (ch = *(unsigned char *) fmt++) { 25 26 // flag to pad on the right 27 case '-': 28 padc = '-'; 29 goto reswitch; 30 31 // flag to pad with 0's instead of spaces 32 case '0': 33 padc = '0'; 34 goto reswitch; 35 36 // width field 37 case '1': 38 case '2': 39 case '3': 40 case '4': 41 case '5': 42 case '6': 43 case '7': 44 case '8': 45 case '9': 46 for (precision = 0; ; ++fmt) { 47 precision = precision * 10 + ch - '0'; 48 ch = *fmt; 49 if (ch < '0' || ch '9') 50 break; 51 } 52 goto process_precision; 53 54 case '*': 55 precision = va_arg(ap, int); 56 goto process_precision; 57 58 case '.': 59 if (width < 0) 60 width = 0; 61 goto reswitch; 62 63 case '#': 64 altflag = 1; 65 goto reswitch; 66 67 process_precision: 68 if (width < 0) 69 width = precision, precision = -1; 70 goto reswitch; 71 72 // long flag (doubled for long long) 73 case 'l': 74 lflag++; 75 goto reswitch; 76 77 // character 78 case 'c': 79 putch(va_arg(ap, int), putdat); 80 break; 81 82 // error message 83 case 'e': 84 err = va_arg(ap, int); 85 if (err < 0) 86 err = -err; 87 if (err = MAXERROR || (p = error_string[err]) == NULL) 88 printfmt(putch, putdat, "error %d", err); 89 else 90 printfmt(putch, putdat, "%s", p); 91 break; 92 93 // string 94 case 's': 95 if ((p = va_arg(ap, char *)) == NULL) 96 p = "(null)"; 97 if (width 0 && padc != '-') 98 for (width -= strnlen(p, precision); width 0; width--) 99 putch(padc, putdat); 100 for (; (ch = *p++) != '\0' && (precision < 0 || --precision = 0); width--) 101 if (altflag && (ch < ' ' || ch '~')) 102 putch('?', putdat); 103 else 104 putch(ch, putdat); 105 for (; width 0; width--) 106 putch(' ', putdat); 107 break; 108 109 // (signed) decimal 110 case 'd': 111 num = getint(&ap, lflag); 112 if ((long long) num < 0) { 113 putch('-', putdat); 114 num = -(long long) num; 115 } 116 base = 10; 117 goto number; 118 119 // unsigned decimal 120 case 'u': 121 num = getuint(&ap, lflag); 122 base = 10; 123 goto number; 124 125 // (unsigned) octal 126 case 'o': 127 // Replace this with your code. 128 putch('X', putdat); 129 putch('X', putdat); 130 putch('X', putdat); 131 break; 132 133 // pointer 134 case 'p': 135 putch('0', putdat); 136 putch('x', putdat); 137 num = (unsigned long long) 138 (uintptr_t) va_arg(ap, void *); 139 base = 16; 140 goto number; 141 142 // (unsigned) hexadecimal 143 case 'x': 144 num = getuint(&ap, lflag); 145 base = 16; 146 number: 147 printnum(putch, putdat, num, base, width, padc); 148 break; 149 150 // escaped '%' character 151 case '%': 152 putch(ch, putdat); 153 break; 154 155 // unrecognized escape sequence - just print it literally 156 default: 157 putch('%', putdat); 158 for (fmt--; fmt[-1] != '%'; fmt--) 159 /* do nothing */; 160 break; 161 } 162 } 163 } 164
大致扫一眼可以发现这段代码是处理输出的格式化参数的,包括输出类型,精度,场宽之类。
我们注意到putch函数的作用是向console输出一个字符,并统计当前累计的输出字符个数。
接下来我们来回答问题:
* 在cprintf的调用中,fmt指向的是"x %d, y %x, z %d\n", ap指向的是第一个变长参数,也就是变量x在调用栈中的地址。 * cons_putc调用的过程按先后顺序为: * cons_putc('x') * cons_putc(' ') * cons_putc('1') * cons_putc(',') * cons_putc(' ') * cons_putc('y') * cons_putc(' ') * cons_putc('3') * cons_putc(',') * cons_putc(' ') * cons_putc('z') * cons_putc(' ') * cons_putc('4') * cons_putc('\n') * va_arg一共调用了三次 * 第一次调用前,ap指向参数x在栈中的地址,调用之后,ap指向参数y在栈中的地址。 * 第二次调用前,ap指向参数y在栈中的地址,调用之后,ap指向参数z在栈中的地址。 * 第三次调用前,ap指向参数z在栈中的地址,调用之后,ap指向参数z之后4字节的地址。 * vcprintf的参数值为"x %d, y %x, z %d\n" 和 参数x在调用栈中的地址。
4.Run the following code.
1 unsigned int i = 0x00646c72; 2 cprintf("H%x Wo%s", 57616, &i); 3
What is the output? Explain how this output is arrived at in the step-by-step manner of the previous exercise. Here's an ASCII table that maps bytes to characters.
The output depends on that fact that the x86 is little-endian. If the x86 were instead big-endian what would you set
i
to in order to yield the same output? Would you need to change57616
to a different value?
输出结果为 "He110 World" 前半部分的e110就是57616的十六进制表示。后半部分将unsiged int i 当成unsigned char类型输出,十六进制64,6c,72对应的字符分别为‘d’,‘l’,'r'.
然后先复习一下字节序。整数类型static_cast不会有字节序问题,指针++和--操作不涉及cast和字节序问题。把指针类型reinterpret_cast才会有字节序问题,例如:
1
2 int a = 0x12345678
3 char *c = reinterpret_cast<char*>(&a);
4 printf("%x %x %x %x\n",c[0],c[1],c[2],c[3]);
5 //小端输出:78 56 34 12
6 //大端输出:12 34 56 78
由于x86体系架构字节序为little-endian,因此实际输出为'r','l','d'.
如果x86体系架构为large-endian,那么i的值应该改为0x00726c64,以实现相同的输出结果。
57616不需要做修改,因为整数类型staic_cast不存在字节序问题。
5.In the following code, what is going to be printed after `'y='`? (note: the answer is not a specific value.) Why does this happen?1 cprintf("x=%d y=%d", 3);
x的结果就是3,y的输出是没意义的一个整数。原因是,这句话会发生当va_list中没有下一个变量时,仍然使用va_arg去取下一个变量。而根据va_arg,此时的行为是undefined behaviour.
6.Let's say that GCC changed its calling convention so that it pushed arguments on the stack in declaration order, so that the last argument is pushed last. How would you have to change `cprintf` or its interface so that it would still be possible to pass it a variable number of arguments?感觉如果知识修改cprintf来达到目的有点难? 因为压栈顺序和之前相反了,那么va_arg这个宏需要修改一下...或者,添加一个buffer,不是一次处理一个参数,而是先将参数全部读取,然后调换顺序,之后再进行处理。
The Stack
Exercise 9. Determine where the kernel initializes its stack, and exactly where in memory its stack is located. How does the kernel reserve space for its stack? And at which "end" of this reserved area is the stack pointer initialized to point to?
参考obj/kernel.asm
1 f010002c <relocated>:
2 relocated:
3
4 # Clear the frame pointer register (EBP)
5 # so that once we get into debugging C code,
6 # stack backtraces will be terminated properly.
7 movl $0x0,p # nuke frame pointer
8 f010002c: bd 00 00 00 00 mov $0x0,p
9
10 # Set the stack pointer
11 movl $(bootstacktop),%esp
12 f0100031: bc 00 00 11 f0 mov $0xf0110000,%esp
得知kernel初始化stack是在地址0xf010002c和0xf0100031完成的。stack被加载到了地址0xf01100000. 至于kernel如何为stack保留空间这个问题,我的理解是,stack现在有了初始位置,但是它如何知道自己有多大空间呢? 换句话说,这个问题问的是kernel如何决定stack的大小。这一部分其实定义在inc/memlayout.h中,
1 // All physical memory mapped at this address
2 #define KERNBASE 0xF0000000
3
4 // At IOPHYSMEM (640K) there is a 384K hole for I/O. From the kernel,
5 // IOPHYSMEM can be addressed at KERNBASE + IOPHYSMEM. The hole ends
6 // at physical address EXTPHYSMEM.
7 #define IOPHYSMEM 0x0A0000
8 #define EXTPHYSMEM 0x100000
9
10 // Kernel stack.
11 #define KSTACKTOP KERNBASE
12 #define KSTKSIZE (8*PGSIZE) // size of a kernel stack
13 #define KSTKGAP (8*PGSIZE) // size of a kernel stack guard
14
15 // Memory-mapped IO.
16 #define MMIOLIM (KSTACKTOP - PTSIZE)
17 #define MMIOBASE (MMIOLIM - PTSIZE)
最后一个问题,由于x86体系架构下栈是向下增长的。因此stack pointer初始指向这段保留区域的大地址端(也就是上面)
Exercise 10. To become familiar with the C calling conventions on the x86, find the address of the `test_backtrace` function in obj/kern/kernel.asm, set a breakpoint there, and examine what happens each time it gets called after the kernel starts. How many 32-bit words does each recursive nesting level of `test_backtrace` push on the stack, and what are those words?Note that, for this exercise to work properly, you should be using the patched version of QEMU available on the tools page or on Athena. Otherwise, you'll have to manually translate all breakpoint and memory addresses to linear addresses.
test_backtrace的入口地址在0xf0100040,在这里设置断点,然后最后的输出结果如下:
entering test_backtrace 5
entering test_backtrace 4
entering test_backtrace 3
entering test_backtrace 2
entering test_backtrace 1
entering test_backtrace 0
leaving test_backtrace 0
leaving test_backtrace 1
leaving test_backtrace 2
leaving test_backtrace 3
leaving test_backtrace 4
leaving test_backtrace 5
Welcome to the JOS kernel monitor!
对于每次调用函数test_backtrace,有三个32-bit的变量被压栈,可以参考
1 // Test the stack backtrace function (lab 1 only)
2 void
3 test_backtrace(int x)
4 {
5 f0100040: 55 push p
6 f0100041: 89 e5 mov %esp,p
7 f0100043: 53 push x
8 f0100044: 83 ec 14 sub $0x14,%esp
9 f0100047: 8b 5d 08 mov 0x8(p),x
10 cprintf("entering test_backtrace %d\n", x);
11 f010004a: 89 5c 24 04 mov x,0x4(%esp)
12 f010004e: c7 04 24 e0 18 10 f0 movl $0xf01018e0,(%esp)
13 f0100055: e8 d7 08 00 00 call f0100931 <cprintf>
14 if (x 0)
15 f010005a: 85 db test x,x
16 f010005c: 7e 0d jle f010006b <test_backtrace+0x2b>
17 test_backtrace(x-1);
18 f010005e: 8d 43 ff lea -0x1(x),x
19 f0100061: 89 04 24 mov x,(%esp)
20 f0100064: e8 d7 ff ff ff call f0100040 <test_backtrace>
21 f0100069: eb 1c jmp f0100087 <test_backtrace+0x47>
22 else
23 mon_backtrace(0, 0, 0);
24 f010006b: c7 44 24 08 00 00 00 movl $0x0,0x8(%esp)
25 f0100072: 00
26 f0100073: c7 44 24 04 00 00 00 movl $0x0,0x4(%esp)
27 f010007a: 00
28 f010007b: c7 04 24 00 00 00 00 movl $0x0,(%esp)
29 f0100082: e8 18 07 00 00 call f010079f <mon_backtrace>
30 cprintf("leaving test_backtrace %d\n", x);
31 f0100087: 89 5c 24 04 mov x,0x4(%esp)
32 f010008b: c7 04 24 fc 18 10 f0 movl $0xf01018fc,(%esp)
33 f0100092: e8 9a 08 00 00 call f0100931 <cprintf>
34 }
35 f0100097: 83 c4 14 add $0x14,%esp
36 f010009a: 5b pop x
37 f010009b: 5d pop p
38 f010009c: c3 ret
分别是参数x,ebp和ebx. 参数x和ebp的压栈是常规操作,就不解释了。ebx的压栈可能有些疑问,可以参考Why are these registers pushed to stack?
下一个练习:
Exercise 11. Implement the backtrace function as specified above. Use the same format as in the example, since otherwise the grading script will be confused. When you think you have it working right, run make grade to see if its output conforms to what our grading script expects, and fix it if it doesn't. _After_ you have handed in your Lab 1 code, you are welcome to change the output format of the backtrace function any way you like.If you use
read_ebp()
, note that GCC may generate "optimized" code that callsread_ebp()
beforemon_backtrace()
's function prologue, which results in an incomplete stack trace (the stack frame of the most recent function call is missing). While we have tried to disable optimizations that cause this reordering, you may want to examine the assembly ofmon_backtrace()
and make sure the call toread_ebp()
is happening after the function prologue.
这个练习主要参考x86-calling-conventions, 主要是需要知道ebp的内容是上一个stack frame中的ebp,以及ebp+4是返回地址,ebp+8是第一个参数,还有ebp的初始值是0.
最后的实现为:
1 int
2 mon_backtrace(int argc, char **argv, struct Trapframe *tf)
3 {
4 // Your code here.
5 uint32_t *ebp = (uint32_t*)read_ebp();
6 int i ;
7 while (ebp)
8 {
9 cprintf("ebp x eip x ",ebp,*(ebp+1));
10 cprintf("args");
11 for ( i = 2 ; i < 7 ; i++)
12 {
13 cprintf(" x",*(ebp+i));
14 }
15 cprintf("\n");
16 ebp = (uint32_t*)*ebp;
17 }
18 return 0;
19 }
20
然后是最后一个练习:
Exercise 12. Modify your stack backtrace function to display, for each eip, the function name, source file name, and line number corresponding to that eip.In
debuginfo_eip
, where do _STAB* come from? This question has a long answer; to help you to discover the answer, here are some things you might want to do:* look in the file kern/kernel.ld for __STAB_* * run objdump -h obj/kern/kernel * run objdump -G obj/kern/kernel * run gcc -pipe -nostdinc -O2 -fno-builtin -I. -MD -Wall -Wno-format -DJOS_KERNEL -gstabs -c -S kern/init.c, and look at init.s. * see if the bootloader loads the symbol table in memory as part of loading the kernel binary
Complete the implementation of
debuginfo_eip
by inserting the call tostab_binsearch
to find the line number for an address.Add a backtrace command to the kernel monitor, and extend your implementation of
mon_backtrace
to calldebuginfo_eip
and print a line for each stack frame of the form:K backtrace Stack backtrace: ebp f010ff78 eip f01008ae args 00000001 f010ff8c 00000000 f0110580 00000000 kern/monitor.c:143: monitor+106 ebp f010ffd8 eip f0100193 args 00000000 00001aac 00000660 00000000 00000000 kern/init.c:49: i386_init+59 ebp f010fff8 eip f010003d args 00000000 00000000 0000ffff 10cf9a00 0000ffff kern/entry.S:70: <unknown>+0 K
Each line gives the file name and line within that file of the stack frame's eip, followed by the name of the function and the offset of the eip from the first instruction of the function (e.g., monitor+106 means the return eip is 106 bytes past the beginning of monitor).
Be sure to print the file and function names on a separate line, to avoid confusing the grading script.
Tip: printf format strings provide an easy, albeit obscure, way to print non-null-terminated strings like those in STABS tables.
printf("%.*s", length, string)
prints at mostlength
characters ofstring
. Take a look at the printf man page to find out why this works.You may find that some functions are missing from the backtrace. For example, you will probably see a call to
monitor()
but not toruncmd()
. This is because the compiler in-lines some function calls. Other optimizations may cause you to see unexpected line numbers. If you get rid of the -O2 fromGNUMakefile, the backtraces may make more sense (but your kernel will run more slowly).
需要先了解一下stab,简单来说是一种调试数据格式。具体可以参考stabs 和 调试 DWARF 和 STAB 格式 。
objdump -h obj/kern/kernel的输出为
obj/kern/kernel: file format elf32-i386
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00001937 f0100000 00100000 00001000 2**4
CONTENTS, ALLOC, LOAD, READONLY, CODE
1 .rodata 0000079c f0101940 00101940 00002940 2**5
CONTENTS, ALLOC, LOAD, READONLY, DATA
2 .stab 000038e9 f01020dc 001020dc 000030dc 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
3 .stabstr 000018f0 f01059c5 001059c5 000069c5 2**0
CONTENTS, ALLOC, LOAD, READONLY, DATA
4 .data 0000a300 f0108000 00108000 00009000 2**12
CONTENTS, ALLOC, LOAD, DATA
5 .bss 00000648 f0112300 00112300 00013300 2**5
CONTENTS, ALLOC, LOAD, DATA
6 .comment 00000023 00000000 00000000 00013948 2**0
CONTENTS, READONLY
我们可以看到stabstr段的link address(VMA)为f01059c5.
然后用gdb调试,先断点到0x10000c,也就是bootloader记载kernel的位置。然后再单步执行几步,直到开启保护模式。此时查看 地址f01059c5,结果如下,说明boot loader在加载kernel的同时也将符号表加载到了内存中
(gdb) x/8s 0xf01059c5
0xf01059c5: ""
0xf01059c6: "{standard input}"
0xf01059d7: "kern/entry.S"
0xf01059e4: "kern/entrypgdir.c"
0xf01059f6: "gcc2_compiled."
0xf0105a05: "int:t(0,1)=r(0,1);-2147483648;2147483647;"
0xf0105a2f: "char:t(0,2)=r(0,2);0;127;"
0xf0105a49: "long int:t(0,3)=r(0,3);-2147483648;2147483647;"
接下来先看一下我们要补全的kern/kdebug.c文件
1 int
2 debuginfo_eip(uintptr_t addr, struct Eipdebuginfo *info)
3 {
4 const struct Stab *stabs, *stab_end;
5 const char *stabstr, *stabstr_end;
6 int lfile, rfile, lfun, rfun, lline, rline;
7
8 // Initialize *info
9 info->eip_file = "<unknown>";
10 info->eip_line = 0;
11 info->eip_fn_name = "<unknown>";
12 info->eip_fn_namelen = 9;
13 info->eip_fn_addr = addr;
14 info->eip_fn_narg = 0;
15
16 // Find the relevant set of stabs
17 if (addr >= ULIM) {
18 stabs = __STAB_BEGIN__;
19 stab_end = __STAB_END__;
20 stabstr = __STABSTR_BEGIN__;
21 stabstr_end = __STABSTR_END__;
22 } else {
23 // Can't search for user-level addresses yet!
24 panic("User address");
25 }
26
27 // String table validity checks
28 if (stabstr_end <= stabstr || stabstr_end[-1] != 0)
29 return -1;
30
31 // Now we find the right stabs that define the function containing
32 // 'eip'. First, we find the basic source file containing 'eip'.
33 // Then, we look in that source file for the function. Then we look
34 // for the line number.
35
36 // Search the entire set of stabs for the source file (type N_SO).
37 lfile = 0;
38 rfile = (stab_end - stabs) - 1;
39 stab_binsearch(stabs, &lfile, &rfile, N_SO, addr);
40 if (lfile == 0)
41 return -1;
42
43 // Search within that file's stabs for the function definition
44 // (N_FUN).
45 lfun = lfile;
46 rfun = rfile;
47 stab_binsearch(stabs, &lfun, &rfun, N_FUN, addr);
48
49 if (lfun <= rfun) {
50 // stabs[lfun] points to the function name
51 // in the string table, but check bounds just in case.
52 if (stabs[lfun].n_strx < stabstr_end - stabstr)
53 info->eip_fn_name = stabstr + stabs[lfun].n_strx;
54 info->eip_fn_addr = stabs[lfun].n_value;
55 addr -= info->eip_fn_addr;
56 // Search within the function definition for the line number.
57 lline = lfun;
58 rline = rfun;
59 } else {
60 // Couldn't find function stab! Maybe we're in an assembly
61 // file. Search the whole file for the line number.
62 info->eip_fn_addr = addr;
63 lline = lfile;
64 rline = rfile;
65 }
66 // Ignore stuff after the colon.
67 info->eip_fn_namelen = strfind(info->eip_fn_name, ':') - info->eip_fn_name;
68
69
70 // Search within [lline, rline] for the line number stab.
71 // If found, set info->eip_line to the right line number.
72 // If not found, return -1.
73 //
74 // Hint:
75 // There's a particular stabs type used for line numbers.
76 // Look at the STABS documentation and <inc/stab.h to find
77 // which one.
78 // use N_SLINE
79
80 // Your code here.
81
82
83
84
85
86 // Search backwards from the line number for the relevant filename
87 // stab.
88 // We can't just use the "lfile" stab because inlined functions
89 // can interpolate code from a different file!
90 // Such included source files use the N_SOL stab type.
91 while (lline >= lfile
92 && stabs[lline].n_type != N_SOL
93 && (stabs[lline].n_type != N_SO || !stabs[lline].n_value))
94 lline--;
95 if (lline >= lfile && stabs[lline].n_strx < stabstr_end - stabstr)
96 info->eip_file = stabstr + stabs[lline].n_strx;
97
98
99 // Set eip_fn_narg to the number of arguments taken by the function,
100 // or 0 if there was no containing function.
101 if (lfun < rfun)
102 for (lline = lfun + 1;
103 lline < rfun && stabs[lline].n_type == N_PSYM;
104 lline++)
105 info->eip_fn_narg++;
106
107 return 0;
108 }
109
110```c
111发现要补全的地方...其实很容易写? 因为在要补全的二分之前,已经做了两次二分...照着写一下就好了。
112
113```c
114 stab_binsearch(stabs, &lline, &rline, N_SLINE, addr);
115 if (lline == 0) return -1;
116 info->eip_line = stabs[rline].n_desc;
然后就是在monitor.c中修改monitor.c中,调用debuginfo_eip,这部分也很容易。
1
2 int
3 mon_backtrace(int argc, char **argv, struct Trapframe *tf)
4 {
5 // Your code here.
6 uint32_t *ebp = (uint32_t*)read_ebp();
7 cprintf("Stack backtrace:\n");
8 int i ;
9 struct Eipdebuginfo info;
10 while (ebp)
11 {
12 uint32_t eip = ebp[1];
13 cprintf("ebp x eip x ",ebp,eip);
14 cprintf("args");
15 for ( i = 2 ; i < 7 ; i++)
16 {
17 cprintf(" x",*(ebp+i));
18 }
19 cprintf("\n");
20 int status = debuginfo_eip(eip,&info);
21 if (status == 0)
22 {
23
24 cprintf("%s:%d: ",info.eip_file,info.eip_line);
25 cprintf("%.*s+%d\n",info.eip_fn_namelen,info.eip_fn_name,eip-info.eip_fn_addr);
26 }
27 ebp = (uint32_t*)*ebp;
28 }
29
30
31 return 0;
32 }
最终效果大概如下:
entering test_backtrace 5
entering test_backtrace 4
entering test_backtrace 3
entering test_backtrace 2
entering test_backtrace 1
entering test_backtrace 0
Stack backtrace:
ebp f0110ec8 eip f0100b09 args f0102499 f0102499 f0100b09 00000000 f0100d9c
kern/monitor.c:66: mon_backtrace+26
ebp f0110f18 eip f010008b args 00000000 00000000 00000000 00000000 f0102238
kern/init.c:19: test_backtrace+75
ebp f0110f38 eip f010006d args 00000000 00000001 f0110f64 00000000 f0102238
kern/init.c:16: test_backtrace+45
ebp f0110f58 eip f010006d args 00000001 00000002 f0110f84 00000000 f0102238
kern/init.c:16: test_backtrace+45
ebp f0110f78 eip f010006d args 00000002 00000003 f0110fa4 00000000 f0102238
kern/init.c:16: test_backtrace+45
ebp f0110f98 eip f010006d args 00000003 00000004 f0110fc4 00000000 f010226f
kern/init.c:16: test_backtrace+45
ebp f0110fb8 eip f010006d args 00000004 00000005 f0110fe4 00000000 00000000
kern/init.c:16: test_backtrace+45
ebp f0110fd8 eip f01000f1 args 00000005 00001aac 00000640 00000000 00000000
kern/init.c:43: i386_init+81
ebp f0110ff8 eip f010003e args 00000003 00001003 00002003 00003003 00004003
kern/entry.S:83: <unknown>+0
leaving test_backtrace 0
leaving test_backtrace 1
leaving test_backtrace 2
leaving test_backtrace 3
leaving test_backtrace 4
leaving test_backtrace 5
Welcome to the JOS kernel monitor!
如果有些函数没有出现在上面,可能是被优化掉了,试着修改makefile中的编译选项,把O2或者O1修改为O0。
至此,我们完成了lab1的全部内容。完结撒花~
虽然做了三十个小时...不过真的收获蛮多,感觉像是在玩解谜游戏,线索就是每个练习前后的那些问题。
没有英汉互译结果 请尝试网页搜索