The x86-64 doesn't just add two levels to page tables to support 64 bit
addresses, but is a different processor. For example, calling conventions,
system calls, and segmentation are different from 32-bit x86. Segmentation is
basically gone, but gs/fs in combination with MSRs can be used to hold a
per-core pointer. In general, x86-64 is more straightforward than 32-bit
x86. The port uses code from sv6 and the xv6 "rsc-amd64" branch.
A summary of the changes is as follows:
- Booting: switch to grub instead of xv6's bootloader (pass -kernel to qemu),
because xv6's boot loader doesn't understand 64bit ELF files. And, we don't
care anymore about booting.
- Makefile: use -m64 instead of -m32 flag for gcc, delete boot loader, xv6.img,
bochs, and memfs. For now dont' use -O2, since usertests with -O2 is bigger than
MAXFILE!
- Update gdb.tmpl to be for i386 or x86-64
- Console/printf: use stdarg.h and treat 64-bit addresses different from ints
(32-bit)
- Update elfhdr to be 64 bit
- entry.S/entryother.S: add code to switch to 64-bit mode: build a simple page
table in 32-bit mode before switching to 64-bit mode, share code for entering
boot processor and APs, and tweak boot gdt. The boot gdt is the gdt that the
kernel proper also uses. (In 64-bit mode, the gdt/segmentation and task state
mostly disappear.)
- exec.c: fix passing argv (64-bit now instead of 32-bit).
- initcode.c: use syscall instead of int.
- kernel.ld: load kernel very high, in top terabyte. 64 bits is a lot of
address space!
- proc.c: initial return is through new syscall path instead of trapret.
- proc.h: update struct cpu to have some scratch space since syscall saves less
state than int, update struct context to reflect x86-64 calling conventions.
- swtch: simplify for x86-64 calling conventions.
- syscall: add fetcharg to handle x86-64 calling convetions (6 arguments are
passed through registers), and fetchaddr to read a 64-bit value from user space.
- sysfile: update to handle pointers from user space (e.g., sys_exec), which are
64 bits.
- trap.c: no special trap vector for sys calls, because x86-64 has a different
plan for system calls.
- trapasm: one plan for syscalls and one plan for traps (interrupt and
exceptions). On x86-64, the kernel is responsible for switching user/kernel
stacks. To do, xv6 keeps some scratch space in the cpu structure, and uses MSR
GS_KERN_BASE to point to the core's cpu structure (using swapgs).
- types.h: add uint64, and change pde_t to uint64
- usertests: exit() when fork fails, which helped in tracking down one of the
bugs in the switch from 32-bit to 64-bit
- vectors: update to make them 64 bits
- vm.c: use bootgdt in kernel too, program MSRs for syscalls and core-local
state (for swapgs), walk 4 levels in walkpgdir, add DEVSPACETOP, use task
segment to set kernel stack for interrupts (but simpler than in 32-bit mode),
add an extra argument to freevm (size of user part of address space) to avoid
checking all entries till KERNBASE (there are MANY TB before the top 1TB).
- x86: update trapframe to have 64-bit entries, which is what the processor
pushes on syscalls and traps. simplify lgdt and lidt, using struct desctr,
which needs the gcc directives packed and aligned.
TODO:
- use int32 instead of int?
- simplify curproc(). xv6 has per-cpu state again, but this time it must have it.
- avoid repetition in walkpgdir
- fix validateint() in usertests.c
- fix bugs (e.g., observed one a case of entering kernel with invalid gs or proc
There is a potential memory leak when mappages() fails inside setupkvm().
A call to freevm() is added in this case so as to reclaim the lost mapping pages.
myproc() points to a different thread.
myproc();
sched();
myproc(); // this proc maybe different than the one before sched
Thus, in a function that operates on one thread better to retrieve the
current process once at the start of the function.
switchuvm() is supposed to switch the TSS and page table to the
process p it is passed. Alas, instead of using p to access the
kstack field, it used the global proc. This worked fine because
(a) most uses of switchuvm() pass proc anyway and (b) because in
the schedule, where we call switchuvm with the newly scheduled
process, we actually set the global proc before the call. But I
think it's still a bug, even if it never broke a test case. :-)
when a zero PDE is encountered while searching for present PTEs to free,
resume searching at first entry of the next page table instead of the
current entry of the next page table.
Previously, deallocuvm scanned from 0 to KERNBASE in one page
increments, which had a noticable effect on boot time. Now it skips
over missing page directories.
Allocate proper kernel page table immediately in main using boot allocator
Remove pginit
Simplify address space layout a tiny bit
More to come (e.g., superpages to simplify static table)