no more proc[] entry per cpu for idle loop
each cpu[] has its own gdt and tss no per-proc gdt or tss, re-write cpu's in scheduler (you win, cliff) main0() switches to cpu[0].mpstack
This commit is contained in:
parent
69332d1918
commit
350e63f7a9
8 changed files with 455 additions and 615 deletions
283
Notes
283
Notes
|
@ -22,32 +22,14 @@ no kernel malloc(), just kalloc() for user core
|
|||
|
||||
user pointers aren't valid in the kernel
|
||||
|
||||
setting up first process
|
||||
we do want a process zero, as template
|
||||
but not runnable
|
||||
just set up return-from-trap frame on new kernel stack
|
||||
fake user program that calls exec
|
||||
|
||||
map text read-only?
|
||||
shared text?
|
||||
|
||||
what's on the stack during a trap or sys call?
|
||||
PUSHA before scheduler switch? for callee-saved registers.
|
||||
segment contents?
|
||||
what does iret need to get out of the kernel?
|
||||
how does INT know what kernel stack to use?
|
||||
|
||||
are interrupts turned on in the kernel? probably.
|
||||
|
||||
per-cpu curproc
|
||||
one tss per process, or one per cpu?
|
||||
one segment array per cpu, or per process?
|
||||
are interrupts turned on in the kernel? yes.
|
||||
|
||||
pass curproc explicitly, or implicit from cpu #?
|
||||
e.g. argument to newproc()?
|
||||
hmm, you need a global curproc[cpu] for trap() &c
|
||||
|
||||
test stack expansion
|
||||
no stack expansion
|
||||
|
||||
test running out of memory, process slots
|
||||
|
||||
we can't really use a separate stack segment, since stack addresses
|
||||
|
@ -56,16 +38,6 @@ data vs text. how can we have a gap between data and stack, so that
|
|||
both can grow, without committing 4GB of physical memory? does this
|
||||
mean we need paging?
|
||||
|
||||
what's the simplest way to add the paging we need?
|
||||
one page table, re-write it each time we leave the kernel?
|
||||
page table per process?
|
||||
probably need to use 0-0xffffffff segments, so that
|
||||
both data and stack pointers always work
|
||||
so is it now worth it to make a process's phys mem contiguous?
|
||||
or could use segment limits and 4 meg pages?
|
||||
but limits would prevent using stack pointers as data pointers
|
||||
how to write-protect text? not important?
|
||||
|
||||
perhaps have fixed-size stack, put it in the data segment?
|
||||
|
||||
oops, if kernel stack is in contiguous user phys mem, then moving
|
||||
|
@ -87,19 +59,6 @@ test children being inherited by grandparent &c
|
|||
|
||||
some sleep()s should be interruptible by kill()
|
||||
|
||||
cli/sti in acquire/release should nest!
|
||||
in case you acquire two locks
|
||||
|
||||
what would need fixing if we got rid of kernel_lock?
|
||||
console output
|
||||
proc_exit() needs lock on proc *array* to deallocate
|
||||
kill() needs lock on proc *array*
|
||||
allocator's free list
|
||||
global fd table (really free-ness)
|
||||
sys_close() on fd table
|
||||
fork on proc list, also next pid
|
||||
hold lock until public slots in proc struct initialized
|
||||
|
||||
locks
|
||||
init_lock
|
||||
sequences CPU startup
|
||||
|
@ -110,37 +69,17 @@ locks
|
|||
memory allocator
|
||||
printf
|
||||
|
||||
wakeup needs proc_table_lock
|
||||
so we need recursive locks?
|
||||
or you must hold the lock to call wakeup?
|
||||
|
||||
in general, the table locks protect both free-ness and
|
||||
public variables of table elements
|
||||
in many cases you can use table elements w/o a lock
|
||||
e.g. if you are the process, or you are using an fd
|
||||
|
||||
lock code shouldn't call cprintf...
|
||||
|
||||
nasty hack to allow locks before first process,
|
||||
and to allow them in interrupts when curproc may be zero
|
||||
|
||||
race between release and sleep in sys_wait()
|
||||
race between sys_exit waking up parent and setting state=ZOMBIE
|
||||
race in pipe code when full/empty
|
||||
|
||||
lock order
|
||||
per-pipe lock
|
||||
proc_table_lock fd_table_lock kalloc_lock
|
||||
console_lock
|
||||
|
||||
condition variable + mutex that protects it
|
||||
proc * (for wait()), proc_table_lock
|
||||
pipe structure, pipe lock
|
||||
|
||||
systematic way to test sleep races?
|
||||
print something at the start of sleep?
|
||||
|
||||
do you have to be holding the mutex in order to call wakeup()?
|
||||
do you have to be holding the mutex in order to call wakeup()? yes
|
||||
|
||||
device interrupts don't clear FL_IF
|
||||
so a recursive timer interrupt is possible
|
||||
|
@ -156,202 +95,11 @@ inode->count counts in-memory pointers to the struct
|
|||
blocks and inodes have ad-hoc sleep-locks
|
||||
provide a single mechanism?
|
||||
|
||||
need to lock bufs in bio between bread and brelse
|
||||
|
||||
test 14-character file names
|
||||
and file arguments longer than 14
|
||||
and directories longer than one sector
|
||||
|
||||
kalloc() can return 0; do callers handle this right?
|
||||
|
||||
why directing interrupts to cpu 1 causes trouble
|
||||
cpu 1 turns on interrupts with no tss!
|
||||
and perhaps a stale gdt (from boot)
|
||||
since it has never run a process, never called setupsegs()
|
||||
but does cpu really need the tss?
|
||||
not switching stacks
|
||||
fake process per cpu, just for tss?
|
||||
seems like a waste
|
||||
move tss to cpu[]?
|
||||
but tss points to per-process kernel stack
|
||||
would also give us a gdt
|
||||
OOPS that wasn't the problem
|
||||
|
||||
wait for other cpu to finish starting before enabling interrupts?
|
||||
some kind of crash in ide_init ioapic_enable cprintf
|
||||
move ide_init before mp_start?
|
||||
didn't do any good
|
||||
maybe cpu0 taking ide interrupt, cpu1 getting a nested lock error
|
||||
|
||||
cprintfs are screwed up if locking is off
|
||||
often loops forever
|
||||
hah, just use lpt alone
|
||||
|
||||
looks like cpu0 took the ide interrupt and was the last to hold
|
||||
the lock, but cpu1 thinks it is nested
|
||||
cpu0 is in load_icode / printf / cons_putc
|
||||
probably b/c cpu1 cleared use_console_lock
|
||||
cpu1 is in scheduler() / printf / acquire
|
||||
|
||||
1: init timer
|
||||
0: init timer
|
||||
cpu 1 initial nlock 1
|
||||
ne0s:t iidd el_occnkt rc
|
||||
onsole cpu 1 old caller stack 1001A5 10071D 104DFF 1049FE
|
||||
panic: acquire
|
||||
^CNext at t=33002418
|
||||
(0) [0x00100091] 0008:0x00100091 (unk. ctxt): jmp .+0xfffffffe ; ebfe
|
||||
(1) [0x00100332] 0008:0x00100332 (unk. ctxt): jmp .+0xfffffffe
|
||||
|
||||
why is output interleaved even before panic?
|
||||
|
||||
does release turn on interrupts even inside an interrupt handler?
|
||||
|
||||
overflowing cpu[] stack?
|
||||
probably not, change from 512 to 4096 didn't do anything
|
||||
|
||||
|
||||
1: init timer
|
||||
0: init timer
|
||||
cnpeus te11 linnitki aclo nnoolleek cp1u
|
||||
ss oarltd sccahleldeul esrt aocnk cpu 0111 Ej6 buf1 01A3140 C5118
|
||||
0
|
||||
la anic1::7 0a0c0 uuirr e
|
||||
^CNext at t=31691050
|
||||
(0) [0x00100373] 0008:0x00100373 (unk. ctxt): jmp .+0xfffffffe ; ebfe
|
||||
(1) [0x00100091] 0008:0x00100091 (unk. ctxt): jmp .+0xfffffffe ; ebfe
|
||||
|
||||
cpu0:
|
||||
|
||||
0: init timer
|
||||
nested lock console cpu 0 old caller stack 1001e6 101a34 1 0
|
||||
(that's mpmain)
|
||||
panic: acquire
|
||||
|
||||
cpu1:
|
||||
|
||||
1: init timer
|
||||
cpu 1 initial nlock 1
|
||||
start scheduler on cpu 1 jmpbuf ...
|
||||
la 107000 lr ...
|
||||
that is, nlock != 0
|
||||
|
||||
maybe a race; acquire does
|
||||
locked = 1
|
||||
cpu = cpu()
|
||||
what if another acquire calls holding w/ locked = 1 but
|
||||
before cpu is set?
|
||||
|
||||
if I type a lot (kbd), i get a panic
|
||||
cpu1 in scheduler: panic "holding locks in scheduler"
|
||||
cpu0 also in the same panic!
|
||||
recursive interrupt?
|
||||
FL_IF is probably set during interrupt... is that correct?
|
||||
again:
|
||||
olding locks in scheduler
|
||||
trap v 33 eip 100ED3 c (that is, interrupt while holding a lock)
|
||||
100ed3 is in lapic_write
|
||||
again:
|
||||
trap v 33 eip 102A3C cpu 1 nlock 1 (in acquire)
|
||||
panic: interrupt while holding a lock
|
||||
again:
|
||||
trap v 33 eip 102A3C cpu 1 nlock 1
|
||||
panic: interrupt while holding a lock
|
||||
OR is it the cprintf("kbd overflow")?
|
||||
no, get panic even w/o that cprintf
|
||||
OR a release() at interrupt time turns interrupts back on?
|
||||
of course i don't think they were off...
|
||||
OK, fixing trap.c to make interrupts turn off FL_IF
|
||||
that makes it take longer, but still panics
|
||||
(maybe b/c release sets FL_IF)
|
||||
|
||||
shouldn't something (PIC?) prevent recursive interrupts of same IRQ?
|
||||
or should FL_IF be clear during all interrupts?
|
||||
|
||||
maybe acquire should remember old FL_IF value, release should restore
|
||||
if acquire did cli()
|
||||
|
||||
DUH the increment of nlock in acquire() happens before the cli!
|
||||
so the panic is probably not a real problem
|
||||
test nlock, cli(), then increment?
|
||||
|
||||
BUT now userfs doesn't do the final cat README
|
||||
|
||||
AND w/ cprintf("kbd overflow"), panic holding locks in scheduler
|
||||
maybe also simulataneous panic("interrupt while holding a lock")
|
||||
|
||||
again (holding down x key):
|
||||
kbd overflow
|
||||
kbd oaaniicloowh
|
||||
olding locks in scheduler
|
||||
trap v 33 eip 100F5F c^CNext at t=32166285
|
||||
(0) [0x0010033e] 0008:0010033e (unk. ctxt): jmp .+0xfffffffe (0x0010033e) ; ebfe
|
||||
(1) [0x0010005c] 0008:0010005c (unk. ctxt): jmp .+0xfffffffe (0x0010005c) ; ebfe
|
||||
cpu0 paniced due to holding locks in scheduler
|
||||
cpu1 got panic("interrupt while holding a lock")
|
||||
again in lapic_write.
|
||||
while re-enabling an IRQ?
|
||||
|
||||
again:
|
||||
cpu 0 panic("holding locks in scheduler")
|
||||
but didn't trigger related panics earlier in scheduler or sched()
|
||||
of course the panic is right after release() and thus sti()
|
||||
so we may be seeing an interrupt that left locks held
|
||||
cpu 1 unknown panic
|
||||
why does it happen to both cpus at the same time?
|
||||
|
||||
again:
|
||||
cpu 0 panic("holding locks in scheduler")
|
||||
but trap() didn't see any held locks on return
|
||||
cpu 1 no apparent panic
|
||||
|
||||
again:
|
||||
cpu 0 panic: holding too many locks in scheduler
|
||||
cpu 1 panic: kbd_intr returned while holding a lock
|
||||
|
||||
again:
|
||||
cpu 0 panic: holding too man
|
||||
la 10d70c lr 10027b
|
||||
those don't seem to be locks...
|
||||
only place non-constant lock is used is sleep()'s 2nd arg
|
||||
maybe register not preserved across context switch?
|
||||
it's in %esi...
|
||||
sched() doesn't touch %esi
|
||||
%esi is evidently callee-saved
|
||||
something to do with interrupts? since ordinarily it works
|
||||
cpu 1 panic: kbd_int returned while holding a lock
|
||||
la 107340 lr 107300
|
||||
console_lock and kbd_lock
|
||||
|
||||
maybe console_lock is often not released due to change
|
||||
in use_console_lock (panic on other cpu)
|
||||
|
||||
again:
|
||||
cpu 0: panic: h...
|
||||
la 10D78C lr 102CA0
|
||||
cpu 1: panic: acquire FL_IF (later than cpu 0)
|
||||
|
||||
but if sleep() were acquiring random locks, we'd see panics
|
||||
in release, after sleep() returned.
|
||||
actually when system is idle, maybe no-one sleeps at all.
|
||||
just scheduler() and interrupts
|
||||
|
||||
questions:
|
||||
does userfs use pipes? or fork?
|
||||
no
|
||||
does anything bad happen if process 1 exits? eg exit() in cat.c
|
||||
looks ok
|
||||
are there really no processes left?
|
||||
lock_init() so we can have a magic number?
|
||||
|
||||
HMM maybe the variables at the end of struct cpu are being overwritten
|
||||
nlocks, lastacquire, lastrelease
|
||||
by cpu->stack?
|
||||
adding junk buffers maybe causes crash to take longer...
|
||||
when do we run on cpu stack?
|
||||
just in scheduler()?
|
||||
and interrupts from scheduler()
|
||||
|
||||
OH! recursive interrupts will use up any amount of cpu[].stack!
|
||||
underflow and wrecks *previous* cpu's struct
|
||||
|
||||
|
@ -360,15 +108,26 @@ mkdir
|
|||
sh arguments
|
||||
sh redirection
|
||||
indirect blocks
|
||||
two bugs in unlink: don't just return if nlink > 0,
|
||||
and search for name, not inum
|
||||
is there a create/create race for same file name?
|
||||
resulting in two entries w/ same name in directory?
|
||||
why does shell often ignore first line of input?
|
||||
|
||||
test: one process unlinks a file while another links to it
|
||||
test: simultaneous create of same file
|
||||
test: one process opens a file while another deletes it
|
||||
test: mkdir. deadlock d/.. vs ../d
|
||||
|
||||
wdir should use writei, to avoid special-case block allocation
|
||||
also readi
|
||||
is dir locked? probably
|
||||
make proc[0] runnable
|
||||
cpu early tss and gdt
|
||||
how do we get cpu0 scheduler() to use mpstack, not proc[0].kstack?
|
||||
when iget() first sleeps, where does it longjmp to?
|
||||
maybe set up proc[0] to be runnable, with entry proc0main(), then
|
||||
have main() call scheduler()?
|
||||
perhaps so proc[0] uses right kstack?
|
||||
and scheduler() uses mpstack?
|
||||
ltr sets the busy bit in the TSS, faults if already set
|
||||
so gdt and TSS per cpu?
|
||||
we don't want to be using some random process's gdt when it changes it.
|
||||
maybe get rid of per-proc gdt and ts
|
||||
one per cpu
|
||||
refresh it when needed
|
||||
setupsegs(proc *)
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue