separate atomic test-and-set from memory barrier.

* use xchg only for its atomicness.
* use __sync_synchronize() for both CPU and compiler barrier.
This commit is contained in:
Robert Morris 2016-08-12 07:03:35 -04:00
parent 9c65b32d9e
commit 20d05d4411

View file

@ -29,11 +29,14 @@ acquire(struct spinlock *lk)
panic("acquire"); panic("acquire");
// The xchg is atomic. // The xchg is atomic.
// It also serializes, so that reads after acquire are not
// reordered before it.
while(xchg(&lk->locked, 1) != 0) while(xchg(&lk->locked, 1) != 0)
; ;
// Tell the C compiler and the processor to not move loads or stores
// past this point, to ensure that the critical section's memory
// references happen after the lock is acquired.
__sync_synchronize();
// Record info about lock acquisition for debugging. // Record info about lock acquisition for debugging.
lk->cpu = cpu; lk->cpu = cpu;
getcallerpcs(&lk, lk->pcs); getcallerpcs(&lk, lk->pcs);
@ -49,16 +52,15 @@ release(struct spinlock *lk)
lk->pcs[0] = 0; lk->pcs[0] = 0;
lk->cpu = 0; lk->cpu = 0;
// The xchg serializes, so that reads before release are // Tell the C compiler and the processor to not move loads or stores
// not reordered after it. The 1996 PentiumPro manual (Volume 3, // past this point, to ensure that all the stores in the critical
// 7.2) says reads can be carried out speculatively and in // section are visible to other cores before the lock is released.
// any order, which implies we need to serialize here. // Both the C compiler and the hardware may re-order loads and
// But the 2007 Intel 64 Architecture Memory Ordering White // stores; __sync_synchronize() tells them both to not re-order.
// Paper says that Intel 64 and IA-32 will not move a load __sync_synchronize();
// after a store. So lock->locked = 0 would work here.
// The xchg being asm volatile ensures gcc emits it after // Release the lock.
// the above assignments (and after the critical section). lk->locked = 0;
xchg(&lk->locked, 0);
popcli(); popcli();
} }