2007-10-26 02:16:54

by Nick Piggin

[permalink] [raw]
Subject: [interesting] smattering of possible memory ordering bugs

Hi,

Just out of interest, I did a grep for files containing test_and_set_bit
as well as clear_bit (excluding obvious ones like include/asm-*/bitops.h).

Quite a few interesting things. There is a lot of stuff in drivers/* that
could be suspect, WRT memory barriers, including lots I didn't touch. Lot
of work. But forget that...

Some surprises with more obvious bugs, which I have added some cc's for.
powerpc seems to have an mmu context allocation bug, (and possible
improvement in hpte locking with the new bitops).

floppy is using a crazy open coded bit mutex, convert to regular mutex.

vt has something strange too.

fs-writeback could be made a little safer by properly closing the "critical"
section (the existing sections aren't really critical, but you never know
what the future holds ;))

jfs seems to be missing an obvious smp_mb__before_clear_bit (and a less
obvious smp_mb__after_clear_bit)

xfs seems to be missing smp_mb__before

Lots of code that allocates things (eg. msi interrupts) is suspicious. I'm
not exactly sure if these allocation bitmaps are actually used to protect
data structures or not...

I'll have to get round to preparing proper patches for these things if I
don't hear nacks...


Attachments:
(No filename) (1.19 kB)
bitops-fixups.patch (23.84 kB)
Download all attachments

2007-10-26 03:36:03

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: [interesting] smattering of possible memory ordering bugs


> Index: linux-2.6/arch/powerpc/mm/hash_native_64.c
> ===================================================================
> --- linux-2.6.orig/arch/powerpc/mm/hash_native_64.c
> +++ linux-2.6/arch/powerpc/mm/hash_native_64.c
> @@ -113,7 +113,7 @@ static inline void native_lock_hpte(stru
> unsigned long *word = &hptep->v;
>
> while (1) {
> - if (!test_and_set_bit(HPTE_LOCK_BIT, word))
> + if (!test_and_set_bit_lock(HPTE_LOCK_BIT, word))
> break;
> while(test_bit(HPTE_LOCK_BIT, word))
> cpu_relax();
> @@ -124,8 +124,7 @@ static inline void native_unlock_hpte(st
> {
> unsigned long *word = &hptep->v;
>
> - asm volatile("lwsync":::"memory");
> - clear_bit(HPTE_LOCK_BIT, word);
> + clear_bit_unlock(HPTE_LOCK_BIT, word);
> }

Ack.

> Index: linux-2.6/arch/ppc/xmon/start.c
> ===================================================================
> --- linux-2.6.orig/arch/ppc/xmon/start.c
> +++ linux-2.6/arch/ppc/xmon/start.c
> @@ -92,16 +92,15 @@ xmon_write(void *handle, void *ptr, int
> {
> char *p = ptr;
> int i, c, ct;
> -
> -#ifdef CONFIG_SMP
> - static unsigned long xmon_write_lock;
> - int lock_wait = 1000000;
> + static DEFINE_SPINLOCK(xmon_write_lock);
> + int lock_udelay = 10000;
> int locked;
>
> - while ((locked = test_and_set_bit(0, &xmon_write_lock)) != 0)
> - if (--lock_wait == 0)
> + while (!(locked = spin_trylock(&xmon_write_lock))) {
> + udelay(1);
> + if (--lock_udelay == 0)
> break;
> -#endif
> + }
>
> if (!scc_initialized)
> xmon_init_scc();
> @@ -122,10 +121,9 @@ xmon_write(void *handle, void *ptr, int
> eieio();
> }
>
> -#ifdef CONFIG_SMP
> - if (!locked)
> - clear_bit(0, &xmon_write_lock);
> -#endif
> + if (locked)
> + spin_unlock(&xmon_write_lock);
> +
> return nb;
> }

Ah yeah, some dirt in xmon... ;-) ack.

> Index: linux-2.6/drivers/net/ibm_newemac/mal.c
> ===================================================================
> --- linux-2.6.orig/drivers/net/ibm_newemac/mal.c
> +++ linux-2.6/drivers/net/ibm_newemac/mal.c
> @@ -318,7 +318,7 @@ static irqreturn_t mal_rxde(int irq, voi
> void mal_poll_disable(struct mal_instance *mal, struct mal_commac *commac)
> {
> /* Spinlock-type semantics: only one caller disable poll at a time */
> - while (test_and_set_bit(MAL_COMMAC_POLL_DISABLED, &commac->flags))
> + while (test_and_set_bit_lock(MAL_COMMAC_POLL_DISABLED, &commac->flags))
> msleep(1);
>
> /* Synchronize with the MAL NAPI poller */
> @@ -327,8 +327,7 @@ void mal_poll_disable(struct mal_instanc
>
> void mal_poll_enable(struct mal_instance *mal, struct mal_commac *commac)
> {
> - smp_wmb();
> - clear_bit(MAL_COMMAC_POLL_DISABLED, &commac->flags);
> + clear_bit_unlock(MAL_COMMAC_POLL_DISABLED, &commac->flags);
>
> /* Feels better to trigger a poll here to catch up with events that
> * may have happened on this channel while disabled. It will most

Ack.

> Index: linux-2.6/include/asm-powerpc/mmu_context.h
> ===================================================================
> --- linux-2.6.orig/include/asm-powerpc/mmu_context.h
> +++ linux-2.6/include/asm-powerpc/mmu_context.h
> @@ -129,7 +129,7 @@ static inline void get_mmu_context(struc
> steal_context();
> #endif
> ctx = next_mmu_context;
> - while (test_and_set_bit(ctx, context_map)) {
> + while (test_and_set_bit_lock(ctx, context_map)) {
> ctx = find_next_zero_bit(context_map, LAST_CONTEXT+1, ctx);
> if (ctx > LAST_CONTEXT)
> ctx = 0;
> @@ -158,7 +158,7 @@ static inline void destroy_context(struc
> {
> preempt_disable();
> if (mm->context.id != NO_CONTEXT) {
> - clear_bit(mm->context.id, context_map);
> + clear_bit_unlock(mm->context.id, context_map);
> mm->context.id = NO_CONTEXT;
> #ifdef FEW_CONTEXTS
> atomic_inc(&nr_free_contexts);

I don't think the previous code was wrong... it's not a locked section
and we don't care about ordering previous stores. It's an allocation, it
should be fine. In general, bitmap allocators should be allright.

Ignore the FEW_CONTEXTS stuff for now :-) At this point, it's UP only
and will be replaced sooner or later.

> Index: linux-2.6/include/asm-ppc/mmu_context.h
> ===================================================================
> --- linux-2.6.orig/include/asm-ppc/mmu_context.h
> +++ linux-2.6/include/asm-ppc/mmu_context.h
> @@ -128,7 +128,7 @@ static inline void get_mmu_context(struc
> steal_context();
> #endif
> ctx = next_mmu_context;
> - while (test_and_set_bit(ctx, context_map)) {
> + while (test_and_set_bit_lock(ctx, context_map)) {
> ctx = find_next_zero_bit(context_map, LAST_CONTEXT+1, ctx);
> if (ctx > LAST_CONTEXT)
> ctx = 0;
> @@ -157,7 +157,7 @@ static inline void destroy_context(struc
> {
> preempt_disable();
> if (mm->context.id != NO_CONTEXT) {
> - clear_bit(mm->context.id, context_map);
> + clear_bit_unlock(mm->context.id, context_map);
> mm->context.id = NO_CONTEXT;
> #ifdef FEW_CONTEXTS
> atomic_inc(&nr_free_contexts);

Same as above.

Cheers,
Ben.


2007-10-26 03:54:33

by Nick Piggin

[permalink] [raw]
Subject: Re: [interesting] smattering of possible memory ordering bugs

On Friday 26 October 2007 13:35, Benjamin Herrenschmidt wrote:

[acks]

Thanks for those...

> > Index: linux-2.6/include/asm-powerpc/mmu_context.h
> > ===================================================================
> > --- linux-2.6.orig/include/asm-powerpc/mmu_context.h
> > +++ linux-2.6/include/asm-powerpc/mmu_context.h
> > @@ -129,7 +129,7 @@ static inline void get_mmu_context(struc
> > steal_context();
> > #endif
> > ctx = next_mmu_context;
> > - while (test_and_set_bit(ctx, context_map)) {
> > + while (test_and_set_bit_lock(ctx, context_map)) {
> > ctx = find_next_zero_bit(context_map, LAST_CONTEXT+1,
> > ctx); if (ctx > LAST_CONTEXT)
> > ctx = 0;
> > @@ -158,7 +158,7 @@ static inline void destroy_context(struc
> > {
> > preempt_disable();
> > if (mm->context.id != NO_CONTEXT) {
> > - clear_bit(mm->context.id, context_map);
> > + clear_bit_unlock(mm->context.id, context_map);
> > mm->context.id = NO_CONTEXT;
> > #ifdef FEW_CONTEXTS
> > atomic_inc(&nr_free_contexts);
>
> I don't think the previous code was wrong... it's not a locked section
> and we don't care about ordering previous stores. It's an allocation, it
> should be fine. In general, bitmap allocators should be allright.

Well if it is just allocating an arbitrary _number_ out of a bitmap
and nothing else (eg. like the pid allocator), then you don't need
barriers.


> Ignore the FEW_CONTEXTS stuff for now :-) At this point, it's UP only
> and will be replaced sooner or later.

OK. Then I agree, provided you're doing the correct synchronisation
or flushing etc. when destroying a context (which presumably you are).

I'll drop those bits then.

Thanks,
Nick

2007-10-26 04:33:00

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: [interesting] smattering of possible memory ordering bugs


On Fri, 2007-10-26 at 13:47 +1000, Nick Piggin wrote:

> I don't think the previous code was wrong... it's not a locked section
> > and we don't care about ordering previous stores. It's an
> allocation, it
> > should be fine. In general, bitmap allocators should be allright.
>
> Well if it is just allocating an arbitrary _number_ out of a bitmap
> and nothing else (eg. like the pid allocator), then you don't need
> barriers.

Yup, that's what it does.

Cheers,
Ben.


2007-10-26 13:45:44

by Dave Kleikamp

[permalink] [raw]
Subject: Re: [interesting] smattering of possible memory ordering bugs

Memory barriers aren't one of my strengths, but this appears correct.

Acked-by: Dave Kleikamp <[email protected]>

On Fri, 2007-10-26 at 12:09 +1000, Nick Piggin wrote:
> Index: linux-2.6/fs/jfs/jfs_metapage.c
> ===================================================================
> --- linux-2.6.orig/fs/jfs/jfs_metapage.c
> +++ linux-2.6/fs/jfs/jfs_metapage.c
> @@ -39,11 +39,12 @@ static struct {
> #endif
>
> #define metapage_locked(mp) test_bit(META_locked, &(mp)->flag)
> -#define trylock_metapage(mp) test_and_set_bit(META_locked,
> &(mp)->flag)
> +#define trylock_metapage(mp) test_and_set_bit_lock(META_locked,
> &(mp)->flag)
>
> static inline void unlock_metapage(struct metapage *mp)
> {
> - clear_bit(META_locked, &mp->flag);
> + clear_bit_unlock(META_locked, &mp->flag);
> + smp_mb__after_clear_bit();
> wake_up(&mp->wait);
> }
>

Thanks,
Shaggy
--
David Kleikamp
IBM Linux Technology Center