On Fri, 10 Oct 2008 04:35:47 +1100, Nick Piggin said:
> On Thursday 09 October 2008 12:50, [email protected] wrote:
> > On Wed, 08 Oct 2008 18:12:23 PDT, Randy Dunlap said:
> > > +
> > > +24: All memory barriers {e.g., barrier(), rmb(), wmb()} need a comment
> > > in the + source code that explains the logic of what they are doing
> > > and why.
> >
> > "what they are doing" will almost always be "flush value to RAM" or
> > similar.
>
> Memory barriers don't flush anything anywhere.

That's what I get for commenting on stuff when I'm into a 40-hour week by
Wednesday. :)

I was speaking of the generic programmer who does stuff like:

x = 10; /* set x to 10 */

for "what they are doing". You know the type. ;)

"flush value to RAM", "force memory barrier operation", and I think I've seen
a few kzalloc()'s that have "allocate zero'ed memory" on them. "what they are
doing" is usually not worth writing down, but being verbose for the *why*
is almost always good, especially for things like memory barriers that
almost nobody can get their brains wrapped around (how many flame-fests per
year do we have about "volatile"? ;)

> /*
> * If we don't do a wmb() here, the store to the RBFROBNIZ,
> * above, might reach the device before the store X, below.
> *
> * If that happens, then the XU293 card will get confused
> * and wedge the hardware...
> */
> wmb();
>
> If you don't comment like that, then how does the reader know that the wmb
> is not *also* supposed to order the store with any other of the limitless
> subsequent stores until the next memory ordering operation? Or any of the
> previous stores since the last one?

Even better (as I missed the "also supposed to know" case). My general point
was that a concrete example would improve Randy's original patch by showing
what sort of things should be in the comment, and your correction pointed
out *why* a concrete example was needed. ;)

Attachments:

(No filename) (226.00 B)

2008-10-09 09:58:35

by Ben Hutchings

[permalink] [raw]

Subject: Re: [PATCH] documentation: explain memory barriers

On Thu, 2008-10-09 at 01:51 -0400, Chris Snook wrote:
> Andrew Morton wrote:
> > On Wed, 08 Oct 2008 21:17:58 -0400 Chris Snook <[email protected]> wrote:
> >
> >> Randy Dunlap wrote:
> >>> On Wed, 1 Oct 2008 22:54:04 -0700 Andrew Morton wrote:
> >>>
> >>>> This sequence is repeated three or four times and should be pulled out
> >>>> into a well-commented function. That comment should explain the logic
> >>>> behind the use of these barriers, please.
> >>> and on 2008-OCT-08 Ben Hutchings wrote:
> >>>
> >>>> All memory barriers need a comment to explain why and what they're doing.
> >
> > I approve this message.
> >
> >> Seriously? When a barrier is used, it's generally self-evident what
> >> it's doing.
> >
> > fs/buffer.c:sync_buffer(). Have fun.
>
> The real disaster there is the clear_buffer_##name macro and friends, as
> evidenced by fs/ext2/inode.c:599
>
> clear_buffer_new(bh_result); /* What's this do? */
>
> I'm completely in favor of documenting everything that can potentially interact
> with that train wreck, but I maintain that the vast majority of memory barriers
> are self-evident.

Acquire and release barriers attached to operations are usually self-
evident; standalone wmb() and rmb() much less so. It is helpful to be
explicit about exactly which memory operations need to be ordered, which
are often not the memory operations immediately preceding and following
it. "all" may have been a bit strong though.

Ben.

--
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

2008-10-09 10:28:17

by Nick Piggin

[permalink] [raw]

Subject: Re: [PATCH] documentation: explain memory barriers

On Thursday 09 October 2008 20:58, Ben Hutchings wrote:
> On Thu, 2008-10-09 at 01:51 -0400, Chris Snook wrote:

> > I'm completely in favor of documenting everything that can potentially
> > interact with that train wreck, but I maintain that the vast majority of
> > memory barriers are self-evident.
>
> Acquire and release barriers attached to operations are usually self-
> evident; standalone wmb() and rmb() much less so. It is helpful to be
> explicit about exactly which memory operations need to be ordered, which
> are often not the memory operations immediately preceding and following
> it. "all" may have been a bit strong though.

No, I don't think so. We should absolutely force "all". That allows nobody
to be lazy, no confusion, and reminds people that memory barriers are not
easy to follow for a new reader of the code, or necessarily even the author,
6 months later. If somebody is too lazy to write a comment, they can use
locks

One last quick quiz, easier than the earlier ones...
mm/vmscan.c:__remove_mapping has a score of lines documenting exactly
what memory operations are being ordered, and even an example of what
happens if the ordering is not folllowed. This is a pretty good comment,
if I say so myself. However, it has one deficiency in that it doesn't
explicitly state where the write barrier(s) is (IMO the comments for one
part of an ordering protocol should reference the other parts of the
protcol).

Where are the store barriers, or why are they not required?

2008-10-11 12:06:38

by Nick Piggin

[permalink] [raw]

Subject: Re: RFC: one-bit mutexes (was: Re: [PATCH 2/3] Memory management livelock)

On Monday 06 October 2008 09:11, Mikulas Patocka wrote:
> Hi
>
> I removed the repeated code and create a new bit mutexes. They are
> space-efficient mutexes that consume only one bit. See the next 3 patches.

Pretty reasonable to have.

> If you are concerned about the size of an inode, I can convert other
> mutexes to bit mutexes: i_mutex and inotify_mutex.

I wouldn't worry for now. mutexes can be unlocked much faster than bit
mutexes, especially in the fastpath. And due to slab, it would be
unlikely to actually save any space.

> I could also create
> bit_spinlock (one-bit spinlock that uses test_and_set_bit) and save space
> for address_space->tree_lock, address_space->i_mmap_lock,
> address_space->private_lock, inode->i_lock.

We have that already. It is much much faster to unlock spinlocks than
bit spinlocks in general (if you own the word exclusively, then it's
not, but then you would be less likely to save space), and we can also
do proper FIFO ticket locks with a larger word.

> Look at it and say what you think about the idea of condensing mutexes
> into single bits.

Looks pretty good to me.

2008-10-20 20:14:50

by Mikulas Patocka

[permalink] [raw]

Subject: Re: RFC: one-bit mutexes (was: Re: [PATCH 2/3] Memory management livelock)

> > If you are concerned about the size of an inode, I can convert other
> > mutexes to bit mutexes: i_mutex and inotify_mutex.
>
> I wouldn't worry for now. mutexes can be unlocked much faster than bit
> mutexes, especially in the fastpath. And due to slab, it would be
> unlikely to actually save any space.

Maybe inotify_mutex. You are right that i_mutex is so heavily contended
that slowing it down to save few words wouldn't be good. Do you know about
any inotify-intensive workload?

> > I could also create
> > bit_spinlock (one-bit spinlock that uses test_and_set_bit) and save space
> > for address_space->tree_lock, address_space->i_mmap_lock,
> > address_space->private_lock, inode->i_lock.
>
> We have that already. It is much much faster to unlock spinlocks than
> bit spinlocks in general (if you own the word exclusively, then it's
> not, but then you would be less likely to save space), and we can also
> do proper FIFO ticket locks with a larger word.

BTW. why do spinlocks on x86(64) have 32 bits and not 8 bits or 16 bits?
Are atomic 32-bit instuctions faster?

Can x86(86) system have 256 CPUs?

Mikulas

2008-10-21 01:51:30

by Nick Piggin

[permalink] [raw]

Subject: Re: RFC: one-bit mutexes (was: Re: [PATCH 2/3] Memory management livelock)

On Tuesday 21 October 2008 07:14, Mikulas Patocka wrote:
> > > If you are concerned about the size of an inode, I can convert other
> > > mutexes to bit mutexes: i_mutex and inotify_mutex.
> >
> > I wouldn't worry for now. mutexes can be unlocked much faster than bit
> > mutexes, especially in the fastpath. And due to slab, it would be
> > unlikely to actually save any space.
>
> Maybe inotify_mutex. You are right that i_mutex is so heavily contended
> that slowing it down to save few words wouldn't be good. Do you know about
> any inotify-intensive workload?

Don't really know, no. I think most desktop environments use it to
some extent, but no idea how much.

> > > I could also create
> > > bit_spinlock (one-bit spinlock that uses test_and_set_bit) and save
> > > space for address_space->tree_lock, address_space->i_mmap_lock,
> > > address_space->private_lock, inode->i_lock.
> >
> > We have that already. It is much much faster to unlock spinlocks than
> > bit spinlocks in general (if you own the word exclusively, then it's
> > not, but then you would be less likely to save space), and we can also
> > do proper FIFO ticket locks with a larger word.
>
> BTW. why do spinlocks on x86(64) have 32 bits and not 8 bits or 16 bits?
> Are atomic 32-bit instuctions faster?

In the case of <= 256 CPUs, they could be an unsigned short I think.
Probably it has never been found to be a huge win because they are
often beside other ints or longs. I think I actually booted up the
kernel with 16-bit spinlocks when doing the FIFO locks, but never
sent a patch for it... Don't let me stop you from trying though.

> Can x86(86) system have 256 CPUs?

Well, none that I know of which actually exist. SGI is hoping to have
4096 CPU x86 systems as far as I can tell.