LinuxLists.cc - [GIT PULL] percpu fixes for 2.6.32-rc6

[permalink] [raw]

Subject: Re: [GIT PULL] percpu fixes for 2.6.32-rc6

On Tue, 10 Nov 2009, Tejun Heo wrote:
>
> Please pull from the following percpu fix branch.

No way in hell.

> It fixes a possible deadlock caused by lock ordering inversion through
> irq.

.. and it does so by introducing a new bug. No thank you.

> +
> + /*
> + * pcpu_mem_free() might end up calling vfree() which uses
> + * IRQ-unsafe lock and thus can't be called with pcpu_lock
> + * held. Release and reacquire pcpu_lock if old map needs to
> + * be freed.
> + */
> + if (old) {
> + spin_unlock_irqrestore(&pcpu_lock, *flags);
> + pcpu_mem_free(old, size);
> + spin_lock_irqsave(&pcpu_lock, *flags);
> + }

Routines that drop and then re-take the lock should be banned, as it's
almost always a bug waiting to happen. As it is this time:

> return 0;

Now the caller will happily continue to traverse a list that may no longer
be valid, because you dropped the lock.

Really. This thing is total sh*t. It was misdesigned to start with, and
the calling convention is wrong. That 'pcpu_extend_area_map()' function
should be split up into two functions: 'pcpu_needs_to_extend()' that never
drops the lock, and 'pcpu_extend_area()' that _always_ drops the lock
(and then returns an error if it can't allocate the memory).

Not that shit-for-brains that may or may not drop the lock, and then
returns an incorrect error return depending on whether it did.

In other words: fix the sh*t, don't add even more to it. That 'return 0'
was and is wrong. It should have been a 'return 1'. And thank the Gods
that I looked at it,

Sure, you can fix the bug by just returning 1. But you can't fix the total
crap of a calling convention that way. Fix it properly as outlined above,
and remember: functions that drop locks that were held when called are
EVIL and almost always the source of really subtle races.

As it was in this case.

Linus

2009-11-10 18:34:21

[permalink] [raw]

Subject: Re: [GIT PULL] percpu fixes for 2.6.32-rc6

Hello, Linus.

Linus Torvalds wrote:
>> + if (old) {
>> + spin_unlock_irqrestore(&pcpu_lock, *flags);
>> + pcpu_mem_free(old, size);
>> + spin_lock_irqsave(&pcpu_lock, *flags);
>> + }
>
> Routines that drop and then re-take the lock should be banned, as it's
> almost always a bug waiting to happen. As it is this time:

Yeap, they usually are. In this case, it's a little bit different.
There are two locks - pcpu_alloc_mutex and pcpu_lock.
pcpu_alloc_mutex protects the whole alloc and reclaim paths while
pcpu_lock protects the part used by free path - area maps in chunks
and pcpu_slot array pointing to chunks.

So, under pcpu_alloc_mutex which pcpu_extend_area_map() is called with
and never drops, the chunk never goes away nor can anything be
allocated from it. Dropping pcpu_lock only allows free path to come
inbetween and mark areas in the area map of the chunk and maybe
relocate the chunk to another pcpu_slot according to new free area
size - both operations should be safe. IOW, it's not really dropping
the main lock here.

At first, both alloc and free paths were protected by single mutex.
The original percpu allocator always assumed GFP_KERNEL allocation, so
I thought that should do it. Unfortunately, I was forgetting about
the free path which was allowed to be called from atomic context, so
the lock was split into two so that the free path can be called from
atomic context.

The reason why this somewhat convoluted locking was chosen over making
both alloc and free paths irq-safe was partly because it was easier
but mostly because percpu allocator's dependence on vmalloc allocator.
All vmalloc area related lockings assume not to be called from irq
context which goes back to having to make cross cpu invalidate calls,
which make it difficult to make percpu allocator irqsafe as a whole.
So, the locking impedance matching was done in the percpu free path
and the temporary inner lock droppings in pcpu_extend_are_map() are
the byproducts.

Christoph Lameter has been saying that atomic percpu allocations would
be beneficial for certain callers. Pushing the problem down to the
vmalloc layer where the problem originates and making percpu locking
more usual would be nice too. I'm still not sure whether it would
justify the added complexity (it will probably require putting vmalloc
reclamation path to a workqueue) but that could be the ultimate right
thing to do here.

If I'm missing something, I'm sure you'll hammer it into me.

Thanks.

--
tejun

2009-11-10 18:55:32

[permalink] [raw]

Subject: Re: [GIT PULL] percpu fixes for 2.6.32-rc6

On Wed, 11 Nov 2009, Tejun Heo wrote:
>
> If I'm missing something, I'm sure you'll hammer it into me.

Here's from the comments on that function:

* RETURNS:
* 0 if noop, 1 if successfully extended, -errno on failure.

and here's from one of the main callers:

list_for_each_entry(chunk, &pcpu_slot[slot], list) {
...
switch (pcpu_extend_area_map(chunk, &flags)) {
case 0:
break;
case 1:
goto restart; /* pcpu_lock dropped, restart */

where that '&pcpu_slot[slot]' list is protected by the pcpu_lock, and
nothing else. At least according to all the _other_ comments in that file.
Including the one that very much tries to _explain_ the locking at the
top, quote:

"The latter is a spinlock and protects the index data structures - chunk
slots, chunks and area maps in chunks."

So as far as I can tell, either the comments are all crap, the whole
restart code is pointless and in fact the whole spin-lock is seemingly
almost entirely pointless to begin with (since pcpu_alloc_mutex is the
only thing that matters), or the code is buggy.

Also, quite frankly, even if the code _isn't_ buggy, it's still wrong to
release a lock that somebody else took. It's a sure-fire way to write
unmaintainable code with bugs almost guaranteed in the future. Yes, it
happens, and sometimes it's the only sane way to do it, but in this case
that really isn't true as far as I can tell.

>From my (admittedly fairly quick) look, my suggested split-up really would
make the code _more_ readable (no need for that subtle "negative, zero or
positive all mean different things" logic), and hopefully avoid the whole
"drop the lock that somebody else took", because we could drop it in the
caller where it was taken.

So it all boils down to: the code is unquestionably ugly and almost
certainly broken. And if it isn't broken, then _all_ the comments are
total crap.

Your choice.

Linus

2009-11-10 19:25:52

[permalink] [raw]

Subject: Re: [GIT PULL] percpu fixes for 2.6.32-rc6

Hello,

Linus Torvalds wrote:
> On Wed, 11 Nov 2009, Tejun Heo wrote:
>> If I'm missing something, I'm sure you'll hammer it into me.
>
> Here's from the comments on that function:
>
> * RETURNS:
> * 0 if noop, 1 if successfully extended, -errno on failure.
>
> and here's from one of the main callers:
>
> list_for_each_entry(chunk, &pcpu_slot[slot], list) {
> ...
> switch (pcpu_extend_area_map(chunk, &flags)) {
> case 0:
> break;
> case 1:
> goto restart; /* pcpu_lock dropped, restart */
>
> where that '&pcpu_slot[slot]' list is protected by the pcpu_lock, and
> nothing else. At least according to all the _other_ comments in that file.
> Including the one that very much tries to _explain_ the locking at the
> top, quote:

Oh, yeah, right. I was too fixated on the part modified by the patch.

> "The latter is a spinlock and protects the index data structures - chunk
> slots, chunks and area maps in chunks."
>
> So as far as I can tell, either the comments are all crap, the whole
> restart code is pointless and in fact the whole spin-lock is seemingly
> almost entirely pointless to begin with (since pcpu_alloc_mutex is the
> only thing that matters), or the code is buggy.

The return value is wrong but it wouldn't lead to oops. There's a
very slight chance that it might end up creating extra chunk when not
necessary - probably why it went unnoticed all this time. The
spin-lock is only to allow free_percpu() to be called from atomic
context, so its usefulness would only be visible if you look at
free_percpu() too.

> Also, quite frankly, even if the code _isn't_ buggy, it's still wrong to
> release a lock that somebody else took. It's a sure-fire way to write
> unmaintainable code with bugs almost guaranteed in the future. Yes, it
> happens, and sometimes it's the only sane way to do it, but in this case
> that really isn't true as far as I can tell.
>
> From my (admittedly fairly quick) look, my suggested split-up really would
> make the code _more_ readable (no need for that subtle "negative, zero or
> positive all mean different things" logic), and hopefully avoid the whole
> "drop the lock that somebody else took", because we could drop it in the
> caller where it was taken.
>
> So it all boils down to: the code is unquestionably ugly and almost
> certainly broken. And if it isn't broken, then _all_ the comments are
> total crap.

Yeap, the return value definitely is broken and the rather ugly
calling convention is remanant from the days when there was only
single mutex protecting the whole thing.

I think this type of function is a bit special in locking requirement
tho. The initial step - checking whether the operation is necessary -
requires lock and the final step - copying over to the new thing and
installing it - also requires the lock, so unless there's one
unnecessary unlock/lock pair, the second function would be called
without lock but return with lock, which probably is safer than
releasing and regrabbing lock in the middle but still not quite
pretty.

In this case, as the second function needs to release to free the old
map, the extra unlock/lock pair is actually necessary. Splitting into
two functions is fine but I think it would be better to fix it first
and then split them in following patches so that it can be bisected if
I screw up while splitting, right?

Thanks.

--
tejun

2009-11-10 19:37:46

[permalink] [raw]

Subject: Re: [GIT PULL] percpu fixes for 2.6.32-rc6

* Tejun Heo <[email protected]> wrote:

> In this case, as the second function needs to release to free the old
> map, the extra unlock/lock pair is actually necessary. Splitting into
> two functions is fine but I think it would be better to fix it first
> and then split them in following patches so that it can be bisected if
> I screw up while splitting, right?

i think Linus's point was that this hack was the last straw that broke
the camel's back, and that we are better off with a clean solution
straight away. If you send the clean approach i can help test it on a
good number of boxes.

Ingo

2009-11-10 19:44:52

[permalink] [raw]

Subject: Re: [GIT PULL] percpu fixes for 2.6.32-rc6

Tejun Heo wrote:
> The return value is wrong but it wouldn't lead to oops.

Correction: It may lead to oops due to wrong end-of-list condition and
I just remembered that I was thinking about it when I was modifying
the locking and adding the switch block there and then I forgot to
actually update the return value of the extend function. So, yeah,
three way return value sucks big time.

--
tejun

2009-11-10 19:50:52

[permalink] [raw]

Subject: Re: [GIT PULL] percpu fixes for 2.6.32-rc6

Ingo Molnar wrote:
> * Tejun Heo <[email protected]> wrote:
>
>> In this case, as the second function needs to release to free the old
>> map, the extra unlock/lock pair is actually necessary. Splitting into
>> two functions is fine but I think it would be better to fix it first
>> and then split them in following patches so that it can be bisected if
>> I screw up while splitting, right?
>
> i think Linus's point was that this hack was the last straw that broke
> the camel's back, and that we are better off with a clean solution
> straight away. If you send the clean approach i can help test it on a
> good number of boxes.

If you're talking about the locking itself, there really is no clean
solution at this point. Until vmalloc locking is updated, we'll have
to do locking impedance matching in percpu code. I'm still not quite
sure whether we really need to update vmalloc code to be irqsafe.

If you're talking about the three way return value, which I do agree
to be quite ugly, I think it will be a lot safer to have three patches
- one to fix the deadlock, another to fix the return value and the
final one to de-uglify the function, especially as we're pretty late
in the release cycle.

Thanks.

--
tejun

2009-11-10 21:43:28

[permalink] [raw]

Subject: Re: [GIT PULL] percpu fixes for 2.6.32-rc6

On Wed, 11 Nov 2009, Tejun Heo wrote:
>
> If you're talking about the three way return value, which I do agree
> to be quite ugly, I think it will be a lot safer to have three patches
> - one to fix the deadlock, another to fix the return value and the
> final one to de-uglify the function, especially as we're pretty late
> in the release cycle.

I'm certainly ok with doing it in stages if that is how you want to do it.

That said, I'm not entirely sure it's _worthwhile_, since the "return 1"
case has apparently never ever actually worked. From a bisect standpoint,
what's the difference between seeing

- oh, now that we made it return the documented code and actually re-try
properly when dropping the lock, it turns out that the re-try code was
always buggy and we just hadn't noticed before because it didn't
trigger

or

- oh, now that we rewrote the function to be cleaner and do the lock
dropping and retry more obviously, it turns out that the retry doesn't
actually work and leads to deadlocks.

but I don't care deeply.

I want the cleanup because I think that the code sucks from a "future
proofing" and readability standpoint, but I really don't mind one way or
the other whether you want to finally do that one "return 1" correctly for
one commit, only to then fix it to not do that three-way test of a single
function in the next one.

So whatever works - as long as the end result both looks sane and doesn't
have the bug we clearly have now.

Linus

2009-11-11 03:56:04

[permalink] [raw]

Subject: Re: [GIT PULL] percpu fixes for 2.6.32-rc6

Hello,

Linus Torvalds wrote:
> I want the cleanup because I think that the code sucks from a "future
> proofing" and readability standpoint, but I really don't mind one way or
> the other whether you want to finally do that one "return 1" correctly for
> one commit, only to then fix it to not do that three-way test of a single
> function in the next one.
>
> So whatever works - as long as the end result both looks sane and doesn't
> have the bug we clearly have now.

I was mostly worrying about introducing unrelated bug while changing.
Anyways, one patch it is. I'll route it through tip as suggested by
Ingo in a few hours.

Thanks a lot.

--
tejun

2009-11-11 08:49:17

[permalink] [raw]

Subject: [PATCH percpu#for-linus] percpu: restructure pcpu_extend_area_map() to fix bugs and improve readability

pcpu_extend_area_map() had the following two bugs.

* It should return 1 if pcpu_lock was dropped and reacquired but it
returned 0. This could lead to oops if free_percpu() races with
area map extension.

* pcpu_mem_free() was called under pcpu_lock. pcpu_mem_free() might
end up calling vfree() which isn't IRQ safe. This could lead to
deadlock through lock order inversion via IRQ.

In addition, Linus pointed out that the temporary lock dropping and
subtle three-way return value of pcpu_extend_area_map() was very ugly
and suggested to split the function into two - pcpu_need_to_extend()
and pcpu_extend_area_map().

This patch restructures pcpu_extend_area_map() as suggested and fixes
the two bugs.

Signed-off-by: Tejun Heo <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Ingo Molnar <[email protected]>
---
I've also put this patch on percpu#for-linus but routing this through
tip would be safer. Went through usual battery of tests and also
verified both bugs are fixed. Ingo, can you please put this into tip?

Thanks.

mm/percpu.c | 122 +++++++++++++++++++++++++++++++++++++++-------------------
1 files changed, 82 insertions(+), 40 deletions(-)

diff --git a/mm/percpu.c b/mm/percpu.c
index d907971..46f037a 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -355,62 +355,86 @@ static struct pcpu_chunk *pcpu_chunk_addr_search(void *addr)
}

/**
- * pcpu_extend_area_map - extend area map for allocation
- * @chunk: target chunk
+ * pcpu_need_to_extend - determine whether chunk area map needs to be extended
+ * @chunk: chunk of interest
*
- * Extend area map of @chunk so that it can accomodate an allocation.
- * A single allocation can split an area into three areas, so this
- * function makes sure that @chunk->map has at least two extra slots.
+ * Determine whether area map of @chunk needs to be extended to
+ * accomodate a new allocation.
*
* CONTEXT:
- * pcpu_alloc_mutex, pcpu_lock. pcpu_lock is released and reacquired
- * if area map is extended.
+ * pcpu_lock.
*
* RETURNS:
- * 0 if noop, 1 if successfully extended, -errno on failure.
+ * New target map allocation length if extension is necessary, 0
+ * otherwise.
*/
-static int pcpu_extend_area_map(struct pcpu_chunk *chunk, unsigned long *flags)
+static int pcpu_need_to_extend(struct pcpu_chunk *chunk)
{
int new_alloc;
- int *new;
- size_t size;

- /* has enough? */
if (chunk->map_alloc >= chunk->map_used + 2)
return 0;

- spin_unlock_irqrestore(&pcpu_lock, *flags);
-
new_alloc = PCPU_DFL_MAP_ALLOC;
while (new_alloc < chunk->map_used + 2)
new_alloc *= 2;

- new = pcpu_mem_alloc(new_alloc * sizeof(new[0]));
- if (!new) {
- spin_lock_irqsave(&pcpu_lock, *flags);
+ return new_alloc;
+}
+
+/**
+ * pcpu_extend_area_map - extend area map of a chunk
+ * @chunk: chunk of interest
+ * @new_alloc: new target allocation length of the area map
+ *
+ * Extend area map of @chunk to have @new_alloc entries.
+ *
+ * CONTEXT:
+ * Does GFP_KERNEL allocation. Grabs and releases pcpu_lock.
+ *
+ * RETURNS:
+ * 0 on success, -errno on failure.
+ */
+static int pcpu_extend_area_map(struct pcpu_chunk *chunk, int new_alloc)
+{
+ int *old = NULL, *new = NULL;
+ size_t old_size = 0, new_size = new_alloc * sizeof(new[0]);
+ unsigned long flags;
+
+ new = pcpu_mem_alloc(new_size);
+ if (!new)
return -ENOMEM;
- }

- /*
- * Acquire pcpu_lock and switch to new area map. Only free
- * could have happened inbetween, so map_used couldn't have
- * grown.
- */
- spin_lock_irqsave(&pcpu_lock, *flags);
- BUG_ON(new_alloc < chunk->map_used + 2);
+ /* acquire pcpu_lock and switch to new area map */
+ spin_lock_irqsave(&pcpu_lock, flags);
+
+ if (new_alloc <= chunk->map_alloc)
+ goto out_unlock;

- size = chunk->map_alloc * sizeof(chunk->map[0]);
- memcpy(new, chunk->map, size);
+ old_size = chunk->map_alloc * sizeof(chunk->map[0]);
+ memcpy(new, chunk->map, old_size);

/*
* map_alloc < PCPU_DFL_MAP_ALLOC indicates that the chunk is
* one of the first chunks and still using static map.
*/
if (chunk->map_alloc >= PCPU_DFL_MAP_ALLOC)
- pcpu_mem_free(chunk->map, size);
+ old = chunk->map;

chunk->map_alloc = new_alloc;
chunk->map = new;
+ new = NULL;
+
+out_unlock:
+ spin_unlock_irqrestore(&pcpu_lock, flags);
+
+ /*
+ * pcpu_mem_free() might end up calling vfree() which uses
+ * IRQ-unsafe lock and thus can't be called under pcpu_lock.
+ */
+ pcpu_mem_free(old, old_size);
+ pcpu_mem_free(new, new_size);
+
return 0;
}

@@ -1049,7 +1073,7 @@ static void *pcpu_alloc(size_t size, size_t align, bool reserved)
static int warn_limit = 10;
struct pcpu_chunk *chunk;
const char *err;
- int slot, off;
+ int slot, off, new_alloc;
unsigned long flags;

if (unlikely(!size || size > PCPU_MIN_UNIT_SIZE || align > PAGE_SIZE)) {
@@ -1064,14 +1088,26 @@ static void *pcpu_alloc(size_t size, size_t align, bool reserved)
/* serve reserved allocations from the reserved chunk if available */
if (reserved && pcpu_reserved_chunk) {
chunk = pcpu_reserved_chunk;
- if (size > chunk->contig_hint ||
- pcpu_extend_area_map(chunk, &flags) < 0) {
- err = "failed to extend area map of reserved chunk";
+
+ if (size > chunk->contig_hint) {
+ err = "alloc from reserved chunk failed";
goto fail_unlock;
}
+
+ while ((new_alloc = pcpu_need_to_extend(chunk))) {
+ spin_unlock_irqrestore(&pcpu_lock, flags);
+ if (pcpu_extend_area_map(chunk, new_alloc) < 0) {
+ err = "failed to extend area map of "
+ "reserved chunk";
+ goto fail_unlock_mutex;
+ }
+ spin_lock_irqsave(&pcpu_lock, flags);
+ }
+
off = pcpu_alloc_area(chunk, size, align);
if (off >= 0)
goto area_found;
+
err = "alloc from reserved chunk failed";
goto fail_unlock;
}
@@ -1083,14 +1119,20 @@ restart:
if (size > chunk->contig_hint)
continue;

- switch (pcpu_extend_area_map(chunk, &flags)) {
- case 0:
- break;
- case 1:
- goto restart; /* pcpu_lock dropped, restart */
- default:
- err = "failed to extend area map";
- goto fail_unlock;
+ new_alloc = pcpu_need_to_extend(chunk);
+ if (new_alloc) {
+ spin_unlock_irqrestore(&pcpu_lock, flags);
+ if (pcpu_extend_area_map(chunk,
+ new_alloc) < 0) {
+ err = "failed to extend area map";
+ goto fail_unlock_mutex;
+ }
+ spin_lock_irqsave(&pcpu_lock, flags);
+ /*
+ * pcpu_lock has been dropped, need to
+ * restart cpu_slot list walking.
+ */
+ goto restart;
}

off = pcpu_alloc_area(chunk, size, align);
--
1.6.4.2

2009-11-11 11:32:19

[permalink] [raw]

Subject: Re: [GIT PULL] percpu fixes for 2.6.32-rc6

* Tejun Heo <[email protected]> wrote:

> Hello,
>
> Linus Torvalds wrote:
> > I want the cleanup because I think that the code sucks from a "future
> > proofing" and readability standpoint, but I really don't mind one way or
> > the other whether you want to finally do that one "return 1" correctly for
> > one commit, only to then fix it to not do that three-way test of a single
> > function in the next one.
> >
> > So whatever works - as long as the end result both looks sane and doesn't
> > have the bug we clearly have now.
>
> I was mostly worrying about introducing unrelated bug while changing.
> Anyways, one patch it is. I'll route it through tip as suggested by
> Ingo in a few hours.

Btw., i'd suggest you keep it in your percpu tree as usual and send it
to Linus directly - i offered testing for the cleanup patch (and can
pull your patch for such a purpose), it doesnt 'have' to go via -tip.

Ingo

2009-11-11 12:22:02

[permalink] [raw]

Subject: Re: [GIT PULL] percpu fixes for 2.6.32-rc6

Hello, Ingo.

Ingo Molnar wrote:
>> I was mostly worrying about introducing unrelated bug while changing.
>> Anyways, one patch it is. I'll route it through tip as suggested by
>> Ingo in a few hours.
>
> Btw., i'd suggest you keep it in your percpu tree as usual and send it
> to Linus directly - i offered testing for the cleanup patch (and can
> pull your patch for such a purpose), it doesnt 'have' to go via -tip.

Can you please then pull from the following tree for testing? I'll
push it to Linus after a couple of days if nothing explodes.

git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu.git for-linus

Thanks.

--
tejun

2009-11-11 19:27:50

[permalink] [raw]

Subject: Re: [PATCH percpu#for-linus] percpu: restructure pcpu_extend_area_map() to fix bugs and improve readability

On Wed, 11 Nov 2009, Tejun Heo wrote:
>
> I've also put this patch on percpu#for-linus but routing this through
> tip would be safer. Went through usual battery of tests and also
> verified both bugs are fixed. Ingo, can you please put this into tip?

Looks good to me. Ack.

Linus

2009-11-11 19:58:01

[permalink] [raw]

Subject: Re: [GIT PULL] percpu fixes for 2.6.32-rc6

* Tejun Heo <[email protected]> wrote:

> Hello, Ingo.
>
> Ingo Molnar wrote:
> >> I was mostly worrying about introducing unrelated bug while changing.
> >> Anyways, one patch it is. I'll route it through tip as suggested by
> >> Ingo in a few hours.
> >
> > Btw., i'd suggest you keep it in your percpu tree as usual and send it
> > to Linus directly - i offered testing for the cleanup patch (and can
> > pull your patch for such a purpose), it doesnt 'have' to go via -tip.
>
> Can you please then pull from the following tree for testing? I'll
> push it to Linus after a couple of days if nothing explodes.
>
> git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu.git for-linus
>
> Thanks.

Sure - pulled it into tip:master for testing earlier today and after a
few hours of it's looking good so far in x86 runtime tests. I also did
cross-build testing to a dozen non-x86 architectures and it was fine
there too.

btw., there's some 80-cols checkpatch warning artifacts in the commit:

+ if (pcpu_extend_area_map(chunk, new_alloc) < 0) {
+ err = "failed to extend area map of "
+ "reserved chunk";

which suggest that the logic here is perhaps nested a bit too deep. It
could be improved by moving the reserved allocation branch of
pcpu_alloc():

if (reserved && pcpu_reserved_chunk) {

into a helper inline function, something like __pcpu_alloc_reserved().

It's a rare special case anyway. It could be changed to return with the
pcpu_lock always taken, so the above branch would look like this:

if (unlikely(reserved)) {
off = __pcpu_alloc_reserved(&chunk, size, align, &err);
if (off < 0)
goto fail_unlock;
goto area_found;
}

Which is a cleaner flow IMO, and which simplifes pcpu_alloc().

And the error string should be:

err = "failed to extend area map of reserved chunk";

(Ignore the checkpatch complaint.)

What do you think?

Ingo

2009-11-12 10:10:13

[permalink] [raw]

Subject: Re: [GIT PULL] percpu fixes for 2.6.32-rc6

Hello, Ingo.

11/12/2009 04:57 AM, Ingo Molnar wrote:
> Sure - pulled it into tip:master for testing earlier today and after a
> few hours of it's looking good so far in x86 runtime tests. I also did
> cross-build testing to a dozen non-x86 architectures and it was fine
> there too.

Great.

> btw., there's some 80-cols checkpatch warning artifacts in the commit:
>
> + if (pcpu_extend_area_map(chunk, new_alloc) < 0) {
> + err = "failed to extend area map of "
> + "reserved chunk";
>
> which suggest that the logic here is perhaps nested a bit too deep. It
> could be improved by moving the reserved allocation branch of
> pcpu_alloc():

Strange, although the line break isn't the prettiest thing, the only
checkpatch problem I can see is the following.

> scripts/checkpatch.pl 0001-percpu-restructure-pcpu_extend_area_map-to-fix-bugs-.patch
ERROR: trailing whitespace
#80: FILE: mm/percpu.c:382:
+^Ireturn new_alloc;^I$

total: 1 errors, 0 warnings, 179 lines checked

0001-percpu-restructure-pcpu_extend_area_map-to-fix-bugs-.patch has style problems, please review. If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

The patch adds a trailing tab. I'll fix that up (I usually catch
these while using quilt but this one didn't go through quilt and I
forgot to run checkpatch).

> if (reserved && pcpu_reserved_chunk) {
>
> into a helper inline function, something like __pcpu_alloc_reserved().
>
> It's a rare special case anyway. It could be changed to return with the
> pcpu_lock always taken, so the above branch would look like this:
>
> if (unlikely(reserved)) {
> off = __pcpu_alloc_reserved(&chunk, size, align, &err);
> if (off < 0)
> goto fail_unlock;
> goto area_found;
> }
>
> Which is a cleaner flow IMO, and which simplifes pcpu_alloc().

Hmmm... The thing is that the nesting isn't that deep there and
breaking string in the middle is something we do quite often. What
checkpatch warning did you see?

Thanks.

--
tejun

2009-11-12 10:37:25

[permalink] [raw]

Subject: Re: [GIT PULL] percpu fixes for 2.6.32-rc6

* Tejun Heo <[email protected]> wrote:

> Hmmm... The thing is that the nesting isn't that deep there and
> breaking string in the middle is something we do quite often. What
> checkpatch warning did you see?

( i did not run checkpatch over your commit - i just assumed that the
ugliness was a checkpatch artifact. )

Breaking strings mid-sentence is something we try not to do. (If you
know about places that do it 'quite often' then those places need fixing
too.)

The biggest problem with it s that it breaks git grep. If someone sees
this message in the log:

PERCPU: allocation failed, size=1024 align=32, failed to extend area map of reserved chunk

and types the obvious:

git grep 'failed to extend area map of reserved chunk'

the git-grep comes up empty because the string was needlessly broken in
mid-sentence. Which is a confusing result and which causes people to
waste time trying to figure out where the message came from.

The other messages in this function are fine btw, for example:

git grep 'failed to populate'

will come up with the right place.

( There are also other reasons why we dont break strings mid-sentence -
it's also less readable to have it on two lines. )

Ingo

2009-11-12 10:57:16

[permalink] [raw]

Subject: Re: [GIT PULL] percpu fixes for 2.6.32-rc6

Hello, Ingo.

11/12/2009 07:36 PM, Ingo Molnar wrote:
>
> * Tejun Heo <[email protected]> wrote:
>
>> Hmmm... The thing is that the nesting isn't that deep there and
>> breaking string in the middle is something we do quite often. What
>> checkpatch warning did you see?
>
> ( i did not run checkpatch over your commit - i just assumed that the
> ugliness was a checkpatch artifact. )
>
> Breaking strings mid-sentence is something we try not to do. (If you
> know about places that do it 'quite often' then those places need fixing
> too.)

Oh... I do that all the time and I see a lot of them around too.

> the git-grep comes up empty because the string was needlessly broken in
> mid-sentence. Which is a confusing result and which causes people to
> waste time trying to figure out where the message came from.
>
> The other messages in this function are fine btw, for example:
>
> git grep 'failed to populate'
>
> will come up with the right place.

While I agree this is a valid reason, I really don't think we should
be restructuring whole code to accomodate long strings on single line.
I think a better way would be to teach grepping tool to match those
broken lines. It shouldn't be too difficult to put this into ack[1]
and maybe we can have git-ack (that's a bad name for git tho). I'll
ask ack author nicely.

> ( There are also other reasons why we dont break strings mid-sentence -
> it's also less readable to have it on two lines. )

This really depends on personal tastes. When trying to use long
string literals, there are several choices.

1. Use broken strings.

printk("blah blah blah blah "
"blah blah blah blah\n");

2. Push it into new line and unindent it.

printk(
"blah blah blah blah blah blah blah blah\n");

3. Restructure code so that the literal ends up in outer block.

printk("blah blah blah blah blah blah blah blah\n");

I prefer the first choice. The third would be nice if it's trivial to
do but I don't think it should dictate the code structure. The second
one, I don't know. Some people like that and grep will be happy with
it but it just seems very disturbing to my eyes.

Thanks.

--
tejun

[1] http://betterthangrep.com/

2009-11-12 11:07:55

[permalink] [raw]

Subject: Re: [GIT PULL] percpu fixes for 2.6.32-rc6

* Tejun Heo <[email protected]> wrote:

> > if (reserved && pcpu_reserved_chunk) {
> >
> > into a helper inline function, something like __pcpu_alloc_reserved().
> >
> > It's a rare special case anyway. It could be changed to return with the
> > pcpu_lock always taken, so the above branch would look like this:
> >
> > if (unlikely(reserved)) {
> > off = __pcpu_alloc_reserved(&chunk, size, align, &err);
> > if (off < 0)
> > goto fail_unlock;
> > goto area_found;
> > }
> >
> > Which is a cleaner flow IMO, and which simplifes pcpu_alloc().
>
> Hmmm... The thing is that the nesting isn't that deep there [...]

Well, the pcpu_alloc() function is 115 lines which is a bit long. It
does 2-3 things while a function should try to do one thing.

Putting the reserved allocation into a separate function also makes the
'main' path of logic more visible and obstructed less by rare details.

The indentation i pinpointed is 4 levels deep:

err = "failed to extend area map of "
"reserved chunk";

which is a bit too much IMO - the code starts in the middle of the
screen, there's barely any space to do anything meaningful.

But there's other line wrap artifacts as well further down:

if (pcpu_extend_area_map(chunk,
new_alloc) < 0) {

But ... there's no hard rules here and i've seen functions where 4
levels of indentation were just ok. Anyway, i just gave you my opinion,
and i'm definitely more on the nitpicky side of the code quality
equilibrium, YMMV.

Ingo

2009-11-12 11:25:56

[permalink] [raw]

Subject: Re: [GIT PULL] percpu fixes for 2.6.32-rc6

* Tejun Heo <[email protected]> wrote:

> This really depends on personal tastes. When trying to use long
> string literals, there are several choices.
>
> 1. Use broken strings.
>
> printk("blah blah blah blah "
> "blah blah blah blah\n");
>
> 2. Push it into new line and unindent it.
>
> printk(
> "blah blah blah blah blah blah blah blah\n");
>
> 3. Restructure code so that the literal ends up in outer block.
>
> printk("blah blah blah blah blah blah blah blah\n");
>
> I prefer the first choice. The third would be nice if it's trivial to
> do but I don't think it should dictate the code structure. The second
> one, I don't know. Some people like that and grep will be happy with
> it but it just seems very disturbing to my eyes.

My preferred choice is:

4. Change "blah blah blah blah blah blah blah blah\n" to "blah\n" -
the user really does not win much from the repetition.

Seriously, if you _ever_ get into a 'hm, the string is too long'
situation, you should ask yourself two fundamental questions:

a) Is the user really interested in this small novel?

b) Is this a side-effect of a bloated function having too deep
indentation?

IMHO, in the specific case at hand, both a) and b) apply:

err = "failed to extend area map of "
"reserved chunk";

the indentation is a bit too deep, and the message is too chatty - the
output itself is:

PERCPU: allocation failed, size=1024 align=32, failed to extend area map of reserved chunk

A better/shorter message would be:

percpu: 1024 bytes allocation failed: could not extend reserved chunk area map

This formulation is a bit shorter and i doubt the align parameter really
needs printing - it's almost always the same and other context will tell
us what it is anyway.

Ingo

2009-11-12 11:28:52

[permalink] [raw]

Subject: Re: [GIT PULL] percpu fixes for 2.6.32-rc6

Hello, Ingo.

11/12/2009 08:07 PM, Ingo Molnar wrote:
> Well, the pcpu_alloc() function is 115 lines which is a bit long. It
> does 2-3 things while a function should try to do one thing.

I agree for low level / utility functions but for top level functions
which direct the flow of the whole logic, I usually prefer to put them
flat. To me, that way things seem less obfuscated.

> Putting the reserved allocation into a separate function also makes the
> 'main' path of logic more visible and obstructed less by rare details.
>
> The indentation i pinpointed is 4 levels deep:
>
> err = "failed to extend area map of "
> "reserved chunk";
>
> which is a bit too much IMO - the code starts in the middle of the
> screen, there's barely any space to do anything meaningful.

Well, all that's there is error exit. Surrounding code segment is,

if (pcpu_extend_area_map(chunk, new_alloc) < 0) {
err = "failed to extend area map of "
"reserved chunk";
goto fail_unlock_mutex;
}

So, we might as well just do

err = "failed to extend area map of reserved chunk";
if (pcpu_extend_area_map(chunk, new_alloc) < 0)
goto fail_unlock_mutex;

> But there's other line wrap artifacts as well further down:
>
> if (pcpu_extend_area_map(chunk,
> new_alloc) < 0) {

This one is uglier and one level deeper than the previous one. The
resulting nesting was one of the reasons why I factored out
pcpu_extend_area_map() as a whole and switched on the return value but
that obfuscated locking. Although it nests quite a bit, I don't think
the loop there is too bad. It shows what it does pretty well.

> But ... there's no hard rules here and i've seen functions where 4
> levels of indentation were just ok. Anyway, i just gave you my opinion,
> and i'm definitely more on the nitpicky side of the code quality
> equilibrium, YMMV.

Indentation and code style are actually something I end up spending
quite some time on and I did think about the second one. Factoring
out without hiding locking is a bit difficult but if I rename
new_alloc to new_len, I can fit that thing onto a single line but that
would probably require renaming matching local variable in
pcpu_extend_area_map() which will end up generating unnecessary amount
of diff obfuscating the real change. At that point, I just thought we
could live with one slightly ugly line break.

So, I don't know. Pros and cons on these things depend too much on
personal tastes (and even mood at the time of writing) to form uniform
standard to follow.

Thanks.

--
tejun

2009-11-12 14:24:07

by Oliver Neukum

[permalink] [raw]

Subject: Re: [GIT PULL] percpu fixes for 2.6.32-rc6

Am Donnerstag, 12. November 2009 11:58:19 schrieb Tejun Heo:

> While I agree this is a valid reason, I really don't think we should
> be restructuring whole code to accomodate long strings on single line.
> I think a better way would be to teach grepping tool to match those
> broken lines. It shouldn't be too difficult to put this into ack[1]
> and maybe we can have git-ack (that's a bad name for git tho). I'll
> ask ack author nicely.

There's a point where following style guidelines turns into a fetish.
Plain simple "grep" should work.

Regards
Oliver

2009-11-12 15:18:24

[permalink] [raw]

Subject: Re: [GIT PULL] percpu fixes for 2.6.32-rc6

On Thu, 12 Nov 2009, Tejun Heo wrote:
>
> 11/12/2009 07:36 PM, Ingo Molnar wrote:
> >
> > Breaking strings mid-sentence is something we try not to do. (If you
> > know about places that do it 'quite often' then those places need fixing
> > too.)
>
> Oh... I do that all the time and I see a lot of them around too.

I hate them. I do greps for error messages, and it's annoying as hell if
it's hard to find.

'checkpatch' is the major reason for them, but I think we've fixed
checkpath long ago to not warn about long lines if they are due to a long
string.

Strings should basically be broken up only at '\n' characters, so

printk("This is a made-up example.\n"
"Ok like this\n");

is fine, because you can expect to grep for "made-up example", but not
over the newline.

Linus

2009-11-12 15:32:15

[permalink] [raw]

Subject: Re: [GIT PULL] percpu fixes for 2.6.32-rc6

Hello,

11/13/2009 12:17 AM, Linus Torvalds wrote:
> On Thu, 12 Nov 2009, Tejun Heo wrote:
>> 11/12/2009 07:36 PM, Ingo Molnar wrote:
>>> Breaking strings mid-sentence is something we try not to do. (If you
>>> know about places that do it 'quite often' then those places need fixing
>>> too.)
>>
>> Oh... I do that all the time and I see a lot of them around too.
>
> I hate them. I do greps for error messages, and it's annoying as hell if
> it's hard to find.
>
> 'checkpatch' is the major reason for them, but I think we've fixed
> checkpath long ago to not warn about long lines if they are due to a long
> string.
>
> Strings should basically be broken up only at '\n' characters, so
>
> printk("This is a made-up example.\n"
> "Ok like this\n");
>
> is fine, because you can expect to grep for "made-up example", but not
> over the newline.

If the consensus is to allow long string literals to overrun 80 column
limit, I have no objection. It makes code slightly more difficult to
read for some people but well we can't make everyone happy.

Thanks.

--
tejun

2009-11-12 15:45:16

[permalink] [raw]

Subject: Re: [GIT PULL] percpu fixes for 2.6.32-rc6

11/13/2009 12:30 AM, Tejun Heo wrote:
>> 'checkpatch' is the major reason for them, but I think we've fixed
>> checkpath long ago to not warn about long lines if they are due to a long
>> string.

It still warns... :-(

WARNING: line over 80 characters
#11: FILE: mm/percpu.c:1100:
+ err = "failed to extend area map of reserved chunk";

--
tejun

2009-11-12 15:53:25

[permalink] [raw]

Subject: Re: [GIT PULL] percpu fixes for 2.6.32-rc6

On Fri, 13 Nov 2009, Tejun Heo wrote:

> 11/13/2009 12:30 AM, Tejun Heo wrote:
> >> 'checkpatch' is the major reason for them, but I think we've fixed
> >> checkpath long ago to not warn about long lines if they are due to a long
> >> string.
>
> It still warns... :-(
>
> WARNING: line over 80 characters
> #11: FILE: mm/percpu.c:1100:
> + err = "failed to extend area map of reserved chunk";

Ok, just ignore it. The 80-column warning has always been sh*t anyway.

It may be that the "long line" warning is suppressed only for actual
'printk()' cases, but it may also be that it looks at the intendation and
only suppresses it for low levels of indents. I'm allergic to perl, so I'm
not even going to look to try to find out the details.

Linus

2009-11-12 17:04:52

by Andres Baldrich

[permalink] [raw]

Subject: Re: [GIT PULL] percpu fixes for 2.6.32-rc6

El jue 12-nov-09, Linus Torvalds <[email protected]> escribi?:
> On Thu, 12 Nov 2009, Tejun Heo wrote:
> >
> > 11/12/2009 07:36 PM, Ingo Molnar wrote:
> > >
> > > Breaking strings mid-sentence is something we try
> not to do. (If you
> > > know about places that do it 'quite often' then
> those places need fixing
> > > too.)
> >
> > Oh... I do that all the time and I see a lot of them
> around too.
>
> I hate them. I do greps for error messages, and it's
> annoying as hell if
> it's hard to find.
>
> 'checkpatch' is the major reason for them, but I think
> we've fixed
> checkpath long ago to not warn about long lines if they are
> due to a long
> string.
>
> Strings should basically be broken up only at '\n'
> characters, so
>
> ??? printk("This is a made-up example.\n"
> ??? ??? "Ok like this\n");
>
> is fine, because you can expect to grep for "made-up
> example", but not
> over the newline.
>
> ??? ??? ???
> ??? Linus

(Kernel)/Documentation/CodingStyle
line 83:
"Statements longer than 80 columns will be broken into sensible chunks.
Descendants are always substantially shorter than the parent and are placed
substantially to the right. The same applies to function headers with a long
argument list. Long strings are as well broken into shorter strings. The
only exception to this is where exceeding 80 columns significantly increases
readability and does not hide information.

void fun(int a, int b, int c)
{
if (condition)
printk(KERN_WARNING "Warning this is a long printk with "
"3 parameters a: %u b: %u "
"c: %u \n", a, b, c);
else
next_statement;
}"

Perhaps it should be changed?
I read the list's faq, it asked useful code or "diff" outputs for proposed changes, but i am an *absolute* beginner with < 6 months of college. 'Never' used FOSS before.
I hope it was good imput.
Andr?s Baldrich <[email protected]>

Yahoo! Cocina

Encontra las mejores recetas con Yahoo! Cocina.

http://ar.mujer.yahoo.com/cocina/

2009-11-12 17:19:16

[permalink] [raw]

Subject: Re: [GIT PULL] percpu fixes for 2.6.32-rc6

On Thu, 12 Nov 2009, Andres Baldrich wrote:
>
> (Kernel)/Documentation/CodingStyle
> line 83:

A lot of people have added code to CodingStyle. That doesn't make it
final. For example, that 80-column thing never existed in my original
coding style, for a reason.

I'm really inclined to just remove the stupid thing entirely both from
coding-style and from checkpatch.

80 columns do not matter. What matters is:
- indentation
- complex expressions and statements

and those two issues _together_ means that 80+ columns should be damn
rare, but the 80 columns itself is not at all that important.

Much more important than 80 columns should be the general guideline that a
"terminal window" may be as small as 80x24. But notice how 80 is just a
small part of that limitation - the 24 is as important as the 80. We have
a guideline that functions should fit on a screenful or two, ie we should
generally aim for functions to be <50 lines long.

And the 80-column thing is EXACTLY THE SAME THING. We should remember that
people may read the code using a roughly 80x24 screen size, but the same
way that nobody sane thinks that "24" is some hard limit on number of
lines, why do people suddenly think that "80" is a hard limit on the
number of columns?

Linus

2009-11-12 18:05:13