2010-04-01 06:31:17

by H. Peter Anvin

[permalink] [raw]
Subject: Re: start_kernel(): bug: interrupts were enabled early

On 03/31/2010 03:26 PM, Andrew Morton wrote:
>
> Not by adding overhead to every single down_read()/down_write() just to
> fix a once-off startup problem - that's taking laziness way too far.
>

How much overhead is this on non-x86 architectures (keep in mind x86
doesn't use this?)

-hpa

--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.


2010-04-01 06:35:53

by Andrew Morton

[permalink] [raw]
Subject: Re: start_kernel(): bug: interrupts were enabled early

On Wed, 31 Mar 2010 23:26:52 -0700 "H. Peter Anvin" <[email protected]> wrote:

> On 03/31/2010 03:26 PM, Andrew Morton wrote:
> >
> > Not by adding overhead to every single down_read()/down_write() just to
> > fix a once-off startup problem - that's taking laziness way too far.
> >
>
> How much overhead is this on non-x86 architectures (keep in mind x86
> doesn't use this?)
>

Just a few instructions, I guess. But we can do it with zero.

And from a design POV, pretending that down_read()/down_write() can be
called with interrupts disabled is daft - they cannot! Why muck up the
usual code paths with this startup-specific hack?

2010-04-01 06:50:08

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: start_kernel(): bug: interrupts were enabled early

On Wed, 2010-03-31 at 23:33 -0400, Andrew Morton wrote:
> Just a few instructions, I guess. But we can do it with zero.
>
> And from a design POV, pretending that down_read()/down_write() can be
> called with interrupts disabled is daft - they cannot! Why muck up
> the
> usual code paths with this startup-specific hack?

Because we the problem of when interrupts are enabled for the first time
is a nasty one, and having entire layer of things not usable at the
right level of init because somewhere something might do an irq enable
due to calling generic code that down's a semaphore is a PITA.

Seriously, Andrew, I don't see a clean solution... Something -somewhere-
will have to be ugly.

Allocation is a pretty basic service that a lot of stuff expect
especially when booting.

We went through that discussion before when we moved the SLAB init
earlier during boot, because it makes no sense to have tons of code to
have to figure out what allocator to call depending on what phase of the
moon it's called from (especially when said code can also be called
later during boot, say for hotplug reasons).

So we moved sl*b init earlier, thus we ought to be able to also
kmem_cache_alloc() earlier. We -fixed- that problem already afaik.

Cheers,
Ben.

2010-04-01 07:50:47

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: start_kernel(): bug: interrupts were enabled early

On Wed, 2010-03-31 at 23:26 -0700, H. Peter Anvin wrote:
> On 03/31/2010 03:26 PM, Andrew Morton wrote:
> >
> > Not by adding overhead to every single down_read()/down_write() just to
> > fix a once-off startup problem - that's taking laziness way too far.
> >
>
> How much overhead is this on non-x86 architectures (keep in mind x86
> doesn't use this?)

None on powerpc, we use atomic ops and don't disable IRQs.

BTW. The same problem used to happen with mutex debug. Was this ever
fixed ?

Cheers,
Ben.

2010-04-01 16:00:57

by Christoph Lameter

[permalink] [raw]
Subject: Re: start_kernel(): bug: interrupts were enabled early

On Thu, 1 Apr 2010, David Howells wrote:

>
> Can we provide a kmem_cache_create_early()? One that takes no locks and gets
> cleaned up with the other __init stuff?

Sure. We can also check if we are early in boot in kmem_cache_create and
just not take the semaphores (which are useless if we are single
threaded).

2010-04-01 16:21:28

by H. Peter Anvin

[permalink] [raw]
Subject: Re: start_kernel(): bug: interrupts were enabled early

On 03/31/2010 11:48 PM, Benjamin Herrenschmidt wrote:
> On Wed, 2010-03-31 at 23:33 -0400, Andrew Morton wrote:
>> Just a few instructions, I guess. But we can do it with zero.
>>
>> And from a design POV, pretending that down_read()/down_write() can be
>> called with interrupts disabled is daft - they cannot! Why muck up
>> the
>> usual code paths with this startup-specific hack?
>
> Because we the problem of when interrupts are enabled for the first time
> is a nasty one, and having entire layer of things not usable at the
> right level of init because somewhere something might do an irq enable
> due to calling generic code that down's a semaphore is a PITA.
>
> Seriously, Andrew, I don't see a clean solution... Something -somewhere-
> will have to be ugly.
>
> Allocation is a pretty basic service that a lot of stuff expect
> especially when booting.
>
> We went through that discussion before when we moved the SLAB init
> earlier during boot, because it makes no sense to have tons of code to
> have to figure out what allocator to call depending on what phase of the
> moon it's called from (especially when said code can also be called
> later during boot, say for hotplug reasons).
>
> So we moved sl*b init earlier, thus we ought to be able to also
> kmem_cache_alloc() earlier. We -fixed- that problem already afaik.

I would like to point out that initialization is a particular subcase of
a more general rule:

- It is safe to call a semaphore/rwlock down with IRQ disabled *if and
only if* the caller can guarantee non-contention.

Initialization is an obvious subcase, but there might be others.

-hpa

--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.

2010-04-01 16:40:33

by David Howells

[permalink] [raw]
Subject: Re: start_kernel(): bug: interrupts were enabled early


Can we provide a kmem_cache_create_early()? One that takes no locks and gets
cleaned up with the other __init stuff?

David

2010-04-01 23:02:30

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: start_kernel(): bug: interrupts were enabled early

On Thu, 2010-04-01 at 12:06 +0100, David Howells wrote:
> Can we provide a kmem_cache_create_early()? One that takes no locks and gets
> cleaned up with the other __init stuff?

Yuck. I hate having to expose more APIs. Also the problem with that is
means callers have to know. So we need to propagate up all call chains
etc... (ie, radix_tree_init_early(), etc...)

This is pretty much exactly the discussion we had when moving sl*b
early, and back then, the final word from Linus (heh, for once he agreed
with me :-) was that this made no sense.

We can bury logic inside kmem_cache_create() though, it's not -that- a
hot path.

Cheers,
Ben.