2002-11-13 23:55:25

by Christoph Hellwig

[permalink] [raw]
Subject: module mess in -CURRENT

Linus, Rusty,

what the hell is going on with the modules code in 2.5-CURRENT?

Rusty's monsterpatch breaks basically everything (and remember we're
in feature freeze!) instead of doing one thing at a time [1], and it
is doing three things that are absolutely separate issues.

We had an almost useable 2.5 and now exactly when we're feature freezing
and people are expected to test it we break everything?

Linus, please backout that patch until we a) have modutils that support
both the new and old code and b) support at least such basic features
as parsing modules.conf and supporting parameters.

Rusty, the next time please submit stuff one feature at a time instead
of a monster patch that is cool but breaks everything but looks cool.

The inkernel loader, generic boot-time option and your - umm - strange
idea of module unload race reduction are absolute separate things.

[1] e.g. kbuild2.5 was rejected due to the must change everything criteria.
not that I actually liked the kbuild2.5 design, but this is exactly the
same thing.


2002-11-14 00:52:33

by Linus Torvalds

[permalink] [raw]
Subject: Re: module mess in -CURRENT


On Thu, 14 Nov 2002, Christoph Hellwig wrote:
>
> Linus, please backout that patch until we a) have modutils that support
> both the new and old code and b) support at least such basic features
> as parsing modules.conf and supporting parameters.

Quite frankly, at this time a backout means that the thing doesn't go in
_at_all_.

It came in before the feature freeze, but I decided that instead of having
a totally hectic time I woul dmerge stuff that I got before the freeze at
my own leisure, but backing it out now would be basically saying it's not
going into 2.6.x. And I think it's worth it.

(There are some other patches I'm still thinking about, notably kprobes
and posix timers, but other than that my plate is fairly empty froma
feature standpoint. And the kexec stuff I want others to test, at least
now it's palatable to me).

People who find the current module situation difficult can just compile in
the stuff they need for now.

Linus

2002-11-14 00:58:33

by Christoph Hellwig

[permalink] [raw]
Subject: Re: module mess in -CURRENT

On Wed, Nov 13, 2002 at 04:59:35PM -0800, Linus Torvalds wrote:
>
> On Thu, 14 Nov 2002, Christoph Hellwig wrote:
> >
> > Linus, please backout that patch until we a) have modutils that support
> > both the new and old code and b) support at least such basic features
> > as parsing modules.conf and supporting parameters.
>
> Quite frankly, at this time a backout means that the thing doesn't go in
> _at_all_.

Probably. Or just the non-intrusive parts.

> It came in before the feature freeze, but I decided that instead of having
> a totally hectic time I woul dmerge stuff that I got before the freeze at
> my own leisure, but backing it out now would be basically saying it's not
> going into 2.6.x. And I think it's worth it.

I don't think it's a must have and absolutely don't think it's worth
breaking about everything at this stage. Please tell me why rusty can't
send a large number of non-intrusive patches that do one thing at a
time just like everyone else?

2002-11-14 01:54:54

by Alan

[permalink] [raw]
Subject: Re: module mess in -CURRENT

On Thu, 2002-11-14 at 00:59, Linus Torvalds wrote:
> People who find the current module situation difficult can just compile in
> the stuff they need for now.

That makes driver debugging almost impossible. It also makes building a
test kernel set for a lot of boxes impractical. The completely broken
unload stuff is going to be a real pig, PCMCIA only works modular and
doesn't work now the unloads are all broken. OTOH the module rewrite has
some nice features and a combo modutils is going to sort some of the
problem out fairly easily.

The biggest need though is documentation so people can actually fix all
the drivers for this stuff.




2002-11-14 02:25:40

by Linus Torvalds

[permalink] [raw]
Subject: Re: module mess in -CURRENT


On 14 Nov 2002, Alan Cox wrote:
>
> That makes driver debugging almost impossible. It also makes building a
> test kernel set for a lot of boxes impractical. The completely broken
> unload stuff is going to be a real pig, PCMCIA only works modular and
> doesn't work now the unloads are all broken.

Yeah, I forgot about the old 16-bit pcmcia crud. Ugh. At least 32-bit
cardbus works fine.

> The biggest need though is documentation so people can actually fix all
> the drivers for this stuff.

I think Al convinced Rusty that most drivers don't need to worry and that
Rusty was a bit over-eager (ie sound, much of char, all of block and fs
should all be handled by upper layers without the races)

I think Rusty has most of the pieces, but he's apparently flying around
the world right now ;)

Linus

2002-11-14 03:19:10

by Rusty Russell

[permalink] [raw]
Subject: Re: module mess in -CURRENT

In message <[email protected]> you write:
> Linus, Rusty,
>
> what the hell is going on with the modules code in 2.5-CURRENT?
>
> Rusty's monsterpatch breaks basically everything (and remember we're
> in feature freeze!) instead of doing one thing at a time [1], and it
> is doing three things that are absolutely separate issues.

<sigh>. I did do it one at a time. I replaced the module loading
code as stage I.

> We had an almost useable 2.5 and now exactly when we're feature freezing
> and people are expected to test it we break everything?
>
> Linus, please backout that patch until we a) have modutils that support
> both the new and old code

It works for me. This is becoming an FAQ, but I thought the code was
obvious: eg. insmod execvp "insmod.old" when it detects an older
kernel, and "make install" moves insmod to insmod.old etc. if
insmod.old doesn't already exist.

> and b) support at least such basic features
> as parsing modules.conf and supporting parameters.

I'm sorry if I'm feeding the patches to Linus too slowly for you, but
you know where to find them, I've posted the URL often enough. It
hasn't even hit a release yet, so I wasn't worried.

I implemented an in-kernel module loader, implemented the linking code
for 6 architectures, re-implemented all the module cruft cleanly on
top of it, and kept it uptodate through all the changes in Linus'
tree.

And you wonder why I held back fleshing out the implementation of
modprobe, until (if) the code was in the kernel?

> The inkernel loader, generic boot-time option and your - umm - strange
> idea of module unload race reduction are absolute separate things.

I rewrote module.c to make it an in-kernel linker so insmod was 20
lines. And you think I should have re-implemented races in the
loading and unloading stages too, so I could remove them in a future
patch?

What an odd concept!
Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

2002-11-14 03:32:00

by Rusty Russell

[permalink] [raw]
Subject: Re: module mess in -CURRENT

In message <[email protected]> you write:
> On Thu, 2002-11-14 at 00:59, Linus Torvalds wrote:
> > People who find the current module situation difficult can just compile in
> > the stuff they need for now.
>
> That makes driver debugging almost impossible. It also makes building a
> test kernel set for a lot of boxes impractical.

Sorry, I know I've been feeding Linus too slowly, but I've just flown
into Spain and I have a tutorial to deliver. I'm solving it my not
sleeping, but that doesn't scale.

> The completely broken unload stuff is going to be a real pig, PCMCIA
> only works modular and doesn't work now the unloads are all broken.

Agreed, that's what "rmmod --force" is for. Patch in the queue, I
promise. That gives more breathing room for fixing the "marked
unsafe" issue.

> OTOH the module rewrite has some nice features and a combo modutils
> is going to sort some of the problem out fairly easily.

Patches welcome, of course. I didn't do any work on the modutils once
they passed "backwards compat exec works, basic features work".

Trying to test both the entire stack of patches, and just the first
three I was sending to Linus, as every tree came out: well, you can
tell my testing wasn't thorough enough for .47.

> The biggest need though is documentation so people can actually fix all
> the drivers for this stuff.

I posted a document previously (again) and it recieved (valid) harsh
criticism for being opaque. I'll rework it. Basically, its an
expansion of the old try_inc_modcount to be a first-class citizen
(hence called try_module_get()). As previously, should be called
(successfully!) before calling through a function ptr which might be
in a module (ie. most code which exposes a "register_xxx" should use
it). Exceptions if that function cannot sleep (and can't be
preemped).

Why is PCMCIA broken?
Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

2002-11-14 04:02:28

by Rusty Russell

[permalink] [raw]
Subject: Re: module mess in -CURRENT

In message <[email protected]> you wri
te:
>
> > The biggest need though is documentation so people can actually fix all
> > the drivers for this stuff.
>
> I think Al convinced Rusty that most drivers don't need to worry and that
> Rusty was a bit over-eager (ie sound, much of char, all of block and fs
> should all be handled by upper layers without the races)

Not really. The Kernel Summit and an audit of drivers convinced me
that changing every driver simply wasn't feasible.

*You* convinced me not to break any driver source code: every time you
dropped my patches I went back and implemented another compat macro 8)

Cheers,
Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

2002-11-14 10:12:34

by Andi Kleen

[permalink] [raw]
Subject: Re: module mess in -CURRENT

Linus Torvalds <[email protected]> writes:

> (There are some other patches I'm still thinking about, notably kprobes
> and posix timers, but other than that my plate is fairly empty froma
> feature standpoint. And the kexec stuff I want others to test, at least
> now it's palatable to me).

How about the nanosecond stat stuff? It is needed for reliable make.

If I sent you a patch would you still consider it? It is not that intrusive,
but needs straightforward editing in all file systems.

-Andi

2002-11-14 14:00:33

by Alan

[permalink] [raw]
Subject: Re: module mess in -CURRENT

On Thu, 2002-11-14 at 04:36, Rusty Russell wrote:
>
> Why is PCMCIA broken?

PCMCIA is broken without modules, not broken because of the change (well
it gets rather upset about unload fails but thats a seperate matter)

2002-11-14 17:33:57

by Andi Kleen

[permalink] [raw]
Subject: Re: module mess in -CURRENT

> Owens' kbuild-2.5 handled it a different way and didn't need exact
> timings. That is especially important since nanosecond time accuracy is

You may not believe it, but there are projects other than the kernel
that do use make too.

> impossible if you are handling a collection of machines doing the
> work. NTP is accurate, but not that accurate.

The patch does not actually implement nanosecond resolution, but
jiffies resolution (1ms on 2.5), which is easily reachable with NTP.

-Andi

2002-11-14 17:26:03

by John Alvord

[permalink] [raw]
Subject: Re: module mess in -CURRENT

On 14 Nov 2002, Andi Kleen wrote:

> Linus Torvalds <[email protected]> writes:
>
> > (There are some other patches I'm still thinking about, notably kprobes
> > and posix timers, but other than that my plate is fairly empty froma
> > feature standpoint. And the kexec stuff I want others to test, at least
> > now it's palatable to me).
>
> How about the nanosecond stat stuff? It is needed for reliable make.
>
> If I sent you a patch would you still consider it? It is not that intrusive,
> but needs straightforward editing in all file systems.
>
Owens' kbuild-2.5 handled it a different way and didn't need exact
timings. That is especially important since nanosecond time accuracy is
impossible if you are handling a collection of machines doing the
work. NTP is accurate, but not that accurate.

john

2002-11-14 17:54:51

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: module mess in -CURRENT

On Thu, Nov 14, 2002 at 06:40:49PM +0100, Andi Kleen wrote:
> > Owens' kbuild-2.5 handled it a different way and didn't need exact
> > timings. That is especially important since nanosecond time accuracy is
>
> You may not believe it, but there are projects other than the kernel
> that do use make too.
>
> > impossible if you are handling a collection of machines doing the
> > work. NTP is accurate, but not that accurate.
>
> The patch does not actually implement nanosecond resolution, but
> jiffies resolution (1ms on 2.5), which is easily reachable with NTP.

1msec still leave a reasonable window open IMHO. this problem would need
sequence numbers updated atomically to be solved correctly without
regard to the timing that may vary depending on the hardware IMHO (so
that it could still work on a 1Thz cpu even if it's not a pratical
matter for at least this decade).

Andrea

2002-11-14 18:11:33

by Andi Kleen

[permalink] [raw]
Subject: Re: module mess in -CURRENT

On Thu, Nov 14, 2002 at 07:01:17PM +0100, Andrea Arcangeli wrote:
> On Thu, Nov 14, 2002 at 06:40:49PM +0100, Andi Kleen wrote:
> > > Owens' kbuild-2.5 handled it a different way and didn't need exact
> > > timings. That is especially important since nanosecond time accuracy is
> >
> > You may not believe it, but there are projects other than the kernel
> > that do use make too.
> >
> > > impossible if you are handling a collection of machines doing the
> > > work. NTP is accurate, but not that accurate.
> >
> > The patch does not actually implement nanosecond resolution, but
> > jiffies resolution (1ms on 2.5), which is easily reachable with NTP.
>
> 1msec still leave a reasonable window open IMHO. this problem would need
> sequence numbers updated atomically to be solved correctly without
> regard to the timing that may vary depending on the hardware IMHO (so
> that it could still work on a 1Thz cpu even if it's not a pratical
> matter for at least this decade).
>

The API is ready for nanoseconds so when you feel the need you can
convert it to using do_gettimeofday()

-Andi

2002-11-14 18:48:09

by Roman Zippel

[permalink] [raw]
Subject: Re: module mess in -CURRENT

Hi,

On Wed, 13 Nov 2002, Linus Torvalds wrote:

> (There are some other patches I'm still thinking about, notably kprobes
> and posix timers, but other than that my plate is fairly empty froma
> feature standpoint. And the kexec stuff I want others to test, at least
> now it's palatable to me).

Linus, please also consider LTT. The benefits might not be immediately
visible, but it's great tool to the analyze system behaviour, which can
even be used by non kernel hackers. This means as soon normal users start
testing 2.5, they're not limited to fuzzy error messages, but they can
tell you what actually happens, when the system doesn't "feel" right.

bye, Roman

2002-11-15 00:20:56

by Jamie Lokier

[permalink] [raw]
Subject: Re: module mess in -CURRENT

Andrea Arcangeli wrote:
> 1msec still leave a reasonable window open IMHO. this problem would need
> sequence numbers updated atomically to be solved correctly without
> regard to the timing

I agree 100% - it would be nice to be correct instead of "usually
works".

Once you're talking about nanoseconds, you can have both: each time
you store an mtime, make sure the value is at least 1 nanosecond
greater than the previously stored mtime for any file in the
serialisation domain. If it is not, simply _wait_ for up to a
nanosecond until the value has advanced enough.

-- Jamie

2002-11-15 00:41:47

by Linus Torvalds

[permalink] [raw]
Subject: Re: module mess in -CURRENT


On Fri, 15 Nov 2002, Jamie Lokier wrote:
>
> Once you're talking about nanoseconds, you can have both: each time
> you store an mtime, make sure the value is at least 1 nanosecond
> greater than the previously stored mtime for any file in the
> serialisation domain. If it is not, simply _wait_ for up to a
> nanosecond until the value has advanced enough.

Note that the fields already _are_ nanoseconds, it's just not updated at a
nanosecond rate. We're updating xtime only at a rate of HZ, where the 1 ms
comes from in 2.5.x.

And doing a full "gettimeofday()" sounds just a bit too expensive for the
stuff that needs to just update an inode time field.

But the thing is, we don't actually care about the exact time, we only
care about it being monotonically increasing, so I suspect we could just
have something like

static unsigned long xtime_count;

static struct timespec current_time(void)
{
struct timespec val;
unsigned long extra;

read_lock_irq(&xtime_lock);
val = xtime;
extra = xtime_count;
xtime_count = extra+1;
read_unlock_irq(&xtime_lock);

val.nsec += count;
if (val.tv_nsec > 1000000000) {
val.tv_nsec -= 1000000000;
val.tv_sec ++;
}
return val;
}

and then have the timer clear "xtime_count" every time it updates it.

This obviously doesn't give us nanosecond resolution, but it _does_ give
us "unique" timestamps (assuming a system call takes longer than a
nanosecond, which is likely true for the next decade or so - after that we
can start worrying about whether the users might be outrunning the clock).

Linus

2002-11-15 03:31:34

by Andi Kleen

[permalink] [raw]
Subject: Re: module mess in -CURRENT

> and then have the timer clear "xtime_count" every time it updates it.

Problem is that you cannot easily synchronize such a monotonously increasing
timer in a network. But make needs synchronized times.

-Andi

2002-11-15 19:33:29

by kaih

[permalink] [raw]
Subject: Re: module mess in -CURRENT

[email protected] (Andi Kleen) wrote on 15.11.02 in <[email protected]>:

> > and then have the timer clear "xtime_count" every time it updates it.
>
> Problem is that you cannot easily synchronize such a monotonously increasing
> timer in a network. But make needs synchronized times.

That's really a make problem. It gets much worse when you count in times
going backwards because you restore a file from backup, or whatever.

What I'd really like make to do - but it can't with the current design -
is to note the exact time stamp of each dependency when creating a target,
and when reconsidering that target, finding out if any of those time
stamps have changed in any way (and, while we're at it, probably check the
size as well). *Changed since last time*, not younger than the target.

But of course to do that, you need a persistent repository for those time
stamps - which, I think, kbuild-Owen does.

If you think about the more tricky things to do with make, this is almost
always what you would need to make a solution much easier.

Take network time shift, for example. Once you no longer need a younger-
older relation, that time shift is actually completely irrelevant!

One of these days, when I have lots of time (as if!) ...

MfG Kai