2002-08-11 12:39:09

by Michel Eyckmans (MCE)

[permalink] [raw]
Subject: 2.5.31: modules don't work at all


After upgrading from 2.5.30 to 2.5.31, nothing related to modules
works for me. Insmod, rmmod, you name it. They all cause errors
along the line of: "QM_SYMBOLS: Bad Address". Any suggestions?

This is with the very latest modutils (2.4.19). These work fine
with 2.5.30.

MCE


2002-08-12 00:55:45

by ryan.flanigan

[permalink] [raw]
Subject: Re: 2.5.31: modules don't work at all

>>>>> "Michel" == Michel Eyckmans <(MCE)" <[email protected]>> writes:

Michel> After upgrading from 2.5.30 to 2.5.31, nothing related to
Michel> modules works for me. Insmod, rmmod, you name it. They all
Michel> cause errors along the line of: "QM_SYMBOLS: Bad
Michel> Address". Any suggestions?

is CONFIG_PREEMPT set?

2002-08-12 01:00:05

by Andrew Rodland

[permalink] [raw]
Subject: Re: 2.5.31: modules don't work at all

On Sun, 11 Aug 2002 14:41:36 +0200
"Michel Eyckmans (MCE)" <[email protected]> wrote:

>
> After upgrading from 2.5.30 to 2.5.31, nothing related to modules
> works for me. Insmod, rmmod, you name it. They all cause errors
> along the line of: "QM_SYMBOLS: Bad Address". Any suggestions?
>
> This is with the very latest modutils (2.4.19). These work fine
> with 2.5.30.

Ditto here.
Ryan: Yes, CONFIG_PREEMPT is set.

--hobbs

2002-08-12 01:12:48

by ryan.flanigan

[permalink] [raw]
Subject: Re: 2.5.31: modules don't work at all

>>>>> "Andrew" == Andrew Rodland <[email protected]> writes:

Andrew> On Sun, 11 Aug 2002 14:41:36 +0200 "Michel Eyckmans (MCE)"
Andrew> <[email protected]> wrote:

>> After upgrading from 2.5.30 to 2.5.31, nothing related to
>> modules works for me. Insmod, rmmod, you name it. They all
>> cause errors along the line of: "QM_SYMBOLS: Bad Address". Any
>> suggestions?

Andrew> Ditto here. Ryan: Yes, CONFIG_PREEMPT is set.

try "unsetting" it. (same problem on the 2.5.31 kernels where i had it set)

2002-08-12 02:30:08

by Adam J. Richter

[permalink] [raw]
Subject: Re: 2.5.31: modules don't work at all

Ryan Flanigan writes:
>>>>>> "Andrew" == Andrew Rodland <[email protected]> writes:
>
> Andrew> On Sun, 11 Aug 2002 14:41:36 +0200 "Michel Eyckmans (MCE)"
> Andrew> <[email protected]> wrote:
>
> >> After upgrading from 2.5.30 to 2.5.31, nothing related to
> >> modules works for me. Insmod, rmmod, you name it. They all
> >> cause errors along the line of: "QM_SYMBOLS: Bad Address". Any
> >> suggestions?
>
> Andrew> Ditto here. Ryan: Yes, CONFIG_PREEMPT is set.
>
>try "unsetting" it. (same problem on the 2.5.31 kernels where i had it set)

I am also experiencing modules not working with CONFIG_PREEMPT
set, and deactivating CONFIG_PREEMPT works around the problem for me too.

Ryan, thanks for suggesting that, as it would have taken me a
long time to narrow it down that far!

It would help avoid duplication of effort if you could indicate
how along you are with this problem. If you or someone else has nailed
the problem and is preparing a patch, then there is no point in anyone
else trying to duplicate that debugging effort. On the other hand,
if you just noticed CONFIG_PREEMPT was the difference between your
configuration and that of someone else who was running 2.5.31 successfully
and are not actively debugging the problem, then I'll try to poke at it
some more.

I already know that the error that trips insmod occurs at
in modules.c, line 831, when qm_symbols gets an error from copy_to_user():

for (; i < mod->nsyms ; ++i, ++s, vals += 2) {
len = strlen(s->name)+1;
if (len > bufsize)
goto calc_space_needed;

here------> if (copy_to_user(strings, s->name, len)
|| __put_user(s->value, vals+0)
|| __put_user(space, vals+1))
return -EFAULT;

strings += len;
bufsize -= len;
space += len;
}

The values of strings and s->name are similar in 2.5.30+preempt
(works) and 2.5.31+preempt (does not work). strings is 0x08______, and
s->name is 0xc0______.

Adam J. Richter __ ______________ 575 Oroville Road
[email protected] \ / Milpitas, California 95035
+1 408 309-6081 | g g d r a s i l United States of America
"Free Software For The Rest Of Us."

2002-08-12 02:54:18

by Skip Ford

[permalink] [raw]
Subject: Re: 2.5.31: modules don't work at all

Adam J. Richter wrote:
>
> I am also experiencing modules not working with CONFIG_PREEMPT
> set, and deactivating CONFIG_PREEMPT works around the problem for me too.
>
[snip]
>
> I already know that the error that trips insmod occurs at
> in modules.c, line 831, when qm_symbols gets an error from copy_to_user():
>
> for (; i < mod->nsyms ; ++i, ++s, vals += 2) {
> len = strlen(s->name)+1;
> if (len > bufsize)
> goto calc_space_needed;
>
> here------> if (copy_to_user(strings, s->name, len)
> || __put_user(s->value, vals+0)
> || __put_user(space, vals+1))
> return -EFAULT;
>
> strings += len;
> bufsize -= len;
> space += len;
> }
>
> The values of strings and s->name are similar in 2.5.30+preempt
> (works) and 2.5.31+preempt (does not work). strings is 0x08______, and
> s->name is 0xc0______.

If I back out this change to arch/i386/mm/fault.c then modules
successfully load. I have no idea if backing it out causes other
problems though.

diff -Nru a/arch/i386/mm/fault.c b/arch/i386/mm/fault.c
--- a/arch/i386/mm/fault.c Sat Aug 10 18:42:20 2002
+++ b/arch/i386/mm/fault.c Sat Aug 10 18:42:20 2002
@@ -181,10 +181,10 @@
info.si_code = SEGV_MAPERR;

/*
- * If we're in an interrupt or have no user
- * context, we must not take the fault..
+ * If we're in an interrupt, have no user context or are running in an
+ * atomic region then we must not take the fault..
*/
- if (in_interrupt() || !mm)
+ if (preempt_count() || !mm)
goto no_context;

down_read(&mm->mmap_sem);


--
Skip

2002-08-12 03:11:06

by ryan.flanigan

[permalink] [raw]
Subject: Re: 2.5.31: modules don't work at all

>>>>> "Adam" == Adam J Richter <[email protected]> writes:

Adam> Ryan, thanks for suggesting that, as it would have taken
Adam> me a long time to narrow it down that far!

np.

Adam> It would help avoid duplication of effort if you could
Adam> indicate how along you are with this problem. If you or

just a "test" build w/ PREEMPT enabled is when i noticed it. it
was the only thing i changed in the .config. so ...
i just wanted to isolate it a bit more before posting.

Adam> someone else has nailed the problem and is preparing a
Adam> patch, then there is no point in anyone else trying to
Adam> duplicate that debugging effort. On the other hand, if you

i have only put in a few hours on the problem thus far, and plan
to continue tonight by looking to the "hold atomic kmaps across
generic_file_read" and "Forward port of get_user_pages() change
from 2.4" patch by Andrew Morton <[email protected]>. thats my best
guess thus far. others might think differently (and they're probably
right).

Adam> just noticed CONFIG_PREEMPT was the difference between your
Adam> configuration and that of someone else who was running
Adam> 2.5.31 successfully and are not actively debugging the
Adam> problem, then I'll try to poke at it some more.

please do. im still _slow_ when dealing with these things.

Adam> I already know that the error that trips insmod occurs at
Adam> in modules.c, line 831, when qm_symbols gets an error from
Adam> copy_to_user():

agreed and thanks for the info!

2002-08-12 05:23:15

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.5.31: modules don't work at all

Skip Ford wrote:
>
> ...
> > I already know that the error that trips insmod occurs at
> > in modules.c, line 831, when qm_symbols gets an error from copy_to_user():
> >
> > for (; i < mod->nsyms ; ++i, ++s, vals += 2) {
> > len = strlen(s->name)+1;
> > if (len > bufsize)
> > goto calc_space_needed;
> >
> > here------> if (copy_to_user(strings, s->name, len)
> > || __put_user(s->value, vals+0)
> > || __put_user(space, vals+1))
> > return -EFAULT;
> >
> > strings += len;
> > bufsize -= len;
> > space += len;
> > }
> >
> > The values of strings and s->name are similar in 2.5.30+preempt
> > (works) and 2.5.31+preempt (does not work). strings is 0x08______, and
> > s->name is 0xc0______.
>
> If I back out this change to arch/i386/mm/fault.c then modules
> successfully load. I have no idea if backing it out causes other
> problems though.
>
> diff -Nru a/arch/i386/mm/fault.c b/arch/i386/mm/fault.c
> --- a/arch/i386/mm/fault.c Sat Aug 10 18:42:20 2002
> +++ b/arch/i386/mm/fault.c Sat Aug 10 18:42:20 2002
> @@ -181,10 +181,10 @@
> info.si_code = SEGV_MAPERR;
>
> /*
> - * If we're in an interrupt or have no user
> - * context, we must not take the fault..
> + * If we're in an interrupt, have no user context or are running in an
> + * atomic region then we must not take the fault..
> */
> - if (in_interrupt() || !mm)
> + if (preempt_count() || !mm)
> goto no_context;
>
> down_read(&mm->mmap_sem);
>

Yes, that's the problem. qm_symbols() is performing copy_to_user()
inside lock_kernel() and that's an "atomic copy_to_user()" in 2.5.31.
But only if preempt is selected. The copy_to_user() doesn't work.

There's nothing illegal about copy_to_user() inside lock_kernel().

Linus, we can back out the preempt_count() test in there and
perform the atomic copy_*_user via a current->flags bit, or
we can do something else?

2002-08-12 17:16:45

by Linus Torvalds

[permalink] [raw]
Subject: Re: 2.5.31: modules don't work at all


On Sun, 11 Aug 2002, Andrew Morton wrote:
>
> Yes, that's the problem. qm_symbols() is performing copy_to_user()
> inside lock_kernel() and that's an "atomic copy_to_user()" in 2.5.31.
> But only if preempt is selected. The copy_to_user() doesn't work.
>
> There's nothing illegal about copy_to_user() inside lock_kernel().
>
> Linus, we can back out the preempt_count() test in there and
> perform the atomic copy_*_user via a current->flags bit, or
> we can do something else?

Since I'm actually hoping that the kernel lock goes away some day, and I
don't want to pollute the stuff that I hope will _not_ go away, I'd prefer
a slightly different approach, namely make kernel_lock() special from a
preempt_count() angle.

In particular, we already "sort" the preemtion count bits according to
just how atomic we are, and lock_kernel is certainly "less atomic" than a
spinlock. So the logical thing to do (I think) is to just make that more
explicit, and make lock_kernel use the low bit of preempt_count, and make
regular spinlocks do a "+= 2" instead of a "+= 1".

That way preempt_count() gives you a much better picture of what the state
of this process is (the name "preempt_count" really gives the wrong notion
these days, since it's really much more generic and is already used for
things that have little to do with preemption any more)

Robert, mind looking into this?

Linus

2002-08-12 17:38:35

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.5.31: modules don't work at all

Linus Torvalds wrote:
>
> On Sun, 11 Aug 2002, Andrew Morton wrote:
> >
> > Yes, that's the problem. qm_symbols() is performing copy_to_user()
> > inside lock_kernel() and that's an "atomic copy_to_user()" in 2.5.31.
> > But only if preempt is selected. The copy_to_user() doesn't work.
> >
> > There's nothing illegal about copy_to_user() inside lock_kernel().
> >
> > Linus, we can back out the preempt_count() test in there and
> > perform the atomic copy_*_user via a current->flags bit, or
> > we can do something else?
>
> Since I'm actually hoping that the kernel lock goes away some day, and I
> don't want to pollute the stuff that I hope will _not_ go away, I'd prefer
> a slightly different approach, namely make kernel_lock() special from a
> preempt_count() angle.
>
> In particular, we already "sort" the preemtion count bits according to
> just how atomic we are, and lock_kernel is certainly "less atomic" than a
> spinlock. So the logical thing to do (I think) is to just make that more
> explicit, and make lock_kernel use the low bit of preempt_count, and make
> regular spinlocks do a "+= 2" instead of a "+= 1".

Gets tricky with nested lock_kernels.

We can do

if (preempt_count() - current->lock_depth)

To ignore the bkl contribution to preempt_count.

I think that's even usable in generic code, because all architectures
use lock_depth in the same way.

2002-08-12 20:28:19

by Linus Torvalds

[permalink] [raw]
Subject: Re: 2.5.31: modules don't work at all


On Mon, 12 Aug 2002, Andrew Morton wrote:
>
> Gets tricky with nested lock_kernels.

No, lock-kernel already only increments once, at the first lock_kernel. We
have a totally separate counter for the BKL depth, see <asm/smplock.h>

Linus

2002-08-12 23:40:13

by Linus Torvalds

[permalink] [raw]
Subject: Re: 2.5.31: modules don't work at all


On Mon, 12 Aug 2002, Andrew Morton wrote:
> Linus Torvalds wrote:
> >
> > On Mon, 12 Aug 2002, Andrew Morton wrote:
> > >
> > > Gets tricky with nested lock_kernels.
> >
> > No, lock-kernel already only increments once, at the first lock_kernel. We
> > have a totally separate counter for the BKL depth, see <asm/smplock.h>
> >
>
> There are eighteen smplock.h's, all different. At least one (SuperH)
> hasn't been converted to preempt.

Note that this should all be trivial, and in fact I think everybody shares
the same smplock.h these days. The x86 smplock.h is 100% C - even though
it's not actually totally visible because we still have some old asm
routines visible that are just #ifdef'ed out.

In fact, I'll clean that up a bit to make it clearer.

> However, soldiering on leads us to some difficulties. You're proposing,
> effectively, that preempt_count gets shifted left one bit and that bit
> zero becomes "has done lock_kernel()".

Actually, on slight introspection I suspect the better answer is to make
the BKL bit somewhere higher up, since BKL is much less interesting than
most spinlocks, and getting increasingly more so. But yes.

Linus

2002-08-12 23:33:06

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.5.31: modules don't work at all

Linus Torvalds wrote:
>
> On Mon, 12 Aug 2002, Andrew Morton wrote:
> >
> > Gets tricky with nested lock_kernels.
>
> No, lock-kernel already only increments once, at the first lock_kernel. We
> have a totally separate counter for the BKL depth, see <asm/smplock.h>
>

There are eighteen smplock.h's, all different. At least one (SuperH)
hasn't been converted to preempt. I'd rather like to decouple the kmap
optimisation work from that mini-quagmire. It's very easy to do, with

current->flags |= PF_ATOMIC;
__copy_to_user(...);
current->flags &= ~PF_ATOMIC;

and that all works fine.

However, soldiering on leads us to some difficulties. You're proposing,
effectively, that preempt_count gets shifted left one bit and that bit
zero becomes "has done lock_kernel()".

So bits 0-31 of preempt_count mean "may not be preempted" and "should
not preempt self". And bits 1-31 of preempt count mean "must not
explicitly call schedule".

Problem is, the semantics of this vary too much between preemptible
and non-preemptible kernels. Because non-preemptible kernels do
not increment preempt_count in spin_lock().

Maybe my penny hasn't dropped yet, but I tend to feel that the
semantics of my "may_schedule()" below are too fragile for it to
be part of the API.

(Untested, uncompiled code)


arch/i386/mm/fault.c | 2 -
include/asm-i386/smplock.h | 20 +++++++++++++-----
include/linux/preempt.h | 49 ++++++++++++++++++++++++++++++++-------------
3 files changed, 51 insertions, 20 deletions

--- 2.5.31/arch/i386/mm/fault.c~fix-faults Mon Aug 12 16:14:21 2002
+++ 2.5.31-akpm/arch/i386/mm/fault.c Mon Aug 12 16:14:21 2002
@@ -192,7 +192,7 @@ asmlinkage void do_page_fault(struct pt_
* If we're in an interrupt, have no user context or are running in an
* atomic region then we must not take the fault..
*/
- if (preempt_count() || !mm)
+ if (!may_schedule() || !mm)
goto no_context;

#ifdef CONFIG_X86_REMOTE_DEBUG
--- 2.5.31/include/linux/preempt.h~fix-faults Mon Aug 12 16:14:21 2002
+++ 2.5.31-akpm/include/linux/preempt.h Mon Aug 12 16:14:21 2002
@@ -5,29 +5,60 @@

#define preempt_count() (current_thread_info()->preempt_count)

+/*
+ * Bit zero of preempt_count means "holds lock_kernel".
+ * So a non-zero value in preempt_count() means "may not be preempted" and a
+ * non-zero value in bits 1-31 means "may not explicitly schedule".
+ */
+
#define inc_preempt_count() \
do { \
- preempt_count()++; \
+ preempt_count() += 2; \
} while (0)

#define dec_preempt_count() \
do { \
- preempt_count()--; \
+ preempt_count() -= 2; \
+} while (0)
+
+#define lock_kernel_enter() \
+do { \
+ preempt_count() |= 1; \
+} while (0)
+
+#define lock_kernel_exit() \
+do { \
+ preempt_count() &= ~1; \
} while (0)

+/*
+ * The semantics of this depend upon CONFIG_PREEMPT.
+ *
+ * With CONFIG_PREEMPT=y, may_schedule() returns false in irq context and
+ * inside spinlocks, and returns true inside lock_kernel().
+ *
+ * With CONFIG_PREEMPT=n, may_schedule() returns false in irq context, returns
+ * true inside spinlocks and returns true inside lock_kernel().
+ *
+ * But may_schedule() will also return false if the task has performed an
+ * explicit inc_preempt_count(), regardless of CONFIG_PREEMPT. Which is really
+ * the only situation in which may_schedule() is useful.
+ */
+#define may_schedule() (!(preempt_count() >> 1))
+
#ifdef CONFIG_PREEMPT

extern void preempt_schedule(void);

#define preempt_disable() \
do { \
- inc_preempt_count(); \
+ preempt_count() += 2; \
barrier(); \
} while (0)

#define preempt_enable_no_resched() \
do { \
- dec_preempt_count(); \
+ preempt_count() -= 2; \
barrier(); \
} while (0)

@@ -44,22 +75,12 @@ do { \
preempt_schedule(); \
} while (0)

-#define inc_preempt_count_non_preempt() do { } while (0)
-#define dec_preempt_count_non_preempt() do { } while (0)
-
#else

#define preempt_disable() do { } while (0)
#define preempt_enable_no_resched() do {} while(0)
#define preempt_enable() do { } while (0)
#define preempt_check_resched() do { } while (0)
-
-/*
- * Sometimes we want to increment the preempt count, but we know that it's
- * already incremented if the kernel is compiled for preemptibility.
- */
-#define inc_preempt_count_non_preempt() inc_preempt_count()
-#define dec_preempt_count_non_preempt() dec_preempt_count()

#endif

--- 2.5.31/include/asm-i386/smplock.h~fix-faults Mon Aug 12 16:14:21 2002
+++ 2.5.31-akpm/include/asm-i386/smplock.h Mon Aug 12 16:15:20 2002
@@ -25,8 +25,10 @@ extern spinlock_t kernel_flag;
*/
#define release_kernel_lock(task) \
do { \
- if (unlikely(task->lock_depth >= 0)) \
+ if (unlikely(task->lock_depth >= 0)) { \
spin_unlock(&kernel_flag); \
+ lock_kernel_exit(); \
+ } \
} while (0)

/*
@@ -34,8 +36,10 @@ do { \
*/
#define reacquire_kernel_lock(task) \
do { \
- if (unlikely(task->lock_depth >= 0)) \
+ if (unlikely(task->lock_depth >= 0)) { \
+ lock_kernel_enter(); \
spin_lock(&kernel_flag); \
+ } \
} while (0)


@@ -49,13 +53,17 @@ do { \
static __inline__ void lock_kernel(void)
{
#ifdef CONFIG_PREEMPT
- if (current->lock_depth == -1)
+ if (current->lock_depth == -1) {
+ lock_kernel_enter();
spin_lock(&kernel_flag);
+ }
++current->lock_depth;
#else
#if 1
- if (!++current->lock_depth)
+ if (!++current->lock_depth) {
+ lock_kernel_enter();
spin_lock(&kernel_flag);
+ }
#else
__asm__ __volatile__(
"incl %1\n\t"
@@ -73,8 +81,10 @@ static __inline__ void unlock_kernel(voi
if (current->lock_depth < 0)
BUG();
#if 1
- if (--current->lock_depth < 0)
+ if (--current->lock_depth < 0) {
spin_unlock(&kernel_flag);
+ lock_kernel_exit();
+ }
#else
__asm__ __volatile__(
"decl %1\n\t"

.

2002-08-13 00:06:07

by Andrew Rodland

[permalink] [raw]
Subject: Re: 2.5.31: modules don't work at all

On Sun, 11 Aug 2002 22:36:50 -0700
Andrew Morton <[email protected]> wrote:
> Skip Ford wrote:
> > If I back out this change to arch/i386/mm/fault.c then modules
> > successfully load. I have no idea if backing it out causes other
> > problems though.
> >
> > diff -Nru a/arch/i386/mm/fault.c b/arch/i386/mm/fault.c
> > --- a/arch/i386/mm/fault.c Sat Aug 10 18:42:20 2002
> > +++ b/arch/i386/mm/fault.c Sat Aug 10 18:42:20 2002
> > @@ -181,10 +181,10 @@
> > info.si_code = SEGV_MAPERR;
> >
> > /*
> > - * If we're in an interrupt or have no user
> > - * context, we must not take the fault..
> > + * If we're in an interrupt, have no user context or are
> > running in an
> > + * atomic region then we must not take the
> > fault..
> > */
> > - if (in_interrupt() || !mm)
> > + if (preempt_count() || !mm)
> > goto no_context;
> >
> > down_read(&mm->mmap_sem);
> >
>
> Yes, that's the problem. qm_symbols() is performing copy_to_user()
> inside lock_kernel() and that's an "atomic copy_to_user()" in 2.5.31.
> But only if preempt is selected. The copy_to_user() doesn't work.
>
> There's nothing illegal about copy_to_user() inside lock_kernel().

Does that mean that the above fix is a legal quick-fix and won't cause
things to fall apart, or does it mean that I shouldn't bother until the
next version?

2002-08-13 00:11:38

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.5.31: modules don't work at all

Andrew Rodland wrote:
>
> > > - if (in_interrupt() || !mm)
> > > + if (preempt_count() || !mm)
> > > goto no_context;
> > >
> > > down_read(&mm->mmap_sem);
> > >
> >
> > Yes, that's the problem. qm_symbols() is performing copy_to_user()
> > inside lock_kernel() and that's an "atomic copy_to_user()" in 2.5.31.
> > But only if preempt is selected. The copy_to_user() doesn't work.
> >
> > There's nothing illegal about copy_to_user() inside lock_kernel().
>
> Does that mean that the above fix is a legal quick-fix and won't cause
> things to fall apart, or does it mean that I shouldn't bother until the
> next version?

That a legal quick-fix. Or disable CONFIG_PREEMPT.

2002-08-13 00:19:54

by Skip Ford

[permalink] [raw]
Subject: Re: 2.5.31: modules don't work at all

Andrew Morton wrote:
>
> (Untested, uncompiled code)

I can't boot with this patch applied. I use a framebuffer and the
screen goes to graphics mode and then nothing...

If you need me to I can get rid of the graphics and see what happens.

--
Skip

2002-08-13 01:18:22

by Skip Ford

[permalink] [raw]
Subject: Re: 2.5.31: modules don't work at all

Andrew Morton wrote:
>
> (Untested, uncompiled code)

More information. With your patch I hit the BUG in kmem_cache_create
at slab.c line 673:

[inside kmem_cache_create]

/*
* Sanity checks... these are all serious usage bugs.
*/
if ((!name) ||
in_interrupt() ||
(size < BYTES_PER_WORD) ||
(size > (1<<MAX_OBJ_ORDER)*PAGE_SIZE) ||
(dtor && !ctor) ||
(offset < 0 || offset > size))
BUG();
--
Skip

2002-08-20 23:31:16

by Ed Tomlinson

[permalink] [raw]
Subject: Re: 2.5.31: modules don't work at all

Andrew Morton wrote:

> Skip Ford wrote:
>>
>> ...
>> > I already know that the error that trips insmod occurs at
>> > in modules.c, line 831, when qm_symbols gets an error from
>> > copy_to_user():
>> >
>> > for (; i < mod->nsyms ; ++i, ++s, vals += 2) {
>> > len = strlen(s->name)+1;
>> > if (len > bufsize)
>> > goto calc_space_needed;
>> >
>> > here------> if (copy_to_user(strings, s->name, len)
>> > || __put_user(s->value, vals+0)
>> > || __put_user(space, vals+1))
>> > return -EFAULT;
>> >
>> > strings += len;
>> > bufsize -= len;
>> > space += len;
>> > }
>> >
>> > The values of strings and s->name are similar in 2.5.30+preempt
>> > (works) and 2.5.31+preempt (does not work). strings is 0x08______, and
>> > s->name is 0xc0______.
>>
>> If I back out this change to arch/i386/mm/fault.c then modules
>> successfully load. I have no idea if backing it out causes other
>> problems though.
>>
>> diff -Nru a/arch/i386/mm/fault.c b/arch/i386/mm/fault.c
>> --- a/arch/i386/mm/fault.c Sat Aug 10 18:42:20 2002
>> +++ b/arch/i386/mm/fault.c Sat Aug 10 18:42:20 2002
>> @@ -181,10 +181,10 @@
>> info.si_code = SEGV_MAPERR;
>>
>> /*
>> - * If we're in an interrupt or have no user
>> - * context, we must not take the fault..
>> + * If we're in an interrupt, have no user context or are running
>> in an
>> + * atomic region then we must not take the fault..
>> */
>> - if (in_interrupt() || !mm)
>> + if (preempt_count() || !mm)
>> goto no_context;
>>
>> down_read(&mm->mmap_sem);
>>
>
> Yes, that's the problem. qm_symbols() is performing copy_to_user()
> inside lock_kernel() and that's an "atomic copy_to_user()" in 2.5.31.
> But only if preempt is selected. The copy_to_user() doesn't work.
>
> There's nothing illegal about copy_to_user() inside lock_kernel().
>
> Linus, we can back out the preempt_count() test in there and
> perform the atomic copy_*_user via a current->flags bit, or
> we can do something else?

I am still seeing this problem with the todays bk current, which includes
the above 'fix'...

Turning off preempt now.

Ed Tomlinson