2004-11-05 13:16:04

by Amit Shah

[permalink] [raw]
Subject: RT-preempt-2.6.10-rc1-mm2-V0.7.11 hang

Hi Ingo,

I'm trying out the RT preempt patch on a P4 HT machine, I get the following
message:

e1000_xmit_frame+0x0/0x83b [e1000]

I got this message 5 times post-boot, and the system's not responsive anymore.

Here's the .config.

--
Amit Shah
http://amitshah.nav.to/


Attachments:
(No filename) (272.00 B)
.config (54.11 kB)
Download all attachments

2004-11-05 13:45:54

by Ingo Molnar

[permalink] [raw]
Subject: Re: RT-preempt-2.6.10-rc1-mm2-V0.7.11 hang


* Amit Shah <[email protected]> wrote:

> Hi Ingo,
>
> I'm trying out the RT preempt patch on a P4 HT machine, I get the following
> message:
>
> e1000_xmit_frame+0x0/0x83b [e1000]

hm, does this happen with -V0.7.13 too? (note that it's against
2.6.10-rc1-mm3, a newer -mm tree.)

Ingo

2004-11-05 14:42:47

by Norberto Bensa

[permalink] [raw]
Subject: Re: RT-preempt-2.6.10-rc1-mm2-V0.7.11 hang

Ingo Molnar wrote:
> * Amit Shah <[email protected]> wrote:
> > I'm trying out the RT preempt patch on a P4 HT machine, I get the
> > following message:
> >
> hm, does this happen with -V0.7.13 too? (note that it's against
> 2.6.10-rc1-mm3, a newer -mm tree.)

But it doesn't -cleanly- apply.

Hunk #2 FAILED at 1545.
1 out of 2 hunks FAILED -- saving rejects to file mm/mmap.c.rej

Regards,
Norberto

2004-11-05 14:59:33

by K.R. Foley

[permalink] [raw]
Subject: Re: RT-preempt-2.6.10-rc1-mm2-V0.7.11 hang

Norberto Bensa wrote:
> Ingo Molnar wrote:
>
>>* Amit Shah <[email protected]> wrote:
>>
>>>I'm trying out the RT preempt patch on a P4 HT machine, I get the
>>>following message:
>>>
>>
>>hm, does this happen with -V0.7.13 too? (note that it's against
>>2.6.10-rc1-mm3, a newer -mm tree.)
>
>
> But it doesn't -cleanly- apply.
>
> Hunk #2 FAILED at 1545.
> 1 out of 2 hunks FAILED -- saving rejects to file mm/mmap.c.rej
>
> Regards,
> Norberto

It looks to me like this fails because it is already in -mm3. Probably
can safely ignore this.

kr

2004-11-05 15:02:08

by Amit Shah

[permalink] [raw]
Subject: Re: RT-preempt-2.6.10-rc1-mm2-V0.7.11 hang

On Friday 05 Nov 2004 20:12, Norberto Bensa wrote:
> Ingo Molnar wrote:
> > * Amit Shah <[email protected]> wrote:
> > > I'm trying out the RT preempt patch on a P4 HT machine, I get the
> > > following message:
> >
> > hm, does this happen with -V0.7.13 too? (note that it's against
> > 2.6.10-rc1-mm3, a newer -mm tree.)
>
> But it doesn't -cleanly- apply.
>
> Hunk #2 FAILED at 1545.
> 1 out of 2 hunks FAILED -- saving rejects to file mm/mmap.c.rej

Just ignore that hunk -- it apparently was a fix Ingo introduced for PML4,
it's been fixed elsewhere, though.

--
Amit Shah
http://amitshah.nav.to/

2004-11-05 15:26:47

by Amit Shah

[permalink] [raw]
Subject: Re: RT-preempt-2.6.10-rc1-mm2-V0.7.11 hang

On Friday 05 Nov 2004 19:16, Ingo Molnar wrote:
> * Amit Shah <[email protected]> wrote:
> > Hi Ingo,
> >
> > I'm trying out the RT preempt patch on a P4 HT machine, I get the
> > following message:
> >
> > e1000_xmit_frame+0x0/0x83b [e1000]
>
> hm, does this happen with -V0.7.13 too? (note that it's against
> 2.6.10-rc1-mm3, a newer -mm tree.)

Okay, doesn't happen with -V0.7.13. Thanks!

>
> Ingo

Amit.

--
Amit Shah
http://amitshah.nav.to/

2004-11-05 18:14:00

by Adam Heath

[permalink] [raw]
Subject: Re: RT-preempt-2.6.10-rc1-mm2-V0.7.11 hang

On Fri, 5 Nov 2004, Ingo Molnar wrote:

>
> * Amit Shah <[email protected]> wrote:
>
> > Hi Ingo,
> >
> > I'm trying out the RT preempt patch on a P4 HT machine, I get the following
> > message:
> >
> > e1000_xmit_frame+0x0/0x83b [e1000]
>
> hm, does this happen with -V0.7.13 too? (note that it's against
> 2.6.10-rc1-mm3, a newer -mm tree.)

adam@gradall:/home.local/adam/kernel/gradall/rt/tmp$ tar xf ../linux-2.6.10-rc1.tar
adam@gradall:/home.local/adam/kernel/gradall/rt/tmp$ mv linux-2.6.10-rc1/ linux-2.6.10-rc1-mm3-RT-V0.7.13
adam@gradall:/home.local/adam/kernel/gradall/rt/tmp$ cd linux-2.6.10-rc1-mm3-RT-V0.7.13/
adam@gradall:/home.local/adam/kernel/gradall/rt/tmp/linux-2.6.10-rc1-mm3-RT-V0.7.13$ patch -p1 < ../../2.6.10-rc1-mm3 >> ../patch.log 2>&1
adam@gradall:/home.local/adam/kernel/gradall/rt/tmp/linux-2.6.10-rc1-mm3-RT-V0.7.13$ patch -p1 --dry-run < ../../realtime-preempt-2.6.10-rc1-mm3-V0.7.13 >> ../patch.log 2>&1
adam@gradall:/home.local/adam/kernel/gradall/rt/tmp/linux-2.6.10-rc1-mm3-RT-V0.7.13$ grep FAILED ../patch.log
Hunk #2 FAILED at 1545.
1 out of 2 hunks FAILED -- saving rejects to file mm/mmap.c.rej

2004-11-05 19:18:50

by Adam Heath

[permalink] [raw]
Subject: Re: RT-preempt-2.6.10-rc1-mm2-V0.7.11 hang

On Fri, 5 Nov 2004, Adam Heath wrote:

> On Fri, 5 Nov 2004, Ingo Molnar wrote:
>
> >
> > * Amit Shah <[email protected]> wrote:
> >
> > > Hi Ingo,
> > >
> > > I'm trying out the RT preempt patch on a P4 HT machine, I get the following
> > > message:
> > >
> > > e1000_xmit_frame+0x0/0x83b [e1000]
> >
> > hm, does this happen with -V0.7.13 too? (note that it's against
> > 2.6.10-rc1-mm3, a newer -mm tree.)
>
> adam@gradall:/home.local/adam/kernel/gradall/rt/tmp$ tar xf ../linux-2.6.10-rc1.tar
> adam@gradall:/home.local/adam/kernel/gradall/rt/tmp$ mv linux-2.6.10-rc1/ linux-2.6.10-rc1-mm3-RT-V0.7.13
> adam@gradall:/home.local/adam/kernel/gradall/rt/tmp$ cd linux-2.6.10-rc1-mm3-RT-V0.7.13/
> adam@gradall:/home.local/adam/kernel/gradall/rt/tmp/linux-2.6.10-rc1-mm3-RT-V0.7.13$ patch -p1 < ../../2.6.10-rc1-mm3 >> ../patch.log 2>&1
> adam@gradall:/home.local/adam/kernel/gradall/rt/tmp/linux-2.6.10-rc1-mm3-RT-V0.7.13$ patch -p1 --dry-run < ../../realtime-preempt-2.6.10-rc1-mm3-V0.7.13 >> ../patch.log 2>&1
> adam@gradall:/home.local/adam/kernel/gradall/rt/tmp/linux-2.6.10-rc1-mm3-RT-V0.7.13$ grep FAILED ../patch.log
> Hunk #2 FAILED at 1545.
> 1 out of 2 hunks FAILED -- saving rejects to file mm/mmap.c.rej

After removing that hunk(it appears to already be in mm3), and rebooting, I
get a BUG. kjournald held a lock at exit. I'll reboot again later, and write
it down(need to do work stuff right now).

In the meantime, my config is attached.

ps: For all those using debian, there is now a patch in kernel-package that
allows for uppercase version strings. Read the changelog(patch courtesy
of your's truly).


Attachments:
config-2.6.10-rc1-mm3-rt-v0.7.13 (32.19 kB)

2004-11-06 07:23:15

by Ingo Molnar

[permalink] [raw]
Subject: Re: RT-preempt-2.6.10-rc1-mm2-V0.7.11 hang


* Adam Heath <[email protected]> wrote:

> Hunk #2 FAILED at 1545.
> 1 out of 2 hunks FAILED -- saving rejects to file mm/mmap.c.rej

ok, i fixed this in -V0.7.14. (but you can safely ignore the reject as
well.)

Ingo

2004-11-06 08:46:19

by Amit Shah

[permalink] [raw]
Subject: Re: RT-preempt-2.6.10-rc1-mm2-V0.7.11 hang

Hi Ingo,

On Friday 05 Nov 2004 19:16, Ingo Molnar wrote:
> * Amit Shah <[email protected]> wrote:
> > Hi Ingo,
> >
> > I'm trying out the RT preempt patch on a P4 HT machine, I get the
> > following message:
> >
> > e1000_xmit_frame+0x0/0x83b [e1000]
>
> hm, does this happen with -V0.7.13 too? (note that it's against
> 2.6.10-rc1-mm3, a newer -mm tree.)

I had left the machine running overnight; I got a few BUGs and some spinlock
hold counts.

The message mentioned above about the e1000 xmit frame also keeps appearing,
but does not result in hangs.

I've uploaded the /var/log/messages file to

http://amitshah.nav.to/kernel/messages-rt-0.7.13.txt

Please take a look.

>
> Ingo

Amit.
--
Amit Shah
http://amitshah.nav.to/

2004-11-06 12:06:22

by Ingo Molnar

[permalink] [raw]
Subject: Re: RT-preempt-2.6.10-rc1-mm2-V0.7.11 hang


* Amit Shah <[email protected]> wrote:

> I had left the machine running overnight; I got a few BUGs and some
> spinlock hold counts.
>
> The message mentioned above about the e1000 xmit frame also keeps
> appearing, but does not result in hangs.
>
> I've uploaded the /var/log/messages file to
>
> http://amitshah.nav.to/kernel/messages-rt-0.7.13.txt
>
> Please take a look.

found the bug(s), the e1000 driver disabled interrupts on
PREEMPT_REALTIME too, and the debug-message printout had a bug as well.
Found a similar problem in the tg3 driver too. Could you check out
-V0.7.15 that i've just uploaded to:

http://redhat.com/~mingo/realtime-preempt/

does this work any better? [you'll still get the e100 message but that
is harmless.]

Ingo

2004-11-06 15:26:47

by Amit Shah

[permalink] [raw]
Subject: Re: RT-preempt-2.6.10-rc1-mm2-V0.7.11 hang

On Saturday 06 Nov 2004 17:35, Ingo Molnar wrote:
> found the bug(s), the e1000 driver disabled interrupts on
> PREEMPT_REALTIME too, and the debug-message printout had a bug as well.
> Found a similar problem in the tg3 driver too. Could you check out
> -V0.7.15 that i've just uploaded to:
>
> http://redhat.com/~mingo/realtime-preempt/
>
> does this work any better? [you'll still get the e100 message but that
> is harmless.]

Yes; solved. (I'm using -18). Thanks.

However, I get this:

IRQ#18 thread RT prio: 46.
ifconfig/4106: BUG in enable_irq at kernel/irq/manage.c:112
[<c0107fc0>] dump_stack+0x1e/0x22 (20)
[<c0142a9b>] enable_irq+0xeb/0xf0 (52)
[<f892b290>] e100_up+0x111/0x20c [e100] (48)
[<f892c4d7>] e100_open+0x2c/0x71 [e100] (32)
[<c027747d>] dev_open+0x78/0x86 (28)
[<c0278aed>] dev_change_flags+0x56/0x127 (36)
[<c02b4538>] devinet_ioctl+0x242/0x5bc (104)
[<c02b66aa>] inet_ioctl+0x5a/0x9a (28)
[<c026e2a4>] sock_ioctl+0xbb/0x21f (32)
[<c01777bf>] sys_ioctl+0xdc/0x241 (44)
[<c0107153>] syscall_call+0x7/0xb (-8124)
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<c02c81b8>] .... _raw_spin_lock_irqsave+0x1d/0x7a
.....[<c01429d8>] .. ( <= enable_irq+0x28/0xf0)
.. [<c013b8c8>] .... print_traces+0x18/0x4c
.....[<c0107fc0>] .. ( <= dump_stack+0x1e/0x22)

IRQ#17 thread RT prio: 45.
IRQ#4 thread RT prio: 44.
IRQ#3 thread RT prio: 43.
e1000: eth0: e1000_watchdog: NIC Link is Up 100 Mbps Full Duplex
IRQ#8 thread RT prio: 42.

But as you say, this should be harmless.

>
> Ingo

Amit.
--
Amit Shah
http://amitshah.nav.to/

2004-11-06 15:29:10

by Ingo Molnar

[permalink] [raw]
Subject: Re: RT-preempt-2.6.10-rc1-mm2-V0.7.11 hang


* Amit Shah <[email protected]> wrote:

> IRQ#18 thread RT prio: 46.
> ifconfig/4106: BUG in enable_irq at kernel/irq/manage.c:112
> [<c0107fc0>] dump_stack+0x1e/0x22 (20)
> [<c0142a9b>] enable_irq+0xeb/0xf0 (52)
> [<f892b290>] e100_up+0x111/0x20c [e100] (48)
> [<f892c4d7>] e100_open+0x2c/0x71 [e100] (32)

> But as you say, this should be harmless.

correct. It also happens with the vanilla -mm kernel.

Ingo