2005-10-19 17:08:19

by Chris Boot

[permalink] [raw]
Subject: Reiser4 lockups (no oops)

Hi all,

I've been using Reiser4 on a couple of filesystems to give it a shot,
and although it has been working fine for a while I've noticed that
newer versions of the patch cause lockups on my machine. It all started
when I upgraded to the reiser4-for-2.6.13-1.patch.gz from
reiser4-for-2.6.12-3.patch.gz, which works fine. I've also tested a
vanilla 2.6.14-rc4-mm1 which has the same symptoms.

I don't get any OOPSes or BUGs or anything, not on my screen nor on my
serial console (although I'm not sure I have this working right--I only
seem to get kernel boot messages). Machine replies to pings but I can't
SSH, and the watchdog doesn't kick in (hangcheck or w83627hf_wdt) so I
only notice it's crashed when I wake up in the morning. The crashes seem
most lilkely to occur in periods of heavy I/O -- large Samba file
transfers or updatdb do the trick.

Anything I can do to try and track down the issue?

Thanks,
Chris

--
Chris Boot
[email protected]
http://www.bootc.net/


2005-10-20 13:17:29

by Jens Axboe

[permalink] [raw]
Subject: Re: Reiser4 lockups (no oops)

On Wed, Oct 19 2005, Chris Boot wrote:
> I don't get any OOPSes or BUGs or anything, not on my screen nor on my
> serial console (although I'm not sure I have this working right--I only
> seem to get kernel boot messages). Machine replies to pings but I can't

Easy fix for that is probably to kill klogd on the machine. Test with eg
loading/unloading of loop, that prints a message when it loads.

--
Jens Axboe

2005-10-20 15:34:27

by Chris Boot

[permalink] [raw]
Subject: Re: Reiser4 lockups (no oops)

Quoting Jens Axboe <[email protected]>:

> On Wed, Oct 19 2005, Chris Boot wrote:
>> I don't get any OOPSes or BUGs or anything, not on my screen nor on my
>> serial console (although I'm not sure I have this working right--I only
>> seem to get kernel boot messages). Machine replies to pings but I can't
>
> Easy fix for that is probably to kill klogd on the machine. Test with eg
> loading/unloading of loop, that prints a message when it loads.

I'd love to, but the machine is locked solid and won't turn on the display or
switch TTYs or anything. Anyway, I've applied reiser4-fix-livelock.patch from
ftp.namesys.org and so far so good (over night).

I see there's now a reiser4-fix-livelock-2.patch, anybody know the
differences?

Chris

--
Chris Boot
[email protected]
http://www.bootc.net/

2005-10-20 16:20:23

by Jens Axboe

[permalink] [raw]
Subject: Re: Reiser4 lockups (no oops)

On Thu, Oct 20 2005, Chris Boot wrote:
> Quoting Jens Axboe <[email protected]>:
>
> >On Wed, Oct 19 2005, Chris Boot wrote:
> >>I don't get any OOPSes or BUGs or anything, not on my screen nor on my
> >>serial console (although I'm not sure I have this working right--I only
> >>seem to get kernel boot messages). Machine replies to pings but I can't
> >
> >Easy fix for that is probably to kill klogd on the machine. Test with eg
> >loading/unloading of loop, that prints a message when it loads.
>
> I'd love to, but the machine is locked solid and won't turn on the
> display or switch TTYs or anything. Anyway, I've applied
> reiser4-fix-livelock.patch from ftp.namesys.org and so far so good
> (over night).

I mean _before_ the crash, to make sure the messages get out :-)

--
Jens Axboe

2005-10-20 16:27:26

by Chris Boot

[permalink] [raw]
Subject: Re: Reiser4 lockups (no oops)

Jens Axboe wrote:

>On Thu, Oct 20 2005, Chris Boot wrote:
>
>
>>Quoting Jens Axboe <[email protected]>:
>>
>>
>>
>>>On Wed, Oct 19 2005, Chris Boot wrote:
>>>
>>>
>>>>I don't get any OOPSes or BUGs or anything, not on my screen nor on my
>>>>serial console (although I'm not sure I have this working right--I only
>>>>seem to get kernel boot messages). Machine replies to pings but I can't
>>>>
>>>>
>>>Easy fix for that is probably to kill klogd on the machine. Test with eg
>>>loading/unloading of loop, that prints a message when it loads.
>>>
>>>
>>I'd love to, but the machine is locked solid and won't turn on the
>>display or switch TTYs or anything. Anyway, I've applied
>>reiser4-fix-livelock.patch from ftp.namesys.org and so far so good
>>(over night).
>>
>>
>
>I mean _before_ the crash, to make sure the messages get out :-)
>
>
Oh! Hehe, now I get you. However, I'm using metalog for logging, and
modprobe loop doesn't give any output. What's interesting is that serial
console logging dies long before metalog is started, just after my swap
is added in fact. I'm using Gentoo.

Any ideas?

Cheers,
Chris

--
Chris Boot
[email protected]
http://www.bootc.net/

2005-10-20 18:32:11

by Nikita Danilov

[permalink] [raw]
Subject: Re: Reiser4 lockups (no oops)

Chris Boot writes:

[...]

> Oh! Hehe, now I get you. However, I'm using metalog for logging, and
> modprobe loop doesn't give any output. What's interesting is that serial
> console logging dies long before metalog is started, just after my swap
> is added in fact. I'm using Gentoo.
>
> Any ideas?

What

cat /proc/sys/kernel/printk

shows after a boot?

>
> Cheers,
> Chris

Nikita.

2005-10-20 18:37:36

by Mattia Dongili

[permalink] [raw]
Subject: Re: Reiser4 lockups (no oops)

On Thu, Oct 20, 2005 at 04:34:25PM +0100, Chris Boot wrote:
> Quoting Jens Axboe <[email protected]>:
>
> >On Wed, Oct 19 2005, Chris Boot wrote:
> >>I don't get any OOPSes or BUGs or anything, not on my screen nor on my
> >>serial console (although I'm not sure I have this working right--I only
> >>seem to get kernel boot messages). Machine replies to pings but I can't
> >
> >Easy fix for that is probably to kill klogd on the machine. Test with eg
> >loading/unloading of loop, that prints a message when it loads.
>
> I'd love to, but the machine is locked solid and won't turn on the display
> or
> switch TTYs or anything. Anyway, I've applied reiser4-fix-livelock.patch
> from
> ftp.namesys.org and so far so good (over night).

aah! nice, those also fix the apt-get freeze I've been having from some
mm kernels ago.

I also managed to get the trace of apt-get freezing by means of Sysrq+P
but I stupidly forgot the fs was readonly so I didn't dump dmesg :P
I can easily reproduce it if anyone is interested.

> I see there's now a reiser4-fix-livelock-2.patch, anybody know the
> differences?

don't know, I'd try the -2 patches also, they seem a different version
of the same fix/cleanup.

--
mattia
:wq!

2005-10-20 18:44:54

by Chris Boot

[permalink] [raw]
Subject: Re: Reiser4 lockups (no oops)

Nikita Danilov wrote:

>Chris Boot writes:
>
>[...]
>
> > Oh! Hehe, now I get you. However, I'm using metalog for logging, and
> > modprobe loop doesn't give any output. What's interesting is that serial
> > console logging dies long before metalog is started, just after my swap
> > is added in fact. I'm using Gentoo.
> >
> > Any ideas?
>
>What
>
>cat /proc/sys/kernel/printk
>
>shows after a boot?
>
> >
> > Cheers,
> > Chris
>
>Nikita.
>
>
Hi there,

It shows:

arcadia ~ # cat /proc/sys/kernel/printk
1 4 1 7

Cheers,
Chris

--
Chris Boot
[email protected]
http://www.bootc.net/

2005-10-20 21:02:00

by Nikita Danilov

[permalink] [raw]
Subject: Re: Reiser4 lockups (no oops)

Chris Boot writes:
> Nikita Danilov wrote:
>
> >Chris Boot writes:
> >
> >[...]
> >
> > > Oh! Hehe, now I get you. However, I'm using metalog for logging, and
> > > modprobe loop doesn't give any output. What's interesting is that serial
> > > console logging dies long before metalog is started, just after my swap
> > > is added in fact. I'm using Gentoo.
> > >
> > > Any ideas?
> >
> >What
> >
> >cat /proc/sys/kernel/printk
> >
> >shows after a boot?
> >
> > >
> > > Cheers,
> > > Chris
> >
> >Nikita.
> >
> >
> Hi there,
>
> It shows:
>
> arcadia ~ # cat /proc/sys/kernel/printk
> 1 4 1 7

That's why nothing is printed on the console. Try

echo 8 4 4 8 > /proc/sys/kernel/printk

>
> Cheers,
> Chris

Nikita.

2005-10-20 22:06:05

by Chris Boot

[permalink] [raw]
Subject: Re: Reiser4 lockups (no oops)

Nikita Danilov wrote:
> Chris Boot writes:
> > Nikita Danilov wrote:
> >
> > >Chris Boot writes:
> > >
> > >[...]
> > >
> > > > Oh! Hehe, now I get you. However, I'm using metalog for logging, and
> > > > modprobe loop doesn't give any output. What's interesting is that serial
> > > > console logging dies long before metalog is started, just after my swap
> > > > is added in fact. I'm using Gentoo.
> > > >
> > > > Any ideas?
> > >
> > >What
> > >
> > >cat /proc/sys/kernel/printk
> > >
> > >shows after a boot?
> > >
> > > >
> > > > Cheers,
> > > > Chris
> > >
> > >Nikita.
> > >
> > >
> > Hi there,
> >
> > It shows:
> >
> > arcadia ~ # cat /proc/sys/kernel/printk
> > 1 4 1 7
>
> That's why nothing is printed on the console. Try
>
> echo 8 4 4 8 > /proc/sys/kernel/printk

Cheers! That did the trick.

Chris

--
Chris Boot
[email protected]
http://www.bootc.net/