2005-10-07 13:13:36

by Michael Tokarev

[permalink] [raw]
Subject: kernel freeze (not even an OOPS) on remount-ro+umount when using quotas

This is something that has biten me quite successefully
in last few days... ;)

To make a long story short:

# mke2fs -j /dev/hda6
# mount -o usrquota /dev/hda6 /mnt
# cp -a /home /mnt # to make some files to work with
# quotacheck -uc /mnt
# quotaon /mnt
# mount -o remount,ro # this is the important step!
# ls -l /mnt /mnt/home # to do "something" (also important)
# umount /mnt

At this time (attempting to umount the read-only filesystem with quotas
enabled), the machine freezes without any messages on the console. No
OOPS, no response, no nothing - until a hard reboot (powercycle).

This happens on 2.6.11, 2.6.12 and 2.6.13 kernels -- ie, with "current"
kernel release.

According to the themperature sensors on my test laptop and CPU fan
behaviour, the kernel goes to some infinite loop at this point, because
the fan starts rotating in a few sec after the freeze.

This happens with both quota_v1 and quota_v2.

Note that it isn't 100% reproduceable - sometimes it umounts
ok, sometimes (rare) there's no need to do that pre-final
'ls -l', and sometimes more "work" is needed around remount-ro
to trigger the freeze.

The filesystem is ext3.

Any hints on the way to debug the problem?

Thanks.

/mjt


2005-10-07 13:30:18

by Michael Tokarev

[permalink] [raw]
Subject: Re: kernel freeze (not even an OOPS) on remount-ro+umount when using quotas

Michael Tokarev wrote:
> This is something that has biten me quite successefully
> in last few days... ;)
>
> To make a long story short:
>
> # mke2fs -j /dev/hda6
> # mount -o usrquota /dev/hda6 /mnt
> # cp -a /home /mnt # to make some files to work with
> # quotacheck -uc /mnt
> # quotaon /mnt
> # mount -o remount,ro # this is the important step!

# mount -o remount,ro /dev/hda6 /mnt
ofcourse... ;)

> # ls -l /mnt /mnt/home # to do "something" (also important)
> # umount /mnt
>
> At this time (attempting to umount the read-only filesystem with quotas
> enabled), the machine freezes without any messages on the console. No
[...]

2005-10-07 14:21:39

by Steven Rostedt

[permalink] [raw]
Subject: Re: kernel freeze (not even an OOPS) on remount-ro+umount when using quotas


On Fri, 7 Oct 2005, Michael Tokarev wrote:

> This is something that has biten me quite successefully
> in last few days... ;)
>
> To make a long story short:
>
> # mke2fs -j /dev/hda6
> # mount -o usrquota /dev/hda6 /mnt
> # cp -a /home /mnt # to make some files to work with
> # quotacheck -uc /mnt
> # quotaon /mnt
> # mount -o remount,ro # this is the important step!
> # ls -l /mnt /mnt/home # to do "something" (also important)
> # umount /mnt
>
> At this time (attempting to umount the read-only filesystem with quotas
> enabled), the machine freezes without any messages on the console. No
> OOPS, no response, no nothing - until a hard reboot (powercycle).
>
> This happens on 2.6.11, 2.6.12 and 2.6.13 kernels -- ie, with "current"
> kernel release.
>

I just tried this on 2.6.13.1 and was not able to reproduce your hangup.
Have you tried turning on the nmi watchdog with "nmi_watchdog=2 lapic"?

If this blocks interrupts while it spins, you might be able to see what's
happening. Also if interrupts are not blocked, try out sysrq-t and
friends.

-- Steve

2005-10-07 16:37:50

by Michael Tokarev

[permalink] [raw]
Subject: Re: kernel freeze (not even an OOPS) on remount-ro+umount when using quotas

Steven Rostedt wrote:
> On Fri, 7 Oct 2005, Michael Tokarev wrote:
>
>
>>This is something that has biten me quite successefully
>>in last few days... ;)
>>
>>To make a long story short:
>>
>> # mke2fs -j /dev/hda6
>> # mount -o usrquota /dev/hda6 /mnt
>> # cp -a /home /mnt # to make some files to work with
>> # quotacheck -uc /mnt
>> # quotaon /mnt

Looks like it's more reproduceable when there's some writing
going on at this point - after enabling the quotas and before
remointing it read-only. Maybe there's some unwritten quota
data left in memory at the remount, or something like that...

>> # mount -o remount,ro # this is the important step!
>> # ls -l /mnt /mnt/home # to do "something" (also important)
>> # umount /mnt
>>
>>At this time (attempting to umount the read-only filesystem with quotas
>>enabled), the machine freezes without any messages on the console. No
>>OOPS, no response, no nothing - until a hard reboot (powercycle).
>>
>>This happens on 2.6.11, 2.6.12 and 2.6.13 kernels -- ie, with "current"
>>kernel release.
>
> I just tried this on 2.6.13.1 and was not able to reproduce your hangup.

I'm able to reproduce it on almost any my machine. Tried on several
production machines first ;) And on at least two test machines.
Now I'm at home and my home PC also shows this bug (2.6.13.1 vanilla).

> Have you tried turning on the nmi watchdog with "nmi_watchdog=2 lapic"?

nmi_watchdog makes no visible difference. Lapic is already enabled, at
least on this machine (BTW, the same behaviour happens on SMP and UP
machines, with and without hyperthreading enabled).

> If this blocks interrupts while it spins, you might be able to see what's
> happening. Also if interrupts are not blocked, try out sysrq-t and
> friends.

And hee-hoo, sysrq works! Strange I haven't noticied it before - I think
I tried it on the laptop, maybe I pressed some wrong button...

Now, as I don't have another PC here @home, only this machine and an ADSL
router (small mips-based device wich is also running linux), and I will
not have access to another machine(s) till monday... I'll try netconsole
to the router. Damn, why ShiftPgUp does not work as it worked in 2.4?? :(

/mjt

2005-10-07 19:12:17

by Michael Tokarev

[permalink] [raw]
Subject: Re: kernel freeze (not even an OOPS) on remount-ro+umount when using quotas

Michael Tokarev wrote:
> Steven Rostedt wrote:
>
>> On Fri, 7 Oct 2005, Michael Tokarev wrote:
>>
>>
>>> This is something that has biten me quite successefully
>>> in last few days... ;)
>>>
>>> To make a long story short:
>>>
>>> # mke2fs -j /dev/hda6
>>> # mount -o usrquota /dev/hda6 /mnt
>>> # cp -a /home /mnt # to make some files to work with
>>> # quotacheck -uc /mnt
>>> # quotaon /mnt
>
> Looks like it's more reproduceable when there's some writing
> going on at this point - after enabling the quotas and before
> remointing it read-only. Maybe there's some unwritten quota
> data left in memory at the remount, or something like that...

Yes it is:
# quotaon /mnt; touch /mnt/file; mount -o remount,ro /mnt; umount /mnt
and voila, instant freeze.

>>> # mount -o remount,ro # this is the important step!
>>> # ls -l /mnt /mnt/home # to do "something" (also important)
>>> # umount /mnt
>>>
>>> At this time (attempting to umount the read-only filesystem with quotas
>>> enabled), the machine freezes without any messages on the console. No
>>> OOPS, no response, no nothing - until a hard reboot (powercycle).
[]
> And hee-hoo, sysrq works! Strange I haven't noticied it before - I think
> I tried it on the laptop, maybe I pressed some wrong button...
>
> Now, as I don't have another PC here @home, only this machine and an ADSL
> router (small mips-based device wich is also running linux), and I will
> not have access to another machine(s) till monday... I'll try netconsole
> to the router. Damn, why ShiftPgUp does not work as it worked in 2.4?? :(

Nope, my ADSL router is too slow to accept printks from netconsole, or
my PC is too fast (which isn't at all fast - it's a 900MHz VIA C3 system) --
sysrq+t output captured by the router (simple recvfrom()+write(tmpfs) loop)
is *very* incomplete, only shows about 50 lines for all the tasks running...
The device is 150MHz mips-el, texas instruments ar7 (avalanche/sangam) board.

Any suggestions on how to improve the logging? :)

But. With the above sequence of commands, looks like the problem is pretty
easy to reproduce...

/mjt