2003-03-08 18:54:32

by walt

[permalink] [raw]
Subject: 2.4.21-pre5-ac2: kernel oops with "swapoff -a"

Hi Alan,

When I do "swapoff -a" I still see the kernel oops that began with -pre4-ac7
and has propagated to every 'ac' kernel since then.

It's a "null-pointer dereference" oops which does not crash the system --
I still don't understand how that is possible.

Am I really the only person having this problem? The oops is 100% reproducible
so it's hard to believe no one else is seeing it. It happens on all three of
the machines I try it on, so it doesn't seem to be hardware-specific.

Plain 2.4.21-pre5 does NOT show this problem, so it seems to be a patch that
was specifically introduced in -pre4-ac7 and I don't know enough to narrow
it any further than that. I'm not an accomplished kernel debugger so I
can't offer much more info than that, but I'd like to help if you can give
me some hints what kind of information you might need to find the problem.


2003-03-08 21:23:15

by Alan

[permalink] [raw]
Subject: Re: 2.4.21-pre5-ac2: kernel oops with "swapoff -a"

On Sat, 2003-03-08 at 11:07, walt wrote:
> When I do "swapoff -a" I still see the kernel oops that began with -pre4-ac7
> and has propagated to every 'ac' kernel since then.

Yes. There is a nasty bug in the original 2.4 code (and maybe 2.5).
There is a fix in the -ac tree but the fix has a different bug it seems.

> Plain 2.4.21-pre5 does NOT show this problem, so it seems to be a patch that
> was specifically introduced in -pre4-ac7 and I don't know enough to narrow
> it any further than that. I'm not an accomplished kernel debugger so I
> can't offer much more info than that, but I'd like to help if you can give
> me some hints what kind of information you might need to find the problem.

The patch is staying in -ac until I find out why you hit it. I've had no
other reports so far, but it just be the way your system is calling it.

Can you send me an strace swapoff -a ?

2003-03-08 21:37:06

by John Bradford

[permalink] [raw]
Subject: Re: 2.4.21-pre5-ac2: kernel oops with "swapoff -a"

> > Plain 2.4.21-pre5 does NOT show this problem, so it seems to be a
> > patch that was specifically introduced in -pre4-ac7 and I don't
> > know enough to narrow it any further than that. I'm not an
> > accomplished kernel debugger so I can't offer much more info than
> > that, but I'd like to help if you can give me some hints what kind
> > of information you might need to find the problem.
>
> The patch is staying in -ac until I find out why you hit it. I've had no
> other reports so far, but it just be the way your system is calling it.

Just a thought - maybe he is using type 0 swap space? That could
explain the lack of other reports...

John.

2003-03-09 00:38:54

by walt

[permalink] [raw]
Subject: Re: 2.4.21-pre5-ac2: kernel oops with "swapoff -a"

Alan Cox wrote:
> On Sat, 2003-03-08 at 11:07, walt wrote:
>
>>When I do "swapoff -a" I still see the kernel oops that began with -pre4-ac7
>>and has propagated to every 'ac' kernel since then.

> Can you send me an strace swapoff -a ?

# strace swapon -a
execve("/sbin/swapon", ["swapon", "-a"], [/* 36 vars */]) = 0
uname({sys="Linux", node="k9.localnet", ...}) = 0
brk(0) = 0x804aa08
open("/etc/ld.so.preload", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
close(3) = 0
open("/etc/ld.so.cache", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=49489, ...}) = 0
mmap2(NULL, 49489, PROT_READ, MAP_PRIVATE, 3, 0) = 0x40014000
close(3) = 0
open("/lib/libc.so.6", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\0_\1\000"..., 1024) = 1024
fstat64(3, {st_mode=S_IFREG|0755, st_size=1411261, ...}) = 0
mmap2(NULL, 1236740, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x40021000
mprotect(0x40146000, 36612, PROT_NONE) = 0
mmap2(0x40146000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x124)
= 0x40146000
mmap2(0x4014b000, 16132, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x4014b000
close(3) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0x4014f000
munmap(0x40014000, 49489) = 0
brk(0) = 0x804aa08
brk(0x804ba08) = 0x804ba08
brk(0x804c000) = 0x804c000
open("/proc/swaps", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0x40014000
read(3, "Filename\t\t\tType\t\tSize\tUsed\tPrior"..., 1024) = 36
read(3, "", 1024) = 0
close(3) = 0
munmap(0x40014000, 4096) = 0
open("/etc/fstab", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=1205, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0x40014000
read(3, "# Copyright 1999-2002 Gentoo Tec"..., 4096) = 1205
stat64("/dev/hda10", {st_mode=S_IFBLK|0600, st_rdev=makedev(3, 10), ...}) = 0
swapon("/dev/hda10") = 0
read(3, "", 4096) = 0
_exit(0) = ?

========================================================================

# strace swapoff -a
execve("/sbin/swapoff", ["swapoff", "-a"], [/* 36 vars */]) = 0
uname({sys="Linux", node="k9.localnet", ...}) = 0
brk(0) = 0x804aa08
open("/etc/ld.so.preload", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
close(3) = 0
open("/etc/ld.so.cache", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=49489, ...}) = 0
mmap2(NULL, 49489, PROT_READ, MAP_PRIVATE, 3, 0) = 0x40014000
close(3) = 0
open("/lib/libc.so.6", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\0_\1\000"..., 1024) = 1024
fstat64(3, {st_mode=S_IFREG|0755, st_size=1411261, ...}) = 0
mmap2(NULL, 1236740, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x40021000
mprotect(0x40146000, 36612, PROT_NONE) = 0
mmap2(0x40146000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x124)
= 0x40146000
mmap2(0x4014b000, 16132, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x4014b000
close(3) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0x4014f000
munmap(0x40014000, 49489) = 0
brk(0) = 0x804aa08
brk(0x804ba08) = 0x804ba08
brk(0x804c000) = 0x804c000
open("/proc/swaps", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0x40014000
read(3, "Filename\t\t\tType\t\tSize\tUsed\tPrior"..., 1024) = 99
read(3, "", 1024) = 0
close(3) = 0
munmap(0x40014000, 4096) = 0
swapoff("/dev/ide/host0/bus0/target0/lun0/part10") = 0
open("/etc/fstab", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=1205, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0x40014000
read(3, "# Copyright 1999-2002 Gentoo Tec"..., 4096) = 1205
swapoff("/dev/hda10") = -1 EINVAL (Invalid argument)
read(3, "", 4096) = 0
_exit(0) = ?

2003-03-09 02:38:16

by walt

[permalink] [raw]
Subject: Re: 2.4.21-pre5-ac2: kernel oops with "swapoff -a"

walt wrote:
> Alan Cox wrote:
>
>> On Sat, 2003-03-08 at 11:07, walt wrote:
>>
>>> When I do "swapoff -a" I still see the kernel oops that began with
>>> -pre4-ac7
>>> and has propagated to every 'ac' kernel since then.
>
>
>> Can you send me an strace swapoff -a ?


> swapoff("/dev/hda10") = -1 EINVAL (Invalid argument)
> read(3, "", 4096) = 0
> _exit(0) = ?


On further investigation I find that "swapoff <anyPartition>" will produce
the same oops and segfault in /sbin/swapoff, whereas if I supply a totally
bogus argument like 'swapoff xyz' I get an appropriate error message
instead of the oops:

swapoff("xyz") = -1 ENOENT (No such file or directory)
write(2, "swapoff: xyz: No such file or di"..., 40swapoff: xyz: No such file or
directory
) = 40
_exit(-1) = ?