2000-12-19 13:00:19

by Zdenek Kabelac

[permalink] [raw]
Subject: Oops with 2.4.0-test13pre3 - swapoff

Hi

This is oops I've got when rebooting after some heavy disk activity on
my SMP system:

Written by hand:

kernel BUG swap_state.c:78!
-- invalid operand: 0000
EIP: 0010:[<c01e20fd>]
Using defaults from ksymoops -t elf32-i386 -a i386
Stack: c0206c16 c0206e2f 0000004e
Call Trace: [<c0206c16>] [<c0206e2f>] [<c012e1a5>] [<c012e1ce>]
[<c0130d0d>]
[<c0130ddc>] [<c012eb5d>] [<c012ed24>] [<c01328d4>] [<c0108ef3>]
Code: 0f 0b 83 c4 0c 8b 43 18 f6 c4 02 74 07 8b 43 18 a8 01 75 16

>>EIP; c01e20fd <unix_stream_sendmsg+225/308> <=====
Trace; c0206c16 <tvecs+2f1e/c8bc>
Trace; c0206e2f <tvecs+3137/c8bc>
Trace; c012e1a5 <delete_from_swap_cache_nolock+5d/74>
Trace; c012e1ce <delete_from_swap_cache+12/5c>
Trace; c0130d0d <shmem_unuse_inode+89/120>
Trace; c0130ddc <shmem_unuse+38/4c>
Trace; c012eb5d <try_to_unuse+f5/170>
Trace; c012ed24 <sys_swapoff+14c/2b0>
Trace; c01328d4 <sys_read+bc/c4>
Trace; c0108ef3 <system_call+33/38>
Code; c01e20fd <unix_stream_sendmsg+225/308>

00000000 <_EIP>:
Code; c01e20fd <unix_stream_sendmsg+225/308> <=====
0: 0f 0b ud2a <=====
Code; c01e20ff <unix_stream_sendmsg+227/308>
2: 83 c4 0c add $0xc,%esp
Code; c01e2102 <unix_stream_sendmsg+22a/308>
5: 8b 43 18 mov 0x18(%ebx),%eax
Code; c01e2105 <unix_stream_sendmsg+22d/308>
8: f6 c4 02 test $0x2,%ah
Code; c01e2108 <unix_stream_sendmsg+230/308>
b: 74 07 je 14 <_EIP+0x14> c01e2111
<unix_stream_sendmsg+239/308>
Code; c01e210a <unix_stream_sendmsg+232/308>
d: 8b 43 18 mov 0x18(%ebx),%eax
Code; c01e210d <unix_stream_sendmsg+235/308>
10: a8 01 test $0x1,%al
Code; c01e210f <unix_stream_sendmsg+237/308>
12: 75 16 jne 2a <_EIP+0x2a> c01e2127
<unix_stream_sendmsg+24f/308>


There are three types of people in the world:
those who can count, and those who can't.
Zdenek Kabelac http://i.am/kabi/ [email protected] {debian.org; fi.muni.cz}


2000-12-20 15:15:14

by Douglas Gilbert

[permalink] [raw]
Subject: Re: Oops with 2.4.0-test13pre3 - swapoff

Zdenek Kabelac wrote:
> This is oops I've got when rebooting after some heavy disk activity on
> my SMP system:
>
> Written by hand:
>
> kernel BUG swap_state.c:78!
[snip]

Same here during a halt of a RH 6.2 based K6-2 500 MHz
UP machine running lk240t13p3. The machine had been on
for a while and had built a kernel amongst other things.

Lead up was:
$ halt
.....
Sending all processes the KILL signal [OK]
Turning off swap VM: __lru_cache_del, found unknown page ?!
kernel BUG at swap_state.c:78
....

Doug Gilbert


2000-12-20 16:16:30

by Zdenek Kabelac

[permalink] [raw]
Subject: Re: Oops with 2.4.0-test13pre3 - swapoff

> Zdenek Kabelac wrote:
> > This is oops I've got when rebooting after some heavy disk activity on
> > my SMP system:
> >
> > Written by hand:
> >
> > kernel BUG swap_state.c:78!
> [snip]
>
> Same here during a halt of a RH 6.2 based K6-2 500 MHz
> UP machine running lk240t13p3. The machine had been on
> for a while and had built a kernel amongst other things.
>

I'll just append that my machine has been up for just several
minutes (maybe 10) but has been doing heavy copying - several
600MB files between some partitions.

So maybe the problem with memory thrashing is still not fully fixed ???


--
There are three types of people in the world:
those who can count, and those who can't.
Zdenek Kabelac http://i.am/kabi/ [email protected] {debian.org; fi.muni.cz}

2000-12-21 19:26:06

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: Oops with 2.4.0-test13pre3 - swapoff



On Wed, 20 Dec 2000, Zdenek Kabelac wrote:

> > Zdenek Kabelac wrote:
> > > This is oops I've got when rebooting after some heavy disk activity on
> > > my SMP system:
> > >
> > > Written by hand:
> > >
> > > kernel BUG swap_state.c:78!
> > [snip]
> >
> > Same here during a halt of a RH 6.2 based K6-2 500 MHz
> > UP machine running lk240t13p3. The machine had been on
> > for a while and had built a kernel amongst other things.
> >
>
> I'll just append that my machine has been up for just several
> minutes (maybe 10) but has been doing heavy copying - several
> 600MB files between some partitions.
>
> So maybe the problem with memory thrashing is still not fully fixed ???

The bug was in new shm's code it seems.

Christoph is already looking at it and should have a fix soon.

2000-12-22 11:04:53

by Christoph Rohland

[permalink] [raw]
Subject: Re: Oops with 2.4.0-test13pre3 - swapoff

Marcelo Tosatti <[email protected]> writes:

> Christoph is already looking at it and should have a fix soon.

Here it comes against 13-pre4 ...

We cannot call delete_from_swap_cache, it was called already in
try_to_unuse.

There is still a race when we page in the page which is
just freed in try_to_unuse. I am not sure how to fix this.

Also the add_to_page_cache forgot the offset :-(

Greetings
Christoph

--- 4-13-4/mm/shmem.c Fri Dec 22 10:05:38 2000
+++ m4-13-4/mm/shmem.c Fri Dec 22 10:47:11 2000
@@ -761,14 +761,16 @@
swp_entry_t **base, **ptr;
unsigned long idx;
int offset;
+ struct shmem_inode_info *info = &inode->u.shmem_i;

idx = 0;
- if ((offset = shmem_clear_swp (entry, inode->u.shmem_i.i_direct, SHMEM_NR_DIRECT)) >= 0)
+ spin_lock (&info->lock);
+ if ((offset = shmem_clear_swp (entry,info->i_direct, SHMEM_NR_DIRECT)) >= 0)
goto found;

idx = SHMEM_NR_DIRECT;
- if (!(base = inode->u.shmem_i.i_indirect))
- return 0;
+ if (!(base = info->i_indirect))
+ goto out;

for (ptr = base; ptr < base + ENTRIES_PER_PAGE; ptr++) {
if (*ptr &&
@@ -776,16 +778,16 @@
goto found;
idx += ENTRIES_PER_PAGE;
}
+out:
+ spin_unlock (&info->lock);
return 0;
found:
- delete_from_swap_cache (page);
- add_to_page_cache (page, inode->i_mapping, idx);
+ add_to_page_cache (page, inode->i_mapping, offset + idx);
SetPageDirty (page);
SetPageUptodate (page);
UnlockPage (page);
- spin_lock (&inode->u.shmem_i.lock);
- inode->u.shmem_i.swapped--;
- spin_unlock (&inode->u.shmem_i.lock);
+ info->swapped--;
+ spin_unlock (&info->lock);
return 1;
}