2003-08-06 10:24:48

by Lenar Lõhmus

[permalink] [raw]
Subject: bad scheduling and an BUG between them

hi,

with kernel-2.6.0-test2-mm3+A3-O12.2int i found this in logs:

bad: scheduling while atomic!
Call Trace:
[<c011c87d>] schedule+0x56d/0x580
[<c0144b33>] unmap_page_range+0x43/0x70
[<c0144d15>] unmap_vmas+0x1b5/0x210
[<c014893b>] exit_mmap+0x7b/0x190
[<c011e399>] mmput+0x79/0xf0
[<c0122082>] do_exit+0x122/0x3f0
[<c010b7c0>] do_invalid_op+0x0/0xd0
[<c010b4d9>] die+0xf9/0x100
[<c010b889>] do_invalid_op+0xc9/0xd0
[<c01436ba>] kunmap_high+0x1a/0xa0
[<c02a0f9f>] error_code+0x2f/0x38
[<c01436ba>] kunmap_high+0x1a/0xa0
[<c01a6040>] reiserfs_unprepare_pages+0x30/0x70
[<c01a72b5>] reiserfs_file_write+0x4e5/0x595
[<c011aa31>] do_page_fault+0x251/0x454
[<c011c941>] __wake_up_common+0x31/0x60
[<c01a6dd0>] reiserfs_file_write+0x0/0x595
[<c0153e38>] vfs_write+0xb8/0x130
[<c0153f62>] sys_write+0x42/0x70
[<c02a0593>] syscall_call+0x7/0xb

and:

bad: scheduling while atomic!
Call Trace:
[<c011c87d>] schedule+0x56d/0x580
[<c01376a4>] __remove_from_page_cache+0x24/0x80
[<c0140db8>] __pagevec_release+0x28/0x40
[<c0141334>] truncate_inode_pages+0xc4/0x2b0
[<c016e392>] iput+0x62/0x80
[<c016b0e2>] dput+0x22/0x260
[<c01561c1>] invalidate_inode_buffers+0x11/0x70
[<c016e18b>] generic_delete_inode+0x13b/0x150
[<c016e392>] iput+0x62/0x80
[<c016b1dc>] dput+0x11c/0x260
[<c0154ce5>] __fput+0xb5/0x120
[<c0153362>] filp_close+0x82/0xb0
[<c01214f4>] put_files_struct+0x54/0xc0
[<c01220bd>] do_exit+0x15d/0x3f0
[<c010b7c0>] do_invalid_op+0x0/0xd0
[<c010b4d9>] die+0xf9/0x100
[<c010b889>] do_invalid_op+0xc9/0xd0
[<c01436ba>] kunmap_high+0x1a/0xa0
[<c02a0f9f>] error_code+0x2f/0x38
[<c01436ba>] kunmap_high+0x1a/0xa0
[<c01a6040>] reiserfs_unprepare_pages+0x30/0x70
[<c01a72b5>] reiserfs_file_write+0x4e5/0x595
[<c011aa31>] do_page_fault+0x251/0x454
[<c011c941>] __wake_up_common+0x31/0x60
[<c01a6dd0>] reiserfs_file_write+0x0/0x595
[<c0153e38>] vfs_write+0xb8/0x130
[<c0153f62>] sys_write+0x42/0x70
[<c02a0593>] syscall_call+0x7/0xb

there are many of those.
and one one kernel BUG report in the middle of them:

Error (regular_file): read_ksyms stat /proc/ksyms failed
ksymoops: No such file or directory
No modules in ksyms, skipping objects
No ksyms, skipping lsmod
kernel BUG at mm/highmem.c:178!
invalid operand: 0000 [#2]
CPU: 0
EIP: 0060:[<c01436ba>] Not tainted VLI
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00210246
eax: 00000000 ebx: c19be060 ecx: c0394c90 edx: c0394c98
esi: 00000000 edi: e7e03ee4 ebp: 00000001 esp: e7e03e7c
ds: 007b es: 007b ss: 0068
Stack: c19be060 c19be060 c01a6040 c19be060 00000000 00000001 00000001
e3cfb678
c01a72b5 e7e03ee4 00000001 00000000 00000001 00001000 e7e03ee4
00000001
00000000 00001000 00000020 00000000 00001000 e3cfb6e4 ffffffe4
00000000
Call Trace:
[<c01a6040>] reiserfs_unprepare_pages+0x30/0x70
[<c01a72b5>] reiserfs_file_write+0x4e5/0x595
[<c025fda0>] ip_rcv+0x330/0x450
[<c011aa31>] do_page_fault+0x251/0x454
[<c024bbbc>] skb_release_data+0x8c/0xa0
[<c024bbf4>] kfree_skbmem+0x24/0x30
[<c024bc82>] __kfree_skb+0x82/0x100
[<c01a6dd0>] reiserfs_file_write+0x0/0x595
[<c0153e38>] vfs_write+0xb8/0x130
[<c0153f62>] sys_write+0x42/0x70
[<c02a0593>] syscall_call+0x7/0xb
Code: 00 04 00 00 8b 0d a0 4b 39 c0 e9 ad fe ff ff 89 f6 53 ba 00 e0 ff ff


>>EIP; c01436ba <kunmap_high+1a/a0> <=====

>>ebx; c19be060 <_end+160e328/3fc4d2c8>
>>ecx; c0394c90 <page_address_htable+d0/400>
>>edx; c0394c98 <page_address_htable+d8/400>
>>edi; e7e03ee4 <_end+27a541ac/3fc4d2c8>
>>esp; e7e03e7c <_end+27a54144/3fc4d2c8>

Trace; c01a6040 <reiserfs_unprepare_pages+30/70>
Trace; c01a72b5 <reiserfs_file_write+4e5/595>
Trace; c025fda0 <ip_rcv+330/450>
Trace; c011aa31 <do_page_fault+251/454>
Trace; c024bbbc <skb_release_data+8c/a0>
Trace; c024bbf4 <kfree_skbmem+24/30>
Trace; c024bc82 <__kfree_skb+82/100>
Trace; c01a6dd0 <reiserfs_file_write+0/595>
Trace; c0153e38 <vfs_write+b8/130>
Trace; c0153f62 <sys_write+42/70>
Trace; c02a0593 <syscall_call+7/b>

Code; c01436ba <kunmap_high+1a/a0>
00000000 <_EIP>:
Code; c01436ba <kunmap_high+1a/a0> <=====
0: 00 04 00 add %al,(%eax,%eax,1) <=====
Code; c01436bd <kunmap_high+1d/a0>
3: 00 8b 0d a0 4b 39 add %cl,0x394ba00d(%ebx)
Code; c01436c3 <kunmap_high+23/a0>
9: c0 e9 ad shr $0xad,%cl
Code; c01436c6 <kunmap_high+26/a0>
c: fe (bad)
Code; c01436c7 <kunmap_high+27/a0>
d: ff (bad)
Code; c01436c8 <kunmap_high+28/a0>
e: ff 89 f6 53 ba 00 decl 0xba53f6(%ecx)
Code; c01436ce <kunmap_high+2e/a0>
14: e0 ff loopne 15 <_EIP+0x15>
Code; c01436d0 <kunmap_high+30/a0>
16: ff 00 incl (%eax)

XP2500+ on nF2 with 1GB of memory

l.


Attachments:
(No filename) (4.69 kB)
.config (23.75 kB)
Download all attachments

2003-08-06 15:17:12

by Oleg Drokin

[permalink] [raw]
Subject: Re: bad scheduling and an BUG between them

Hello!

On Wed, Aug 06, 2003 at 01:24:05PM +0300, Lenar L?hmus wrote:

> with kernel-2.6.0-test2-mm3+A3-O12.2int i found this in logs:
> bad: scheduling while atomic!
> Call Trace:
> [<c011c87d>] schedule+0x56d/0x580
> [<c0144b33>] unmap_page_range+0x43/0x70
> [<c0144d15>] unmap_vmas+0x1b5/0x210
> [<c014893b>] exit_mmap+0x7b/0x190
> [<c011e399>] mmput+0x79/0xf0
> [<c0122082>] do_exit+0x122/0x3f0
> [<c010b7c0>] do_invalid_op+0x0/0xd0
> [<c010b4d9>] die+0xf9/0x100
> [<c010b889>] do_invalid_op+0xc9/0xd0
> [<c01436ba>] kunmap_high+0x1a/0xa0
> [<c02a0f9f>] error_code+0x2f/0x38
> [<c01436ba>] kunmap_high+0x1a/0xa0
> [<c01a6040>] reiserfs_unprepare_pages+0x30/0x70
> [<c01a72b5>] reiserfs_file_write+0x4e5/0x595
> [<c011aa31>] do_page_fault+0x251/0x454
> [<c011c941>] __wake_up_common+0x31/0x60
> [<c01a6dd0>] reiserfs_file_write+0x0/0x595
> [<c0153e38>] vfs_write+0xb8/0x130
> [<c0153f62>] sys_write+0x42/0x70
> [<c02a0593>] syscall_call+0x7/0xb

I am really unsure who is at fault here. At least reiserfs does not hold any locks
at this point. Not even BKL.

> and one one kernel BUG report in the middle of them:
> kernel BUG at mm/highmem.c:178!
> invalid operand: 0000 [#2]
> Call Trace:
> [<c01a6040>] reiserfs_unprepare_pages+0x30/0x70
> [<c01a72b5>] reiserfs_file_write+0x4e5/0x595

Hm, this one is real.
Try the patch below.
I wonder how you was able to hit this reiserfs_unprepare_pages() codepath at all.
Were there any messages prior to the bug? Or was there out of space situation?

Thanks for the report.

Bye,
Oleg

===== fs/reiserfs/file.c 1.20 vs edited =====
--- 1.20/fs/reiserfs/file.c Wed Jun 4 11:50:34 2003
+++ edited/fs/reiserfs/file.c Wed Aug 6 19:11:01 2003
@@ -555,7 +555,6 @@
struct page *page = prepared_pages[i];

try_to_free_buffers(page);
- kunmap(page);
unlock_page(page);
page_cache_release(page);
}