2008-08-14 07:12:59

by Ian Campbell

[permalink] [raw]
Subject: kernel BUG at lib/radix-tree.c:473!

Jeremy first noticed this
http://marc.info/?l=linux-kernel&m=121783008503477&w=2

[ 3.132333] ------------[ cut here ]------------
[ 3.132343] kernel BUG at /home/ijc/development/kernel/2.6.git/lib/radix-tree.c:473!
[ 3.132348] invalid opcode: 0000 [#1] SMP
[ 3.132352] Modules linked in:
[ 3.132356]
[ 3.132363] Pid: 580, comm: debconf Tainted: G W (2.6.26 #27)
[ 3.132368] EIP: 0061:[<c01f5279>] EFLAGS: 00010002 CPU: 0
[ 3.132375] EIP is at radix_tree_tag_set+0x1d/0x9f
[ 3.132379] EAX: c203af30 EBX: c261b8c0 ECX: 00000000 EDX: 00000001
[ 3.132383] ESI: 00000000 EDI: 00000001 EBP: c7977ce8 ESP: c7977cc8
[ 3.132387] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069
[ 3.132392] Process debconf (pid: 580, ti=c7976000 task=c538a240 task.ti=c7976000)
[ 3.132396] Stack: fffede22 00000000 00000001 c203af30 c203af2c c261b8c0 c203af2c 00000001
[ 3.132406] c7977cfc c01a1570 c261b8c0 c203af2c c2563000 c7977d0c c01a1a24 c261b8c0
[ 3.132416] 00000001 c7977d1c c01682fc c261b8c0 00000001 c7977d2c c0169185 c261b8c0
[ 3.132425] Call Trace:
[ 3.132428] [<c01a1570>] ? __set_page_dirty+0xdf/0x11f
[ 3.132434] [<c01a1a24>] ? __set_page_dirty_buffers+0x68/0x6c
[ 3.132441] [<c01682fc>] ? set_page_dirty+0x34/0x94
[ 3.132446] [<c0169185>] ? set_page_dirty_balance+0xe/0x3c
[ 3.132452] [<c016eec6>] ? __do_fault+0x35d/0x37e
[ 3.132458] [<c01707a4>] ? handle_mm_fault+0x45d/0x9c9
[ 3.132463] [<c01939a0>] ? __d_lookup+0xb7/0xeb
[ 3.132469] [<c0180d18>] ? kfree+0x81/0x88
[ 3.132474] [<c02dfaf7>] ? _spin_unlock_irqrestore+0x19/0x1f
[ 3.132481] [<c02dfaf7>] ? _spin_unlock_irqrestore+0x19/0x1f
[ 3.132487] [<c011b5ee>] ? do_page_fault+0x3be/0x8d0
[ 3.132493] [<c02061ef>] ? fb_ioctl+0x1a2/0x2de
[ 3.132499] [<c011ad1c>] ? pvclock_clocksource_read+0x48/0xa3
[ 3.132506] [<c02dfaf7>] ? _spin_unlock_irqrestore+0x19/0x1f
[ 3.132512] [<c013e02b>] ? hrtimer_start+0x12a/0x144
[ 3.132519] [<c0104682>] ? xen_mc_flush+0x123/0x160
[ 3.132525] [<c0104699>] ? xen_mc_flush+0x13a/0x160
[ 3.136027] [<c010436c>] ? xen_leave_lazy+0x12/0x14
[ 3.136027] [<c01075bc>] ? __switch_to+0xec/0x126
[ 3.136027] [<c012755c>] ? finish_task_switch+0x32/0xa5
[ 3.136027] [<c02de601>] ? schedule+0x6cc/0x735
[ 3.136027] [<c018e377>] ? vfs_ioctl+0x57/0x69
[ 3.136027] [<c018e636>] ? sys_ioctl+0x50/0x5a
[ 3.136027] [<c011b230>] ? do_page_fault+0x0/0x8d0
[ 3.136027] [<c02dfd3a>] ? error_code+0x72/0x78
[ 3.136027] =======================
[ 3.136027] Code: b4 89 42 04 83 c4 50 89 d8 5b 5e 5f 5d c3 55 89 e5 57 56 53 83 ec 14 89 45 ec 89 55 e8 89 4d e4 8b 30 3b 14 b5 88 52 3a c0 76 04 <0f> 0b eb fe 8b 45 ec 8b 4d e4 8b 58 08 6b c6 06 c1 e1 03
[ 3.136027] EIP: [<c01f5279>] radix_tree_tag_set+0x1d/0x9f SS:ESP 0069:c7977cc8
[ 3.136027] ---[ end trace 991579adcab01bbf ]---

I've bisected it down to:
commit 14fcc23fdc78e9d32372553ccf21758a9bd56fa1
Author: Hugh Dickins <[email protected]>
Date: Mon Jul 28 15:46:19 2008 -0700

tmpfs: fix kernel BUG in shmem_delete_inode

SuSE's insserve initscript ordering program hits kernel BUG at mm/shmem.c:814
on 2.6.26. It's using posix_fadvise on directories, and the shmem_readpage
method added in 2.6.23 is letting POSIX_FADV_WILLNEED allocate useless pages
to a tmpfs directory, incrementing i_blocks count but never decrementing it.

Fix this by assigning shmem_aops (pointing to readpage and writepage and
set_page_dirty) only when it's needed, on a regular file or a long symlink.

Many thanks to Kel for outstanding bugreport and steps to reproduce it.

Reported-by: Kel Modderman <[email protected]>
Tested-by: Kel Modderman <[email protected]>
Signed-off-by: Hugh Dickins <[email protected]>
Cc: <[email protected]> [2.6.25.x, 2.6.26.x]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

Reverting this patch from current Linus tree
(b635acec48bcaa9183fcbf4e3955616b0d4119b5) causes the problem to go
away. I haven't yet seen the link between the backtrace and this
changeset though.

Ian.
--
Ian Campbell

Preserve wildlife -- pickle a squirrel today!


Attachments:
signature.asc (197.00 B)
This is a digitally signed message part

2008-08-14 10:41:45

by Peter Zijlstra

[permalink] [raw]
Subject: Re: kernel BUG at lib/radix-tree.c:473!

On Thu, 2008-08-14 at 08:02 +0100, Ian Campbell wrote:
> Jeremy first noticed this
> http://marc.info/?l=linux-kernel&m=121783008503477&w=2
>
> [ 3.132333] ------------[ cut here ]------------
> [ 3.132343] kernel BUG at /home/ijc/development/kernel/2.6.git/lib/radix-tree.c:473!
> [ 3.132348] invalid opcode: 0000 [#1] SMP
> [ 3.132352] Modules linked in:
> [ 3.132356]
> [ 3.132363] Pid: 580, comm: debconf Tainted: G W (2.6.26 #27)
> [ 3.132368] EIP: 0061:[<c01f5279>] EFLAGS: 00010002 CPU: 0
> [ 3.132375] EIP is at radix_tree_tag_set+0x1d/0x9f
> [ 3.132379] EAX: c203af30 EBX: c261b8c0 ECX: 00000000 EDX: 00000001
> [ 3.132383] ESI: 00000000 EDI: 00000001 EBP: c7977ce8 ESP: c7977cc8
> [ 3.132387] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069
> [ 3.132392] Process debconf (pid: 580, ti=c7976000 task=c538a240 task.ti=c7976000)
> [ 3.132396] Stack: fffede22 00000000 00000001 c203af30 c203af2c c261b8c0 c203af2c 00000001
> [ 3.132406] c7977cfc c01a1570 c261b8c0 c203af2c c2563000 c7977d0c c01a1a24 c261b8c0
> [ 3.132416] 00000001 c7977d1c c01682fc c261b8c0 00000001 c7977d2c c0169185 c261b8c0
> [ 3.132425] Call Trace:
> [ 3.132428] [<c01a1570>] ? __set_page_dirty+0xdf/0x11f
> [ 3.132434] [<c01a1a24>] ? __set_page_dirty_buffers+0x68/0x6c
> [ 3.132441] [<c01682fc>] ? set_page_dirty+0x34/0x94
> [ 3.132446] [<c0169185>] ? set_page_dirty_balance+0xe/0x3c
> [ 3.132452] [<c016eec6>] ? __do_fault+0x35d/0x37e
> [ 3.132458] [<c01707a4>] ? handle_mm_fault+0x45d/0x9c9
> [ 3.132463] [<c01939a0>] ? __d_lookup+0xb7/0xeb
> [ 3.132469] [<c0180d18>] ? kfree+0x81/0x88
> [ 3.132474] [<c02dfaf7>] ? _spin_unlock_irqrestore+0x19/0x1f
> [ 3.132481] [<c02dfaf7>] ? _spin_unlock_irqrestore+0x19/0x1f
> [ 3.132487] [<c011b5ee>] ? do_page_fault+0x3be/0x8d0
> [ 3.132493] [<c02061ef>] ? fb_ioctl+0x1a2/0x2de
> [ 3.132499] [<c011ad1c>] ? pvclock_clocksource_read+0x48/0xa3
> [ 3.132506] [<c02dfaf7>] ? _spin_unlock_irqrestore+0x19/0x1f
> [ 3.132512] [<c013e02b>] ? hrtimer_start+0x12a/0x144
> [ 3.132519] [<c0104682>] ? xen_mc_flush+0x123/0x160
> [ 3.132525] [<c0104699>] ? xen_mc_flush+0x13a/0x160
> [ 3.136027] [<c010436c>] ? xen_leave_lazy+0x12/0x14
> [ 3.136027] [<c01075bc>] ? __switch_to+0xec/0x126
> [ 3.136027] [<c012755c>] ? finish_task_switch+0x32/0xa5
> [ 3.136027] [<c02de601>] ? schedule+0x6cc/0x735
> [ 3.136027] [<c018e377>] ? vfs_ioctl+0x57/0x69
> [ 3.136027] [<c018e636>] ? sys_ioctl+0x50/0x5a
> [ 3.136027] [<c011b230>] ? do_page_fault+0x0/0x8d0
> [ 3.136027] [<c02dfd3a>] ? error_code+0x72/0x78
> [ 3.136027] =======================
> [ 3.136027] Code: b4 89 42 04 83 c4 50 89 d8 5b 5e 5f 5d c3 55 89 e5 57 56 53 83 ec 14 89 45 ec 89 55 e8 89 4d e4 8b 30 3b 14 b5 88 52 3a c0 76 04 <0f> 0b eb fe 8b 45 ec 8b 4d e4 8b 58 08 6b c6 06 c1 e1 03
> [ 3.136027] EIP: [<c01f5279>] radix_tree_tag_set+0x1d/0x9f SS:ESP 0069:c7977cc8
> [ 3.136027] ---[ end trace 991579adcab01bbf ]---

This looks like we try to dirty a page that's not in the page-cache.

> I've bisected it down to:
> commit 14fcc23fdc78e9d32372553ccf21758a9bd56fa1
> Author: Hugh Dickins <[email protected]>
> Date: Mon Jul 28 15:46:19 2008 -0700
>
> tmpfs: fix kernel BUG in shmem_delete_inode
>
> SuSE's insserve initscript ordering program hits kernel BUG at mm/shmem..c:814
> on 2.6.26. It's using posix_fadvise on directories, and the shmem_readpage
> method added in 2.6.23 is letting POSIX_FADV_WILLNEED allocate useless pages
> to a tmpfs directory, incrementing i_blocks count but never decrementing it.
>
> Fix this by assigning shmem_aops (pointing to readpage and writepage and
> set_page_dirty) only when it's needed, on a regular file or a long symlink.
>
> Many thanks to Kel for outstanding bugreport and steps to reproduce it.
>
> Reported-by: Kel Modderman <[email protected]>
> Tested-by: Kel Modderman <[email protected]>
> Signed-off-by: Hugh Dickins <[email protected]>
> Cc: <[email protected]> [2.6.25.x, 2.6.26.x]
> Signed-off-by: Andrew Morton <[email protected]>
> Signed-off-by: Linus Torvalds <[email protected]>
>
> Reverting this patch from current Linus tree
> (b635acec48bcaa9183fcbf4e3955616b0d4119b5) causes the problem to go
> away. I haven't yet seen the link between the backtrace and this
> changeset though.
>
> Ian.

2008-08-14 13:07:28

by Hugh Dickins

[permalink] [raw]
Subject: Re: kernel BUG at lib/radix-tree.c:473!

On Thu, 14 Aug 2008, Ian Campbell wrote:
> Jeremy first noticed this
> http://marc.info/?l=linux-kernel&m=121783008503477&w=2
>
> [ 3.132333] ------------[ cut here ]------------
> [ 3.132343] kernel BUG at /home/ijc/development/kernel/2.6.git/lib/radix-tree.c:473!
> [ 3.132348] invalid opcode: 0000 [#1] SMP
> [ 3.132352] Modules linked in:
> [ 3.132356]
> [ 3.132363] Pid: 580, comm: debconf Tainted: G W (2.6.26 #27)
> [ 3.132368] EIP: 0061:[<c01f5279>] EFLAGS: 00010002 CPU: 0
> [ 3.132375] EIP is at radix_tree_tag_set+0x1d/0x9f
> [ 3.132379] EAX: c203af30 EBX: c261b8c0 ECX: 00000000 EDX: 00000001
> [ 3.132383] ESI: 00000000 EDI: 00000001 EBP: c7977ce8 ESP: c7977cc8
> [ 3.132387] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069
> [ 3.132392] Process debconf (pid: 580, ti=c7976000 task=c538a240 task.ti=c7976000)
> [ 3.132396] Stack: fffede22 00000000 00000001 c203af30 c203af2c c261b8c0 c203af2c 00000001
> [ 3.132406] c7977cfc c01a1570 c261b8c0 c203af2c c2563000 c7977d0c c01a1a24 c261b8c0
> [ 3.132416] 00000001 c7977d1c c01682fc c261b8c0 00000001 c7977d2c c0169185 c261b8c0
> [ 3.132425] Call Trace:
> [ 3.132428] [<c01a1570>] ? __set_page_dirty+0xdf/0x11f
> [ 3.132434] [<c01a1a24>] ? __set_page_dirty_buffers+0x68/0x6c
> [ 3.132441] [<c01682fc>] ? set_page_dirty+0x34/0x94
> [ 3.132446] [<c0169185>] ? set_page_dirty_balance+0xe/0x3c
> [ 3.132452] [<c016eec6>] ? __do_fault+0x35d/0x37e
> [ 3.132458] [<c01707a4>] ? handle_mm_fault+0x45d/0x9c9
> [ 3.132463] [<c01939a0>] ? __d_lookup+0xb7/0xeb
> [ 3.132469] [<c0180d18>] ? kfree+0x81/0x88
> [ 3.132474] [<c02dfaf7>] ? _spin_unlock_irqrestore+0x19/0x1f
> [ 3.132481] [<c02dfaf7>] ? _spin_unlock_irqrestore+0x19/0x1f
> [ 3.132487] [<c011b5ee>] ? do_page_fault+0x3be/0x8d0
> [ 3.132493] [<c02061ef>] ? fb_ioctl+0x1a2/0x2de
> [ 3.132499] [<c011ad1c>] ? pvclock_clocksource_read+0x48/0xa3
> [ 3.132506] [<c02dfaf7>] ? _spin_unlock_irqrestore+0x19/0x1f
> [ 3.132512] [<c013e02b>] ? hrtimer_start+0x12a/0x144
> [ 3.132519] [<c0104682>] ? xen_mc_flush+0x123/0x160
> [ 3.132525] [<c0104699>] ? xen_mc_flush+0x13a/0x160
> [ 3.136027] [<c010436c>] ? xen_leave_lazy+0x12/0x14
> [ 3.136027] [<c01075bc>] ? __switch_to+0xec/0x126
> [ 3.136027] [<c012755c>] ? finish_task_switch+0x32/0xa5
> [ 3.136027] [<c02de601>] ? schedule+0x6cc/0x735
> [ 3.136027] [<c018e377>] ? vfs_ioctl+0x57/0x69
> [ 3.136027] [<c018e636>] ? sys_ioctl+0x50/0x5a
> [ 3.136027] [<c011b230>] ? do_page_fault+0x0/0x8d0
> [ 3.136027] [<c02dfd3a>] ? error_code+0x72/0x78
> [ 3.136027] =======================
> [ 3.136027] Code: b4 89 42 04 83 c4 50 89 d8 5b 5e 5f 5d c3 55 89 e5 57 56 53 83 ec 14 89 45 ec 89 55 e8 89 4d e4 8b 30 3b 14 b5 88 52 3a c0 76 04 <0f> 0b eb fe 8b 45 ec 8b 4d e4 8b 58 08 6b c6 06 c1 e1 03
> [ 3.136027] EIP: [<c01f5279>] radix_tree_tag_set+0x1d/0x9f SS:ESP 0069:c7977cc8
> [ 3.136027] ---[ end trace 991579adcab01bbf ]---
>
> I've bisected it down to:
> commit 14fcc23fdc78e9d32372553ccf21758a9bd56fa1
> Author: Hugh Dickins <[email protected]>
> Date: Mon Jul 28 15:46:19 2008 -0700
>
> tmpfs: fix kernel BUG in shmem_delete_inode
>
> Reverting this patch from current Linus tree
> (b635acec48bcaa9183fcbf4e3955616b0d4119b5) causes the problem to go
> away. I haven't yet seen the link between the backtrace and this
> changeset though.

Nor I! Thanks a lot for doing the bisection, but all I can say so far
is that I'm utterly flummoxed. (And I do wonder if it's a pvfb bug
which has previously been masked; but that's premature, we can't
say until we understand how it got here at all.)

There's a lot of "?" entries in your backtrace, Jeremy's ones
look clearer: CONFIG_FRAME_POINTER=y ought to improve yours.

In both cases it's handling a page fault: I'm curious as to what kind
of vma this fault is occurring on. Could you devise a way of getting
us /proc/<pid>/maps output, together with the faulting address, when
it hits one of these BUGs? Or should I try to put together a patch
for that?

Hugh

2008-08-14 15:04:28

by Ian Campbell

[permalink] [raw]
Subject: Re: kernel BUG at lib/radix-tree.c:473!

On Thu, 2008-08-14 at 14:06 +0100, Hugh Dickins wrote:
> On Thu, 14 Aug 2008, Ian Campbell wrote:
> > Jeremy first noticed this
> > http://marc.info/?l=linux-kernel&m=121783008503477&w=2
> >
> > [ 3.132333] ------------[ cut here ]------------
> > [ 3.132343] kernel BUG at /home/ijc/development/kernel/2.6.git/lib/radix-tree.c:473!
> > [ 3.132348] invalid opcode: 0000 [#1] SMP
> > [ 3.132352] Modules linked in:
> > [ 3.132356]
> > [ 3.132363] Pid: 580, comm: debconf Tainted: G W (2.6.26 #27)
> > [ 3.132368] EIP: 0061:[<c01f5279>] EFLAGS: 00010002 CPU: 0
> > [ 3.132375] EIP is at radix_tree_tag_set+0x1d/0x9f
> > [ 3.132379] EAX: c203af30 EBX: c261b8c0 ECX: 00000000 EDX: 00000001
> > [ 3.132383] ESI: 00000000 EDI: 00000001 EBP: c7977ce8 ESP: c7977cc8
> > [ 3.132387] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069
> > [ 3.132392] Process debconf (pid: 580, ti=c7976000 task=c538a240 task.ti=c7976000)
> > [ 3.132396] Stack: fffede22 00000000 00000001 c203af30 c203af2c c261b8c0 c203af2c 00000001
> > [ 3.132406] c7977cfc c01a1570 c261b8c0 c203af2c c2563000 c7977d0c c01a1a24 c261b8c0
> > [ 3.132416] 00000001 c7977d1c c01682fc c261b8c0 00000001 c7977d2c c0169185 c261b8c0
> > [ 3.132425] Call Trace:
> > [ 3.132428] [<c01a1570>] ? __set_page_dirty+0xdf/0x11f
> > [ 3.132434] [<c01a1a24>] ? __set_page_dirty_buffers+0x68/0x6c
> > [ 3.132441] [<c01682fc>] ? set_page_dirty+0x34/0x94
> > [ 3.132446] [<c0169185>] ? set_page_dirty_balance+0xe/0x3c
> > [ 3.132452] [<c016eec6>] ? __do_fault+0x35d/0x37e
> > [ 3.132458] [<c01707a4>] ? handle_mm_fault+0x45d/0x9c9
> > [ 3.132463] [<c01939a0>] ? __d_lookup+0xb7/0xeb
> > [ 3.132469] [<c0180d18>] ? kfree+0x81/0x88
> > [ 3.132474] [<c02dfaf7>] ? _spin_unlock_irqrestore+0x19/0x1f
> > [ 3.132481] [<c02dfaf7>] ? _spin_unlock_irqrestore+0x19/0x1f
> > [ 3.132487] [<c011b5ee>] ? do_page_fault+0x3be/0x8d0
> > [ 3.132493] [<c02061ef>] ? fb_ioctl+0x1a2/0x2de
> > [ 3.132499] [<c011ad1c>] ? pvclock_clocksource_read+0x48/0xa3
> > [ 3.132506] [<c02dfaf7>] ? _spin_unlock_irqrestore+0x19/0x1f
> > [ 3.132512] [<c013e02b>] ? hrtimer_start+0x12a/0x144
> > [ 3.132519] [<c0104682>] ? xen_mc_flush+0x123/0x160
> > [ 3.132525] [<c0104699>] ? xen_mc_flush+0x13a/0x160
> > [ 3.136027] [<c010436c>] ? xen_leave_lazy+0x12/0x14
> > [ 3.136027] [<c01075bc>] ? __switch_to+0xec/0x126
> > [ 3.136027] [<c012755c>] ? finish_task_switch+0x32/0xa5
> > [ 3.136027] [<c02de601>] ? schedule+0x6cc/0x735
> > [ 3.136027] [<c018e377>] ? vfs_ioctl+0x57/0x69
> > [ 3.136027] [<c018e636>] ? sys_ioctl+0x50/0x5a
> > [ 3.136027] [<c011b230>] ? do_page_fault+0x0/0x8d0
> > [ 3.136027] [<c02dfd3a>] ? error_code+0x72/0x78
> > [ 3.136027] =======================
> > [ 3.136027] Code: b4 89 42 04 83 c4 50 89 d8 5b 5e 5f 5d c3 55 89 e5 57 56 53 83 ec 14 89 45 ec 89 55 e8 89 4d e4 8b 30 3b 14 b5 88 52 3a c0 76 04 <0f> 0b eb fe 8b 45 ec 8b 4d e4 8b 58 08 6b c6 06 c1 e1 03
> > [ 3.136027] EIP: [<c01f5279>] radix_tree_tag_set+0x1d/0x9f SS:ESP 0069:c7977cc8
> > [ 3.136027] ---[ end trace 991579adcab01bbf ]---
> >
> > I've bisected it down to:
> > commit 14fcc23fdc78e9d32372553ccf21758a9bd56fa1
> > Author: Hugh Dickins <[email protected]>
> > Date: Mon Jul 28 15:46:19 2008 -0700
> >
> > tmpfs: fix kernel BUG in shmem_delete_inode
> >
> > Reverting this patch from current Linus tree
> > (b635acec48bcaa9183fcbf4e3955616b0d4119b5) causes the problem to go
> > away. I haven't yet seen the link between the backtrace and this
> > changeset though.
>
> Nor I! Thanks a lot for doing the bisection, but all I can say so far
> is that I'm utterly flummoxed. (And I do wonder if it's a pvfb bug
> which has previously been masked; but that's premature, we can't
> say until we understand how it got here at all.)

I've added Markus, the current PVFB maintainer.

> There's a lot of "?" entries in your backtrace, Jeremy's ones
> look clearer: CONFIG_FRAME_POINTER=y ought to improve yours.

That was with CONFIG_FRAME_POINTER=y / CONFIG_DEBUG_KERNEL=y /
CONFIG_DEBUG_INFO=y. I wonder why it is so crappy...

> In both cases it's handling a page fault: I'm curious as to what kind
> of vma this fault is occurring on. Could you devise a way of getting
> us /proc/<pid>/maps output, together with the faulting address, when
> it hits one of these BUGs? Or should I try to put together a patch
> for that?

I'll have a go but I'm just about to go away for a long weekend so it
might be next week unless I get to it tonight. Any pointers on how to go
about it would be appreciated though...

Ian.

--
Ian Campbell
Current Noise: Enslaved - Intro: "Green Reflection"

The problem with people who have no vices is that generally you can
be pretty sure they're going to have some pretty annoying virtues.
-- Elizabeth Taylor

2008-08-14 17:39:14

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: kernel BUG at lib/radix-tree.c:473!

Hugh Dickins wrote:
> In both cases it's handling a page fault: I'm curious as to what kind
> of vma this fault is occurring on. Could you devise a way of getting
> us /proc/<pid>/maps output, together with the faulting address, when
> it hits one of these BUGs? Or should I try to put together a patch
> for that?
>

Irritatingly, it looks like the i386 register dump doesn't include cr2.
My 64-bit dump has cr2 as 00007fb12cb2a000 which is a good solid
usermode address. I'll run X under strace to see if that gives a clue
to what it corresponds to.

J

2008-08-14 17:49:27

by Hugh Dickins

[permalink] [raw]
Subject: Re: kernel BUG at lib/radix-tree.c:473!

On Thu, 14 Aug 2008, Ian Campbell wrote:
> On Thu, 2008-08-14 at 14:06 +0100, Hugh Dickins wrote:
> > On Thu, 14 Aug 2008, Ian Campbell wrote:
> > > Jeremy first noticed this
> > > http://marc.info/?l=linux-kernel&m=121783008503477&w=2
> > >
> > > I've bisected it down to:
> > > commit 14fcc23fdc78e9d32372553ccf21758a9bd56fa1
> > > Author: Hugh Dickins <[email protected]>
> > > Date: Mon Jul 28 15:46:19 2008 -0700
> > >
> > > tmpfs: fix kernel BUG in shmem_delete_inode
>
> > In both cases it's handling a page fault: I'm curious as to what kind
> > of vma this fault is occurring on. Could you devise a way of getting
> > us /proc/<pid>/maps output, together with the faulting address, when
> > it hits one of these BUGs? Or should I try to put together a patch
> > for that?
>
> I'll have a go but I'm just about to go away for a long weekend so it
> might be next week unless I get to it tonight. Any pointers on how to go
> about it would be appreciated though...

No, have a good break and come back to it next week. Whatever I suggest
in a rush will at first give you more oopses than you started out with,
back and forth, then it'll give us some info but not enough, back and
forth... no. I'd rather spend the time trying to work out a hypothesis
from my end: it's always easier to devise an info patch once you have
a hypothesis to test, right now it makes no sense to me at all.

My starting point will be, prior to my tmpfs patch the faulting page
had a .set_page_dirty that kept it out of trouble, with that change
of ops now it hasn't; but what kind of page it actually is, a shmem
page (because the ops change affected it) that isn't a shmem page
(because it doesn't have the ops), I don't grasp yet.

Hugh

2008-08-14 19:33:51

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: kernel BUG at lib/radix-tree.c:473!

Hugh Dickins wrote:
> In both cases it's handling a page fault: I'm curious as to what kind
> of vma this fault is occurring on. Could you devise a way of getting
> us /proc/<pid>/maps output, together with the faulting address, when
> it hits one of these BUGs? Or should I try to put together a patch
> for that?

It's a /dev/fb0 mapping:

open("/dev/fb0", O_RDWR) = 8
...
mmap(NULL, 2097152, PROT_READ|PROT_WRITE, MAP_SHARED, 8, 0) = 0x7fed69a08000

The fault is 1 page into this mapping:

WARNING: at /home/jeremy/hg/xen/paravirt/linux/fs/buffer.c:711 __set_page_dirty+0x7e/0x113()
Modules linked in: xen_fbfront fb fb_sys_fops sysimgblt sysfillrect xen_netfront syscopyarea xen_kbdfront xen_blkfront
Pid: 1357, comm: X Not tainted 2.6.27-rc2-tip #337

Call Trace:
[<ffffffff802346be>] warn_on_slowpath+0x5d/0x84
[<ffffffff8024a55d>] ? hrtimer_start+0x118/0x13a
[<ffffffff8024a728>] ? ktime_get_ts+0x4e/0x53
[<ffffffff8022c590>] ? hrtick_start_fair+0x11b/0x152
[<ffffffff803ce686>] ? _spin_lock_irqsave+0x2a/0x33
[<ffffffff8023d87e>] ? lock_timer_base+0x2b/0x4f
[<ffffffff803ce6ac>] ? _spin_unlock_irqrestore+0x1d/0x21
[<ffffffff802af7d3>] __set_page_dirty+0x7e/0x113
[<ffffffff802b19e1>] __set_page_dirty_buffers+0x84/0x89
[<ffffffff8026c876>] set_page_dirty+0x46/0xb2
[<ffffffff8026d9a1>] set_page_dirty_balance+0x17/0x4d
[<ffffffff80274817>] __do_fault+0x494/0x4ae
[<ffffffff803cccff>] ? thread_return+0x50/0xc3
[<ffffffff802765b3>] handle_mm_fault+0x4a1/0x967
[<ffffffff803ce640>] ? _spin_lock_irq+0x16/0x19
[<ffffffff8023f8c6>] ? ptrace_stop+0x138/0x15d
[<ffffffff803ce6ac>] ? _spin_unlock_irqrestore+0x1d/0x21
[<ffffffff80224f06>] do_page_fault+0x5d0/0x9fa
[<ffffffff803ce97a>] error_exit+0x0/0x70

---[ end trace a9dc8a283a23b787 ]---
------------[ cut here ]------------
kernel BUG at /home/jeremy/hg/xen/paravirt/linux/lib/radix-tree.c:473!
invalid opcode: 0000 [1] SMP
CPU 0
Modules linked in: xen_fbfront fb fb_sys_fops sysimgblt sysfillrect xen_netfront syscopyarea xen_kbdfront xen_blkfront
Pid: 1357, comm: X Tainted: G W 2.6.27-rc2-tip #337
RIP: e030:[<ffffffff80301cb4>] [<ffffffff80301cb4>] radix_tree_tag_set+0x17/0x9b
RSP: e02b:ffff88000b049cc0 EFLAGS: 00010002
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff88000e47ec08
RBP: ffff88000b049cc8 R08: 2000000000000000 R09: 0000000000000000
R10: 00007fed69a08c80 R11: ffff880001425a90 R12: ffff88000e47ec00
R13: 0000000000000001 R14: ffff88000136c150 R15: ffff88000ef4e700
FS: 00007fed6b4f9780(0000) GS:ffffffff804b6ec0(0000) knlGS:0000000000000000
CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007fed69a09000 CR3: 000000000cbc4000 CR4: 0000000000002620
^^^^^^^^^^^^^^^^
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process X (pid: 1357, threadinfo ffff88000b048000, task ffff88000cb2ecc0)
Stack: ffff880001425a90 ffff88000b049cf8 ffffffff802af838 ffff88000b049ce8
ffff880001425a90 ffff88000e47ec00 ffff880001425a90 ffff88000b049d18
ffffffff802b19e1 ffff880001425a90 0000000000000001 ffff88000b049d48
Call Trace:
[<ffffffff802af838>] __set_page_dirty+0xe3/0x113
[<ffffffff802b19e1>] __set_page_dirty_buffers+0x84/0x89
[<ffffffff8026c876>] set_page_dirty+0x46/0xb2
[<ffffffff8026d9a1>] set_page_dirty_balance+0x17/0x4d
[<ffffffff80274817>] __do_fault+0x494/0x4ae
[<ffffffff803cccff>] ? thread_return+0x50/0xc3
[<ffffffff802765b3>] handle_mm_fault+0x4a1/0x967
[<ffffffff803ce640>] ? _spin_lock_irq+0x16/0x19
[<ffffffff803ce6ac>] ? _spin_unlock_irqrestore+0x1d/0x21
[<ffffffff80224f06>] do_page_fault+0x5d0/0x9fa
[<ffffffff803ce97a>] error_exit+0x0/0x70


Code: ff eb 02 31 c0 48 83 c4 18 5b 41 5c 41 5d 41 5e 41 5f c9 c3 55 44 8b 0f 48 89 e5 44 89 c8 53 48 3b 34 c5 60 f0 4b 80 89 d3 76 04 <0f> 0b eb fe 41 6b c1 06 4c 8b 57 08 44 8d 58 fa 49 83 e2 fe 89
RIP [<ffffffff80301cb4>] radix_tree_tag_set+0x17/0x9b
RSP <ffff88000b049cc0>

2008-08-14 21:04:40

by Hugh Dickins

[permalink] [raw]
Subject: Re: kernel BUG at lib/radix-tree.c:473!

On Thu, 14 Aug 2008, Jeremy Fitzhardinge wrote:
> Hugh Dickins wrote:
> > In both cases it's handling a page fault: I'm curious as to what kind
> > of vma this fault is occurring on. Could you devise a way of getting
> > us /proc/<pid>/maps output, together with the faulting address, when
> > it hits one of these BUGs? Or should I try to put together a patch
> > for that?
>
> It's a /dev/fb0 mapping:
>
> open("/dev/fb0", O_RDWR) = 8
> ...
> mmap(NULL, 2097152, PROT_READ|PROT_WRITE, MAP_SHARED, 8, 0) = 0x7fed69a08000
>
> The fault is 1 page into this mapping:
>
> WARNING: at /home/jeremy/hg/xen/paravirt/linux/fs/buffer.c:711
> __set_page_dirty+0x7e/0x113()
> kernel BUG at /home/jeremy/hg/xen/paravirt/linux/lib/radix-tree.c:473!
> radix_tree_tag_set+0x17/0x9b
> CR2: 00007fed69a09000 CR3: 000000000cbc4000 CR4: 0000000000002620
> ^^^^^^^^^^^^^^^^
> Process X (pid: 1357, threadinfo ffff88000b048000, task ffff88000cb2ecc0)

Brilliant, thanks a lot, Jeremy. That fits, I'd been inching towards
forming the thought that it was likely to involve a block or char device
(rather than a directory, which is what had prompted the patch).

I'd thought about them when making the patch, but quickly decided that
a device node may live in a tmpfs (and usually does with udev), but
redirects off to somewhere else entirely.

If I open /dev/sda and mmap it, then I don't expect to see pages of
shmem, I expect to see pages from my disk. Though if I open /dev/zero
and mmap it, that character device does happen to be the one which
comes back and delivers pages of shmem.

Now if I open /dev/fb0 here and mmap it as you did, and try to write
to it through those pages, I see nothing bad happening: I don't know
for sure what pages it's making available to me, but I hope they're
pages belonging to that driver.

tmpfs doesn't associate its shmem_file_operations with a device node,
so there wouldn't be a way to mmap it, unless the device driver gives
the struct file its own file_operations, including an .mmap method.

It looks like your fb driver is providing a backing_dev_info which
tells vma_wants_writenotify that it wants mapping_cap_account_dirty:
hmm, I suppose the default one would do that, though shmem provided
one which says not. But not providing any address_space_operations
with a .set_page_dirty which would keep it out of trouble.

Before my patch, the device node happened to stay pointing to
shmem_aops, whose set_page_dirty was safe; now it's getting
default behaviour, and hitting these problems.

As you can see, I'm still groping towards the right answer.
The driver probably needs to provide its own backing_dev_info
(or point to a suitable default), and its own address_space_ops,
and perhaps more (there should be examples elsewhere). But whether
it is actually wrong, or whether I was wrong to mess it up, I've
not yet decided.

An additional useful input would be: what happens if you replace
that /dev/fb0 by a symlink /dev/fb0 pointing to an fb0 device node in
one of your disk filesystems? I rather expect that to cause the same
trouble, which would argue that the driver is wrong and shmem right.

Hugh

2008-08-14 22:05:40

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: kernel BUG at lib/radix-tree.c:473!

Hugh Dickins wrote:
> As you can see, I'm still groping towards the right answer.
> The driver probably needs to provide its own backing_dev_info
> (or point to a suitable default), and its own address_space_ops,
> and perhaps more (there should be examples elsewhere). But whether
> it is actually wrong, or whether I was wrong to mess it up, I've
> not yet decided.
>

My understanding is that the driver is doing something a bit clever: it
uses the page dirty flags to determine which parts of the framebuffer
have been written to, and uses that information to minimize the amount
of stuff that needs to be copied out. The writes to the pages are not
expected to generate actual page faults.

But I haven't really looked at it closely, and I'm not at all familiar
with the vm at this layer. I'm not sure how it actually allocates the
framebuffer memory for example (vmalloc? incrementally on faults?).
I'm hoping Markus will leap in, since wrote this stuff. Or, gasp, I'll
read the code myself.

> An additional useful input would be: what happens if you replace
> that /dev/fb0 by a symlink /dev/fb0 pointing to an fb0 device node in
> one of your disk filesystems? I rather expect that to cause the same
> trouble, which would argue that the driver is wrong and shmem right.
>

I don't follow. Do you mean make /dev/fb0 a plain file on a filesystem?
Or make it a disk device node? Something else?

J

2008-08-14 23:14:31

by Johannes Weiner

[permalink] [raw]
Subject: Re: kernel BUG at lib/radix-tree.c:473!

Hi,

Jeremy Fitzhardinge <[email protected]> writes:

> Hugh Dickins wrote:
>> As you can see, I'm still groping towards the right answer.
>> The driver probably needs to provide its own backing_dev_info
>> (or point to a suitable default), and its own address_space_ops,
>> and perhaps more (there should be examples elsewhere). But whether
>> it is actually wrong, or whether I was wrong to mess it up, I've
>> not yet decided.
>>
>
> My understanding is that the driver is doing something a bit clever:
> it uses the page dirty flags to determine which parts of the
> framebuffer have been written to, and uses that information to
> minimize the amount of stuff that needs to be copied out. The writes
> to the pages are not expected to generate actual page faults.
>
> But I haven't really looked at it closely, and I'm not at all familiar
> with the vm at this layer. I'm not sure how it actually allocates the
> framebuffer memory for example (vmalloc? incrementally on faults?).
> I'm hoping Markus will leap in, since wrote this stuff. Or, gasp,
> I'll read the code myself.
>
>> An additional useful input would be: what happens if you replace
>> that /dev/fb0 by a symlink /dev/fb0 pointing to an fb0 device node in
>> one of your disk filesystems? I rather expect that to cause the same
>> trouble, which would argue that the driver is wrong and shmem right.
>>
>
> I don't follow. Do you mean make /dev/fb0 a plain file on a
> filesystem? Or make it a disk device node? Something else?

Creating a device node on a different filesystem to see if the driver
only worked with the safe shmem set_page_dirty and now breaks due to
exposure to the generic version. Or if the driver works with the
generic version through other mappings and the shmem code screws it up
somewhere else.

Hannes

2008-08-15 00:01:20

by Hugh Dickins

[permalink] [raw]
Subject: Re: kernel BUG at lib/radix-tree.c:473!

On Fri, 15 Aug 2008, Johannes Weiner wrote:
> Jeremy Fitzhardinge <[email protected]> writes:
> > Hugh Dickins wrote:
> >
> >> An additional useful input would be: what happens if you replace
> >> that /dev/fb0 by a symlink /dev/fb0 pointing to an fb0 device node in
> >> one of your disk filesystems? I rather expect that to cause the same
> >> trouble, which would argue that the driver is wrong and shmem right.
> >
> > I don't follow. Do you mean make /dev/fb0 a plain file on a
> > filesystem? Or make it a disk device node? Something else?
>
> Creating a device node on a different filesystem to see if the driver
> only worked with the safe shmem set_page_dirty and now breaks due to
> exposure to the generic version. Or if the driver works with the
> generic version through other mappings and the shmem code screws it up
> somewhere else.

Yes, that's it. I think it was ext2 I referred to, when I worried
about this when making the change to tmpfs; and my reading of it
was that ext2 left a device node's a_ops unset, as I was changing
tmpfs to do. (Looking at it again, ext2 doesn't even specify its
.set_page_dirty, so even if it had assigned an a_ops, it wouldn't
have avoided the default behaviour.) But I'd like to hear what
actually happens in practice, rather than relying on my reading.

Hugh

2008-08-15 12:24:00

by Markus Armbruster

[permalink] [raw]
Subject: Re: kernel BUG at lib/radix-tree.c:473!

Jeremy Fitzhardinge <[email protected]> writes:

> Hugh Dickins wrote:
>> As you can see, I'm still groping towards the right answer.
>> The driver probably needs to provide its own backing_dev_info
>> (or point to a suitable default), and its own address_space_ops,
>> and perhaps more (there should be examples elsewhere). But whether
>> it is actually wrong, or whether I was wrong to mess it up, I've
>> not yet decided.
>>
>
> My understanding is that the driver is doing something a bit clever:
> it uses the page dirty flags to determine which parts of the
> framebuffer have been written to, and uses that information to
> minimize the amount of stuff that needs to be copied out. The writes

Yes.

> to the pages are not expected to generate actual page faults.
>
> But I haven't really looked at it closely, and I'm not at all familiar
> with the vm at this layer. I'm not sure how it actually allocates the
> framebuffer memory for example (vmalloc? incrementally on faults?).

vmalloc()

> I'm hoping Markus will leap in, since wrote this stuff. Or, gasp,
> I'll read the code myself.

The actual cleverness is in fb_defio[*], which was written by Jaya
Kumar (cc'ed). I merely ripped out the old, somewhat racy cleverness
I inherited from Anthony Liguori (which you can still admire in Xen's
2.6.18 kernel), and switched over to use fb_defio instead. Because
one instance of clever code is enough.

My understanding of fb_defio's inner workings is rather limited I
fear. I'm just using it.

Jaya, could you help?

[...]


[*] Documentation/fb/deferred_io.txt drivers/video/fb_defio.c

2008-08-17 03:55:15

by zhang wenjie

[permalink] [raw]
Subject: Re: kernel BUG at lib/radix-tree.c:473!

> On Thu, 14 Aug 2008, Ian Campbell wrote:
> > Jeremy first noticed this
> > http://marc.info/?l=linux-kernel&m=121783008503477&w=2 <http://marc.info/?l=linux-kernel&m=121783008503477&w=2>

I have counted the same problem when i mmap the /dev/fb0 and memset
it to 0( the fb driver use deferred_io and when i do not use deferred_io
it works well) .This bug also showed int linux2.6.26 and linux2.6.25.
does this bug solved?

function fb_deferred_io_fault called<3>function fb_deferred_io_mkwrite called<3>function fb_deferred_io_fault called<3>function fb_deferred_io_mkwrite called<3>function fb_deferred_io_fault called<3>function fb_deferred_io_mkwrite called<2>kernel BUG at lib/radix-tree.c:474!
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = c3ef8000
[00000000] *pgd=0bd25031, *pte=00000000, *ppte=00000000
Internal error: Oops: 817 [#2]
Modules linked in: etrackfb_new sony_prs_505
CPU: 0 Tainted: G D W (2.6.26-rc9-00056-g99b0915-dirty #1)
PC is at __bug+0x20/0x2c
LR is at log_wait+0x0/0x8
pc : [<c002118c>] lr : [<c024f1f8>] psr: 20000093
sp : c3ed7d9c ip : c3ed7ce0 fp : c3ed7da8
r10: 00000002 r9 : 00000000 r8 : 40137000
r7 : 00000000 r6 : c3811294 r5 : 00000000 r4 : 00000000
r3 : 00000000 r2 : c024f214 r1 : 0001233d r0 : 00000027
Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user
Control: c000717f Table: 0bef8000 DAC: 00000015
Process framebuff.ko (pid: 215, stack limit = 0xc3ed6260)
Stack: (0xc3ed7d9c to 0xc3ed8000)
7d80: c3ed7dd8
7da0: c3ed7dac c01069d4 c002117c c3ed7db8 00000000 c0310b40 c3811290 00000000
7dc0: 40137000 c3d2c900 00000000 c3ed7df4 c3ed7ddc c00a639c c010692c c0310b40
7de0: 00000001 c3d1339c c3ed7e04 c3ed7df8 c00a65f8 c00a6248 c3ed7e1c c3ed7e08
7e00: c00660d8 c00a6554 c0310b40 00000001 c3ed7e34 c3ed7e20 c0066ba4 c006609c
7e20: 0bd5a0ff c0310b40 c3ed7e7c c3ed7e38 c006d684 c0066b9c c3ef9000 00000001
7e40: 00000001 00000001 00000002 40137000 c0310b40 c3ef8000 00000000 c3d20f00
7e60: c3d1339c 40137000 00000800 00001000 c3ed7ecc c3ed7e80 c006eb64 c006d318
7e80: 00000002 00000001 00000000 c02ffe60 c3d2c900 c3ed7eb4 000004dc c3ef9000
7ea0: c004c368 ffffffff c3d1339c c3d20f00 c3d2c938 c3d2c900 c3ed7fb0 40137000
7ec0: c3ed7f04 c3ed7ed0 c0022dc0 c006e8b8 c02ffe60 c3ef8000 00000817 ffffffff
7ee0: c024e630 00000817 c3ed7fb0 40137000 00000000 4012f000 c3ed7fac c3ed7f08
7f00: c001d1dc c0022ce8 00000000 000000fb c3d2c900 c3d30680 00000224 c3ef9000
7f20: 00076000 ffffffff 00000200 c3d136e4 c3ed7f4c c3ed7f40 c004f998 c0107a20
7f40: c3ed7f84 c3ed7f50 c0022e50 c004f998 00075300 c3ed7f70 00000000 ffffffff
7f60: 00001000 000086c0 00000001 00008520 00000000 4012f000 c3ed7f9c c3ed7f88
7f80: c0022f98 c0022ce8 ffffffff ffffffff 000086b4 000086c0 00000001 00008520
7fa0: 00000000 c3ed7fb0 c001d9c0 c001d1b0 40135000 00000000 000732f8 40137000
7fc0: beb2bed4 000086b4 000086c0 00000001 00008520 00000000 4012f000 beb2bea8
7fe0: 40089810 beb2bd6c 00008674 40089838 20000010 ffffffff ffffffff ffffffff
Backtrace:
[<c002116c>] (__bug+0x0/0x2c) from [<c01069d4>] (radix_tree_tag_set+0xb8/0xfc)
[<c010691c>] (radix_tree_tag_set+0x0/0xfc) from [<c00a639c>] (__set_page_dirty+0x164/0x198)
[<c00a6238>] (__set_page_dirty+0x0/0x198) from [<c00a65f8>] (__set_page_dirty_buffers+0xb4/0xc4)
r6:c3d1339c r5:00000001 r4:c0310b40
[<c00a6544>] (__set_page_dirty_buffers+0x0/0xc4) from [<c00660d8>] (set_page_dirty+0x4c/0xd0)
[<c006608c>] (set_page_dirty+0x0/0xd0) from [<c0066ba4>] (set_page_dirty_balance+0x18/0x5c)
r5:00000001 r4:c0310b40
[<c0066b8c>] (set_page_dirty_balance+0x0/0x5c) from [<c006d684>] (__do_fault+0x37c/0x3a4)
r5:c0310b40 r4:0bd5a0ff
[<c006d308>] (__do_fault+0x0/0x3a4) from [<c006eb64>] (handle_mm_fault+0x2bc/0x5f0)
[<c006e8a8>] (handle_mm_fault+0x0/0x5f0) from [<c0022dc0>] (do_page_fault+0xe8/0x224)
[<c0022cd8>] (do_page_fault+0x0/0x224) from [<c001d1dc>] (do_DataAbort+0x3c/0xa0)
[<c001d1a0>] (do_DataAbort+0x0/0xa0) from [<c001d9c0>] (ret_from_exception+0x0/0x10)
Exception stack(0xc3ed7fb0 to 0xc3ed7ff8)
7fa0: 40135000 00000000 000732f8 40137000
7fc0: beb2bed4 000086b4 000086c0 00000001 00008520 00000000 4012f000 beb2bea8
7fe0: 40089810 beb2bd6c 00008674 40089838 20000010 ffffffff
r8:00008520 r7:00000001 r6:000086c0 r5:000086b4 r4:ffffffff
Code: e1a01000 e59f000c eb006243 e3a03000 (e5833000)
---[ end trace 4cc49dda9c86eac6 ]---
Segmentation fault


2008-08-17 12:09:18

by Jaya Kumar

[permalink] [raw]
Subject: Re: kernel BUG at lib/radix-tree.c:473!

On Thu, Aug 14, 2008 at 6:48 PM, Markus Armbruster <[email protected]> wrote:
> Jeremy Fitzhardinge <[email protected]> writes:
>
>> Hugh Dickins wrote:
>>> As you can see, I'm still groping towards the right answer.
>>> The driver probably needs to provide its own backing_dev_info
>>> (or point to a suitable default), and its own address_space_ops,
>>> and perhaps more (there should be examples elsewhere). But whether
>>> it is actually wrong, or whether I was wrong to mess it up, I've
>>> not yet decided.
>>>
>>
>> My understanding is that the driver is doing something a bit clever:
>> it uses the page dirty flags to determine which parts of the
>> framebuffer have been written to, and uses that information to
>> minimize the amount of stuff that needs to be copied out. The writes
>
> Yes.
>
>> to the pages are not expected to generate actual page faults.
>>
>> But I haven't really looked at it closely, and I'm not at all familiar
>> with the vm at this layer. I'm not sure how it actually allocates the
>> framebuffer memory for example (vmalloc? incrementally on faults?).
>
> vmalloc()
>
>> I'm hoping Markus will leap in, since wrote this stuff. Or, gasp,
>> I'll read the code myself.
>
> The actual cleverness is in fb_defio[*], which was written by Jaya
> Kumar (cc'ed). I merely ripped out the old, somewhat racy cleverness
> I inherited from Anthony Liguori (which you can still admire in Xen's
> 2.6.18 kernel), and switched over to use fb_defio instead. Because
> one instance of clever code is enough.
>
> My understanding of fb_defio's inner workings is rather limited I
> fear. I'm just using it.
>
> Jaya, could you help?
>

I will try my best. Ok, I read through the thread. My current
understanding is as follows:

- Jeremy observed this issue when starting Xorg with Xen pvfb on 2.6.27-rc1
- Ian bisected it to 14fcc23fdc78e9d32372553ccf21758a9bd56fa1
- Peter pointed out from the trace we may be dirtying a page not in
the page cache
- Hugh mentioned prior to the bisected patch maybe the faulting page
had a .set_page_dirty that was ok but now it doesn't.
- Jeremy pointed out that the fault is at 1 page in to the /dev/fb0 mapping
- Hugh mentioned:
" The driver probably needs to provide its own backing_dev_info
(or point to a suitable default), and its own address_space_ops,
and perhaps more (there should be examples elsewhere). But whether
it is actually wrong, or whether I was wrong to mess it up, I've
not yet decided. "

In defio, the page mapping is provided through the vm_file that got
setup during mmap.
page->mapping = vma->vm_file->f_mapping;

I haven't figured how setting inode->i_mapping->a_ops is affecting
this. I will pull tip and test with metronomefb and see if I can
reproduce the issue when starting Xfbdev on that and start debugging
from there.

Thanks,
jaya

2008-08-17 14:05:18

by zhang wenjie

[permalink] [raw]
Subject: Re: kernel BUG at lib/radix-tree.c:473!

Jaya Kumar wrote:
> On Thu, Aug 14, 2008 at 6:48 PM, Markus Armbruster <[email protected]> wrote:
>> Jeremy Fitzhardinge <[email protected]> writes:
>>
>>> Hugh Dickins wrote:
>>>> As you can see, I'm still groping towards the right answer.
>>>> The driver probably needs to provide its own backing_dev_info
>>>> (or point to a suitable default), and its own address_space_ops,
>>>> and perhaps more (there should be examples elsewhere). But whether
>>>> it is actually wrong, or whether I was wrong to mess it up, I've
>>>> not yet decided.
>>>>
>>> My understanding is that the driver is doing something a bit clever:
>>> it uses the page dirty flags to determine which parts of the
>>> framebuffer have been written to, and uses that information to
>>> minimize the amount of stuff that needs to be copied out. The writes
>> Yes.
>>
>>> to the pages are not expected to generate actual page faults.
>>>
>>> But I haven't really looked at it closely, and I'm not at all familiar
>>> with the vm at this layer. I'm not sure how it actually allocates the
>>> framebuffer memory for example (vmalloc? incrementally on faults?).
>> vmalloc()
>>
>>> I'm hoping Markus will leap in, since wrote this stuff. Or, gasp,
>>> I'll read the code myself.
>> The actual cleverness is in fb_defio[*], which was written by Jaya
>> Kumar (cc'ed). I merely ripped out the old, somewhat racy cleverness
>> I inherited from Anthony Liguori (which you can still admire in Xen's
>> 2.6.18 kernel), and switched over to use fb_defio instead. Because
>> one instance of clever code is enough.
>>
>> My understanding of fb_defio's inner workings is rather limited I
>> fear. I'm just using it.
>>
>> Jaya, could you help?
>>
>
> I will try my best. Ok, I read through the thread. My current
> understanding is as follows:
>
> - Jeremy observed this issue when starting Xorg with Xen pvfb on 2.6.27-rc1
> - Ian bisected it to 14fcc23fdc78e9d32372553ccf21758a9bd56fa1
> - Peter pointed out from the trace we may be dirtying a page not in
> the page cache
> - Hugh mentioned prior to the bisected patch maybe the faulting page
> had a .set_page_dirty that was ok but now it doesn't.
> - Jeremy pointed out that the fault is at 1 page in to the /dev/fb0 mapping
> - Hugh mentioned:
> " The driver probably needs to provide its own backing_dev_info
> (or point to a suitable default), and its own address_space_ops,
> and perhaps more (there should be examples elsewhere). But whether
> it is actually wrong, or whether I was wrong to mess it up, I've
> not yet decided. "
>
> In defio, the page mapping is provided through the vm_file that got
> setup during mmap.
> page->mapping = vma->vm_file->f_mapping;
>
> I haven't figured how setting inode->i_mapping->a_ops is affecting
> this. I will pull tip and test with metronomefb and see if I can
> reproduce the issue when starting Xfbdev on that and start debugging
> from there.
>
> Thanks,
> jaya

I have counted the same problem when i mmap the /dev/fb0 and memset it
to 0(the fb driver use deferred_io and when i do not use deferred_io it
works well) .This bug also showed int linux2.6.26 and linux2.6.25. and i
set some printk in the radix_tree_tag_set and fb_deferred_io_fault.

radix_tree_tag_set: height is 0
radix_tree_tag_set: index is 0
radix_tree_tag_set: radix_tree_maxindex(height) is 0
radix_tree_tag_set: height is 0
radix_tree_tag_set: index is 0
radix_tree_tag_set: radix_tree_maxindex(height) is 0
radix_tree_tag_set: height is 0
radix_tree_tag_set: index is 0
radix_tree_tag_set: radix_tree_maxindex(height) is 0
radix_tree_tag_set: height is 0
radix_tree_tag_set: index is 0
radix_tree_tag_set: radix_tree_maxindex(height) is 0
mmap address :0x40135000fb_deferred_io_fault, enter

fb_deferred_io_fault, leave
fb_deferred_io_mkwrite, enter
fb_deferred_io_mkwrite, leave
------------[ cut here ]------------
WARNING: at fs/buffer.c:711 __set_page_dirty+0xbc/0x18c()
Modules linked in: etrackfb_new sony_prs_505
[<c0024198>] (dump_stack+0x0/0x14) from [<c003bf40>]
(warn_on_slowpath+0x4c/0x84)
[<c003bef4>] (warn_on_slowpath+0x0/0x84) from [<c00a403c>]
(__set_page_dirty+0xbc/0x18c)
r6:c38114b0 r5:c0319b80 r4:c0272114
[<c00a3f80>] (__set_page_dirty+0x0/0x18c) from [<c00a433c>]
(__set_page_dirty_buffers+0xbc/0xd0)
r6:c3d01738 r5:00000001 r4:c0319b80
[<c00a4280>] (__set_page_dirty_buffers+0x0/0xd0) from [<c0069dc4>]
(set_page_dirty+0x54/0xdc)
[<c0069d70>] (set_page_dirty+0x0/0xdc) from [<c006a8a4>]
(set_page_dirty_balance+0x18/0x64)
r5:00000001 r4:c0319b80
[<c006a88c>] (set_page_dirty_balance+0x0/0x64) from [<c0071524>]
(__do_fault+0x3b8/0x3f0)
r5:c0319b80 r4:0bd5c0ff
[<c007116c>] (__do_fault+0x0/0x3f0) from [<c0072acc>]
(handle_mm_fault+0x2a8/0x3bc)
[<c0072824>] (handle_mm_fault+0x0/0x3bc) from [<c0025be0>]
(do_page_fault+0xe8/0x224)
[<c0025af8>] (do_page_fault+0x0/0x224) from [<c00201dc>]
(do_DataAbort+0x3c/0xa0)
[<c00201a0>] (do_DataAbort+0x0/0xa0) from [<c00209c0>]
(ret_from_exception+0x0/0x10)
Exception stack(0xc3edffb0 to 0xc3edfff8)
ffa0: 40135000 ffffffff 000752f8
40135000
ffc0: becdced4 000086b8 000086c4 00000001 00008520 00000000 4012f000
becdcea8
ffe0: 40089810 becdcd6c 00008670 40089838 20000010 ffffffff

r8:00008520 r7:00000001 r6:000086c4 r5:000086b8 r4:ffffffff
---[ end trace 7cf699b159b0c732 ]---
radix_tree_tag_set: height is 0
radix_tree_tag_set: index is 0
radix_tree_tag_set: radix_tree_maxindex(height) is 0
fb_deferred_io_fault, enter
fb_deferred_io_fault, leave
fb_deferred_io_mkwrite, enter
fb_deferred_io_mkwrite, leave
radix_tree_tag_set: height is 0
radix_tree_tag_set: index is 1
radix_tree_tag_set: radix_tree_maxindex(height) is 0
kernel BUG at lib/radix-tree.c:477!
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = c3d3c000
[00000000] *pgd=0bd53031, *pte=00000000, *ppte=00000000
Internal error: Oops: 817 [#1]
Modules linked in: etrackfb_new sony_prs_505
CPU: 0 Tainted: G W (2.6.26-00011-g15bc467-dirty #1)
PC is at __bug+0x20/0x2c
LR is at log_wait+0x0/0x8
pc : [<c002418c>] lr : [<c0259200>] psr: 60000093
sp : c3edfda8 ip : c0259200 fp : c3edfdb4
r10: 00000000 r9 : c3edb780 r8 : c38114b4
r7 : 00000001 r6 : c38114b0 r5 : 00000000 r4 : c027ac68
r3 : 00000000 r2 : 00000001 r1 : 00000001 r0 : 00000027
Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user
Control: c000717f Table: 0bd3c000 DAC: 00000015
Process framebuff.ko (pid: 215, stack limit = 0xc3ede260)
Stack: (0xc3edfda8 to 0xc3ee0000)
fda0: c3edfde4 c3edfdb8 c010fce4 c002417c 00000000
00000000
fdc0: c0319b60 c38114b0 00000000 40136000 c3edb780 00000000 c3edfe00
c3edfde8
fde0: c00a40d8 c010fc1c c0319b60 00000001 c3d01738 c3edfe10 c3edfe04
c00a433c
fe00: c00a3f90 c3edfe28 c3edfe14 c0069dc4 c00a4290 c0319b60 00000001
c3edfe40
fe20: c3edfe2c c006a8a4 c0069d80 0bd5b0ff c0319b60 c3edfe88 c3edfe44
c0071524
fe40: c006a89c 00000001 c3d3d000 00000001 00000001 00000001 40136000
c0319b60
fe60: 00000000 00001000 00000800 c3d01738 40136000 000004d8 c3edb780
c3edfecc
fe80: c3edfe8c c0072acc c007117c 00000001 00000001 00000000 c3d3d000
c3d01738
fea0: c3c5f800 ffffffff c3d01738 c3c5f800 c3edb7b8 c3edb780 c3edffb0
40136000
fec0: c3edff04 c3edfed0 c0025be0 c0072834 c0151c64 c014c444 00000817
ffffffff
fee0: c0258638 00000817 c3edffb0 40136000 00000000 4012f000 c3edffac
c3edff08
ff00: c00201dc c0025b08 00000083 00008520 00000083 4012f000 c3edff44
c025cb4c
ff20: 00000083 c3c12600 00000000 00008520 c3ede000 4012f000 c3edff60
c3edff48
ff40: c005fe38 c005ef6c 00000000 c025cb4c 00000084 c3edff7c c3edff64
c0028bb4
ff60: c005fd28 c025b17c 0000000d c027abb8 c3edff8c c3edff80 c0028c38
c0028b84
ff80: c3edffac c3edff90 c0020048 ffffffff 000086b8 000086c4 00000001
00008520
ffa0: 00000000 c3edffb0 c00209c0 c00201b0 40135000 ffffffff 000742f8
40136000
ffc0: becdced4 000086b8 000086c4 00000001 00008520 00000000 4012f000
becdcea8
ffe0: 40089810 becdcd6c 00008670 40089838 20000010 ffffffff ffffffff
ffffffff
Backtrace:
[<c002416c>] (__bug+0x0/0x2c) from [<c010fce4>]
(radix_tree_tag_set+0xd8/0x12c)
[<c010fc0c>] (radix_tree_tag_set+0x0/0x12c) from [<c00a40d8>]
(__set_page_dirty+0x158/0x18c)
[<c00a3f80>] (__set_page_dirty+0x0/0x18c) from [<c00a433c>]
(__set_page_dirty_buffers+0xbc/0xd0)
r6:c3d01738 r5:00000001 r4:c0319b60
[<c00a4280>] (__set_page_dirty_buffers+0x0/0xd0) from [<c0069dc4>]
(set_page_dirty+0x54/0xdc)
[<c0069d70>] (set_page_dirty+0x0/0xdc) from [<c006a8a4>]
(set_page_dirty_balance+0x18/0x64)
r5:00000001 r4:c0319b60
[<c006a88c>] (set_page_dirty_balance+0x0/0x64) from [<c0071524>]
(__do_fault+0x3b8/0x3f0)
r5:c0319b60 r4:0bd5b0ff
[<c007116c>] (__do_fault+0x0/0x3f0) from [<c0072acc>]
(handle_mm_fault+0x2a8/0x3bc)
[<c0072824>] (handle_mm_fault+0x0/0x3bc) from [<c0025be0>]
(do_page_fault+0xe8/0x224)
[<c0025af8>] (do_page_fault+0x0/0x224) from [<c00201dc>]
(do_DataAbort+0x3c/0xa0)
[<c00201a0>] (do_DataAbort+0x0/0xa0) from [<c00209c0>]
(ret_from_exception+0x0/0x10)
Exception stack(0xc3edffb0 to 0xc3edfff8)
ffa0: 40135000 ffffffff 000742f8
40136000
ffc0: becdced4 000086b8 000086c4 00000001 00008520 00000000 4012f000
becdcea8
ffe0: 40089810 becdcd6c 00008670 40089838 20000010 ffffffff

r8:00008520 r7:00000001 r6:000086c4 r5:000086b8 r4:ffffffff
Code: e1a01000 e59f000c eb0061f3 e3a03000 (e5833000)
---[ end trace 7cf699b159b0c732 ]---
Thanks
Wenjie

2008-08-17 16:19:59

by Ian Campbell

[permalink] [raw]
Subject: Re: kernel BUG at lib/radix-tree.c:473!

On Fri, 2008-08-15 at 01:00 +0100, Hugh Dickins wrote:
> On Fri, 15 Aug 2008, Johannes Weiner wrote:
> > Jeremy Fitzhardinge <[email protected]> writes:
> > > Hugh Dickins wrote:
> > >
> > >> An additional useful input would be: what happens if you replace
> > >> that /dev/fb0 by a symlink /dev/fb0 pointing to an fb0 device node in
> > >> one of your disk filesystems? I rather expect that to cause the same
> > >> trouble, which would argue that the driver is wrong and shmem right.
> > >
> > > I don't follow. Do you mean make /dev/fb0 a plain file on a
> > > filesystem? Or make it a disk device node? Something else?
> >
> > Creating a device node on a different filesystem to see if the driver
> > only worked with the safe shmem set_page_dirty and now breaks due to
> > exposure to the generic version. Or if the driver works with the
> > generic version through other mappings and the shmem code screws it up
> > somewhere else.
>
> Yes, that's it. I think it was ext2 I referred to, when I worried
> about this when making the change to tmpfs; and my reading of it
> was that ext2 left a device node's a_ops unset, as I was changing
> tmpfs to do. (Looking at it again, ext2 doesn't even specify its
> .set_page_dirty, so even if it had assigned an a_ops, it wouldn't
> have avoided the default behaviour.) But I'd like to hear what
> actually happens in practice, rather than relying on my reading.

Creating /tmp/fb0 on an ext3 filesystem gives the same behaviour.

FWIW the patch below apparently makes it work for me, but I'm not going
to pretend I follow what's going on, why or what else it breaks ;-)

Ian.

diff --git a/drivers/video/fb_defio.c b/drivers/video/fb_defio.c
index 59df132..8414646 100644
--- a/drivers/video/fb_defio.c
+++ b/drivers/video/fb_defio.c
@@ -24,6 +24,10 @@
#include <linux/rmap.h>
#include <linux/pagemap.h>

+static const struct address_space_operations fb_defio_aops = {
+ .set_page_dirty = __set_page_dirty_no_writeback,
+};
+
/* this is to find and return the vmalloc-ed fb pages */
static int fb_deferred_io_fault(struct vm_area_struct *vma,
struct vm_fault *vmf)
@@ -119,6 +123,7 @@ static int fb_deferred_io_mmap(struct fb_info *info, struct vm_area_struct *vma)
vma->vm_ops = &fb_deferred_io_vm_ops;
vma->vm_flags |= ( VM_IO | VM_RESERVED | VM_DONTEXPAND );
vma->vm_private_data = info;
+ vma->vm_file->f_mapping->a_ops = &fb_defio_aops;
return 0;
}


--
Ian Campbell

Wharbat darbid yarbou sarbay?


Attachments:
signature.asc (197.00 B)
This is a digitally signed message part

2008-08-18 01:33:10

by Nick Piggin

[permalink] [raw]
Subject: Re: kernel BUG at lib/radix-tree.c:473!

On Monday 18 August 2008 02:19, Ian Campbell wrote:
> On Fri, 2008-08-15 at 01:00 +0100, Hugh Dickins wrote:
> > On Fri, 15 Aug 2008, Johannes Weiner wrote:
> > > Jeremy Fitzhardinge <[email protected]> writes:
> > > > Hugh Dickins wrote:
> > > >> An additional useful input would be: what happens if you replace
> > > >> that /dev/fb0 by a symlink /dev/fb0 pointing to an fb0 device node
> > > >> in one of your disk filesystems? I rather expect that to cause the
> > > >> same trouble, which would argue that the driver is wrong and shmem
> > > >> right.
> > > >
> > > > I don't follow. Do you mean make /dev/fb0 a plain file on a
> > > > filesystem? Or make it a disk device node? Something else?
> > >
> > > Creating a device node on a different filesystem to see if the driver
> > > only worked with the safe shmem set_page_dirty and now breaks due to
> > > exposure to the generic version. Or if the driver works with the
> > > generic version through other mappings and the shmem code screws it up
> > > somewhere else.
> >
> > Yes, that's it. I think it was ext2 I referred to, when I worried
> > about this when making the change to tmpfs; and my reading of it
> > was that ext2 left a device node's a_ops unset, as I was changing
> > tmpfs to do. (Looking at it again, ext2 doesn't even specify its
> > .set_page_dirty, so even if it had assigned an a_ops, it wouldn't
> > have avoided the default behaviour.) But I'd like to hear what
> > actually happens in practice, rather than relying on my reading.
>
> Creating /tmp/fb0 on an ext3 filesystem gives the same behaviour.
>
> FWIW the patch below apparently makes it work for me, but I'm not going
> to pretend I follow what's going on, why or what else it breaks ;-)

I think Iwould prefer fs_defio to use its own set_page_dirty
function. __set_page_dirty_no_writeback is supposed to be used
on pagecache, by filesystems.

2008-08-18 07:54:55

by Ian Campbell

[permalink] [raw]
Subject: Re: kernel BUG at lib/radix-tree.c:473!

On Mon, 2008-08-18 at 11:32 +1000, Nick Piggin wrote:
> On Monday 18 August 2008 02:19, Ian Campbell wrote:
> > FWIW the patch below apparently makes it work for me, but I'm not going
> > to pretend I follow what's going on, why or what else it breaks ;-)
>
> I think Iwould prefer fs_defio to use its own set_page_dirty
> function. __set_page_dirty_no_writeback is supposed to be used
> on pagecache, by filesystems.

fbdefio: add set_page_dirty handler to deferred IO FB

Fixes kernel BUG at lib/radix-tree.c:473.

Previously the handler was incidentally provided by tmpfs but this was
removed with:

commit 14fcc23fdc78e9d32372553ccf21758a9bd56fa1
Author: Hugh Dickins <[email protected]>
Date: Mon Jul 28 15:46:19 2008 -0700

tmpfs: fix kernel BUG in shmem_delete_inode

relying on this behaviour was incorrect in any case and the BUG
also appeared when the device node was on an ext3 filesystem.

Signed-off-by: Ian Campbell <[email protected]>
Cc: Jaya Kumar <[email protected]>
Cc: Nick Piggin <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Jeremy Fitzhardinge <[email protected]>
Cc: Kel Modderman <[email protected]>
Cc: Markus Armbruster <[email protected]>
Cc: Peter Zijlstra <[email protected]>
---
drivers/video/fb_defio.c | 12 ++++++++++++
1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/drivers/video/fb_defio.c b/drivers/video/fb_defio.c
index 59df132..214bb7c 100644
--- a/drivers/video/fb_defio.c
+++ b/drivers/video/fb_defio.c
@@ -114,11 +114,23 @@ static struct vm_operations_struct
fb_deferred_io_vm_ops = {
.page_mkwrite = fb_deferred_io_mkwrite,
};

+static int fb_deferred_io_set_page_dirty(struct page *page)
+{
+ if (!PageDirty(page))
+ SetPageDirty(page);
+ return 0;
+}
+
+static const struct address_space_operations fb_deferred_io_aops = {
+ .set_page_dirty = fb_deferred_io_set_page_dirty,
+};
+
static int fb_deferred_io_mmap(struct fb_info *info, struct
vm_area_struct *vma)
{
vma->vm_ops = &fb_deferred_io_vm_ops;
vma->vm_flags |= ( VM_IO | VM_RESERVED | VM_DONTEXPAND );
vma->vm_private_data = info;
+ vma->vm_file->f_mapping->a_ops = &fb_deferred_io_aops;
return 0;
}

--
1.5.6.3

--
Ian Campbell

How many Unix hacks does it take to change a light bulb?
Let's see, can you use a shell script for that or does it need a C program?


Attachments:
signature.asc (197.00 B)
This is a digitally signed message part

2008-08-18 08:05:13

by Peter Zijlstra

[permalink] [raw]
Subject: Re: kernel BUG at lib/radix-tree.c:473!

On Mon, 2008-08-18 at 08:54 +0100, Ian Campbell wrote:
> On Mon, 2008-08-18 at 11:32 +1000, Nick Piggin wrote:
> > On Monday 18 August 2008 02:19, Ian Campbell wrote:
> > > FWIW the patch below apparently makes it work for me, but I'm not going
> > > to pretend I follow what's going on, why or what else it breaks ;-)
> >
> > I think Iwould prefer fs_defio to use its own set_page_dirty
> > function. __set_page_dirty_no_writeback is supposed to be used
> > on pagecache, by filesystems.
>
> fbdefio: add set_page_dirty handler to deferred IO FB
>
> Fixes kernel BUG at lib/radix-tree.c:473.
>
> Previously the handler was incidentally provided by tmpfs but this was
> removed with:
>
> commit 14fcc23fdc78e9d32372553ccf21758a9bd56fa1
> Author: Hugh Dickins <[email protected]>
> Date: Mon Jul 28 15:46:19 2008 -0700
>
> tmpfs: fix kernel BUG in shmem_delete_inode
>
> relying on this behaviour was incorrect in any case and the BUG
> also appeared when the device node was on an ext3 filesystem.
>
> Signed-off-by: Ian Campbell <[email protected]>
> Cc: Jaya Kumar <[email protected]>
> Cc: Nick Piggin <[email protected]>
> Cc: Hugh Dickins <[email protected]>
> Cc: Johannes Weiner <[email protected]>
> Cc: Jeremy Fitzhardinge <[email protected]>
> Cc: Kel Modderman <[email protected]>
> Cc: Markus Armbruster <[email protected]>

Acked-by: Peter Zijlstra <[email protected]>

> ---
> drivers/video/fb_defio.c | 12 ++++++++++++
> 1 files changed, 12 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/video/fb_defio.c b/drivers/video/fb_defio.c
> index 59df132..214bb7c 100644
> --- a/drivers/video/fb_defio.c
> +++ b/drivers/video/fb_defio.c
> @@ -114,11 +114,23 @@ static struct vm_operations_struct
> fb_deferred_io_vm_ops = {
> .page_mkwrite = fb_deferred_io_mkwrite,
> };
>
> +static int fb_deferred_io_set_page_dirty(struct page *page)
> +{
> + if (!PageDirty(page))
> + SetPageDirty(page);
> + return 0;
> +}
> +
> +static const struct address_space_operations fb_deferred_io_aops = {
> + .set_page_dirty = fb_deferred_io_set_page_dirty,
> +};
> +
> static int fb_deferred_io_mmap(struct fb_info *info, struct
> vm_area_struct *vma)
> {
> vma->vm_ops = &fb_deferred_io_vm_ops;
> vma->vm_flags |= ( VM_IO | VM_RESERVED | VM_DONTEXPAND );
> vma->vm_private_data = info;
> + vma->vm_file->f_mapping->a_ops = &fb_deferred_io_aops;
> return 0;
> }
>
> --
> 1.5.6.3
>

2008-08-18 08:06:12

by Nick Piggin

[permalink] [raw]
Subject: Re: kernel BUG at lib/radix-tree.c:473!

On Monday 18 August 2008 17:54, Ian Campbell wrote:
> On Mon, 2008-08-18 at 11:32 +1000, Nick Piggin wrote:
> > On Monday 18 August 2008 02:19, Ian Campbell wrote:
> > > FWIW the patch below apparently makes it work for me, but I'm not going
> > > to pretend I follow what's going on, why or what else it breaks ;-)
> >
> > I think Iwould prefer fs_defio to use its own set_page_dirty
> > function. __set_page_dirty_no_writeback is supposed to be used
> > on pagecache, by filesystems.
>
> fbdefio: add set_page_dirty handler to deferred IO FB
>
> Fixes kernel BUG at lib/radix-tree.c:473.
>
> Previously the handler was incidentally provided by tmpfs but this was
> removed with:
>
> commit 14fcc23fdc78e9d32372553ccf21758a9bd56fa1
> Author: Hugh Dickins <[email protected]>
> Date: Mon Jul 28 15:46:19 2008 -0700
>
> tmpfs: fix kernel BUG in shmem_delete_inode
>
> relying on this behaviour was incorrect in any case and the BUG
> also appeared when the device node was on an ext3 filesystem.
>
> Signed-off-by: Ian Campbell <[email protected]>
> Cc: Jaya Kumar <[email protected]>
> Cc: Nick Piggin <[email protected]>
> Cc: Hugh Dickins <[email protected]>
> Cc: Johannes Weiner <[email protected]>
> Cc: Jeremy Fitzhardinge <[email protected]>
> Cc: Kel Modderman <[email protected]>
> Cc: Markus Armbruster <[email protected]>
> Cc: Peter Zijlstra <[email protected]>

Thanks,
Acked-by: Nick Piggin <[email protected]>

> ---
> drivers/video/fb_defio.c | 12 ++++++++++++
> 1 files changed, 12 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/video/fb_defio.c b/drivers/video/fb_defio.c
> index 59df132..214bb7c 100644
> --- a/drivers/video/fb_defio.c
> +++ b/drivers/video/fb_defio.c
> @@ -114,11 +114,23 @@ static struct vm_operations_struct
> fb_deferred_io_vm_ops = {
> .page_mkwrite = fb_deferred_io_mkwrite,
> };
>
> +static int fb_deferred_io_set_page_dirty(struct page *page)
> +{
> + if (!PageDirty(page))
> + SetPageDirty(page);
> + return 0;
> +}
> +
> +static const struct address_space_operations fb_deferred_io_aops = {
> + .set_page_dirty = fb_deferred_io_set_page_dirty,
> +};
> +
> static int fb_deferred_io_mmap(struct fb_info *info, struct
> vm_area_struct *vma)
> {
> vma->vm_ops = &fb_deferred_io_vm_ops;
> vma->vm_flags |= ( VM_IO | VM_RESERVED | VM_DONTEXPAND );
> vma->vm_private_data = info;
> + vma->vm_file->f_mapping->a_ops = &fb_deferred_io_aops;
> return 0;
> }
>
> --
> 1.5.6.3

2008-08-18 08:22:31

by Jaya Kumar

[permalink] [raw]
Subject: Re: kernel BUG at lib/radix-tree.c:473!

On Mon, Aug 18, 2008 at 3:54 AM, Ian Campbell <[email protected]> wrote:
> On Mon, 2008-08-18 at 11:32 +1000, Nick Piggin wrote:
>> On Monday 18 August 2008 02:19, Ian Campbell wrote:
>> > FWIW the patch below apparently makes it work for me, but I'm not going
>> > to pretend I follow what's going on, why or what else it breaks ;-)
>>
>> I think Iwould prefer fs_defio to use its own set_page_dirty
>> function. __set_page_dirty_no_writeback is supposed to be used
>> on pagecache, by filesystems.
>
> fbdefio: add set_page_dirty handler to deferred IO FB
>
> Fixes kernel BUG at lib/radix-tree.c:473.
>
> Previously the handler was incidentally provided by tmpfs but this was
> removed with:
>
> commit 14fcc23fdc78e9d32372553ccf21758a9bd56fa1
> Author: Hugh Dickins <[email protected]>
> Date: Mon Jul 28 15:46:19 2008 -0700
>
> tmpfs: fix kernel BUG in shmem_delete_inode
>
> relying on this behaviour was incorrect in any case and the BUG
> also appeared when the device node was on an ext3 filesystem.
>
> Signed-off-by: Ian Campbell <[email protected]>
> Cc: Jaya Kumar <[email protected]>
> Cc: Nick Piggin <[email protected]>
> Cc: Hugh Dickins <[email protected]>
> Cc: Johannes Weiner <[email protected]>
> Cc: Jeremy Fitzhardinge <[email protected]>
> Cc: Kel Modderman <[email protected]>
> Cc: Markus Armbruster <[email protected]>
> Cc: Peter Zijlstra <[email protected]>

Acked-by: Jaya Kumar <[email protected]>

Thanks,
jaya