From: Peter Staubach Subject: Re: 2.6.24.3 kernel BUG at fs/nfs/pagelist.c:82 Date: Fri, 11 Apr 2008 16:10:42 -0400 Message-ID: <47FFC5C2.1070009@redhat.com> References: <20080319094942.GA7627@janus> <1206017233.8465.7.camel@heimdal.trondhjem.org> <20080320125716.GA20071@janus> <20080410115433.GA29211@janus> <1207944436.14621.6.camel@heimdal.trondhjem.org> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Cc: Frank van Maarseveen , linux-nfs@vger.kernel.org To: Trond Myklebust Return-path: Received: from mx1.redhat.com ([66.187.233.31]:49999 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758568AbYDKULB (ORCPT ); Fri, 11 Apr 2008 16:11:01 -0400 In-Reply-To: <1207944436.14621.6.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org> Sender: linux-nfs-owner@vger.kernel.org List-ID: Trond Myklebust wrote: > On Thu, 2008-04-10 at 13:54 +0200, Frank van Maarseveen wrote: > >> FYI, >> >> On Thu, Mar 20, 2008 at 01:57:16PM +0100, Frank van Maarseveen wrote: >> >>> On Thu, Mar 20, 2008 at 08:47:13AM -0400, Trond Myklebust wrote: >>> >>>> On Wed, 2008-03-19 at 10:49 +0100, Frank van Maarseveen wrote: >>>> >>>>> FYI, >>>>> >>>>> 2.6.24.3 wrote: >>>>> >>>>>> kernel BUG at fs/nfs/pagelist.c:82! >>>>>> >>>>> BUG_ON(PagePrivate(page)); >>>>> >>>>> >>>>>> invalid opcode: 0000 [#1] SMP >>>>>> Modules linked in: vmnetfilter vmnet(P) vmmon(P) vmthrottle >>>>>> >>>>> In addition, there are some NFS patches for handling >16 groups and >>>>> selectively disabling attribute caching so its not a clean kernel. >>>>> >>>>> >>>>>> Pid: 4575, comm: tail Tainted: P (2.6.24.3-x177 #1) >>>>>> >>> ^^^^ >>> >>>>>> EIP: 0060:[] EFLAGS: 00010202 CPU: 1 >>>>>> EIP is at nfs_create_request+0xf4/0x100 >>>>>> EAX: 80000821 EBX: e31a5300 ECX: 00000000 EDX: c1f0712c >>>>>> ESI: c1f0712c EDI: e31a5338 EBP: e56dfd90 ESP: e56dfd74 >>>>>> DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 >>>>>> Process tail (pid: 4575, ti=e56de000 task=d4b65500 task.ti=e56de000) >>>>>> Stack: 00000000 f669ad20 cac3c168 e7330cb0 00000000 00000000 cac3c168 e56dfdc8 >>>>>> c01fded5 00000000 000000a4 039cffff 000000a4 c1f0712c cac3c168 e7330cb0 >>>>>> e56dfdb4 e56dfdb4 ffffff8c c1f0712c e7330cb0 e56dfdf0 c01fe8ce e56dfddc >>>>>> Call Trace: >>>>>> [] show_trace_log_lvl+0x1a/0x30 >>>>>> [] show_stack_log_lvl+0x9a/0xc0 >>>>>> [] show_registers+0xc8/0x1d0 >>>>>> [] die+0x10c/0x230 >>>>>> [] do_trap+0x91/0xd0 >>>>>> [] do_invalid_op+0x89/0xa0 >>>>>> [] error_code+0x72/0x80 >>>>>> [] nfs_readpage_async+0xb5/0x1b0 >>>>>> [] nfs_readpage+0xae/0x120 >>>>>> [] do_generic_mapping_read+0xe8/0x440 >>>>>> [] generic_file_aio_read+0x160/0x190 >>>>>> [] nfs_file_read+0x97/0xe0 >>>>>> [] do_sync_read+0xc7/0x120 >>>>>> [] vfs_read+0x84/0x130 >>>>>> [] sys_read+0x3d/0x70 >>>>>> [] syscall_call+0x7/0xb >>>>>> ======================= >>>>>> Code: 02 75 0a e8 4f dc 3b 00 e9 4a ff ff ff 83 c4 10 b8 00 fe ff ff 5b 5e 5f 5d c3 8b 56 0c e9 7a ff ff ff 0f 0b eb fe 90 0f 0b eb fe <0f> 0b eb fe 90 8d b4 26 00 00 00 00 55 89 e5 53 83 ec 04 89 c3 >>>>>> EIP: [] nfs_create_request+0xf4/0x100 SS:ESP 0068:e56dfd74 >>>>>> ---[ end trace 0ef921372ea6410b ]--- >>>>>> >>>>> The machine is a quad Xeon with 4GB ram with CONFIG_HIGHMEM64G=y >>>>> >>>> Would that be on a file that was open for read and write, or is it >>>> possible that some other process was writing to the same file? If so, >>>> then it might be a bug in nfs_wb_page(). >>>> >>> Yes, I'm quite sure it was a "tail -f" on a logfile which gets >>> continuously appended to by another process.. So, one process reads it >>> while another one writes to it through different descriptors/struct file. >>> >> The problem occurred again on a different box under exactly the same >> userland conditions yielding exactly the same stack trace. Kernels are >> identical but no vmware modules this time. >> > > Just a quick question: how does your > 16 groups patch behave when it is > denied a write with an EACCES error? I've got a feeling that this may be > due to the page getting redirtied and the RPC call retried. If so, then > the following patch may help. > > Cheers > Trond > -------------------------------------------------------------------------- > From: Trond Myklebust > Date: Fri, 11 Apr 2008 16:03:54 -0400 > Subject: NFS: Fix nfs_wb_page() to always exit with an error or a clean page > > It is possible for nfs_wb_page() to sometimes exit with 0 return value, yet > the page is left in a dirty state. > For instance in the case where the server rebooted, and the COMMIT request > failed, then all the previously "clean" pages which were cached by the > server, but were not guaranteed to have been writted out to disk, > have to be redirtied and resent to the server. > The fix is to have nfs_wb_page_priority() check that the page is clean > before it exits... > > Is this a problem if the page is mmap'd into a process address space which is constantly touching it? Thanx... ps > This fixes a condition that triggers the BUG_ON(PagePrivate(page)) in > nfs_create_request() when we're in the nfs_readpage() path. > > Also eliminate a redundant BUG_ON(!PageLocked(page)) while we're at it. It > turns out that clear_page_dirty_for_io() has the exact same test. > > Signed-off-by: Trond Myklebust > --- > > fs/nfs/write.c | 24 +++++++++++++----------- > 1 files changed, 13 insertions(+), 11 deletions(-) > > diff --git a/fs/nfs/write.c b/fs/nfs/write.c > index ce40cad..3a2e821 100644 > --- a/fs/nfs/write.c > +++ b/fs/nfs/write.c > @@ -1493,18 +1493,20 @@ static int nfs_wb_page_priority(struct inode *inode, struct page *page, > }; > int ret; > > - BUG_ON(!PageLocked(page)); > - if (clear_page_dirty_for_io(page)) { > - ret = nfs_writepage_locked(page, &wbc); > + do { > + if (clear_page_dirty_for_io(page)) { > + ret = nfs_writepage_locked(page, &wbc); > + if (ret < 0) > + goto out_error; > + } else if (!PagePrivate(page)) > + break; > + ret = nfs_sync_mapping_wait(page->mapping, &wbc, how); > if (ret < 0) > - goto out; > - } > - if (!PagePrivate(page)) > - return 0; > - ret = nfs_sync_mapping_wait(page->mapping, &wbc, how); > - if (ret >= 0) > - return 0; > -out: > + goto out_error; > + /* Is the test for PageDirty() really needed here? */ > + } while (PagePrivate(page) || PageDirty(page)); > + return 0; > +out_error: > __mark_inode_dirty(inode, I_DIRTY_PAGES); > return ret; > } > >