Return-Path: Received: from mail-qg0-f52.google.com ([209.85.192.52]:32977 "EHLO mail-qg0-f52.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751065AbbJJLT2 (ORCPT ); Sat, 10 Oct 2015 07:19:28 -0400 Received: by qgew37 with SMTP id w37so30264173qge.0 for ; Sat, 10 Oct 2015 04:19:27 -0700 (PDT) Date: Sat, 10 Oct 2015 07:19:23 -0400 From: Jeff Layton To: "J. Bruce Fields" Cc: linux-nfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, Al Viro Subject: Re: [PATCH v5 00/20] nfsd: open file caching Message-ID: <20151010071923.435ce037@synchrony.poochiereds.net> In-Reply-To: <20151008180400.GB496@fieldses.org> References: <1444042962-6947-1-git-send-email-jeff.layton@primarydata.com> <20151008164225.GA496@fieldses.org> <20151008125529.3f30308e@synchrony.poochiereds.net> <20151008180400.GB496@fieldses.org> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-nfs-owner@vger.kernel.org List-ID: On Thu, 8 Oct 2015 14:04:00 -0400 "J. Bruce Fields" wrote: > On Thu, Oct 08, 2015 at 12:55:29PM -0400, Jeff Layton wrote: > > My bad...it needs this patch. I'll roll this into the set before the > > next posting. > > Oh, good, thanks. > > Also, just seen on the server side--not sure what was going on at the > time. > > There were a ton of these: > > Oct 08 12:35:07 f21-1.fieldses.org kernel: ------------[ cut here ]------------ > Oct 08 12:35:07 f21-1.fieldses.org kernel: WARNING: CPU: 1 PID: 584 at lib/list_debug.c:59 __list_del_entry+0x9e/0xc0() > Oct 08 12:35:07 f21-1.fieldses.org kernel: list_del corruption. prev->next should be ffff88004cb23f80, but was b6a7e8df8948e4eb > Oct 08 12:35:07 f21-1.fieldses.org kernel: Modules linked in: rpcsec_gss_krb5 nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc > Oct 08 12:35:07 f21-1.fieldses.org kernel: CPU: 1 PID: 584 Comm: fsnotify_mark Not tainted 4.3.0-rc3-14186-g7619b8e #322 > Oct 08 12:35:07 f21-1.fieldses.org kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140709_153950- 04/01/2014 > Oct 08 12:35:07 f21-1.fieldses.org kernel: ffffffff81f62683 ffff880071af3d50 ffffffff8160540c ffff880071af3d98 > Oct 08 12:35:07 f21-1.fieldses.org kernel: ffff880071af3d88 ffffffff81077692 ffff88004cb23f80 ffffffff8109c160 > Oct 08 12:35:07 f21-1.fieldses.org kernel: ffff880071af3e08 ffff880071af3e30 ffff88004cb23f70 ffff880071af3de8 > Oct 08 12:35:07 f21-1.fieldses.org kernel: Call Trace: > Oct 08 12:35:07 f21-1.fieldses.org kernel: [] dump_stack+0x4e/0x82 > Oct 08 12:35:07 f21-1.fieldses.org kernel: [] warn_slowpath_common+0x82/0xc0 > Oct 08 12:35:07 f21-1.fieldses.org kernel: [] ? sort_range+0x20/0x30 > Oct 08 12:35:07 f21-1.fieldses.org kernel: [] warn_slowpath_fmt+0x4c/0x50 > Oct 08 12:35:07 f21-1.fieldses.org kernel: [] __list_del_entry+0x9e/0xc0 > Oct 08 12:35:07 f21-1.fieldses.org kernel: [] fsnotify_mark_destroy+0x95/0x140 > Oct 08 12:35:07 f21-1.fieldses.org kernel: [] ? wait_woken+0x90/0x90 > Oct 08 12:35:07 f21-1.fieldses.org kernel: [] ? fsnotify_put_mark+0x30/0x30 > Oct 08 12:35:07 f21-1.fieldses.org kernel: [] kthread+0xef/0x110 > Oct 08 12:35:07 f21-1.fieldses.org kernel: [] ? _raw_spin_unlock_irq+0x2c/0x50 > Oct 08 12:35:07 f21-1.fieldses.org kernel: [] ? kthread_create_on_node+0x200/0x200 > Oct 08 12:35:07 f21-1.fieldses.org kernel: [] ret_from_fork+0x3f/0x70 > Oct 08 12:35:07 f21-1.fieldses.org kernel: [] ? kthread_create_on_node+0x200/0x200 > Oct 08 12:35:07 f21-1.fieldses.org kernel: ---[ end trace 687abd8552e06b32 ]--- > Thanks for the bug report! I think I understand the problem now: It's in the way this patchset embeds a fsnotify_mark inside the nfsd_file. The way fsnotify_destroy_mark works sort of requires that it be freed separately since it wants to traverse these objects under a srcu read lock. The rest of the stack traces are probably collateral damage from that mem corruption. I think I'll have to change the code to allocate the fsnotify_mark objects separately. It may also be better to have just one mark per inode and have each nfsd_file take a reference to the mark. I'll need to stare at the code a bit longer to see what makes the most sense. -- Jeff Layton