Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753837Ab1C0Pe6 (ORCPT ); Sun, 27 Mar 2011 11:34:58 -0400 Received: from mx2.netapp.com ([216.240.18.37]:31490 "EHLO mx2.netapp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753761Ab1C0Pe5 convert rfc822-to-8bit (ORCPT ); Sun, 27 Mar 2011 11:34:57 -0400 X-IronPort-AV: E=Sophos;i="4.63,251,1299484800"; d="scan'208";a="535152342" Subject: Re: strange put_rpccred() handling From: Trond Myklebust To: OGAWA Hirofumi Cc: "J. Bruce Fields" , linux-kernel@vger.kernel.org, linux-nfs@vger.kernel.org In-Reply-To: <87ei5sppsx.fsf@devron.myhome.or.jp> References: <87ei5sppsx.fsf@devron.myhome.or.jp> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8BIT Organization: NetApp Inc Date: Sun, 27 Mar 2011 17:33:46 +0200 Message-ID: <1301240026.22136.27.camel@lade.trondhjem.org> Mime-Version: 1.0 X-Mailer: Evolution 2.32.2 (2.32.2-1.fc14) X-OriginalArrivalTime: 27 Mar 2011 15:33:47.0324 (UTC) FILETIME=[5D34AFC0:01CBEC94] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3004 Lines: 81 On Mon, 2011-03-28 at 00:05 +0900, OGAWA Hirofumi wrote: > Hi, > > BUG: atomic_dec_and_test(): -1: atomic counter underflow at: > Pid: 2827, comm: mount.nfs Not tainted 2.6.38 #1 > Call Trace: > [] ? put_rpccred+0x44/0x14e [sunrpc] > [] ? rpc_ping+0x4e/0x58 [sunrpc] > [] ? rpc_create+0x481/0x4fc [sunrpc] > [] ? rpcauth_lookup_credcache+0xab/0x22d [sunrpc] > [] ? nfs_create_rpc_client+0xa6/0xeb [nfs] > [] ? nfs4_set_client+0xc2/0x1f9 [nfs] > [] ? nfs4_create_server+0xf2/0x2a6 [nfs] > [] ? nfs4_remote_mount+0x4e/0x14a [nfs] > [] ? vfs_kern_mount+0x6e/0x133 > [] ? nfs_do_root_mount+0x76/0x95 [nfs] > [] ? nfs4_try_mount+0x56/0xaf [nfs] > [] ? nfs_get_sb+0x435/0x73c [nfs] > [] ? vfs_kern_mount+0x99/0x133 > [] ? do_kern_mount+0x48/0xd8 > [] ? do_mount+0x6da/0x741 > [] ? sys_mount+0x83/0xc0 > [] ? system_call_fastpath+0x16/0x1b > > This is not oops, and debug code is not in vanilla. This debug code is > simple - detects atomic_dec_and_test() underflow. > > Well, so, I think this is real bug of nfs codes somewhere. With some > review, the code > > rpc_call_sync() > rpc_run_task > rpc_execute() > __rpc_execute() > rpc_release_task() > rpc_release_resources_task() > put_rpccred() <= release cred > rpc_put_task > rpc_do_put_task() > rpc_release_resources_task() > put_rpccred() <= release cred again > > seems to be release cred unintendedly. > > static void rpc_release_resources_task(struct rpc_task *task) > { > if (task->tk_rqstp) > xprt_release(task); > if (task->tk_msg.rpc_cred) { > put_rpccred(task->tk_msg.rpc_cred); > task->tk_msg.rpc_cred = NULL; > } > rpc_task_release_client(task); > } > > The above change may fix the problem though, I don't know the codes what > want to do actually. And I guess this is not right fix, because the path > is looks strange - on early stage, __rpc_execute() calls > rpc_release_task() explicitly. Argh! You are completely correct. The intention was that rpc_release_resources_task() should be able to be called more than once, which means that we have to set task->tk_msg.rpc_cred to NULL after freeing the cred. I've no idea why I missed that when I cleaned up that code. Thanks for debugging this! -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust@netapp.com www.netapp.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/