From: "Talpey, Thomas" Subject: [PATCH RFC 1/1] NLM GRANTED callback race Date: Wed, 17 Oct 2007 09:36:46 -0400 Message-ID: Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=====================_175444434==_" To: nfs@lists.sourceforge.net Return-path: Received: from sc8-sf-mx2-b.sourceforge.net ([10.3.1.92] helo=mail.sourceforge.net) by sc8-sf-list2-new.sourceforge.net with esmtp (Exim 4.43) id 1Ii96S-000897-F2 for nfs@lists.sourceforge.net; Wed, 17 Oct 2007 06:38:36 -0700 Received: from mx2.netapp.com ([216.240.18.37]) by mail.sourceforge.net with esmtp (Exim 4.44) id 1Ii96X-0000Fo-IV for nfs@lists.sourceforge.net; Wed, 17 Oct 2007 06:38:33 -0700 Received: from svlexrs02.hq.netapp.com (svlexrs02.corp.netapp.com [10.57.156.154]) by smtp2.corp.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id l9HDbR24020393 for ; Wed, 17 Oct 2007 06:37:27 -0700 (PDT) List-Id: "Discussion of NFS under Linux development, interoperability, and testing." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: nfs-bounces@lists.sourceforge.net Errors-To: nfs-bounces@lists.sourceforge.net --=====================_175444434==_ Content-Type: text/plain; charset="us-ascii" I've discovered a serious and somewhat interesting issue in the NLM client that's been present for quite a while. Basically, if an NLM server GRANTED callback arrives for a lock the client already holds, the client *rejects* the lock. This leads to incorrect behavior (since the server may give the lock to another process), but it also may lead to deadlock in the affected client when the next process is on the same machine; it means two processes are contending in the local locking code. We have seen this in the field recently. I have a proposed fix (patch below), which adds further checking to the NLM callback to scan for held locks, if the callback does not find a blocked lock in the client's nlm_blocked list. If found, the client will *accept* the lock instead of rejecting it. We're testing the patch at high scale currently, an earlier version of the patch (for kernel 2.6.15) works well at extreme levels. To see this, you need to have NLM lock contention, and enough network traffic to cause UDP NLM_GRANTED callbacks to be lost in the network. Typically, the client will retry the NLM_LOCK rpc after 30 seconds when this occurs. These retries often succeed, but there are still the old GRANTED callbacks floating around, either at the server or in the network itself. We can duplicate this in minutes at high load, even with server callback retry. Comments? Tom. ===== --=====================_175444434==_ Content-Type: text/plain; charset="us-ascii" Content-Disposition: attachment; filename="04-fix-nlm-late-granted.dif" Author: Tom Talpey NLM: close race condition in GRANTED callback Don't reply with nlm_lck_denied to a late GRANTED callback for a lock which we already hold. These callbacks can be delayed in the network, and also may cross the client's NLMCLNT_POLL_TIMEOUT retry. Signed-off-by: Tom Talpey --- fs/lockd/clntlock.c | 3 ++ fs/lockd/host.c | 52 ++++++++++++++++++++++++++++++++++++++++++++ include/linux/lockd/lockd.h | 2 + 3 files changed, 57 insertions(+) Index: linux-2.6.22.10/fs/lockd/clntlock.c =================================================================== --- linux-2.6.22.10.orig/fs/lockd/clntlock.c +++ linux-2.6.22.10/fs/lockd/clntlock.c @@ -135,6 +135,9 @@ __be32 nlmclnt_grant(const struct sockad wake_up(&block->b_wait); res = nlm_granted; } + /* Do NOT deny a lock which we already hold */ + if (res != nlm_granted && nlmclnt_check_lock(addr, lock)) + res = nlm_granted; return res; } Index: linux-2.6.22.10/include/linux/lockd/lockd.h =================================================================== --- linux-2.6.22.10.orig/include/linux/lockd/lockd.h +++ linux-2.6.22.10/include/linux/lockd/lockd.h @@ -174,6 +174,8 @@ void nlmclnt_next_cookie(struct nlm_c */ struct nlm_host * nlmclnt_lookup_host(const struct sockaddr_in *, int, int, const char *, int); struct nlm_host * nlmsvc_lookup_host(struct svc_rqst *, const char *, int); +int nlmclnt_check_lock(const struct sockaddr_in *, + const struct nlm_lock *); struct rpc_clnt * nlm_bind_host(struct nlm_host *); void nlm_rebind_host(struct nlm_host *); struct nlm_host * nlm_get_host(struct nlm_host *); Index: linux-2.6.22.10/fs/lockd/host.c =================================================================== --- linux-2.6.22.10.orig/fs/lockd/host.c +++ linux-2.6.22.10/fs/lockd/host.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #include #include @@ -197,6 +198,57 @@ nlmsvc_lookup_host(struct svc_rqst *rqst } /* + * Find an existing client NLM lock, used for validating old GRANTED callbacks. + */ + +static inline int +nlm_lock_match(const struct nlm_lock *lock, const struct file_lock *fl) +{ + return lock->fl.fl_start == fl->fl_start && + lock->fl.fl_end == fl->fl_end && + lock->svid == fl->fl_u.nfs_fl.owner->pid && + 0 == nfs_compare_fh(&lock->fh, + NFS_FH(fl->fl_file->f_path.dentry->d_inode)); +} + +int +nlmclnt_check_lock(const struct sockaddr_in *sin, const struct nlm_lock *lock) +{ + struct hlist_head *chain; + struct hlist_node *pos; + struct nlm_host *host; + struct file_lock *fl; + int hash; + + dprintk("lockd: nlmclnt_check_lock(%u.%u.%u.%u)\n", + NIPQUAD(sin->sin_addr.s_addr)); + + hash = NLM_ADDRHASH(sin->sin_addr.s_addr); + + /* Lock hash table */ + mutex_lock(&nlm_host_mutex); + + /* We must check all nlm_host objects for a peer, because the + * callback does not provide protocol, version or other context. + * When each peer is found, we scan for any matching granted lock. + * Also check for a granted lock in the process of being reclaimed. */ + chain = &nlm_hosts[hash]; + hlist_for_each_entry(host, pos, chain, h_hash) { + if (!nlm_cmp_addr(&host->h_addr, sin)) + continue; + list_for_each_entry(fl, &host->h_granted, fl_u.nfs_fl.list) + if (nlm_lock_match(lock, fl)) + goto found; + list_for_each_entry(fl, &host->h_reclaim, fl_u.nfs_fl.list) + if (nlm_lock_match(lock, fl)) + goto found; + } + fl = NULL; +found: + mutex_unlock(&nlm_host_mutex); + return fl != NULL; +} +/* * Create the NLM RPC client for an NLM peer */ struct rpc_clnt * --=====================_175444434==_ Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ --=====================_175444434==_ Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs --=====================_175444434==_--