Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp5936imu; Sun, 4 Nov 2018 17:34:31 -0800 (PST) X-Google-Smtp-Source: AJdET5c3F/deBihLMQVRRIBIrYPbvGmhjitLIqAtuUzYExQiHpQZljGCCq7hiauBzCL7Hgxz9QEP X-Received: by 2002:a62:6801:: with SMTP id d1-v6mr20137251pfc.7.1541381671169; Sun, 04 Nov 2018 17:34:31 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1541381671; cv=none; d=google.com; s=arc-20160816; b=mLCE5oM4SOYk8iOrzysVQuUJ2E5cCpJAWxaZ6u9iYBg0cNs9POFqLGNPJ2+N/hCgge Dyy0oDdwLe6Yh9t4oUeJIrjeBVsHidj0U3d/S1rW2hgO16efp8UtkVqtnYJenjnhOX0q D6Kn1ppEk/kAdN8d4xgAkXFrWnwdqMOTr8ef8I7NKiS1U+eW18vqqu3Hf+/4ma+Kt5nO 5b+m1IrUfbglY9kWiWYqZ3Dw4DHILym8Ou3XxC9Xu/x/QL+QePn8go1C7MT85+E8kPz7 cbZE31EDmEkneVTfMW5InNPSlXD5R4i7rxKftZF/XJLbaVfa8f5UYy85gF+k6WX8T16q LpIA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:cc:subject:date:to :from; bh=1Pxrjl/JcSmRNJ+38MSmAJc1mvaXBl0yaFYCq++XPQM=; b=YdPvL9bpg01ZqAzUZWVJcun6C8qfDg8xck8zLZeE3NHuC8cBPL1JcHYWXoamq6sTYI mAHSa/p20wlcjtOAuXbJoHSjOM53vpiR7ZUGW0wNbv3g2qANO/YDwp3K1c25YWuEK3KA 3lGuGtQQ1QHNRttaatSY/Vjcu6v9ula7UVxRl4dYAezM/xInYZG7/XsxUrw51paSgzKV vAOcSyrsfaFPYLEGoDuHB5XxvZlwiwpRtVcACALPkEFfveHiHzQCEFM2Dqq1pnZWDu74 nugih1LQzuLoL+0Ul015wrjqJNs4OIPsB+LQM6h6YsOQERrqWN6xIDMWiUCKqqscOA6o Remg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k83-v6si24782164pfj.115.2018.11.04.17.34.16; Sun, 04 Nov 2018 17:34:31 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728916AbeKEKtw (ORCPT + 99 others); Mon, 5 Nov 2018 05:49:52 -0500 Received: from mx2.suse.de ([195.135.220.15]:60852 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726494AbeKEKtv (ORCPT ); Mon, 5 Nov 2018 05:49:51 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id B9D19AFD9; Mon, 5 Nov 2018 01:32:42 +0000 (UTC) From: NeilBrown To: Jeff Layton , Alexander Viro Date: Mon, 05 Nov 2018 12:30:48 +1100 Subject: [PATCH 10/12] fs/locks: create a tree of dependent requests. Cc: "J. Bruce Fields" , Martin Wilck , linux-fsdevel@vger.kernel.org, Frank Filz , linux-kernel@vger.kernel.org Message-ID: <154138144796.31651.14201944346371750178.stgit@noble> In-Reply-To: <154138128401.31651.1381177427603557514.stgit@noble> References: <154138128401.31651.1381177427603557514.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When we find an existing lock which conflicts with a request, and the request wants to wait, we currently add the request to a list. When the lock is removed, the whole list is woken. This can cause the thundering-herd problem. To reduce the problem, we make use of the (new) fact that a pending request can itself have a list of blocked requests. When we find a conflict, we look through the existing blocked requests. If any one of them blocks the new request, the new request is attached below that request, otherwise it is added to the list of blocked requests, which are now known to be mutually non-conflicting. This way, when the lock is released, only a set of non-conflicting locks will be woken, the rest can stay asleep. If the lock request cannot be granted and the request needs to be requeued, all the other requests it blocks will then be woken Reported-and-tested-by: Martin Wilck Signed-off-by: NeilBrown --- fs/locks.c | 29 +++++++++++++++++++++++------ 1 file changed, 23 insertions(+), 6 deletions(-) diff --git a/fs/locks.c b/fs/locks.c index 802d5853acd5..1b0eac6b2918 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -715,11 +715,25 @@ static void locks_delete_block(struct file_lock *waiter) * fl_blocked list itself is protected by the blocked_lock_lock, but by ensuring * that the flc_lock is also held on insertions we can avoid taking the * blocked_lock_lock in some cases when we see that the fl_blocked list is empty. + * + * Rather than just adding to the list, we check for conflicts with any existing + * waiters, and add beneath any waiter that blocks the new waiter. + * Thus wakeups don't happen until needed. */ static void __locks_insert_block(struct file_lock *blocker, - struct file_lock *waiter) + struct file_lock *waiter, + bool conflict(struct file_lock *, + struct file_lock *)) { + struct file_lock *fl; BUG_ON(!list_empty(&waiter->fl_block)); + +new_blocker: + list_for_each_entry(fl, &blocker->fl_blocked, fl_block) + if (conflict(fl, waiter)) { + blocker = fl; + goto new_blocker; + } waiter->fl_blocker = blocker; list_add_tail(&waiter->fl_block, &blocker->fl_blocked); if (IS_POSIX(blocker) && !IS_OFDLCK(blocker)) @@ -734,10 +748,12 @@ static void __locks_insert_block(struct file_lock *blocker, /* Must be called with flc_lock held. */ static void locks_insert_block(struct file_lock *blocker, - struct file_lock *waiter) + struct file_lock *waiter, + bool conflict(struct file_lock *, + struct file_lock *)) { spin_lock(&blocked_lock_lock); - __locks_insert_block(blocker, waiter); + __locks_insert_block(blocker, waiter, conflict); spin_unlock(&blocked_lock_lock); } @@ -996,7 +1012,7 @@ static int flock_lock_inode(struct inode *inode, struct file_lock *request) if (!(request->fl_flags & FL_SLEEP)) goto out; error = FILE_LOCK_DEFERRED; - locks_insert_block(fl, request); + locks_insert_block(fl, request, flock_locks_conflict); goto out; } if (request->fl_flags & FL_ACCESS) @@ -1071,7 +1087,8 @@ static int posix_lock_inode(struct inode *inode, struct file_lock *request, spin_lock(&blocked_lock_lock); if (likely(!posix_locks_deadlock(request, fl))) { error = FILE_LOCK_DEFERRED; - __locks_insert_block(fl, request); + __locks_insert_block(fl, request, + posix_locks_conflict); } spin_unlock(&blocked_lock_lock); goto out; @@ -1542,7 +1559,7 @@ int __break_lease(struct inode *inode, unsigned int mode, unsigned int type) break_time -= jiffies; if (break_time == 0) break_time++; - locks_insert_block(fl, new_fl); + locks_insert_block(fl, new_fl, leases_conflict); trace_break_lease_block(inode, new_fl); spin_unlock(&ctx->flc_lock); percpu_up_read_preempt_enable(&file_rwsem);