Received: by 2002:a4a:311b:0:0:0:0:0 with SMTP id k27-v6csp3902084ooa; Mon, 13 Aug 2018 20:59:56 -0700 (PDT) X-Google-Smtp-Source: AA+uWPyLxKBafC8MZtDHE+1CHult7+27iEuT3pRFy+iahNa7zctzOSf/gU8qlo5HqfLPLNTG3Lnk X-Received: by 2002:a63:d916:: with SMTP id r22-v6mr19044412pgg.381.1534219196782; Mon, 13 Aug 2018 20:59:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1534219196; cv=none; d=google.com; s=arc-20160816; b=ccZ7ZYcEKIgZrUiOQp+/S4J3YGJwZiOFvCWUHdeWNqSXDi5U1RmmhQHNtnFxxjmPea IYoYHEqwDH8XkxAvfPOstUFprYtXArg3RAsg/QRDbRFP/qvIDZ59ITwlioIXu/7AdzKV LBbDZ1xZV+LPhoLWlJDC8PaAQxiFr18A7BXM/o6n3bWBqIFml9Bu4n9taatjW3AGJydT kiMIJr1aUGN4coLEuE2om6pvGggndfRB4TjPXjP8PBPWN02sTgn17gz+2MwjTqORAWBs UQ1eoO3HRlUoTb3b3fwvyGonlYJvcAVK5lvRtuguuVxNkf08zEO9UlMGW2IiWlQoRyRG btCA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:message-id:cc:subject:date:to:from :arc-authentication-results; bh=BAFzQvwenu/1mnX3q7yhH2nm0RqVMlg5AlmpjP6HBdM=; b=FQB5iJbFi+7eeolKq4jAIIrZh7RqMFV6wUUBVU8KzHOyPnT7WQfUUTYxJ4t/Tm7Peu A1aKNQXjvy3lTzTlKgFHq292RIk29vK/S+HXFvyK8Udk3qcNUQZHXvNk1SkUaJ7hZ9uT XDOwFkrTytgEWKcpoLyxVjZS2dMMZkKxzPCM9ktQQWU64lhbLgfgvSOx4fVQFR2PCWPf jccI10J4XFxY52dCCxphlY6HnMNhJo7RZSCP5vjj+8MfjRQRRmCwc0MbXR/A0RXNiXBC GPWMSnxmvfdgsKZPssHQGdIU6Uj6kSh3XlVpazmYBzQVOrZqUzDxh/0HSEgK6fWYY9B/ /p5Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c125-v6si17192565pga.534.2018.08.13.20.59.42; Mon, 13 Aug 2018 20:59:56 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731060AbeHNGnT (ORCPT + 99 others); Tue, 14 Aug 2018 02:43:19 -0400 Received: from mx2.suse.de ([195.135.220.15]:44964 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725846AbeHNGnS (ORCPT ); Tue, 14 Aug 2018 02:43:18 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 7F63BAF3E; Tue, 14 Aug 2018 03:58:01 +0000 (UTC) From: NeilBrown To: Jeff Layton , Alexander Viro Date: Tue, 14 Aug 2018 13:56:51 +1000 Subject: [PATCH 0/5 v2] locks: avoid thundering-herd wake-ups Cc: "J. Bruce Fields" , Martin Wilck , linux-fsdevel@vger.kernel.org, Frank Filz , linux-kernel@vger.kernel.org Message-ID: <153421852728.24426.2111161640156686201.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org V2, which added wake_non_conflicts() was more broken than V1 - as Bruce explained there is no transitivity in the blocking relation between locks. So this series takes a simpler approach. It still attached waiters between other waiters as necessary to ensure that: - a waiter is blocked by it's parent (fl->blocker) and all further ancestors, and - the list of waiters on fl_blocked are mutually non-conflicting. When a lock (the root of a tree of requests) is released, only its immediate children (fl_blocked) are woken. When any lock is woken (either because its fl_blocker was released to due to a signal or similar) it with either: - be granted - be aborted - be re-queued beneath some other lock. In the first case tree of blocked locks is moved across to the newly created lock, and the invariants still hold. In the order two cases, the tree or blocked waiters are all detached and woken. Note that this series has not received much testing yet. Original description: If you have a many-core machine, and have many threads all wanting to briefly lock a give file (udev is known to do this), you can get quite poor performance. When one thread releases a lock, it wakes up all other threads that are waiting (classic thundering-herd) - one will get the lock and the others go to sleep. When you have few cores, this is not very noticeable: by the time the 4th or 5th thread gets enough CPU time to try to claim the lock, the earlier threads have claimed it, done what was needed, and released. With 50+ cores, the contention can easily be measured. This patchset creates a tree of pending lock request in which siblings don't conflict and each lock request does conflict with its parent. When a lock is released, only requests which don't conflict with each other a woken. Testing shows that lock-acquisitions-per-second is now fairly stable even as number of contending process goes to 1000. Without this patch, locks-per-second drops off steeply after a few 10s of processes. There is a small cost to this extra complexity. At 20 processes running a particular test on 72 cores, the lock acquisitions per second drops from 1.8 million to 1.4 million with this patch. For 100 processes, this patch still provides 1.4 million while without this patch there are about 700,000. NeilBrown --- NeilBrown (5): fs/locks: rename some lists and pointers. fs/locks: split out __locks_wake_up_blocks(). fs/locks: allow a lock request to block other requests. fs/locks: change all *_conflict() functions to return bool. fs/locks: create a tree of dependent requests. fs/cifs/file.c | 2 - fs/locks.c | 156 ++++++++++++++++++++++++++------------- include/linux/fs.h | 7 +- include/trace/events/filelock.h | 16 ++-- 4 files changed, 119 insertions(+), 62 deletions(-) -- Signature