Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp1044765imm; Wed, 8 Aug 2018 09:48:33 -0700 (PDT) X-Google-Smtp-Source: AA+uWPz1URKxu8r+56TBQpdMiKOjjkw+/mXAC9afwEYXxX7KckAuF10xFwlPsJv8Uk7koiLBGINV X-Received: by 2002:a65:5545:: with SMTP id t5-v6mr3287516pgr.157.1533746913248; Wed, 08 Aug 2018 09:48:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1533746913; cv=none; d=google.com; s=arc-20160816; b=rMsLVh2sWEQUz/+oaZqQd19hgxB4ma593v0RF0PaJUfCgGdIwckDO5RfkGXbpvrOFp xUuomALLJ4sB7+f2qV9sxodvKbk6LM73uiEcfbm8k3zv1bsNfZI2asNh5Gq2tMH/9uZf CTMF+0Hbo5FL8ZF0U4h6Oa9UPe3Po8HFiXZm3loSO6kd7eWiAW5gI/IrSgWX0F/X2IYd e+VVkc53GXg3Zh6ceG5zGMNvWcfHHfnycJDuiXYrIvrn3aolqNzcx7ufUMNYw2kSEOHC Invj/QKFJ+eJEmOycuvCK5TMJ5SNXfvAv0x4tL0mZ0p0HV7yn8OXi3Vq0w1NWAXrQIgI 1jbg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:date:cc:to:from:subject:message-id :dkim-signature:arc-authentication-results; bh=Ex5UZaCMah34QYTkY2EOOx11ZDtyoW/Y1vpmfrjbdDI=; b=As0gooU6Zz7YEUnMxVPRu1lkrWZTBSjoXr+qVOi0owAEhNs9W13AEFg0tWkZwfq+0l KpE6f3Zdudg3FraLmgWqRM2BeIkM8FFb3T6LRKRm939bB4qmb3GaCWV09yZkSmaFZJhw J4qHUXHn/IvRt7UA4EjCEs6dRSnlau5WsO4yXGuF9/rc1aRFXfQ26fAHfLtx02N7BPiP jG/VmAbAtQwR4HAnNtuLFyR3sntCUpLDnDV5UWVKa43euLyTebLI4DwHxb30Q7Qh+Gzt 9ve6gtNfv5I3Tl70R0gInWWqTJsv8YMrQglRw20iY8si95hWcIIEpFYqlmdUgWTYWruD cMCQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=Yavo2PGd; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n68-v6si4650947pga.662.2018.08.08.09.48.18; Wed, 08 Aug 2018 09:48:33 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=Yavo2PGd; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728434AbeHHTHz (ORCPT + 99 others); Wed, 8 Aug 2018 15:07:55 -0400 Received: from mail.kernel.org ([198.145.29.99]:56178 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727177AbeHHTHz (ORCPT ); Wed, 8 Aug 2018 15:07:55 -0400 Received: from tleilax.poochiereds.net (cpe-71-70-156-158.nc.res.rr.com [71.70.156.158]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 9DAD321A32; Wed, 8 Aug 2018 16:47:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1533746844; bh=0m5imcOdgFyRwFXCgeEsLJbS+YW/T8BXC6kvFCX4kv8=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=Yavo2PGd7mVR2+w3MWWO4Txgc/Y1Lb3dGwNxFMZwWIKTql0BCYfRegYYH1og0R5mq w1vOznk0zWzAn7qKCKIiTnyZ+UdMOhDc0JtDRrO0h4ok87QNysR1cv6HyMO8psBFz0 O4bpGK3UB0JANOIkMYRkvSea7/Y1kpcOOrPnLQ9o= Message-ID: Subject: Re: [PATCH 0/4] locks: avoid thundering-herd wake-ups From: Jeff Layton To: NeilBrown , Alexander Viro Cc: "J. Bruce Fields" , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Martin Wilck Date: Wed, 08 Aug 2018 12:47:22 -0400 In-Reply-To: <153369219467.12605.13472423449508444601.stgit@noble> References: <153369219467.12605.13472423449508444601.stgit@noble> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.28.5 (3.28.5-1.fc28) Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2018-08-08 at 11:51 +1000, NeilBrown wrote: > If you have a many-core machine, and have many threads all wanting to > briefly lock a give file (udev is known to do this), you can get quite > poor performance. > > When one thread releases a lock, it wakes up all other threads that > are waiting (classic thundering-herd) - one will get the lock and the > others go to sleep. > When you have few cores, this is not very noticeable: by the time the > 4th or 5th thread gets enough CPU time to try to claim the lock, the > earlier threads have claimed it, done what was needed, and released. > With 50+ cores, the contention can easily be measured. > > This patchset creates a tree of pending lock request in which siblings > don't conflict and each lock request does conflict with its parent. > When a lock is released, only requests which don't conflict with each > other a woken. > > Testing shows that lock-acquisitions-per-second is now fairly stable even > as number of contending process goes to 1000. Without this patch, > locks-per-second drops off steeply after a few 10s of processes. > > There is a small cost to this extra complexity. > At 20 processes running a particular test on 72 cores, the lock > acquisitions per second drops from 1.8 million to 1.4 million with > this patch. For 100 processes, this patch still provides 1.4 million > while without this patch there are about 700,000. > > NeilBrown > > --- > > NeilBrown (4): > fs/locks: rename some lists and pointers. > fs/locks: allow a lock request to block other requests. > fs/locks: change all *_conflict() functions to return bool. > fs/locks: create a tree of dependent requests. > > > fs/cifs/file.c | 2 - > fs/locks.c | 142 +++++++++++++++++++++++++-------------- > include/linux/fs.h | 5 + > include/trace/events/filelock.h | 16 ++-- > 4 files changed, 103 insertions(+), 62 deletions(-) > Nice work! I looked over this and I think it looks good. I made an attempt to fix this issue several years ago, but my method sucked as it ended up penalizing the unlocking task too much. This is much cleaner and should scale well overall, I think. I'll put this in -next soon and we can aim for merge in v4.20. Thanks, -- Jeff Layton