Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp1067191imm; Sat, 11 Aug 2018 05:37:23 -0700 (PDT) X-Google-Smtp-Source: AA+uWPxDB4WAJtcrq5FUE7PqBkDqB2mufAHjdMkgRO6YKWfDQnCvbE2AY1duMs+oD7CpLIwuHzr1 X-Received: by 2002:a62:9349:: with SMTP id b70-v6mr11204563pfe.193.1533991043251; Sat, 11 Aug 2018 05:37:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1533991043; cv=none; d=google.com; s=arc-20160816; b=xvliuFbe3tRzRnmvG9nSNM6YaPvug8+0h6RKaopPd4pRdmHgGTBzR31DUNf70tkJLJ XtKWFmpzbWf6NoJdp5lGCFGFJoxdLJWVnLSx+jEo487YGYFY3tg4ddLBYZl4cWfaOcbH wh5VCrqZfQSJK5scy8X6ZsdS2pcXHv4rrZpG2nKYTz2peSfFToMcoT3CxvNGwih1jmHX BH9MBbROJ7OJdQfLzyrWg79w+WBdeS3eU2Buq0n0HcMlE/vCfPlsT2sOI5MfAWOrNKXF rOVDJCVwUtMP4FnQElUykxrHwwhuHW1tJXZIAtTVAV+J6ycW0RL2iEvkHH0xrv6lwo3k x/gQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=iwX8XvpmDQ86+d1ehwFPzkv1xxpDeDmQZfuf/uYKlzw=; b=EFyOCTJF7/+wgxAU4o6F4PviYwBr5qniFwzdwCHKIGbZ6FlgSEkMcNdTInx6BJFyhi mvLafgM9TwkGdq5QT/SgaRlHdruA46CPTrsYvyoQ3yGHHdqUP+XtnfallVTdpdsTkKew VK6sBRGQePI8zhmVF4DzT53gmLHXmy9c5+bX6rzOrVGixE0crzLzNQEsr2bjQLz7UV8Z J+ostNl/LK7UTO7MP+QyFFz489YCkBk8m6T4uJdBTrqePrh8L9thB1uhTet7tz3SFZVG 99Zf8rAPn8NtS7imvF2WjzqNRI4jLK8AFe/rSJnm5ADGf1wtImpuzhdud8qrV446Y9Bj PuzA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g11-v6si10811362plb.323.2018.08.11.05.37.09; Sat, 11 Aug 2018 05:37:23 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727437AbeHKPJa (ORCPT + 99 others); Sat, 11 Aug 2018 11:09:30 -0400 Received: from fieldses.org ([173.255.197.46]:52446 "EHLO fieldses.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727268AbeHKPJa (ORCPT ); Sat, 11 Aug 2018 11:09:30 -0400 Received: by fieldses.org (Postfix, from userid 2815) id AA772BD4; Sat, 11 Aug 2018 08:35:26 -0400 (EDT) Date: Sat, 11 Aug 2018 08:35:26 -0400 From: "J. Bruce Fields" To: Jeff Layton Cc: NeilBrown , Alexander Viro , Martin Wilck , linux-fsdevel@vger.kernel.org, Frank Filz , linux-kernel@vger.kernel.org Subject: Re: [PATCH 0/5 - V2] locks: avoid thundering-herd wake-ups Message-ID: <20180811123526.GB15848@fieldses.org> References: <153378012255.1220.6754153662007899557.stgit@noble> <20180809173245.GM23873@fieldses.org> <87lg9frxyc.fsf@notabene.neil.brown.name> <20180810002922.GA3915@fieldses.org> <871sb7rnul.fsf@notabene.neil.brown.name> <20180810025251.GO23873@fieldses.org> <87y3derjut.fsf@notabene.neil.brown.name> <20180810154742.GE7906@fieldses.org> <0f198c62b057ab7d796746144d458835a6c7433e.camel@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <0f198c62b057ab7d796746144d458835a6c7433e.camel@kernel.org> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Aug 11, 2018 at 07:56:25AM -0400, Jeff Layton wrote: > FWIW, I did a bit of testing with lockperf tests that I had written on > an earlier rework of this code: > > https://git.samba.org/jlayton/linux.git/?p=jlayton/lockperf.git;a=summary > > > The posix01 and flock01 tests in there show about a 10x speedup with > this set in place. > > I think something closer to Neil's design will end up being what we want > here. Consider the relatively common case where you have a whole-file > POSIX write lock held with a bunch of different waiters blocked on it > (all whole file write locks with different owners): > > With Neil's patches, you will just wake up a single waiter when the > blocked lock is released, as they would all be in a long chain of > waiters. Right, but you still need to walk the whole tree to make sure that it's the only one you need to wake. The tree structure means that you know all the other locks have non-overlapping ranges, but it doesn't tell you the lock owners. Maybe there's some reasonable way to rule out the shared-lockowner case more quickly too. I haven't thought about that much. > If you keep all the locks in a single list, you'll either have to: > > a) wake up all the waiters on the list when the lock comes free: no lock > is held at that point so none of them will conflict. > > ...or... > > b) keep track of what waiters have already been awoken, and compare any > further candidate for waking against the current set of held locks and > any lock requests by waiters that you just woke. Instead of keeping track of *every* waiter that you've woken, you could keep track of some subset. Worst case that just means waking more processes than you need to, which is wasteful but correct. In the common case that you give, that subset could just be "the first waiter you wake". You'd get the same result. The every-waiter-a-whole-file-write-lock case is pretty easy. To benefit from the tree you need a case where some of the waiters overlap and some don't. Might be worth it, sure. --b. > b seems more expensive as you have to walk over a larger set of locks > on every change. > -- > Jeff Layton