Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp386442imu; Thu, 8 Nov 2018 22:24:54 -0800 (PST) X-Google-Smtp-Source: AJdET5eSwdqxxkYqZU6IUQB/lJs2wtAIta8UQjvNEZCuYMnAmtg1cEGlQ4kx+EBXaqAbUHUqMxp7 X-Received: by 2002:a17:902:4324:: with SMTP id i33-v6mr7797696pld.253.1541744694050; Thu, 08 Nov 2018 22:24:54 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1541744694; cv=none; d=google.com; s=arc-20160816; b=kTkA6/2VuX6smAk3yRa+wyzVuP/K0AZorZ4n3J1Y6UEo0wflZF34TlGPSnmBPNjh9i pD/FIQ9N5RpN9GngPBoqDz091elGH9o06i/gZX/ACxtmffrDwXHuMSIVHmRE8H2sr0bY 20SKUKfuWSNCL+G1SyRr5KpB5FmCz3yNZbX2MjLjk3JY5erf2SdpItSH0UW8a+bqfPMy typuJSCbyGD2ITWoki9tXBpr0KN+vBA02mGuAmQ+U4sJQFpeE9yXCOt9szx+b8UEnh3V lkmtridMXCF25+AExvldYpEoLKSIVWyPD93+8XOV8k7NojMbp59fCMkbB+voh5lcfk3G EcoA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:message-id:references :in-reply-to:subject:cc:date:to:from; bh=y8Hq+Fh7xKdXsU+gvxmcRhTmW6LjwlQvCIgobMiYzU4=; b=1G8b9iAI/tEimzSowqAFdXl9X2DlAmFsAGf5qeot+Or1OAXvHk/M9U3laLOM+F4xGy OBGNFHiNxbeX/w2ruXCfNDBvDMgW1CuHwh9CBGtrIuVg9l3tDAyviVfWSXhqLibdQeOS xq6nXTg/bNpwpWxymm12u7jjanNNxjolZwgLQ0K/aEdwNtWHvaOBK8yfyKozJKJjAxJX Sq55oy2Suqcy/x+3m8iXk+/EWfwGoPuVhxPqZD3B3H9Jbzm02i7ZrmCd3wD97byt1CUN qXgK2Rr7JPZ0IuVRFHLvG9ylk339KZl3YqKv4JVOS+xDQ4yDmihZ3o/mOSy7tp6KvHlj t/rQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x2-v6si6321388plo.259.2018.11.08.22.24.37; Thu, 08 Nov 2018 22:24:54 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727736AbeKIQDT (ORCPT + 99 others); Fri, 9 Nov 2018 11:03:19 -0500 Received: from mx2.suse.de ([195.135.220.15]:55824 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727549AbeKIQDT (ORCPT ); Fri, 9 Nov 2018 11:03:19 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 53D67AF3E; Fri, 9 Nov 2018 06:24:13 +0000 (UTC) From: NeilBrown To: "J. Bruce Fields" Date: Fri, 09 Nov 2018 17:24:04 +1100 Cc: Jeff Layton , Alexander Viro , Martin Wilck , linux-fsdevel@vger.kernel.org, Frank Filz , linux-kernel@vger.kernel.org Subject: Re: [PATCH 10/12] fs/locks: create a tree of dependent requests. In-Reply-To: <20181109030913.GA8831@fieldses.org> References: <154138128401.31651.1381177427603557514.stgit@noble> <154138144796.31651.14201944346371750178.stgit@noble> <20181108213030.GF6090@fieldses.org> <87k1lnw0ec.fsf@notabene.neil.brown.name> <20181109030913.GA8831@fieldses.org> Message-ID: <87bm6ywyyj.fsf@notabene.neil.brown.name> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --=-=-= Content-Type: text/plain Content-Transfer-Encoding: quoted-printable On Thu, Nov 08 2018, J. Bruce Fields wrote: > On Fri, Nov 09, 2018 at 11:38:19AM +1100, NeilBrown wrote: >> On Thu, Nov 08 2018, J. Bruce Fields wrote: >>=20 >> > On Mon, Nov 05, 2018 at 12:30:48PM +1100, NeilBrown wrote: >> >> When we find an existing lock which conflicts with a request, >> >> and the request wants to wait, we currently add the request >> >> to a list. When the lock is removed, the whole list is woken. >> >> This can cause the thundering-herd problem. >> >> To reduce the problem, we make use of the (new) fact that >> >> a pending request can itself have a list of blocked requests. >> >> When we find a conflict, we look through the existing blocked request= s. >> >> If any one of them blocks the new request, the new request is attached >> >> below that request, otherwise it is added to the list of blocked >> >> requests, which are now known to be mutually non-conflicting. >> >>=20 >> >> This way, when the lock is released, only a set of non-conflicting >> >> locks will be woken, the rest can stay asleep. >> >> If the lock request cannot be granted and the request needs to be >> >> requeued, all the other requests it blocks will then be woken >> > >> > So, to make sure I understand: the tree of blocking locks only ever has >> > three levels (the active lock, the locks blocking on it, and their >> > children?) >>=20 >> Not correct. >> Blocks is only vertical, never horizontal. Siblings never block each >> other. >> So one process hold a lock on a byte, and 27 other process want a lock >> on that byte, then there will be 28 levels in a narrow tree - it is >> effectively a queue. >> Branching (via siblings) only happens when a child conflict with only >> part of the lock held by the parent. >> So if one process locks 32K, then two other processes request locks on >> the 2 16K halves, then 4 processes request locks on the 8K quarters, and >> so-on, then you could end up with 32767 processes in a binary tree, with >> half of them all waiting on different individual bytes. > > Maybe I should actually read the code carefully instead of just skimming > the changelog and jumping to conclusions. > > I think this is correct, but I wish we had an actual written-out > argument that it's correct, because intuition isn't a great guide for > posix file locks. > > Maybe: > > Waiting and applied locks are all kept in trees whose properties are: > > - the root of a tree may be an applied or unapplied lock. > - every other node in the tree is an unapplied lock that > conflicts with every ancestor of that node. > > Every such tree begins life as an unapplied singleton which obviously > satisfies the above properties. > > The only ways we modify trees preserve these properties: > > 1. We may add a new child, but only after first verifying that it > conflicts with all of its ancestors. > 2. We may remove the root of a tree, creating a new singleton > tree from the root and N new trees rooted in the immediate > children. > 3. If the root of a tree is not currently an applied lock, we may > apply it (if possible). > 4. We may upgrade the root of the tree (either extend its range, > or upgrade its entire range from read to write). > > When an applied lock is modified in a way that reduces or downgrades any > part of its range, we remove all its children (2 above). > > For each of those child trees: if the root of the tree applies, we do so > (3). If it doesn't, it must conflict with some applied lock. We remove > all of its children (2), and add it is a new leaf to the tree rooted in > the applied lock (1). We then repeat the process recursively with those > children. > Thanks pretty thorough - and even looks correct. I'll re-reading some time when it isn't late, and maybe make it into a comment in the code. I agree, this sort of documentation can be quite helpful. Thanks, NeilBrown --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEG8Yp69OQ2HB7X0l6Oeye3VZigbkFAlvlKAUACgkQOeye3VZi gbnnRg/8DK+Zv73yAzgSJvPbwxgXatREw2sbd8OQ10lTp4fEISzSUXb8fNs/AwiJ j/ZKu7XV3gLmep7IA1GIR2piBPwuLeDvHsxVkIc3tQdtaGMTcqtTbRdYcsfvVT1u 77lV4AOnI2wXey1b03gFmEu2ZMCMe+Kk69nvTY6vnkRiCq+VZUbpKFHkHB1vf9DN gpYnZnd2nR6FVYfh5SkjxhuzecZJWwnkA7GCO7oHShaeAu9bZXBwKivQNhJadQse 0Y6oNexj4gS5t4zvu3ZGEnait7QeeSOocq8Me6QiItqt0k0EBCiXfhnoVlthjxv9 29Q3IkPQQ7JfZKkn8FVgdvzG3j031BQbhvZ5HoSKLwkFptpAe9xYxbUhJwjvNBJy HYHGGPdFy5oqbjcTTqt31YQUa+7xaYN4ObMkRt/6InjUpMIc1FjlcKuprdW7J9Tx AfAYAmAt+jQys9RX81fktLzgX7Fv26sGT1h7NaPwhHfHKprDr05eYk2XU+T1OfQC U3Qu3HAQ+ziKtk96dnsZufITY3pDIZBPS7IPfUCGHpgqGyVjik4Kem2V8twrgvcQ E//XruPKNvf37eP9pIdjKv1f2uSg7vV1husdfmReGJOtlTlM/BgnfgAIuooMYAhY DIzYGFL9gtQdOQc8Bv2cmueKcrq77RXLuDfKaP8zM/Zz20jXIQ0= =0W5C -----END PGP SIGNATURE----- --=-=-=--