Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp1101917imm; Sat, 11 Aug 2018 06:17:17 -0700 (PDT) X-Google-Smtp-Source: AA+uWPzzPTTAWysSFbnQginqwH5iXfuOV66k3qVQqu6dZXkASvaqc3xADbpxBGwcKd/NSBTt5Mt6 X-Received: by 2002:a63:5e45:: with SMTP id s66-v6mr10340515pgb.151.1533993437096; Sat, 11 Aug 2018 06:17:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1533993437; cv=none; d=google.com; s=arc-20160816; b=O3FuD0nNaSbEzs30ip/TB1DJY3ni3j+WxjSWYv0H0cjvjuCD87myCDPgr9CarulnIu l2yJGYFyC0byHpb/YJRRH1fgYOOjVZ3GjpdMzAZnUOOX7YrYtXIU81Oy8DVqyB0B2zBx o9DuCZsI/xUo+bmEhxn4Ja2JYU9BWll/7KDrLY5BwClt/Q/AoJxTFm+odRuj9tmvwuXP wC5uEiT9Kvdxb2AnsSlyiMoWGZ/ikMDidD5BpDDgtW91gnrvq8w7UKsq7HUoWj5cQ4HU 4PqY2FrJjcwMqIL9PbvPlp8X4neVoxN/xAc+sSqQ0bbELg2eixLNi0ZIr8rzgXPkKrDF KBFw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:date:cc:to:from:subject:message-id :dkim-signature:arc-authentication-results; bh=xtE303LNeXKOht3pMxitfcL4FJwqvc0oNmsC0ce8zxs=; b=xVYG0P+H4XjfnrqZZMC3Jh1d3WmxaGDAW9e6U6RCiZEihUBZ4SjeuKPOhHlwmu1p6Z 01KVeqmol/iVrqAzHgeF5w5EFCvtCLSnfk9RhdiCPiUPpgkjYZObOMS+hFEucJKC/6vj 20jvDP5XHHjEBGDdxhCUmfh/AJVCdo3V6MaCiUhHpyjc+8KbqmHflG+aA+ElNpoxo3MC 02E4c1KYxAkpyGZxefl6ZIBWO8r4Daomg+sYRPPj7XI07I8CfoJNy2eiw83Je3NMKrXf RybjmyygddlGzMBp/r9N8J3Evv8rqBZ4z4Tz0RjbAhLw8pVB3glSabGvWECAyPSs3IQQ TW4Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=mFZ3DmIR; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u1-v6si13886582pfc.337.2018.08.11.06.16.51; Sat, 11 Aug 2018 06:17:17 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=mFZ3DmIR; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727416AbeHKPth (ORCPT + 99 others); Sat, 11 Aug 2018 11:49:37 -0400 Received: from mail.kernel.org ([198.145.29.99]:55816 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727324AbeHKPth (ORCPT ); Sat, 11 Aug 2018 11:49:37 -0400 Received: from tleilax.poochiereds.net (cpe-71-70-156-158.nc.res.rr.com [71.70.156.158]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 79E2C21A05; Sat, 11 Aug 2018 13:15:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1533993327; bh=IB9r6NfiKuWoWH0IDGX2zy9hq2ehfkTcWQLxM4zy248=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=mFZ3DmIR0xM0J0VJkNdnKx/524vi1Yd5ntoGORI6CreUdaTo/K+OP6o4hpxbvULlR WNl0p687SEa/a0PHd+iGGFk/RrHze5BVAtpb9H2RGahEAnyTZIRb+R8y8oPaIlqKy2 brEzs8UupDfPPrb+jctCU91NAwfTkJHfbFgZebHY= Message-ID: <68c41d01959f2a8e3bb42483935865ff21e4af75.camel@kernel.org> Subject: Re: [PATCH 0/5 - V2] locks: avoid thundering-herd wake-ups From: Jeff Layton To: "J. Bruce Fields" Cc: NeilBrown , Alexander Viro , Martin Wilck , linux-fsdevel@vger.kernel.org, Frank Filz , linux-kernel@vger.kernel.org Date: Sat, 11 Aug 2018 09:15:25 -0400 In-Reply-To: <20180811122150.GA15848@fieldses.org> References: <153378012255.1220.6754153662007899557.stgit@noble> <20180809173245.GM23873@fieldses.org> <87lg9frxyc.fsf@notabene.neil.brown.name> <20180810002922.GA3915@fieldses.org> <20180811122150.GA15848@fieldses.org> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.28.5 (3.28.5-1.fc28) Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, 2018-08-11 at 08:21 -0400, J. Bruce Fields wrote: > On Sat, Aug 11, 2018 at 07:51:13AM -0400, Jeff Layton wrote: > > On Thu, 2018-08-09 at 20:29 -0400, J. Bruce Fields wrote: > > > On Fri, Aug 10, 2018 at 08:12:43AM +1000, NeilBrown wrote: > > > > On Thu, Aug 09 2018, J. Bruce Fields wrote: > > > > > > > > > I think there's also a problem with multiple tasks sharing the same > > > > > lock owner. > > > > > > > > > > So, all locks are exclusive locks for the same range. We have four > > > > > tasks. Tasks 1 and 4 share the same owner, the others' owners are > > > > > distinct. > > > > > > > > > > - Task 1 gets a lock. > > > > > - Task 2 gets a conflicting lock. > > > > > - Task 3 gets another conflicting lock. So now we the tree is > > > > > 3->2->1. > > > > > - Task 1's lock is released. > > > > > - Before task 2 is scheduled, task 4 acquires a new lock. > > > > > - Task 2 waits on task 4's lock, we now have > > > > > 3->2->4. > > > > > > > > > > Task 3 shouldn't be waiting--the lock it's requesting has the same owner > > > > > as the lock task 4 holds--but we fail to wake up task 3. > > > > > > > > So task 1 and task 4 are threads in the one process - OK. > > > > Tasks 2 and 3 are threads in two other processes. > > > > > > > > So 2 and 3 conflict with either 1 or 4 equally - why should task 3 be > > > > woken? > > > > > > > > I suspect you got the numbers bit mixed up, > > > > > > Whoops. > > > > > > > but in any case, the "conflict()" function that is passed around takes > > > > ownership into account when assessing if one lock conflicts with > > > > another. > > > > > > Right, I know, but, let me try again: > > > > > > All locks are exclusive locks for the same range. Only tasks 3 and 4 > > > share the the same owner. > > > > > > - Task 1 gets a lock. > > > - Task 2 requests a conflicting lock, so we have 2->1. > > > - Task 3 requests a conflicting lock, so we have 3->2->1. > > > - Task 1 unlocks. We wake up task 2, but it isn't scheduled yet. > > > - Task 4 gets a new lock. > > > - Task 2 runs, discovers the conflict, and waits. Now we have: > > > 3->2->4. > > > > > > There is no conflict between the lock 3 requested and the lock 4 holds, > > > but 3 is not woken up. > > > > > > This is another version of the first problem: there's information we > > > need (the owners of the waiting locks in the tree) that we can't > > > determine just from looking at the root of the tree. > > > > > > I'm not sure what to do about that. > > > > > > > Is this still a problem in the v2 set? > > > > wake_non_conflicts walks the whole tree of requests that were blocked on > > it, > > Not in the FL_TRANSITIVE_CONFLICT case, which is the case here. > Got it. Yeah, I'm not sure that the idea of FL_TRANSITIVE_CONFLICT really works. I think you could fix this by just getting rid of that and folding it into the FL_CONFLICT case. In more complex situations (like the one you describe), you may end up cycling through several blocked requests before you hit one that can get the lock. It may slow things down some for those cases. In the more common locking scenarios (whole file locks, different owners), I think this will be much more efficient by avoiding so many wakeups. -- Jeff Layton