Received: by 10.192.165.148 with SMTP id m20csp1537253imm; Thu, 3 May 2018 00:45:17 -0700 (PDT) X-Google-Smtp-Source: AB8JxZo9GtgZ2KWcB3UhI97czpSwO+6BWeNix8FMEwwBj01/4C3a4dz45CwxlG4OPGgFlCw5TKDA X-Received: by 2002:a65:4784:: with SMTP id e4-v6mr18135693pgs.196.1525333517930; Thu, 03 May 2018 00:45:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525333517; cv=none; d=google.com; s=arc-20160816; b=WcRoRGLS9tNtGmOvLzOxHwSeFAthJF44hUY7wZcieVlZmoq8HN9IgnCIbzIKQdIXro Cmj0vNZWU05akGD5oovryjaG+gGY8iqVjzeVPklVYD2/aNBO8pNG/cRYg9rvIenBqTD3 GYZsLk6xahWLejXzJE4gIDt9UhE2+RioHCOU5ZBPBNh7Nmx9W5ppWJKUK88RMqGzOw08 S0xFfFAvn08jXazAbb+vDIAsQ+rZfR+MsrjijPObQ0aIhbsJm0hJ2N/mGfAsr3HkTUus jCBlQV4W7zerfq7jF8s7UAOvFIru89nKtF2cY9z4Xj4+KaLrjXod+gshgPgLL839pUHo MCQg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=GszzGSu/emERiMu7DzqqrRnZ5V9E079Rye5e+33juVQ=; b=nCPQdGuxhjyXAJbDh+WGAZx5GlP62SnZdwdDfrFLTVGLJyTll4VuWWyITKbSYXgtTY b6saVTMzSZfxWg5eJtXXUaqDDCYXdD458+wJ4YaYBISh7eN3P+9LFXDntUVNzE8sR+ei mKMe+XTdP3cVfbwdbdRj9EgGiRpTQ03DnW4Asdc9wsaKncvGgDcpmaKhAEuXeOAEWWJE kUZMNGPdBIO840tS+2tr9bFlN2hGPDnGzww9K50NIq3MLGhYfEOEvDKVx9Okn+bXxZGg BoUfipMX69PDD+pfSVLEL9PShsg5aNpiBMJGE9XSmIOxdIeCdRJtVdgNunIvwwTDYkyv 4IUQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=temperror (no key for signature) header.i=@szeredi.hu header.s=google header.b=BAixmhRn; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 9-v6si13235660plb.415.2018.05.03.00.45.03; Thu, 03 May 2018 00:45:17 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=temperror (no key for signature) header.i=@szeredi.hu header.s=google header.b=BAixmhRn; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751148AbeECHoj (ORCPT + 99 others); Thu, 3 May 2018 03:44:39 -0400 Received: from mail-oi0-f46.google.com ([209.85.218.46]:38539 "EHLO mail-oi0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750939AbeECHof (ORCPT ); Thu, 3 May 2018 03:44:35 -0400 Received: by mail-oi0-f46.google.com with SMTP id k17-v6so15286770oih.5 for ; Thu, 03 May 2018 00:44:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=szeredi.hu; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=GszzGSu/emERiMu7DzqqrRnZ5V9E079Rye5e+33juVQ=; b=BAixmhRnxZ7gQIsVmeddUBr91G8A7l+XZaRJguHnZd2ra/Dp7G5gE+akrx5rJgDBWf VKTXgMFx+qn/Hg/jrkcEk2REmz/ybRsnLoISDAD0hBbDIDtr5tycoeV+0mAcmj7hC1l1 orFpwpBvW9ym6XLdxxF/KBEgU9LrP6IfGdxg8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=GszzGSu/emERiMu7DzqqrRnZ5V9E079Rye5e+33juVQ=; b=G0NTrcGItDP2GYS5Df8wtQqdqMhQP5dy19MFWg0dE5oHVRI7SCCRztWm7LAQl9GWXD hqX9kBZwy1aCFXsGDPhHIUNThPdVVMj02vLlMiXfTM58hsXnV0DJCDIQCQVDiSMSQ9QQ lPiJufgWcdCZCMPu3FTsIyA1deK3/7GcvmTOtw66EBNHEYmMCgxQ8Yh8bFZazMtFbCYk ppcKTRWPS5/MUDfTMKVafJmneGBvGF5OzXslC1gCh5jc+2g4/aeUDHfd5gUD73Lo4TYu tle/zUX5daNwv/hzhvnbq87ym8zDaOC3vnDCvA6WSva6U/IyQCP4d1zcI6H9gmPZs4fF B4iA== X-Gm-Message-State: ALQs6tDDgn7FqCzGTp6kFFLpF2DirstYblWLWiGUVFYkfKe0hNkeQGzE +cXEpw0nHHpD8Ps0bLewdaXYfwj9+COKdbi3lIHKVw== X-Received: by 2002:aca:f388:: with SMTP id r130-v6mr13232063oih.17.1525333475159; Thu, 03 May 2018 00:44:35 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a9d:5303:0:0:0:0:0 with HTTP; Thu, 3 May 2018 00:44:34 -0700 (PDT) X-Originating-IP: [194.176.227.33] In-Reply-To: <20180502224533.GW30522@ZenIV.linux.org.uk> References: <20180502222635.1862-1-mszeredi@redhat.com> <20180502224533.GW30522@ZenIV.linux.org.uk> From: Miklos Szeredi Date: Thu, 3 May 2018 09:44:34 +0200 Message-ID: Subject: Re: [PATCH] dcache: fix quadratic behavior with parallel shrinkers To: Al Viro Cc: Miklos Szeredi , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, May 3, 2018 at 12:45 AM, Al Viro wrote: > On Thu, May 03, 2018 at 12:26:35AM +0200, Miklos Szeredi wrote: >> When multiple shrinkers are operating on a directory containing many >> dentries, it takes much longer than if only one shrinker is operating on >> the directory. >> >> Call the shrinker instances A and B, which shrink DIR containing NUM >> dentries. >> >> Assume A wins the race for locking DIR's d_lock, then it goes onto moving >> all unlinked dentries to its dispose list. When it's done, then B will >> scan the directory once again, but will find that all dentries are already >> being shrunk, so it will have an empty dispose list. Both A and B will >> have found NUM dentries (data.found == NUM). >> >> Now comes the interesting part: A will proceed to shrink the dispose list >> by killing individual dentries and decrementing the refcount of the parent >> (which is DIR). NB: decrementing DIR's refcount will block if DIR's d_lock >> is held. B will shrink a zero size list and then immediately restart >> scanning the directory, where it will lock DIR's d_lock, scan the remaining >> dentries and find no dentry to dispose. >> >> So that results in B doing the directory scan over and over again, holding >> d_lock of DIR, while A is waiting for a chance to decrement refcount of DIR >> and making very slow progress because of this. B is wasting time and >> holding up progress of A at the same time. >> >> Proposed fix is to check this situation in B (found some dentries, but >> all are being shrunk already) and just sleep for some time, before retrying >> the scan. The sleep is proportional to the number of found dentries. > > The thing is, the majority of massive shrink_dcache_parent() can be killed. > Let's do that first and see if anything else is really needed. > > As it is, rmdir() and rename() are ridiculously bad - they should only call > shrink_dcache_parent() after successful ->rmdir() or ->rename(). Sure, > there are other places where we do large shrink_dcache_parent() runs, > but those won't trigger in parallel on the same tree. I think we are cat hit this also with lru pruner (prune_dcache_sb(), shrink_dcache_sb()) running in parallel with shrink_dcache_parent(). Although shrink_dcache_sb() looks better in this regard, since it will only hold up to 1024 dentries in the dispose list. I'm open to a better solution, but keep in mind that it will also need backporting. Thanks, Miklos