Received: by 2002:ac0:a581:0:0:0:0:0 with SMTP id m1-v6csp3948080imm; Mon, 25 Jun 2018 07:17:25 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKYnjsW4oyeznYjkZeSLUqzMytldT7EI+rshXZXT2Qh/JMdz6qYDzUVamxOMVr0bKhobt+u X-Received: by 2002:a17:902:6115:: with SMTP id t21-v6mr1826456plj.92.1529936245443; Mon, 25 Jun 2018 07:17:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529936245; cv=none; d=google.com; s=arc-20160816; b=acScovIq5aNVNgvMOIXs81vXC60jR59rU+VhEq6FVuckeGo1vEFXGSjJ2Wv7Ccxs8E 1wJMn7e11QlrM6d7IUBMqFhHXBOZO2mijZWr0IFDXb5rX6BJXid+DlJM8VGDXsM4jEgd I0JSKHxEXDB6ipTLeA+2nY7T2IJ8Fq6cvTYYfakkA26/R4r9/3gaY8UPsFujAEKfzQIw fo/+kANEwUDOrRkR2RYvaBBeRzFI3MERfbySoUPaDYQUViF244/FY9Nd0o7yfpHFAKEP jNQaDcC5Ckx+TVabECoUhiwXz4UOLyFtpS4h4Hoqd6zXYDpnltO24A2+VCipTRTWQAv6 ibjg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=HVL7VdK3tcXpJVXzkD6qNCL3mxu8s3tvm2A1y57APgQ=; b=zIHWT1xA+GOvcxA3g/6lyzgtDDSnyNphfYABjF/jnAHjEPJKKzYdAOuWyRtTYVhrDs 7JqED4ZdrMmX6+Yp6dwOG9aqsaCAD619Y4W4t+88OS80zR518f+hwRbXY1L/nJcje1YV cMt3udGJA4g0jN3hVwbw5+1Jyt5pfZvQmqB1KMy3KVa21YL9Ysve7lqIe6OCSH3+Qc69 jhRorPZ+XTWZpkA82tqFLagyanenHyvcn9+nfUzk+JWbYDU0Jbghbnc9VMoVshl9Ln8M JqPgd5CyA8uZ6/1cWKUPnl0PqourAlNbnOYQXILnrbk5wwUD2zMk1Q9Ysll1VkeBaRBF PRaQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z3-v6si13738782plb.228.2018.06.25.07.17.10; Mon, 25 Jun 2018 07:17:25 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934361AbeFYOOg (ORCPT + 99 others); Mon, 25 Jun 2018 10:14:36 -0400 Received: from mx2.suse.de ([195.135.220.15]:55219 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934201AbeFYOOg (ORCPT ); Mon, 25 Jun 2018 10:14:36 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (charybdis-ext-too.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id C1C0FABD9; Mon, 25 Jun 2018 14:14:34 +0000 (UTC) Date: Mon, 25 Jun 2018 16:14:34 +0200 From: Michal Hocko To: Mikulas Patocka Cc: jing xia , Mike Snitzer , agk@redhat.com, dm-devel@redhat.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: dm bufio: Reduce dm_bufio_lock contention Message-ID: <20180625141434.GO28965@dhcp22.suse.cz> References: <20180619104312.GD13685@dhcp22.suse.cz> <20180622090151.GS10465@dhcp22.suse.cz> <20180622090935.GT10465@dhcp22.suse.cz> <20180622130524.GZ10465@dhcp22.suse.cz> <20180625090957.GF28965@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.5 (2018-04-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon 25-06-18 09:53:34, Mikulas Patocka wrote: > y > > On Mon, 25 Jun 2018, Michal Hocko wrote: > > > On Fri 22-06-18 14:57:10, Mikulas Patocka wrote: > > > > > > > > > On Fri, 22 Jun 2018, Michal Hocko wrote: > > > > > > > On Fri 22-06-18 08:52:09, Mikulas Patocka wrote: > > > > > > > > > > > > > > > On Fri, 22 Jun 2018, Michal Hocko wrote: > > > > > > > > > > > On Fri 22-06-18 11:01:51, Michal Hocko wrote: > > > > > > > On Thu 21-06-18 21:17:24, Mikulas Patocka wrote: > > > > > > [...] > > > > > > > > What about this patch? If __GFP_NORETRY and __GFP_FS is not set (i.e. the > > > > > > > > request comes from a block device driver or a filesystem), we should not > > > > > > > > sleep. > > > > > > > > > > > > > > Why? How are you going to audit all the callers that the behavior makes > > > > > > > sense and moreover how are you going to ensure that future usage will > > > > > > > still make sense. The more subtle side effects gfp flags have the harder > > > > > > > they are to maintain. > > > > > > > > > > > > So just as an excercise. Try to explain the above semantic to users. We > > > > > > currently have the following. > > > > > > > > > > > > * __GFP_NORETRY: The VM implementation will try only very lightweight > > > > > > * memory direct reclaim to get some memory under memory pressure (thus > > > > > > * it can sleep). It will avoid disruptive actions like OOM killer. The > > > > > > * caller must handle the failure which is quite likely to happen under > > > > > > * heavy memory pressure. The flag is suitable when failure can easily be > > > > > > * handled at small cost, such as reduced throughput > > > > > > > > > > > > * __GFP_FS can call down to the low-level FS. Clearing the flag avoids the > > > > > > * allocator recursing into the filesystem which might already be holding > > > > > > * locks. > > > > > > > > > > > > So how are you going to explain gfp & (__GFP_NORETRY | ~__GFP_FS)? What > > > > > > is the actual semantic without explaining the whole reclaim or force > > > > > > users to look into the code to understand that? What about GFP_NOIO | > > > > > > __GFP_NORETRY? What does it mean to that "should not sleep". Do all > > > > > > shrinkers have to follow that as well? > > > > > > > > > > My reasoning was that there is broken code that uses __GFP_NORETRY and > > > > > assumes that it can't fail - so conditioning the change on !__GFP_FS would > > > > > minimize the diruption to the broken code. > > > > > > > > > > Anyway - if you want to test only on __GFP_NORETRY (and fix those 16 > > > > > broken cases that assume that __GFP_NORETRY can't fail), I'm OK with that. > > > > > > > > As I've already said, this is a subtle change which is really hard to > > > > reason about. Throttling on congestion has its meaning and reason. Look > > > > at why we are doing that in the first place. You cannot simply say this > > > > > > So - explain why is throttling needed. You support throttling, I don't, so > > > you have to explain it :) > > > > > > > is ok based on your specific usecase. We do have means to achieve that. > > > > It is explicit and thus it will be applied only where it makes sense. > > > > You keep repeating that implicit behavior change for everybody is > > > > better. > > > > > > I don't want to change it for everybody. I want to change it for block > > > device drivers. I don't care what you do with non-block drivers. > > > > Well, it is usually onus of the patch submitter to justify any change. > > But let me be nice on you, for once. This throttling is triggered only > > if we all the pages we have encountered during the reclaim attempt are > > dirty and that means that we are rushing through the LRU list quicker > > than flushers are able to clean. If we didn't throttle we could hit > > stronger reclaim priorities (aka scan more to reclaim memory) and > > reclaim more pages as a result. > > And the throttling in dm-bufio prevents kswapd from making forward > progress, causing this situation... Which is what we have PF_THROTTLE_LESS for. Geez, do we have to go in circles like that? Are you even listening? [...] > And so what do you want to do to prevent block drivers from sleeping? use the existing means we have. -- Michal Hocko SUSE Labs