Received: by 10.192.165.148 with SMTP id m20csp5031132imm; Tue, 24 Apr 2018 12:27:12 -0700 (PDT) X-Google-Smtp-Source: AIpwx4/xjlzlAYKVLjZeTN9B55eny3xLGKfOsQqt1TDd7f7OVFUOHTyULCS5HeG5j8mOfwEhHPb6 X-Received: by 10.98.190.2 with SMTP id l2mr25060886pff.224.1524598032288; Tue, 24 Apr 2018 12:27:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524598032; cv=none; d=google.com; s=arc-20160816; b=KsYn6SD+P3CJQJTXT8Tvu8XElLGoL0hiKgOeBdnniGjLaLnukUe6QuPox/UDQC6wsw faKCNZnPw86cvlP691CVTV30R9W+ggv63Ji4UpGm5aDyf6Dkhl8X4+5KDQ1Lg1vLCtuE mjo8CT8XD5XkTOKgTqpIcf5wTej1U6U0zCp/R454imIutWIj3lSAwEqOrQRN0qSFgqxO 6Reu4LGYlt0rn/T5HtQ7r5KvSZeWeUYSr228NcBhaTRmeqqPYi4NmeFhfdeBqrgV8/Tk 91WS2siHBHmSKkubQrjxzCg6s1zYe1K6dXi2kU972fM/0f1DVCXUDjXReUS9M7eV5YPV ZS0w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=QtuSfQ/0knbpcfletp21g86nwa5LBYmVO71JJHNIEJ8=; b=V7iU6cqCXO51ojZwNk5zNc3KZ+cj0sriB8SPurdI4qkWqB8kF2P2dG5ZsBKfTLDuOV 4CvbrCUUG6TJfJPZx9f/jLTo2zXzTeyrtskTEmJht4qihDGpkAx9ebPwa7cJuLboIsEV 7EPT24BYsvBCQIEU6+w53NR0oYFkPD8zUGe/YX0nGy25JQIOWLvfs2SsU76/GCEkyyCQ R9jdBz1rW59pFAMlpe4K8jBUvEQX1mqxyfh1M8yP3RtwYxfh3omonYVuPZa4PWV5Jax1 yCV+AaSUR+Q0p3wyPDWzWNvUwyybVurKBdcV02VnPZnHYaHovo5Qw6tt/rn5aS6SSLRY NnqA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e98-v6si14505162plb.273.2018.04.24.12.26.57; Tue, 24 Apr 2018 12:27:12 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751280AbeDXTZu (ORCPT + 99 others); Tue, 24 Apr 2018 15:25:50 -0400 Received: from mx2.suse.de ([195.135.220.15]:53778 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750791AbeDXTZs (ORCPT ); Tue, 24 Apr 2018 15:25:48 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 8CB4EAD45; Tue, 24 Apr 2018 19:25:46 +0000 (UTC) Date: Tue, 24 Apr 2018 13:25:42 -0600 From: Michal Hocko To: "Theodore Y. Ts'o" Cc: LKML , Artem Bityutskiy , Richard Weinberger , David Woodhouse , Brian Norris , Boris Brezillon , Marek Vasut , Cyrille Pitchen , Andreas Dilger , Steven Whitehouse , Bob Peterson , Trond Myklebust , Anna Schumaker , Adrian Hunter , Philippe Ombredanne , Kate Stewart , Mikulas Patocka , linux-mtd@lists.infradead.org, linux-ext4@vger.kernel.org, cluster-devel@redhat.com, linux-nfs@vger.kernel.org, linux-mm@kvack.org Subject: Re: vmalloc with GFP_NOFS Message-ID: <20180424192542.GS17484@dhcp22.suse.cz> References: <20180424162712.GL17484@dhcp22.suse.cz> <20180424183536.GF30619@thunk.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180424183536.GF30619@thunk.org> User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 24-04-18 14:35:36, Theodore Ts'o wrote: > On Tue, Apr 24, 2018 at 10:27:12AM -0600, Michal Hocko wrote: > > fs/ext4/xattr.c > > > > What to do about this? Well, there are two things. Firstly, it would be > > really great to double check whether the GFP_NOFS is really needed. I > > cannot judge that because I am not familiar with the code. > > *Most* of the time it's not needed, but there are times when it is. > We could be more smart about sending down GFP_NOFS only when it is > needed. Well, the primary idea is that you do not have to. All you care about is to use the scope api where it matters + a comment describing the reclaim recursion context (e.g. this lock will be held in the reclaim path here and there). > If we are sending too many GFP_NOFS's allocations such that > it's causing heartburn, we could fix this. (xattr commands are rare > enough that I dind't think it was worth it to modulate the GFP flags > for this particular case, but we could make it be smarter if it would > help.) Well, the vmalloc is actually a correctness issue rather than a heartburn... > > If the use is really valid then we have a way to do the vmalloc > > allocation properly. We have memalloc_nofs_{save,restore} scope api. How > > does that work? You simply call memalloc_nofs_save when the reclaim > > recursion critical section starts (e.g. when you take a lock which is > > then used in the reclaim path - e.g. shrinker) and memalloc_nofs_restore > > when the critical section ends. _All_ allocations within that scope > > will get GFP_NOFS semantic automagically. If you are not sure about the > > scope itself then the easiest workaround is to wrap the vmalloc itself > > with a big fat comment that this should be revisited. > > This is something we could do in ext4. It hadn't been high priority, > because we've been rather overloaded. Well, ext/jbd already has scopes defined for the transaction context so anything down that road can be converted to GFP_KERNEL (well, unless the same code path is shared outside of the transaction context and still requires a protection). It would be really great to identify other contexts and slowly move away from the explicit GFP_NOFS. Are you aware of other contexts? > As a suggestion, could you take > documentation about how to convert to the memalloc_nofs_{save,restore} > scope api (which I think you've written about e-mails at length > before), and put that into a file in Documentation/core-api? I can. > The question I was trying to figure out which triggered the above > request is how/whether to gradually convert to that scope API. Is it > safe to add the memalloc_nofs_{save,restore} to code and keep the > GFP_NOFS flags until we're sure we got it all right, for all of the > code paths, and then drop the GFP_NOFS? The first stage is to define and document those scopes. I have provided a debugging patch [1] in the past that would dump_stack when seeing an explicit GFP_NOFS from a scope which could help to eliminate existing users. [1] http://lkml.kernel.org/r/20170106141845.24362-1-mhocko@kernel.org -- Michal Hocko SUSE Labs