Received: by 10.192.165.148 with SMTP id m20csp4884365imm; Tue, 24 Apr 2018 09:57:07 -0700 (PDT) X-Google-Smtp-Source: AIpwx4/bQNhuaculRr6c/ZbKX9l/gaJR+rkzTQIPDSSj1CyDlBEHlev5+7mB1BC+pRJaKS5uIMIh X-Received: by 10.101.87.138 with SMTP id b10mr20904691pgr.241.1524589027760; Tue, 24 Apr 2018 09:57:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524589027; cv=none; d=google.com; s=arc-20160816; b=dbf73DWCkqVGPsvMjZguTRvAI16cV+pwWULjz31RJRuIHjWvgc+DhHuS2mwhSTwDJj cqfnTZt+4r5kI2/Mjam7MPlvvrN+KkxjjUixZ67oFbswc6bTYWi3Ag8npWmoLCb1FLGP MgAd7mg9+dBPM8Hr7G4ADNfkwNZBDyjU2m2066CdHjol/jt1gcYYsUZlJTWcxmzOOlTJ GbDozUn1je6GAao49aNvPCV7cZTps7O6NjMMmym8IChLdTxW9FX0r9LJGHtV71J26Jbf rAetPCgUodJW5zE5izhkUCTSfsWBJhrWv/T2rCP0nCgYFe1v573AfgLkVgirG4d7hOaB Ak6w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=CCtB4mSc0daqc50VclP+f9ubC9Q0xEIPS6ZUJD4vBgE=; b=l6p470a+Pnw3DALD+Yhi2RqRulRJZ9OPRNn7g9fQDToTWYfqBoWVdvt4JeTNj6dZa/ LTfBjSr8LbGTZHN9VuuFeq5XoKFR+qRT4fkkOPNG0xMUvZwv/xMRzbg+gZGSMU216wuN iyq+ACXuyI0sFg3F9j2hhwlAzcoT2fB5QBqNas2A0zm/NPDrfxow3tOfpgNwJdUVzoNZ wYVX7bsbSqRygBL/D8ZzimROj0KDo9PCd7cWEl9l8Ws1vEsQM4VpW+xgeNr2HfFdNVL1 z+W1qOPt25C/3gXNkCLXc0e3JclKl/2EtaLapuWIEYQmeyOq3UwzoSbgPWM1jLsjLFXK jJWA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p13-v6si14548250pll.416.2018.04.24.09.56.52; Tue, 24 Apr 2018 09:57:07 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751226AbeDXQzm (ORCPT + 99 others); Tue, 24 Apr 2018 12:55:42 -0400 Received: from mx2.suse.de ([195.135.220.15]:43091 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750757AbeDXQzi (ORCPT ); Tue, 24 Apr 2018 12:55:38 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 5CAADAE6F; Tue, 24 Apr 2018 16:55:36 +0000 (UTC) Date: Tue, 24 Apr 2018 10:55:32 -0600 From: Michal Hocko To: Mikulas Patocka Cc: LKML , Artem Bityutskiy , Richard Weinberger , David Woodhouse , Brian Norris , Boris Brezillon , Marek Vasut , Cyrille Pitchen , Theodore Ts'o , Andreas Dilger , Steven Whitehouse , Bob Peterson , Trond Myklebust , Anna Schumaker , Adrian Hunter , Philippe Ombredanne , Kate Stewart , linux-mtd@lists.infradead.org, linux-ext4@vger.kernel.org, cluster-devel@redhat.com, linux-nfs@vger.kernel.org, linux-mm@kvack.org Subject: Re: vmalloc with GFP_NOFS Message-ID: <20180424165532.GO17484@dhcp22.suse.cz> References: <20180424162712.GL17484@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 24-04-18 12:46:55, Mikulas Patocka wrote: > > > On Tue, 24 Apr 2018, Michal Hocko wrote: > > > Hi, > > it seems that we still have few vmalloc users who perform GFP_NOFS > > allocation: > > drivers/mtd/ubi/io.c > > fs/ext4/xattr.c > > fs/gfs2/dir.c > > fs/gfs2/quota.c > > fs/nfs/blocklayout/extent_tree.c > > fs/ubifs/debug.c > > fs/ubifs/lprops.c > > fs/ubifs/lpt_commit.c > > fs/ubifs/orphan.c > > > > Unfortunatelly vmalloc doesn't suppoer GFP_NOFS semantinc properly > > because we do have hardocded GFP_KERNEL allocations deep inside the > > vmalloc layers. That means that if GFP_NOFS really protects from > > recursion into the fs deadlocks then the vmalloc call is broken. > > > > What to do about this? Well, there are two things. Firstly, it would be > > really great to double check whether the GFP_NOFS is really needed. I > > cannot judge that because I am not familiar with the code. It would be > > great if the respective maintainers (hopefully get_maintainer.sh pointed > > me to all relevant ones). If there is not reclaim recursion issue then > > simply use the standard vmalloc (aka GFP_KERNEL request). > > > > If the use is really valid then we have a way to do the vmalloc > > allocation properly. We have memalloc_nofs_{save,restore} scope api. How > > does that work? You simply call memalloc_nofs_save when the reclaim > > recursion critical section starts (e.g. when you take a lock which is > > then used in the reclaim path - e.g. shrinker) and memalloc_nofs_restore > > when the critical section ends. _All_ allocations within that scope > > will get GFP_NOFS semantic automagically. If you are not sure about the > > scope itself then the easiest workaround is to wrap the vmalloc itself > > with a big fat comment that this should be revisited. > > > > Does that sound like something that can be done in a reasonable time? > > I have tried to bring this up in the past but our speed is glacial and > > there are attempts to do hacks like checking for abusers inside the > > vmalloc which is just too ugly to live. > > > > Please do not hesitate to get back to me if something is not clear. > > > > Thanks! > > -- > > Michal Hocko > > SUSE Labs > > I made a patch that adds memalloc_noio/fs_save around these calls a year > ago: http://lkml.iu.edu/hypermail/linux/kernel/1707.0/01376.html Yeah, and that is the wrong approach. Let's try to fix this properly this time. As the above outlines, the worst case we can end up mid-term would be to wrap vmalloc calls with the scope api with a TODO. But I am pretty sure the respective maintainers can come up with a better solution. I am definitely willing to help here. -- Michal Hocko SUSE Labs