Received: by 10.192.165.148 with SMTP id m20csp5032251imm; Tue, 24 Apr 2018 12:28:22 -0700 (PDT) X-Google-Smtp-Source: AIpwx49gME1ZCGdlM9qAtBTWtGiUZ3icz1bnldHqfuxi5q32TJIQrKAa+SmUwgJUahaibZH/EOhq X-Received: by 10.99.49.205 with SMTP id x196mr19211606pgx.397.1524598102749; Tue, 24 Apr 2018 12:28:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524598102; cv=none; d=google.com; s=arc-20160816; b=y1hk2GanvZTHP1pnyVbt82ikkSF7RjOdevZjUspEO9VVfwOwxHJjm8lMJGcaAuYUMD FTFgkZTR67BaM6nj74FJbcx00jZqyYsg1Ac21NhNYbCHqeCa1isvyChZ7zFfJnM9yplg /F4FdOehQgVwGncxTSnAl18nsTfAPgV8riVlDFMcDGH2vjWJDjzv3KDk9lj5Yttm9cF5 0+v6XbRVwC0dv8TLVp3bmrwJ2AZ3jxvQypRNSFTwtMCqtYFpiYi2NyZf3YfZPefjfRHp wS1g7icr03oAGfvCvtVIin3AnBVLK/rilGWZJWmIgk2DgPU/NknoUoURwkbi4SqKcGIk 7L4g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=ZYpIo2YxZaYSKAJTVA74qTfq+VG8jsPCEm2EmqlAR+U=; b=RHTbxCeVpwr8Ev/Fn920SM3bxox83h7IQI4jOxOCTqA7zdA3hwnYyqMHvdJrHWeiPE iDRFzx3kThwObluCN7n/agOUpJnCPgnKNJgapadiXZKGticWoNToc4P9Hs/EYJhC9gv5 rOuVF9Dc77XZZ7aE/9f5KnU/6xgr/4wCJUW0lQvgi09wM1zcu1XblV46PLSVJr66no7m MHo4+x1DyxmhiWlTI46Wy/UfMRHTXGk1oRzRwS0C1b8gAPejeLiMj/0HyUQDGGzL8FWi fFPaVZmkPxD+zOTIlOOzspPDNYcZ4dnhVl5gS46lkAlw3zB/yNS2plrePbL+40Hnr/Yp mMEw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x124si12298704pgb.651.2018.04.24.12.28.08; Tue, 24 Apr 2018 12:28:22 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752189AbeDXT0c (ORCPT + 99 others); Tue, 24 Apr 2018 15:26:32 -0400 Received: from mx1.redhat.com ([209.132.183.28]:50692 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751994AbeDXT0a (ORCPT ); Tue, 24 Apr 2018 15:26:30 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E606AC04574F; Tue, 24 Apr 2018 19:26:28 +0000 (UTC) Received: from localhost.localdomain (ovpn-116-21.phx2.redhat.com [10.3.116.21]) by smtp.corp.redhat.com (Postfix) with ESMTP id 66DDB8377F; Tue, 24 Apr 2018 19:26:23 +0000 (UTC) Subject: Re: vmalloc with GFP_NOFS To: Michal Hocko , LKML Cc: Artem Bityutskiy , Richard Weinberger , David Woodhouse , Brian Norris , Boris Brezillon , Marek Vasut , Cyrille Pitchen , Theodore Ts'o , Andreas Dilger , Bob Peterson , Trond Myklebust , Anna Schumaker , Adrian Hunter , Philippe Ombredanne , Kate Stewart , Mikulas Patocka , linux-mtd@lists.infradead.org, linux-ext4@vger.kernel.org, cluster-devel@redhat.com, linux-nfs@vger.kernel.org, linux-mm@kvack.org References: <20180424162712.GL17484@dhcp22.suse.cz> From: Steven Whitehouse Message-ID: Date: Tue, 24 Apr 2018 20:26:23 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0 MIME-Version: 1.0 In-Reply-To: <20180424162712.GL17484@dhcp22.suse.cz> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Tue, 24 Apr 2018 19:26:29 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On 24/04/18 17:27, Michal Hocko wrote: > Hi, > it seems that we still have few vmalloc users who perform GFP_NOFS > allocation: > drivers/mtd/ubi/io.c > fs/ext4/xattr.c > fs/gfs2/dir.c > fs/gfs2/quota.c > fs/nfs/blocklayout/extent_tree.c > fs/ubifs/debug.c > fs/ubifs/lprops.c > fs/ubifs/lpt_commit.c > fs/ubifs/orphan.c > > Unfortunatelly vmalloc doesn't suppoer GFP_NOFS semantinc properly > because we do have hardocded GFP_KERNEL allocations deep inside the > vmalloc layers. That means that if GFP_NOFS really protects from > recursion into the fs deadlocks then the vmalloc call is broken. > > What to do about this? Well, there are two things. Firstly, it would be > really great to double check whether the GFP_NOFS is really needed. I > cannot judge that because I am not familiar with the code. It would be > great if the respective maintainers (hopefully get_maintainer.sh pointed > me to all relevant ones). If there is not reclaim recursion issue then > simply use the standard vmalloc (aka GFP_KERNEL request). For GFS2, and I suspect for other fs too, it is really needed. We don't want to enter reclaim while holding filesystem locks. > If the use is really valid then we have a way to do the vmalloc > allocation properly. We have memalloc_nofs_{save,restore} scope api. How > does that work? You simply call memalloc_nofs_save when the reclaim > recursion critical section starts (e.g. when you take a lock which is > then used in the reclaim path - e.g. shrinker) and memalloc_nofs_restore > when the critical section ends. _All_ allocations within that scope > will get GFP_NOFS semantic automagically. If you are not sure about the > scope itself then the easiest workaround is to wrap the vmalloc itself > with a big fat comment that this should be revisited. > > Does that sound like something that can be done in a reasonable time? > I have tried to bring this up in the past but our speed is glacial and > there are attempts to do hacks like checking for abusers inside the > vmalloc which is just too ugly to live. > > Please do not hesitate to get back to me if something is not clear. > > Thanks! It would be good to fix this, and it has been known as an issue for a long time. We might well be able to make use of the new API though. It might be as simple as adding the calls when we get & release glocks, but I'd have to check the code to be sure, Steve.