Received: by 10.192.165.148 with SMTP id m20csp28081imm; Wed, 9 May 2018 08:16:35 -0700 (PDT) X-Google-Smtp-Source: AB8JxZqlmrf1M45x6b9M8Qrj0ODq+6BAQPEvuOE1ZbqX5RvXwxvSDoWc0gS4KCgnv/xbAREztGxE X-Received: by 2002:a63:6e0e:: with SMTP id j14-v6mr5481420pgc.218.1525878995896; Wed, 09 May 2018 08:16:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525878995; cv=none; d=google.com; s=arc-20160816; b=x/7lYqW8yb6YsPa75+08lxDzrTI9ksD6jS2tZmcsi7zDl5vrIp1r3ib6KjIy1Mef8p ylp7K27Exvm+jXUmDMD587CiC3uQP6XfhK1anpB4KQ8nlQbbLX5rF5WGTUmnmh9AmcoC ksZoxRq9Ez4FUEyMmshJJ2rmyILffGGpJCIKaUMOp5GOcA5tHyA93YL2oGT+DR/JvkVk eMF6IeOthes3akNWoYtFd0v/SiFhq1GA3r3jdoJIstd7APvGxga3QfR1mNIuWEeOwvy6 JGogM29v53S9xWdhH/Ccd4rAKo5616wQ00XpNzcaibAoFJRx9cR1Ld+tUOg/SoyA+28a M9JA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=vLAdb67KsQZJvwsBy8h8xMq48PtHqqgyqi5pfyIRdPU=; b=DGvIDEyW+RNgiXgg9Otz67m5rl9Ka4Osa2zrOZCnzLD7DYdKVs1dJ0qWVMGIv3+oJ1 F1bEg4gSq1UhDHYEEa3WOKbTfXSAWLbXJrvvsGgGXB2JdpQsq8HkCyRiQZJA00iYdCfw o9IjFe8fFKhX477E4BmeKVGfsru/LKbRzAsKa1hE2owGRLnU2FlehhMIO5kMCH1KKhpC AWMdnz5UIGku0I4pIqnY4yeUFPv8YAJAjBWUpRL2FIuWOBbS2tDWTM9gcG4FLP2eVrzV Spxojzd2KhkXRnIR7Wcf219G58o1RLJLK6jispVuwb5uYvwPLdV63nUJXjO+Q4Yr/1O7 7Z0g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=YK7sm5Nx; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v11-v6si21317335pgo.643.2018.05.09.08.16.21; Wed, 09 May 2018 08:16:35 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=YK7sm5Nx; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S964929AbeEIPPg (ORCPT + 99 others); Wed, 9 May 2018 11:15:36 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:35802 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S935265AbeEIPPe (ORCPT ); Wed, 9 May 2018 11:15:34 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w49FAjjt180280; Wed, 9 May 2018 15:13:58 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=date : from : to : cc : subject : message-id : references : mime-version : content-type : in-reply-to; s=corp-2017-10-26; bh=vLAdb67KsQZJvwsBy8h8xMq48PtHqqgyqi5pfyIRdPU=; b=YK7sm5Nx6FMWlPV1BiUcoXs05iLrYcsioXVOa8U3pJZrseqE5pldixij+z55u7/BjKlk C+uY8eGDGCWkRprIJu0sPJwdsuqK6PF/aHFHq546s4J2MM9JzR7b1neyCZvW+wALQmeZ rGwRPp8drBVag8WftaDIaTLdxXQ0y3HfF8HMgtdFDfgvUXIxgGQv+u+nRqrZmZwDht7R GlpVn0LNKoBE1JlNzSLss83HlgP+PQRVpL7oAEc1rEIkWpdg6w28QdQ4Ekp4O6vPjoZ3 fnsEUp1Q6e6qGtQUxeo6p6lWV/wrGyDsHTAJIqgemQJyLajEThLJMTT/j1cLJ88xBWMr ng== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by userp2130.oracle.com with ESMTP id 2hs426pb9n-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 09 May 2018 15:13:58 +0000 Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id w49FDvQM026781 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 9 May 2018 15:13:57 GMT Received: from abhmp0005.oracle.com (abhmp0005.oracle.com [141.146.116.11]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id w49FDsqP001796; Wed, 9 May 2018 15:13:54 GMT Received: from localhost (/67.169.218.210) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 09 May 2018 08:13:54 -0700 Date: Wed, 9 May 2018 08:13:51 -0700 From: "Darrick J. Wong" To: Michal Hocko Cc: "Theodore Y. Ts'o" , LKML , Artem Bityutskiy , Richard Weinberger , David Woodhouse , Brian Norris , Boris Brezillon , Marek Vasut , Cyrille Pitchen , Andreas Dilger , Steven Whitehouse , Bob Peterson , Trond Myklebust , Anna Schumaker , Adrian Hunter , Philippe Ombredanne , Kate Stewart , Mikulas Patocka , linux-mtd@lists.infradead.org, linux-ext4@vger.kernel.org, cluster-devel@redhat.com, linux-nfs@vger.kernel.org, linux-mm@kvack.org Subject: Re: vmalloc with GFP_NOFS Message-ID: <20180509151351.GA4111@magnolia> References: <20180424162712.GL17484@dhcp22.suse.cz> <20180424183536.GF30619@thunk.org> <20180424192542.GS17484@dhcp22.suse.cz> <20180509134222.GU32366@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180509134222.GU32366@dhcp22.suse.cz> User-Agent: Mutt/1.9.4 (2018-02-28) X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8887 signatures=668698 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1711220000 definitions=main-1805090143 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, May 09, 2018 at 03:42:22PM +0200, Michal Hocko wrote: > On Tue 24-04-18 13:25:42, Michal Hocko wrote: > [...] > > > As a suggestion, could you take > > > documentation about how to convert to the memalloc_nofs_{save,restore} > > > scope api (which I think you've written about e-mails at length > > > before), and put that into a file in Documentation/core-api? > > > > I can. > > Does something like the below sound reasonable/helpful? > --- > ================================= > GFP masks used from FS/IO context > ================================= > > :Date: Mapy, 2018 > :Author: Michal Hocko > > Introduction > ============ > > FS resp. IO submitting code paths have to be careful when allocating Not sure what 'FS resp. IO' means here -- 'FS and IO' ? (Or is this one of those things where this looks like plain English text but in reality it's some sort of markup that I'm not so familiar with?) Confused because I've seen 'resp.' used as shorthand for 'responsible'... > memory to prevent from potential recursion deadlocks caused by direct > memory reclaim calling back into the FS/IO path and block on already > held resources (e.g. locks). Traditional way to avoid this problem 'The traditional way to avoid this deadlock problem...' > is to clear __GFP_FS resp. __GFP_IO (note the later implies clearing > the first as well) in the gfp mask when calling an allocator. GFP_NOFS > resp. GFP_NOIO can be used as shortcut. > > This has been the traditional way to avoid deadlocks since ages. It I think this sentence is a little redundant with the previous sentence, you could chop it out and join this paragraph to the one before it. > turned out though that above approach has led to abuses when the restricted > gfp mask is used "just in case" without a deeper consideration which leads > to problems because an excessive use of GFP_NOFS/GFP_NOIO can lead to > memory over-reclaim or other memory reclaim issues. > > New API > ======= > > Since 4.12 we do have a generic scope API for both NOFS and NOIO context > ``memalloc_nofs_save``, ``memalloc_nofs_restore`` resp. ``memalloc_noio_save``, > ``memalloc_noio_restore`` which allow to mark a scope to be a critical > section from the memory reclaim recursion into FS/IO POV. Any allocation > from that scope will inherently drop __GFP_FS resp. __GFP_IO from the given > mask so no memory allocation can recurse back in the FS/IO. > > FS/IO code then simply calls the appropriate save function right at > the layer where a lock taken from the reclaim context (e.g. shrinker) > is taken and the corresponding restore function when the lock is > released. All that ideally along with an explanation what is the reclaim > context for easier maintenance. > > What about __vmalloc(GFP_NOFS) > ============================== > > vmalloc doesn't support GFP_NOFS semantic because there are hardcoded > GFP_KERNEL allocations deep inside the allocator which are quit non-trivial ...which are quite non-trivial... > to fix up. That means that calling ``vmalloc`` with GFP_NOFS/GFP_NOIO is > almost always a bug. The good news is that the NOFS/NOIO semantic can be > achieved by the scope api. > > In the ideal world, upper layers should already mark dangerous contexts > and so no special care is required and vmalloc should be called without > any problems. Sometimes if the context is not really clear or there are > layering violations then the recommended way around that is to wrap ``vmalloc`` > by the scope API with a comment explaining the problem. Otherwise looks ok to me based on my understanding of how all this is supposed to work... Reviewed-by: Darrick J. Wong --D > -- > Michal Hocko > SUSE Labs