Received: by 10.192.165.148 with SMTP id m20csp5539517imm; Wed, 9 May 2018 06:43:05 -0700 (PDT) X-Google-Smtp-Source: AB8JxZp/5xf1TXDfGJ9pAbSGe7dBP9CA7mn6VQ6t5WTu6XaQ71cPefOrbOVMKZyEgPkwK9Epw0Gw X-Received: by 10.98.48.133 with SMTP id w127mr22108414pfw.224.1525873384962; Wed, 09 May 2018 06:43:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525873384; cv=none; d=google.com; s=arc-20160816; b=HRM+O28F02ZwI3f9i+61qQB1muUtRj50JUnKsjT3/Cj/MRjZ+l73q/0bRO9I5qEBsy rdaXcIxZCC0sUo5FCQhivVvFDWuNHpUHSJ03AN6XL2155FbpOUtsxCAHRVtg4m2ozZLm 9dvmMLM/o+GZOTzWx6wpUDtZup/TRm/+0ryXL53P/IzVWRQcEklJGaaZ+byO+NswvlXF bkw7oBxFMpVc2VY6WsDRznpjRDE9zg8E0p6CQFWBxG610GS2pXYIJusrE6Bz5jwSbJ+Z ZyDzO5cHFS0J7ZMOxHo2lIBQBAbbaU1vclEZjJwyAJrMpv0pEFfg1wiontbICJhSAbDn pIcg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=Suec6sa70azH7P5jjaeWBxxlubdLSBEjquZ4QGG9Xao=; b=0ZYvOgOzha2nYlCG2DrXf9xSwEn+aAVw115TeJ+M979/XTGWLW9j1NPLaBAR5cuvdH wvi5jabNqxz9HtVe0G2fQFesjUVSHHeMddmgXYav8tZFYM2ZUUL+ZHtPHWWFi+ITXRzJ +9VqRmz/xLEu+BcOA5dEEPSkor+M9K1umtEYcy6WgtveHzlMLbmeG/UxiRDGrowby5Og CAwKxf8zMcskBWKAZ8m2SizuRULw/6KQ/N5Smb+umQ2O9/TfjvZGEdgLSYFE3pwFLY/O jdrqkF1Zz2rzXYHYJ/pwr7vgONYCgqI7G5O+8LVXjG79LhVgvBLr4G8CAtz+tFf06Ehg YI4Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h27-v6si17913649pgn.147.2018.05.09.06.42.50; Wed, 09 May 2018 06:43:04 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934976AbeEINm3 (ORCPT + 99 others); Wed, 9 May 2018 09:42:29 -0400 Received: from mx2.suse.de ([195.135.220.15]:33119 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934110AbeEINm2 (ORCPT ); Wed, 9 May 2018 09:42:28 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 67BDDAC95; Wed, 9 May 2018 13:42:25 +0000 (UTC) Date: Wed, 9 May 2018 15:42:22 +0200 From: Michal Hocko To: "Theodore Y. Ts'o" Cc: LKML , Artem Bityutskiy , Richard Weinberger , David Woodhouse , Brian Norris , Boris Brezillon , Marek Vasut , Cyrille Pitchen , Andreas Dilger , Steven Whitehouse , Bob Peterson , Trond Myklebust , Anna Schumaker , Adrian Hunter , Philippe Ombredanne , Kate Stewart , Mikulas Patocka , linux-mtd@lists.infradead.org, linux-ext4@vger.kernel.org, cluster-devel@redhat.com, linux-nfs@vger.kernel.org, linux-mm@kvack.org Subject: Re: vmalloc with GFP_NOFS Message-ID: <20180509134222.GU32366@dhcp22.suse.cz> References: <20180424162712.GL17484@dhcp22.suse.cz> <20180424183536.GF30619@thunk.org> <20180424192542.GS17484@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180424192542.GS17484@dhcp22.suse.cz> User-Agent: Mutt/1.9.5 (2018-04-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 24-04-18 13:25:42, Michal Hocko wrote: [...] > > As a suggestion, could you take > > documentation about how to convert to the memalloc_nofs_{save,restore} > > scope api (which I think you've written about e-mails at length > > before), and put that into a file in Documentation/core-api? > > I can. Does something like the below sound reasonable/helpful? --- ================================= GFP masks used from FS/IO context ================================= :Date: Mapy, 2018 :Author: Michal Hocko Introduction ============ FS resp. IO submitting code paths have to be careful when allocating memory to prevent from potential recursion deadlocks caused by direct memory reclaim calling back into the FS/IO path and block on already held resources (e.g. locks). Traditional way to avoid this problem is to clear __GFP_FS resp. __GFP_IO (note the later implies clearing the first as well) in the gfp mask when calling an allocator. GFP_NOFS resp. GFP_NOIO can be used as shortcut. This has been the traditional way to avoid deadlocks since ages. It turned out though that above approach has led to abuses when the restricted gfp mask is used "just in case" without a deeper consideration which leads to problems because an excessive use of GFP_NOFS/GFP_NOIO can lead to memory over-reclaim or other memory reclaim issues. New API ======= Since 4.12 we do have a generic scope API for both NOFS and NOIO context ``memalloc_nofs_save``, ``memalloc_nofs_restore`` resp. ``memalloc_noio_save``, ``memalloc_noio_restore`` which allow to mark a scope to be a critical section from the memory reclaim recursion into FS/IO POV. Any allocation from that scope will inherently drop __GFP_FS resp. __GFP_IO from the given mask so no memory allocation can recurse back in the FS/IO. FS/IO code then simply calls the appropriate save function right at the layer where a lock taken from the reclaim context (e.g. shrinker) is taken and the corresponding restore function when the lock is released. All that ideally along with an explanation what is the reclaim context for easier maintenance. What about __vmalloc(GFP_NOFS) ============================== vmalloc doesn't support GFP_NOFS semantic because there are hardcoded GFP_KERNEL allocations deep inside the allocator which are quit non-trivial to fix up. That means that calling ``vmalloc`` with GFP_NOFS/GFP_NOIO is almost always a bug. The good news is that the NOFS/NOIO semantic can be achieved by the scope api. In the ideal world, upper layers should already mark dangerous contexts and so no special care is required and vmalloc should be called without any problems. Sometimes if the context is not really clear or there are layering violations then the recommended way around that is to wrap ``vmalloc`` by the scope API with a comment explaining the problem. -- Michal Hocko SUSE Labs