Received: by 2002:ab2:6991:0:b0:1f7:f6c3:9cb1 with SMTP id v17csp893445lqo; Wed, 8 May 2024 20:18:09 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCVsYpKEbtJKZtjfX2k0X4CAAH+z7+h/XAKf4S7HGU5QDx9onWr6jJFn34wkV9Qas/NkxIqaCGLaGeqXEslweyICjV79ifRXSMMhm41qvw== X-Google-Smtp-Source: AGHT+IExxL3ag6m7KCzckkQF/Oh2X2+Rb/3NpELrOoiMeGbXnf6uoqTTp/jDzdETsBPlQ2ZtuRqe X-Received: by 2002:a17:906:aed0:b0:a59:9a42:b7de with SMTP id a640c23a62f3a-a5a1181266fmr101296266b.36.1715224689112; Wed, 08 May 2024 20:18:09 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1715224689; cv=pass; d=google.com; s=arc-20160816; b=CZ+zCT2JvxaHuHX6aBW3BADjuP5/RAEF8qvaIROb7nlwV3LhU53WwsfKS5VEHxR+0g 8cJqqk96/i8qbnVINidPlqxPAsZgmrfKrPCh3XOaPy7VjDL7c2UTSBTpCjOlRE3HLKbu lJfE4+tVWmGt4z8hqnOfnikdfZegeLl2XRNlzeArNdCJ4L2pObojEQKqvG8Gu5Gnz/so Am1PELQ83M+uIxd4KMRYEWf6D1RCSFWByugQYZ0fTPHrlr1Gmv16mUNLbIT5zAyYqvxL f+/IjRD7ioyc2ZGfebOuFpGLdo5/rmwXEqgwlnfTLmpWHcBcGOoiL4PdCLHdbBhQUlxH nzCA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:from:references:cc:to:subject :user-agent:mime-version:list-unsubscribe:list-subscribe:list-id :precedence:date:message-id:dkim-signature; bh=bJbpI4Ykw2bPEOET9QEYBge5aKOtp0FuvaBCPXYq4j8=; fh=9QkNDCDtsftFamHPQaYOM3U40envZKeOEubCSrU/t8s=; b=LCmsAarB4c2D3sess+DmvmtRseNNDIYhIPVZENRCpZCMjmFdoxmpexO0lywWxVt1KA 35uWHb+lAj70M2xDzg6rBFoDSAaaCBprZ+pSxhlBaxFSQA34Qsp4PXunowlraaqUn4q4 eNdVW6SdOPlcNZzKv4BHEWezQ1RvMF9kyLIv1QJd2yD0pfpBVMFw6hOaJOnFi4DirRAN UkxiMYYINpPp5kAvksy4uN+LrESBAvbc/Ku/otXq2KjsCmIR566E9we1TWYYRhysVxmc +qFtA2lwzD06WTa+SE1bLunirb+yr1/bD8H5dWO155ELgwZci/GfAij/YBOWdj3znKhB eQ7w==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@linux.alibaba.com header.s=default header.b="ie/rV9Ai"; arc=pass (i=1 spf=pass spfdomain=linux.alibaba.com dkim=pass dkdomain=linux.alibaba.com dmarc=pass fromdomain=linux.alibaba.com); spf=pass (google.com: domain of linux-kernel+bounces-174083-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-174083-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.alibaba.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id a640c23a62f3a-a5a1794645asi29858366b.26.2024.05.08.20.18.09 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 May 2024 20:18:09 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-174083-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.alibaba.com header.s=default header.b="ie/rV9Ai"; arc=pass (i=1 spf=pass spfdomain=linux.alibaba.com dkim=pass dkdomain=linux.alibaba.com dmarc=pass fromdomain=linux.alibaba.com); spf=pass (google.com: domain of linux-kernel+bounces-174083-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-174083-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.alibaba.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id ABDFE1F2305A for ; Thu, 9 May 2024 03:18:08 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id D57A0146D42; Thu, 9 May 2024 03:18:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="ie/rV9Ai" Received: from out30-101.freemail.mail.aliyun.com (out30-101.freemail.mail.aliyun.com [115.124.30.101]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9F08D1DDEB for ; Thu, 9 May 2024 03:17:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.101 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715224682; cv=none; b=sb+kqsiCViAd8eTV7hIo0CzD4WI65nviBwYMbJF68cWTMi1y4Z3HEKhjBubQG9pbfvPpguHB90nIsthxcYKjkpAMUx5peSD+30HfQXpk/XRqrdzFQn/TDya58SlZOwtLwJhb7QmIzLoDXabfI+NdbUHvXoJs/aLYIo1+Fs4C+2Y= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715224682; c=relaxed/simple; bh=O1LF+1WwuiO61NvWh6v4wD8HfaJ8BIoTXYSzCTQmYKY=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=icX8N5aHWWGKiWdd7Ly3/iMhgSWOUUGkFYxreTJ6H9LpHIEqhFj4WpOD+z/q8DLxYtte7uP8rly33lfR+cxYrPvAeKRVW7N8hrnchBJXaSKJfJVhgWR7C1qCrIfiGxmyjwtKYqVA3WprTq/z821glUY/b01P3X5scY9hV5D8d2c= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=ie/rV9Ai; arc=none smtp.client-ip=115.124.30.101 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1715224677; h=Message-ID:Date:MIME-Version:Subject:To:From:Content-Type; bh=bJbpI4Ykw2bPEOET9QEYBge5aKOtp0FuvaBCPXYq4j8=; b=ie/rV9AiPdVq499F/55o8QEytk35qC6N7I5QRXAdNwrZqbBdmPBCREq4qmVhF23U6Bh8H2VVywjwx3dyG6297HgHmvlbLjPhiFfX3UBuDg9IZU/fsdfBMyn0cXgPtAD6fK9JHNrkQaDkcJ0IpkAIswfjWOylL3TywvojCB9Ymw8= X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R621e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033037067110;MF=hsiangkao@linux.alibaba.com;NM=1;PH=DS;RN=11;SR=0;TI=SMTPD_---0W65OPA9_1715224674; Received: from 30.97.48.191(mailfrom:hsiangkao@linux.alibaba.com fp:SMTPD_---0W65OPA9_1715224674) by smtp.aliyun-inc.com; Thu, 09 May 2024 11:17:56 +0800 Message-ID: Date: Thu, 9 May 2024 11:17:54 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH] mm/vmalloc: fix vmalloc which may return null if called with __GFP_NOFAIL To: Barry Song <21cnbao@gmail.com> Cc: hailong.liu@oppo.com, akpm@linux-foundation.org, urezki@gmail.com, hch@infradead.org, lstoakes@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, xiang@kernel.org, chao@kernel.org, Oven References: <20240508125808.28882-1-hailong.liu@oppo.com> <20d782ad-c059-4029-9c75-0ef278c98d81@linux.alibaba.com> From: Gao Xiang In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit On 2024/5/9 11:09, Barry Song wrote: > On Thu, May 9, 2024 at 2:39 PM Gao Xiang wrote: >> >> Hi, >> >> On 2024/5/9 10:20, Barry Song wrote: >>> On Thu, May 9, 2024 at 12:58 AM wrote: >>>> >>>> From: "Hailong.Liu" >>>> >>>> Commit a421ef303008 ("mm: allow !GFP_KERNEL allocations for kvmalloc") >>>> includes support for __GFP_NOFAIL, but it presents a conflict with >>>> commit dd544141b9eb ("vmalloc: back off when the current task is >>>> OOM-killed"). A possible scenario is as belows: >>>> >>>> process-a >>>> kvcalloc(n, m, GFP_KERNEL | __GFP_NOFAIL) >>>> __vmalloc_node_range() >>>> __vmalloc_area_node() >>>> vm_area_alloc_pages() >>>> --> oom-killer send SIGKILL to process-a >>>> if (fatal_signal_pending(current)) break; >>>> --> return NULL; >>>> >>>> to fix this, do not check fatal_signal_pending() in vm_area_alloc_pages() >>>> if __GFP_NOFAIL set. >>>> >>>> Reported-by: Oven >>>> Signed-off-by: Hailong.Liu >>>> --- >>>> mm/vmalloc.c | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c >>>> index 6641be0ca80b..2f359d08bf8d 100644 >>>> --- a/mm/vmalloc.c >>>> +++ b/mm/vmalloc.c >>>> @@ -3560,7 +3560,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid, >>>> >>>> /* High-order pages or fallback path if "bulk" fails. */ >>>> while (nr_allocated < nr_pages) { >>>> - if (fatal_signal_pending(current)) >>>> + if (!(gfp & __GFP_NOFAIL) && fatal_signal_pending(current)) >>>> break; >>> >>> why not !nofail ? >>> >>> This seems a correct fix, but it undermines the assumption made in >>> commit dd544141b9eb >>> ("vmalloc: back off when the current task is OOM-killed") >>> >>> " >>> This may trigger some hidden problems, when caller does not handle >>> vmalloc failures, or when rollaback after failed vmalloc calls own >>> vmallocs inside. However all of these scenarios are incorrect: vmalloc >>> does not guarantee successful allocation, it has never been called with >>> __GFP_NOFAIL and threfore either should not be used for any rollbacks or >>> should handle such errors correctly and not lead to critical failures. >>> " >>> >>> If a significant kvmalloc operation is performed with the NOFAIL flag, it risks >>> reverting the fix intended to address the OOM-killer issue in commit >>> dd544141b9eb. >>> Should we indeed permit the NOFAIL flag for large kvmalloc allocations? >> >> Just from my perspective, I don't really care about kmalloc, vmalloc >> or kvmalloc (__GFP_NOFAIL). I even don't care if it returns three >> order-0 pages or a high-order page. I just would like to need a >> virtual consecutive buffer (even it works slowly.) with __GFP_NOFAIL. >> >> Because in some cases, writing fallback code may be tough and hard to >> test if such fallback path is correct since it only triggers in extreme >> workloads, and even such buffers are just used in a very short lifetime. >> Also see other FS discussion of __GFP_NOFAIL, e.g. >> https://lore.kernel.org/all/ZcUQfzfQ9R8X0s47@tiehlicka/ >> >> In the worst cases, it usually just needs < 5 order-0 pages (for many >> cases it only needs one page), but with kmalloc it will trigger WARN >> if it occurs to > order-1 allocation. as I mentioned before. >> >> With my limited understanding I don't see why it could any problem with >> kvmalloc(__GFP_NOFAIL) since it has no difference of kmalloc(GFP_NOFAIL) >> with order-0 allocation. > > I completely understand that you're not concerned about the origin of > the memory, > such as whether it's organized by all zero-order pages. However, in the event > that someone else allocates a large memory, like several megabytes with the > NOFAIL flag, commit dd544141b9eb aims to halt the allocation before success > if the process being allocated is targeted for termination of OOM-killer. > > With the current patch, we miss the opportunity for early allocation > termination. > However, if the size of the kvmalloc() is small, as in your case, I > believe it should > be perfectly fine. but do we have any way to prevent large size allocation with > NOFAIL? I think large size order-0 virtual consecutive allocation prevention should be done by the callers (e.g. EROFS) other than memory subsystem, since it totally sounds like caller bugs. It doesn't make sense for me to block getting a virtual consecutive buffer which only needs more than two (e.g. three or five) order-0 pages in the extreme cases. Yes, I need to prevent very insane allocations but it's an on-disk hard limitation of a filesystem itself. Thanks, Gao Xiang > >> >> >> Thanks, >> Gao XIang > > Thanks > Barry