Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp5526297pxb; Tue, 16 Feb 2021 00:16:41 -0800 (PST) X-Google-Smtp-Source: ABdhPJwLSd3iJNpu8gOI2aswhpItBq0lmBan5uCvtteqUhW2YKNANM60kNdOGjUgqN3g0vILPybP X-Received: by 2002:a05:6402:430c:: with SMTP id m12mr20310174edc.299.1613463401680; Tue, 16 Feb 2021 00:16:41 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1613463401; cv=none; d=google.com; s=arc-20160816; b=KHof8oe0vjj0tQRiuEb1wh0YBwOKIPYBFh7BEDqd9CleOKpOwQfiLvEItB15WmNUMv AqN/oAiCBivkRfmz46KBWC4/W9VGc619TLJ12g0m6kYJuV3kTbpSwBfvqL0eR5ln/N7Y oLj42dc27CZcV1zlZZfXEa0onoR/GVf2DnPghuHaqJOaUvMxNkup3AdPydUcZnLISAUE OwMh/65AeGbElAmskbzTqtEEQy6YQhK7LQHfcy0A7KOSIz2y/sVnOLIz72y3W5QcSVK7 0c22FsICvgE0JIbFiuyAyRG4DRj5GIZpPcH1karb9PnXAR02FRe9yEwXOeFFK9kS8ivo Ab1A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:organization :from:references:cc:to:subject:dkim-signature; bh=mrBI3gmDGE4IcqlEPbbLaR/5KOLeuQUWrbl5wiK6Nv8=; b=zx2gTvG4HOn4tqP+99AFtrmnc5jAsCJSKqmUussVy2YfDVmRfpSReUxuT2AHhSft/J QX1B8otEAp0Tqo1hrNujztMG2+yNrfxcviyBTl21GGr9fmoswxpdcOpJY6F+M5tjgeD0 9m0lcFIEMsGw+KTA6Ob4H9EkGmpYbab1P+Sjnfd7/yysw8sJF7tyoG4KWyFfLzyhzV2m 5Kv3f5p0xIw3flDnzVO7HOf/4IzpZZZASuzFh4U7jj7tFI1Jq2oLOgWVIudZjipjqBW0 IvzrroOQrtjHqLlOIZ4qhBnxcGuE5mqZU76xLlssWpVZOJbdKmww70i0ncIJ/QYrXxc2 hw1w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=F12eWXrR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id s7si697941edw.490.2021.02.16.00.16.18; Tue, 16 Feb 2021 00:16:41 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=F12eWXrR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229782AbhBPIOx (ORCPT + 99 others); Tue, 16 Feb 2021 03:14:53 -0500 Received: from us-smtp-delivery-124.mimecast.com ([63.128.21.124]:24270 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229635AbhBPIOv (ORCPT ); Tue, 16 Feb 2021 03:14:51 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1613463204; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mrBI3gmDGE4IcqlEPbbLaR/5KOLeuQUWrbl5wiK6Nv8=; b=F12eWXrRlsRWURm1+QdZT3BzbUhYSFhhFq9YtC3rLFFHCJEtaqHPM/HxMbwrLnFvc8oJzZ eVwqqVVaWVq2YIQX+TNh4zIbBdYk1WaSDO4LMNs17tbrNEzDgFSFPuAIABliH96bdgamKa 1WfbIGGTvvhoXE6S3J9WjYDXPmrFrz4= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-576-7mnurP3rNQGA5ySur9Yqbw-1; Tue, 16 Feb 2021 03:13:22 -0500 X-MC-Unique: 7mnurP3rNQGA5ySur9Yqbw-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 02480189CD2E; Tue, 16 Feb 2021 08:13:18 +0000 (UTC) Received: from [10.36.114.70] (ovpn-114-70.ams2.redhat.com [10.36.114.70]) by smtp.corp.redhat.com (Postfix) with ESMTP id F0FB57216B; Tue, 16 Feb 2021 08:13:09 +0000 (UTC) Subject: Re: [External] Re: [PATCH v15 4/8] mm: hugetlb: alloc the vmemmap pages associated with each HugeTLB page To: Michal Hocko , Muchun Song Cc: Jonathan Corbet , Mike Kravetz , Thomas Gleixner , mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, Peter Zijlstra , viro@zeniv.linux.org.uk, Andrew Morton , paulmck@kernel.org, mchehab+huawei@kernel.org, pawan.kumar.gupta@linux.intel.com, Randy Dunlap , oneukum@suse.com, anshuman.khandual@arm.com, jroedel@suse.de, Mina Almasry , David Rientjes , Matthew Wilcox , Oscar Salvador , "Song Bao Hua (Barry Song)" , =?UTF-8?B?SE9SSUdVQ0hJIE5BT1lBKOWggOWPoyDnm7TkuZ8p?= , Joao Martins , Xiongchun duan , linux-doc@vger.kernel.org, LKML , Linux Memory Management List , linux-fsdevel References: From: David Hildenbrand Organization: Red Hat GmbH Message-ID: <4f8664fb-0d65-b7d6-39d6-2ce5fc86623a@redhat.com> Date: Tue, 16 Feb 2021 09:13:09 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 15.02.21 20:02, Michal Hocko wrote: > On Tue 16-02-21 01:48:29, Muchun Song wrote: >> On Tue, Feb 16, 2021 at 12:28 AM Michal Hocko wrote: >>> >>> On Mon 15-02-21 23:36:49, Muchun Song wrote: >>> [...] >>>>> There shouldn't be any real reason why the memory allocation for >>>>> vmemmaps, or handling vmemmap in general, has to be done from within the >>>>> hugetlb lock and therefore requiring a non-sleeping semantic. All that >>>>> can be deferred to a more relaxed context. If you want to make a >>>> >>>> Yeah, you are right. We can put the freeing hugetlb routine to a >>>> workqueue. Just like I do in the previous version (before v13) patch. >>>> I will pick up these patches. >>> >>> I haven't seen your v13 and I will unlikely have time to revisit that >>> version. I just wanted to point out that the actual allocation doesn't >>> have to happen from under the spinlock. There are multiple ways to go >>> around that. Dropping the lock would be one of them. Preallocation >>> before the spin lock is taken is another. WQ is certainly an option but >>> I would take it as the last resort when other paths are not feasible. >>> >> >> "Dropping the lock" and "Preallocation before the spin lock" can limit >> the context of put_page to non-atomic context. I am not sure if there >> is a page puted somewhere under an atomic context. e.g. compaction. >> I am not an expert on this. > > Then do a due research or ask for a help from the MM community. Do > not just try to go around harder problems and somehow duct tape a > solution. I am sorry for sounding harsh here but this is a repetitive > pattern. > > Now to the merit. put_page can indeed be called from all sorts of > contexts. And it might be indeed impossible to guarantee that hugetlb > pages are never freed up from an atomic context. Requiring that would be > even hard to maintain longterm. There are ways around that, I believe, > though. > > The most simple one that I can think of right now would be using > in_atomic() rather than in_task() check free_huge_page. IIRC recent > changes would allow in_atomic to be reliable also on !PREEMPT kernels > (via RCU tree, not sure where this stands right now). That would make > __free_huge_page always run in a non-atomic context which sounds like an > easy enough solution. > Another way would be to keep a pool of ready pages to use in case of > GFP_NOWAIT allocation fails and have means to keep that pool replenished > when needed. Would it be feasible to reused parts of the freed page in > the worst case? As already discussed, this is only possible when the huge page does not reside on ZONE_MOVABLE/CMA. In addition, we can no longer form a huge page at that memory location ever. -- Thanks, David / dhildenb