Received: by 2002:ac2:464d:0:0:0:0:0 with SMTP id s13csp3287172lfo; Mon, 23 May 2022 00:45:35 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzVmHJv7o+DmXQWY8YXvAVGfuF+2AOsduciQ5f2M2+E2S1xVw3cRjL+536JgYW6+F12jyDp X-Received: by 2002:a62:1cd5:0:b0:518:3293:5c52 with SMTP id c204-20020a621cd5000000b0051832935c52mr22667960pfc.52.1653291935579; Mon, 23 May 2022 00:45:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1653291935; cv=none; d=google.com; s=arc-20160816; b=tKmpmieoQqeaSrlvdsjsA9d9JD1tqsCSwZ738ZDmDyuAaBmYP6vkIEIzPs3+XQgQqm YP898xYD0FHheA/fZEsxeWoo1jU+Gu5BDeUzWotE/d7SSZnGFa+5ll60y3rEl4LXfbHy hjPHLVN+Pkw0CQgS0I6MRvfnMLiiSLkYoJr8lAYP3fhqL59c1UMyntKPUhUIzFVsI4G8 kCZZV1RRVUwS1uu+W1aworHiyDwDJGelSbXRzU+GEhwFNiZUngD8ZxFiR+Ssmwlk/IbZ foWIWbJ+RoFX9xb18DgtdM65HsxvYEXM7X9kULmZCPR9qHst1vEF/Ou4pprt1UGnDpcd /DnQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:sender:dkim-signature; bh=6s6WVdCnyStqPvDqluR9cvNh+3kAmaoZbyO+JNs1GmM=; b=YAPlBXJHIy+OBS9mCcJgXEiRmzY0bf8lSKBTNkzZIwhE2/RUFCl3rtdQYYyS7aPl+R 5+AG6YkShR45NuuqXWyVQOzZAv/nHcGAdIS0cuJoRsRbpKWt691JEGfl+ZVKe3sF80no UPG9C5jHmXzX4HEf7pN5n/3RmLVAjRxnQZ//1GoWsCIrFWqCox1o3AnKQir4zCe39No8 z8qKGL+j+0n12E2zvqX+AIJUoUcwocTuUTcHYKWlBsz7lg14pTTQtGLAoDPUyCFoml2z bHUzdrgsn0tb0ttfljOEeWiOLhIRIOKxt0VkakifHFqZBgG/yr7Lpn0Qm/JKUJj09/Yn 2uHA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b="Al/GqXg2"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id lr14-20020a17090b4b8e00b001e001016e10si13165970pjb.31.2022.05.23.00.45.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 May 2022 00:45:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b="Al/GqXg2"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 5D71243381; Sun, 22 May 2022 23:52:23 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S245551AbiEUQgj (ORCPT + 99 others); Sat, 21 May 2022 12:36:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53520 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236474AbiEUQgh (ORCPT ); Sat, 21 May 2022 12:36:37 -0400 Received: from mail-pg1-x535.google.com (mail-pg1-x535.google.com [IPv6:2607:f8b0:4864:20::535]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 745486005B for ; Sat, 21 May 2022 09:36:36 -0700 (PDT) Received: by mail-pg1-x535.google.com with SMTP id a38so7323395pgl.9 for ; Sat, 21 May 2022 09:36:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=6s6WVdCnyStqPvDqluR9cvNh+3kAmaoZbyO+JNs1GmM=; b=Al/GqXg2rNQzNcq9/YItA5F3eVvUj/PxLIjH5mdQkLudgxEg1H5uAbeyeN0H8F9GW9 NAF3hoTJ9y98I84MLc4fEPBTIZWcf5wUmoWoUioxzv496J9IyjFFX3Gmr/x1SGkFRcx+ QiwGkMjvQgk2FnEXyrPMA+iitH6DQtre4xS09Y084oIv9mnJrfA/a/6+w9AfDpLd2zUl xfuq1r3Yfa7vxx6+1BomjhDgyNuBZ9x3Vnyd7eNypjtXhYT1uwpg+9ctJTEdxHo8UUTS is/rnNkvwdR++z5iW7h3nxWggS36RTUJ9MOX0BWvQER4cP8UWOQnN5CvKqu+PJGp1Bj0 jMGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition :content-transfer-encoding:in-reply-to; bh=6s6WVdCnyStqPvDqluR9cvNh+3kAmaoZbyO+JNs1GmM=; b=6rPEILYlZm4x+8+Ihe348LLdkXly5SDKGwqnFM9pyf2BWV48mxEGAr5PSgmORGnZOd ebTSPJ/xaZXpj2V3K2S7JwaghmwlM/O6XCVj1HCFIZY0r2dXDUCly7M2ropUy7YaeiIj XrvsZKJRIunPINP1zNIeMQGXstpcLt7cggPEne9Mr1z9tiu4/uTWTHiA/jnOZhjxzoBt B2eBntefjVKsUg6rmeiuNlCL0PDf5S4QQAbpEAdxGYNWN3OkJj9N/YzlWTSg6mDtdYaW tSLhJ/ITK+YixvkCTEr7nyuFBSIDmHFg7Jd7R9C9KTDJbMO96K0Zb59T0uMTLt/e3HFm 3NaQ== X-Gm-Message-State: AOAM531BTCjxux1ED0sHn7dCduEYRt7N3gAcdWiY3T381YFBufCTt5u5 eipYXApvlFFBteaz2fTmjMXRt1fMYMQ= X-Received: by 2002:a62:6410:0:b0:4f3:9654:266d with SMTP id y16-20020a626410000000b004f39654266dmr15287548pfb.59.1653150995896; Sat, 21 May 2022 09:36:35 -0700 (PDT) Received: from google.com ([2620:15c:211:201:ef57:ac0e:cc3e:9974]) by smtp.gmail.com with ESMTPSA id l17-20020a629111000000b0050dc76281ccsm3816690pfe.166.2022.05.21.09.36.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 21 May 2022 09:36:35 -0700 (PDT) Sender: Minchan Kim Date: Sat, 21 May 2022 09:36:33 -0700 From: Minchan Kim To: David Hildenbrand Cc: Mike Kravetz , John Hubbard , Andrew Morton , syzbot , linux-kernel@vger.kernel.org, linux-mm@kvack.org, llvm@lists.linux.dev, nathan@kernel.org, ndesaulniers@google.com, syzkaller-bugs@googlegroups.com, trix@redhat.com, Matthew Wilcox , Stephen Rothwell Subject: Re: [syzbot] WARNING in follow_hugetlb_page Message-ID: References: <6d281052-485c-5e17-4f1c-ef5689831450@oracle.com> <0be9132d-a928-9ebe-a9cf-6d140b907d59@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, May 21, 2022 at 05:51:58PM +0200, David Hildenbrand wrote: > On 21.05.22 17:24, Minchan Kim wrote: > > On Fri, May 20, 2022 at 05:04:22PM -0700, Mike Kravetz wrote: > >> On 5/20/22 16:43, Minchan Kim wrote: > >>> On Fri, May 20, 2022 at 04:31:31PM -0700, Mike Kravetz wrote: > >>>> On 5/20/22 15:56, John Hubbard wrote: > >>>>> On 5/20/22 15:19, Minchan Kim wrote: > >>>>>> The memory offline would be an issue so we shouldn't allow pinning of any > >>>>>> pages in *movable zone*. > >>>>>> > >>>>>> Isn't alloc_contig_range just best effort? Then, it wouldn't be a big > >>>>>> problem to allow pinning on those area. The matter is what target range > >>>>>> on alloc_contig_range is backed by CMA or movable zone and usecases. > >>>>>> > >>>>>> IOW, movable zone should be never allowed. But CMA case, if pages > >>>>>> are used by normal process memory instead of hugeTLB, we shouldn't > >>>>>> allow longterm pinning since someone can claim those memory suddenly. > >>>>>> However, we are fine to allow longterm pinning if the CMA memory > >>>>>> already claimed and mapped at userspace(hugeTLB case IIUC). > >>>>>> > >>>>> > >>>>> From Mike's comments and yours, plus a rather quick reading of some > >>>>> CMA-related code in mm/hugetlb.c (free_gigantic_page(), alloc_gigantic_pages()), the following seems true: > >>>>> > >>>>> a) hugetlbfs can allocate pages *from* CMA, via cma_alloc() > >>>>> > >>>>> b) while hugetlbfs is using those CMA-allocated pages, it is debatable > >>>>> whether those pages should be allowed to be long term pinned. That's > >>>>> because there are two cases: > >>>>> > >>>>> ??? Case 1: pages are longterm pinned, then released, all while > >>>>> ??????????? owned by hugetlbfs. No problem. > >>>>> > >>>>> ??? Case 2: pages are longterm pinned, but then hugetlbfs releases the > >>>>> ??????????? pages entirely (via unmounting hugetlbfs, I presume). In > >>>>> ??????????? this case, we now have CMA page that are long-term pinned, > >>>>> ??????????? and that's the state we want to avoid. > >>>> > >>>> I do not think case 2 can happen. A hugetlb page can only be changed back > >>>> to 'normal' (buddy) pages when ref count goes to zero. > >>>> > >>>> It should also be noted that hugetlb code sets up the CMA area from which > >>>> hugetlb pages can be allocated. This area is never unreserved/freed. > >>>> > >>>> I do not think there is a reason to disallow long term pinning of hugetlb > >>>> pages allocated from THE hugetlb CMA area. > > Hm. We primarily use CMA for gigantic pages only IIRC. Ordinary huge > pages come via the buddy. > > Assume we allocated a (movable) 2MiB huge page ordinarily via the buddy > and it ended up on that CMA area by pure luck (as it's movable). If we'd > allow to pin it long-term, allocating a gigantic page from the > designated CMA area would fail. If we allow the longterm pin against the hugetlb page come via buddy, it should be migrated out of CMA before the longterm pinning by check_and_migrate_movable_pages, IIUC. If so, what the allocating a giganitc page from the designated CMA area would fail? > > So we'd want to allow long-term pinning a gigantic page but we'd not > want to allow long-term pinning an ordinary huge page. We'd want to > migrate the latter away. Sure. Gigantic page was already CMA claimed page so there is no user in the future to claim the memory again so fine to allow longterm pin but ordinary huge page shouldn't be allowed since CMA owner could claim the memory some day. > > > The general rules are: > > ZONE_MOVABLE: nobody is allowed to place unmovable allocations there; it > could prevent memory offlining/unplug. > > CMA: nobody *but the designated owner* is allowed to place unmovable > memory there; it could prevent the actual owner to allocate contiguous > memory. I am confused what's the meaning of designated owner and actuall owner in your context. What I thought about the issue based on you explanation: HugeTLB allocates its page by two types of allocation 1. alloc_pages(GFP_MOVABLE) It could allocate the hugetlb page from CMA area but longterm pin should migrate them out of cma before the pinning so allowing the pinning on the page is no problem and current code works like that. check_and_migrate_movable_pages 2. cma_alloc The cma_alloc is used only for *gigantic page* and the hugetlbfs is the very owner of the page. IOW, if the hugetlbfs was succeeded to allocate the gigantic page by cma_alloc, there is no other owner to be able to claim the page any longer so it's fine to allow longterm pinning againt the gingantic page but current. However, current code doesn't work like that due to is_pinnable_page. IOW, hugetlbfs need a way to distinguish whether the page owner is hugetlbfs or not. Are we on same page?