Received: by 2002:a05:7412:f690:b0:e2:908c:2ebd with SMTP id ej16csp937975rdb; Fri, 20 Oct 2023 04:07:50 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEPsiFoIsMN2mpoSVZJd95hpI+Dw6FFpyf/dHt6s9+v4DiZThfiZ6ymyzKu29X5rudOPyLX X-Received: by 2002:a05:6a20:8e28:b0:17b:1f75:e3f2 with SMTP id y40-20020a056a208e2800b0017b1f75e3f2mr1492004pzj.39.1697800069988; Fri, 20 Oct 2023 04:07:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697800069; cv=none; d=google.com; s=arc-20160816; b=pMUcRbcObZ3Eny8Y2Kp/I6CXDyCqrAxenxmjZL4vS+O6QQ0WohcBGQmw20JNaJkBRN wACk5S1dwqBx+XaYa5GxKwj/cEaCgWZOI7Z9uEmoYVk+VX10PHduJRy25wW33box1dF4 y0FDRc2/oh99RqqXxyPLWLTlEMC5h4186cQLu+ldzJ0qMQvKtP97Vm8outkfATK5Ejh3 CfF4GdT9zAf02F32Qpg8cDTyZjhxdDjr9m/haQYeZMrAVPZv0cMwPiZcW+Pnzya4GEMZ C4zr/TI51E73RArgIeK2ZpkACqAONLUd736NfJWEg9QTg/62jqzExngu2ZFuQ0xZOygM KrMg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=gK+cFoR1MyWIl8n0phUtp2lC8tfEt3+4PWkcgkrk9lI=; fh=sfcVOgYtZePY21+UVXSJ05iBxN2n4QtX3V2hPqRe5TU=; b=YE6bC4RStubFVmTdk9gkDx1ERSs4KfjN5/+Z0iIYJEGFWBePHvX5e5r6CjZikPxwFO Npc4+1EgWhSKx7ifPeJipQiUq7lQbIPVbLbVRt3rBHcZK4O31py2vpOG+bJYnCFgW723 lFgbVVtRM9cUQZUxFbJhZQA6WxYxar+y7t9cPqrV5s9cU5nFYxT6Ojf/4irU4+eHJj1D QuapFvZNJz5EMx8S3PRaBq/cxjft1BTy3tIFphhe0666FYBziEMjDSMIJYOYki6jeHoU rwjGv87jetZKyYzYuuTx72QeBNuc2MyQENwz3L1kvDmKrV6W4dicPkaBfv6gYGchO95G smQw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=W2OvXBWY; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Return-Path: Received: from groat.vger.email (groat.vger.email. [23.128.96.35]) by mx.google.com with ESMTPS id s9-20020a170902ea0900b001bb993ef74bsi20862plg.540.2023.10.20.04.07.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 20 Oct 2023 04:07:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) client-ip=23.128.96.35; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=W2OvXBWY; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id 2476580293F2; Fri, 20 Oct 2023 04:06:15 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1377147AbjJTLFs (ORCPT + 99 others); Fri, 20 Oct 2023 07:05:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44828 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1377133AbjJTLFn (ORCPT ); Fri, 20 Oct 2023 07:05:43 -0400 Received: from mail-oa1-x29.google.com (mail-oa1-x29.google.com [IPv6:2001:4860:4864:20::29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4F35F114 for ; Fri, 20 Oct 2023 04:05:20 -0700 (PDT) Received: by mail-oa1-x29.google.com with SMTP id 586e51a60fabf-1ea1742c1a5so193812fac.0 for ; Fri, 20 Oct 2023 04:05:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1697799918; x=1698404718; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=gK+cFoR1MyWIl8n0phUtp2lC8tfEt3+4PWkcgkrk9lI=; b=W2OvXBWY6ET60Qygx4/Ibtx3Rj1fhBlEekqxnVVX7LnDXDlymTiAiyFxVQc00AvOP/ 3kddzv2ak9MteblZwE0zexI++ygY6ZPZ0nZbKtGR1Gd5ar3U81E9TCqOilP+WQ4Czxyl Y2MOqc+DbTPJHpJg1MbspAWG5crvo8ebKbc2Pk29g/iTCE3MtAQS31EOCsTPNcAsEa4q ZHaFUmJ/yPRVqJb+Z/7PD6vCXhDZPy6q8DCuwiZdv5/KRQrdj+mhyyz3ducrw4BtsQeX yhRX58xLBXXnjBQgFPY/dlPb0cWuTCw1fxSkHaMrWoVQuCRL7NiyEPwNyquRNaj5NlS7 VKOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697799918; x=1698404718; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=gK+cFoR1MyWIl8n0phUtp2lC8tfEt3+4PWkcgkrk9lI=; b=RQk0N6EzkPMD0UjdbiKdEuJD21RH5qPJsajrO5VnQUJEcm+cLCUcQgyL6K9lG85I+I nYrVyi4+1bku8cMJLSxyS2hPSY4lRlMt6UALwc98WctZ46dtdnnCHm1l+K6OHV9rIVEX TP0QiB88qh8VKmkJRX5xMk31ygRjpKmOvNCoSz5Z+DUXRxS40D6MWnp9BAQoysCN4gdB RC3uWIWa4KiZGjhYTltFNdGmF7VhvIEOqEtpjDg5tv4MHo5EBbJ1UT0VEbHYELGSbeNH XTp98ajjIDpFYQV+VXZJCMDZhIxBw/nbp8r5fZ/HUVastHpxqEgFOWKtzmcEWAnpxmPQ k6Uw== X-Gm-Message-State: AOJu0Yx+6iQzzep2LtuTrh8fQwJBW0I4DugQMXeiZnIfO3evMOHf1AK3 w/KV6ZQX5D7rIq3PyJ9aemiXdg== X-Received: by 2002:a05:6358:5922:b0:168:a35c:f07b with SMTP id g34-20020a056358592200b00168a35cf07bmr1449368rwf.0.1697799917664; Fri, 20 Oct 2023 04:05:17 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([203.208.167.147]) by smtp.gmail.com with ESMTPSA id z6-20020aa79f86000000b006be4bb0d2dcsm1323865pfr.149.2023.10.20.04.05.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 20 Oct 2023 04:05:17 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, rppt@kernel.org, david@redhat.com, vbabka@suse.cz, mhocko@suse.com Cc: willy@infradead.org, mgorman@techsingularity.net, mingo@kernel.org, aneesh.kumar@linux.ibm.com, ying.huang@intel.com, hannes@cmpxchg.org, osalvador@suse.de, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Qi Zheng Subject: [PATCH v4 1/2] mm: page_alloc: skip memoryless nodes entirely Date: Fri, 20 Oct 2023 19:04:45 +0800 Message-Id: <7300fc00a057eefeb9a68c8ad28171c3f0ce66ce.1697799303.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Fri, 20 Oct 2023 04:06:15 -0700 (PDT) In find_next_best_node(), we skipped memoryless nodes when building the zonelists of other normal nodes (N_NORMAL), but did not skip the memoryless node itself when building the zonelist. This will cause it to be traversed at runtime. For example, say we have node0 and node1, node0 is memoryless node, then the fall back order of node0 and node1 as follows: [ 0.153005] Fallback order for Node 0: 0 1 [ 0.153564] Fallback order for Node 1: 1 After this patch, we skip memoryless node0 entirely, then the fall back order of node0 and node1 as follows: [ 0.155236] Fallback order for Node 0: 1 [ 0.155806] Fallback order for Node 1: 1 So it becomes completely invisible, which will reduce runtime overhead. And in this way, we will not try to allocate pages from memoryless node0, then the panic mentioned in [1] will also be fixed. Even though this problem has been solved by dropping the NODE_MIN_SIZE constrain in x86 [2], it would be better to fix it in the core MM as well. [1]. https://lore.kernel.org/all/20230212110305.93670-1-zhengqi.arch@bytedance.com/ [2]. https://lore.kernel.org/all/20231017062215.171670-1-rppt@kernel.org/ Signed-off-by: Qi Zheng Acked-by: David Hildenbrand Acked-by: Ingo Molnar --- mm/page_alloc.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index ee392a324802..1f852929709f 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5052,8 +5052,11 @@ int find_next_best_node(int node, nodemask_t *used_node_mask) int min_val = INT_MAX; int best_node = NUMA_NO_NODE; - /* Use the local node if we haven't already */ - if (!node_isset(node, *used_node_mask)) { + /* + * Use the local node if we haven't already, but for memoryless local + * node, we should skip it and fall back to other nodes. + */ + if (!node_isset(node, *used_node_mask) && node_state(node, N_MEMORY)) { node_set(node, *used_node_mask); return node; } -- 2.30.2