Received: by 2002:a05:6a11:4021:0:0:0:0 with SMTP id ky33csp221538pxb; Mon, 13 Sep 2021 17:32:25 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwJBVkvqzBjJxLtyawhHvqHBgDNY0cMYyHXJ3ZE7rMfFCWC5Kg8x006NVwp9sSVi+h+Pigy X-Received: by 2002:a17:906:318b:: with SMTP id 11mr16166930ejy.493.1631579545549; Mon, 13 Sep 2021 17:32:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1631579545; cv=none; d=google.com; s=arc-20160816; b=heo7GNgDcj/vdQbnjz0oMKp7RrbCk/nLXub1g91vI0yQa360IsZanfd6zRhPwWqvF0 EReSRHQ0cpmaLypjMX8YP/BmZSjl1F0enzfhumlAa2zX1jEEdjqPSGtrnkxvXAh5lpmg VOnb6q2qfXARVIMAOO2vZuU+TTR4AitA0Z6Rnckg8Y9h4PsiKPYodfT8bLWVbyO7OrsY mdUHG34Rb8SXT+3bIHGdI7FKw7aG1YYwrIpVHZe0T5BUH5wD0AL+4MGl4huF69SlQI37 eCxlhQyLLzi3Hq7dHYGEs56Zs94CNJLjzfGXocCwLsKXwkCZZRRcPh7Ro1vVrcbLzKmp GIbQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=9xm0iDvNJIkhziJa44gd0CbwpPctoIf3BZZigEYqLA4=; b=bEQ9//Ae0cMLDvRtD+zd2Q+OuS5K4Ct6PXGNZrCb4N49DuYU64O/BiXpBj+AZjDnBg wBqXSH8EBHx7y0SicXr6uxXelNLEzYx4gtOozpPHwtbTJy4NXl1wMbzvza3B8JeieI+e B/4hTszM8NiW9485wEtZY5WgGonyxa6eVls2up1kIptKjB02Hg+obxnPGgTPg0OqiXUJ boXejvu3pOgaE3THojrBl5VqYdx1xpahHzYqnFNi0mldRcN1vcQ0qjUKij01xp+I8XSt gsB4K9B65fK5lK75tu4M1nT9e1d4N6JpZJbEWVaZL6w367+eoIq6ZkJBJrP6jcklUYEV FeIg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=yFQYPpcV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id g8si5583199edr.256.2021.09.13.17.32.02; Mon, 13 Sep 2021 17:32:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=yFQYPpcV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345831AbhIMO1V (ORCPT + 99 others); Mon, 13 Sep 2021 10:27:21 -0400 Received: from mail.kernel.org ([198.145.29.99]:41744 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345121AbhIMOWZ (ORCPT ); Mon, 13 Sep 2021 10:22:25 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 7C8C561B2E; Mon, 13 Sep 2021 13:47:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1631540838; bh=M2DLKss7Og2FyRfWehhKqs0LFhzzVoCrlvZh2pKx/zQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=yFQYPpcVoyLHkGsCo3VvFREnMDij6ciUpDjHhTWmh45lQb4CDAwGPpGLcRrSIaOUd HUg2sYYV/vOqEvhbPom3tnmy/tvrv73kk/bUhzF0hL5ra+kwuuHhbMOJlvK4bp97qM iAwqkMzQkY3wPJenKhcJv17q6XVXLgcRgk0BgR4w= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Geetika Moolchandani , Srikar Dronamraju , Valentin Schneider , "Peter Zijlstra (Intel)" , Sasha Levin Subject: [PATCH 5.14 028/334] sched/topology: Skip updating masks for non-online nodes Date: Mon, 13 Sep 2021 15:11:22 +0200 Message-Id: <20210913131114.357767890@linuxfoundation.org> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20210913131113.390368911@linuxfoundation.org> References: <20210913131113.390368911@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Valentin Schneider [ Upstream commit 0083242c93759dde353a963a90cb351c5c283379 ] The scheduler currently expects NUMA node distances to be stable from init onwards, and as a consequence builds the related data structures once-and-for-all at init (see sched_init_numa()). Unfortunately, on some architectures node distance is unreliable for offline nodes and may very well change upon onlining. Skip over offline nodes during sched_init_numa(). Track nodes that have been onlined at least once, and trigger a build of a node's NUMA masks when it is first onlined post-init. Reported-by: Geetika Moolchandani Signed-off-by: Srikar Dronamraju Signed-off-by: Valentin Schneider Signed-off-by: Peter Zijlstra (Intel) Link: https://lkml.kernel.org/r/20210818074333.48645-1-srikar@linux.vnet.ibm.com Signed-off-by: Sasha Levin --- kernel/sched/topology.c | 65 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 65 insertions(+) diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c index b77ad49dc14f..4e8698e62f07 100644 --- a/kernel/sched/topology.c +++ b/kernel/sched/topology.c @@ -1482,6 +1482,8 @@ int sched_max_numa_distance; static int *sched_domains_numa_distance; static struct cpumask ***sched_domains_numa_masks; int __read_mostly node_reclaim_distance = RECLAIM_DISTANCE; + +static unsigned long __read_mostly *sched_numa_onlined_nodes; #endif /* @@ -1833,6 +1835,16 @@ void sched_init_numa(void) sched_domains_numa_masks[i][j] = mask; for_each_node(k) { + /* + * Distance information can be unreliable for + * offline nodes, defer building the node + * masks to its bringup. + * This relies on all unique distance values + * still being visible at init time. + */ + if (!node_online(j)) + continue; + if (sched_debug() && (node_distance(j, k) != node_distance(k, j))) sched_numa_warn("Node-distance not symmetric"); @@ -1886,6 +1898,53 @@ void sched_init_numa(void) sched_max_numa_distance = sched_domains_numa_distance[nr_levels - 1]; init_numa_topology_type(); + + sched_numa_onlined_nodes = bitmap_alloc(nr_node_ids, GFP_KERNEL); + if (!sched_numa_onlined_nodes) + return; + + bitmap_zero(sched_numa_onlined_nodes, nr_node_ids); + for_each_online_node(i) + bitmap_set(sched_numa_onlined_nodes, i, 1); +} + +static void __sched_domains_numa_masks_set(unsigned int node) +{ + int i, j; + + /* + * NUMA masks are not built for offline nodes in sched_init_numa(). + * Thus, when a CPU of a never-onlined-before node gets plugged in, + * adding that new CPU to the right NUMA masks is not sufficient: the + * masks of that CPU's node must also be updated. + */ + if (test_bit(node, sched_numa_onlined_nodes)) + return; + + bitmap_set(sched_numa_onlined_nodes, node, 1); + + for (i = 0; i < sched_domains_numa_levels; i++) { + for (j = 0; j < nr_node_ids; j++) { + if (!node_online(j) || node == j) + continue; + + if (node_distance(j, node) > sched_domains_numa_distance[i]) + continue; + + /* Add remote nodes in our masks */ + cpumask_or(sched_domains_numa_masks[i][node], + sched_domains_numa_masks[i][node], + sched_domains_numa_masks[0][j]); + } + } + + /* + * A new node has been brought up, potentially changing the topology + * classification. + * + * Note that this is racy vs any use of sched_numa_topology_type :/ + */ + init_numa_topology_type(); } void sched_domains_numa_masks_set(unsigned int cpu) @@ -1893,8 +1952,14 @@ void sched_domains_numa_masks_set(unsigned int cpu) int node = cpu_to_node(cpu); int i, j; + __sched_domains_numa_masks_set(node); + for (i = 0; i < sched_domains_numa_levels; i++) { for (j = 0; j < nr_node_ids; j++) { + if (!node_online(j)) + continue; + + /* Set ourselves in the remote node's masks */ if (node_distance(j, node) <= sched_domains_numa_distance[i]) cpumask_set_cpu(cpu, sched_domains_numa_masks[i][j]); } -- 2.30.2