Received: by 2002:a05:6a10:eb17:0:0:0:0 with SMTP id hx23csp124987pxb; Thu, 2 Sep 2021 21:17:41 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx5vjOa8tbO8HWTpdq53CTPX9gRN46c/ynq3oT2VZyL66bSRZEaA4F2GSxdHpImKxi8mpaY X-Received: by 2002:a92:ce86:: with SMTP id r6mr1176287ilo.170.1630642661609; Thu, 02 Sep 2021 21:17:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1630642661; cv=none; d=google.com; s=arc-20160816; b=u4OxDROZ9FbF2t+CzCxnUHIEU9Nd90n8PRZGWGZatcAmym7ptGJRfgYo93XA68LL7S WRy+RLYbLi9CySBTL5NIefa38Sq5MxvBc1Ngk1KLYr9SzbW2Pp7DHocIasZiakMqb0lH lCT/BNmOSP5CWfwBz09VL/xNl7gyEJzwjYVbd3QjXeKxl1DvZsO8hAasAwU5XflRaOmM ZmXhFPCsvfoNYzKlITAz4g1/opW4/5CvA3gEtvSrCAwI5YY4T/aBUnHgwIoxxaZSl28M p4iHn5WR8SZrmZNci8Q9VGWIbL9gwYysOQ283+m+XDWtjDnZuyFNJNkSW0nQYwkpTZSd FPvg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=wR0Cc+uWcw/dvYlkRnbElokQjeGStQVBRoGZ+auh9Rw=; b=ds+fkwzpQFJsOUl4nbdU52D22piObLQePcehfgbbaZuw0FooY1etBdCYpw/cuxxlgr 1LAzaNJveVSMCfEyWMzhZXija35unyOhJWU+xrgBljKWmH5cIv70mtmi56a4OQpO5ac6 d+nesWEIMvh5GHRxEQtus0VVHb3lTv0MmadiADrqV/QTQ+LSRF7dOs73Hfla9MJTfxmR Z2on/DFlDnZi4hpRtVig0Z6s9ndKSFZ1IZf4o/RBVxioNOkkH3rqn2NMVjYh4UcEtbXc szOWfh5MkQU71k7/qQY0EoxF39XCehQPNKaQ9XC7/3giy6LEbyG+BZb08fWrG/xPWcKF 6vnA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y12si3886575ilh.146.2021.09.02.21.17.30; Thu, 02 Sep 2021 21:17:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230041AbhICEBl (ORCPT + 99 others); Fri, 3 Sep 2021 00:01:41 -0400 Received: from foss.arm.com ([217.140.110.172]:34560 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229481AbhICEBl (ORCPT ); Fri, 3 Sep 2021 00:01:41 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D69CCD6E; Thu, 2 Sep 2021 21:00:41 -0700 (PDT) Received: from [10.163.72.65] (unknown [10.163.72.65]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 3040B3F5A1; Thu, 2 Sep 2021 21:00:38 -0700 (PDT) Subject: Re: [FIX PATCH 2/2] mm/page_alloc: Use accumulated load when building node fallback list To: "Ramakrishnan, Krupa" , "Rao, Bharata Bhasker" , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" Cc: "akpm@linux-foundation.org" , "kamezawa.hiroyu@jp.fujitsu.com" , "lee.schermerhorn@hp.com" , "mgorman@suse.de" , "Srinivasan, Sadagopan" References: <20210830121603.1081-1-bharata@amd.com> <20210830121603.1081-3-bharata@amd.com> <13dab5ac-03a3-e9b3-ff12-f819f7711569@arm.com> From: Anshuman Khandual Message-ID: Date: Fri, 3 Sep 2021 09:31:39 +0530 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 8/31/21 8:56 PM, Ramakrishnan, Krupa wrote: > [AMD Official Use Only] > > The bandwidth is limited by underutilization of cross socket links and not the latency. Hotspotting on one node will not engage all hardware resources based on our routing protocol which results in the lower bandwidth. Distributing equally across nodes 0 and 1 will yield the best results as it stresses the full system capabilities. Makes sense. Nonetheless this patch clearly solves a problem. > > Thanks > Krupa Ramakrishnan > > -----Original Message----- > From: Anshuman Khandual > Sent: 31 August, 2021 4:58 > To: Rao, Bharata Bhasker ; linux-mm@kvack.org; linux-kernel@vger.kernel.org > Cc: akpm@linux-foundation.org; kamezawa.hiroyu@jp.fujitsu.com; lee.schermerhorn@hp.com; mgorman@suse.de; Ramakrishnan, Krupa ; Srinivasan, Sadagopan > Subject: Re: [FIX PATCH 2/2] mm/page_alloc: Use accumulated load when building node fallback list > > [CAUTION: External Email] > > On 8/30/21 5:46 PM, Bharata B Rao wrote: >> As an example, consider a 4 node system with the following distance >> matrix. >> >> Node 0 1 2 3 >> ---------------- >> 0 10 12 32 32 >> 1 12 10 32 32 >> 2 32 32 10 12 >> 3 32 32 12 10 >> >> For this case, the node fallback list gets built like this: >> >> Node Fallback list >> --------------------- >> 0 0 1 2 3 >> 1 1 0 3 2 >> 2 2 3 0 1 >> 3 3 2 0 1 <-- Unexpected fallback order >> >> In the fallback list for nodes 2 and 3, the nodes 0 and 1 appear in >> the same order which results in more allocations getting satisfied >> from node 0 compared to node 1. >> >> The effect of this on remote memory bandwidth as seen by stream >> benchmark is shown below: >> >> Case 1: Bandwidth from cores on nodes 2 & 3 to memory on nodes 0 & 1 >> (numactl -m 0,1 ./stream_lowOverhead ... --cores ) >> Case 2: Bandwidth from cores on nodes 0 & 1 to memory on nodes 2 & 3 >> (numactl -m 2,3 ./stream_lowOverhead ... --cores ) >> >> ---------------------------------------- >> BANDWIDTH (MB/s) >> TEST Case 1 Case 2 >> ---------------------------------------- >> COPY 57479.6 110791.8 >> SCALE 55372.9 105685.9 >> ADD 50460.6 96734.2 >> TRIADD 50397.6 97119.1 >> ---------------------------------------- >> >> The bandwidth drop in Case 1 occurs because most of the allocations >> get satisfied by node 0 as it appears first in the fallback order for >> both nodes 2 and 3. > > I am wondering what causes this performance drop here ? Would not the memory access latency be similar between {2, 3} ---> { 0 } and {2, 3} ---> { 1 }, given both these nodes {0, 1} have same distance from {2, 3} i.e 32 from the above distance matrix. Even if the preferred node order changes from { 0 } to { 1 } for the accessing node { 3 }, it should not change the latency as such. > > Is the performance drop here, is caused by excessive allocation on node { 0 } resulting from page allocation latency instead. >