Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp1368762pxb; Fri, 6 Nov 2020 07:57:39 -0800 (PST) X-Google-Smtp-Source: ABdhPJzDsloqbBhavW7ratnSxVv7mblSjAQTDSDfZsnS0fziqOJLEFF7ipC4MRNegsKb6aXqURGV X-Received: by 2002:a05:6402:150:: with SMTP id s16mr2592423edu.304.1604678259404; Fri, 06 Nov 2020 07:57:39 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1604678259; cv=none; d=google.com; s=arc-20160816; b=CGhCWYlXpttgtSywGKbBa2yyirqNsbNTjMHNnkj7AiVJH6w/ed4IbhEYk0OAg87IAw XszVRCAGAtnCqMC+xv7PpOb1jzilup3xZllVY2As+6dNnbEwe/Sar3AdgZdoyThL6XlK 5YIMedVanXWGf+AZ4w3SOkzos28/LS7M1VCmw5Yg4ZyVmvTDGT4rZVZoHGq61iaeqqHX tMZMTozFE2wtMkmp3h0BXCCugWeuzwH8sAxo3FVgGWasVAgx8bL4LoktsQ3AJa+a6/TC dcQPVW6Rj0rRs77utdPyw9Iz6G48+cWY4Nc9NMUh/L8Bs1YtDRosKka3BkYQOE6iSkSK KROg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:mail-followup-to:message-id:subject:cc:to:from:date :ironport-sdr:ironport-sdr; bh=sm0rp0erTgB7NgfFhEVgkmOxdLR/TFqLGcwXljtXY7o=; b=aDglLYxqGl8DzWaEz4m6LjBbq35aS1erxgiE+OhZCcDcQz2CEeaVMUytC5Zx4N9uxn pUiqjg52GqPy6HGy0Cz86FzJGl5+RShvuCYyyh9/H7PIHEeHkNfVXi7IV2wXyIoADP9l HD6h6+Rh+lOjz22zDb6oM6Wk6Sun28NlC/A4sejcdRqv891qwlCiiZ+mfIJvAb9bdZwd jyaU4jLUp8BAbDbqq0CbqmGgEZuLI1Pg9JD+/CzHVOvGyyeEOCEkEgJBpmce9XJaOaPS /P+V1l7xsk4WSrcHAZWZocCTlk5yQyjpo8lltNfZ2PGdmlkuQ4+FArC7QnTN8uzkD70n NeaQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id w12si1174421ejf.73.2020.11.06.07.57.16; Fri, 06 Nov 2020 07:57:39 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727610AbgKFPzI (ORCPT + 99 others); Fri, 6 Nov 2020 10:55:08 -0500 Received: from mga09.intel.com ([134.134.136.24]:36159 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726422AbgKFPzH (ORCPT ); Fri, 6 Nov 2020 10:55:07 -0500 IronPort-SDR: sDHMnabFTUBUAX7OU01wNA/Cwn38IrUzPLVdQMhuNU5CrDAfXr0VTPWoJ2yZlmSQclEw3BSlrl Ak5VsP1IbYHQ== X-IronPort-AV: E=McAfee;i="6000,8403,9797"; a="169708692" X-IronPort-AV: E=Sophos;i="5.77,456,1596524400"; d="scan'208";a="169708692" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Nov 2020 07:55:05 -0800 IronPort-SDR: tDj4OyVIr9tk9Jtb24z9upN02k6+2l9N5E3bn+CjQVK86EYvuJOEAKrf0fz4kpJP9s3z3c/+mg yEaWMJRkQYcw== X-IronPort-AV: E=Sophos;i="5.77,456,1596524400"; d="scan'208";a="472104901" Received: from rseth-mobl2.amr.corp.intel.com (HELO intel.com) ([10.252.131.79]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Nov 2020 07:55:04 -0800 Date: Fri, 6 Nov 2020 07:55:03 -0800 From: Ben Widawsky To: "Huang, Ying" Cc: Mel Gorman , Peter Zijlstra , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton , Ingo Molnar , Rik van Riel , Johannes Weiner , "Matthew Wilcox (Oracle)" , Dave Hansen , Andi Kleen , Michal Hocko , David Rientjes Subject: Re: [PATCH -V2 2/2] autonuma: Migrate on fault among multiple bound nodes Message-ID: <20201106155503.nkwuxr5mkneggzl7@intel.com> Mail-Followup-To: "Huang, Ying" , Mel Gorman , Peter Zijlstra , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton , Ingo Molnar , Rik van Riel , Johannes Weiner , "Matthew Wilcox (Oracle)" , Dave Hansen , Andi Kleen , Michal Hocko , David Rientjes References: <20201028023411.15045-1-ying.huang@intel.com> <20201028023411.15045-3-ying.huang@intel.com> <20201102111717.GB3306@suse.de> <87eel9wumd.fsf@yhuang-dev.intel.com> <20201105112523.GQ3306@suse.de> <87v9ejosec.fsf@yhuang-dev.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87v9ejosec.fsf@yhuang-dev.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 20-11-06 15:28:59, Huang, Ying wrote: > Mel Gorman writes: > > > On Wed, Nov 04, 2020 at 01:36:58PM +0800, Huang, Ying wrote: > >> But from another point of view, I suggest to remove the constraints of > >> MPOL_F_MOF in the future. If the overhead of AutoNUMA isn't acceptable, > >> why not just disable AutoNUMA globally via sysctl knob? > >> > > > > Because it's a double edged sword. NUMA Balancing can make a workload > > faster while still incurring more overhead than it should -- particularly > > when threads are involved rescanning the same or unrelated regions. > > Global disabling only really should happen when an application is running > > that is the only application on the machine and has full NUMA awareness. > > Got it. So NUMA Balancing may in generally benefit some workloads but > hurt some other workloads on one machine. So we need a method to > enable/disable NUMA Balancing for one workload. Previously, this is > done via the explicit NUMA policy. If some explicit NUMA policy is > specified, NUMA Balancing is disabled for the memory region or the > thread. And this can be reverted again for a memory region via > MPOL_MF_LAZY. It appears that we lacks MPOL_MF_LAZY for the thread yet. > > >> > It might still end up being better but I was not aware of a > >> > *realistic* workload that binds to multiple nodes > >> > deliberately. Generally I expect if an application is binding, it's > >> > binding to one local node. > >> > >> Yes. It's not popular configuration for now. But for the memory > >> tiering system with both DRAM and PMEM, the DRAM and PMEM in one socket > >> will become 2 NUMA nodes. To avoid too much cross-socket memory > >> accessing, but take advantage of both the DRAM and PMEM, the workload > >> can be bound to 2 NUMA nodes (DRAM and PMEM). > >> > > > > Ok, that may lead to unpredictable performance as it'll have variable > > performance with limited control of the "important" applications that > > should use DRAM over PMEM. That's a long road but the step is not > > incompatible with the long-term goal. > > Yes. Ben Widawsky is working on a patchset to make it possible to > prefer the remote DRAM instead of the local PMEM as follows, > > https://lore.kernel.org/linux-mm/20200630212517.308045-1-ben.widawsky@intel.com/ > > Best Regards, > Huang, Ying > Rebased version was posted here: https://lore.kernel.org/linux-mm/20201030190238.306764-1-ben.widawsky@intel.com/ Thanks. Ben