Received: by 2002:a05:7412:2a8c:b0:e2:908c:2ebd with SMTP id u12csp3268338rdh; Thu, 28 Sep 2023 07:17:43 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGGyr+vw5f2R7arGILHzKAcaXKpDdLgdOMINxQrKkUWsMQq4dPC9WJ5/VCdYK+5bJP53KtY X-Received: by 2002:a17:903:11d2:b0:1c6:f56:9315 with SMTP id q18-20020a17090311d200b001c60f569315mr1224422plh.68.1695910663028; Thu, 28 Sep 2023 07:17:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695910663; cv=none; d=google.com; s=arc-20160816; b=YwySSEjUl/IGZUhlOzs3g0AMksa29B25lEsBzHRDc5P7i3EN+8hYz7yQFF0Ae1beuI q1rw5JBfiTc5d+Hr7QzDjxNSFwxKmVr35c7PdFo6DyzZp4bjQTUMnNWse+iYqHjV64TH Qo1TlT/kprprCX+LwstwEobOzhvyy+R7Gizbf+rBw7UE/lJ6KjYIY713+MtmEMCjT43z duXKqPwWjmBQhYq7x86slp3UYyNCorrx/95V+ZJXifbo4jvCoZwY5XeYCuwPj+CraFY8 KRc4q6hnkuiFBavtZkc7jsj4N+LcgO2Vipg04tVU22/ft5viuFxLVIRthbSODbSmHzeZ g4Qg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent:message-id:in-reply-to :date:references:subject:cc:to:from:dkim-signature; bh=YxZ9MwwlITykaTp8RnGdNze8UNTQrSLnsJIUrSYR1RM=; fh=z/hB7iGW6Rppf9uHe0DFR+NjrUIEhJrmd1ZDEYKdKNY=; b=xv5nDjx7iCfLjR4fQYv3ReuNAtHDQwnkQH3/Kj/9VXtIcTuFDvEOdsXOQddzpFlUgT kf1WxiuF/hT0wKe5it2d1A0DlKqDpqoNTOgpv7KznD/SmigVAjW2hx93V1gsp9KaTP+V emGbcVM74GxEVZGRurEnR05vMPQ2vMNYp96trGhC0DjNQUVzJ8IPNuPHjJUIs8ahsFsJ XWycyWEE9ZiesnE9hpoI9xV+juy8ydilqbM5NzH2Rd1n5OQNpP6C6unGoICbCIMKXWWV vXXcSQABsmrB7OznHR6k0KO5jHRuDqMNq17jJtX3+wkfHf8aYSbidqeNdLrG2POIjP9k sb/g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=XCmh6lMe; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from pete.vger.email (pete.vger.email. [23.128.96.36]) by mx.google.com with ESMTPS id k12-20020a170902c40c00b001bbc61fedafsi20808124plk.422.2023.09.28.07.17.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 28 Sep 2023 07:17:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) client-ip=23.128.96.36; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=XCmh6lMe; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id B038C812E033; Wed, 27 Sep 2023 23:19:29 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230287AbjI1GTK (ORCPT + 99 others); Thu, 28 Sep 2023 02:19:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55534 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229453AbjI1GTJ (ORCPT ); Thu, 28 Sep 2023 02:19:09 -0400 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CC4D899; Wed, 27 Sep 2023 23:19:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695881946; x=1727417946; h=from:to:cc:subject:references:date:in-reply-to: message-id:mime-version; bh=yPoRTff1W8kHOZj6MIgWTa4FPEtV7El2Duzf1zGYqoI=; b=XCmh6lMezJrCbdiEj6brvfLLN+4eo4r0uPrIBptu38VyOV1oYDG6UJI9 do2V5wVo0YxP/em7YTb0rKomZqAHoHKP41Ccv47kW5MBFB+2yfnlkNq51 X92JODOz5PFxUzxfwKV8rpdrpjK8KZzBv2eRlgo1VElZIbBAVyVOKHmYF qkCX1Ouz4f0JUTsZDmhFmI1J+5p6CkzjhiqQAp9t1Lp0lJ/hR+L1U1GN9 52bTwBaeQPbrzcJJwMlwZdlsNl3oLkQpDlY+EqhGSw1QFTmPb/9NvfYRw yPYbFY6TUoFZOuJ9yQnlDf9U80BK6g7a25NLdazkyARnsmTJB3cvNAGIu w==; X-IronPort-AV: E=McAfee;i="6600,9927,10846"; a="446139326" X-IronPort-AV: E=Sophos;i="6.03,183,1694761200"; d="scan'208";a="446139326" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2023 23:19:06 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10846"; a="726120975" X-IronPort-AV: E=Sophos;i="6.03,183,1694761200"; d="scan'208";a="726120975" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2023 23:19:00 -0700 From: "Huang, Ying" To: Ravi Jonnalagadda Cc: , , , , , , , , , , , , , , , , , , , , , Subject: Re: [RFC PATCH 0/2] mm: mempolicy: Multi-tier interleaving References: <20230927095002.10245-1-ravis.opensrc@micron.com> Date: Thu, 28 Sep 2023 14:14:32 +0800 In-Reply-To: <20230927095002.10245-1-ravis.opensrc@micron.com> (Ravi Jonnalagadda's message of "Wed, 27 Sep 2023 15:20:00 +0530") Message-ID: <87v8burfhz.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Wed, 27 Sep 2023 23:19:29 -0700 (PDT) Hi, Ravi, Thanks for the patch! Ravi Jonnalagadda writes: > From: Ravi Shankar > > Hello, > > The current interleave policy operates by interleaving page requests > among nodes defined in the memory policy. To accommodate the > introduction of memory tiers for various memory types (e.g., DDR, CXL, > HBM, PMEM, etc.), a mechanism is needed for interleaving page requests > across these memory types or tiers. Why do we need interleaving page allocation among memory tiers? I think that you need to make it more explicit. I guess that it's to increase maximal memory bandwidth for workloads? > This can be achieved by implementing an interleaving method that > considers the tier weights. > The tier weight will determine the proportion of nodes to select from > those specified in the memory policy. > A tier weight can be assigned to each memory type within the system. What is the problem of the original interleaving? I think you need to make it explicit too. > Hasan Al Maruf had put forth a proposal for interleaving between two > tiers, namely the top tier and the low tier. However, this patch was > not adopted due to constraints on the number of available tiers. > > https://lore.kernel.org/linux-mm/YqD0%2FtzFwXvJ1gK6@cmpxchg.org/T/ > > New proposed changes: > > 1. Introducea sysfs entry to allow setting the interleave weight for each > memory tier. > 2. Each tier with a default weight of 1, indicating a standard 1:1 > proportion. > 3. Distribute the weight of that tier in a uniform manner across all nodes. > 4. Modifications to the existing interleaving algorithm to support the > implementation of multi-tier interleaving based on tier-weights. > > This is inline with Huang, Ying's presentation in lpc22, 16th slide in > https://lpc.events/event/16/contributions/1209/attachments/1042/1995/\ > Live%20In%20a%20World%20With%20Multiple%20Memory%20Types.pdf Thanks to refer to the original work about this. > Observed a significant increase (165%) in bandwidth utilization > with the newly proposed multi-tier interleaving compared to the > traditional 1:1 interleaving approach between DDR and CXL tier nodes, > where 85% of the bandwidth is allocated to DDR tier and 15% to CXL > tier with MLC -w2 option. It appears that "mlc" isn't an open source software. Better to use a open source software to test. And, even better to use a more practical workloads instead of a memory bandwidth/latency measurement tool. > Usage Example: > > 1. Set weights for DDR (tier4) and CXL(teir22) tiers. > echo 85 > /sys/devices/virtual/memory_tiering/memory_tier4/interleave_weight > echo 15 > /sys/devices/virtual/memory_tiering/memory_tier22/interleave_weight > > 2. Interleave between DRR(tier4, node-0) and CXL (tier22, node-1) using numactl > numactl -i0,1 mlc --loaded_latency W2 > > Srinivasulu Thanneeru (2): > memory tier: Introduce sysfs for tier interleave weights. > mm: mempolicy: Interleave policy for tiered memory nodes > > include/linux/memory-tiers.h | 27 ++++++++- > include/linux/sched.h | 2 + > mm/memory-tiers.c | 67 +++++++++++++++------- > mm/mempolicy.c | 107 +++++++++++++++++++++++++++++++++-- > 4 files changed, 174 insertions(+), 29 deletions(-) -- Best Regards, Huang, Ying