Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp105727imu; Mon, 10 Dec 2018 17:10:57 -0800 (PST) X-Google-Smtp-Source: AFSGD/V/wlOx9VfKmgX5IM2n5HXEriJfHJbyfODLLntSP73sjebJlfptjGb3yIqPdDoV5+VN1jgK X-Received: by 2002:a62:3603:: with SMTP id d3mr15001854pfa.146.1544490657917; Mon, 10 Dec 2018 17:10:57 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1544490657; cv=none; d=google.com; s=arc-20160816; b=js28AiGUtxZHNY2NkSkSyCtlvzsGfszpriRqe5XNIZlKgSvkit76Ug6ae6sEeABaGB 7jJjk7QEvEBR8NWNevsc/KoLATkUH5/g8fsKvV8lHV1qS3Ui2bHYC3qzxFT/5al2KPo5 vVhS7pWbk9FatKIrCKlH9hdv5jDuorqfnAwbG/DbFxknWQT0/uRuLirXwY+k427ago70 eD7geIyItzoOJ+KAc2zCBI/F6WhW+KeXELBaHzXtrSNvV6kRXd0DeIkIBRRh5j5bXV1S OD/Jra3Qja2At3AKN0D0KKiCpKNN5NwM6qhJq7o1xLNocBGgikBmUKBC4KqvxCGCu9fz ObUg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from; bh=RvwSly8zCurQLlyc3Yp2CVbVJb5LFpRKZ82Dh9r7Kp4=; b=xbAt/9NWq5aUsxVD1u0z6fXgRbM0ZEinFbLkqDEQ0AjlYQFBlOKiLj8+zNS93aLEOS b7i4H4n8w1wczFO5S3lTRhY2CG5sFAdj/9s9zbFghOQiKskEIHDUBZ84Z9g05NHChx3h y0hpFgWJygBzbjd1IH7QghPspwDvtvEairj69UyRS6lPuoHngDvOMPK4zw1UB9sfsZFH YLlVr6MrlbU8JT0SDe1HCuY1gaI94EeqgIiTR1ps2gjQ1FAtHkcGnQXPO7RHwcNI86C0 T0J7Rc66kEcqeS2OCGrYdWJKI0rcoH2T3dNwYWJ9Or0ejpERIJSUb2GDT6vzy3H6SqK+ J5Jg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j132si11968997pfc.84.2018.12.10.17.10.42; Mon, 10 Dec 2018 17:10:57 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729847AbeLKBGc (ORCPT + 99 others); Mon, 10 Dec 2018 20:06:32 -0500 Received: from mga02.intel.com ([134.134.136.20]:24704 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729546AbeLKBFw (ORCPT ); Mon, 10 Dec 2018 20:05:52 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Dec 2018 17:05:52 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,340,1539673200"; d="scan'208";a="117705208" Received: from unknown (HELO localhost.lm.intel.com) ([10.232.112.69]) by orsmga001.jf.intel.com with ESMTP; 10 Dec 2018 17:05:51 -0800 From: Keith Busch To: linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org, linux-mm@kvack.org Cc: Greg Kroah-Hartman , Rafael Wysocki , Dave Hansen , Dan Williams , Keith Busch Subject: [PATCHv2 09/12] node: Add memory caching attributes Date: Mon, 10 Dec 2018 18:03:07 -0700 Message-Id: <20181211010310.8551-10-keith.busch@intel.com> X-Mailer: git-send-email 2.13.6 In-Reply-To: <20181211010310.8551-1-keith.busch@intel.com> References: <20181211010310.8551-1-keith.busch@intel.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org System memory may have side caches to help improve access speed. While the system provided cache is transparent to the software accessing these memory ranges, applications can optimize their own access based on cache attributes. Provide a new API for the kernel to register these memory side caches under the memory node that provides it. The kernel's sysfs representation is modeled from the cpu cacheinfo attributes, as seen from /sys/devices/system/cpu/cpuX/side_cache/. Unlike CPU cacheinfo, though, the node cache level is reported from the view of the memory. A higher number is nearer to the CPU, while lower levels are closer to the backing memory. Also unlike CPU cache, it is assumed the system will handle flushing any dirty cached memory to the last level the memory on a power failure if the range is persistent memory. The attributes we export are the cache size, the line size, associativity, and write back policy. Signed-off-by: Keith Busch --- drivers/base/node.c | 140 +++++++++++++++++++++++++++++++++++++++++++++++++++ include/linux/node.h | 23 +++++++++ 2 files changed, 163 insertions(+) diff --git a/drivers/base/node.c b/drivers/base/node.c index 768612c06c56..54184424ca7f 100644 --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -17,6 +17,7 @@ #include #include #include +#include #include #include @@ -141,6 +142,143 @@ void node_set_perf_attrs(unsigned int nid, struct node_hmem_attrs *hmem_attrs) pr_info("failed to add performance attribute group to node %d\n", nid); } + +struct node_cache_info { + struct device dev; + struct list_head node; + struct node_cache_attrs cache_attrs; +}; +#define to_cache_info(device) container_of(device, struct node_cache_info, dev) + +#define CACHE_ATTR(name, fmt) \ +static ssize_t name##_show(struct device *dev, \ + struct device_attribute *attr, \ + char *buf) \ +{ \ + return sprintf(buf, fmt "\n", to_cache_info(dev)->cache_attrs.name); \ +} \ +DEVICE_ATTR_RO(name); + +CACHE_ATTR(size, "%lld") +CACHE_ATTR(level, "%d") +CACHE_ATTR(line_size, "%d") +CACHE_ATTR(associativity, "%d") +CACHE_ATTR(write_policy, "%d") + +static struct attribute *cache_attrs[] = { + &dev_attr_level.attr, + &dev_attr_associativity.attr, + &dev_attr_size.attr, + &dev_attr_line_size.attr, + &dev_attr_write_policy.attr, + NULL, +}; + +const struct attribute_group node_cache_attrs_group = { + .attrs = cache_attrs, +}; + +const struct attribute_group *node_cache_attrs_groups[] = { + &node_cache_attrs_group, + NULL, +}; + +static void node_release(struct device *dev) +{ + kfree(dev); +} + +static void node_cache_release(struct device *dev) +{ + struct node_cache_info *info = to_cache_info(dev); + kfree(info); +} + +static void node_init_cache_dev(struct node *node) +{ + struct device *dev; + + dev = kzalloc(sizeof(*dev), GFP_KERNEL); + if (!dev) + return; + + dev->parent = &node->dev; + dev->release = node_release; + dev_set_name(dev, "side_cache"); + + if (device_register(dev)) { + kfree(dev); + return; + } + pm_runtime_no_callbacks(dev); + node->cache_dev = dev; +} + +void node_add_cache(unsigned int nid, struct node_cache_attrs *cache_attrs) +{ + struct node_cache_info *info; + struct device *dev; + struct node *node; + + if (!node_online(nid) || !node_devices[nid]) + return; + + node = node_devices[nid]; + list_for_each_entry(info, &node->cache_attrs, node) { + if (info->cache_attrs.level == cache_attrs->level) { + dev_warn(&node->dev, + "attempt to add duplicate cache level:%d\n", + cache_attrs->level); + return; + } + } + + if (!node->cache_dev) + node_init_cache_dev(node); + if (!node->cache_dev) + return; + + info = kzalloc(sizeof(*info), GFP_KERNEL); + if (!info) + return; + + dev = &info->dev; + dev->parent = node->cache_dev; + dev->release = node_cache_release; + dev->groups = node_cache_attrs_groups; + dev_set_name(dev, "index%d", cache_attrs->level); + info->cache_attrs = *cache_attrs; + if (device_register(dev)) { + dev_warn(&node->dev, "failed to add cache level:%d\n", + cache_attrs->level); + kfree(info); + return; + } + pm_runtime_no_callbacks(dev); + list_add_tail(&info->node, &node->cache_attrs); +} + +static void node_remove_caches(struct node *node) +{ + struct node_cache_info *info, *next; + + if (!node->cache_dev) + return; + + list_for_each_entry_safe(info, next, &node->cache_attrs, node) { + list_del(&info->node); + device_unregister(&info->dev); + } + device_unregister(node->cache_dev); +} + +static void node_init_caches(unsigned int nid) +{ + INIT_LIST_HEAD(&node_devices[nid]->cache_attrs); +} +#else +static void node_init_caches(unsigned int nid) { } +static void node_remove_caches(struct node *node) { } #endif #define K(x) ((x) << (PAGE_SHIFT - 10)) @@ -389,6 +527,7 @@ static void node_device_release(struct device *dev) */ flush_work(&node->node_work); #endif + node_remove_caches(node); kfree(node); } @@ -711,6 +850,7 @@ int __register_one_node(int nid) /* initialize work queue for memory hot plug */ init_node_hugetlb_work(nid); + node_init_caches(nid); return error; } diff --git a/include/linux/node.h b/include/linux/node.h index 71abaf0d4f4b..897e04e99e80 100644 --- a/include/linux/node.h +++ b/include/linux/node.h @@ -36,6 +36,27 @@ struct node_hmem_attrs { unsigned int write_latency; }; void node_set_perf_attrs(unsigned int nid, struct node_hmem_attrs *hmem_attrs); + +enum cache_associativity { + NODE_CACHE_DIRECT_MAP, + NODE_CACHE_INDEXED, + NODE_CACHE_OTHER, +}; + +enum cache_write_policy { + NODE_CACHE_WRITE_BACK, + NODE_CACHE_WRITE_THROUGH, + NODE_CACHE_WRITE_OTHER, +}; + +struct node_cache_attrs { + enum cache_associativity associativity; + enum cache_write_policy write_policy; + u64 size; + u16 line_size; + u8 level; +}; +void node_add_cache(unsigned int nid, struct node_cache_attrs *cache_attrs); #endif struct node { @@ -48,6 +69,8 @@ struct node { #endif #ifdef CONFIG_HMEM_REPORTING struct node_hmem_attrs hmem_attrs; + struct list_head cache_attrs; + struct device *cache_dev; #endif }; -- 2.14.4