Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp879543imu; Tue, 11 Dec 2018 08:59:37 -0800 (PST) X-Google-Smtp-Source: AFSGD/XkdfK6V0Mi7Rzemdh3iLtrRt/qZOJ0MFu8WkS7qcGlBmO1EI3xtmME5Nlj0pDoo1roaECn X-Received: by 2002:a63:396:: with SMTP id 144mr15430625pgd.68.1544547577757; Tue, 11 Dec 2018 08:59:37 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1544547577; cv=none; d=google.com; s=arc-20160816; b=n59dGd9WKDzRo4P9KKVp+k3TLgXOk/dnOtC6352CdDwIRsrf7qoGaoHSE8OqK3rJVO wdB31LNiuC1SYT4NeoHZ0l6vxUoXVBmwFAizkrbDcl89YSLGyKBiEiJFZwQp29dvg4pG z+73vO8qWfBxHWThxmiFMefdZkmOJj6z9Dd0rTwNsJMDLAL9BmH5tG8s4v4O85ylqMFC A/z2PwpFf8k22wdfEhmwx4OwJS2P5Rg3jq/FngR2gnLKOqaUQqr5cYM/ktbJ1pYo49rU 1c7r6cVbxaDcjX7i8E4qarwiMoYnrbYxN5dX+UlhZ0I1xbOHLvQsPkmP7/aXz5uwNTZg LPuQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=msdByEMwZtcQLOM84iLKv8sFJZInGut/y+nh91H+n0w=; b=eLHtkyXyj8nfkEO4ylQA1WkF7D3+6zXnN5LkQBuXfypoN+iqZjLpWqW6vDph7roKCX ksRtfrr6y1T8p/pQKF/GSYwrCnaVWjQvyoydKV64NzczsV5NoyrUeubVJCnMqmn0E7mw vg/+j+P4KKup9SxLz0fcHBQ1NzGBhSXUaWTdNE7BW1XvbA5yqgJx9Ud/xxoz0KxX05mM CtAMudD/kfvp7UfJBckaEkUmhOXD4gjVfXO8Q8/Z5iZrqr8XilC7RBliW9ti5XD8tnk6 SE89W3v0Mr48Zv9X9R77rKp9w9lJVkSVi9e5BOFp2d5UUKItWjTGB8PJAJESee58ROBv JRiw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 3si13122798pla.240.2018.12.11.08.59.22; Tue, 11 Dec 2018 08:59:37 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727105AbeLKQ5u (ORCPT + 99 others); Tue, 11 Dec 2018 11:57:50 -0500 Received: from mga18.intel.com ([134.134.136.126]:59892 "EHLO mga18.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726739AbeLKQ5u (ORCPT ); Tue, 11 Dec 2018 11:57:50 -0500 X-Amp-Result: UNSCANNABLE X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga106.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Dec 2018 08:57:49 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,342,1539673200"; d="scan'208";a="117476644" Received: from unknown (HELO localhost.localdomain) ([10.232.112.69]) by orsmga002.jf.intel.com with ESMTP; 11 Dec 2018 08:57:48 -0800 Date: Tue, 11 Dec 2018 09:55:18 -0700 From: Keith Busch To: Dan Williams Cc: Linux Kernel Mailing List , Linux ACPI , Linux MM , Greg KH , "Rafael J. Wysocki" , "Hansen, Dave" Subject: Re: [PATCHv2 02/12] acpi/hmat: Parse and report heterogeneous memory Message-ID: <20181211165518.GB8101@localhost.localdomain> References: <20181211010310.8551-1-keith.busch@intel.com> <20181211010310.8551-3-keith.busch@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.1 (2017-09-22) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Dec 10, 2018 at 10:03:40PM -0800, Dan Williams wrote: > I have a use case to detect the presence of a memory-side-cache early > at init time [1]. To me this means that hmat_init() needs to happen as > a part of acpi_numa_init(). Subsequently I think that also means that > the sysfs portion needs to be broken out to its own init path that can > probably run at module_init() priority. > > Perhaps we should split this patch set into two? The table parsing > with an in-kernel user is a bit easier to reason about and can go in > first. Towards that end can I steal / refllow patches 1 & 2 into the > memory randomization series? Other ideas how to handle this? > > [1]: https://lkml.org/lkml/2018/10/12/309 To that end, will something like the following work for you? This just needs to happen after patch 1. --- diff --git a/drivers/acpi/numa.c b/drivers/acpi/numa.c index f5e09c39ff22..03ef3c8ba4ea 100644 --- a/drivers/acpi/numa.c +++ b/drivers/acpi/numa.c @@ -40,6 +40,8 @@ static int pxm_to_node_map[MAX_PXM_DOMAINS] static int node_to_pxm_map[MAX_NUMNODES] = { [0 ... MAX_NUMNODES - 1] = PXM_INVAL }; +static unsigned long node_side_cached[BITS_TO_LONGS(MAX_PXM_DOMAINS)]; + unsigned char acpi_srat_revision __initdata; int acpi_numa __initdata; @@ -262,6 +264,7 @@ acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma) u64 start, end; u32 hotpluggable; int node, pxm; + bool side_cached; if (srat_disabled()) goto out_err; @@ -308,6 +311,11 @@ acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma) pr_warn("SRAT: Failed to mark hotplug range [mem %#010Lx-%#010Lx] in memblock\n", (unsigned long long)start, (unsigned long long)end - 1); + side_cached = test_bit(pxm, node_side_cached); + if (side_cached && memblock_mark_sidecached(start, ma->length)) + pr_warn("SRAT: Failed to mark side cached range [mem %#010Lx-%#010Lx] in memblock\n", + (unsigned long long)start, (unsigned long long)end - 1); + max_possible_pfn = max(max_possible_pfn, PFN_UP(end - 1)); return 0; @@ -411,6 +419,19 @@ acpi_parse_memory_affinity(union acpi_subtable_headers * header, return 0; } +static int __init +acpi_parse_cache(union acpi_subtable_headers *header, const unsigned long end) +{ + struct acpi_hmat_cache *cache = (void *)header; + u32 attrs; + + attrs = cache->cache_attributes; + if (((attrs & ACPI_HMAT_CACHE_ASSOCIATIVITY) >> 8) == + ACPI_HMAT_CA_DIRECT_MAPPED) + set_bit(cache->memory_PD, node_side_cached); + return 0; +} + static int __init acpi_parse_srat(struct acpi_table_header *table) { struct acpi_table_srat *srat = (struct acpi_table_srat *)table; @@ -422,6 +443,11 @@ static int __init acpi_parse_srat(struct acpi_table_header *table) return 0; } +static __init int acpi_parse_hmat(struct acpi_table_header *table) +{ + return 0; +} + static int __init acpi_table_parse_srat(enum acpi_srat_type id, acpi_tbl_entry_handler handler, unsigned int max_entries) @@ -460,6 +486,16 @@ int __init acpi_numa_init(void) sizeof(struct acpi_table_srat), srat_proc, ARRAY_SIZE(srat_proc), 0); + if (!acpi_table_parse(ACPI_SIG_HMAT, acpi_parse_hmat)) { + struct acpi_subtable_proc hmat_proc; + + memset(&hmat_proc, 0, sizeof(hmat_proc)); + hmat_proc.handler = acpi_parse_cache; + hmat_proc.id = ACPI_HMAT_TYPE_CACHE; + acpi_table_parse_entries_array(ACPI_SIG_HMAT, + sizeof(struct acpi_table_hmat), + &hmat_proc, 1, 0); + } cnt = acpi_table_parse_srat(ACPI_SRAT_TYPE_MEMORY_AFFINITY, acpi_parse_memory_affinity, 0); } diff --git a/include/linux/memblock.h b/include/linux/memblock.h index aee299a6aa76..a24c918a4496 100644 --- a/include/linux/memblock.h +++ b/include/linux/memblock.h @@ -44,6 +44,7 @@ enum memblock_flags { MEMBLOCK_HOTPLUG = 0x1, /* hotpluggable region */ MEMBLOCK_MIRROR = 0x2, /* mirrored region */ MEMBLOCK_NOMAP = 0x4, /* don't add to kernel direct mapping */ + MEMBLOCK_SIDECACHED = 0x8, /* System side caches memory access */ }; /** @@ -130,6 +131,7 @@ int memblock_clear_hotplug(phys_addr_t base, phys_addr_t size); int memblock_mark_mirror(phys_addr_t base, phys_addr_t size); int memblock_mark_nomap(phys_addr_t base, phys_addr_t size); int memblock_clear_nomap(phys_addr_t base, phys_addr_t size); +int memblock_mark_sidecached(phys_addr_t base, phys_addr_t size); enum memblock_flags choose_memblock_flags(void); unsigned long memblock_free_all(void); @@ -227,6 +229,11 @@ static inline bool memblock_is_nomap(struct memblock_region *m) return m->flags & MEMBLOCK_NOMAP; } +static inline bool memblock_is_sidecached(struct memblock_region *m) +{ + return m->flags & MEMBLOCK_SIDECACHED; +} + #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP int memblock_search_pfn_nid(unsigned long pfn, unsigned long *start_pfn, unsigned long *end_pfn); diff --git a/mm/memblock.c b/mm/memblock.c index 9a2d5ae81ae1..827b709afdcd 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -865,6 +865,11 @@ int __init_memblock memblock_mark_hotplug(phys_addr_t base, phys_addr_t size) return memblock_setclr_flag(base, size, 1, MEMBLOCK_HOTPLUG); } +int __init_memblock memblock_mark_sidecached(phys_addr_t base, phys_addr_t size) +{ + return memblock_setclr_flag(base, size, 1, MEMBLOCK_SIDECACHED); +} + /** * memblock_clear_hotplug - Clear flag MEMBLOCK_HOTPLUG for a specified region. * @base: the base phys addr of the region --