Received: by 2002:a5b:505:0:0:0:0:0 with SMTP id o5csp761404ybp; Wed, 9 Oct 2019 03:57:49 -0700 (PDT) X-Google-Smtp-Source: APXvYqw0u77/xEXQkAVLbAUSafYl7I7qvOKqoEUR89KiS15JAfEJORLNKL06S6ct44XtzcpsvAMA X-Received: by 2002:a17:906:2295:: with SMTP id p21mr2158478eja.8.1570618669568; Wed, 09 Oct 2019 03:57:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1570618669; cv=none; d=google.com; s=arc-20160816; b=06Ssf5lzOTEpKrDbeCacEvqDl5leJlRMUe0Q+3g8esmascgOGkdf2C3c9ggf0wtSy0 h2TuqPeoHMg1hJPV/Kg2K4vqVRgiIBJRLgh/zRm0B68QRh8Hwe/hcY+XLb0ATKBVP2xq kgWrAg9vJPyI+gazoG4Z/1oOOZnxDhJaGV1oxJSgpECwvSbPe/zlcnjxrHVdF2v3UFs2 SOKuqMKU4x+IMv4wFGCiIJ+p5fYYBX4Qj5ThZo1jdzaBMVqaUxudzP4FYXuzUh8rwr/A Awn+uKS0nkEOxexBhrGpj/M7NLGq47stttJbJoR/9e4n3n5EGWBsgKPUmWVO70jzaKFX t56A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature; bh=S4AkZN05hHrNTFK6ZX6XQMYCD0Fp7OiS5V1Bigp9mPc=; b=fuuz/x56hJMr0evnjC+hYx1XWudrcV+89q5qG4P8RQzC4qymGDzTvL55jjhRWJCei/ dG9/0DAjQPvRk2Um8ScyqJai/go9QcG2MEBJsnuDpkhntJIPolDipET5DzXTB5cSTng7 Al12rwMxLYBF5+4wk8EdDSq7UNHuWJFczBg0DkCbV+8FaUlQwFP7TZVKd9Cq+RSkwg0h s+fuUoSGyjD5ylCWgDQI1Oe1TAPCGBqsJ5owp6+pQIYyCCwcnrCJtfsS7KNsFi8vZB8P YgpsC21orPq5lw2D3n/5t2myOvHpzLB87z30tBpI6C8DKUfZnejpviSL1t2/FWWnAP9Z nzKg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2019-08-05 header.b=l4dWfX2b; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h3si888965ejt.380.2019.10.09.03.57.26; Wed, 09 Oct 2019 03:57:49 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2019-08-05 header.b=l4dWfX2b; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730754AbfJIKzN (ORCPT + 99 others); Wed, 9 Oct 2019 06:55:13 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:42326 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727035AbfJIKzN (ORCPT ); Wed, 9 Oct 2019 06:55:13 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x99An3cD076365; Wed, 9 Oct 2019 10:54:31 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2019-08-05; bh=S4AkZN05hHrNTFK6ZX6XQMYCD0Fp7OiS5V1Bigp9mPc=; b=l4dWfX2bJSpJW38w40SJrhcgwX9HBGz5dSRazFfBAdQXk0KBMjsr95SKN+PfvHG3ThYb l2cs07AW3dapEHpc7RyNgWM8ig8VxpQes/oLvY9Kj3ePsqLOzOAQpdNrYs5UoCwC5Ui+ WsZzvMoHzAjCqHIC8esGqlLGyHlQ3C1RYxbJ7VeC9T9DbAn9MQXriGMaKSi9hGh5R5bO wxtyNGwx/iQwj1NlrCSKgiZP1afFdBC2tc5RRUyeK9caruha8OE4SMHHRZK5xFfbGO6p Px45ayZUiAw0U5zDwPQjCBMxDhZcEeqqV3MuIhQQd9k+mh22eZlnxVaeOgfOdUEznc1e PA== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by userp2130.oracle.com with ESMTP id 2vejkukb0g-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 09 Oct 2019 10:54:31 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x99Amo5f141747; Wed, 9 Oct 2019 10:54:30 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by aserp3020.oracle.com with ESMTP id 2vgefcf2t2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 09 Oct 2019 10:54:30 +0000 Received: from abhmp0013.oracle.com (abhmp0013.oracle.com [141.146.116.19]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x99AsSQU028165; Wed, 9 Oct 2019 10:54:28 GMT Received: from tomti.i.net-space.pl (/10.175.167.68) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 09 Oct 2019 03:54:27 -0700 From: Daniel Kiper To: linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, x86@kernel.org, xen-devel@lists.xenproject.org Cc: ard.biesheuvel@linaro.org, boris.ostrovsky@oracle.com, bp@alien8.de, corbet@lwn.net, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, eric.snowberg@oracle.com, hpa@zytor.com, jgross@suse.com, konrad.wilk@oracle.com, mingo@redhat.com, ross.philipson@oracle.com, tglx@linutronix.de Subject: [PATCH v3 3/3] x86/boot: Introduce the setup_indirect Date: Wed, 9 Oct 2019 12:53:58 +0200 Message-Id: <20191009105358.32256-4-daniel.kiper@oracle.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20191009105358.32256-1-daniel.kiper@oracle.com> References: <20191009105358.32256-1-daniel.kiper@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9404 signatures=668684 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1908290000 definitions=main-1910090101 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9404 signatures=668684 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1908290000 definitions=main-1910090101 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The setup_data is a bit awkward to use for extremely large data objects, both because the setup_data header has to be adjacent to the data object and because it has a 32-bit length field. However, it is important that intermediate stages of the boot process have a way to identify which chunks of memory are occupied by kernel data. Thus we introduce an uniform way to specify such indirect data as setup_indirect struct and SETUP_INDIRECT type. Suggested-by: H. Peter Anvin Signed-off-by: Daniel Kiper Acked-by: Konrad Rzeszutek Wilk Reviewed-by: Ross Philipson --- v3 - suggestions/fixes: - add setup_indirect mapping/KASLR avoidance/etc. code (suggested by H. Peter Anvin), - the SETUP_INDIRECT sets most significant bit right now; this way it is possible to differentiate regular setup_data and setup_indirect objects in the debugfs filesystem. v2 - suggestions/fixes: - add setup_indirect usage example (suggested by Eric Snowberg and Ross Philipson). --- Documentation/x86/boot.rst | 40 +++++++++++++++++++++++++++++++++++ arch/x86/boot/compressed/kaslr.c | 12 +++++++++++ arch/x86/include/uapi/asm/bootparam.h | 16 +++++++++++--- arch/x86/kernel/e820.c | 11 ++++++++++ arch/x86/kernel/kdebugfs.c | 20 ++++++++++++++---- arch/x86/kernel/ksysfs.c | 30 ++++++++++++++++++++------ arch/x86/kernel/setup.c | 4 ++++ arch/x86/mm/ioremap.c | 11 ++++++++++ 8 files changed, 130 insertions(+), 14 deletions(-) diff --git a/Documentation/x86/boot.rst b/Documentation/x86/boot.rst index 4c536bc8816d..d6d03b00b594 100644 --- a/Documentation/x86/boot.rst +++ b/Documentation/x86/boot.rst @@ -827,6 +827,46 @@ Protocol: 2.09+ sure to consider the case where the linked list already contains entries. + The setup_data is a bit awkward to use for extremely large data objects, + both because the setup_data header has to be adjacent to the data object + and because it has a 32-bit length field. However, it is important that + intermediate stages of the boot process have a way to identify which + chunks of memory are occupied by kernel data. + + Thus setup_indirect struct and SETUP_INDIRECT type were introduced in + protocol 2.15. + + struct setup_indirect { + __u32 type; + __u32 reserved; /* Reserved, must be set to zero. */ + __u64 len; + __u64 addr; + }; + + The type member is a SETUP_INDIRECT | SETUP_* type. However, it cannot be + SETUP_INDIRECT itself since making the setup_indirect a tree structure + could require a lot of stack space in something that needs to parse it + and stack space can be limited in boot contexts. + + Let's give an example how to point to SETUP_E820_EXT data using setup_indirect. + In this case setup_data and setup_indirect will look like this: + + struct setup_data { + __u64 next = 0 or ; + __u32 type = SETUP_INDIRECT; + __u32 len = sizeof(setup_data); + __u8 data[sizeof(setup_indirect)] = struct setup_indirect { + __u32 type = SETUP_INDIRECT | SETUP_E820_EXT; + __u32 reserved = 0; + __u64 len = ; + __u64 addr = ; + } + } + + Note: SETUP_INDIRECT | SETUP_NONE objects cannot be properly distinguished + from SETUP_INDIRECT itself. So, this kind of objects cannot be provided + by the bootloaders. + ============ ============ Field name: pref_address Type: read (reloc) diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c index 2e53c056ba20..bb9bfef174ae 100644 --- a/arch/x86/boot/compressed/kaslr.c +++ b/arch/x86/boot/compressed/kaslr.c @@ -459,6 +459,18 @@ static bool mem_avoid_overlap(struct mem_vector *img, is_overlapping = true; } + if (ptr->type == SETUP_INDIRECT && + ((struct setup_indirect *)ptr->data)->type != SETUP_INDIRECT) { + avoid.start = ((struct setup_indirect *)ptr->data)->addr; + avoid.size = ((struct setup_indirect *)ptr->data)->len; + + if (mem_overlaps(img, &avoid) && (avoid.start < earliest)) { + *overlap = avoid; + earliest = overlap->start; + is_overlapping = true; + } + } + ptr = (struct setup_data *)(unsigned long)ptr->next; } diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h index dbb41128e5a0..949066b5398a 100644 --- a/arch/x86/include/uapi/asm/bootparam.h +++ b/arch/x86/include/uapi/asm/bootparam.h @@ -2,7 +2,7 @@ #ifndef _ASM_X86_BOOTPARAM_H #define _ASM_X86_BOOTPARAM_H -/* setup_data types */ +/* setup_data/setup_indirect types */ #define SETUP_NONE 0 #define SETUP_E820_EXT 1 #define SETUP_DTB 2 @@ -11,8 +11,10 @@ #define SETUP_APPLE_PROPERTIES 5 #define SETUP_JAILHOUSE 6 -/* max(SETUP_*) */ -#define SETUP_TYPE_MAX SETUP_JAILHOUSE +#define SETUP_INDIRECT (1<<31) + +/* SETUP_INDIRECT | max(SETUP_*) */ +#define SETUP_TYPE_MAX (SETUP_INDIRECT | SETUP_JAILHOUSE) /* ram_size flags */ #define RAMDISK_IMAGE_START_MASK 0x07FF @@ -52,6 +54,14 @@ struct setup_data { __u8 data[0]; }; +/* extensible setup indirect data node */ +struct setup_indirect { + __u32 type; + __u32 reserved; /* Reserved, must be set to zero. */ + __u64 len; + __u64 addr; +}; + struct setup_header { __u8 setup_sects; __u16 root_flags; diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c index 7da2bcd2b8eb..0bfe9a685b3b 100644 --- a/arch/x86/kernel/e820.c +++ b/arch/x86/kernel/e820.c @@ -999,6 +999,17 @@ void __init e820__reserve_setup_data(void) data = early_memremap(pa_data, sizeof(*data)); e820__range_update(pa_data, sizeof(*data)+data->len, E820_TYPE_RAM, E820_TYPE_RESERVED_KERN); e820__range_update_kexec(pa_data, sizeof(*data)+data->len, E820_TYPE_RAM, E820_TYPE_RESERVED_KERN); + + if (data->type == SETUP_INDIRECT && + ((struct setup_indirect *)data->data)->type != SETUP_INDIRECT) { + e820__range_update(((struct setup_indirect *)data->data)->addr, + ((struct setup_indirect *)data->data)->len, + E820_TYPE_RAM, E820_TYPE_RESERVED_KERN); + e820__range_update_kexec(((struct setup_indirect *)data->data)->addr, + ((struct setup_indirect *)data->data)->len, + E820_TYPE_RAM, E820_TYPE_RESERVED_KERN); + } + pa_data = data->next; early_memunmap(data, sizeof(*data)); } diff --git a/arch/x86/kernel/kdebugfs.c b/arch/x86/kernel/kdebugfs.c index edaa30b20841..701a98300f86 100644 --- a/arch/x86/kernel/kdebugfs.c +++ b/arch/x86/kernel/kdebugfs.c @@ -44,7 +44,11 @@ static ssize_t setup_data_read(struct file *file, char __user *user_buf, if (count > node->len - pos) count = node->len - pos; - pa = node->paddr + sizeof(struct setup_data) + pos; + pa = node->paddr + pos; + + if (!(node->type & SETUP_INDIRECT) || node->type == SETUP_INDIRECT) + pa += sizeof(struct setup_data); + p = memremap(pa, count, MEMREMAP_WB); if (!p) return -ENOMEM; @@ -108,9 +112,17 @@ static int __init create_setup_data_nodes(struct dentry *parent) goto err_dir; } - node->paddr = pa_data; - node->type = data->type; - node->len = data->len; + if (data->type == SETUP_INDIRECT && + ((struct setup_indirect *)data->data)->type != SETUP_INDIRECT) { + node->paddr = ((struct setup_indirect *)data->data)->addr; + node->type = ((struct setup_indirect *)data->data)->type; + node->len = ((struct setup_indirect *)data->data)->len; + } else { + node->paddr = pa_data; + node->type = data->type; + node->len = data->len; + } + create_setup_data_node(d, no, node); pa_data = data->next; diff --git a/arch/x86/kernel/ksysfs.c b/arch/x86/kernel/ksysfs.c index 7969da939213..14ef8121aa53 100644 --- a/arch/x86/kernel/ksysfs.c +++ b/arch/x86/kernel/ksysfs.c @@ -100,7 +100,11 @@ static int __init get_setup_data_size(int nr, size_t *size) if (!data) return -ENOMEM; if (nr == i) { - *size = data->len; + if (data->type == SETUP_INDIRECT && + ((struct setup_indirect *)data->data)->type != SETUP_INDIRECT) + *size = ((struct setup_indirect *)data->data)->len; + else + *size = data->len; memunmap(data); return 0; } @@ -130,7 +134,10 @@ static ssize_t type_show(struct kobject *kobj, if (!data) return -ENOMEM; - ret = sprintf(buf, "0x%x\n", data->type); + if (data->type == SETUP_INDIRECT) + ret = sprintf(buf, "0x%x\n", ((struct setup_indirect *)data->data)->type); + else + ret = sprintf(buf, "0x%x\n", data->type); memunmap(data); return ret; } @@ -142,7 +149,7 @@ static ssize_t setup_data_data_read(struct file *fp, loff_t off, size_t count) { int nr, ret = 0; - u64 paddr; + u64 paddr, len; struct setup_data *data; void *p; @@ -157,19 +164,28 @@ static ssize_t setup_data_data_read(struct file *fp, if (!data) return -ENOMEM; - if (off > data->len) { + if (data->type == SETUP_INDIRECT && + ((struct setup_indirect *)data->data)->type != SETUP_INDIRECT) { + paddr = ((struct setup_indirect *)data->data)->addr; + len = ((struct setup_indirect *)data->data)->len; + } else { + paddr += sizeof(*data); + len = data->len; + } + + if (off > len) { ret = -EINVAL; goto out; } - if (count > data->len - off) - count = data->len - off; + if (count > len - off) + count = len - off; if (!count) goto out; ret = count; - p = memremap(paddr + sizeof(*data), data->len, MEMREMAP_WB); + p = memremap(paddr, len, MEMREMAP_WB); if (!p) { ret = -ENOMEM; goto out; diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index 77ea96b794bd..4603702dbfc1 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -438,6 +438,10 @@ static void __init memblock_x86_reserve_range_setup_data(void) while (pa_data) { data = early_memremap(pa_data, sizeof(*data)); memblock_reserve(pa_data, sizeof(*data) + data->len); + if (data->type == SETUP_INDIRECT && + ((struct setup_indirect *)data->data)->type != SETUP_INDIRECT) + memblock_reserve(((struct setup_indirect *)data->data)->addr, + ((struct setup_indirect *)data->data)->len); pa_data = data->next; early_memunmap(data, sizeof(*data)); } diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c index a39dcdb5ae34..1ff9c2030b4f 100644 --- a/arch/x86/mm/ioremap.c +++ b/arch/x86/mm/ioremap.c @@ -626,6 +626,17 @@ static bool memremap_is_setup_data(resource_size_t phys_addr, paddr_next = data->next; len = data->len; + if ((phys_addr > paddr) && (phys_addr < (paddr + len))) { + memunmap(data); + return true; + } + + if (data->type == SETUP_INDIRECT && + ((struct setup_indirect *)data->data)->type != SETUP_INDIRECT) { + paddr = ((struct setup_indirect *)data->data)->addr; + len = ((struct setup_indirect *)data->data)->len; + } + memunmap(data); if ((phys_addr > paddr) && (phys_addr < (paddr + len))) -- 2.11.0