Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp3711557ybz; Mon, 4 May 2020 08:18:23 -0700 (PDT) X-Google-Smtp-Source: APiQypJ5OD1b0LmBFupyypO1/3/jC6bARA+GM0Lnyx9LcYuIG+Cx+jk7PDK6zeKajSx/bnv4LY3V X-Received: by 2002:a05:6402:2032:: with SMTP id ay18mr14580023edb.16.1588605503089; Mon, 04 May 2020 08:18:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588605503; cv=none; d=google.com; s=arc-20160816; b=lb8dKkaXqWcR/v5ym527hqUzjwu1gE0rdczzm6f4fNkHfhlMwid+hz7m1VUeIV40iE u0L7ZrrdlNLLUIJu3+DNglvvoS9V5vkQDaWBwcLOyVOGweeQr0J/Sk8vj99ONSFkeMiY 0pWOGCqMuZTSK3fBx1a3hNFgGKJH/4LjjKetcNG9BiUR5CLQp79qM2wb1pP9gJJQGUik 8d8OgZ/NvzTXhQS2XF7DfUuRh3mRKZjF2FyTTjlj4Okgpc3Hio/PycGpaZf/BldktRD3 hJkXIKmIfHcUHFVu8tNXhIdcEzvT84VOjhjxiCJDTaYHZU90DpOoj1kZ5OozbpnOO6BY U6dQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature; bh=mpXXd7uEdJ6fmVl0iUxBjNXdKzD8MJRJwDBOk9pZnT0=; b=a3vwV86u7Qt/F38cnq5bJiKZ8ZuBJ753wlHTFkmWpb+DLxm19T2iiUYPb5cFU32tO/ NH7K+evyuIRvFm6RcUf7HYXNE4Y7g9tHxXXs11c9BDhrNLTR/ZXIMKV34HZ8znz1Dqir k7VDldPYIbD6NuPZuOTLoTNZXodaTGZ7ZCTk4ykDPQd1ZiI451w/HP9hQWAV/gTjv6TH Q6G9kMneTGcm68jcshDGnFvQgj74OlzOu37NaJdoEZ45rTSQBZBgpEMMby7xV9khXxNE B5fNQWz+zjoDP33jpbouFi7sduI0tOJk11EZFQX1OiiAaKfUmL86x5upqkmoGIT5aYAl nGMw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2020-01-29 header.b=aiDhQg48; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id c10si7147741eds.107.2020.05.04.08.17.49; Mon, 04 May 2020 08:18:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2020-01-29 header.b=aiDhQg48; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729421AbgEDPC4 (ORCPT + 99 others); Mon, 4 May 2020 11:02:56 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:53264 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728909AbgEDPCz (ORCPT ); Mon, 4 May 2020 11:02:55 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 044ElbML116466; Mon, 4 May 2020 15:01:44 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2020-01-29; bh=mpXXd7uEdJ6fmVl0iUxBjNXdKzD8MJRJwDBOk9pZnT0=; b=aiDhQg48J1nTnS19xiB7JOTozw6IIJ6RDVQKmmdGkIqRuLpW8s76/frddUZtuAuBUjPy ElA+W64NLBZsxkRyNsTHCo/mD+yeVv56uxVANbu6S2URk2CukVo65oH/WwumHiwC9XLc QLGhAmMgjaDqXIMK2tNJ7Vprz/i/aInfcGOScTAJSLJvVJE6wJFVgf9hUJd/N0KarHKf iKzeHffR/dJ89EaaBaWr3bOJ5r6XSVp4fiR+K8C2nCHgNcngD6TILIBhQQQ8MlZsG6z+ pXc71NSWiGFr7xuMH6ygsa1m/aagaunI0X0rkwg6+vL6vOEyqgIYgPdHioAuVPhoN8i8 aw== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by userp2130.oracle.com with ESMTP id 30s09qyg76-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 04 May 2020 15:01:42 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 044ElbCv053434; Mon, 4 May 2020 14:59:42 GMT Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by userp3030.oracle.com with ESMTP id 30t1r2esd5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 04 May 2020 14:59:42 +0000 Received: from abhmp0001.oracle.com (abhmp0001.oracle.com [141.146.116.7]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id 044ExeBx022128; Mon, 4 May 2020 14:59:40 GMT Received: from linux-1.home.com (/10.175.9.166) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 04 May 2020 07:59:40 -0700 From: Alexandre Chartre To: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, x86@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: pbonzini@redhat.com, konrad.wilk@oracle.com, jan.setjeeilers@oracle.com, liran.alon@oracle.com, junaids@google.com, graf@amazon.de, rppt@linux.vnet.ibm.com, kuzuno@gmail.com, mgross@linux.intel.com, alexandre.chartre@oracle.com Subject: [RFC v4][PATCH part-2 02/13] mm/dpt: Track buffers allocated for a decorated page-table Date: Mon, 4 May 2020 16:57:59 +0200 Message-Id: <20200504145810.11882-3-alexandre.chartre@oracle.com> X-Mailer: git-send-email 2.18.2 In-Reply-To: <20200504145810.11882-1-alexandre.chartre@oracle.com> References: <20200504145810.11882-1-alexandre.chartre@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9610 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 adultscore=0 suspectscore=2 spamscore=0 mlxlogscore=999 malwarescore=0 phishscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2005040123 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9610 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 mlxscore=0 lowpriorityscore=0 spamscore=0 adultscore=0 clxscore=1015 suspectscore=2 priorityscore=1501 malwarescore=0 mlxlogscore=999 phishscore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2005040123 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Add functions to track buffers allocated for a decorated page-table. A page-table can have direct references to the kernel page table, at different levels (PGD, P4D, PUD, PMD). When freeing a page-table, we should make sure that we free parts actually allocated for the decorated page-table, and not parts of the kernel page table referenced from the page-table. To do so, we will keep track of buffers when building the page-table. Signed-off-by: Alexandre Chartre --- arch/x86/include/asm/dpt.h | 21 ++++++++++ arch/x86/mm/dpt.c | 82 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 103 insertions(+) diff --git a/arch/x86/include/asm/dpt.h b/arch/x86/include/asm/dpt.h index 1da4d43d5e94..b9cba051ebf2 100644 --- a/arch/x86/include/asm/dpt.h +++ b/arch/x86/include/asm/dpt.h @@ -3,9 +3,18 @@ #define ARCH_X86_MM_DPT_H #include +#include #include +enum page_table_level { + PGT_LEVEL_PTE, + PGT_LEVEL_PMD, + PGT_LEVEL_PUD, + PGT_LEVEL_P4D, + PGT_LEVEL_PGD +}; + /* * A decorated page-table (dpt) encapsulates a native page-table (e.g. * a PGD) and maintain additional attributes related to this page-table. @@ -15,6 +24,18 @@ struct dpt { pgd_t *pagetable; /* the actual page-table */ unsigned int alignment; /* page-table alignment */ + /* + * A page-table can have direct references to another page-table, + * at different levels (PGD, P4D, PUD, PMD). When freeing or + * modifying a page-table, we should make sure that we free/modify + * parts effectively allocated to the actual page-table, and not + * parts of another page-table referenced from this page-table. + * + * To do so, the backend_pages XArray is used to keep track of pages + * used for this page-table. + */ + struct xarray backend_pages; /* page-table pages */ + unsigned long backend_pages_count; /* pages count */ }; extern struct dpt *dpt_create(unsigned int pgt_alignment); diff --git a/arch/x86/mm/dpt.c b/arch/x86/mm/dpt.c index 333e259c5b7f..6df2d4fde8ec 100644 --- a/arch/x86/mm/dpt.c +++ b/arch/x86/mm/dpt.c @@ -8,6 +8,80 @@ #include +/* + * Get the pointer to the beginning of a page table directory from a page + * table directory entry. + */ +#define DPT_BACKEND_PAGE_ALIGN(entry) \ + ((typeof(entry))(((unsigned long)(entry)) & PAGE_MASK)) + +/* + * Pages used to build a page-table are stored in the backend_pages XArray. + * Each entry in the array is a logical OR of the page address and the page + * table level (PTE, PMD, PUD, P4D) this page is used for in the page-table. + * + * As a page address is aligned with PAGE_SIZE, we have plenty of space + * for storing the page table level (which is a value between 0 and 4) in + * the low bits of the page address. + * + */ + +#define DPT_BACKEND_PAGE_ENTRY(addr, level) \ + ((typeof(addr))(((unsigned long)(addr)) | ((unsigned long)(level)))) +#define DPT_BACKEND_PAGE_ADDR(entry) \ + ((void *)(((unsigned long)(entry)) & PAGE_MASK)) +#define DPT_BACKEND_PAGE_LEVEL(entry) \ + ((enum page_table_level)(((unsigned long)(entry)) & ~PAGE_MASK)) + +static int dpt_add_backend_page(struct dpt *dpt, void *addr, + enum page_table_level level) +{ + unsigned long index; + void *old_entry; + + if ((!addr) || ((unsigned long)addr) & ~PAGE_MASK) + return -EINVAL; + + lockdep_assert_held(&dpt->lock); + index = dpt->backend_pages_count; + + old_entry = xa_store(&dpt->backend_pages, index, + DPT_BACKEND_PAGE_ENTRY(addr, level), + GFP_KERNEL); + if (xa_is_err(old_entry)) + return xa_err(old_entry); + if (old_entry) + return -EBUSY; + + dpt->backend_pages_count++; + + return 0; +} + +/* + * Check if an offset in the page-table is valid, i.e. check that the + * offset is on a page effectively belonging to the page-table. + */ +static bool dpt_valid_offset(struct dpt *dpt, void *offset) +{ + unsigned long index; + void *addr, *entry; + bool valid; + + addr = DPT_BACKEND_PAGE_ALIGN(offset); + valid = false; + + lockdep_assert_held(&dpt->lock); + xa_for_each(&dpt->backend_pages, index, entry) { + if (DPT_BACKEND_PAGE_ADDR(entry) == addr) { + valid = true; + break; + } + } + + return valid; +} + /* * dpt_create - allocate a page-table and create a corresponding * decorated page-table. The page-table is allocated and aligned @@ -41,6 +115,7 @@ struct dpt *dpt_create(unsigned int pgt_alignment) dpt->alignment = pgt_alignment; spin_lock_init(&dpt->lock); + xa_init(&dpt->backend_pages); return dpt; } @@ -50,10 +125,17 @@ void dpt_destroy(struct dpt *dpt) { unsigned int pgt_alignment; unsigned int alloc_order; + unsigned long index; + void *entry; if (!dpt) return; + if (dpt->backend_pages_count) { + xa_for_each(&dpt->backend_pages, index, entry) + free_page((unsigned long)DPT_BACKEND_PAGE_ADDR(entry)); + } + if (dpt->pagetable) { pgt_alignment = dpt->alignment; alloc_order = round_up(PAGE_SIZE + pgt_alignment, -- 2.18.2