2019-07-19 21:51:41

by Joerg Roedel

[permalink] [raw]
Subject: [PATCH 3/3] mm/vmalloc: Sync unmappings in vunmap_page_range()

From: Joerg Roedel <[email protected]>

On x86-32 with PTI enabled, parts of the kernel page-tables
are not shared between processes. This can cause mappings in
the vmalloc/ioremap area to persist in some page-tables
after the region is unmapped and released.

When the region is re-used the processes with the old
mappings do not fault in the new mappings but still access
the old ones.

This causes undefined behavior, in reality often data
corruption, kernel oopses and panics and even spontaneous
reboots.

Fix this problem by activly syncing unmaps in the
vmalloc/ioremap area to all page-tables in the system before
the regions can be re-used.

References: https://bugzilla.suse.com/show_bug.cgi?id=1118689
Reviewed-by: Dave Hansen <[email protected]>
Fixes: 5d72b4fba40ef ('x86, mm: support huge I/O mapping capability I/F')
Signed-off-by: Joerg Roedel <[email protected]>
---
mm/vmalloc.c | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 4fa8d84599b0..e0fc963acc41 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1258,6 +1258,12 @@ static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end)
if (unlikely(valist == NULL))
return false;

+ /*
+ * First make sure the mappings are removed from all page-tables
+ * before they are freed.
+ */
+ vmalloc_sync_all();
+
/*
* TODO: to calculate a flush range without looping.
* The list can be up to lazy_max_pages() elements.
@@ -3038,6 +3044,9 @@ EXPORT_SYMBOL(remap_vmalloc_range);
/*
* Implement a stub for vmalloc_sync_all() if the architecture chose not to
* have one.
+ *
+ * The purpose of this function is to make sure the vmalloc area
+ * mappings are identical in all page-tables in the system.
*/
void __weak vmalloc_sync_all(void)
{
--
2.17.1


2019-07-22 08:27:00

by Jörg Rödel

[permalink] [raw]
Subject: Re: [PATCH 3/3] mm/vmalloc: Sync unmappings in vunmap_page_range()

Srewed up the subject :(, it needs to be

"mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()"

of course.

On Fri, Jul 19, 2019 at 08:46:52PM +0200, Joerg Roedel wrote:
> From: Joerg Roedel <[email protected]>
>
> On x86-32 with PTI enabled, parts of the kernel page-tables
> are not shared between processes. This can cause mappings in
> the vmalloc/ioremap area to persist in some page-tables
> after the region is unmapped and released.
>
> When the region is re-used the processes with the old
> mappings do not fault in the new mappings but still access
> the old ones.
>
> This causes undefined behavior, in reality often data
> corruption, kernel oopses and panics and even spontaneous
> reboots.
>
> Fix this problem by activly syncing unmaps in the
> vmalloc/ioremap area to all page-tables in the system before
> the regions can be re-used.
>
> References: https://bugzilla.suse.com/show_bug.cgi?id=1118689
> Reviewed-by: Dave Hansen <[email protected]>
> Fixes: 5d72b4fba40ef ('x86, mm: support huge I/O mapping capability I/F')
> Signed-off-by: Joerg Roedel <[email protected]>
> ---
> mm/vmalloc.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 4fa8d84599b0..e0fc963acc41 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -1258,6 +1258,12 @@ static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end)
> if (unlikely(valist == NULL))
> return false;
>
> + /*
> + * First make sure the mappings are removed from all page-tables
> + * before they are freed.
> + */
> + vmalloc_sync_all();
> +
> /*
> * TODO: to calculate a flush range without looping.
> * The list can be up to lazy_max_pages() elements.
> @@ -3038,6 +3044,9 @@ EXPORT_SYMBOL(remap_vmalloc_range);
> /*
> * Implement a stub for vmalloc_sync_all() if the architecture chose not to
> * have one.
> + *
> + * The purpose of this function is to make sure the vmalloc area
> + * mappings are identical in all page-tables in the system.
> */
> void __weak vmalloc_sync_all(void)
> {
> --
> 2.17.1

2019-07-22 08:37:33

by Joerg Roedel

[permalink] [raw]
Subject: Re: [PATCH 3/3] mm/vmalloc: Sync unmappings in vunmap_page_range()

On Mon, Jul 22, 2019 at 10:19:32AM +0200, Thomas Gleixner wrote:
> On Mon, 22 Jul 2019, Joerg Roedel wrote:
>
> > Srewed up the subject :(, it needs to be
>
> Un-Srewed it :)

Thanks a lot :)

2019-07-22 09:09:29

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 3/3] mm/vmalloc: Sync unmappings in vunmap_page_range()

On Mon, 22 Jul 2019, Joerg Roedel wrote:

> Srewed up the subject :(, it needs to be

Un-Srewed it :)
^^^^^^

Subject: [tip:x86/urgent] mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()

Commit-ID: 3f8fd02b1bf1d7ba964485a56f2f4b53ae88c167
Gitweb: https://git.kernel.org/tip/3f8fd02b1bf1d7ba964485a56f2f4b53ae88c167
Author: Joerg Roedel <[email protected]>
AuthorDate: Fri, 19 Jul 2019 20:46:52 +0200
Committer: Thomas Gleixner <[email protected]>
CommitDate: Mon, 22 Jul 2019 10:18:30 +0200

mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()

On x86-32 with PTI enabled, parts of the kernel page-tables are not shared
between processes. This can cause mappings in the vmalloc/ioremap area to
persist in some page-tables after the region is unmapped and released.

When the region is re-used the processes with the old mappings do not fault
in the new mappings but still access the old ones.

This causes undefined behavior, in reality often data corruption, kernel
oopses and panics and even spontaneous reboots.

Fix this problem by activly syncing unmaps in the vmalloc/ioremap area to
all page-tables in the system before the regions can be re-used.

References: https://bugzilla.suse.com/show_bug.cgi?id=1118689
Fixes: 5d72b4fba40ef ('x86, mm: support huge I/O mapping capability I/F')
Signed-off-by: Joerg Roedel <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Dave Hansen <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

---
mm/vmalloc.c | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 4fa8d84599b0..e0fc963acc41 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1258,6 +1258,12 @@ static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end)
if (unlikely(valist == NULL))
return false;

+ /*
+ * First make sure the mappings are removed from all page-tables
+ * before they are freed.
+ */
+ vmalloc_sync_all();
+
/*
* TODO: to calculate a flush range without looping.
* The list can be up to lazy_max_pages() elements.
@@ -3038,6 +3044,9 @@ EXPORT_SYMBOL(remap_vmalloc_range);
/*
* Implement a stub for vmalloc_sync_all() if the architecture chose not to
* have one.
+ *
+ * The purpose of this function is to make sure the vmalloc area
+ * mappings are identical in all page-tables in the system.
*/
void __weak vmalloc_sync_all(void)
{