2023-07-22 23:54:12

by Pasha Tatashin

[permalink] [raw]
Subject: [PATCH v2 0/3] page table check warn instead of panic

Changelog:
v2:
- Moved "Check writable zero page in page table check" to be last in
series so not to add more BUG_ON.
- Removed page_table_check=panic as was suggested by Mathew Wilcox
v1:
https://lore.kernel.org/linux-mm/[email protected]/

Page table check when detects errors panics the kernel, instead,
print warnings as it is more useful, and it was agreed the right
behaviour for kernel.

In case when panic is still preferred, there is panic_on_warn sysctl
option.

Pasha Tatashin (2):
mm/page_table_check: Do WARN_ON instead of BUG_ON
doc/vm: add information about page_table_check warn_on behavior

Rick Edgecombe (1):
mm/page_table_check: Check writable zero page in page table check

Documentation/mm/page_table_check.rst | 5 ++--
mm/page_table_check.c | 39 ++++++++++++++++-----------
2 files changed, 27 insertions(+), 17 deletions(-)

--
2.41.0.487.g6d72f3e995-goog



2023-07-22 23:54:31

by Pasha Tatashin

[permalink] [raw]
Subject: [PATCH v2 3/3] mm/page_table_check: Check writable zero page in page table check

From: Rick Edgecombe <[email protected]>

The zero page should remain all zero, so that it can be mapped as
read-only for read faults of memory that should be zeroed. If it is ever
mapped writable to userspace, it could become non-zero and so other apps
would unexpectedly get non-zero data. So the zero page should never be
mapped writable to userspace. Check for this condition in
page_table_check_set().

Signed-off-by: Rick Edgecombe <[email protected]>
Signed-off-by: Pasha Tatashin <[email protected]>
---
mm/page_table_check.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/mm/page_table_check.c b/mm/page_table_check.c
index ad4447e999f8..db1ed36f7203 100644
--- a/mm/page_table_check.c
+++ b/mm/page_table_check.c
@@ -114,6 +114,8 @@ static void page_table_check_set(struct mm_struct *mm, unsigned long addr,
if (!pfn_valid(pfn))
return;

+ PAGE_TABLE_CHECK_WARN(is_zero_pfn(pfn) && rw);
+
page = pfn_to_page(pfn);
page_ext = page_ext_get(page);

--
2.41.0.487.g6d72f3e995-goog


2023-07-22 23:55:14

by Pasha Tatashin

[permalink] [raw]
Subject: [PATCH v2 1/3] mm/page_table_check: Do WARN_ON instead of BUG_ON

Currently, page_table_check when detects errors panics the kernel. Instead,
print a warning as it is more useful compared to unconditionally crashing
the machine.

However, once a warning is detected, the counting of page_table_check
becomes unbalanced, therefore, disable its activity until the next boot.

In case of where machine hardening requires a more secure environment, it
is still possible to crash machine on page_table_check errors via
panic_on_warn sysctl option.

Signed-off-by: Pasha Tatashin <[email protected]>
---
mm/page_table_check.c | 37 ++++++++++++++++++++++---------------
1 file changed, 22 insertions(+), 15 deletions(-)

diff --git a/mm/page_table_check.c b/mm/page_table_check.c
index 93ec7690a0d8..ad4447e999f8 100644
--- a/mm/page_table_check.c
+++ b/mm/page_table_check.c
@@ -22,6 +22,12 @@ static bool __page_table_check_enabled __initdata =
DEFINE_STATIC_KEY_TRUE(page_table_check_disabled);
EXPORT_SYMBOL(page_table_check_disabled);

+#define PAGE_TABLE_CHECK_WARN(v) \
+ do { \
+ if (WARN_ON_ONCE(v)) \
+ static_branch_enable(&page_table_check_disabled); \
+ } while (false)
+
static int __init early_page_table_check_param(char *buf)
{
return kstrtobool(buf, &__page_table_check_enabled);
@@ -50,7 +56,8 @@ struct page_ext_operations page_table_check_ops = {

static struct page_table_check *get_page_table_check(struct page_ext *page_ext)
{
- BUG_ON(!page_ext);
+ PAGE_TABLE_CHECK_WARN(!page_ext);
+
return (void *)(page_ext) + page_table_check_ops.offset;
}

@@ -72,18 +79,18 @@ static void page_table_check_clear(struct mm_struct *mm, unsigned long addr,
page = pfn_to_page(pfn);
page_ext = page_ext_get(page);

- BUG_ON(PageSlab(page));
+ PAGE_TABLE_CHECK_WARN(PageSlab(page));
anon = PageAnon(page);

for (i = 0; i < pgcnt; i++) {
struct page_table_check *ptc = get_page_table_check(page_ext);

if (anon) {
- BUG_ON(atomic_read(&ptc->file_map_count));
- BUG_ON(atomic_dec_return(&ptc->anon_map_count) < 0);
+ PAGE_TABLE_CHECK_WARN(atomic_read(&ptc->file_map_count));
+ PAGE_TABLE_CHECK_WARN(atomic_dec_return(&ptc->anon_map_count) < 0);
} else {
- BUG_ON(atomic_read(&ptc->anon_map_count));
- BUG_ON(atomic_dec_return(&ptc->file_map_count) < 0);
+ PAGE_TABLE_CHECK_WARN(atomic_read(&ptc->anon_map_count));
+ PAGE_TABLE_CHECK_WARN(atomic_dec_return(&ptc->file_map_count) < 0);
}
page_ext = page_ext_next(page_ext);
}
@@ -110,18 +117,18 @@ static void page_table_check_set(struct mm_struct *mm, unsigned long addr,
page = pfn_to_page(pfn);
page_ext = page_ext_get(page);

- BUG_ON(PageSlab(page));
+ PAGE_TABLE_CHECK_WARN(PageSlab(page));
anon = PageAnon(page);

for (i = 0; i < pgcnt; i++) {
struct page_table_check *ptc = get_page_table_check(page_ext);

if (anon) {
- BUG_ON(atomic_read(&ptc->file_map_count));
- BUG_ON(atomic_inc_return(&ptc->anon_map_count) > 1 && rw);
+ PAGE_TABLE_CHECK_WARN(atomic_read(&ptc->file_map_count));
+ PAGE_TABLE_CHECK_WARN(atomic_inc_return(&ptc->anon_map_count) > 1 && rw);
} else {
- BUG_ON(atomic_read(&ptc->anon_map_count));
- BUG_ON(atomic_inc_return(&ptc->file_map_count) < 0);
+ PAGE_TABLE_CHECK_WARN(atomic_read(&ptc->anon_map_count));
+ PAGE_TABLE_CHECK_WARN(atomic_inc_return(&ptc->file_map_count) < 0);
}
page_ext = page_ext_next(page_ext);
}
@@ -137,15 +144,15 @@ void __page_table_check_zero(struct page *page, unsigned int order)
struct page_ext *page_ext;
unsigned long i;

- BUG_ON(PageSlab(page));
+ PAGE_TABLE_CHECK_WARN(PageSlab(page));

page_ext = page_ext_get(page);
- BUG_ON(!page_ext);
+ PAGE_TABLE_CHECK_WARN(!page_ext);
for (i = 0; i < (1ul << order); i++) {
struct page_table_check *ptc = get_page_table_check(page_ext);

- BUG_ON(atomic_read(&ptc->anon_map_count));
- BUG_ON(atomic_read(&ptc->file_map_count));
+ PAGE_TABLE_CHECK_WARN(atomic_read(&ptc->anon_map_count));
+ PAGE_TABLE_CHECK_WARN(atomic_read(&ptc->file_map_count));
page_ext = page_ext_next(page_ext);
}
page_ext_put(page_ext);
--
2.41.0.487.g6d72f3e995-goog


2023-07-23 00:25:48

by Pasha Tatashin

[permalink] [raw]
Subject: [PATCH v2 2/3] doc/vm: add information about page_table_check warn_on behavior

The default behavior of page table check was changed from panicking
kernel to printing a warning.

Add a note how to still panic the kernel when error is detected.

Signed-off-by: Pasha Tatashin <[email protected]>
---
Documentation/mm/page_table_check.rst | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/Documentation/mm/page_table_check.rst b/Documentation/mm/page_table_check.rst
index c12838ce6b8d..f534c80ee9c9 100644
--- a/Documentation/mm/page_table_check.rst
+++ b/Documentation/mm/page_table_check.rst
@@ -14,13 +14,14 @@ Page table check performs extra verifications at the time when new pages become
accessible from the userspace by getting their page table entries (PTEs PMDs
etc.) added into the table.

-In case of detected corruption, the kernel is crashed. There is a small
+In case of detected corruption, a warning is printed. There is a small
performance and memory overhead associated with the page table check. Therefore,
it is disabled by default, but can be optionally enabled on systems where the
extra hardening outweighs the performance costs. Also, because page table check
is synchronous, it can help with debugging double map memory corruption issues,
by crashing kernel at the time wrong mapping occurs instead of later which is
-often the case with memory corruptions bugs.
+often the case with memory corruptions bugs. In order to crash kernel sysctl
+panic_on_warn should be set to 1.

Double mapping detection logic
==============================
--
2.41.0.487.g6d72f3e995-goog


2023-07-23 00:46:55

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] doc/vm: add information about page_table_check warn_on behavior

Hi--

On 7/22/23 16:15, Pasha Tatashin wrote:
> The default behavior of page table check was changed from panicking
> kernel to printing a warning.
>
> Add a note how to still panic the kernel when error is detected.
>
> Signed-off-by: Pasha Tatashin<[email protected]>
> ---
> Documentation/mm/page_table_check.rst | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/mm/page_table_check.rst b/Documentation/mm/page_table_check.rst
> index c12838ce6b8d..f534c80ee9c9 100644
> --- a/Documentation/mm/page_table_check.rst
> +++ b/Documentation/mm/page_table_check.rst
> @@ -14,13 +14,14 @@ Page table check performs extra verifications at the time when new pages become
> accessible from the userspace by getting their page table entries (PTEs PMDs
> etc.) added into the table.
>
> -In case of detected corruption, the kernel is crashed. There is a small
> +In case of detected corruption, a warning is printed. There is a small
> performance and memory overhead associated with the page table check. Therefore,
> it is disabled by default, but can be optionally enabled on systems where the
> extra hardening outweighs the performance costs. Also, because page table check
> is synchronous, it can help with debugging double map memory corruption issues,
> by crashing kernel at the time wrong mapping occurs instead of later which is
> -often the case with memory corruptions bugs.
> +often the case with memory corruptions bugs. In order to crash kernel sysctl
> +panic_on_warn should be set to 1.

Better as:
In order to crash the kernel, the sysctl panic_on_warn should be set
to 1.


2023-07-23 04:04:54

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] mm/page_table_check: Do WARN_ON instead of BUG_ON

On Sat, Jul 22, 2023 at 11:15:06PM +0000, Pasha Tatashin wrote:
> static struct page_table_check *get_page_table_check(struct page_ext *page_ext)
> {
> - BUG_ON(!page_ext);
> + PAGE_TABLE_CHECK_WARN(!page_ext);
> +
> return (void *)(page_ext) + page_table_check_ops.offset;
> }

[...]

> @@ -137,15 +144,15 @@ void __page_table_check_zero(struct page *page, unsigned int order)
> struct page_ext *page_ext;
> unsigned long i;
>
> - BUG_ON(PageSlab(page));
> + PAGE_TABLE_CHECK_WARN(PageSlab(page));
>
> page_ext = page_ext_get(page);
> - BUG_ON(!page_ext);
> + PAGE_TABLE_CHECK_WARN(!page_ext);
> for (i = 0; i < (1ul << order); i++) {
> struct page_table_check *ptc = get_page_table_check(page_ext);

Seems like we're going to warn about !page_ext twice? Or more than
twice -- once per tail page?

But then we'll crash because page_ext was NULL and offset was small?

2023-07-23 04:06:07

by Pasha Tatashin

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] mm/page_table_check: Do WARN_ON instead of BUG_ON

On Sat, Jul 22, 2023 at 9:56 PM Matthew Wilcox <[email protected]> wrote:
>
> On Sat, Jul 22, 2023 at 11:15:06PM +0000, Pasha Tatashin wrote:
> > static struct page_table_check *get_page_table_check(struct page_ext *page_ext)
> > {
> > - BUG_ON(!page_ext);
> > + PAGE_TABLE_CHECK_WARN(!page_ext);
> > +
> > return (void *)(page_ext) + page_table_check_ops.offset;
> > }
>
> [...]
>
> > @@ -137,15 +144,15 @@ void __page_table_check_zero(struct page *page, unsigned int order)
> > struct page_ext *page_ext;
> > unsigned long i;
> >
> > - BUG_ON(PageSlab(page));
> > + PAGE_TABLE_CHECK_WARN(PageSlab(page));
> >
> > page_ext = page_ext_get(page);
> > - BUG_ON(!page_ext);
> > + PAGE_TABLE_CHECK_WARN(!page_ext);
> > for (i = 0; i < (1ul << order); i++) {
> > struct page_table_check *ptc = get_page_table_check(page_ext);
>
> Seems like we're going to warn about !page_ext twice? Or more than
> twice -- once per tail page?
>
> But then we'll crash because page_ext was NULL and offset was small?

Good catch, page_ext should not be NULL, yet I do not want to add
BUG_ON, let me fix this by warning and gracefully returning if
page_ext is NULL

Pasha

2023-07-23 04:33:48

by Pasha Tatashin

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] doc/vm: add information about page_table_check warn_on behavior

On Sat, Jul 22, 2023 at 7:59 PM Randy Dunlap <[email protected]> wrote:
>
> Hi--
>
> On 7/22/23 16:15, Pasha Tatashin wrote:
> > The default behavior of page table check was changed from panicking
> > kernel to printing a warning.
> >
> > Add a note how to still panic the kernel when error is detected.
> >
> > Signed-off-by: Pasha Tatashin<[email protected]>
> > ---
> > Documentation/mm/page_table_check.rst | 5 +++--
> > 1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/Documentation/mm/page_table_check.rst b/Documentation/mm/page_table_check.rst
> > index c12838ce6b8d..f534c80ee9c9 100644
> > --- a/Documentation/mm/page_table_check.rst
> > +++ b/Documentation/mm/page_table_check.rst
> > @@ -14,13 +14,14 @@ Page table check performs extra verifications at the time when new pages become
> > accessible from the userspace by getting their page table entries (PTEs PMDs
> > etc.) added into the table.
> >
> > -In case of detected corruption, the kernel is crashed. There is a small
> > +In case of detected corruption, a warning is printed. There is a small
> > performance and memory overhead associated with the page table check. Therefore,
> > it is disabled by default, but can be optionally enabled on systems where the
> > extra hardening outweighs the performance costs. Also, because page table check
> > is synchronous, it can help with debugging double map memory corruption issues,
> > by crashing kernel at the time wrong mapping occurs instead of later which is
> > -often the case with memory corruptions bugs.
> > +often the case with memory corruptions bugs. In order to crash kernel sysctl
> > +panic_on_warn should be set to 1.
>
> Better as:
> In order to crash the kernel, the sysctl panic_on_warn should be set
> to 1.

Will update in the next version.

Thanks,
Pasha

>