2022-01-24 19:48:48

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 00/39] kasan, vmalloc, arm64: add vmalloc tagging support for SW/HW_TAGS

From: Andrey Konovalov <[email protected]>

Hi,

This patchset adds vmalloc tagging support for SW_TAGS and HW_TAGS
KASAN modes.

The tree with patches is available here:

https://github.com/xairy/linux/tree/up-kasan-vmalloc-tags-v6

About half of patches are cleanups I went for along the way. None of
them seem to be important enough to go through stable, so I decided
not to split them out into separate patches/series.

The patchset is partially based on an early version of the HW_TAGS
patchset by Vincenzo that had vmalloc support. Thus, I added a
Co-developed-by tag into a few patches.

SW_TAGS vmalloc tagging support is straightforward. It reuses all of
the generic KASAN machinery, but uses shadow memory to store tags
instead of magic values. Naturally, vmalloc tagging requires adding
a few kasan_reset_tag() annotations to the vmalloc code.

HW_TAGS vmalloc tagging support stands out. HW_TAGS KASAN is based on
Arm MTE, which can only assigns tags to physical memory. As a result,
HW_TAGS KASAN only tags vmalloc() allocations, which are backed by
page_alloc memory. It ignores vmap() and others.

Thanks!

Changes in v5->v6:
- Rebased onto mainline/5.17-rc1.
- Drop unnecessary explicit checks for software KASAN modes from
should_skip_init().

Changes in v4->v5:
- Rebase onto fresh mm.
- Mention optimization intention in the comment for __GFP_ZEROTAGS.
- Replace "kasan: simplify kasan_init_hw_tags" with "kasan: clean up
feature flags for HW_TAGS mode".
- Use true as kasan_flag_vmalloc static key default.
- Cosmetic changes to __def_gfpflag_names_kasan and __GFP_BITS_SHIFT.

Changes in v3->v4:
- Rebase onto fresh mm.
- Rename KASAN_VMALLOC_NOEXEC to KASAN_VMALLOC_PROT_NORMAL.
- Compare prot with PAGE_KERNEL instead of using pgprot_nx() to
indentify normal non-executable mappings.
- Rename arch_vmalloc_pgprot_modify() to arch_vmap_pgprot_tagged().
- Move checks from arch_vmap_pgprot_tagged() to __vmalloc_node_range()
as the same condition is used for other things in subsequent patches.
- Use proper kasan_hw_tags_enabled() checks instead of
IS_ENABLED(CONFIG_KASAN_HW_TAGS).
- Set __GFP_SKIP_KASAN_UNPOISON and __GFP_SKIP_ZERO flags instead of
resetting.
- Only define KASAN GFP flags when when HW_TAGS KASAN is enabled.
- Move setting KASAN GFP flags to __vmalloc_node_range() and do it
only for normal non-executable mapping when HW_TAGS KASAN is enabled.
- Add new GFP flags to include/trace/events/mmflags.h.
- Don't forget to save tagged addr to vm_struct->addr for VM_ALLOC
so that find_vm_area(addr)->addr == addr for vmalloc().
- Reset pointer tag in change_memory_common().
- Add test checks for set_memory_*() on vmalloc() allocations.
- Minor patch descriptions and comments fixes.

Changes in v2->v3:
- Rebase onto mm.
- New patch: "kasan, arm64: reset pointer tags of vmapped stacks".
- New patch: "kasan, vmalloc: don't tag executable vmalloc allocations".
- New patch: "kasan, arm64: don't tag executable vmalloc allocations".
- Allowing enabling KASAN_VMALLOC with SW/HW_TAGS is moved to
"kasan: allow enabling KASAN_VMALLOC and SW/HW_TAGS", as this can only
be done once executable allocations are no longer tagged.
- Minor fixes, see patches for lists of changes.

Changes in v1->v2:
- Move memory init for vmalloc() into vmalloc code for HW_TAGS KASAN.
- Minor fixes and code reshuffling, see patches for lists of changes.

Acked-by: Marco Elver <[email protected]>

Andrey Konovalov (39):
kasan, page_alloc: deduplicate should_skip_kasan_poison
kasan, page_alloc: move tag_clear_highpage out of
kernel_init_free_pages
kasan, page_alloc: merge kasan_free_pages into free_pages_prepare
kasan, page_alloc: simplify kasan_poison_pages call site
kasan, page_alloc: init memory of skipped pages on free
kasan: drop skip_kasan_poison variable in free_pages_prepare
mm: clarify __GFP_ZEROTAGS comment
kasan: only apply __GFP_ZEROTAGS when memory is zeroed
kasan, page_alloc: refactor init checks in post_alloc_hook
kasan, page_alloc: merge kasan_alloc_pages into post_alloc_hook
kasan, page_alloc: combine tag_clear_highpage calls in post_alloc_hook
kasan, page_alloc: move SetPageSkipKASanPoison in post_alloc_hook
kasan, page_alloc: move kernel_init_free_pages in post_alloc_hook
kasan, page_alloc: rework kasan_unpoison_pages call site
kasan: clean up metadata byte definitions
kasan: define KASAN_VMALLOC_INVALID for SW_TAGS
kasan, x86, arm64, s390: rename functions for modules shadow
kasan, vmalloc: drop outdated VM_KASAN comment
kasan: reorder vmalloc hooks
kasan: add wrappers for vmalloc hooks
kasan, vmalloc: reset tags in vmalloc functions
kasan, fork: reset pointer tags of vmapped stacks
kasan, arm64: reset pointer tags of vmapped stacks
kasan, vmalloc: add vmalloc tagging for SW_TAGS
kasan, vmalloc, arm64: mark vmalloc mappings as pgprot_tagged
kasan, vmalloc: unpoison VM_ALLOC pages after mapping
kasan, mm: only define ___GFP_SKIP_KASAN_POISON with HW_TAGS
kasan, page_alloc: allow skipping unpoisoning for HW_TAGS
kasan, page_alloc: allow skipping memory init for HW_TAGS
kasan, vmalloc: add vmalloc tagging for HW_TAGS
kasan, vmalloc: only tag normal vmalloc allocations
kasan, arm64: don't tag executable vmalloc allocations
kasan: mark kasan_arg_stacktrace as __initdata
kasan: clean up feature flags for HW_TAGS mode
kasan: add kasan.vmalloc command line flag
kasan: allow enabling KASAN_VMALLOC and SW/HW_TAGS
arm64: select KASAN_VMALLOC for SW/HW_TAGS modes
kasan: documentation updates
kasan: improve vmalloc tests

Documentation/dev-tools/kasan.rst | 17 ++-
arch/arm64/Kconfig | 2 +-
arch/arm64/include/asm/vmalloc.h | 6 +
arch/arm64/include/asm/vmap_stack.h | 5 +-
arch/arm64/kernel/module.c | 5 +-
arch/arm64/mm/pageattr.c | 2 +-
arch/arm64/net/bpf_jit_comp.c | 3 +-
arch/s390/kernel/module.c | 2 +-
arch/x86/kernel/module.c | 2 +-
include/linux/gfp.h | 35 +++--
include/linux/kasan.h | 97 +++++++++-----
include/linux/vmalloc.h | 18 +--
include/trace/events/mmflags.h | 14 +-
kernel/fork.c | 1 +
kernel/scs.c | 4 +-
lib/Kconfig.kasan | 20 +--
lib/test_kasan.c | 189 ++++++++++++++++++++++++++-
mm/kasan/common.c | 4 +-
mm/kasan/hw_tags.c | 193 ++++++++++++++++++++++------
mm/kasan/kasan.h | 18 ++-
mm/kasan/shadow.c | 63 +++++----
mm/page_alloc.c | 152 +++++++++++++++-------
mm/vmalloc.c | 99 +++++++++++---
23 files changed, 731 insertions(+), 220 deletions(-)

--
2.25.1


2022-01-24 19:48:56

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 06/39] kasan: drop skip_kasan_poison variable in free_pages_prepare

From: Andrey Konovalov <[email protected]>

skip_kasan_poison is only used in a single place.
Call should_skip_kasan_poison() directly for simplicity.

Signed-off-by: Andrey Konovalov <[email protected]>
Suggested-by: Marco Elver <[email protected]>

---

Changes v1->v2:
- Add this patch.
---
mm/page_alloc.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index f994fd68e3b1..8481420d2502 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1301,7 +1301,6 @@ static __always_inline bool free_pages_prepare(struct page *page,
unsigned int order, bool check_free, fpi_t fpi_flags)
{
int bad = 0;
- bool skip_kasan_poison = should_skip_kasan_poison(page, fpi_flags);
bool init = want_init_on_free();

VM_BUG_ON_PAGE(PageTail(page), page);
@@ -1375,7 +1374,7 @@ static __always_inline bool free_pages_prepare(struct page *page,
* With hardware tag-based KASAN, memory tags must be set before the
* page becomes unavailable via debug_pagealloc or arch_free_page.
*/
- if (!skip_kasan_poison) {
+ if (!should_skip_kasan_poison(page, fpi_flags)) {
kasan_poison_pages(page, order, init);

/* Memory is already initialized if KASAN did it internally. */
--
2.25.1

2022-01-24 19:48:58

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 07/39] mm: clarify __GFP_ZEROTAGS comment

From: Andrey Konovalov <[email protected]>

__GFP_ZEROTAGS is intended as an optimization: if memory is zeroed during
allocation, it's possible to set memory tags at the same time with little
performance impact.

Clarify this intention of __GFP_ZEROTAGS in the comment.

Signed-off-by: Andrey Konovalov <[email protected]>

---

Changes v4->v5:
- Mention optimization intention in the comment.
---
include/linux/gfp.h | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 80f63c862be5..581a1f47b8a2 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -232,8 +232,10 @@ struct vm_area_struct;
*
* %__GFP_ZERO returns a zeroed page on success.
*
- * %__GFP_ZEROTAGS returns a page with zeroed memory tags on success, if
- * __GFP_ZERO is set.
+ * %__GFP_ZEROTAGS zeroes memory tags at allocation time if the memory itself
+ * is being zeroed (either via __GFP_ZERO or via init_on_alloc). This flag is
+ * intended for optimization: setting memory tags at the same time as zeroing
+ * memory has minimal additional performace impact.
*
* %__GFP_SKIP_KASAN_POISON returns a page which does not need to be poisoned
* on deallocation. Typically used for userspace pages. Currently only has an
--
2.25.1

2022-01-24 19:48:59

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 08/39] kasan: only apply __GFP_ZEROTAGS when memory is zeroed

From: Andrey Konovalov <[email protected]>

__GFP_ZEROTAGS should only be effective if memory is being zeroed.
Currently, hardware tag-based KASAN violates this requirement.

Fix by including an initialization check along with checking for
__GFP_ZEROTAGS.

Signed-off-by: Andrey Konovalov <[email protected]>
Reviewed-by: Alexander Potapenko <[email protected]>
---
mm/kasan/hw_tags.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/kasan/hw_tags.c b/mm/kasan/hw_tags.c
index 0b8225add2e4..c643740b8599 100644
--- a/mm/kasan/hw_tags.c
+++ b/mm/kasan/hw_tags.c
@@ -199,11 +199,12 @@ void kasan_alloc_pages(struct page *page, unsigned int order, gfp_t flags)
* page_alloc.c.
*/
bool init = !want_init_on_free() && want_init_on_alloc(flags);
+ bool init_tags = init && (flags & __GFP_ZEROTAGS);

if (flags & __GFP_SKIP_KASAN_POISON)
SetPageSkipKASanPoison(page);

- if (flags & __GFP_ZEROTAGS) {
+ if (init_tags) {
int i;

for (i = 0; i != 1 << order; ++i)
--
2.25.1

2022-01-24 19:49:00

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 11/39] kasan, page_alloc: combine tag_clear_highpage calls in post_alloc_hook

From: Andrey Konovalov <[email protected]>

Move tag_clear_highpage() loops out of the kasan_has_integrated_init()
clause as a code simplification.

This patch does no functional changes.

Signed-off-by: Andrey Konovalov <[email protected]>
Reviewed-by: Alexander Potapenko <[email protected]>

---

Changes v2->v3:
- Update patch description.
---
mm/page_alloc.c | 32 ++++++++++++++++----------------
1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index abed862d889d..b3959327e06c 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2419,30 +2419,30 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
* KASAN unpoisoning and memory initializion code must be
* kept together to avoid discrepancies in behavior.
*/
+
+ /*
+ * If memory tags should be zeroed (which happens only when memory
+ * should be initialized as well).
+ */
+ if (init_tags) {
+ int i;
+
+ /* Initialize both memory and tags. */
+ for (i = 0; i != 1 << order; ++i)
+ tag_clear_highpage(page + i);
+
+ /* Note that memory is already initialized by the loop above. */
+ init = false;
+ }
if (kasan_has_integrated_init()) {
if (gfp_flags & __GFP_SKIP_KASAN_POISON)
SetPageSkipKASanPoison(page);

- if (init_tags) {
- int i;
-
- for (i = 0; i != 1 << order; ++i)
- tag_clear_highpage(page + i);
- } else {
+ if (!init_tags)
kasan_unpoison_pages(page, order, init);
- }
} else {
kasan_unpoison_pages(page, order, init);

- if (init_tags) {
- int i;
-
- for (i = 0; i < 1 << order; i++)
- tag_clear_highpage(page + i);
-
- init = false;
- }
-
if (init)
kernel_init_free_pages(page, 1 << order);
}
--
2.25.1

2022-01-24 19:49:01

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 09/39] kasan, page_alloc: refactor init checks in post_alloc_hook

From: Andrey Konovalov <[email protected]>

Separate code for zeroing memory from the code clearing tags in
post_alloc_hook().

This patch is not useful by itself but makes the simplifications in
the following patches easier to follow.

This patch does no functional changes.

Signed-off-by: Andrey Konovalov <[email protected]>
Reviewed-by: Alexander Potapenko <[email protected]>

---

Changes v2->v3:
- Update patch description.
---
mm/page_alloc.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 8481420d2502..868480d463c7 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2420,19 +2420,21 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
kasan_alloc_pages(page, order, gfp_flags);
} else {
bool init = !want_init_on_free() && want_init_on_alloc(gfp_flags);
+ bool init_tags = init && (gfp_flags & __GFP_ZEROTAGS);

kasan_unpoison_pages(page, order, init);

- if (init) {
- if (gfp_flags & __GFP_ZEROTAGS) {
- int i;
+ if (init_tags) {
+ int i;

- for (i = 0; i < 1 << order; i++)
- tag_clear_highpage(page + i);
- } else {
- kernel_init_free_pages(page, 1 << order);
- }
+ for (i = 0; i < 1 << order; i++)
+ tag_clear_highpage(page + i);
+
+ init = false;
}
+
+ if (init)
+ kernel_init_free_pages(page, 1 << order);
}

set_page_owner(page, order, gfp_flags);
--
2.25.1

2022-01-24 19:49:10

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 02/39] kasan, page_alloc: move tag_clear_highpage out of kernel_init_free_pages

From: Andrey Konovalov <[email protected]>

Currently, kernel_init_free_pages() serves two purposes: it either only
zeroes memory or zeroes both memory and memory tags via a different
code path. As this function has only two callers, each using only one
code path, this behaviour is confusing.

Pull the code that zeroes both memory and tags out of
kernel_init_free_pages().

As a result of this change, the code in free_pages_prepare() starts to
look complicated, but this is improved in the few following patches.
Those improvements are not integrated into this patch to make diffs
easier to read.

This patch does no functional changes.

Signed-off-by: Andrey Konovalov <[email protected]>
Reviewed-by: Alexander Potapenko <[email protected]>

---

Changes v2->v3:
- Update patch description.
---
mm/page_alloc.c | 24 +++++++++++++-----------
1 file changed, 13 insertions(+), 11 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 25d4f9ad3525..012170b1c47a 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1282,16 +1282,10 @@ static inline bool should_skip_kasan_poison(struct page *page, fpi_t fpi_flags)
PageSkipKASanPoison(page);
}

-static void kernel_init_free_pages(struct page *page, int numpages, bool zero_tags)
+static void kernel_init_free_pages(struct page *page, int numpages)
{
int i;

- if (zero_tags) {
- for (i = 0; i < numpages; i++)
- tag_clear_highpage(page + i);
- return;
- }
-
/* s390's use of memset() could override KASAN redzones. */
kasan_disable_current();
for (i = 0; i < numpages; i++) {
@@ -1387,7 +1381,7 @@ static __always_inline bool free_pages_prepare(struct page *page,
bool init = want_init_on_free();

if (init)
- kernel_init_free_pages(page, 1 << order, false);
+ kernel_init_free_pages(page, 1 << order);
if (!skip_kasan_poison)
kasan_poison_pages(page, order, init);
}
@@ -2430,9 +2424,17 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
bool init = !want_init_on_free() && want_init_on_alloc(gfp_flags);

kasan_unpoison_pages(page, order, init);
- if (init)
- kernel_init_free_pages(page, 1 << order,
- gfp_flags & __GFP_ZEROTAGS);
+
+ if (init) {
+ if (gfp_flags & __GFP_ZEROTAGS) {
+ int i;
+
+ for (i = 0; i < 1 << order; i++)
+ tag_clear_highpage(page + i);
+ } else {
+ kernel_init_free_pages(page, 1 << order);
+ }
+ }
}

set_page_owner(page, order, gfp_flags);
--
2.25.1

2022-01-24 19:49:11

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 12/39] kasan, page_alloc: move SetPageSkipKASanPoison in post_alloc_hook

From: Andrey Konovalov <[email protected]>

Pull the SetPageSkipKASanPoison() call in post_alloc_hook() out of the
big if clause for better code readability. This also allows for more
simplifications in the following patches.

Also turn the kasan_has_integrated_init() check into the proper
kasan_hw_tags_enabled() one. These checks evaluate to the same value,
but logically skipping kasan poisoning has nothing to do with
integrated init.

Signed-off-by: Andrey Konovalov <[email protected]>

---

Changes v3->v4:
- Use proper kasan_hw_tags_enabled() check instead of
IS_ENABLED(CONFIG_KASAN_HW_TAGS).
---
mm/page_alloc.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index b3959327e06c..c51d637cdab3 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2435,9 +2435,6 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
init = false;
}
if (kasan_has_integrated_init()) {
- if (gfp_flags & __GFP_SKIP_KASAN_POISON)
- SetPageSkipKASanPoison(page);
-
if (!init_tags)
kasan_unpoison_pages(page, order, init);
} else {
@@ -2446,6 +2443,9 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
if (init)
kernel_init_free_pages(page, 1 << order);
}
+ /* Propagate __GFP_SKIP_KASAN_POISON to page flags. */
+ if (kasan_hw_tags_enabled() && (gfp_flags & __GFP_SKIP_KASAN_POISON))
+ SetPageSkipKASanPoison(page);

set_page_owner(page, order, gfp_flags);
page_table_check_alloc(page, order);
--
2.25.1

2022-01-24 19:49:12

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 14/39] kasan, page_alloc: rework kasan_unpoison_pages call site

From: Andrey Konovalov <[email protected]>

Rework the checks around kasan_unpoison_pages() call in
post_alloc_hook().

The logical condition for calling this function is:

- If a software KASAN mode is enabled, we need to mark shadow memory.
- Otherwise, HW_TAGS KASAN is enabled, and it only makes sense to
set tags if they haven't already been cleared by tag_clear_highpage(),
which is indicated by init_tags.

This patch concludes the changes for post_alloc_hook().

Signed-off-by: Andrey Konovalov <[email protected]>

---

Changes v3->v4:
- Make the confition checks more explicit.
- Update patch description.
---
mm/page_alloc.c | 19 ++++++++++++-------
1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 2784bd478942..3af38e323391 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2434,15 +2434,20 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
/* Note that memory is already initialized by the loop above. */
init = false;
}
- if (kasan_has_integrated_init()) {
- if (!init_tags) {
- kasan_unpoison_pages(page, order, init);
+ /*
+ * If either a software KASAN mode is enabled, or,
+ * in the case of hardware tag-based KASAN,
+ * if memory tags have not been cleared via tag_clear_highpage().
+ */
+ if (IS_ENABLED(CONFIG_KASAN_GENERIC) ||
+ IS_ENABLED(CONFIG_KASAN_SW_TAGS) ||
+ kasan_hw_tags_enabled() && !init_tags) {
+ /* Mark shadow memory or set memory tags. */
+ kasan_unpoison_pages(page, order, init);

- /* Note that memory is already initialized by KASAN. */
+ /* Note that memory is already initialized by KASAN. */
+ if (kasan_has_integrated_init())
init = false;
- }
- } else {
- kasan_unpoison_pages(page, order, init);
}
/* If memory is still not initialized, do it now. */
if (init)
--
2.25.1

2022-01-24 19:49:14

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 04/39] kasan, page_alloc: simplify kasan_poison_pages call site

From: Andrey Konovalov <[email protected]>

Simplify the code around calling kasan_poison_pages() in
free_pages_prepare().

This patch does no functional changes.

Signed-off-by: Andrey Konovalov <[email protected]>
Reviewed-by: Alexander Potapenko <[email protected]>

---

Changes v1->v2:
- Don't reorder kasan_poison_pages() and free_pages_prepare().
---
mm/page_alloc.c | 18 +++++-------------
1 file changed, 5 insertions(+), 13 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e5f95c6ab0ac..60bc838a4d85 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1302,6 +1302,7 @@ static __always_inline bool free_pages_prepare(struct page *page,
{
int bad = 0;
bool skip_kasan_poison = should_skip_kasan_poison(page, fpi_flags);
+ bool init = want_init_on_free();

VM_BUG_ON_PAGE(PageTail(page), page);

@@ -1374,19 +1375,10 @@ static __always_inline bool free_pages_prepare(struct page *page,
* With hardware tag-based KASAN, memory tags must be set before the
* page becomes unavailable via debug_pagealloc or arch_free_page.
*/
- if (kasan_has_integrated_init()) {
- bool init = want_init_on_free();
-
- if (!skip_kasan_poison)
- kasan_poison_pages(page, order, init);
- } else {
- bool init = want_init_on_free();
-
- if (init)
- kernel_init_free_pages(page, 1 << order);
- if (!skip_kasan_poison)
- kasan_poison_pages(page, order, init);
- }
+ if (init && !kasan_has_integrated_init())
+ kernel_init_free_pages(page, 1 << order);
+ if (!skip_kasan_poison)
+ kasan_poison_pages(page, order, init);

/*
* arch_free_page() can make the page's contents inaccessible. s390
--
2.25.1

2022-01-24 19:49:15

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 15/39] kasan: clean up metadata byte definitions

From: Andrey Konovalov <[email protected]>

Most of the metadata byte values are only used for Generic KASAN.

Remove KASAN_KMALLOC_FREETRACK definition for !CONFIG_KASAN_GENERIC
case, and put it along with other metadata values for the Generic
mode under a corresponding ifdef.

Signed-off-by: Andrey Konovalov <[email protected]>
Reviewed-by: Alexander Potapenko <[email protected]>
---
mm/kasan/kasan.h | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
index c17fa8d26ffe..952cd6f9ca46 100644
--- a/mm/kasan/kasan.h
+++ b/mm/kasan/kasan.h
@@ -71,15 +71,16 @@ static inline bool kasan_sync_fault_possible(void)
#define KASAN_PAGE_REDZONE 0xFE /* redzone for kmalloc_large allocations */
#define KASAN_KMALLOC_REDZONE 0xFC /* redzone inside slub object */
#define KASAN_KMALLOC_FREE 0xFB /* object was freed (kmem_cache_free/kfree) */
-#define KASAN_KMALLOC_FREETRACK 0xFA /* object was freed and has free track set */
#else
#define KASAN_FREE_PAGE KASAN_TAG_INVALID
#define KASAN_PAGE_REDZONE KASAN_TAG_INVALID
#define KASAN_KMALLOC_REDZONE KASAN_TAG_INVALID
#define KASAN_KMALLOC_FREE KASAN_TAG_INVALID
-#define KASAN_KMALLOC_FREETRACK KASAN_TAG_INVALID
#endif

+#ifdef CONFIG_KASAN_GENERIC
+
+#define KASAN_KMALLOC_FREETRACK 0xFA /* object was freed and has free track set */
#define KASAN_GLOBAL_REDZONE 0xF9 /* redzone for global variable */
#define KASAN_VMALLOC_INVALID 0xF8 /* unallocated space in vmapped page */

@@ -110,6 +111,8 @@ static inline bool kasan_sync_fault_possible(void)
#define KASAN_ABI_VERSION 1
#endif

+#endif /* CONFIG_KASAN_GENERIC */
+
/* Metadata layout customization. */
#define META_BYTES_PER_BLOCK 1
#define META_BLOCKS_PER_ROW 16
--
2.25.1

2022-01-24 19:49:19

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 16/39] kasan: define KASAN_VMALLOC_INVALID for SW_TAGS

From: Andrey Konovalov <[email protected]>

In preparation for adding vmalloc support to SW_TAGS KASAN,
provide a KASAN_VMALLOC_INVALID definition for it.

HW_TAGS KASAN won't be using this value, as it falls back onto
page_alloc for poisoning freed vmalloc() memory.

Signed-off-by: Andrey Konovalov <[email protected]>
Reviewed-by: Alexander Potapenko <[email protected]>
---
mm/kasan/kasan.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
index 952cd6f9ca46..020f3e57a03f 100644
--- a/mm/kasan/kasan.h
+++ b/mm/kasan/kasan.h
@@ -71,18 +71,19 @@ static inline bool kasan_sync_fault_possible(void)
#define KASAN_PAGE_REDZONE 0xFE /* redzone for kmalloc_large allocations */
#define KASAN_KMALLOC_REDZONE 0xFC /* redzone inside slub object */
#define KASAN_KMALLOC_FREE 0xFB /* object was freed (kmem_cache_free/kfree) */
+#define KASAN_VMALLOC_INVALID 0xF8 /* unallocated space in vmapped page */
#else
#define KASAN_FREE_PAGE KASAN_TAG_INVALID
#define KASAN_PAGE_REDZONE KASAN_TAG_INVALID
#define KASAN_KMALLOC_REDZONE KASAN_TAG_INVALID
#define KASAN_KMALLOC_FREE KASAN_TAG_INVALID
+#define KASAN_VMALLOC_INVALID KASAN_TAG_INVALID /* only for SW_TAGS */
#endif

#ifdef CONFIG_KASAN_GENERIC

#define KASAN_KMALLOC_FREETRACK 0xFA /* object was freed and has free track set */
#define KASAN_GLOBAL_REDZONE 0xF9 /* redzone for global variable */
-#define KASAN_VMALLOC_INVALID 0xF8 /* unallocated space in vmapped page */

/*
* Stack redzone shadow values
--
2.25.1

2022-01-24 19:49:20

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 17/39] kasan, x86, arm64, s390: rename functions for modules shadow

From: Andrey Konovalov <[email protected]>

Rename kasan_free_shadow to kasan_free_module_shadow and
kasan_module_alloc to kasan_alloc_module_shadow.

These functions are used to allocate/free shadow memory for kernel
modules when KASAN_VMALLOC is not enabled. The new names better
reflect their purpose.

Also reword the comment next to their declaration to improve clarity.

Signed-off-by: Andrey Konovalov <[email protected]>
Acked-by: Catalin Marinas <[email protected]>
---
arch/arm64/kernel/module.c | 2 +-
arch/s390/kernel/module.c | 2 +-
arch/x86/kernel/module.c | 2 +-
include/linux/kasan.h | 14 +++++++-------
mm/kasan/shadow.c | 4 ++--
mm/vmalloc.c | 2 +-
6 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index 309a27553c87..d3a1fa818348 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -58,7 +58,7 @@ void *module_alloc(unsigned long size)
PAGE_KERNEL, 0, NUMA_NO_NODE,
__builtin_return_address(0));

- if (p && (kasan_module_alloc(p, size, gfp_mask) < 0)) {
+ if (p && (kasan_alloc_module_shadow(p, size, gfp_mask) < 0)) {
vfree(p);
return NULL;
}
diff --git a/arch/s390/kernel/module.c b/arch/s390/kernel/module.c
index d52d85367bf7..b16bebd9a8b9 100644
--- a/arch/s390/kernel/module.c
+++ b/arch/s390/kernel/module.c
@@ -45,7 +45,7 @@ void *module_alloc(unsigned long size)
p = __vmalloc_node_range(size, MODULE_ALIGN, MODULES_VADDR, MODULES_END,
gfp_mask, PAGE_KERNEL_EXEC, VM_DEFER_KMEMLEAK, NUMA_NO_NODE,
__builtin_return_address(0));
- if (p && (kasan_module_alloc(p, size, gfp_mask) < 0)) {
+ if (p && (kasan_alloc_module_shadow(p, size, gfp_mask) < 0)) {
vfree(p);
return NULL;
}
diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
index 95fa745e310a..c9eb8aa3b7b8 100644
--- a/arch/x86/kernel/module.c
+++ b/arch/x86/kernel/module.c
@@ -78,7 +78,7 @@ void *module_alloc(unsigned long size)
MODULES_END, gfp_mask,
PAGE_KERNEL, VM_DEFER_KMEMLEAK, NUMA_NO_NODE,
__builtin_return_address(0));
- if (p && (kasan_module_alloc(p, size, gfp_mask) < 0)) {
+ if (p && (kasan_alloc_module_shadow(p, size, gfp_mask) < 0)) {
vfree(p);
return NULL;
}
diff --git a/include/linux/kasan.h b/include/linux/kasan.h
index b88ca6b97ba3..55f1d4edf6b5 100644
--- a/include/linux/kasan.h
+++ b/include/linux/kasan.h
@@ -454,17 +454,17 @@ static inline void kasan_populate_early_vm_area_shadow(void *start,
!defined(CONFIG_KASAN_VMALLOC)

/*
- * These functions provide a special case to support backing module
- * allocations with real shadow memory. With KASAN vmalloc, the special
- * case is unnecessary, as the work is handled in the generic case.
+ * These functions allocate and free shadow memory for kernel modules.
+ * They are only required when KASAN_VMALLOC is not supported, as otherwise
+ * shadow memory is allocated by the generic vmalloc handlers.
*/
-int kasan_module_alloc(void *addr, size_t size, gfp_t gfp_mask);
-void kasan_free_shadow(const struct vm_struct *vm);
+int kasan_alloc_module_shadow(void *addr, size_t size, gfp_t gfp_mask);
+void kasan_free_module_shadow(const struct vm_struct *vm);

#else /* (CONFIG_KASAN_GENERIC || CONFIG_KASAN_SW_TAGS) && !CONFIG_KASAN_VMALLOC */

-static inline int kasan_module_alloc(void *addr, size_t size, gfp_t gfp_mask) { return 0; }
-static inline void kasan_free_shadow(const struct vm_struct *vm) {}
+static inline int kasan_alloc_module_shadow(void *addr, size_t size, gfp_t gfp_mask) { return 0; }
+static inline void kasan_free_module_shadow(const struct vm_struct *vm) {}

#endif /* (CONFIG_KASAN_GENERIC || CONFIG_KASAN_SW_TAGS) && !CONFIG_KASAN_VMALLOC */

diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c
index 94136f84b449..e5c4393eb861 100644
--- a/mm/kasan/shadow.c
+++ b/mm/kasan/shadow.c
@@ -498,7 +498,7 @@ void kasan_release_vmalloc(unsigned long start, unsigned long end,

#else /* CONFIG_KASAN_VMALLOC */

-int kasan_module_alloc(void *addr, size_t size, gfp_t gfp_mask)
+int kasan_alloc_module_shadow(void *addr, size_t size, gfp_t gfp_mask)
{
void *ret;
size_t scaled_size;
@@ -534,7 +534,7 @@ int kasan_module_alloc(void *addr, size_t size, gfp_t gfp_mask)
return -ENOMEM;
}

-void kasan_free_shadow(const struct vm_struct *vm)
+void kasan_free_module_shadow(const struct vm_struct *vm)
{
if (vm->flags & VM_KASAN)
vfree(kasan_mem_to_shadow(vm->addr));
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 4165304d3547..b6712a25c996 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2526,7 +2526,7 @@ struct vm_struct *remove_vm_area(const void *addr)
va->vm = NULL;
spin_unlock(&vmap_area_lock);

- kasan_free_shadow(vm);
+ kasan_free_module_shadow(vm);
free_unmap_vmap_area(va);

return vm;
--
2.25.1

2022-01-24 19:49:33

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 19/39] kasan: reorder vmalloc hooks

From: Andrey Konovalov <[email protected]>

Group functions that [de]populate shadow memory for vmalloc.
Group functions that [un]poison memory for vmalloc.

This patch does no functional changes but prepares KASAN code for
adding vmalloc support to HW_TAGS KASAN.

Signed-off-by: Andrey Konovalov <[email protected]>
Reviewed-by: Alexander Potapenko <[email protected]>
---
include/linux/kasan.h | 20 +++++++++-----------
mm/kasan/shadow.c | 43 ++++++++++++++++++++++---------------------
2 files changed, 31 insertions(+), 32 deletions(-)

diff --git a/include/linux/kasan.h b/include/linux/kasan.h
index 55f1d4edf6b5..46a63374c86f 100644
--- a/include/linux/kasan.h
+++ b/include/linux/kasan.h
@@ -418,34 +418,32 @@ static inline void kasan_init_hw_tags(void) { }

#ifdef CONFIG_KASAN_VMALLOC

+void kasan_populate_early_vm_area_shadow(void *start, unsigned long size);
int kasan_populate_vmalloc(unsigned long addr, unsigned long size);
-void kasan_poison_vmalloc(const void *start, unsigned long size);
-void kasan_unpoison_vmalloc(const void *start, unsigned long size);
void kasan_release_vmalloc(unsigned long start, unsigned long end,
unsigned long free_region_start,
unsigned long free_region_end);

-void kasan_populate_early_vm_area_shadow(void *start, unsigned long size);
+void kasan_unpoison_vmalloc(const void *start, unsigned long size);
+void kasan_poison_vmalloc(const void *start, unsigned long size);

#else /* CONFIG_KASAN_VMALLOC */

+static inline void kasan_populate_early_vm_area_shadow(void *start,
+ unsigned long size) { }
static inline int kasan_populate_vmalloc(unsigned long start,
unsigned long size)
{
return 0;
}
-
-static inline void kasan_poison_vmalloc(const void *start, unsigned long size)
-{ }
-static inline void kasan_unpoison_vmalloc(const void *start, unsigned long size)
-{ }
static inline void kasan_release_vmalloc(unsigned long start,
unsigned long end,
unsigned long free_region_start,
- unsigned long free_region_end) {}
+ unsigned long free_region_end) { }

-static inline void kasan_populate_early_vm_area_shadow(void *start,
- unsigned long size)
+static inline void kasan_unpoison_vmalloc(const void *start, unsigned long size)
+{ }
+static inline void kasan_poison_vmalloc(const void *start, unsigned long size)
{ }

#endif /* CONFIG_KASAN_VMALLOC */
diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c
index e5c4393eb861..bf7ab62fbfb9 100644
--- a/mm/kasan/shadow.c
+++ b/mm/kasan/shadow.c
@@ -345,27 +345,6 @@ int kasan_populate_vmalloc(unsigned long addr, unsigned long size)
return 0;
}

-/*
- * Poison the shadow for a vmalloc region. Called as part of the
- * freeing process at the time the region is freed.
- */
-void kasan_poison_vmalloc(const void *start, unsigned long size)
-{
- if (!is_vmalloc_or_module_addr(start))
- return;
-
- size = round_up(size, KASAN_GRANULE_SIZE);
- kasan_poison(start, size, KASAN_VMALLOC_INVALID, false);
-}
-
-void kasan_unpoison_vmalloc(const void *start, unsigned long size)
-{
- if (!is_vmalloc_or_module_addr(start))
- return;
-
- kasan_unpoison(start, size, false);
-}
-
static int kasan_depopulate_vmalloc_pte(pte_t *ptep, unsigned long addr,
void *unused)
{
@@ -496,6 +475,28 @@ void kasan_release_vmalloc(unsigned long start, unsigned long end,
}
}

+
+void kasan_unpoison_vmalloc(const void *start, unsigned long size)
+{
+ if (!is_vmalloc_or_module_addr(start))
+ return;
+
+ kasan_unpoison(start, size, false);
+}
+
+/*
+ * Poison the shadow for a vmalloc region. Called as part of the
+ * freeing process at the time the region is freed.
+ */
+void kasan_poison_vmalloc(const void *start, unsigned long size)
+{
+ if (!is_vmalloc_or_module_addr(start))
+ return;
+
+ size = round_up(size, KASAN_GRANULE_SIZE);
+ kasan_poison(start, size, KASAN_VMALLOC_INVALID, false);
+}
+
#else /* CONFIG_KASAN_VMALLOC */

int kasan_alloc_module_shadow(void *addr, size_t size, gfp_t gfp_mask)
--
2.25.1

2022-01-24 19:49:33

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 20/39] kasan: add wrappers for vmalloc hooks

From: Andrey Konovalov <[email protected]>

Add wrappers around functions that [un]poison memory for vmalloc
allocations. These functions will be used by HW_TAGS KASAN and
therefore need to be disabled when kasan=off command line argument
is provided.

This patch does no functional changes for software KASAN modes.

Signed-off-by: Andrey Konovalov <[email protected]>
Reviewed-by: Alexander Potapenko <[email protected]>
---
include/linux/kasan.h | 17 +++++++++++++++--
mm/kasan/shadow.c | 5 ++---
2 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/include/linux/kasan.h b/include/linux/kasan.h
index 46a63374c86f..da320069e7cf 100644
--- a/include/linux/kasan.h
+++ b/include/linux/kasan.h
@@ -424,8 +424,21 @@ void kasan_release_vmalloc(unsigned long start, unsigned long end,
unsigned long free_region_start,
unsigned long free_region_end);

-void kasan_unpoison_vmalloc(const void *start, unsigned long size);
-void kasan_poison_vmalloc(const void *start, unsigned long size);
+void __kasan_unpoison_vmalloc(const void *start, unsigned long size);
+static __always_inline void kasan_unpoison_vmalloc(const void *start,
+ unsigned long size)
+{
+ if (kasan_enabled())
+ __kasan_unpoison_vmalloc(start, size);
+}
+
+void __kasan_poison_vmalloc(const void *start, unsigned long size);
+static __always_inline void kasan_poison_vmalloc(const void *start,
+ unsigned long size)
+{
+ if (kasan_enabled())
+ __kasan_poison_vmalloc(start, size);
+}

#else /* CONFIG_KASAN_VMALLOC */

diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c
index bf7ab62fbfb9..39d0b32ebf70 100644
--- a/mm/kasan/shadow.c
+++ b/mm/kasan/shadow.c
@@ -475,8 +475,7 @@ void kasan_release_vmalloc(unsigned long start, unsigned long end,
}
}

-
-void kasan_unpoison_vmalloc(const void *start, unsigned long size)
+void __kasan_unpoison_vmalloc(const void *start, unsigned long size)
{
if (!is_vmalloc_or_module_addr(start))
return;
@@ -488,7 +487,7 @@ void kasan_unpoison_vmalloc(const void *start, unsigned long size)
* Poison the shadow for a vmalloc region. Called as part of the
* freeing process at the time the region is freed.
*/
-void kasan_poison_vmalloc(const void *start, unsigned long size)
+void __kasan_poison_vmalloc(const void *start, unsigned long size)
{
if (!is_vmalloc_or_module_addr(start))
return;
--
2.25.1

2022-01-24 19:49:35

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 21/39] kasan, vmalloc: reset tags in vmalloc functions

From: Andrey Konovalov <[email protected]>

In preparation for adding vmalloc support to SW/HW_TAGS KASAN,
reset pointer tags in functions that use pointer values in
range checks.

vread() is a special case here. Despite the untagging of the addr
pointer in its prologue, the accesses performed by vread() are checked.

Instead of accessing the virtual mappings though addr directly, vread()
recovers the physical address via page_address(vmalloc_to_page()) and
acceses that. And as page_address() recovers the pointer tag, the
accesses get checked.

Signed-off-by: Andrey Konovalov <[email protected]>

---

Changes v1->v2:
- Clarified the description of untagging in vread().
---
mm/vmalloc.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index b6712a25c996..38bf3b418b81 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -74,7 +74,7 @@ static const bool vmap_allow_huge = false;

bool is_vmalloc_addr(const void *x)
{
- unsigned long addr = (unsigned long)x;
+ unsigned long addr = (unsigned long)kasan_reset_tag(x);

return addr >= VMALLOC_START && addr < VMALLOC_END;
}
@@ -632,7 +632,7 @@ int is_vmalloc_or_module_addr(const void *x)
* just put it in the vmalloc space.
*/
#if defined(CONFIG_MODULES) && defined(MODULES_VADDR)
- unsigned long addr = (unsigned long)x;
+ unsigned long addr = (unsigned long)kasan_reset_tag(x);
if (addr >= MODULES_VADDR && addr < MODULES_END)
return 1;
#endif
@@ -806,6 +806,8 @@ static struct vmap_area *find_vmap_area_exceed_addr(unsigned long addr)
struct vmap_area *va = NULL;
struct rb_node *n = vmap_area_root.rb_node;

+ addr = (unsigned long)kasan_reset_tag((void *)addr);
+
while (n) {
struct vmap_area *tmp;

@@ -827,6 +829,8 @@ static struct vmap_area *__find_vmap_area(unsigned long addr)
{
struct rb_node *n = vmap_area_root.rb_node;

+ addr = (unsigned long)kasan_reset_tag((void *)addr);
+
while (n) {
struct vmap_area *va;

@@ -2145,7 +2149,7 @@ EXPORT_SYMBOL_GPL(vm_unmap_aliases);
void vm_unmap_ram(const void *mem, unsigned int count)
{
unsigned long size = (unsigned long)count << PAGE_SHIFT;
- unsigned long addr = (unsigned long)mem;
+ unsigned long addr = (unsigned long)kasan_reset_tag(mem);
struct vmap_area *va;

might_sleep();
@@ -3404,6 +3408,8 @@ long vread(char *buf, char *addr, unsigned long count)
unsigned long buflen = count;
unsigned long n;

+ addr = kasan_reset_tag(addr);
+
/* Don't allow overflow */
if ((unsigned long) addr + count < count)
count = -(unsigned long) addr;
--
2.25.1

2022-01-24 19:49:37

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 10/39] kasan, page_alloc: merge kasan_alloc_pages into post_alloc_hook

From: Andrey Konovalov <[email protected]>

Currently, the code responsible for initializing and poisoning memory in
post_alloc_hook() is scattered across two locations: kasan_alloc_pages()
hook for HW_TAGS KASAN and post_alloc_hook() itself. This is confusing.

This and a few following patches combine the code from these two
locations. Along the way, these patches do a step-by-step restructure
the many performed checks to make them easier to follow.

Replace the only caller of kasan_alloc_pages() with its implementation.

As kasan_has_integrated_init() is only true when CONFIG_KASAN_HW_TAGS
is enabled, moving the code does no functional changes.

Also move init and init_tags variables definitions out of
kasan_has_integrated_init() clause in post_alloc_hook(), as they have
the same values regardless of what the if condition evaluates to.

This patch is not useful by itself but makes the simplifications in
the following patches easier to follow.

Signed-off-by: Andrey Konovalov <[email protected]>

---

Changes v2->v3:
- Update patch description.
---
include/linux/kasan.h | 9 ---------
mm/kasan/common.c | 2 +-
mm/kasan/hw_tags.c | 22 ----------------------
mm/page_alloc.c | 20 +++++++++++++++-----
4 files changed, 16 insertions(+), 37 deletions(-)

diff --git a/include/linux/kasan.h b/include/linux/kasan.h
index a8bfe9f157c9..b88ca6b97ba3 100644
--- a/include/linux/kasan.h
+++ b/include/linux/kasan.h
@@ -95,8 +95,6 @@ static inline bool kasan_hw_tags_enabled(void)
return kasan_enabled();
}

-void kasan_alloc_pages(struct page *page, unsigned int order, gfp_t flags);
-
#else /* CONFIG_KASAN_HW_TAGS */

static inline bool kasan_enabled(void)
@@ -109,13 +107,6 @@ static inline bool kasan_hw_tags_enabled(void)
return false;
}

-static __always_inline void kasan_alloc_pages(struct page *page,
- unsigned int order, gfp_t flags)
-{
- /* Only available for integrated init. */
- BUILD_BUG();
-}
-
#endif /* CONFIG_KASAN_HW_TAGS */

static inline bool kasan_has_integrated_init(void)
diff --git a/mm/kasan/common.c b/mm/kasan/common.c
index a0082fad48b1..d9079ec11f31 100644
--- a/mm/kasan/common.c
+++ b/mm/kasan/common.c
@@ -538,7 +538,7 @@ void * __must_check __kasan_kmalloc_large(const void *ptr, size_t size,
return NULL;

/*
- * The object has already been unpoisoned by kasan_alloc_pages() for
+ * The object has already been unpoisoned by kasan_unpoison_pages() for
* alloc_pages() or by kasan_krealloc() for krealloc().
*/

diff --git a/mm/kasan/hw_tags.c b/mm/kasan/hw_tags.c
index c643740b8599..76cf2b6229c7 100644
--- a/mm/kasan/hw_tags.c
+++ b/mm/kasan/hw_tags.c
@@ -192,28 +192,6 @@ void __init kasan_init_hw_tags(void)
kasan_stack_collection_enabled() ? "on" : "off");
}

-void kasan_alloc_pages(struct page *page, unsigned int order, gfp_t flags)
-{
- /*
- * This condition should match the one in post_alloc_hook() in
- * page_alloc.c.
- */
- bool init = !want_init_on_free() && want_init_on_alloc(flags);
- bool init_tags = init && (flags & __GFP_ZEROTAGS);
-
- if (flags & __GFP_SKIP_KASAN_POISON)
- SetPageSkipKASanPoison(page);
-
- if (init_tags) {
- int i;
-
- for (i = 0; i != 1 << order; ++i)
- tag_clear_highpage(page + i);
- } else {
- kasan_unpoison_pages(page, order, init);
- }
-}
-
#if IS_ENABLED(CONFIG_KASAN_KUNIT_TEST)

void kasan_enable_tagging_sync(void)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 868480d463c7..abed862d889d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2398,6 +2398,9 @@ static bool check_new_pages(struct page *page, unsigned int order)
inline void post_alloc_hook(struct page *page, unsigned int order,
gfp_t gfp_flags)
{
+ bool init = !want_init_on_free() && want_init_on_alloc(gfp_flags);
+ bool init_tags = init && (gfp_flags & __GFP_ZEROTAGS);
+
set_page_private(page, 0);
set_page_refcounted(page);

@@ -2413,15 +2416,22 @@ inline void post_alloc_hook(struct page *page, unsigned int order,

/*
* As memory initialization might be integrated into KASAN,
- * kasan_alloc_pages and kernel_init_free_pages must be
+ * KASAN unpoisoning and memory initializion code must be
* kept together to avoid discrepancies in behavior.
*/
if (kasan_has_integrated_init()) {
- kasan_alloc_pages(page, order, gfp_flags);
- } else {
- bool init = !want_init_on_free() && want_init_on_alloc(gfp_flags);
- bool init_tags = init && (gfp_flags & __GFP_ZEROTAGS);
+ if (gfp_flags & __GFP_SKIP_KASAN_POISON)
+ SetPageSkipKASanPoison(page);
+
+ if (init_tags) {
+ int i;

+ for (i = 0; i != 1 << order; ++i)
+ tag_clear_highpage(page + i);
+ } else {
+ kasan_unpoison_pages(page, order, init);
+ }
+ } else {
kasan_unpoison_pages(page, order, init);

if (init_tags) {
--
2.25.1

2022-01-24 19:49:38

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 22/39] kasan, fork: reset pointer tags of vmapped stacks

From: Andrey Konovalov <[email protected]>

Once tag-based KASAN modes start tagging vmalloc() allocations,
kernel stacks start getting tagged if CONFIG_VMAP_STACK is enabled.

Reset the tag of kernel stack pointers after allocation in
alloc_thread_stack_node().

For SW_TAGS KASAN, when CONFIG_KASAN_STACK is enabled, the
instrumentation can't handle the SP register being tagged.

For HW_TAGS KASAN, there's no instrumentation-related issues. However,
the impact of having a tagged SP register needs to be properly evaluated,
so keep it non-tagged for now.

Note, that the memory for the stack allocation still gets tagged to
catch vmalloc-into-stack out-of-bounds accesses.

Signed-off-by: Andrey Konovalov <[email protected]>
Reviewed-by: Alexander Potapenko <[email protected]>

---

Changes v2->v3:
- Update patch description.
---
kernel/fork.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/kernel/fork.c b/kernel/fork.c
index d75a528f7b21..57d624f05182 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -254,6 +254,7 @@ static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int node)
* so cache the vm_struct.
*/
if (stack) {
+ stack = kasan_reset_tag(stack);
tsk->stack_vm_area = find_vm_area(stack);
tsk->stack = stack;
}
--
2.25.1

2022-01-24 19:49:40

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 13/39] kasan, page_alloc: move kernel_init_free_pages in post_alloc_hook

From: Andrey Konovalov <[email protected]>

Pull the kernel_init_free_pages() call in post_alloc_hook() out of the
big if clause for better code readability. This also allows for more
simplifications in the following patch.

This patch does no functional changes.

Signed-off-by: Andrey Konovalov <[email protected]>
Reviewed-by: Alexander Potapenko <[email protected]>
---
mm/page_alloc.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index c51d637cdab3..2784bd478942 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2435,14 +2435,18 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
init = false;
}
if (kasan_has_integrated_init()) {
- if (!init_tags)
+ if (!init_tags) {
kasan_unpoison_pages(page, order, init);
+
+ /* Note that memory is already initialized by KASAN. */
+ init = false;
+ }
} else {
kasan_unpoison_pages(page, order, init);
-
- if (init)
- kernel_init_free_pages(page, 1 << order);
}
+ /* If memory is still not initialized, do it now. */
+ if (init)
+ kernel_init_free_pages(page, 1 << order);
/* Propagate __GFP_SKIP_KASAN_POISON to page flags. */
if (kasan_hw_tags_enabled() && (gfp_flags & __GFP_SKIP_KASAN_POISON))
SetPageSkipKASanPoison(page);
--
2.25.1

2022-01-24 19:49:42

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 05/39] kasan, page_alloc: init memory of skipped pages on free

From: Andrey Konovalov <[email protected]>

Since commit 7a3b83537188 ("kasan: use separate (un)poison implementation
for integrated init"), when all init, kasan_has_integrated_init(), and
skip_kasan_poison are true, free_pages_prepare() doesn't initialize
the page. This is wrong.

Fix it by remembering whether kasan_poison_pages() performed
initialization, and call kernel_init_free_pages() if it didn't.

Reordering kasan_poison_pages() and kernel_init_free_pages() is OK,
since kernel_init_free_pages() can handle poisoned memory.

Signed-off-by: Andrey Konovalov <[email protected]>

---

Changes v2->v3:
- Drop Fixes tag, as the patch won't cleanly apply to older kernels
anyway. The commit is mentioned in the patch description.

Changes v1->v2:
- Reorder kasan_poison_pages() and free_pages_prepare() in this patch
instead of doing it in the previous one.
---
mm/page_alloc.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 60bc838a4d85..f994fd68e3b1 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1375,11 +1375,16 @@ static __always_inline bool free_pages_prepare(struct page *page,
* With hardware tag-based KASAN, memory tags must be set before the
* page becomes unavailable via debug_pagealloc or arch_free_page.
*/
- if (init && !kasan_has_integrated_init())
- kernel_init_free_pages(page, 1 << order);
- if (!skip_kasan_poison)
+ if (!skip_kasan_poison) {
kasan_poison_pages(page, order, init);

+ /* Memory is already initialized if KASAN did it internally. */
+ if (kasan_has_integrated_init())
+ init = false;
+ }
+ if (init)
+ kernel_init_free_pages(page, 1 << order);
+
/*
* arch_free_page() can make the page's contents inaccessible. s390
* does this. So nothing which can access the page's contents should
--
2.25.1

2022-01-24 19:49:43

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 24/39] kasan, vmalloc: add vmalloc tagging for SW_TAGS

From: Andrey Konovalov <[email protected]>

Add vmalloc tagging support to SW_TAGS KASAN.

- __kasan_unpoison_vmalloc() now assigns a random pointer tag, poisons
the virtual mapping accordingly, and embeds the tag into the returned
pointer.

- __get_vm_area_node() (used by vmalloc() and vmap()) and
pcpu_get_vm_areas() save the tagged pointer into vm_struct->addr
(note: not into vmap_area->addr). This requires putting
kasan_unpoison_vmalloc() after setup_vmalloc_vm[_locked]();
otherwise the latter will overwrite the tagged pointer.
The tagged pointer then is naturally propagateed to vmalloc()
and vmap().

- vm_map_ram() returns the tagged pointer directly.

As a result of this change, vm_struct->addr is now tagged.

Enabling KASAN_VMALLOC with SW_TAGS is not yet allowed.

Signed-off-by: Andrey Konovalov <[email protected]>

---

Changes v2->v3:
- Drop accidentally added kasan_unpoison_vmalloc() argument for when
KASAN is off.
- Drop __must_check for kasan_unpoison_vmalloc(), as its result is
sometimes intentionally ignored.
- Move allowing enabling KASAN_VMALLOC with SW_TAGS into a separate
patch.
- Update patch description.

Changes v1->v2:
- Allow enabling KASAN_VMALLOC with SW_TAGS in this patch.
---
include/linux/kasan.h | 16 ++++++++++------
mm/kasan/shadow.c | 6 ++++--
mm/vmalloc.c | 14 ++++++++------
3 files changed, 22 insertions(+), 14 deletions(-)

diff --git a/include/linux/kasan.h b/include/linux/kasan.h
index da320069e7cf..92c5dfa29a35 100644
--- a/include/linux/kasan.h
+++ b/include/linux/kasan.h
@@ -424,12 +424,13 @@ void kasan_release_vmalloc(unsigned long start, unsigned long end,
unsigned long free_region_start,
unsigned long free_region_end);

-void __kasan_unpoison_vmalloc(const void *start, unsigned long size);
-static __always_inline void kasan_unpoison_vmalloc(const void *start,
- unsigned long size)
+void *__kasan_unpoison_vmalloc(const void *start, unsigned long size);
+static __always_inline void *kasan_unpoison_vmalloc(const void *start,
+ unsigned long size)
{
if (kasan_enabled())
- __kasan_unpoison_vmalloc(start, size);
+ return __kasan_unpoison_vmalloc(start, size);
+ return (void *)start;
}

void __kasan_poison_vmalloc(const void *start, unsigned long size);
@@ -454,8 +455,11 @@ static inline void kasan_release_vmalloc(unsigned long start,
unsigned long free_region_start,
unsigned long free_region_end) { }

-static inline void kasan_unpoison_vmalloc(const void *start, unsigned long size)
-{ }
+static inline void *kasan_unpoison_vmalloc(const void *start,
+ unsigned long size)
+{
+ return (void *)start;
+}
static inline void kasan_poison_vmalloc(const void *start, unsigned long size)
{ }

diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c
index 39d0b32ebf70..5a866f6663fc 100644
--- a/mm/kasan/shadow.c
+++ b/mm/kasan/shadow.c
@@ -475,12 +475,14 @@ void kasan_release_vmalloc(unsigned long start, unsigned long end,
}
}

-void __kasan_unpoison_vmalloc(const void *start, unsigned long size)
+void *__kasan_unpoison_vmalloc(const void *start, unsigned long size)
{
if (!is_vmalloc_or_module_addr(start))
- return;
+ return (void *)start;

+ start = set_tag(start, kasan_random_tag());
kasan_unpoison(start, size, false);
+ return (void *)start;
}

/*
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 38bf3b418b81..15e1a4fdfe0b 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2210,7 +2210,7 @@ void *vm_map_ram(struct page **pages, unsigned int count, int node)
mem = (void *)addr;
}

- kasan_unpoison_vmalloc(mem, size);
+ mem = kasan_unpoison_vmalloc(mem, size);

if (vmap_pages_range(addr, addr + size, PAGE_KERNEL,
pages, PAGE_SHIFT) < 0) {
@@ -2443,10 +2443,10 @@ static struct vm_struct *__get_vm_area_node(unsigned long size,
return NULL;
}

- kasan_unpoison_vmalloc((void *)va->va_start, requested_size);
-
setup_vmalloc_vm(area, va, flags, caller);

+ area->addr = kasan_unpoison_vmalloc(area->addr, requested_size);
+
return area;
}

@@ -3795,9 +3795,6 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets,
for (area = 0; area < nr_vms; area++) {
if (kasan_populate_vmalloc(vas[area]->va_start, sizes[area]))
goto err_free_shadow;
-
- kasan_unpoison_vmalloc((void *)vas[area]->va_start,
- sizes[area]);
}

/* insert all vm's */
@@ -3810,6 +3807,11 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets,
}
spin_unlock(&vmap_area_lock);

+ /* mark allocated areas as accessible */
+ for (area = 0; area < nr_vms; area++)
+ vms[area]->addr = kasan_unpoison_vmalloc(vms[area]->addr,
+ vms[area]->size);
+
kfree(vas);
return vms;

--
2.25.1

2022-01-24 19:49:48

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 25/39] kasan, vmalloc, arm64: mark vmalloc mappings as pgprot_tagged

From: Andrey Konovalov <[email protected]>

HW_TAGS KASAN relies on ARM Memory Tagging Extension (MTE). With MTE,
a memory region must be mapped as MT_NORMAL_TAGGED to allow setting
memory tags via MTE-specific instructions.

Add proper protection bits to vmalloc() allocations. These allocations
are always backed by page_alloc pages, so the tags will actually be
getting set on the corresponding physical memory.

Signed-off-by: Andrey Konovalov <[email protected]>
Co-developed-by: Vincenzo Frascino <[email protected]>
Signed-off-by: Vincenzo Frascino <[email protected]>

---

Changes v3->v4:
- Rename arch_vmalloc_pgprot_modify() to arch_vmap_pgprot_tagged()
to be consistent with other arch vmalloc hooks.
- Move checks from arch_vmap_pgprot_tagged() to __vmalloc_node_range()
as the same condition is used for other things in subsequent patches.

Changes v2->v3:
- Update patch description.
---
arch/arm64/include/asm/vmalloc.h | 6 ++++++
include/linux/vmalloc.h | 7 +++++++
mm/vmalloc.c | 9 +++++++++
3 files changed, 22 insertions(+)

diff --git a/arch/arm64/include/asm/vmalloc.h b/arch/arm64/include/asm/vmalloc.h
index b9185503feae..38fafffe699f 100644
--- a/arch/arm64/include/asm/vmalloc.h
+++ b/arch/arm64/include/asm/vmalloc.h
@@ -25,4 +25,10 @@ static inline bool arch_vmap_pmd_supported(pgprot_t prot)

#endif

+#define arch_vmap_pgprot_tagged arch_vmap_pgprot_tagged
+static inline pgprot_t arch_vmap_pgprot_tagged(pgprot_t prot)
+{
+ return pgprot_tagged(prot);
+}
+
#endif /* _ASM_ARM64_VMALLOC_H */
diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
index 87f8cfec50a0..7b879c77bec5 100644
--- a/include/linux/vmalloc.h
+++ b/include/linux/vmalloc.h
@@ -115,6 +115,13 @@ static inline int arch_vmap_pte_supported_shift(unsigned long size)
}
#endif

+#ifndef arch_vmap_pgprot_tagged
+static inline pgprot_t arch_vmap_pgprot_tagged(pgprot_t prot)
+{
+ return prot;
+}
+#endif
+
/*
* Highlevel APIs for driver use
*/
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 15e1a4fdfe0b..92e635b7490c 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -3108,6 +3108,15 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align,
goto fail;
}

+ /*
+ * Modify protection bits to allow tagging.
+ * This must be done before mapping by __vmalloc_area_node().
+ */
+ if (kasan_hw_tags_enabled() &&
+ pgprot_val(prot) == pgprot_val(PAGE_KERNEL))
+ prot = arch_vmap_pgprot_tagged(prot);
+
+ /* Allocate physical pages and map them into vmalloc space. */
addr = __vmalloc_area_node(area, gfp_mask, prot, shift, node);
if (!addr)
goto fail;
--
2.25.1

2022-01-24 19:50:30

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 27/39] kasan, mm: only define ___GFP_SKIP_KASAN_POISON with HW_TAGS

From: Andrey Konovalov <[email protected]>

Only define the ___GFP_SKIP_KASAN_POISON flag when CONFIG_KASAN_HW_TAGS
is enabled.

This patch it not useful by itself, but it prepares the code for
additions of new KASAN-specific GFP patches.

Signed-off-by: Andrey Konovalov <[email protected]>

---

Changes v3->v4:
- This is a new patch.
---
include/linux/gfp.h | 8 +++++++-
include/trace/events/mmflags.h | 12 +++++++++---
2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 581a1f47b8a2..96f707931770 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -54,7 +54,11 @@ struct vm_area_struct;
#define ___GFP_THISNODE 0x200000u
#define ___GFP_ACCOUNT 0x400000u
#define ___GFP_ZEROTAGS 0x800000u
+#ifdef CONFIG_KASAN_HW_TAGS
#define ___GFP_SKIP_KASAN_POISON 0x1000000u
+#else
+#define ___GFP_SKIP_KASAN_POISON 0
+#endif
#ifdef CONFIG_LOCKDEP
#define ___GFP_NOLOCKDEP 0x2000000u
#else
@@ -251,7 +255,9 @@ struct vm_area_struct;
#define __GFP_NOLOCKDEP ((__force gfp_t)___GFP_NOLOCKDEP)

/* Room for N __GFP_FOO bits */
-#define __GFP_BITS_SHIFT (25 + IS_ENABLED(CONFIG_LOCKDEP))
+#define __GFP_BITS_SHIFT (24 + \
+ IS_ENABLED(CONFIG_KASAN_HW_TAGS) + \
+ IS_ENABLED(CONFIG_LOCKDEP))
#define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))

/**
diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h
index 116ed4d5d0f8..cb4520374e2c 100644
--- a/include/trace/events/mmflags.h
+++ b/include/trace/events/mmflags.h
@@ -49,12 +49,18 @@
{(unsigned long)__GFP_RECLAIM, "__GFP_RECLAIM"}, \
{(unsigned long)__GFP_DIRECT_RECLAIM, "__GFP_DIRECT_RECLAIM"},\
{(unsigned long)__GFP_KSWAPD_RECLAIM, "__GFP_KSWAPD_RECLAIM"},\
- {(unsigned long)__GFP_ZEROTAGS, "__GFP_ZEROTAGS"}, \
- {(unsigned long)__GFP_SKIP_KASAN_POISON,"__GFP_SKIP_KASAN_POISON"}\
+ {(unsigned long)__GFP_ZEROTAGS, "__GFP_ZEROTAGS"} \
+
+#ifdef CONFIG_KASAN_HW_TAGS
+#define __def_gfpflag_names_kasan \
+ , {(unsigned long)__GFP_SKIP_KASAN_POISON, "__GFP_SKIP_KASAN_POISON"}
+#else
+#define __def_gfpflag_names_kasan
+#endif

#define show_gfp_flags(flags) \
(flags) ? __print_flags(flags, "|", \
- __def_gfpflag_names \
+ __def_gfpflag_names __def_gfpflag_names_kasan \
) : "none"

#ifdef CONFIG_MMU
--
2.25.1

2022-01-24 19:50:30

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 26/39] kasan, vmalloc: unpoison VM_ALLOC pages after mapping

From: Andrey Konovalov <[email protected]>

Make KASAN unpoison vmalloc mappings after they have been mapped in
when it's possible: for vmalloc() (indentified via VM_ALLOC) and
vm_map_ram().

The reasons for this are:

- For vmalloc() and vm_map_ram(): pages don't get unpoisoned in case
mapping them fails.
- For vmalloc(): HW_TAGS KASAN needs pages to be mapped to set tags via
kasan_unpoison_vmalloc().

As a part of these changes, the return value of __vmalloc_node_range()
is changed to area->addr. This is a non-functional change, as
__vmalloc_area_node() returns area->addr anyway.

Signed-off-by: Andrey Konovalov <[email protected]>
Reviewed-by: Alexander Potapenko <[email protected]>

---

Changes v3->v4:
- Don't forget to save tagged addr to vm_struct->addr for VM_ALLOC
so that find_vm_area(addr)->addr == addr for vmalloc().
- Reword comments.
- Update patch description.

Changes v2->v3:
- Update patch description.
---
mm/vmalloc.c | 30 ++++++++++++++++++++++--------
1 file changed, 22 insertions(+), 8 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 92e635b7490c..b65adac1cd80 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2210,14 +2210,15 @@ void *vm_map_ram(struct page **pages, unsigned int count, int node)
mem = (void *)addr;
}

- mem = kasan_unpoison_vmalloc(mem, size);
-
if (vmap_pages_range(addr, addr + size, PAGE_KERNEL,
pages, PAGE_SHIFT) < 0) {
vm_unmap_ram(mem, count);
return NULL;
}

+ /* Mark the pages as accessible, now that they are mapped. */
+ mem = kasan_unpoison_vmalloc(mem, size);
+
return mem;
}
EXPORT_SYMBOL(vm_map_ram);
@@ -2445,7 +2446,14 @@ static struct vm_struct *__get_vm_area_node(unsigned long size,

setup_vmalloc_vm(area, va, flags, caller);

- area->addr = kasan_unpoison_vmalloc(area->addr, requested_size);
+ /*
+ * Mark pages for non-VM_ALLOC mappings as accessible. Do it now as a
+ * best-effort approach, as they can be mapped outside of vmalloc code.
+ * For VM_ALLOC mappings, the pages are marked as accessible after
+ * getting mapped in __vmalloc_node_range().
+ */
+ if (!(flags & VM_ALLOC))
+ area->addr = kasan_unpoison_vmalloc(area->addr, requested_size);

return area;
}
@@ -3055,7 +3063,7 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align,
const void *caller)
{
struct vm_struct *area;
- void *addr;
+ void *ret;
unsigned long real_size = size;
unsigned long real_align = align;
unsigned int shift = PAGE_SHIFT;
@@ -3117,10 +3125,13 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align,
prot = arch_vmap_pgprot_tagged(prot);

/* Allocate physical pages and map them into vmalloc space. */
- addr = __vmalloc_area_node(area, gfp_mask, prot, shift, node);
- if (!addr)
+ ret = __vmalloc_area_node(area, gfp_mask, prot, shift, node);
+ if (!ret)
goto fail;

+ /* Mark the pages as accessible, now that they are mapped. */
+ area->addr = kasan_unpoison_vmalloc(area->addr, real_size);
+
/*
* In this function, newly allocated vm_struct has VM_UNINITIALIZED
* flag. It means that vm_struct is not fully initialized.
@@ -3132,7 +3143,7 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align,
if (!(vm_flags & VM_DEFER_KMEMLEAK))
kmemleak_vmalloc(area, size, gfp_mask);

- return addr;
+ return area->addr;

fail:
if (shift > PAGE_SHIFT) {
@@ -3816,7 +3827,10 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets,
}
spin_unlock(&vmap_area_lock);

- /* mark allocated areas as accessible */
+ /*
+ * Mark allocated areas as accessible. Do it now as a best-effort
+ * approach, as they can be mapped outside of vmalloc code.
+ */
for (area = 0; area < nr_vms; area++)
vms[area]->addr = kasan_unpoison_vmalloc(vms[area]->addr,
vms[area]->size);
--
2.25.1

2022-01-24 19:50:30

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 31/39] kasan, vmalloc: only tag normal vmalloc allocations

From: Andrey Konovalov <[email protected]>

The kernel can use to allocate executable memory. The only supported way
to do that is via __vmalloc_node_range() with the executable bit set in
the prot argument. (vmap() resets the bit via pgprot_nx()).

Once tag-based KASAN modes start tagging vmalloc allocations, executing
code from such allocations will lead to the PC register getting a tag,
which is not tolerated by the kernel.

Only tag the allocations for normal kernel pages.

Signed-off-by: Andrey Konovalov <[email protected]>

---

Changes v3->v4:
- Rename KASAN_VMALLOC_NOEXEC to KASAN_VMALLOC_PROT_NORMAL.
- Compare with PAGE_KERNEL instead of using pgprot_nx().
- Update patch description.

Changes v2->v3:
- Add this patch.
---
include/linux/kasan.h | 7 ++++---
mm/kasan/hw_tags.c | 7 +++++++
mm/kasan/shadow.c | 7 +++++++
mm/vmalloc.c | 49 +++++++++++++++++++++++++------------------
4 files changed, 47 insertions(+), 23 deletions(-)

diff --git a/include/linux/kasan.h b/include/linux/kasan.h
index 499f1573dba4..3593c95d1fa5 100644
--- a/include/linux/kasan.h
+++ b/include/linux/kasan.h
@@ -27,9 +27,10 @@ struct kunit_kasan_expectation {

typedef unsigned int __bitwise kasan_vmalloc_flags_t;

-#define KASAN_VMALLOC_NONE 0x00u
-#define KASAN_VMALLOC_INIT 0x01u
-#define KASAN_VMALLOC_VM_ALLOC 0x02u
+#define KASAN_VMALLOC_NONE 0x00u
+#define KASAN_VMALLOC_INIT 0x01u
+#define KASAN_VMALLOC_VM_ALLOC 0x02u
+#define KASAN_VMALLOC_PROT_NORMAL 0x04u

#if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)

diff --git a/mm/kasan/hw_tags.c b/mm/kasan/hw_tags.c
index 21104fd51872..2e9378a4f07f 100644
--- a/mm/kasan/hw_tags.c
+++ b/mm/kasan/hw_tags.c
@@ -247,6 +247,13 @@ void *__kasan_unpoison_vmalloc(const void *start, unsigned long size,
if (!(flags & KASAN_VMALLOC_VM_ALLOC))
return (void *)start;

+ /*
+ * Don't tag executable memory.
+ * The kernel doesn't tolerate having the PC register tagged.
+ */
+ if (!(flags & KASAN_VMALLOC_PROT_NORMAL))
+ return (void *)start;
+
tag = kasan_random_tag();
start = set_tag(start, tag);

diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c
index b958babc8fed..7272e248db87 100644
--- a/mm/kasan/shadow.c
+++ b/mm/kasan/shadow.c
@@ -488,6 +488,13 @@ void *__kasan_unpoison_vmalloc(const void *start, unsigned long size,
if (!is_vmalloc_or_module_addr(start))
return (void *)start;

+ /*
+ * Don't tag executable memory.
+ * The kernel doesn't tolerate having the PC register tagged.
+ */
+ if (!(flags & KASAN_VMALLOC_PROT_NORMAL))
+ return (void *)start;
+
start = set_tag(start, kasan_random_tag());
kasan_unpoison(start, size, false);
return (void *)start;
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 6dcdf815576b..375b53fd939f 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2221,7 +2221,7 @@ void *vm_map_ram(struct page **pages, unsigned int count, int node)
* With hardware tag-based KASAN, marking is skipped for
* non-VM_ALLOC mappings, see __kasan_unpoison_vmalloc().
*/
- mem = kasan_unpoison_vmalloc(mem, size, KASAN_VMALLOC_NONE);
+ mem = kasan_unpoison_vmalloc(mem, size, KASAN_VMALLOC_PROT_NORMAL);

return mem;
}
@@ -2460,7 +2460,7 @@ static struct vm_struct *__get_vm_area_node(unsigned long size,
*/
if (!(flags & VM_ALLOC))
area->addr = kasan_unpoison_vmalloc(area->addr, requested_size,
- KASAN_VMALLOC_NONE);
+ KASAN_VMALLOC_PROT_NORMAL);

return area;
}
@@ -3071,7 +3071,7 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align,
{
struct vm_struct *area;
void *ret;
- kasan_vmalloc_flags_t kasan_flags;
+ kasan_vmalloc_flags_t kasan_flags = KASAN_VMALLOC_NONE;
unsigned long real_size = size;
unsigned long real_align = align;
unsigned int shift = PAGE_SHIFT;
@@ -3124,21 +3124,28 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align,
goto fail;
}

- /* Prepare arguments for __vmalloc_area_node(). */
- if (kasan_hw_tags_enabled() &&
- pgprot_val(prot) == pgprot_val(PAGE_KERNEL)) {
- /*
- * Modify protection bits to allow tagging.
- * This must be done before mapping in __vmalloc_area_node().
- */
- prot = arch_vmap_pgprot_tagged(prot);
+ /*
+ * Prepare arguments for __vmalloc_area_node() and
+ * kasan_unpoison_vmalloc().
+ */
+ if (pgprot_val(prot) == pgprot_val(PAGE_KERNEL)) {
+ if (kasan_hw_tags_enabled()) {
+ /*
+ * Modify protection bits to allow tagging.
+ * This must be done before mapping.
+ */
+ prot = arch_vmap_pgprot_tagged(prot);

- /*
- * Skip page_alloc poisoning and zeroing for physical pages
- * backing VM_ALLOC mapping. Memory is instead poisoned and
- * zeroed by kasan_unpoison_vmalloc().
- */
- gfp_mask |= __GFP_SKIP_KASAN_UNPOISON | __GFP_SKIP_ZERO;
+ /*
+ * Skip page_alloc poisoning and zeroing for physical
+ * pages backing VM_ALLOC mapping. Memory is instead
+ * poisoned and zeroed by kasan_unpoison_vmalloc().
+ */
+ gfp_mask |= __GFP_SKIP_KASAN_UNPOISON | __GFP_SKIP_ZERO;
+ }
+
+ /* Take note that the mapping is PAGE_KERNEL. */
+ kasan_flags |= KASAN_VMALLOC_PROT_NORMAL;
}

/* Allocate physical pages and map them into vmalloc space. */
@@ -3152,10 +3159,13 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align,
* (except for the should_skip_init() check) to make sure that memory
* is initialized under the same conditions regardless of the enabled
* KASAN mode.
+ * Tag-based KASAN modes only assign tags to normal non-executable
+ * allocations, see __kasan_unpoison_vmalloc().
*/
- kasan_flags = KASAN_VMALLOC_VM_ALLOC;
+ kasan_flags |= KASAN_VMALLOC_VM_ALLOC;
if (!want_init_on_free() && want_init_on_alloc(gfp_mask))
kasan_flags |= KASAN_VMALLOC_INIT;
+ /* KASAN_VMALLOC_PROT_NORMAL already set if required. */
area->addr = kasan_unpoison_vmalloc(area->addr, real_size, kasan_flags);

/*
@@ -3861,8 +3871,7 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets,
*/
for (area = 0; area < nr_vms; area++)
vms[area]->addr = kasan_unpoison_vmalloc(vms[area]->addr,
- vms[area]->size,
- KASAN_VMALLOC_NONE);
+ vms[area]->size, KASAN_VMALLOC_PROT_NORMAL);

kfree(vas);
return vms;
--
2.25.1

2022-01-24 19:50:30

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 34/39] kasan: clean up feature flags for HW_TAGS mode

From: Andrey Konovalov <[email protected]>

- Untie kasan_init_hw_tags() code from the default values of
kasan_arg_mode and kasan_arg_stacktrace.

- Move static_branch_enable(&kasan_flag_enabled) to the end of
kasan_init_hw_tags_cpu().

- Remove excessive comments in kasan_arg_mode switch.

- Add new comments.

Signed-off-by: Andrey Konovalov <[email protected]>

---

Changes v4->v5:
- Add this patch.
---
mm/kasan/hw_tags.c | 38 +++++++++++++++++++++-----------------
mm/kasan/kasan.h | 2 +-
2 files changed, 22 insertions(+), 18 deletions(-)

diff --git a/mm/kasan/hw_tags.c b/mm/kasan/hw_tags.c
index 6509809dd5d8..6a3146d1ccc5 100644
--- a/mm/kasan/hw_tags.c
+++ b/mm/kasan/hw_tags.c
@@ -42,16 +42,22 @@ static enum kasan_arg kasan_arg __ro_after_init;
static enum kasan_arg_mode kasan_arg_mode __ro_after_init;
static enum kasan_arg_stacktrace kasan_arg_stacktrace __initdata;

-/* Whether KASAN is enabled at all. */
+/*
+ * Whether KASAN is enabled at all.
+ * The value remains false until KASAN is initialized by kasan_init_hw_tags().
+ */
DEFINE_STATIC_KEY_FALSE(kasan_flag_enabled);
EXPORT_SYMBOL(kasan_flag_enabled);

-/* Whether the selected mode is synchronous/asynchronous/asymmetric.*/
+/*
+ * Whether the selected mode is synchronous, asynchronous, or asymmetric.
+ * Defaults to KASAN_MODE_SYNC.
+ */
enum kasan_mode kasan_mode __ro_after_init;
EXPORT_SYMBOL_GPL(kasan_mode);

/* Whether to collect alloc/free stack traces. */
-DEFINE_STATIC_KEY_FALSE(kasan_flag_stacktrace);
+DEFINE_STATIC_KEY_TRUE(kasan_flag_stacktrace);

/* kasan=off/on */
static int __init early_kasan_flag(char *arg)
@@ -127,7 +133,11 @@ void kasan_init_hw_tags_cpu(void)
* as this function is only called for MTE-capable hardware.
*/

- /* If KASAN is disabled via command line, don't initialize it. */
+ /*
+ * If KASAN is disabled via command line, don't initialize it.
+ * When this function is called, kasan_flag_enabled is not yet
+ * set by kasan_init_hw_tags(). Thus, check kasan_arg instead.
+ */
if (kasan_arg == KASAN_ARG_OFF)
return;

@@ -154,42 +164,36 @@ void __init kasan_init_hw_tags(void)
if (kasan_arg == KASAN_ARG_OFF)
return;

- /* Enable KASAN. */
- static_branch_enable(&kasan_flag_enabled);
-
switch (kasan_arg_mode) {
case KASAN_ARG_MODE_DEFAULT:
- /*
- * Default to sync mode.
- */
- fallthrough;
+ /* Default is specified by kasan_mode definition. */
+ break;
case KASAN_ARG_MODE_SYNC:
- /* Sync mode enabled. */
kasan_mode = KASAN_MODE_SYNC;
break;
case KASAN_ARG_MODE_ASYNC:
- /* Async mode enabled. */
kasan_mode = KASAN_MODE_ASYNC;
break;
case KASAN_ARG_MODE_ASYMM:
- /* Asymm mode enabled. */
kasan_mode = KASAN_MODE_ASYMM;
break;
}

switch (kasan_arg_stacktrace) {
case KASAN_ARG_STACKTRACE_DEFAULT:
- /* Default to enabling stack trace collection. */
- static_branch_enable(&kasan_flag_stacktrace);
+ /* Default is specified by kasan_flag_stacktrace definition. */
break;
case KASAN_ARG_STACKTRACE_OFF:
- /* Do nothing, kasan_flag_stacktrace keeps its default value. */
+ static_branch_disable(&kasan_flag_stacktrace);
break;
case KASAN_ARG_STACKTRACE_ON:
static_branch_enable(&kasan_flag_stacktrace);
break;
}

+ /* KASAN is now initialized, enable it. */
+ static_branch_enable(&kasan_flag_enabled);
+
pr_info("KernelAddressSanitizer initialized (hw-tags, mode=%s, stacktrace=%s)\n",
kasan_mode_info(),
kasan_stack_collection_enabled() ? "on" : "off");
diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
index 020f3e57a03f..efda13a9ce6a 100644
--- a/mm/kasan/kasan.h
+++ b/mm/kasan/kasan.h
@@ -12,7 +12,7 @@
#include <linux/static_key.h>
#include "../slab.h"

-DECLARE_STATIC_KEY_FALSE(kasan_flag_stacktrace);
+DECLARE_STATIC_KEY_TRUE(kasan_flag_stacktrace);

enum kasan_mode {
KASAN_MODE_SYNC,
--
2.25.1

2022-01-24 19:50:30

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 30/39] kasan, vmalloc: add vmalloc tagging for HW_TAGS

From: Andrey Konovalov <[email protected]>

Add vmalloc tagging support to HW_TAGS KASAN.

The key difference between HW_TAGS and the other two KASAN modes
when it comes to vmalloc: HW_TAGS KASAN can only assign tags to
physical memory. The other two modes have shadow memory covering
every mapped virtual memory region.

Make __kasan_unpoison_vmalloc() for HW_TAGS KASAN:

- Skip non-VM_ALLOC mappings as HW_TAGS KASAN can only tag a single
mapping of normal physical memory; see the comment in the function.
- Generate a random tag, tag the returned pointer and the allocation,
and initialize the allocation at the same time.
- Propagate the tag into the page stucts to allow accesses through
page_address(vmalloc_to_page()).

The rest of vmalloc-related KASAN hooks are not needed:

- The shadow-related ones are fully skipped.
- __kasan_poison_vmalloc() is kept as a no-op with a comment.

Poisoning and zeroing of physical pages that are backing vmalloc()
allocations are skipped via __GFP_SKIP_KASAN_UNPOISON and
__GFP_SKIP_ZERO: __kasan_unpoison_vmalloc() does that instead.

Enabling CONFIG_KASAN_VMALLOC with HW_TAGS is not yet allowed.

Signed-off-by: Andrey Konovalov <[email protected]>
Co-developed-by: Vincenzo Frascino <[email protected]>
Signed-off-by: Vincenzo Frascino <[email protected]>

---

Changes v3->v4:
- Fix comment style in __kasan_unpoison_vmalloc().
- Set __GFP_SKIP_KASAN_UNPOISON and __GFP_SKIP_ZERO flags instead of
resetting.
- Move setting KASAN GFP flags to __vmalloc_node_range() and do it
only for normal non-executable mapping when HW_TAGS KASAN is enabled.

Changes v2->v3:
- Switch kasan_unpoison_vmalloc() to using a single flags argument.
- Update kasan_unpoison_vmalloc() arguments in kernel/scs.c.
- Move allowing enabling KASAN_VMALLOC with SW_TAGS into a separate
patch.
- Minor comments fixes.
- Update patch description.

Changes v1->v2:
- Allow enabling CONFIG_KASAN_VMALLOC with HW_TAGS in this patch.
- Move memory init for page_alloc pages backing vmalloc() into
kasan_unpoison_vmalloc().
---
include/linux/kasan.h | 36 +++++++++++++++--
kernel/scs.c | 4 +-
mm/kasan/hw_tags.c | 92 +++++++++++++++++++++++++++++++++++++++++++
mm/kasan/shadow.c | 10 ++++-
mm/vmalloc.c | 51 ++++++++++++++++++------
5 files changed, 175 insertions(+), 18 deletions(-)

diff --git a/include/linux/kasan.h b/include/linux/kasan.h
index 92c5dfa29a35..499f1573dba4 100644
--- a/include/linux/kasan.h
+++ b/include/linux/kasan.h
@@ -25,6 +25,12 @@ struct kunit_kasan_expectation {

#endif

+typedef unsigned int __bitwise kasan_vmalloc_flags_t;
+
+#define KASAN_VMALLOC_NONE 0x00u
+#define KASAN_VMALLOC_INIT 0x01u
+#define KASAN_VMALLOC_VM_ALLOC 0x02u
+
#if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)

#include <linux/pgtable.h>
@@ -418,18 +424,39 @@ static inline void kasan_init_hw_tags(void) { }

#ifdef CONFIG_KASAN_VMALLOC

+#if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)
+
void kasan_populate_early_vm_area_shadow(void *start, unsigned long size);
int kasan_populate_vmalloc(unsigned long addr, unsigned long size);
void kasan_release_vmalloc(unsigned long start, unsigned long end,
unsigned long free_region_start,
unsigned long free_region_end);

-void *__kasan_unpoison_vmalloc(const void *start, unsigned long size);
+#else /* CONFIG_KASAN_GENERIC || CONFIG_KASAN_SW_TAGS */
+
+static inline void kasan_populate_early_vm_area_shadow(void *start,
+ unsigned long size)
+{ }
+static inline int kasan_populate_vmalloc(unsigned long start,
+ unsigned long size)
+{
+ return 0;
+}
+static inline void kasan_release_vmalloc(unsigned long start,
+ unsigned long end,
+ unsigned long free_region_start,
+ unsigned long free_region_end) { }
+
+#endif /* CONFIG_KASAN_GENERIC || CONFIG_KASAN_SW_TAGS */
+
+void *__kasan_unpoison_vmalloc(const void *start, unsigned long size,
+ kasan_vmalloc_flags_t flags);
static __always_inline void *kasan_unpoison_vmalloc(const void *start,
- unsigned long size)
+ unsigned long size,
+ kasan_vmalloc_flags_t flags)
{
if (kasan_enabled())
- return __kasan_unpoison_vmalloc(start, size);
+ return __kasan_unpoison_vmalloc(start, size, flags);
return (void *)start;
}

@@ -456,7 +483,8 @@ static inline void kasan_release_vmalloc(unsigned long start,
unsigned long free_region_end) { }

static inline void *kasan_unpoison_vmalloc(const void *start,
- unsigned long size)
+ unsigned long size,
+ kasan_vmalloc_flags_t flags)
{
return (void *)start;
}
diff --git a/kernel/scs.c b/kernel/scs.c
index 579841be8864..b83bc9251f99 100644
--- a/kernel/scs.c
+++ b/kernel/scs.c
@@ -32,7 +32,7 @@ static void *__scs_alloc(int node)
for (i = 0; i < NR_CACHED_SCS; i++) {
s = this_cpu_xchg(scs_cache[i], NULL);
if (s) {
- kasan_unpoison_vmalloc(s, SCS_SIZE);
+ kasan_unpoison_vmalloc(s, SCS_SIZE, KASAN_VMALLOC_NONE);
memset(s, 0, SCS_SIZE);
return s;
}
@@ -78,7 +78,7 @@ void scs_free(void *s)
if (this_cpu_cmpxchg(scs_cache[i], 0, s) == NULL)
return;

- kasan_unpoison_vmalloc(s, SCS_SIZE);
+ kasan_unpoison_vmalloc(s, SCS_SIZE, KASAN_VMALLOC_NONE);
vfree_atomic(s);
}

diff --git a/mm/kasan/hw_tags.c b/mm/kasan/hw_tags.c
index 76cf2b6229c7..21104fd51872 100644
--- a/mm/kasan/hw_tags.c
+++ b/mm/kasan/hw_tags.c
@@ -192,6 +192,98 @@ void __init kasan_init_hw_tags(void)
kasan_stack_collection_enabled() ? "on" : "off");
}

+#ifdef CONFIG_KASAN_VMALLOC
+
+static void unpoison_vmalloc_pages(const void *addr, u8 tag)
+{
+ struct vm_struct *area;
+ int i;
+
+ /*
+ * As hardware tag-based KASAN only tags VM_ALLOC vmalloc allocations
+ * (see the comment in __kasan_unpoison_vmalloc), all of the pages
+ * should belong to a single area.
+ */
+ area = find_vm_area((void *)addr);
+ if (WARN_ON(!area))
+ return;
+
+ for (i = 0; i < area->nr_pages; i++) {
+ struct page *page = area->pages[i];
+
+ page_kasan_tag_set(page, tag);
+ }
+}
+
+void *__kasan_unpoison_vmalloc(const void *start, unsigned long size,
+ kasan_vmalloc_flags_t flags)
+{
+ u8 tag;
+ unsigned long redzone_start, redzone_size;
+
+ if (!is_vmalloc_or_module_addr(start))
+ return (void *)start;
+
+ /*
+ * Skip unpoisoning and assigning a pointer tag for non-VM_ALLOC
+ * mappings as:
+ *
+ * 1. Unlike the software KASAN modes, hardware tag-based KASAN only
+ * supports tagging physical memory. Therefore, it can only tag a
+ * single mapping of normal physical pages.
+ * 2. Hardware tag-based KASAN can only tag memory mapped with special
+ * mapping protection bits, see arch_vmalloc_pgprot_modify().
+ * As non-VM_ALLOC mappings can be mapped outside of vmalloc code,
+ * providing these bits would require tracking all non-VM_ALLOC
+ * mappers.
+ *
+ * Thus, for VM_ALLOC mappings, hardware tag-based KASAN only tags
+ * the first virtual mapping, which is created by vmalloc().
+ * Tagging the page_alloc memory backing that vmalloc() allocation is
+ * skipped, see ___GFP_SKIP_KASAN_UNPOISON.
+ *
+ * For non-VM_ALLOC allocations, page_alloc memory is tagged as usual.
+ */
+ if (!(flags & KASAN_VMALLOC_VM_ALLOC))
+ return (void *)start;
+
+ tag = kasan_random_tag();
+ start = set_tag(start, tag);
+
+ /* Unpoison and initialize memory up to size. */
+ kasan_unpoison(start, size, flags & KASAN_VMALLOC_INIT);
+
+ /*
+ * Explicitly poison and initialize the in-page vmalloc() redzone.
+ * Unlike software KASAN modes, hardware tag-based KASAN doesn't
+ * unpoison memory when populating shadow for vmalloc() space.
+ */
+ redzone_start = round_up((unsigned long)start + size,
+ KASAN_GRANULE_SIZE);
+ redzone_size = round_up(redzone_start, PAGE_SIZE) - redzone_start;
+ kasan_poison((void *)redzone_start, redzone_size, KASAN_TAG_INVALID,
+ flags & KASAN_VMALLOC_INIT);
+
+ /*
+ * Set per-page tag flags to allow accessing physical memory for the
+ * vmalloc() mapping through page_address(vmalloc_to_page()).
+ */
+ unpoison_vmalloc_pages(start, tag);
+
+ return (void *)start;
+}
+
+void __kasan_poison_vmalloc(const void *start, unsigned long size)
+{
+ /*
+ * No tagging here.
+ * The physical pages backing the vmalloc() allocation are poisoned
+ * through the usual page_alloc paths.
+ */
+}
+
+#endif
+
#if IS_ENABLED(CONFIG_KASAN_KUNIT_TEST)

void kasan_enable_tagging_sync(void)
diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c
index 5a866f6663fc..b958babc8fed 100644
--- a/mm/kasan/shadow.c
+++ b/mm/kasan/shadow.c
@@ -475,8 +475,16 @@ void kasan_release_vmalloc(unsigned long start, unsigned long end,
}
}

-void *__kasan_unpoison_vmalloc(const void *start, unsigned long size)
+void *__kasan_unpoison_vmalloc(const void *start, unsigned long size,
+ kasan_vmalloc_flags_t flags)
{
+ /*
+ * Software KASAN modes unpoison both VM_ALLOC and non-VM_ALLOC
+ * mappings, so the KASAN_VMALLOC_VM_ALLOC flag is ignored.
+ * Software KASAN modes can't optimize zeroing memory by combining it
+ * with setting memory tags, so the KASAN_VMALLOC_INIT flag is ignored.
+ */
+
if (!is_vmalloc_or_module_addr(start))
return (void *)start;

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index b65adac1cd80..6dcdf815576b 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2216,8 +2216,12 @@ void *vm_map_ram(struct page **pages, unsigned int count, int node)
return NULL;
}

- /* Mark the pages as accessible, now that they are mapped. */
- mem = kasan_unpoison_vmalloc(mem, size);
+ /*
+ * Mark the pages as accessible, now that they are mapped.
+ * With hardware tag-based KASAN, marking is skipped for
+ * non-VM_ALLOC mappings, see __kasan_unpoison_vmalloc().
+ */
+ mem = kasan_unpoison_vmalloc(mem, size, KASAN_VMALLOC_NONE);

return mem;
}
@@ -2451,9 +2455,12 @@ static struct vm_struct *__get_vm_area_node(unsigned long size,
* best-effort approach, as they can be mapped outside of vmalloc code.
* For VM_ALLOC mappings, the pages are marked as accessible after
* getting mapped in __vmalloc_node_range().
+ * With hardware tag-based KASAN, marking is skipped for
+ * non-VM_ALLOC mappings, see __kasan_unpoison_vmalloc().
*/
if (!(flags & VM_ALLOC))
- area->addr = kasan_unpoison_vmalloc(area->addr, requested_size);
+ area->addr = kasan_unpoison_vmalloc(area->addr, requested_size,
+ KASAN_VMALLOC_NONE);

return area;
}
@@ -3064,6 +3071,7 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align,
{
struct vm_struct *area;
void *ret;
+ kasan_vmalloc_flags_t kasan_flags;
unsigned long real_size = size;
unsigned long real_align = align;
unsigned int shift = PAGE_SHIFT;
@@ -3116,21 +3124,39 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align,
goto fail;
}

- /*
- * Modify protection bits to allow tagging.
- * This must be done before mapping by __vmalloc_area_node().
- */
+ /* Prepare arguments for __vmalloc_area_node(). */
if (kasan_hw_tags_enabled() &&
- pgprot_val(prot) == pgprot_val(PAGE_KERNEL))
+ pgprot_val(prot) == pgprot_val(PAGE_KERNEL)) {
+ /*
+ * Modify protection bits to allow tagging.
+ * This must be done before mapping in __vmalloc_area_node().
+ */
prot = arch_vmap_pgprot_tagged(prot);

+ /*
+ * Skip page_alloc poisoning and zeroing for physical pages
+ * backing VM_ALLOC mapping. Memory is instead poisoned and
+ * zeroed by kasan_unpoison_vmalloc().
+ */
+ gfp_mask |= __GFP_SKIP_KASAN_UNPOISON | __GFP_SKIP_ZERO;
+ }
+
/* Allocate physical pages and map them into vmalloc space. */
ret = __vmalloc_area_node(area, gfp_mask, prot, shift, node);
if (!ret)
goto fail;

- /* Mark the pages as accessible, now that they are mapped. */
- area->addr = kasan_unpoison_vmalloc(area->addr, real_size);
+ /*
+ * Mark the pages as accessible, now that they are mapped.
+ * The init condition should match the one in post_alloc_hook()
+ * (except for the should_skip_init() check) to make sure that memory
+ * is initialized under the same conditions regardless of the enabled
+ * KASAN mode.
+ */
+ kasan_flags = KASAN_VMALLOC_VM_ALLOC;
+ if (!want_init_on_free() && want_init_on_alloc(gfp_mask))
+ kasan_flags |= KASAN_VMALLOC_INIT;
+ area->addr = kasan_unpoison_vmalloc(area->addr, real_size, kasan_flags);

/*
* In this function, newly allocated vm_struct has VM_UNINITIALIZED
@@ -3830,10 +3856,13 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets,
/*
* Mark allocated areas as accessible. Do it now as a best-effort
* approach, as they can be mapped outside of vmalloc code.
+ * With hardware tag-based KASAN, marking is skipped for
+ * non-VM_ALLOC mappings, see __kasan_unpoison_vmalloc().
*/
for (area = 0; area < nr_vms; area++)
vms[area]->addr = kasan_unpoison_vmalloc(vms[area]->addr,
- vms[area]->size);
+ vms[area]->size,
+ KASAN_VMALLOC_NONE);

kfree(vas);
return vms;
--
2.25.1

2022-01-24 19:50:31

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 29/39] kasan, page_alloc: allow skipping memory init for HW_TAGS

From: Andrey Konovalov <[email protected]>

Add a new GFP flag __GFP_SKIP_ZERO that allows to skip memory
initialization. The flag is only effective with HW_TAGS KASAN.

This flag will be used by vmalloc code for page_alloc allocations
backing vmalloc() mappings in a following patch. The reason to skip
memory initialization for these pages in page_alloc is because vmalloc
code will be initializing them instead.

With the current implementation, when __GFP_SKIP_ZERO is provided,
__GFP_ZEROTAGS is ignored. This doesn't matter, as these two flags are
never provided at the same time. However, if this is changed in the
future, this particular implementation detail can be changed as well.

Signed-off-by: Andrey Konovalov <[email protected]>

---

Changes v5->v6:
- Drop unnecessary explicit checks for software KASAN modes from
should_skip_init().

Changes v4->v5:
- Cosmetic changes to __def_gfpflag_names_kasan and __GFP_BITS_SHIFT.

Changes v3->v4:
- Only define __GFP_SKIP_ZERO when CONFIG_KASAN_HW_TAGS is enabled.
- Add __GFP_SKIP_ZERO to include/trace/events/mmflags.h.
- Use proper kasan_hw_tags_enabled() check instead of
IS_ENABLED(CONFIG_KASAN_HW_TAGS). Also add explicit checks for
software modes.

Changes v2->v3:
- Update patch description.

Changes v1->v2:
- Add this patch.
---
include/linux/gfp.h | 18 +++++++++++-------
include/trace/events/mmflags.h | 1 +
mm/page_alloc.c | 13 ++++++++++++-
3 files changed, 24 insertions(+), 8 deletions(-)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 7303d1064460..7797c915ce54 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -55,14 +55,16 @@ struct vm_area_struct;
#define ___GFP_ACCOUNT 0x400000u
#define ___GFP_ZEROTAGS 0x800000u
#ifdef CONFIG_KASAN_HW_TAGS
-#define ___GFP_SKIP_KASAN_UNPOISON 0x1000000u
-#define ___GFP_SKIP_KASAN_POISON 0x2000000u
+#define ___GFP_SKIP_ZERO 0x1000000u
+#define ___GFP_SKIP_KASAN_UNPOISON 0x2000000u
+#define ___GFP_SKIP_KASAN_POISON 0x4000000u
#else
+#define ___GFP_SKIP_ZERO 0
#define ___GFP_SKIP_KASAN_UNPOISON 0
#define ___GFP_SKIP_KASAN_POISON 0
#endif
#ifdef CONFIG_LOCKDEP
-#define ___GFP_NOLOCKDEP 0x4000000u
+#define ___GFP_NOLOCKDEP 0x8000000u
#else
#define ___GFP_NOLOCKDEP 0
#endif
@@ -239,9 +241,10 @@ struct vm_area_struct;
* %__GFP_ZERO returns a zeroed page on success.
*
* %__GFP_ZEROTAGS zeroes memory tags at allocation time if the memory itself
- * is being zeroed (either via __GFP_ZERO or via init_on_alloc). This flag is
- * intended for optimization: setting memory tags at the same time as zeroing
- * memory has minimal additional performace impact.
+ * is being zeroed (either via __GFP_ZERO or via init_on_alloc, provided that
+ * __GFP_SKIP_ZERO is not set). This flag is intended for optimization: setting
+ * memory tags at the same time as zeroing memory has minimal additional
+ * performace impact.
*
* %__GFP_SKIP_KASAN_UNPOISON makes KASAN skip unpoisoning on page allocation.
* Only effective in HW_TAGS mode.
@@ -253,6 +256,7 @@ struct vm_area_struct;
#define __GFP_COMP ((__force gfp_t)___GFP_COMP)
#define __GFP_ZERO ((__force gfp_t)___GFP_ZERO)
#define __GFP_ZEROTAGS ((__force gfp_t)___GFP_ZEROTAGS)
+#define __GFP_SKIP_ZERO ((__force gfp_t)___GFP_SKIP_ZERO)
#define __GFP_SKIP_KASAN_UNPOISON ((__force gfp_t)___GFP_SKIP_KASAN_UNPOISON)
#define __GFP_SKIP_KASAN_POISON ((__force gfp_t)___GFP_SKIP_KASAN_POISON)

@@ -261,7 +265,7 @@ struct vm_area_struct;

/* Room for N __GFP_FOO bits */
#define __GFP_BITS_SHIFT (24 + \
- 2 * IS_ENABLED(CONFIG_KASAN_HW_TAGS) + \
+ 3 * IS_ENABLED(CONFIG_KASAN_HW_TAGS) + \
IS_ENABLED(CONFIG_LOCKDEP))
#define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))

diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h
index 134c45e62d91..6532119a6bf1 100644
--- a/include/trace/events/mmflags.h
+++ b/include/trace/events/mmflags.h
@@ -53,6 +53,7 @@

#ifdef CONFIG_KASAN_HW_TAGS
#define __def_gfpflag_names_kasan , \
+ {(unsigned long)__GFP_SKIP_ZERO, "__GFP_SKIP_ZERO"}, \
{(unsigned long)__GFP_SKIP_KASAN_POISON, "__GFP_SKIP_KASAN_POISON"}, \
{(unsigned long)__GFP_SKIP_KASAN_UNPOISON, "__GFP_SKIP_KASAN_UNPOISON"}
#else
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 94bfbc216ae9..368c6c5bf42a 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2415,10 +2415,21 @@ static inline bool should_skip_kasan_unpoison(gfp_t flags, bool init_tags)
return init_tags || (flags & __GFP_SKIP_KASAN_UNPOISON);
}

+static inline bool should_skip_init(gfp_t flags)
+{
+ /* Don't skip, if hardware tag-based KASAN is not enabled. */
+ if (!kasan_hw_tags_enabled())
+ return false;
+
+ /* For hardware tag-based KASAN, skip if requested. */
+ return (flags & __GFP_SKIP_ZERO);
+}
+
inline void post_alloc_hook(struct page *page, unsigned int order,
gfp_t gfp_flags)
{
- bool init = !want_init_on_free() && want_init_on_alloc(gfp_flags);
+ bool init = !want_init_on_free() && want_init_on_alloc(gfp_flags) &&
+ !should_skip_init(gfp_flags);
bool init_tags = init && (gfp_flags & __GFP_ZEROTAGS);

set_page_private(page, 0);
--
2.25.1

2022-01-24 19:50:31

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 37/39] arm64: select KASAN_VMALLOC for SW/HW_TAGS modes

From: Andrey Konovalov <[email protected]>

Generic KASAN already selects KASAN_VMALLOC to allow VMAP_STACK to be
selected unconditionally, see commit acc3042d62cb9 ("arm64: Kconfig:
select KASAN_VMALLOC if KANSAN_GENERIC is enabled").

The same change is needed for SW_TAGS KASAN.

HW_TAGS KASAN does not require enabling KASAN_VMALLOC for VMAP_STACK,
they already work together as is. Still, selecting KASAN_VMALLOC still
makes sense to make vmalloc() always protected. In case any bugs in
KASAN's vmalloc() support are discovered, the command line kasan.vmalloc
flag can be used to disable vmalloc() checking.

Select KASAN_VMALLOC for all KASAN modes for arm64.

Signed-off-by: Andrey Konovalov <[email protected]>
Acked-by: Catalin Marinas <[email protected]>

---

Changes v2->v3:
- Update patch description.

Changes v1->v2:
- Split out this patch.
---
arch/arm64/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 6978140edfa4..beefec5c7b90 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -205,7 +205,7 @@ config ARM64
select IOMMU_DMA if IOMMU_SUPPORT
select IRQ_DOMAIN
select IRQ_FORCED_THREADING
- select KASAN_VMALLOC if KASAN_GENERIC
+ select KASAN_VMALLOC if KASAN
select MODULES_USE_ELF_RELA
select NEED_DMA_MAP_STATE
select NEED_SG_DMA_LENGTH
--
2.25.1

2022-01-24 19:50:31

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 38/39] kasan: documentation updates

From: Andrey Konovalov <[email protected]>

Update KASAN documentation:

- Bump Clang version requirement for HW_TAGS as ARM64_MTE depends on
AS_HAS_LSE_ATOMICS as of commit 2decad92f4731 ("arm64: mte: Ensure
TIF_MTE_ASYNC_FAULT is set atomically"), which requires Clang 12.
- Add description of the new kasan.vmalloc command line flag.
- Mention that SW_TAGS and HW_TAGS modes now support vmalloc tagging.
- Explicitly say that the "Shadow memory" section is only applicable
to software KASAN modes.
- Mention that shadow-based KASAN_VMALLOC is supported on arm64.

Signed-off-by: Andrey Konovalov <[email protected]>
---
Documentation/dev-tools/kasan.rst | 17 +++++++++++------
1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/Documentation/dev-tools/kasan.rst b/Documentation/dev-tools/kasan.rst
index 8089c559d339..7614a1fc30fa 100644
--- a/Documentation/dev-tools/kasan.rst
+++ b/Documentation/dev-tools/kasan.rst
@@ -30,7 +30,7 @@ Software tag-based KASAN mode is only supported in Clang.

The hardware KASAN mode (#3) relies on hardware to perform the checks but
still requires a compiler version that supports memory tagging instructions.
-This mode is supported in GCC 10+ and Clang 11+.
+This mode is supported in GCC 10+ and Clang 12+.

Both software KASAN modes work with SLUB and SLAB memory allocators,
while the hardware tag-based KASAN currently only supports SLUB.
@@ -206,6 +206,9 @@ additional boot parameters that allow disabling KASAN or controlling features:
Asymmetric mode: a bad access is detected synchronously on reads and
asynchronously on writes.

+- ``kasan.vmalloc=off`` or ``=on`` disables or enables tagging of vmalloc
+ allocations (default: ``on``).
+
- ``kasan.stacktrace=off`` or ``=on`` disables or enables alloc and free stack
traces collection (default: ``on``).

@@ -279,8 +282,8 @@ Software tag-based KASAN uses 0xFF as a match-all pointer tag (accesses through
pointers with the 0xFF pointer tag are not checked). The value 0xFE is currently
reserved to tag freed memory regions.

-Software tag-based KASAN currently only supports tagging of slab and page_alloc
-memory.
+Software tag-based KASAN currently only supports tagging of slab, page_alloc,
+and vmalloc memory.

Hardware tag-based KASAN
~~~~~~~~~~~~~~~~~~~~~~~~
@@ -303,8 +306,8 @@ Hardware tag-based KASAN uses 0xFF as a match-all pointer tag (accesses through
pointers with the 0xFF pointer tag are not checked). The value 0xFE is currently
reserved to tag freed memory regions.

-Hardware tag-based KASAN currently only supports tagging of slab and page_alloc
-memory.
+Hardware tag-based KASAN currently only supports tagging of slab, page_alloc,
+and VM_ALLOC-based vmalloc memory.

If the hardware does not support MTE (pre ARMv8.5), hardware tag-based KASAN
will not be enabled. In this case, all KASAN boot parameters are ignored.
@@ -319,6 +322,8 @@ checking gets disabled.
Shadow memory
-------------

+The contents of this section are only applicable to software KASAN modes.
+
The kernel maps memory in several different parts of the address space.
The range of kernel virtual addresses is large: there is not enough real
memory to support a real shadow region for every address that could be
@@ -349,7 +354,7 @@ CONFIG_KASAN_VMALLOC

With ``CONFIG_KASAN_VMALLOC``, KASAN can cover vmalloc space at the
cost of greater memory usage. Currently, this is supported on x86,
-riscv, s390, and powerpc.
+arm64, riscv, s390, and powerpc.

This works by hooking into vmalloc and vmap and dynamically
allocating real shadow memory to back the mappings.
--
2.25.1

2022-01-24 19:50:33

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 33/39] kasan: mark kasan_arg_stacktrace as __initdata

From: Andrey Konovalov <[email protected]>

As kasan_arg_stacktrace is only used in __init functions, mark it as
__initdata instead of __ro_after_init to allow it be freed after boot.

The other enums for KASAN args are used in kasan_init_hw_tags_cpu(),
which is not marked as __init as a CPU can be hot-plugged after boot.
Clarify this in a comment.

Signed-off-by: Andrey Konovalov <[email protected]>
Suggested-by: Marco Elver <[email protected]>

---

Changes v1->v2:
- Add this patch.
---
mm/kasan/hw_tags.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/mm/kasan/hw_tags.c b/mm/kasan/hw_tags.c
index 2e9378a4f07f..6509809dd5d8 100644
--- a/mm/kasan/hw_tags.c
+++ b/mm/kasan/hw_tags.c
@@ -40,7 +40,7 @@ enum kasan_arg_stacktrace {

static enum kasan_arg kasan_arg __ro_after_init;
static enum kasan_arg_mode kasan_arg_mode __ro_after_init;
-static enum kasan_arg_stacktrace kasan_arg_stacktrace __ro_after_init;
+static enum kasan_arg_stacktrace kasan_arg_stacktrace __initdata;

/* Whether KASAN is enabled at all. */
DEFINE_STATIC_KEY_FALSE(kasan_flag_enabled);
@@ -116,7 +116,10 @@ static inline const char *kasan_mode_info(void)
return "sync";
}

-/* kasan_init_hw_tags_cpu() is called for each CPU. */
+/*
+ * kasan_init_hw_tags_cpu() is called for each CPU.
+ * Not marked as __init as a CPU can be hot-plugged after boot.
+ */
void kasan_init_hw_tags_cpu(void)
{
/*
--
2.25.1

2022-01-24 19:50:34

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 36/39] kasan: allow enabling KASAN_VMALLOC and SW/HW_TAGS

From: Andrey Konovalov <[email protected]>

Allow enabling CONFIG_KASAN_VMALLOC with SW_TAGS and HW_TAGS KASAN
modes.

Also adjust CONFIG_KASAN_VMALLOC description:

- Mention HW_TAGS support.
- Remove unneeded internal details: they have no place in Kconfig
description and are already explained in the documentation.

Signed-off-by: Andrey Konovalov <[email protected]>
---
lib/Kconfig.kasan | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/lib/Kconfig.kasan b/lib/Kconfig.kasan
index 879757b6dd14..1f3e620188a2 100644
--- a/lib/Kconfig.kasan
+++ b/lib/Kconfig.kasan
@@ -178,17 +178,17 @@ config KASAN_TAGS_IDENTIFY
memory consumption.

config KASAN_VMALLOC
- bool "Back mappings in vmalloc space with real shadow memory"
- depends on KASAN_GENERIC && HAVE_ARCH_KASAN_VMALLOC
+ bool "Check accesses to vmalloc allocations"
+ depends on HAVE_ARCH_KASAN_VMALLOC
help
- By default, the shadow region for vmalloc space is the read-only
- zero page. This means that KASAN cannot detect errors involving
- vmalloc space.
-
- Enabling this option will hook in to vmap/vmalloc and back those
- mappings with real shadow memory allocated on demand. This allows
- for KASAN to detect more sorts of errors (and to support vmapped
- stacks), but at the cost of higher memory usage.
+ This mode makes KASAN check accesses to vmalloc allocations for
+ validity.
+
+ With software KASAN modes, checking is done for all types of vmalloc
+ allocations. Enabling this option leads to higher memory usage.
+
+ With hardware tag-based KASAN, only VM_ALLOC mappings are checked.
+ There is no additional memory usage.

config KASAN_KUNIT_TEST
tristate "KUnit-compatible tests of KASAN bug detection capabilities" if !KUNIT_ALL_TESTS
--
2.25.1

2022-01-24 19:50:34

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 35/39] kasan: add kasan.vmalloc command line flag

From: Andrey Konovalov <[email protected]>

Allow disabling vmalloc() tagging for HW_TAGS KASAN via a kasan.vmalloc
command line switch.

This is a fail-safe switch intended for production systems that enable
HW_TAGS KASAN. In case vmalloc() tagging ends up having an issue not
detected during testing but that manifests in production, kasan.vmalloc
allows to turn vmalloc() tagging off while leaving page_alloc/slab
tagging on.

Signed-off-by: Andrey Konovalov <[email protected]>

---

Changes v4->v5:
- Use true as kasan_flag_vmalloc static key default.

Changes v1->v2:
- Mark kasan_arg_stacktrace as __initdata instead of __ro_after_init.
- Combine KASAN_ARG_VMALLOC_DEFAULT and KASAN_ARG_VMALLOC_ON switch
cases.
---
mm/kasan/hw_tags.c | 45 ++++++++++++++++++++++++++++++++++++++++++++-
mm/kasan/kasan.h | 6 ++++++
2 files changed, 50 insertions(+), 1 deletion(-)

diff --git a/mm/kasan/hw_tags.c b/mm/kasan/hw_tags.c
index 6a3146d1ccc5..fad1887e54c0 100644
--- a/mm/kasan/hw_tags.c
+++ b/mm/kasan/hw_tags.c
@@ -32,6 +32,12 @@ enum kasan_arg_mode {
KASAN_ARG_MODE_ASYMM,
};

+enum kasan_arg_vmalloc {
+ KASAN_ARG_VMALLOC_DEFAULT,
+ KASAN_ARG_VMALLOC_OFF,
+ KASAN_ARG_VMALLOC_ON,
+};
+
enum kasan_arg_stacktrace {
KASAN_ARG_STACKTRACE_DEFAULT,
KASAN_ARG_STACKTRACE_OFF,
@@ -40,6 +46,7 @@ enum kasan_arg_stacktrace {

static enum kasan_arg kasan_arg __ro_after_init;
static enum kasan_arg_mode kasan_arg_mode __ro_after_init;
+static enum kasan_arg_vmalloc kasan_arg_vmalloc __initdata;
static enum kasan_arg_stacktrace kasan_arg_stacktrace __initdata;

/*
@@ -56,6 +63,9 @@ EXPORT_SYMBOL(kasan_flag_enabled);
enum kasan_mode kasan_mode __ro_after_init;
EXPORT_SYMBOL_GPL(kasan_mode);

+/* Whether to enable vmalloc tagging. */
+DEFINE_STATIC_KEY_TRUE(kasan_flag_vmalloc);
+
/* Whether to collect alloc/free stack traces. */
DEFINE_STATIC_KEY_TRUE(kasan_flag_stacktrace);

@@ -95,6 +105,23 @@ static int __init early_kasan_mode(char *arg)
}
early_param("kasan.mode", early_kasan_mode);

+/* kasan.vmalloc=off/on */
+static int __init early_kasan_flag_vmalloc(char *arg)
+{
+ if (!arg)
+ return -EINVAL;
+
+ if (!strcmp(arg, "off"))
+ kasan_arg_vmalloc = KASAN_ARG_VMALLOC_OFF;
+ else if (!strcmp(arg, "on"))
+ kasan_arg_vmalloc = KASAN_ARG_VMALLOC_ON;
+ else
+ return -EINVAL;
+
+ return 0;
+}
+early_param("kasan.vmalloc", early_kasan_flag_vmalloc);
+
/* kasan.stacktrace=off/on */
static int __init early_kasan_flag_stacktrace(char *arg)
{
@@ -179,6 +206,18 @@ void __init kasan_init_hw_tags(void)
break;
}

+ switch (kasan_arg_vmalloc) {
+ case KASAN_ARG_VMALLOC_DEFAULT:
+ /* Default is specified by kasan_flag_vmalloc definition. */
+ break;
+ case KASAN_ARG_VMALLOC_OFF:
+ static_branch_disable(&kasan_flag_vmalloc);
+ break;
+ case KASAN_ARG_VMALLOC_ON:
+ static_branch_enable(&kasan_flag_vmalloc);
+ break;
+ }
+
switch (kasan_arg_stacktrace) {
case KASAN_ARG_STACKTRACE_DEFAULT:
/* Default is specified by kasan_flag_stacktrace definition. */
@@ -194,8 +233,9 @@ void __init kasan_init_hw_tags(void)
/* KASAN is now initialized, enable it. */
static_branch_enable(&kasan_flag_enabled);

- pr_info("KernelAddressSanitizer initialized (hw-tags, mode=%s, stacktrace=%s)\n",
+ pr_info("KernelAddressSanitizer initialized (hw-tags, mode=%s, vmalloc=%s, stacktrace=%s)\n",
kasan_mode_info(),
+ kasan_vmalloc_enabled() ? "on" : "off",
kasan_stack_collection_enabled() ? "on" : "off");
}

@@ -228,6 +268,9 @@ void *__kasan_unpoison_vmalloc(const void *start, unsigned long size,
u8 tag;
unsigned long redzone_start, redzone_size;

+ if (!kasan_vmalloc_enabled())
+ return (void *)start;
+
if (!is_vmalloc_or_module_addr(start))
return (void *)start;

diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
index efda13a9ce6a..4d67408e8407 100644
--- a/mm/kasan/kasan.h
+++ b/mm/kasan/kasan.h
@@ -12,6 +12,7 @@
#include <linux/static_key.h>
#include "../slab.h"

+DECLARE_STATIC_KEY_TRUE(kasan_flag_vmalloc);
DECLARE_STATIC_KEY_TRUE(kasan_flag_stacktrace);

enum kasan_mode {
@@ -22,6 +23,11 @@ enum kasan_mode {

extern enum kasan_mode kasan_mode __ro_after_init;

+static inline bool kasan_vmalloc_enabled(void)
+{
+ return static_branch_likely(&kasan_flag_vmalloc);
+}
+
static inline bool kasan_stack_collection_enabled(void)
{
return static_branch_unlikely(&kasan_flag_stacktrace);
--
2.25.1

2022-01-24 19:50:34

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 32/39] kasan, arm64: don't tag executable vmalloc allocations

From: Andrey Konovalov <[email protected]>

Besides asking vmalloc memory to be executable via the prot argument
of __vmalloc_node_range() (see the previous patch), the kernel can skip
that bit and instead mark memory as executable via set_memory_x().

Once tag-based KASAN modes start tagging vmalloc allocations, executing
code from such allocations will lead to the PC register getting a tag,
which is not tolerated by the kernel.

Generic kernel code typically allocates memory via module_alloc() if
it intends to mark memory as executable. (On arm64 module_alloc()
uses __vmalloc_node_range() without setting the executable bit).

Thus, reset pointer tags of pointers returned from module_alloc().

However, on arm64 there's an exception: the eBPF subsystem. Instead of
using module_alloc(), it uses vmalloc() (via bpf_jit_alloc_exec())
to allocate its JIT region.

Thus, reset pointer tags of pointers returned from bpf_jit_alloc_exec().

Resetting tags for these pointers results in untagged pointers being
passed to set_memory_x(). This causes conflicts in arithmetic checks
in change_memory_common(), as vm_struct->addr pointer returned by
find_vm_area() is tagged.

Reset pointer tag of find_vm_area(addr)->addr in change_memory_common().

Signed-off-by: Andrey Konovalov <[email protected]>
Acked-by: Catalin Marinas <[email protected]>

---

Changes v3->v4:
- Reset pointer tag in change_memory_common().

Changes v2->v3:
- Add this patch.
---
arch/arm64/kernel/module.c | 3 ++-
arch/arm64/mm/pageattr.c | 2 +-
arch/arm64/net/bpf_jit_comp.c | 3 ++-
3 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index d3a1fa818348..f2d4bb14bfab 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -63,7 +63,8 @@ void *module_alloc(unsigned long size)
return NULL;
}

- return p;
+ /* Memory is intended to be executable, reset the pointer tag. */
+ return kasan_reset_tag(p);
}

enum aarch64_reloc_op {
diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c
index a3bacd79507a..64e985eaa52d 100644
--- a/arch/arm64/mm/pageattr.c
+++ b/arch/arm64/mm/pageattr.c
@@ -85,7 +85,7 @@ static int change_memory_common(unsigned long addr, int numpages,
*/
area = find_vm_area((void *)addr);
if (!area ||
- end > (unsigned long)area->addr + area->size ||
+ end > (unsigned long)kasan_reset_tag(area->addr) + area->size ||
!(area->flags & VM_ALLOC))
return -EINVAL;

diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index e96d4d87291f..2198af06ae6a 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -1150,7 +1150,8 @@ u64 bpf_jit_alloc_exec_limit(void)

void *bpf_jit_alloc_exec(unsigned long size)
{
- return vmalloc(size);
+ /* Memory is intended to be executable, reset the pointer tag. */
+ return kasan_reset_tag(vmalloc(size));
}

void bpf_jit_free_exec(void *addr)
--
2.25.1

2022-01-24 19:50:37

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 39/39] kasan: improve vmalloc tests

From: Andrey Konovalov <[email protected]>

Update the existing vmalloc_oob() test to account for the specifics
of the tag-based modes. Also add a few new checks and comments.

Add new vmalloc-related tests:

- vmalloc_helpers_tags() to check that exported vmalloc helpers can
handle tagged pointers.
- vmap_tags() to check that SW_TAGS mode properly tags vmap() mappings.
- vm_map_ram_tags() to check that SW_TAGS mode properly tags
vm_map_ram() mappings.
- vmalloc_percpu() to check that SW_TAGS mode tags regions allocated
for __alloc_percpu(). The tagging of per-cpu mappings is best-effort;
proper tagging is tracked in [1].

[1] https://bugzilla.kernel.org/show_bug.cgi?id=215019

Signed-off-by: Andrey Konovalov <[email protected]>
---
lib/test_kasan.c | 189 +++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 183 insertions(+), 6 deletions(-)

diff --git a/lib/test_kasan.c b/lib/test_kasan.c
index 847cdbefab46..ae7b2e703f1b 100644
--- a/lib/test_kasan.c
+++ b/lib/test_kasan.c
@@ -19,6 +19,7 @@
#include <linux/uaccess.h>
#include <linux/io.h>
#include <linux/vmalloc.h>
+#include <linux/set_memory.h>

#include <asm/page.h>

@@ -1049,21 +1050,181 @@ static void kmalloc_double_kzfree(struct kunit *test)
KUNIT_EXPECT_KASAN_FAIL(test, kfree_sensitive(ptr));
}

+static void vmalloc_helpers_tags(struct kunit *test)
+{
+ void *ptr;
+ int rv;
+
+ /* This test is intended for tag-based modes. */
+ KASAN_TEST_NEEDS_CONFIG_OFF(test, CONFIG_KASAN_GENERIC);
+
+ KASAN_TEST_NEEDS_CONFIG_ON(test, CONFIG_KASAN_VMALLOC);
+
+ ptr = vmalloc(PAGE_SIZE);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr);
+
+ /* Check that the returned pointer is tagged. */
+ KUNIT_EXPECT_GE(test, (u8)get_tag(ptr), (u8)KASAN_TAG_MIN);
+ KUNIT_EXPECT_LT(test, (u8)get_tag(ptr), (u8)KASAN_TAG_KERNEL);
+
+ /* Make sure exported vmalloc helpers handle tagged pointers. */
+ KUNIT_ASSERT_TRUE(test, is_vmalloc_addr(ptr));
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, vmalloc_to_page(ptr));
+
+ /* Make sure vmalloc'ed memory permissions can be changed. */
+ rv = set_memory_ro((unsigned long)ptr, 1);
+ KUNIT_ASSERT_GE(test, rv, 0);
+ rv = set_memory_rw((unsigned long)ptr, 1);
+ KUNIT_ASSERT_GE(test, rv, 0);
+
+ vfree(ptr);
+}
+
static void vmalloc_oob(struct kunit *test)
{
- void *area;
+ char *v_ptr, *p_ptr;
+ struct page *page;
+ size_t size = PAGE_SIZE / 2 - KASAN_GRANULE_SIZE - 5;

KASAN_TEST_NEEDS_CONFIG_ON(test, CONFIG_KASAN_VMALLOC);

+ v_ptr = vmalloc(size);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, v_ptr);
+
/*
- * We have to be careful not to hit the guard page.
+ * We have to be careful not to hit the guard page in vmalloc tests.
* The MMU will catch that and crash us.
*/
- area = vmalloc(3000);
- KUNIT_ASSERT_NOT_ERR_OR_NULL(test, area);

- KUNIT_EXPECT_KASAN_FAIL(test, ((volatile char *)area)[3100]);
- vfree(area);
+ /* Make sure in-bounds accesses are valid. */
+ v_ptr[0] = 0;
+ v_ptr[size - 1] = 0;
+
+ /*
+ * An unaligned access past the requested vmalloc size.
+ * Only generic KASAN can precisely detect these.
+ */
+ if (IS_ENABLED(CONFIG_KASAN_GENERIC))
+ KUNIT_EXPECT_KASAN_FAIL(test, ((volatile char *)v_ptr)[size]);
+
+ /* An aligned access into the first out-of-bounds granule. */
+ KUNIT_EXPECT_KASAN_FAIL(test, ((volatile char *)v_ptr)[size + 5]);
+
+ /* Check that in-bounds accesses to the physical page are valid. */
+ page = vmalloc_to_page(v_ptr);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, page);
+ p_ptr = page_address(page);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, p_ptr);
+ p_ptr[0] = 0;
+
+ vfree(v_ptr);
+
+ /*
+ * We can't check for use-after-unmap bugs in this nor in the following
+ * vmalloc tests, as the page might be fully unmapped and accessing it
+ * will crash the kernel.
+ */
+}
+
+static void vmap_tags(struct kunit *test)
+{
+ char *p_ptr, *v_ptr;
+ struct page *p_page, *v_page;
+ size_t order = 1;
+
+ /*
+ * This test is specifically crafted for the software tag-based mode,
+ * the only tag-based mode that poisons vmap mappings.
+ */
+ KASAN_TEST_NEEDS_CONFIG_ON(test, CONFIG_KASAN_SW_TAGS);
+
+ KASAN_TEST_NEEDS_CONFIG_ON(test, CONFIG_KASAN_VMALLOC);
+
+ p_page = alloc_pages(GFP_KERNEL, order);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, p_page);
+ p_ptr = page_address(p_page);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, p_ptr);
+
+ v_ptr = vmap(&p_page, 1 << order, VM_MAP, PAGE_KERNEL);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, v_ptr);
+
+ /*
+ * We can't check for out-of-bounds bugs in this nor in the following
+ * vmalloc tests, as allocations have page granularity and accessing
+ * the guard page will crash the kernel.
+ */
+
+ KUNIT_EXPECT_GE(test, (u8)get_tag(v_ptr), (u8)KASAN_TAG_MIN);
+ KUNIT_EXPECT_LT(test, (u8)get_tag(v_ptr), (u8)KASAN_TAG_KERNEL);
+
+ /* Make sure that in-bounds accesses through both pointers work. */
+ *p_ptr = 0;
+ *v_ptr = 0;
+
+ /* Make sure vmalloc_to_page() correctly recovers the page pointer. */
+ v_page = vmalloc_to_page(v_ptr);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, v_page);
+ KUNIT_EXPECT_PTR_EQ(test, p_page, v_page);
+
+ vunmap(v_ptr);
+ free_pages((unsigned long)p_ptr, order);
+}
+
+static void vm_map_ram_tags(struct kunit *test)
+{
+ char *p_ptr, *v_ptr;
+ struct page *page;
+ size_t order = 1;
+
+ /*
+ * This test is specifically crafted for the software tag-based mode,
+ * the only tag-based mode that poisons vm_map_ram mappings.
+ */
+ KASAN_TEST_NEEDS_CONFIG_ON(test, CONFIG_KASAN_SW_TAGS);
+
+ page = alloc_pages(GFP_KERNEL, order);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, page);
+ p_ptr = page_address(page);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, p_ptr);
+
+ v_ptr = vm_map_ram(&page, 1 << order, -1);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, v_ptr);
+
+ KUNIT_EXPECT_GE(test, (u8)get_tag(v_ptr), (u8)KASAN_TAG_MIN);
+ KUNIT_EXPECT_LT(test, (u8)get_tag(v_ptr), (u8)KASAN_TAG_KERNEL);
+
+ /* Make sure that in-bounds accesses through both pointers work. */
+ *p_ptr = 0;
+ *v_ptr = 0;
+
+ vm_unmap_ram(v_ptr, 1 << order);
+ free_pages((unsigned long)p_ptr, order);
+}
+
+static void vmalloc_percpu(struct kunit *test)
+{
+ char __percpu *ptr;
+ int cpu;
+
+ /*
+ * This test is specifically crafted for the software tag-based mode,
+ * the only tag-based mode that poisons percpu mappings.
+ */
+ KASAN_TEST_NEEDS_CONFIG_ON(test, CONFIG_KASAN_SW_TAGS);
+
+ ptr = __alloc_percpu(PAGE_SIZE, PAGE_SIZE);
+
+ for_each_possible_cpu(cpu) {
+ char *c_ptr = per_cpu_ptr(ptr, cpu);
+
+ KUNIT_EXPECT_GE(test, (u8)get_tag(c_ptr), (u8)KASAN_TAG_MIN);
+ KUNIT_EXPECT_LT(test, (u8)get_tag(c_ptr), (u8)KASAN_TAG_KERNEL);
+
+ /* Make sure that in-bounds accesses don't crash the kernel. */
+ *c_ptr = 0;
+ }
+
+ free_percpu(ptr);
}

/*
@@ -1097,6 +1258,18 @@ static void match_all_not_assigned(struct kunit *test)
KUNIT_EXPECT_LT(test, (u8)get_tag(ptr), (u8)KASAN_TAG_KERNEL);
free_pages((unsigned long)ptr, order);
}
+
+ if (!IS_ENABLED(CONFIG_KASAN_VMALLOC))
+ return;
+
+ for (i = 0; i < 256; i++) {
+ size = (get_random_int() % 1024) + 1;
+ ptr = vmalloc(size);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr);
+ KUNIT_EXPECT_GE(test, (u8)get_tag(ptr), (u8)KASAN_TAG_MIN);
+ KUNIT_EXPECT_LT(test, (u8)get_tag(ptr), (u8)KASAN_TAG_KERNEL);
+ vfree(ptr);
+ }
}

/* Check that 0xff works as a match-all pointer tag for tag-based modes. */
@@ -1202,7 +1375,11 @@ static struct kunit_case kasan_kunit_test_cases[] = {
KUNIT_CASE(kasan_bitops_generic),
KUNIT_CASE(kasan_bitops_tags),
KUNIT_CASE(kmalloc_double_kzfree),
+ KUNIT_CASE(vmalloc_helpers_tags),
KUNIT_CASE(vmalloc_oob),
+ KUNIT_CASE(vmap_tags),
+ KUNIT_CASE(vm_map_ram_tags),
+ KUNIT_CASE(vmalloc_percpu),
KUNIT_CASE(match_all_not_assigned),
KUNIT_CASE(match_all_ptr_tag),
KUNIT_CASE(match_all_mem_tag),
--
2.25.1

2022-01-24 19:50:34

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 28/39] kasan, page_alloc: allow skipping unpoisoning for HW_TAGS

From: Andrey Konovalov <[email protected]>

Add a new GFP flag __GFP_SKIP_KASAN_UNPOISON that allows skipping KASAN
poisoning for page_alloc allocations. The flag is only effective with
HW_TAGS KASAN.

This flag will be used by vmalloc code for page_alloc allocations
backing vmalloc() mappings in a following patch. The reason to skip
KASAN poisoning for these pages in page_alloc is because vmalloc code
will be poisoning them instead.

Also reword the comment for __GFP_SKIP_KASAN_POISON.

Signed-off-by: Andrey Konovalov <[email protected]>

---

Changes v4->v5:
- Cosmetic changes to __def_gfpflag_names_kasan and __GFP_BITS_SHIFT.

Changes v3->v4:
- Only define __GFP_SKIP_KASAN_POISON when CONFIG_KASAN_HW_TAGS is
enabled.

Changes v2->v3:
- Update patch description.

Signed-off-by: Andrey Konovalov <[email protected]>
---
include/linux/gfp.h | 21 +++++++++++++--------
include/trace/events/mmflags.h | 5 +++--
mm/page_alloc.c | 31 ++++++++++++++++++++++---------
3 files changed, 38 insertions(+), 19 deletions(-)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 96f707931770..7303d1064460 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -55,12 +55,14 @@ struct vm_area_struct;
#define ___GFP_ACCOUNT 0x400000u
#define ___GFP_ZEROTAGS 0x800000u
#ifdef CONFIG_KASAN_HW_TAGS
-#define ___GFP_SKIP_KASAN_POISON 0x1000000u
+#define ___GFP_SKIP_KASAN_UNPOISON 0x1000000u
+#define ___GFP_SKIP_KASAN_POISON 0x2000000u
#else
+#define ___GFP_SKIP_KASAN_UNPOISON 0
#define ___GFP_SKIP_KASAN_POISON 0
#endif
#ifdef CONFIG_LOCKDEP
-#define ___GFP_NOLOCKDEP 0x2000000u
+#define ___GFP_NOLOCKDEP 0x4000000u
#else
#define ___GFP_NOLOCKDEP 0
#endif
@@ -241,22 +243,25 @@ struct vm_area_struct;
* intended for optimization: setting memory tags at the same time as zeroing
* memory has minimal additional performace impact.
*
- * %__GFP_SKIP_KASAN_POISON returns a page which does not need to be poisoned
- * on deallocation. Typically used for userspace pages. Currently only has an
- * effect in HW tags mode.
+ * %__GFP_SKIP_KASAN_UNPOISON makes KASAN skip unpoisoning on page allocation.
+ * Only effective in HW_TAGS mode.
+ *
+ * %__GFP_SKIP_KASAN_POISON makes KASAN skip poisoning on page deallocation.
+ * Typically, used for userspace pages. Only effective in HW_TAGS mode.
*/
#define __GFP_NOWARN ((__force gfp_t)___GFP_NOWARN)
#define __GFP_COMP ((__force gfp_t)___GFP_COMP)
#define __GFP_ZERO ((__force gfp_t)___GFP_ZERO)
#define __GFP_ZEROTAGS ((__force gfp_t)___GFP_ZEROTAGS)
-#define __GFP_SKIP_KASAN_POISON ((__force gfp_t)___GFP_SKIP_KASAN_POISON)
+#define __GFP_SKIP_KASAN_UNPOISON ((__force gfp_t)___GFP_SKIP_KASAN_UNPOISON)
+#define __GFP_SKIP_KASAN_POISON ((__force gfp_t)___GFP_SKIP_KASAN_POISON)

/* Disable lockdep for GFP context tracking */
#define __GFP_NOLOCKDEP ((__force gfp_t)___GFP_NOLOCKDEP)

/* Room for N __GFP_FOO bits */
-#define __GFP_BITS_SHIFT (24 + \
- IS_ENABLED(CONFIG_KASAN_HW_TAGS) + \
+#define __GFP_BITS_SHIFT (24 + \
+ 2 * IS_ENABLED(CONFIG_KASAN_HW_TAGS) + \
IS_ENABLED(CONFIG_LOCKDEP))
#define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))

diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h
index cb4520374e2c..134c45e62d91 100644
--- a/include/trace/events/mmflags.h
+++ b/include/trace/events/mmflags.h
@@ -52,8 +52,9 @@
{(unsigned long)__GFP_ZEROTAGS, "__GFP_ZEROTAGS"} \

#ifdef CONFIG_KASAN_HW_TAGS
-#define __def_gfpflag_names_kasan \
- , {(unsigned long)__GFP_SKIP_KASAN_POISON, "__GFP_SKIP_KASAN_POISON"}
+#define __def_gfpflag_names_kasan , \
+ {(unsigned long)__GFP_SKIP_KASAN_POISON, "__GFP_SKIP_KASAN_POISON"}, \
+ {(unsigned long)__GFP_SKIP_KASAN_UNPOISON, "__GFP_SKIP_KASAN_UNPOISON"}
#else
#define __def_gfpflag_names_kasan
#endif
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 3af38e323391..94bfbc216ae9 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2395,6 +2395,26 @@ static bool check_new_pages(struct page *page, unsigned int order)
return false;
}

+static inline bool should_skip_kasan_unpoison(gfp_t flags, bool init_tags)
+{
+ /* Don't skip if a software KASAN mode is enabled. */
+ if (IS_ENABLED(CONFIG_KASAN_GENERIC) ||
+ IS_ENABLED(CONFIG_KASAN_SW_TAGS))
+ return false;
+
+ /* Skip, if hardware tag-based KASAN is not enabled. */
+ if (!kasan_hw_tags_enabled())
+ return true;
+
+ /*
+ * With hardware tag-based KASAN enabled, skip if either:
+ *
+ * 1. Memory tags have already been cleared via tag_clear_highpage().
+ * 2. Skipping has been requested via __GFP_SKIP_KASAN_UNPOISON.
+ */
+ return init_tags || (flags & __GFP_SKIP_KASAN_UNPOISON);
+}
+
inline void post_alloc_hook(struct page *page, unsigned int order,
gfp_t gfp_flags)
{
@@ -2434,15 +2454,8 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
/* Note that memory is already initialized by the loop above. */
init = false;
}
- /*
- * If either a software KASAN mode is enabled, or,
- * in the case of hardware tag-based KASAN,
- * if memory tags have not been cleared via tag_clear_highpage().
- */
- if (IS_ENABLED(CONFIG_KASAN_GENERIC) ||
- IS_ENABLED(CONFIG_KASAN_SW_TAGS) ||
- kasan_hw_tags_enabled() && !init_tags) {
- /* Mark shadow memory or set memory tags. */
+ if (!should_skip_kasan_unpoison(gfp_flags, init_tags)) {
+ /* Unpoison shadow memory or set memory tags. */
kasan_unpoison_pages(page, order, init);

/* Note that memory is already initialized by KASAN. */
--
2.25.1

2022-01-24 19:50:45

by Marco Elver

[permalink] [raw]
Subject: Re: [PATCH v6 00/39] kasan, vmalloc, arm64: add vmalloc tagging support for SW/HW_TAGS

On Mon, 24 Jan 2022 at 19:02, <[email protected]> wrote:
>
> From: Andrey Konovalov <[email protected]>
>
> Hi,
>
> This patchset adds vmalloc tagging support for SW_TAGS and HW_TAGS
> KASAN modes.
[...]
>
> Acked-by: Marco Elver <[email protected]>

FYI, my Ack may get lost here - on rebase you could apply it to all
patches to carry it forward. As-is, Andrew would still have to apply
it manually.

An Ack to the cover letter saves replying to each patch and thus
generating less emails, which I think is preferred.

My Ack is still valid, given v6 is mainly a rebase and I don't see any
major changes.

Thanks,
-- Marco

> Andrey Konovalov (39):
> kasan, page_alloc: deduplicate should_skip_kasan_poison
> kasan, page_alloc: move tag_clear_highpage out of
> kernel_init_free_pages
> kasan, page_alloc: merge kasan_free_pages into free_pages_prepare
> kasan, page_alloc: simplify kasan_poison_pages call site
> kasan, page_alloc: init memory of skipped pages on free
> kasan: drop skip_kasan_poison variable in free_pages_prepare
> mm: clarify __GFP_ZEROTAGS comment
> kasan: only apply __GFP_ZEROTAGS when memory is zeroed
> kasan, page_alloc: refactor init checks in post_alloc_hook
> kasan, page_alloc: merge kasan_alloc_pages into post_alloc_hook
> kasan, page_alloc: combine tag_clear_highpage calls in post_alloc_hook
> kasan, page_alloc: move SetPageSkipKASanPoison in post_alloc_hook
> kasan, page_alloc: move kernel_init_free_pages in post_alloc_hook
> kasan, page_alloc: rework kasan_unpoison_pages call site
> kasan: clean up metadata byte definitions
> kasan: define KASAN_VMALLOC_INVALID for SW_TAGS
> kasan, x86, arm64, s390: rename functions for modules shadow
> kasan, vmalloc: drop outdated VM_KASAN comment
> kasan: reorder vmalloc hooks
> kasan: add wrappers for vmalloc hooks
> kasan, vmalloc: reset tags in vmalloc functions
> kasan, fork: reset pointer tags of vmapped stacks
> kasan, arm64: reset pointer tags of vmapped stacks
> kasan, vmalloc: add vmalloc tagging for SW_TAGS
> kasan, vmalloc, arm64: mark vmalloc mappings as pgprot_tagged
> kasan, vmalloc: unpoison VM_ALLOC pages after mapping
> kasan, mm: only define ___GFP_SKIP_KASAN_POISON with HW_TAGS
> kasan, page_alloc: allow skipping unpoisoning for HW_TAGS
> kasan, page_alloc: allow skipping memory init for HW_TAGS
> kasan, vmalloc: add vmalloc tagging for HW_TAGS
> kasan, vmalloc: only tag normal vmalloc allocations
> kasan, arm64: don't tag executable vmalloc allocations
> kasan: mark kasan_arg_stacktrace as __initdata
> kasan: clean up feature flags for HW_TAGS mode
> kasan: add kasan.vmalloc command line flag
> kasan: allow enabling KASAN_VMALLOC and SW/HW_TAGS
> arm64: select KASAN_VMALLOC for SW/HW_TAGS modes
> kasan: documentation updates
> kasan: improve vmalloc tests
>
> Documentation/dev-tools/kasan.rst | 17 ++-
> arch/arm64/Kconfig | 2 +-
> arch/arm64/include/asm/vmalloc.h | 6 +
> arch/arm64/include/asm/vmap_stack.h | 5 +-
> arch/arm64/kernel/module.c | 5 +-
> arch/arm64/mm/pageattr.c | 2 +-
> arch/arm64/net/bpf_jit_comp.c | 3 +-
> arch/s390/kernel/module.c | 2 +-
> arch/x86/kernel/module.c | 2 +-
> include/linux/gfp.h | 35 +++--
> include/linux/kasan.h | 97 +++++++++-----
> include/linux/vmalloc.h | 18 +--
> include/trace/events/mmflags.h | 14 +-
> kernel/fork.c | 1 +
> kernel/scs.c | 4 +-
> lib/Kconfig.kasan | 20 +--
> lib/test_kasan.c | 189 ++++++++++++++++++++++++++-
> mm/kasan/common.c | 4 +-
> mm/kasan/hw_tags.c | 193 ++++++++++++++++++++++------
> mm/kasan/kasan.h | 18 ++-
> mm/kasan/shadow.c | 63 +++++----
> mm/page_alloc.c | 152 +++++++++++++++-------
> mm/vmalloc.c | 99 +++++++++++---
> 23 files changed, 731 insertions(+), 220 deletions(-)
>
> --
> 2.25.1
>

2022-01-24 19:50:47

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 18/39] kasan, vmalloc: drop outdated VM_KASAN comment

From: Andrey Konovalov <[email protected]>

The comment about VM_KASAN in include/linux/vmalloc.c is outdated.
VM_KASAN is currently only used to mark vm_areas allocated for
kernel modules when CONFIG_KASAN_VMALLOC is disabled.

Drop the comment.

Signed-off-by: Andrey Konovalov <[email protected]>
Reviewed-by: Alexander Potapenko <[email protected]>
---
include/linux/vmalloc.h | 11 -----------
1 file changed, 11 deletions(-)

diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
index 880227b9f044..87f8cfec50a0 100644
--- a/include/linux/vmalloc.h
+++ b/include/linux/vmalloc.h
@@ -35,17 +35,6 @@ struct notifier_block; /* in notifier.h */
#define VM_DEFER_KMEMLEAK 0
#endif

-/*
- * VM_KASAN is used slightly differently depending on CONFIG_KASAN_VMALLOC.
- *
- * If IS_ENABLED(CONFIG_KASAN_VMALLOC), VM_KASAN is set on a vm_struct after
- * shadow memory has been mapped. It's used to handle allocation errors so that
- * we don't try to poison shadow on free if it was never allocated.
- *
- * Otherwise, VM_KASAN is set for kasan_module_alloc() allocations and used to
- * determine which allocations need the module shadow freed.
- */
-
/* bits [20..32] reserved for arch specific ioremap internals */

/*
--
2.25.1

2022-01-24 19:50:48

by andrey.konovalov

[permalink] [raw]
Subject: [PATCH v6 23/39] kasan, arm64: reset pointer tags of vmapped stacks

From: Andrey Konovalov <[email protected]>

Once tag-based KASAN modes start tagging vmalloc() allocations,
kernel stacks start getting tagged if CONFIG_VMAP_STACK is enabled.

Reset the tag of kernel stack pointers after allocation in
arch_alloc_vmap_stack().

For SW_TAGS KASAN, when CONFIG_KASAN_STACK is enabled, the
instrumentation can't handle the SP register being tagged.

For HW_TAGS KASAN, there's no instrumentation-related issues. However,
the impact of having a tagged SP register needs to be properly evaluated,
so keep it non-tagged for now.

Note, that the memory for the stack allocation still gets tagged to
catch vmalloc-into-stack out-of-bounds accesses.

Signed-off-by: Andrey Konovalov <[email protected]>
Acked-by: Catalin Marinas <[email protected]>

---

Changes v2->v3:
- Add this patch.
---
arch/arm64/include/asm/vmap_stack.h | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/vmap_stack.h b/arch/arm64/include/asm/vmap_stack.h
index 894e031b28d2..20873099c035 100644
--- a/arch/arm64/include/asm/vmap_stack.h
+++ b/arch/arm64/include/asm/vmap_stack.h
@@ -17,10 +17,13 @@
*/
static inline unsigned long *arch_alloc_vmap_stack(size_t stack_size, int node)
{
+ void *p;
+
BUILD_BUG_ON(!IS_ENABLED(CONFIG_VMAP_STACK));

- return __vmalloc_node(stack_size, THREAD_ALIGN, THREADINFO_GFP, node,
+ p = __vmalloc_node(stack_size, THREAD_ALIGN, THREADINFO_GFP, node,
__builtin_return_address(0));
+ return kasan_reset_tag(p);
}

#endif /* __ASM_VMAP_STACK_H */
--
2.25.1

2022-01-24 19:53:02

by Andrey Konovalov

[permalink] [raw]
Subject: Re: [PATCH v6 00/39] kasan, vmalloc, arm64: add vmalloc tagging support for SW/HW_TAGS

On Mon, Jan 24, 2022 at 7:09 PM Marco Elver <[email protected]> wrote:
>
> On Mon, 24 Jan 2022 at 19:02, <[email protected]> wrote:
> >
> > From: Andrey Konovalov <[email protected]>
> >
> > Hi,
> >
> > This patchset adds vmalloc tagging support for SW_TAGS and HW_TAGS
> > KASAN modes.
> [...]
> >
> > Acked-by: Marco Elver <[email protected]>
>
> FYI, my Ack may get lost here - on rebase you could apply it to all
> patches to carry it forward. As-is, Andrew would still have to apply
> it manually.

Sounds good, will do if there is a v7.

> An Ack to the cover letter saves replying to each patch and thus
> generating less emails, which I think is preferred.
>
> My Ack is still valid, given v6 is mainly a rebase and I don't see any
> major changes.

Thanks, Marco!

2022-03-09 00:47:02

by Andrey Konovalov

[permalink] [raw]
Subject: Re: [PATCH v6 31/39] kasan, vmalloc: only tag normal vmalloc allocations

On Tue, Mar 8, 2022 at 4:17 PM Vasily Gorbik <[email protected]> wrote:
>
> On Mon, Jan 24, 2022 at 07:05:05PM +0100, [email protected] wrote:
> > From: Andrey Konovalov <[email protected]>
> >
> > The kernel can use to allocate executable memory. The only supported way
> > to do that is via __vmalloc_node_range() with the executable bit set in
> > the prot argument. (vmap() resets the bit via pgprot_nx()).
> >
> > Once tag-based KASAN modes start tagging vmalloc allocations, executing
> > code from such allocations will lead to the PC register getting a tag,
> > which is not tolerated by the kernel.
> >
> > Only tag the allocations for normal kernel pages.
> >
> > Signed-off-by: Andrey Konovalov <[email protected]>
>
> This breaks s390 and produce huge amount of false positives.
> I haven't been testing linux-next with KASAN for while, now tried it with
> next-20220308 and bisected false positives to this commit.
>
> Any idea what is going wrong here?

Hi Vasily,

Could you try the attached fix?

Thanks!


Attachments:
s390-kasan-vmalloc.fix (536.00 B)

2022-03-09 01:03:03

by Vasily Gorbik

[permalink] [raw]
Subject: Re: [PATCH v6 31/39] kasan, vmalloc: only tag normal vmalloc allocations

On Mon, Jan 24, 2022 at 07:05:05PM +0100, [email protected] wrote:
> From: Andrey Konovalov <[email protected]>
>
> The kernel can use to allocate executable memory. The only supported way
> to do that is via __vmalloc_node_range() with the executable bit set in
> the prot argument. (vmap() resets the bit via pgprot_nx()).
>
> Once tag-based KASAN modes start tagging vmalloc allocations, executing
> code from such allocations will lead to the PC register getting a tag,
> which is not tolerated by the kernel.
>
> Only tag the allocations for normal kernel pages.
>
> Signed-off-by: Andrey Konovalov <[email protected]>

This breaks s390 and produce huge amount of false positives.
I haven't been testing linux-next with KASAN for while, now tried it with
next-20220308 and bisected false positives to this commit.

Any idea what is going wrong here?

I see 2 patterns:

[ 1.123723] BUG: KASAN: vmalloc-out-of-bounds in ftrace_plt_init+0xb8/0xe0
[ 1.123740] Write of size 8 at addr 001bffff80000000 by task swapper/0/1
[ 1.123745]
[ 1.123749] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.17.0-rc7-118520-ga20d77ce812a #142
[ 1.123755] Hardware name: IBM 8561 T01 701 (KVM/Linux)
[ 1.123758] Call Trace:
[ 1.123761] [<000000000218e5fe>] dump_stack_lvl+0xc6/0xf8
[ 1.123782] [<0000000002176cb4>] print_address_description.constprop.0+0x64/0x2f0
[ 1.123793] [<000000000086fd3e>] kasan_report+0x15e/0x1c8
[ 1.123802] [<0000000000870f5c>] kasan_check_range+0x174/0x1c0
[ 1.123808] [<0000000000871988>] memcpy+0x58/0x88
[ 1.123813] [<000000000342cea8>] ftrace_plt_init+0xb8/0xe0
[ 1.123819] [<0000000000101522>] do_one_initcall+0xc2/0x468
[ 1.123825] [<000000000341ffc6>] do_initcalls+0x1be/0x1e8
[ 1.123830] [<0000000003420504>] kernel_init_freeable+0x494/0x4e8
[ 1.123834] [<0000000002196556>] kernel_init+0x2e/0x180
[ 1.123838] [<000000000010625a>] __ret_from_fork+0x8a/0xe8
[ 1.123843] [<00000000021b557a>] ret_from_fork+0xa/0x40
[ 1.123852]
[ 1.123854]
[ 1.123856] Memory state around the buggy address:
[ 1.123861] 001bffff7fffff00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 1.123865] 001bffff7fffff80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 1.123868] >001bffff80000000: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
[ 1.123872] ^
[ 1.123874] 001bffff80000080: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
[ 1.123878] 001bffff80000100: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8

$ cat /sys/kernel/debug/kernel_page_tables
---[ Modules Area Start ]---
0x001bffff80000000-0x001bffff80001000 4K PTE RO X
0x001bffff80001000-0x001bffff80002000 4K PTE I
0x001bffff80002000-0x001bffff80003000 4K PTE RO X
0x001bffff80003000-0x001bffff80004000 4K PTE I

[ 1.409146] BUG: KASAN: vmalloc-out-of-bounds in bpf_jit_binary_alloc+0x138/0x170
[ 1.409154] Write of size 4 at addr 001bffff80002000 by task systemd/1
[ 1.409158]
[ 1.409160] CPU: 0 PID: 1 Comm: systemd Tainted: G B W 5.17.0-rc7-118520-ga20d77ce812a #141
[ 1.409166] Hardware name: IBM 8561 T01 701 (KVM/Linux)
[ 1.409169] Call Trace:
[ 1.409171] [<000000000218e5fe>] dump_stack_lvl+0xc6/0xf8
[ 1.409176] [<0000000002176cb4>] print_address_description.constprop.0+0x64/0x2f0
[ 1.409183] [<000000000086fd3e>] kasan_report+0x15e/0x1c8
[ 1.409188] [<0000000000588860>] bpf_jit_binary_alloc+0x138/0x170
[ 1.409192] [<000000000019fa84>] bpf_int_jit_compile+0x814/0xca8
[ 1.409197] [<000000000058b60e>] bpf_prog_select_runtime+0x286/0x3e8
[ 1.409202] [<000000000059ac2e>] bpf_prog_load+0xe66/0x1a10
[ 1.409206] [<000000000059ebd4>] __sys_bpf+0x8bc/0x1088
[ 1.409211] [<000000000059f9e8>] __s390x_sys_bpf+0x98/0xc8
[ 1.409216] [<000000000010ce74>] do_syscall+0x22c/0x328
[ 1.409221] [<000000000219599c>] __do_syscall+0x94/0xf0
[ 1.409226] [<00000000021b5542>] system_call+0x82/0xb0
[ 1.409232]
[ 1.409234]
[ 1.409235] Memory state around the buggy address:
[ 1.409238] 001bffff80001f00: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
[ 1.409242] 001bffff80001f80: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
[ 1.409246] >001bffff80002000: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
[ 1.409249] ^
[ 1.409251] 001bffff80002080: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
[ 1.409255] 001bffff80002100: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8

$ git bisect log
git bisect start
# good: [ea4424be16887a37735d6550cfd0611528dbe5d9] Merge tag 'mtd/fixes-for-5.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux
git bisect good ea4424be16887a37735d6550cfd0611528dbe5d9
# bad: [cb153b68ff91cbc434f3de70ac549e110543e1bb] Add linux-next specific files for 20220308
git bisect bad cb153b68ff91cbc434f3de70ac549e110543e1bb
# good: [1ce7aac49a7b73abbd691c6e6a1577a449d90bad] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git
git bisect good 1ce7aac49a7b73abbd691c6e6a1577a449d90bad
# good: [08688e100b1b07ce178c1d3c6b9983e00cd85413] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
git bisect good 08688e100b1b07ce178c1d3c6b9983e00cd85413
# good: [82a204646439657e5c2f94da5cad7ba96de10414] Merge branch 'togreg' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio.git
git bisect good 82a204646439657e5c2f94da5cad7ba96de10414
# good: [ac82bf337c937458bf4f75985857bf3a68cd7c16] Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git
git bisect good ac82bf337c937458bf4f75985857bf3a68cd7c16
# good: [a36f330518af9bd205451bedb4eb22a5245cf010] ipc/mqueue: use get_tree_nodev() in mqueue_get_tree()
git bisect good a36f330518af9bd205451bedb4eb22a5245cf010
# good: [339c1d0fb400ab3acd2da2d9990242f654689f6e] Merge branch 'for-next' of git://git.infradead.org/users/willy/pagecache.git
git bisect good 339c1d0fb400ab3acd2da2d9990242f654689f6e
# good: [b8a58fecbd4982211f528d405a9ded00ddc7d646] kasan: only apply __GFP_ZEROTAGS when memory is zeroed
git bisect good b8a58fecbd4982211f528d405a9ded00ddc7d646
# bad: [141e05389762bee5fb0eb54af9c4d5266ce11d26] kasan: drop addr check from describe_object_addr
git bisect bad 141e05389762bee5fb0eb54af9c4d5266ce11d26
# good: [97fedbc9a6bccd508c392b0e177380313dd9fcd2] kasan, page_alloc: allow skipping unpoisoning for HW_TAGS
git bisect good 97fedbc9a6bccd508c392b0e177380313dd9fcd2
# bad: [606c2ee3fabbf66594f39998be9b5a21c2bf5dff] arm64: select KASAN_VMALLOC for SW/HW_TAGS modes
git bisect bad 606c2ee3fabbf66594f39998be9b5a21c2bf5dff
# bad: [bd2c296805cff9572080bf56807c16d1dd382260] kasan, scs: support tagged vmalloc mappings
git bisect bad bd2c296805cff9572080bf56807c16d1dd382260
# good: [7b80fa947b3a3ee746115395d1c5f7157119b7d2] kasan, vmalloc: add vmalloc tagging for HW_TAGS
git bisect good 7b80fa947b3a3ee746115395d1c5f7157119b7d2
# bad: [f51c09448ea124622f8ebcfb41d06c809ee01bca] fix for "kasan, vmalloc: only tag normal vmalloc allocations"
git bisect bad f51c09448ea124622f8ebcfb41d06c809ee01bca
# bad: [a20d77ce812a3e11b3cf2cb4f411904bb5c6edaa] kasan, vmalloc: only tag normal vmalloc allocations
git bisect bad a20d77ce812a3e11b3cf2cb4f411904bb5c6edaa
# first bad commit: [a20d77ce812a3e11b3cf2cb4f411904bb5c6edaa] kasan, vmalloc: only tag normal vmalloc allocations

2022-03-09 01:06:00

by Vasily Gorbik

[permalink] [raw]
Subject: Re: [PATCH v6 31/39] kasan, vmalloc: only tag normal vmalloc allocations

On Tue, Mar 08, 2022 at 04:30:46PM +0100, Andrey Konovalov wrote:
> On Tue, Mar 8, 2022 at 4:17 PM Vasily Gorbik <[email protected]> wrote:
> >
> > On Mon, Jan 24, 2022 at 07:05:05PM +0100, [email protected] wrote:
> > > From: Andrey Konovalov <[email protected]>
> > >
> > > The kernel can use to allocate executable memory. The only supported way
> > > to do that is via __vmalloc_node_range() with the executable bit set in
> > > the prot argument. (vmap() resets the bit via pgprot_nx()).
> > >
> > > Once tag-based KASAN modes start tagging vmalloc allocations, executing
> > > code from such allocations will lead to the PC register getting a tag,
> > > which is not tolerated by the kernel.
> > >
> > > Only tag the allocations for normal kernel pages.
> > >
> > > Signed-off-by: Andrey Konovalov <[email protected]>
> >
> > This breaks s390 and produce huge amount of false positives.
> > I haven't been testing linux-next with KASAN for while, now tried it with
> > next-20220308 and bisected false positives to this commit.
> >
> > Any idea what is going wrong here?
>
> Could you try the attached fix?

Wow, that was quick!
Yes, it fixes the issue for s390, kasan tests pass as well.
Thank you!

Subject: Re: [PATCH v6 27/39] kasan, mm: only define ___GFP_SKIP_KASAN_POISON with HW_TAGS

On 2022-03-23 12:48:29 [+0100], Vlastimil Babka wrote:
> > +#ifdef CONFIG_KASAN_HW_TAGS
> > #define ___GFP_SKIP_KASAN_POISON 0x1000000u
> > +#else
> > +#define ___GFP_SKIP_KASAN_POISON 0
> > +#endif
> > #ifdef CONFIG_LOCKDEP
> > #define ___GFP_NOLOCKDEP 0x2000000u
> > #else
> > @@ -251,7 +255,9 @@ struct vm_area_struct;
> > #define __GFP_NOLOCKDEP ((__force gfp_t)___GFP_NOLOCKDEP)
> >
> > /* Room for N __GFP_FOO bits */
> > -#define __GFP_BITS_SHIFT (25 + IS_ENABLED(CONFIG_LOCKDEP))
> > +#define __GFP_BITS_SHIFT (24 + \
> > + IS_ENABLED(CONFIG_KASAN_HW_TAGS) + \
> > + IS_ENABLED(CONFIG_LOCKDEP))
>
> This breaks __GFP_NOLOCKDEP, see:
> https://lore.kernel.org/all/[email protected]/

This could work because ___GFP_NOLOCKDEP is still 0x2000000u. In
("kasan, page_alloc: allow skipping memory init for HW_TAGS")
https://lore.kernel.org/all/0d53efeff345de7d708e0baa0d8829167772521e.1643047180.git.andreyknvl@google.com/

This is replaced with 0x8000000u which breaks lockdep.

Sebastian

2022-03-24 07:37:54

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH v6 27/39] kasan, mm: only define ___GFP_SKIP_KASAN_POISON with HW_TAGS

On 3/23/22 14:36, Andrey Konovalov wrote:
> On Wed, Mar 23, 2022 at 2:02 PM Sebastian Andrzej Siewior
> <[email protected]> wrote:
>>
>> On 2022-03-23 12:48:29 [+0100], Vlastimil Babka wrote:
>>>> +#ifdef CONFIG_KASAN_HW_TAGS
>>>> #define ___GFP_SKIP_KASAN_POISON 0x1000000u
>>>> +#else
>>>> +#define ___GFP_SKIP_KASAN_POISON 0
>>>> +#endif
>>>> #ifdef CONFIG_LOCKDEP
>>>> #define ___GFP_NOLOCKDEP 0x2000000u
>>>> #else
>>>> @@ -251,7 +255,9 @@ struct vm_area_struct;
>>>> #define __GFP_NOLOCKDEP ((__force gfp_t)___GFP_NOLOCKDEP)
>>>>
>>>> /* Room for N __GFP_FOO bits */
>>>> -#define __GFP_BITS_SHIFT (25 + IS_ENABLED(CONFIG_LOCKDEP))
>>>> +#define __GFP_BITS_SHIFT (24 + \
>>>> + IS_ENABLED(CONFIG_KASAN_HW_TAGS) + \
>>>> + IS_ENABLED(CONFIG_LOCKDEP))
>>>
>>> This breaks __GFP_NOLOCKDEP, see:
>>> https://lore.kernel.org/all/[email protected]/
>>
>> This could work because ___GFP_NOLOCKDEP is still 0x2000000u. In
>> ("kasan, page_alloc: allow skipping memory init for HW_TAGS")
>> https://lore.kernel.org/all/0d53efeff345de7d708e0baa0d8829167772521e.1643047180.git.andreyknvl@google.com/
>>
>> This is replaced with 0x8000000u which breaks lockdep.
>>
>> Sebastian
>
> Hi Sebastian,
>
> Indeed, sorry for breaking lockdep. Thank you for the report!
>
> I wonder what's the proper fix for this. Perhaps, don't hide KASAN GFP
> bits under CONFIG_KASAN_HW_TAGS? And then do:
>
> #define __GFP_BITS_SHIFT (27 + IS_ENABLED(CONFIG_LOCKDEP))
>
> Vlastimil, Andrew do you have any preference?

I guess it's the simplest thing to do for now. For the future we can
still improve and handle all combinations of kasan/lockdep to occupy as
few bits as possible and set the shift/mask appropriately. Or consider
first if it's necessary anyway. I don't know if we really expect at any
point to start triggering the BUILD_BUG_ON() in radix_tree_init() and
then only some combination of configs will reduce the flags to a number
that works. Or is there anything else that depends on __GFP_BITS_SHIFT?
I mean if we don't expect to go this way, we can just define
__GFP_BITS_SHIFT as a constant that assumes all the config-dependent
flags to be defined (not zero).

> If my suggestion sounds good, Andrew, could you directly apply the
> changes? They are needed for these 3 patches:
>
> kasan, page_alloc: allow skipping memory init for HW_TAGS
> kasan, page_alloc: allow skipping unpoisoning for HW_TAGS
> kasan, mm: only define ___GFP_SKIP_KASAN_POISON with HW_TAGS
>
> As these depend on each other, I can't send separate patches that can
> be folded for all 3.
>
> Thanks!

2022-03-24 11:31:17

by Andrey Konovalov

[permalink] [raw]
Subject: Re: [PATCH v6 27/39] kasan, mm: only define ___GFP_SKIP_KASAN_POISON with HW_TAGS

On Wed, Mar 23, 2022 at 2:02 PM Sebastian Andrzej Siewior
<[email protected]> wrote:
>
> On 2022-03-23 12:48:29 [+0100], Vlastimil Babka wrote:
> > > +#ifdef CONFIG_KASAN_HW_TAGS
> > > #define ___GFP_SKIP_KASAN_POISON 0x1000000u
> > > +#else
> > > +#define ___GFP_SKIP_KASAN_POISON 0
> > > +#endif
> > > #ifdef CONFIG_LOCKDEP
> > > #define ___GFP_NOLOCKDEP 0x2000000u
> > > #else
> > > @@ -251,7 +255,9 @@ struct vm_area_struct;
> > > #define __GFP_NOLOCKDEP ((__force gfp_t)___GFP_NOLOCKDEP)
> > >
> > > /* Room for N __GFP_FOO bits */
> > > -#define __GFP_BITS_SHIFT (25 + IS_ENABLED(CONFIG_LOCKDEP))
> > > +#define __GFP_BITS_SHIFT (24 + \
> > > + IS_ENABLED(CONFIG_KASAN_HW_TAGS) + \
> > > + IS_ENABLED(CONFIG_LOCKDEP))
> >
> > This breaks __GFP_NOLOCKDEP, see:
> > https://lore.kernel.org/all/[email protected]/
>
> This could work because ___GFP_NOLOCKDEP is still 0x2000000u. In
> ("kasan, page_alloc: allow skipping memory init for HW_TAGS")
> https://lore.kernel.org/all/0d53efeff345de7d708e0baa0d8829167772521e.1643047180.git.andreyknvl@google.com/
>
> This is replaced with 0x8000000u which breaks lockdep.
>
> Sebastian

Hi Sebastian,

Indeed, sorry for breaking lockdep. Thank you for the report!

I wonder what's the proper fix for this. Perhaps, don't hide KASAN GFP
bits under CONFIG_KASAN_HW_TAGS? And then do:

#define __GFP_BITS_SHIFT (27 + IS_ENABLED(CONFIG_LOCKDEP))

Vlastimil, Andrew do you have any preference?

If my suggestion sounds good, Andrew, could you directly apply the
changes? They are needed for these 3 patches:

kasan, page_alloc: allow skipping memory init for HW_TAGS
kasan, page_alloc: allow skipping unpoisoning for HW_TAGS
kasan, mm: only define ___GFP_SKIP_KASAN_POISON with HW_TAGS

As these depend on each other, I can't send separate patches that can
be folded for all 3.

Thanks!

2022-03-24 17:23:53

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH v6 27/39] kasan, mm: only define ___GFP_SKIP_KASAN_POISON with HW_TAGS

On 3/23/22 14:02, Sebastian Andrzej Siewior wrote:
> On 2022-03-23 12:48:29 [+0100], Vlastimil Babka wrote:
>>> +#ifdef CONFIG_KASAN_HW_TAGS
>>> #define ___GFP_SKIP_KASAN_POISON 0x1000000u
>>> +#else
>>> +#define ___GFP_SKIP_KASAN_POISON 0
>>> +#endif
>>> #ifdef CONFIG_LOCKDEP
>>> #define ___GFP_NOLOCKDEP 0x2000000u
>>> #else
>>> @@ -251,7 +255,9 @@ struct vm_area_struct;
>>> #define __GFP_NOLOCKDEP ((__force gfp_t)___GFP_NOLOCKDEP)
>>>
>>> /* Room for N __GFP_FOO bits */
>>> -#define __GFP_BITS_SHIFT (25 + IS_ENABLED(CONFIG_LOCKDEP))
>>> +#define __GFP_BITS_SHIFT (24 + \
>>> + IS_ENABLED(CONFIG_KASAN_HW_TAGS) + \
>>> + IS_ENABLED(CONFIG_LOCKDEP))
>>
>> This breaks __GFP_NOLOCKDEP, see:
>> https://lore.kernel.org/all/[email protected]/
>
> This could work because ___GFP_NOLOCKDEP is still 0x2000000u. In

Hm but already this patch makes gfp_allowed_mask to be 0x1ffffff (thus
not covering 0x2000000u) when CONFIG_LOCKDEP is enabled and the KASAN
stuff not? 0x8000000u is just even further away.

> ("kasan, page_alloc: allow skipping memory init for HW_TAGS")
> https://lore.kernel.org/all/0d53efeff345de7d708e0baa0d8829167772521e.1643047180.git.andreyknvl@google.com/
>
> This is replaced with 0x8000000u which breaks lockdep.
>
> Sebastian
>

2022-03-24 21:21:11

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH v6 27/39] kasan, mm: only define ___GFP_SKIP_KASAN_POISON with HW_TAGS

On Wed, Mar 23, 2022 at 02:57:30PM +0100, Vlastimil Babka wrote:
> I guess it's the simplest thing to do for now. For the future we can
> still improve and handle all combinations of kasan/lockdep to occupy as
> few bits as possible and set the shift/mask appropriately. Or consider
> first if it's necessary anyway. I don't know if we really expect at any
> point to start triggering the BUILD_BUG_ON() in radix_tree_init() and
> then only some combination of configs will reduce the flags to a number
> that works. Or is there anything else that depends on __GFP_BITS_SHIFT?

The correct long-term solution is to transition all the radix tree
users to the XArray, which has the GFP flags specified in the correct
place (ie at the call site) instead of embedding the GFP flags in the
data structure.

I've paused work on that while I work on folios; by my count there are
about 60 users left. What I really need is something which prevents
any attempt to add new users. Maybe that's a job for checkpatch.

2022-03-24 23:54:24

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH v6 27/39] kasan, mm: only define ___GFP_SKIP_KASAN_POISON with HW_TAGS

On 1/24/22 19:05, [email protected] wrote:
> From: Andrey Konovalov <[email protected]>
>
> Only define the ___GFP_SKIP_KASAN_POISON flag when CONFIG_KASAN_HW_TAGS
> is enabled.
>
> This patch it not useful by itself, but it prepares the code for
> additions of new KASAN-specific GFP patches.
>
> Signed-off-by: Andrey Konovalov <[email protected]>
>
> ---
>
> Changes v3->v4:
> - This is a new patch.
> ---
> include/linux/gfp.h | 8 +++++++-
> include/trace/events/mmflags.h | 12 +++++++++---
> 2 files changed, 16 insertions(+), 4 deletions(-)
>
> diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> index 581a1f47b8a2..96f707931770 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -54,7 +54,11 @@ struct vm_area_struct;
> #define ___GFP_THISNODE 0x200000u
> #define ___GFP_ACCOUNT 0x400000u
> #define ___GFP_ZEROTAGS 0x800000u
> +#ifdef CONFIG_KASAN_HW_TAGS
> #define ___GFP_SKIP_KASAN_POISON 0x1000000u
> +#else
> +#define ___GFP_SKIP_KASAN_POISON 0
> +#endif
> #ifdef CONFIG_LOCKDEP
> #define ___GFP_NOLOCKDEP 0x2000000u
> #else
> @@ -251,7 +255,9 @@ struct vm_area_struct;
> #define __GFP_NOLOCKDEP ((__force gfp_t)___GFP_NOLOCKDEP)
>
> /* Room for N __GFP_FOO bits */
> -#define __GFP_BITS_SHIFT (25 + IS_ENABLED(CONFIG_LOCKDEP))
> +#define __GFP_BITS_SHIFT (24 + \
> + IS_ENABLED(CONFIG_KASAN_HW_TAGS) + \
> + IS_ENABLED(CONFIG_LOCKDEP))

This breaks __GFP_NOLOCKDEP, see:
https://lore.kernel.org/all/[email protected]/

> #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
>
> /**
> diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h
> index 116ed4d5d0f8..cb4520374e2c 100644
> --- a/include/trace/events/mmflags.h
> +++ b/include/trace/events/mmflags.h
> @@ -49,12 +49,18 @@
> {(unsigned long)__GFP_RECLAIM, "__GFP_RECLAIM"}, \
> {(unsigned long)__GFP_DIRECT_RECLAIM, "__GFP_DIRECT_RECLAIM"},\
> {(unsigned long)__GFP_KSWAPD_RECLAIM, "__GFP_KSWAPD_RECLAIM"},\
> - {(unsigned long)__GFP_ZEROTAGS, "__GFP_ZEROTAGS"}, \
> - {(unsigned long)__GFP_SKIP_KASAN_POISON,"__GFP_SKIP_KASAN_POISON"}\
> + {(unsigned long)__GFP_ZEROTAGS, "__GFP_ZEROTAGS"} \
> +
> +#ifdef CONFIG_KASAN_HW_TAGS
> +#define __def_gfpflag_names_kasan \
> + , {(unsigned long)__GFP_SKIP_KASAN_POISON, "__GFP_SKIP_KASAN_POISON"}
> +#else
> +#define __def_gfpflag_names_kasan
> +#endif
>
> #define show_gfp_flags(flags) \
> (flags) ? __print_flags(flags, "|", \
> - __def_gfpflag_names \
> + __def_gfpflag_names __def_gfpflag_names_kasan \
> ) : "none"
>
> #ifdef CONFIG_MMU

2022-03-25 21:41:15

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH v6 27/39] kasan, mm: only define ___GFP_SKIP_KASAN_POISON with HW_TAGS

On Wed, 23 Mar 2022 14:36:29 +0100 Andrey Konovalov <[email protected]> wrote:

> If my suggestion sounds good, Andrew, could you directly apply the
> changes? They are needed for these 3 patches:
>
> kasan, page_alloc: allow skipping memory init for HW_TAGS
> kasan, page_alloc: allow skipping unpoisoning for HW_TAGS
> kasan, mm: only define ___GFP_SKIP_KASAN_POISON with HW_TAGS
>
> As these depend on each other, I can't send separate patches that can
> be folded for all 3.

It's all upstream now, so please send along a fixup patch.

2022-05-02 18:02:53

by Qian Cai

[permalink] [raw]
Subject: Re: [PATCH v6 00/39] kasan, vmalloc, arm64: add vmalloc tagging support for SW/HW_TAGS

On Mon, Jan 24, 2022 at 07:02:08PM +0100, [email protected] wrote:
> From: Andrey Konovalov <[email protected]>
>
> Hi,
>
> This patchset adds vmalloc tagging support for SW_TAGS and HW_TAGS
> KASAN modes.
>
> The tree with patches is available here:
>
> https://github.com/xairy/linux/tree/up-kasan-vmalloc-tags-v6
>
> About half of patches are cleanups I went for along the way. None of
> them seem to be important enough to go through stable, so I decided
> not to split them out into separate patches/series.
>
> The patchset is partially based on an early version of the HW_TAGS
> patchset by Vincenzo that had vmalloc support. Thus, I added a
> Co-developed-by tag into a few patches.
>
> SW_TAGS vmalloc tagging support is straightforward. It reuses all of
> the generic KASAN machinery, but uses shadow memory to store tags
> instead of magic values. Naturally, vmalloc tagging requires adding
> a few kasan_reset_tag() annotations to the vmalloc code.
>
> HW_TAGS vmalloc tagging support stands out. HW_TAGS KASAN is based on
> Arm MTE, which can only assigns tags to physical memory. As a result,
> HW_TAGS KASAN only tags vmalloc() allocations, which are backed by
> page_alloc memory. It ignores vmap() and others.

I could use some help here. Ever since this series, our system starts to
trigger bad page state bugs from time to time. Any thoughts?

BUG: Bad page state in process systemd-udevd pfn:83ffffcd
page:fffffc20fdfff340 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x83ffffcd
flags: 0xbfffc0000001000(reserved|node=0|zone=2|lastcpupid=0xffff)
raw: 0bfffc0000001000 fffffc20fdfff348 fffffc20fdfff348 0000000000000000
raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
page_owner info is not present (never set?)
CPU: 76 PID: 1873 Comm: systemd-udevd Not tainted 5.18.0-rc4-next-20220428-dirty #67
Call trace:
dump_backtrace
show_stack
dump_stack_lvl
dump_stack
bad_page
free_pcp_prepare
free_unref_page
__free_pages
free_pages.part.0
free_pages
kasan_depopulate_vmalloc_pte
(inlined by) kasan_depopulate_vmalloc_pte at mm/kasan/shadow.c:361
apply_to_pte_range
apply_to_pmd_range
apply_to_pud_range
__apply_to_page_range
apply_to_existing_page_range
kasan_release_vmalloc
(inlined by) kasan_release_vmalloc at mm/kasan/shadow.c:469
__purge_vmap_area_lazy
purge_vmap_area_lazy
alloc_vmap_area
__get_vm_area_node.constprop.0
__vmalloc_node_range
module_alloc
move_module
layout_and_allocate
load_module
__do_sys_finit_module
__arm64_sys_finit_module
invoke_syscall
el0_svc_common.constprop.0
do_el0_svc
el0_svc
el0t_64_sync_handler
el0t_64_sync
Disabling lock debugging due to kernel taint
BUG: Bad page state in process systemd-udevd pfn:83ffffcc
page:fffffc20fdfff300 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x83ffffcc
flags: 0xbfffc0000001000(reserved|node=0|zone=2|lastcpupid=0xffff)
raw: 0bfffc0000001000 fffffc20fdfff308 fffffc20fdfff308 0000000000000000
raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
page_owner info is not present (never set?)
CPU: 76 PID: 1873 Comm: systemd-udevd Tainted: G B 5.18.0-rc4-next-20220428-dirty #67
Call trace:
dump_backtrace
show_stack
dump_stack_lvl
dump_stack
bad_page
free_pcp_prepare
free_unref_page
__free_pages
free_pages.part.0
free_pages
kasan_depopulate_vmalloc_pte
apply_to_pte_range
apply_to_pmd_range
apply_to_pud_range
__apply_to_page_range
apply_to_existing_page_range
kasan_release_vmalloc
__purge_vmap_area_lazy
purge_vmap_area_lazy
alloc_vmap_area
__get_vm_area_node.constprop.0
__vmalloc_node_range
module_alloc
move_module
layout_and_allocate
load_module
__do_sys_finit_module
__arm64_sys_finit_module
invoke_syscall
el0_svc_common.constprop.0
do_el0_svc
el0_svc
el0t_64_sync_handler
el0t_64_sync

2022-05-02 18:18:28

by Andrey Konovalov

[permalink] [raw]
Subject: Re: [PATCH v6 00/39] kasan, vmalloc, arm64: add vmalloc tagging support for SW/HW_TAGS

On Thu, Apr 28, 2022 at 4:14 PM Qian Cai <[email protected]> wrote:
>
> > SW_TAGS vmalloc tagging support is straightforward. It reuses all of
> > the generic KASAN machinery, but uses shadow memory to store tags
> > instead of magic values. Naturally, vmalloc tagging requires adding
> > a few kasan_reset_tag() annotations to the vmalloc code.
>
> I could use some help here. Ever since this series, our system starts to
> trigger bad page state bugs from time to time. Any thoughts?
>
> BUG: Bad page state in process systemd-udevd pfn:83ffffcd
> page:fffffc20fdfff340 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x83ffffcd
> flags: 0xbfffc0000001000(reserved|node=0|zone=2|lastcpupid=0xffff)
> raw: 0bfffc0000001000 fffffc20fdfff348 fffffc20fdfff348 0000000000000000
> raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
> page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
> page_owner info is not present (never set?)

Hi Qian,

No ideas so far.

Looks like the page has reserved tag set when it's being freed.

Does this crash only happen with the SW_TAGS mode?

Does this crash only happen when loading modules?

Does your system have any hot-plugged memory?

Thanks!

2022-05-03 00:45:21

by Qian Cai

[permalink] [raw]
Subject: Re: [PATCH v6 00/39] kasan, vmalloc, arm64: add vmalloc tagging support for SW/HW_TAGS

On Thu, Apr 28, 2022 at 05:28:12PM +0200, Andrey Konovalov wrote:
> No ideas so far.
>
> Looks like the page has reserved tag set when it's being freed.
>
> Does this crash only happen with the SW_TAGS mode?

No, the system is running exclusively with CONFIG_KASAN_GENERIC=y

> Does this crash only happen when loading modules?

Yes. Here is another sligtly different path at the bottom.

> Does your system have any hot-plugged memory?

No.

BUG: Bad page state in process systemd-udevd pfn:403fc007c
page:fffffd00fd001f00 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x403fc007c
flags: 0x1bfffc0000001000(reserved|node=1|zone=2|lastcpupid=0xffff)
raw: 1bfffc0000001000 fffffd00fd001f08 fffffd00fd001f08 0000000000000000
raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
CPU: 101 PID: 2004 Comm: systemd-udevd Not tainted 5.17.0-rc8-next-20220317-dirty #39
Call trace:
dump_backtrace
show_stack
dump_stack_lvl
dump_stack
bad_page
free_pcp_prepare
free_pages_prepare at mm/page_alloc.c:1348
(inlined by) free_pcp_prepare at mm/page_alloc.c:1403
free_unref_page
__free_pages
free_pages.part.0
free_pages
kasan_depopulate_vmalloc_pte
(inlined by) kasan_depopulate_vmalloc_pte at mm/kasan/shadow.c:359
apply_to_pte_range
apply_to_pte_range at mm/memory.c:2547
apply_to_pmd_range
apply_to_pud_range
__apply_to_page_range
apply_to_existing_page_range
kasan_release_vmalloc
(inlined by) kasan_release_vmalloc at mm/kasan/shadow.c:469
__purge_vmap_area_lazy
_vm_unmap_aliases.part.0
__vunmap
__vfree
vfree
module_memfree
free_module
do_init_module
load_module
__do_sys_finit_module
__arm64_sys_finit_module
invoke_syscall
el0_svc_common.constprop.0
do_el0_svc
el0_svc
el0t_64_sync_handler
el0t_64_sync
Disabling lock debugging due to kernel taint
BUG: Bad page state in process systemd-udevd pfn:403fc007b
page:fffffd00fd001ec0 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x403fc007b
flags: 0x1bfffc0000001000(reserved|node=1|zone=2|lastcpupid=0xffff)
raw: 1bfffc0000001000 fffffd00fd001ec8 fffffd00fd001ec8 0000000000000000
raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
CPU: 101 PID: 2004 Comm: systemd-udevd Tainted: G B 5.17.0-rc8-next-20220317-dirty #39
Call trace:
dump_backtrace
show_stack
dump_stack_lvl
dump_stack
bad_page
free_pcp_prepare
free_unref_page
__free_pages
free_pages.part.0
free_pages
kasan_depopulate_vmalloc_pte
apply_to_pte_range
apply_to_pmd_range
apply_to_pud_range
__apply_to_page_range
apply_to_existing_page_range
kasan_release_vmalloc
__purge_vmap_area_lazy
_vm_unmap_aliases.part.0
__vunmap
__vfree
vfree
module_memfree
free_module
do_init_module
load_module
__do_sys_finit_module
__arm64_sys_finit_module
invoke_syscall
el0_svc_common.constprop.0
do_el0_svc
el0_svc
el0t_64_sync_handler
el0t_64_sync