kernel_init_free_pages() will use memset() on s390 to clear all pages
from kmalloc_order() which will override KASAN redzones because a
redzone was setup from the end of the allocation size to the end of the
last page. Silence it by not reporting it there. An example of the
report is,
BUG: KASAN: slab-out-of-bounds in __free_pages_ok
Write of size 4096 at addr 000000014beaa000
Call Trace:
show_stack+0x152/0x210
dump_stack+0x1f8/0x248
print_address_description.isra.13+0x5e/0x4d0
kasan_report+0x130/0x178
check_memory_region+0x190/0x218
memset+0x34/0x60
__free_pages_ok+0x894/0x12f0
kfree+0x4f2/0x5e0
unpack_to_rootfs+0x60e/0x650
populate_rootfs+0x56/0x358
do_one_initcall+0x1f4/0xa20
kernel_init_freeable+0x758/0x7e8
kernel_init+0x1c/0x170
ret_from_fork+0x24/0x28
Memory state around the buggy address:
000000014bea9f00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
000000014bea9f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>000000014beaa000: 03 fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
^
000000014beaa080: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
000000014beaa100: fe fe fe fe fe fe fe fe fe fe fe fe fe fe
Fixes: 6471384af2a6 ("mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options")
Signed-off-by: Qian Cai <[email protected]>
---
mm/page_alloc.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 727751219003..9954973f89a3 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1164,8 +1164,11 @@ static void kernel_init_free_pages(struct page *page, int numpages)
{
int i;
+ /* s390's use of memset() could override KASAN redzones. */
+ kasan_disable_current();
for (i = 0; i < numpages; i++)
clear_highpage(page + i);
+ kasan_enable_current();
}
static __always_inline bool free_pages_prepare(struct page *page,
--
2.21.0 (Apple Git-122.2)
On Wed, Jun 10, 2020 at 7:22 AM Qian Cai <[email protected]> wrote:
>
> kernel_init_free_pages() will use memset() on s390 to clear all pages
> from kmalloc_order() which will override KASAN redzones because a
> redzone was setup from the end of the allocation size to the end of the
> last page. Silence it by not reporting it there. An example of the
> report is,
Interesting. The reason why we did not hit it on x86_64 is because
clear_page is implemented in asm (arch/x86/lib/clear_page_64.S) and
thus is not instrumented. Arm64 probably does the same. However, on
s390 clear_page is defined to memset.
clear_[high]page are pretty extensively used in the kernel.
We can either do this, or make clear_page non instrumented on s390 as
well to match the existing implicit assumption. The benefit of the
current approach is that we can find some real use-after-free's and
maybe out-of-bounds on clear_page. The downside is that we may need
more of these annotations. Thoughts?
> BUG: KASAN: slab-out-of-bounds in __free_pages_ok
> Write of size 4096 at addr 000000014beaa000
> Call Trace:
> show_stack+0x152/0x210
> dump_stack+0x1f8/0x248
> print_address_description.isra.13+0x5e/0x4d0
> kasan_report+0x130/0x178
> check_memory_region+0x190/0x218
> memset+0x34/0x60
> __free_pages_ok+0x894/0x12f0
> kfree+0x4f2/0x5e0
> unpack_to_rootfs+0x60e/0x650
> populate_rootfs+0x56/0x358
> do_one_initcall+0x1f4/0xa20
> kernel_init_freeable+0x758/0x7e8
> kernel_init+0x1c/0x170
> ret_from_fork+0x24/0x28
> Memory state around the buggy address:
> 000000014bea9f00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 000000014bea9f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >000000014beaa000: 03 fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
> ^
> 000000014beaa080: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
> 000000014beaa100: fe fe fe fe fe fe fe fe fe fe fe fe fe fe
>
> Fixes: 6471384af2a6 ("mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options")
> Signed-off-by: Qian Cai <[email protected]>
> ---
> mm/page_alloc.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 727751219003..9954973f89a3 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1164,8 +1164,11 @@ static void kernel_init_free_pages(struct page *page, int numpages)
> {
> int i;
>
> + /* s390's use of memset() could override KASAN redzones. */
> + kasan_disable_current();
> for (i = 0; i < numpages; i++)
> clear_highpage(page + i);
> + kasan_enable_current();
> }
>
> static __always_inline bool free_pages_prepare(struct page *page,
> --
> 2.21.0 (Apple Git-122.2)
>
On Wed, Jun 10, 2020 at 7:55 AM Dmitry Vyukov <[email protected]> wrote:
>
> On Wed, Jun 10, 2020 at 7:22 AM Qian Cai <[email protected]> wrote:
> >
> > kernel_init_free_pages() will use memset() on s390 to clear all pages
> > from kmalloc_order() which will override KASAN redzones because a
> > redzone was setup from the end of the allocation size to the end of the
> > last page. Silence it by not reporting it there. An example of the
> > report is,
>
> Interesting. The reason why we did not hit it on x86_64 is because
> clear_page is implemented in asm (arch/x86/lib/clear_page_64.S) and
> thus is not instrumented. Arm64 probably does the same. However, on
> s390 clear_page is defined to memset.
Can we define it to __memset() instead?
__memset() is supposed to be ignored by KASAN, e.g. KASAN runtime uses
it in the places where we don't care about bugs.
On Wed, Jun 10, 2020 at 07:54:50AM +0200, Dmitry Vyukov wrote:
> On Wed, Jun 10, 2020 at 7:22 AM Qian Cai <[email protected]> wrote:
> >
> > kernel_init_free_pages() will use memset() on s390 to clear all pages
> > from kmalloc_order() which will override KASAN redzones because a
> > redzone was setup from the end of the allocation size to the end of the
> > last page. Silence it by not reporting it there. An example of the
> > report is,
>
> Interesting. The reason why we did not hit it on x86_64 is because
> clear_page is implemented in asm (arch/x86/lib/clear_page_64.S) and
> thus is not instrumented. Arm64 probably does the same. However, on
> s390 clear_page is defined to memset.
> clear_[high]page are pretty extensively used in the kernel.
> We can either do this, or make clear_page non instrumented on s390 as
> well to match the existing implicit assumption. The benefit of the
> current approach is that we can find some real use-after-free's and
> maybe out-of-bounds on clear_page. The downside is that we may need
> more of these annotations. Thoughts?
Since we had already done the same thing in poison_page(), I suppose we
could do the same here. Also, clear_page() has been used in many places
on s390, and it is not clear to me if those are all safe like this.
There might be more annotations required, so it probably up to s390
maintainers (CC'ed) if they prefer not instrumenting clear_page() like
other arches.
>
> > BUG: KASAN: slab-out-of-bounds in __free_pages_ok
> > Write of size 4096 at addr 000000014beaa000
> > Call Trace:
> > show_stack+0x152/0x210
> > dump_stack+0x1f8/0x248
> > print_address_description.isra.13+0x5e/0x4d0
> > kasan_report+0x130/0x178
> > check_memory_region+0x190/0x218
> > memset+0x34/0x60
> > __free_pages_ok+0x894/0x12f0
> > kfree+0x4f2/0x5e0
> > unpack_to_rootfs+0x60e/0x650
> > populate_rootfs+0x56/0x358
> > do_one_initcall+0x1f4/0xa20
> > kernel_init_freeable+0x758/0x7e8
> > kernel_init+0x1c/0x170
> > ret_from_fork+0x24/0x28
> > Memory state around the buggy address:
> > 000000014bea9f00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > 000000014bea9f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > >000000014beaa000: 03 fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
> > ^
> > 000000014beaa080: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
> > 000000014beaa100: fe fe fe fe fe fe fe fe fe fe fe fe fe fe
> >
> > Fixes: 6471384af2a6 ("mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options")
> > Signed-off-by: Qian Cai <[email protected]>
> > ---
> > mm/page_alloc.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 727751219003..9954973f89a3 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -1164,8 +1164,11 @@ static void kernel_init_free_pages(struct page *page, int numpages)
> > {
> > int i;
> >
> > + /* s390's use of memset() could override KASAN redzones. */
> > + kasan_disable_current();
> > for (i = 0; i < numpages; i++)
> > clear_highpage(page + i);
> > + kasan_enable_current();
> > }
> >
> > static __always_inline bool free_pages_prepare(struct page *page,
> > --
> > 2.21.0 (Apple Git-122.2)
> >
On Wed, Jun 10, 2020 at 01:02:04PM +0200, Alexander Potapenko wrote:
> On Wed, Jun 10, 2020 at 7:55 AM Dmitry Vyukov <[email protected]> wrote:
> >
> > On Wed, Jun 10, 2020 at 7:22 AM Qian Cai <[email protected]> wrote:
> > >
> > > kernel_init_free_pages() will use memset() on s390 to clear all pages
> > > from kmalloc_order() which will override KASAN redzones because a
> > > redzone was setup from the end of the allocation size to the end of the
> > > last page. Silence it by not reporting it there. An example of the
> > > report is,
> >
> > Interesting. The reason why we did not hit it on x86_64 is because
> > clear_page is implemented in asm (arch/x86/lib/clear_page_64.S) and
> > thus is not instrumented. Arm64 probably does the same. However, on
> > s390 clear_page is defined to memset.
>
> Can we define it to __memset() instead?
> __memset() is supposed to be ignored by KASAN, e.g. KASAN runtime uses
> it in the places where we don't care about bugs.
I suppose that could work if s390 maintains perfer this way.
On Wed, Jun 10, 2020 at 08:26:00AM -0400, Qian Cai wrote:
> On Wed, Jun 10, 2020 at 07:54:50AM +0200, Dmitry Vyukov wrote:
> > On Wed, Jun 10, 2020 at 7:22 AM Qian Cai <[email protected]> wrote:
> > >
> > > kernel_init_free_pages() will use memset() on s390 to clear all pages
> > > from kmalloc_order() which will override KASAN redzones because a
> > > redzone was setup from the end of the allocation size to the end of the
> > > last page. Silence it by not reporting it there. An example of the
> > > report is,
> >
> > Interesting. The reason why we did not hit it on x86_64 is because
> > clear_page is implemented in asm (arch/x86/lib/clear_page_64.S) and
> > thus is not instrumented. Arm64 probably does the same. However, on
> > s390 clear_page is defined to memset.
> > clear_[high]page are pretty extensively used in the kernel.
> > We can either do this, or make clear_page non instrumented on s390 as
> > well to match the existing implicit assumption. The benefit of the
> > current approach is that we can find some real use-after-free's and
> > maybe out-of-bounds on clear_page. The downside is that we may need
> > more of these annotations. Thoughts?
>
> Since we had already done the same thing in poison_page(), I suppose we
> could do the same here. Also, clear_page() has been used in many places
> on s390, and it is not clear to me if those are all safe like this.
>
> There might be more annotations required, so it probably up to s390
> maintainers (CC'ed) if they prefer not instrumenting clear_page() like
> other arches.
Vasily will look into this and come up with a proper solution for s390.
On Wed, Jun 10, 2020 at 08:26:00AM -0400, Qian Cai wrote:
> On Wed, Jun 10, 2020 at 07:54:50AM +0200, Dmitry Vyukov wrote:
> > On Wed, Jun 10, 2020 at 7:22 AM Qian Cai <[email protected]> wrote:
> > >
> > > kernel_init_free_pages() will use memset() on s390 to clear all pages
> > > from kmalloc_order() which will override KASAN redzones because a
> > > redzone was setup from the end of the allocation size to the end of the
> > > last page. Silence it by not reporting it there. An example of the
> > > report is,
> >
> > Interesting. The reason why we did not hit it on x86_64 is because
> > clear_page is implemented in asm (arch/x86/lib/clear_page_64.S) and
> > thus is not instrumented. Arm64 probably does the same. However, on
> > s390 clear_page is defined to memset.
> > clear_[high]page are pretty extensively used in the kernel.
> > We can either do this, or make clear_page non instrumented on s390 as
> > well to match the existing implicit assumption. The benefit of the
> > current approach is that we can find some real use-after-free's and
> > maybe out-of-bounds on clear_page. The downside is that we may need
> > more of these annotations. Thoughts?
>
> Since we had already done the same thing in poison_page(), I suppose we
> could do the same here. Also, clear_page() has been used in many places
> on s390, and it is not clear to me if those are all safe like this.
>
> There might be more annotations required, so it probably up to s390
> maintainers (CC'ed) if they prefer not instrumenting clear_page() like
> other arches.
>
Sorry for delay. I assume you tested it without CONFIG_JUMP_LABEL.
I had to fix couple of things before I was able to use init_on_alloc=1
and init_on_free=1 boot options on s390 to reproduce KASAN problem:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.8-rc3&id=998f5bbe3dbdab81c1cfb1aef7c3892f5d24f6c7
https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git/commit/?h=fixes&id=95e61b1b5d6394b53d147c0fcbe2ae70fbe09446
https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git/commit/?h=fixes&id=d6df52e9996dcc2062c3d9c9123288468bb95b52
Back to clear_page - we could certainly make it non-instrumented. But
it didn't cause any problems so far. And as Dmitry pointed out we
could potentially find additional bugs with it. So, I'm leaning
towards original solution proposed. For that you have my
Acked-by: Vasily Gorbik <[email protected]>
Tested-by: Vasily Gorbik <[email protected]>
Thank you for looking into this!
Andrew, would you pick this change up?
Thank you
Vasily