From: Barry Song <[email protected]>
We are passing a huge nr to __clear_young_dirty_ptes() right
now. While we should pass the number of pages, we are actually
passing CONT_PTE_SIZE. This is causing lots of crashes of
MADV_FREE, panic oops could vary everytime.
Fixes: 89e86854fb0a ("mm/arm64: override clear_young_dirty_ptes() batch helper")
Cc: Lance Yang <[email protected]>
Cc: Barry Song <[email protected]>
Cc: Ryan Roberts <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Jeff Xie <[email protected]>
Cc: Kefeng Wang <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Peter Xu <[email protected]>
Cc: Yang Shi <[email protected]>
Cc: Yin Fengwei <[email protected]>
Cc: Zach O'Keefe <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Barry Song <[email protected]>
---
arch/arm64/mm/contpte.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c
index 9f9486de0004..a3edced29ac1 100644
--- a/arch/arm64/mm/contpte.c
+++ b/arch/arm64/mm/contpte.c
@@ -376,7 +376,7 @@ void contpte_clear_young_dirty_ptes(struct vm_area_struct *vma,
* clearing access/dirty for the whole block.
*/
unsigned long start = addr;
- unsigned long end = start + nr;
+ unsigned long end = start + nr * PAGE_SIZE;
if (pte_cont(__ptep_get(ptep + nr - 1)))
end = ALIGN(end, CONT_PTE_SIZE);
@@ -386,7 +386,7 @@ void contpte_clear_young_dirty_ptes(struct vm_area_struct *vma,
ptep = contpte_align_down(ptep);
}
- __clear_young_dirty_ptes(vma, start, ptep, end - start, flags);
+ __clear_young_dirty_ptes(vma, start, ptep, (end - start) / PAGE_SIZE, flags);
}
EXPORT_SYMBOL_GPL(contpte_clear_young_dirty_ptes);
--
2.34.1
On 2024/5/24 08:54, Barry Song wrote:
> From: Barry Song <[email protected]>
>
> We are passing a huge nr to __clear_young_dirty_ptes() right
> now. While we should pass the number of pages, we are actually
> passing CONT_PTE_SIZE. This is causing lots of crashes of
> MADV_FREE, panic oops could vary everytime.
>
> Fixes: 89e86854fb0a ("mm/arm64: override clear_young_dirty_ptes() batch helper")
> Cc: Lance Yang <[email protected]>
> Cc: Barry Song <[email protected]>
> Cc: Ryan Roberts <[email protected]>
> Cc: David Hildenbrand <[email protected]>
> Cc: Jeff Xie <[email protected]>
> Cc: Kefeng Wang <[email protected]>
> Cc: Michal Hocko <[email protected]>
> Cc: Minchan Kim <[email protected]>
> Cc: Muchun Song <[email protected]>
> Cc: Peter Xu <[email protected]>
> Cc: Yang Shi <[email protected]>
> Cc: Yin Fengwei <[email protected]>
> Cc: Zach O'Keefe <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Will Deacon <[email protected]>
> Signed-off-by: Barry Song <[email protected]>
Good catch. Feel free to add:
Reviewed-by: Baolin Wang <[email protected]>
> ---
> arch/arm64/mm/contpte.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c
> index 9f9486de0004..a3edced29ac1 100644
> --- a/arch/arm64/mm/contpte.c
> +++ b/arch/arm64/mm/contpte.c
> @@ -376,7 +376,7 @@ void contpte_clear_young_dirty_ptes(struct vm_area_struct *vma,
> * clearing access/dirty for the whole block.
> */
> unsigned long start = addr;
> - unsigned long end = start + nr;
> + unsigned long end = start + nr * PAGE_SIZE;
>
> if (pte_cont(__ptep_get(ptep + nr - 1)))
> end = ALIGN(end, CONT_PTE_SIZE);
> @@ -386,7 +386,7 @@ void contpte_clear_young_dirty_ptes(struct vm_area_struct *vma,
> ptep = contpte_align_down(ptep);
> }
>
> - __clear_young_dirty_ptes(vma, start, ptep, end - start, flags);
> + __clear_young_dirty_ptes(vma, start, ptep, (end - start) / PAGE_SIZE, flags);
> }
> EXPORT_SYMBOL_GPL(contpte_clear_young_dirty_ptes);
>
Thanks a lot for reaching out, Barry!
On Fri, May 24, 2024 at 8:55 AM Barry Song <[email protected]> wrote:
>
> From: Barry Song <[email protected]>
>
> We are passing a huge nr to __clear_young_dirty_ptes() right
> now. While we should pass the number of pages, we are actually
Yes.
It's my mistake - sorry :(
> passing CONT_PTE_SIZE. This is causing lots of crashes of
> MADV_FREE, panic oops could vary everytime.
>
> Fixes: 89e86854fb0a ("mm/arm64: override clear_young_dirty_ptes() batch helper")
> Cc: Lance Yang <[email protected]>
> Cc: Barry Song <[email protected]>
> Cc: Ryan Roberts <[email protected]>
> Cc: David Hildenbrand <[email protected]>
> Cc: Jeff Xie <[email protected]>
> Cc: Kefeng Wang <[email protected]>
> Cc: Michal Hocko <[email protected]>
> Cc: Minchan Kim <[email protected]>
> Cc: Muchun Song <[email protected]>
> Cc: Peter Xu <[email protected]>
> Cc: Yang Shi <[email protected]>
> Cc: Yin Fengwei <[email protected]>
> Cc: Zach O'Keefe <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Will Deacon <[email protected]>
LGTM.
Acked-by: Lance Yang <[email protected]>
> Signed-off-by: Barry Song <[email protected]>
> ---
> arch/arm64/mm/contpte.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c
> index 9f9486de0004..a3edced29ac1 100644
> --- a/arch/arm64/mm/contpte.c
> +++ b/arch/arm64/mm/contpte.c
> @@ -376,7 +376,7 @@ void contpte_clear_young_dirty_ptes(struct vm_area_struct *vma,
> * clearing access/dirty for the whole block.
> */
> unsigned long start = addr;
> - unsigned long end = start + nr;
> + unsigned long end = start + nr * PAGE_SIZE;
>
> if (pte_cont(__ptep_get(ptep + nr - 1)))
> end = ALIGN(end, CONT_PTE_SIZE);
> @@ -386,7 +386,7 @@ void contpte_clear_young_dirty_ptes(struct vm_area_struct *vma,
> ptep = contpte_align_down(ptep);
> }
>
> - __clear_young_dirty_ptes(vma, start, ptep, end - start, flags);
> + __clear_young_dirty_ptes(vma, start, ptep, (end - start) / PAGE_SIZE, flags);
> }
> EXPORT_SYMBOL_GPL(contpte_clear_young_dirty_ptes);
>
> --
> 2.34.1
>
On 24.05.24 02:54, Barry Song wrote:
> From: Barry Song <[email protected]>
>
> We are passing a huge nr to __clear_young_dirty_ptes() right
> now. While we should pass the number of pages, we are actually
> passing CONT_PTE_SIZE. This is causing lots of crashes of
> MADV_FREE, panic oops could vary everytime.
>
> Fixes: 89e86854fb0a ("mm/arm64: override clear_young_dirty_ptes() batch helper")
> Cc: Lance Yang <[email protected]>
> Cc: Barry Song <[email protected]>
> Cc: Ryan Roberts <[email protected]>
> Cc: David Hildenbrand <[email protected]>
> Cc: Jeff Xie <[email protected]>
> Cc: Kefeng Wang <[email protected]>
> Cc: Michal Hocko <[email protected]>
> Cc: Minchan Kim <[email protected]>
> Cc: Muchun Song <[email protected]>
> Cc: Peter Xu <[email protected]>
> Cc: Yang Shi <[email protected]>
> Cc: Yin Fengwei <[email protected]>
> Cc: Zach O'Keefe <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Will Deacon <[email protected]>
> Signed-off-by: Barry Song <[email protected]>
> ---
LGTM
Acked-by: David Hildenbrand <[email protected]>
--
Cheers,
David / dhildenb
Good catch.
Acked-by: Chris Li <[email protected]>
Chris
On Thu, May 23, 2024 at 5:55 PM Barry Song <[email protected]> wrote:
>
> From: Barry Song <[email protected]>
>
> We are passing a huge nr to __clear_young_dirty_ptes() right
> now. While we should pass the number of pages, we are actually
> passing CONT_PTE_SIZE. This is causing lots of crashes of
> MADV_FREE, panic oops could vary everytime.
>
> Fixes: 89e86854fb0a ("mm/arm64: override clear_young_dirty_ptes() batch helper")
> Cc: Lance Yang <[email protected]>
> Cc: Barry Song <[email protected]>
> Cc: Ryan Roberts <[email protected]>
> Cc: David Hildenbrand <[email protected]>
> Cc: Jeff Xie <[email protected]>
> Cc: Kefeng Wang <[email protected]>
> Cc: Michal Hocko <[email protected]>
> Cc: Minchan Kim <[email protected]>
> Cc: Muchun Song <[email protected]>
> Cc: Peter Xu <[email protected]>
> Cc: Yang Shi <[email protected]>
> Cc: Yin Fengwei <[email protected]>
> Cc: Zach O'Keefe <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Will Deacon <[email protected]>
> Signed-off-by: Barry Song <[email protected]>
> ---
> arch/arm64/mm/contpte.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c
> index 9f9486de0004..a3edced29ac1 100644
> --- a/arch/arm64/mm/contpte.c
> +++ b/arch/arm64/mm/contpte.c
> @@ -376,7 +376,7 @@ void contpte_clear_young_dirty_ptes(struct vm_area_struct *vma,
> * clearing access/dirty for the whole block.
> */
> unsigned long start = addr;
> - unsigned long end = start + nr;
> + unsigned long end = start + nr * PAGE_SIZE;
>
> if (pte_cont(__ptep_get(ptep + nr - 1)))
> end = ALIGN(end, CONT_PTE_SIZE);
> @@ -386,7 +386,7 @@ void contpte_clear_young_dirty_ptes(struct vm_area_struct *vma,
> ptep = contpte_align_down(ptep);
> }
>
> - __clear_young_dirty_ptes(vma, start, ptep, end - start, flags);
> + __clear_young_dirty_ptes(vma, start, ptep, (end - start) / PAGE_SIZE, flags);
> }
> EXPORT_SYMBOL_GPL(contpte_clear_young_dirty_ptes);
>
> --
> 2.34.1
>
>
On Fri, May 24, 2024 at 12:54:44PM +1200, Barry Song wrote:
> From: Barry Song <[email protected]>
>
> We are passing a huge nr to __clear_young_dirty_ptes() right
> now. While we should pass the number of pages, we are actually
> passing CONT_PTE_SIZE. This is causing lots of crashes of
> MADV_FREE, panic oops could vary everytime.
>
> Fixes: 89e86854fb0a ("mm/arm64: override clear_young_dirty_ptes() batch helper")
I was seeing ths same thing on v6.10-rc1 (syzkaller splat and reproducer
included at the end of the mail). The patch makes sense to me, and fixed the
splat in testing, so:
Reviewed-by: Mark Rutland <[email protected]>
Tested-by: Mark Rutland <[email protected]>
Since this only affects arm64 and is already in mainline, I assume the fix
should go via the arm64 tree even though the broken commit went via mm.
Mark.
> Cc: Lance Yang <[email protected]>
> Cc: Barry Song <[email protected]>
> Cc: Ryan Roberts <[email protected]>
> Cc: David Hildenbrand <[email protected]>
> Cc: Jeff Xie <[email protected]>
> Cc: Kefeng Wang <[email protected]>
> Cc: Michal Hocko <[email protected]>
> Cc: Minchan Kim <[email protected]>
> Cc: Muchun Song <[email protected]>
> Cc: Peter Xu <[email protected]>
> Cc: Yang Shi <[email protected]>
> Cc: Yin Fengwei <[email protected]>
> Cc: Zach O'Keefe <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Will Deacon <[email protected]>
> Signed-off-by: Barry Song <[email protected]>
> ---
> arch/arm64/mm/contpte.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c
> index 9f9486de0004..a3edced29ac1 100644
> --- a/arch/arm64/mm/contpte.c
> +++ b/arch/arm64/mm/contpte.c
> @@ -376,7 +376,7 @@ void contpte_clear_young_dirty_ptes(struct vm_area_struct *vma,
> * clearing access/dirty for the whole block.
> */
> unsigned long start = addr;
> - unsigned long end = start + nr;
> + unsigned long end = start + nr * PAGE_SIZE;
>
> if (pte_cont(__ptep_get(ptep + nr - 1)))
> end = ALIGN(end, CONT_PTE_SIZE);
> @@ -386,7 +386,7 @@ void contpte_clear_young_dirty_ptes(struct vm_area_struct *vma,
> ptep = contpte_align_down(ptep);
> }
>
> - __clear_young_dirty_ptes(vma, start, ptep, end - start, flags);
> + __clear_young_dirty_ptes(vma, start, ptep, (end - start) / PAGE_SIZE, flags);
> }
> EXPORT_SYMBOL_GPL(contpte_clear_young_dirty_ptes);
>
> --
> 2.34.1
---->8----
Syzkaller hit 'KASAN: use-after-free Read in contpte_clear_young_dirty_ptes' bug.
==================================================================
BUG: KASAN: use-after-free in __ptep_get arch/arm64/include/asm/pgtable.h:315 [inline]
BUG: KASAN: use-after-free in __clear_young_dirty_ptes arch/arm64/include/asm/pgtable.h:1309 [inline]
BUG: KASAN: use-after-free in contpte_clear_young_dirty_ptes+0x264/0x288 arch/arm64/mm/contpte.c:389
Read of size 8 at addr ffff000018c0d000 by task syz-executor392/193
CPU: 0 PID: 193 Comm: syz-executor392 Not tainted 6.10.0-rc1-00001-g30b7f99b25b6 #1
Hardware name: linux,dummy-virt (DT)
Call trace:
dump_backtrace+0x12c/0x1f8 arch/arm64/kernel/stacktrace.c:317
show_stack+0x34/0x50 arch/arm64/kernel/stacktrace.c:324
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x184/0x360 lib/dump_stack.c:114
print_address_description mm/kasan/report.c:377 [inline]
print_report+0xf4/0x5b0 mm/kasan/report.c:488
kasan_report+0xc0/0x100 mm/kasan/report.c:601
__asan_report_load8_noabort+0x20/0x30 mm/kasan/report_generic.c:381
__ptep_get arch/arm64/include/asm/pgtable.h:315 [inline]
__clear_young_dirty_ptes arch/arm64/include/asm/pgtable.h:1309 [inline]
contpte_clear_young_dirty_ptes+0x264/0x288 arch/arm64/mm/contpte.c:389
clear_young_dirty_ptes arch/arm64/include/asm/pgtable.h:1715 [inline]
madvise_free_pte_range+0xa5c/0x16d8 mm/madvise.c:767
walk_pmd_range mm/pagewalk.c:143 [inline]
walk_pud_range mm/pagewalk.c:221 [inline]
walk_p4d_range mm/pagewalk.c:256 [inline]
walk_pgd_range+0xca4/0x1900 mm/pagewalk.c:293
__walk_page_range+0x4bc/0x5b8 mm/pagewalk.c:395
walk_page_range+0x4a4/0x840 mm/pagewalk.c:521
madvise_free_single_vma+0x3a0/0x798 mm/madvise.c:815
madvise_dontneed_free mm/madvise.c:929 [inline]
madvise_vma_behavior mm/madvise.c:1046 [inline]
madvise_walk_vmas mm/madvise.c:1268 [inline]
do_madvise+0x54c/0x2990 mm/madvise.c:1464
__do_sys_madvise mm/madvise.c:1481 [inline]
__se_sys_madvise mm/madvise.c:1479 [inline]
__arm64_sys_madvise+0x94/0xf8 mm/madvise.c:1479
__invoke_syscall arch/arm64/kernel/syscall.c:34 [inline]
invoke_syscall+0x8c/0x2e0 arch/arm64/kernel/syscall.c:48
el0_svc_common.constprop.0+0xec/0x2a8 arch/arm64/kernel/syscall.c:133
do_el0_svc+0x4c/0x70 arch/arm64/kernel/syscall.c:152
el0_svc+0x54/0x160 arch/arm64/kernel/entry-common.c:712
el0t_64_sync_handler+0x120/0x130 arch/arm64/kernel/entry-common.c:730
el0t_64_sync+0x1a4/0x1a8 arch/arm64/kernel/entry.S:598
The buggy address belongs to the physical page:
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x58c0d
flags: 0x3fffe0000000000(node=0|zone=0|lastcpupid=0x1ffff)
raw: 03fffe0000000000 fffffdffc0630388 fffffdffc071cc48 0000000000000000
raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff000018c0cf00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff000018c0cf80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffff000018c0d000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
^
ffff000018c0d080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ffff000018c0d100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
==================================================================
Syzkaller reproducer:
# {Threaded:false Repeat:false RepeatTimes:0 Procs:1 Slowdown:1 Sandbox: SandboxArg:0 Leak:false NetInjection:false NetDevices:false NetReset:false Cgroups:false BinfmtMisc:false CloseFDs:false KCSAN:false DevlinkPCI:false NicVF:false USB:false VhciInjection:false Wifi:false IEEE802154:false Sysctl:false Swap:false UseTmpDir:false HandleSegv:false Trace:false LegacyOptions:{Collide:false Fault:false FaultCall:0 FaultNth:0}}
madvise(&(0x7f0000ffd000/0x3000)=nil, 0x3000, 0x17)
mprotect(&(0x7f0000ffc000/0x4000)=nil, 0x4000, 0x0)
mprotect(&(0x7f0000800000/0x800000)=nil, 0x800000, 0x1)
madvise(&(0x7f0000400000/0xc00000)=nil, 0xc00000, 0x8)
C reproducer:
// autogenerated by syzkaller (https://github.com/google/syzkaller)
#define _GNU_SOURCE
#include <endian.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/syscall.h>
#include <sys/types.h>
#include <unistd.h>
#ifndef __NR_madvise
#define __NR_madvise 233
#endif
#ifndef __NR_mmap
#define __NR_mmap 222
#endif
#ifndef __NR_mprotect
#define __NR_mprotect 226
#endif
int main(void)
{
syscall(__NR_mmap, /*addr=*/0x1ffff000ul, /*len=*/0x1000ul, /*prot=*/0ul, /*flags=MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE*/0x32ul, /*fd=*/-1, /*offset=*/0ul);
syscall(__NR_mmap, /*addr=*/0x20000000ul, /*len=*/0x1000000ul, /*prot=PROT_WRITE|PROT_READ|PROT_EXEC*/7ul, /*flags=MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE*/0x32ul, /*fd=*/-1, /*offset=*/0ul);
syscall(__NR_mmap, /*addr=*/0x21000000ul, /*len=*/0x1000ul, /*prot=*/0ul, /*flags=MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE*/0x32ul, /*fd=*/-1, /*offset=*/0ul);
if (write(1, "executing program\n", sizeof("executing program\n") - 1)) {}
syscall(__NR_madvise, /*addr=*/0x20ffd000ul, /*len=*/0x3000ul, /*advice=MADV_POPULATE_WRITE*/0x17ul);
syscall(__NR_mprotect, /*addr=*/0x20ffc000ul, /*len=*/0x4000ul, /*prot=*/0ul);
syscall(__NR_mprotect, /*addr=*/0x20800000ul, /*len=*/0x800000ul, /*prot=PROT_READ*/1ul);
syscall(__NR_madvise, /*addr=*/0x20400000ul, /*len=*/0xc00000ul, /*advice=*/8ul);
return 0;
}
On Tue, May 28, 2024 at 8:26 PM Mark Rutland <[email protected]> wrote:
>
> On Fri, May 24, 2024 at 12:54:44PM +1200, Barry Song wrote:
> > From: Barry Song <[email protected]>
> >
> > We are passing a huge nr to __clear_young_dirty_ptes() right
> > now. While we should pass the number of pages, we are actually
> > passing CONT_PTE_SIZE. This is causing lots of crashes of
> > MADV_FREE, panic oops could vary everytime.
> >
> > Fixes: 89e86854fb0a ("mm/arm64: override clear_young_dirty_ptes() batch helper")
>
> I was seeing ths same thing on v6.10-rc1 (syzkaller splat and reproducer
> included at the end of the mail). The patch makes sense to me, and fixed the
> splat in testing, so:
>
> Reviewed-by: Mark Rutland <[email protected]>
> Tested-by: Mark Rutland <[email protected]>
Thanks!
>
> Since this only affects arm64 and is already in mainline, I assume the fix
> should go via the arm64 tree even though the broken commit went via mm.
Either mm or arm64 is fine with me, but I noticed that Andrew has already
included it in mm-hotfixes-unstable. If it works, we may want to stick with
that. :-)
>
> Mark.
>
> > Cc: Lance Yang <[email protected]>
> > Cc: Barry Song <[email protected]>
> > Cc: Ryan Roberts <[email protected]>
> > Cc: David Hildenbrand <[email protected]>
> > Cc: Jeff Xie <[email protected]>
> > Cc: Kefeng Wang <[email protected]>
> > Cc: Michal Hocko <[email protected]>
> > Cc: Minchan Kim <[email protected]>
> > Cc: Muchun Song <[email protected]>
> > Cc: Peter Xu <[email protected]>
> > Cc: Yang Shi <[email protected]>
> > Cc: Yin Fengwei <[email protected]>
> > Cc: Zach O'Keefe <[email protected]>
> > Cc: Catalin Marinas <[email protected]>
> > Cc: Will Deacon <[email protected]>
> > Signed-off-by: Barry Song <[email protected]>
> > ---
> > arch/arm64/mm/contpte.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c
> > index 9f9486de0004..a3edced29ac1 100644
> > --- a/arch/arm64/mm/contpte.c
> > +++ b/arch/arm64/mm/contpte.c
> > @@ -376,7 +376,7 @@ void contpte_clear_young_dirty_ptes(struct vm_area_struct *vma,
> > * clearing access/dirty for the whole block.
> > */
> > unsigned long start = addr;
> > - unsigned long end = start + nr;
> > + unsigned long end = start + nr * PAGE_SIZE;
> >
> > if (pte_cont(__ptep_get(ptep + nr - 1)))
> > end = ALIGN(end, CONT_PTE_SIZE);
> > @@ -386,7 +386,7 @@ void contpte_clear_young_dirty_ptes(struct vm_area_struct *vma,
> > ptep = contpte_align_down(ptep);
> > }
> >
> > - __clear_young_dirty_ptes(vma, start, ptep, end - start, flags);
> > + __clear_young_dirty_ptes(vma, start, ptep, (end - start) / PAGE_SIZE, flags);
> > }
> > EXPORT_SYMBOL_GPL(contpte_clear_young_dirty_ptes);
> >
> > --
> > 2.34.1
>
> ---->8----
> Syzkaller hit 'KASAN: use-after-free Read in contpte_clear_young_dirty_ptes' bug.
>
> ==================================================================
> BUG: KASAN: use-after-free in __ptep_get arch/arm64/include/asm/pgtable.h:315 [inline]
> BUG: KASAN: use-after-free in __clear_young_dirty_ptes arch/arm64/include/asm/pgtable.h:1309 [inline]
> BUG: KASAN: use-after-free in contpte_clear_young_dirty_ptes+0x264/0x288 arch/arm64/mm/contpte.c:389
> Read of size 8 at addr ffff000018c0d000 by task syz-executor392/193
>
> CPU: 0 PID: 193 Comm: syz-executor392 Not tainted 6.10.0-rc1-00001-g30b7f99b25b6 #1
> Hardware name: linux,dummy-virt (DT)
> Call trace:
> dump_backtrace+0x12c/0x1f8 arch/arm64/kernel/stacktrace.c:317
> show_stack+0x34/0x50 arch/arm64/kernel/stacktrace.c:324
> __dump_stack lib/dump_stack.c:88 [inline]
> dump_stack_lvl+0x184/0x360 lib/dump_stack.c:114
> print_address_description mm/kasan/report.c:377 [inline]
> print_report+0xf4/0x5b0 mm/kasan/report.c:488
> kasan_report+0xc0/0x100 mm/kasan/report.c:601
> __asan_report_load8_noabort+0x20/0x30 mm/kasan/report_generic.c:381
> __ptep_get arch/arm64/include/asm/pgtable.h:315 [inline]
> __clear_young_dirty_ptes arch/arm64/include/asm/pgtable.h:1309 [inline]
> contpte_clear_young_dirty_ptes+0x264/0x288 arch/arm64/mm/contpte.c:389
> clear_young_dirty_ptes arch/arm64/include/asm/pgtable.h:1715 [inline]
> madvise_free_pte_range+0xa5c/0x16d8 mm/madvise.c:767
> walk_pmd_range mm/pagewalk.c:143 [inline]
> walk_pud_range mm/pagewalk.c:221 [inline]
> walk_p4d_range mm/pagewalk.c:256 [inline]
> walk_pgd_range+0xca4/0x1900 mm/pagewalk.c:293
> __walk_page_range+0x4bc/0x5b8 mm/pagewalk.c:395
> walk_page_range+0x4a4/0x840 mm/pagewalk.c:521
> madvise_free_single_vma+0x3a0/0x798 mm/madvise.c:815
> madvise_dontneed_free mm/madvise.c:929 [inline]
> madvise_vma_behavior mm/madvise.c:1046 [inline]
> madvise_walk_vmas mm/madvise.c:1268 [inline]
> do_madvise+0x54c/0x2990 mm/madvise.c:1464
> __do_sys_madvise mm/madvise.c:1481 [inline]
> __se_sys_madvise mm/madvise.c:1479 [inline]
> __arm64_sys_madvise+0x94/0xf8 mm/madvise.c:1479
> __invoke_syscall arch/arm64/kernel/syscall.c:34 [inline]
> invoke_syscall+0x8c/0x2e0 arch/arm64/kernel/syscall.c:48
> el0_svc_common.constprop.0+0xec/0x2a8 arch/arm64/kernel/syscall.c:133
> do_el0_svc+0x4c/0x70 arch/arm64/kernel/syscall.c:152
> el0_svc+0x54/0x160 arch/arm64/kernel/entry-common.c:712
> el0t_64_sync_handler+0x120/0x130 arch/arm64/kernel/entry-common.c:730
> el0t_64_sync+0x1a4/0x1a8 arch/arm64/kernel/entry.S:598
>
> The buggy address belongs to the physical page:
> page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x58c0d
> flags: 0x3fffe0000000000(node=0|zone=0|lastcpupid=0x1ffff)
> raw: 03fffe0000000000 fffffdffc0630388 fffffdffc071cc48 0000000000000000
> raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
> page dumped because: kasan: bad access detected
>
> Memory state around the buggy address:
> ffff000018c0cf00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> ffff000018c0cf80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >ffff000018c0d000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> ^
> ffff000018c0d080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> ffff000018c0d100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> ==================================================================
>
>
> Syzkaller reproducer:
> # {Threaded:false Repeat:false RepeatTimes:0 Procs:1 Slowdown:1 Sandbox: SandboxArg:0 Leak:false NetInjection:false NetDevices:false NetReset:false Cgroups:false BinfmtMisc:false CloseFDs:false KCSAN:false DevlinkPCI:false NicVF:false USB:false VhciInjection:false Wifi:false IEEE802154:false Sysctl:false Swap:false UseTmpDir:false HandleSegv:false Trace:false LegacyOptions:{Collide:false Fault:false FaultCall:0 FaultNth:0}}
> madvise(&(0x7f0000ffd000/0x3000)=nil, 0x3000, 0x17)
> mprotect(&(0x7f0000ffc000/0x4000)=nil, 0x4000, 0x0)
> mprotect(&(0x7f0000800000/0x800000)=nil, 0x800000, 0x1)
> madvise(&(0x7f0000400000/0xc00000)=nil, 0xc00000, 0x8)
>
>
> C reproducer:
> // autogenerated by syzkaller (https://github.com/google/syzkaller)
>
> #define _GNU_SOURCE
>
> #include <endian.h>
> #include <stdint.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <string.h>
> #include <sys/syscall.h>
> #include <sys/types.h>
> #include <unistd.h>
>
> #ifndef __NR_madvise
> #define __NR_madvise 233
> #endif
> #ifndef __NR_mmap
> #define __NR_mmap 222
> #endif
> #ifndef __NR_mprotect
> #define __NR_mprotect 226
> #endif
>
> int main(void)
> {
> syscall(__NR_mmap, /*addr=*/0x1ffff000ul, /*len=*/0x1000ul, /*prot=*/0ul, /*flags=MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE*/0x32ul, /*fd=*/-1, /*offset=*/0ul);
> syscall(__NR_mmap, /*addr=*/0x20000000ul, /*len=*/0x1000000ul, /*prot=PROT_WRITE|PROT_READ|PROT_EXEC*/7ul, /*flags=MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE*/0x32ul, /*fd=*/-1, /*offset=*/0ul);
> syscall(__NR_mmap, /*addr=*/0x21000000ul, /*len=*/0x1000ul, /*prot=*/0ul, /*flags=MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE*/0x32ul, /*fd=*/-1, /*offset=*/0ul);
> if (write(1, "executing program\n", sizeof("executing program\n") - 1)) {}
> syscall(__NR_madvise, /*addr=*/0x20ffd000ul, /*len=*/0x3000ul, /*advice=MADV_POPULATE_WRITE*/0x17ul);
> syscall(__NR_mprotect, /*addr=*/0x20ffc000ul, /*len=*/0x4000ul, /*prot=*/0ul);
> syscall(__NR_mprotect, /*addr=*/0x20800000ul, /*len=*/0x800000ul, /*prot=PROT_READ*/1ul);
> syscall(__NR_madvise, /*addr=*/0x20400000ul, /*len=*/0xc00000ul, /*advice=*/8ul);
> return 0;
> }
>
>
Thanks
Barry
On Tue, May 28, 2024 at 08:39:55PM +1200, Barry Song wrote:
> On Tue, May 28, 2024 at 8:26 PM Mark Rutland <[email protected]> wrote:
> > On Fri, May 24, 2024 at 12:54:44PM +1200, Barry Song wrote:
> > > From: Barry Song <[email protected]>
> > >
> > > We are passing a huge nr to __clear_young_dirty_ptes() right
> > > now. While we should pass the number of pages, we are actually
> > > passing CONT_PTE_SIZE. This is causing lots of crashes of
> > > MADV_FREE, panic oops could vary everytime.
> > >
> > > Fixes: 89e86854fb0a ("mm/arm64: override clear_young_dirty_ptes() batch helper")
> >
> > I was seeing ths same thing on v6.10-rc1 (syzkaller splat and reproducer
> > included at the end of the mail). The patch makes sense to me, and fixed the
> > splat in testing, so:
> >
> > Reviewed-by: Mark Rutland <[email protected]>
> > Tested-by: Mark Rutland <[email protected]>
>
> Thanks!
>
> > Since this only affects arm64 and is already in mainline, I assume the fix
> > should go via the arm64 tree even though the broken commit went via mm.
>
> Either mm or arm64 is fine with me, but I noticed that Andrew has already
> included it in mm-hotfixes-unstable. If it works, we may want to stick with
> that. :-)
Going via mm is also fine by me, I had just expected it'd be quicker to
go via arm64 (and evidently I was wrong there!). :)
Mark.
On Wed, May 29, 2024 at 03:59:13PM +0100, Mark Rutland wrote:
> On Tue, May 28, 2024 at 08:39:55PM +1200, Barry Song wrote:
> > On Tue, May 28, 2024 at 8:26 PM Mark Rutland <[email protected]> wrote:
> > > On Fri, May 24, 2024 at 12:54:44PM +1200, Barry Song wrote:
> > > > From: Barry Song <[email protected]>
> > > >
> > > > We are passing a huge nr to __clear_young_dirty_ptes() right
> > > > now. While we should pass the number of pages, we are actually
> > > > passing CONT_PTE_SIZE. This is causing lots of crashes of
> > > > MADV_FREE, panic oops could vary everytime.
> > > >
> > > > Fixes: 89e86854fb0a ("mm/arm64: override clear_young_dirty_ptes() batch helper")
> > >
> > > I was seeing ths same thing on v6.10-rc1 (syzkaller splat and reproducer
> > > included at the end of the mail). The patch makes sense to me, and fixed the
> > > splat in testing, so:
> > >
> > > Reviewed-by: Mark Rutland <[email protected]>
> > > Tested-by: Mark Rutland <[email protected]>
> >
> > Thanks!
> >
> > > Since this only affects arm64 and is already in mainline, I assume the fix
> > > should go via the arm64 tree even though the broken commit went via mm.
> >
> > Either mm or arm64 is fine with me, but I noticed that Andrew has already
> > included it in mm-hotfixes-unstable. If it works, we may want to stick with
> > that. :-)
>
> Going via mm is also fine by me, I had just expected it'd be quicker to
> go via arm64 (and evidently I was wrong there!). :)
Sorry, I was fishing! I'm happy for it to land via -mm.
Will