LinuxLists.cc - [syzbot] BUG: unable to handle kernel NULL pointer dereference in set_page

2022-08-25 16:29:31

Subject: [syzbot] BUG: unable to handle kernel NULL pointer dereference in set_page_dirty

Hello,

syzbot found the following issue on:

HEAD commit: a41a877bc12d Merge branch 'for-next/fixes' into for-kernelci
git tree: git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
console output: https://syzkaller.appspot.com/x/log.txt?x=175def47080000
kernel config: https://syzkaller.appspot.com/x/.config?x=5cea15779c42821c
dashboard link: https://syzkaller.appspot.com/bug?extid=775a3440817f74fddb8c
compiler: Debian clang version 13.0.1-++20220126092033+75e33f71c2da-1~exp1~20220126212112.63, GNU ld (GNU Binutils for Debian) 2.35.2
userspace arch: arm64

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]

Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
Mem abort info:
ESR = 0x0000000086000005
EC = 0x21: IABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x05: level 1 translation fault
user pgtable: 4k pages, 48-bit VAs, pgdp=00000001249cc000
[0000000000000000] pgd=080000012ee65003, p4d=080000012ee65003, pud=0000000000000000
Internal error: Oops: 86000005 [#1] PREEMPT SMP
Modules linked in:
CPU: 0 PID: 3044 Comm: syz-executor.0 Not tainted 6.0.0-rc2-syzkaller-16455-ga41a877bc12d #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/20/2022
pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : 0x0
lr : folio_mark_dirty+0xbc/0x208 mm/page-writeback.c:2748
sp : ffff800012803830
x29: ffff800012803830 x28: ffff0000d02c8000 x27: 0000000000000009
x26: 0000000000000001 x25: 0000000000000a00 x24: 0000000000000080
x23: 0000000000000000 x22: ffff0000ef276c00 x21: 05ffc00000000007
x20: ffff0000f14b83b8 x19: fffffc00036409c0 x18: fffffffffffffff5
x17: ffff80000dd7a698 x16: ffff80000dbb8658 x15: 0000000000000000
x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
x11: ff808000083e9814 x10: 0000000000000000 x9 : ffff8000083e9814
x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000
x5 : ffff0000d9028000 x4 : ffff0000d5c31000 x3 : ffff0000d9027f80
x2 : fffffffffffffff0 x1 : fffffc00036409c0 x0 : ffff0000f14b83b8
Call trace:
0x0
set_page_dirty+0x38/0xbc mm/folio-compat.c:62
get_next_nat_page+0x198/0x300 fs/f2fs/node.c:154
__flush_nat_entry_set fs/f2fs/node.c:3005 [inline]
f2fs_flush_nat_entries+0x354/0x988 fs/f2fs/node.c:3109
f2fs_write_checkpoint+0x350/0x568 fs/f2fs/checkpoint.c:1667
f2fs_issue_checkpoint+0x1b0/0x234
f2fs_sync_fs+0x8c/0xc8 fs/f2fs/super.c:1651
sync_filesystem+0xe0/0x134 fs/sync.c:66
generic_shutdown_super+0x38/0x190 fs/super.c:474
kill_block_super+0x30/0x78 fs/super.c:1427
kill_f2fs_super+0x140/0x184 fs/f2fs/super.c:4544
deactivate_locked_super+0x70/0xd4 fs/super.c:332
deactivate_super+0xb8/0xbc fs/super.c:363
cleanup_mnt+0x1f8/0x234 fs/namespace.c:1186
__cleanup_mnt+0x20/0x30 fs/namespace.c:1193
task_work_run+0xc4/0x208 kernel/task_work.c:177
resume_user_mode_work include/linux/resume_user_mode.h:49 [inline]
do_notify_resume+0x174/0x1d0 arch/arm64/kernel/signal.c:1127
prepare_exit_to_user_mode arch/arm64/kernel/entry-common.c:137 [inline]
exit_to_user_mode arch/arm64/kernel/entry-common.c:142 [inline]
el0_svc+0x9c/0x150 arch/arm64/kernel/entry-common.c:625
el0t_64_sync_handler+0x84/0xf0 arch/arm64/kernel/entry-common.c:642
el0t_64_sync+0x18c/0x190
Code: bad PC value
---[ end trace 0000000000000000 ]---

---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

2022-08-26 02:08:19

by Andrew Morton

[permalink] [raw]

Subject: Re: [syzbot] BUG: unable to handle kernel NULL pointer dereference in set_page_dirty

(cc fsf2 developers)

On Thu, 25 Aug 2022 08:29:32 -0700 syzbot <[email protected]> wrote:

> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: a41a877bc12d Merge branch 'for-next/fixes' into for-kernelci
> git tree: git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
> console output: https://syzkaller.appspot.com/x/log.txt?x=175def47080000
> kernel config: https://syzkaller.appspot.com/x/.config?x=5cea15779c42821c
> dashboard link: https://syzkaller.appspot.com/bug?extid=775a3440817f74fddb8c
> compiler: Debian clang version 13.0.1-++20220126092033+75e33f71c2da-1~exp1~20220126212112.63, GNU ld (GNU Binutils for Debian) 2.35.2
> userspace arch: arm64
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: [email protected]
>
> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
> Mem abort info:
> ESR = 0x0000000086000005
> EC = 0x21: IABT (current EL), IL = 32 bits
> SET = 0, FnV = 0
> EA = 0, S1PTW = 0
> FSC = 0x05: level 1 translation fault
> user pgtable: 4k pages, 48-bit VAs, pgdp=00000001249cc000
> [0000000000000000] pgd=080000012ee65003, p4d=080000012ee65003, pud=0000000000000000
> Internal error: Oops: 86000005 [#1] PREEMPT SMP
> Modules linked in:
> CPU: 0 PID: 3044 Comm: syz-executor.0 Not tainted 6.0.0-rc2-syzkaller-16455-ga41a877bc12d #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/20/2022
> pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : 0x0
> lr : folio_mark_dirty+0xbc/0x208 mm/page-writeback.c:2748
> sp : ffff800012803830
> x29: ffff800012803830 x28: ffff0000d02c8000 x27: 0000000000000009
> x26: 0000000000000001 x25: 0000000000000a00 x24: 0000000000000080
> x23: 0000000000000000 x22: ffff0000ef276c00 x21: 05ffc00000000007
> x20: ffff0000f14b83b8 x19: fffffc00036409c0 x18: fffffffffffffff5
> x17: ffff80000dd7a698 x16: ffff80000dbb8658 x15: 0000000000000000
> x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
> x11: ff808000083e9814 x10: 0000000000000000 x9 : ffff8000083e9814
> x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000
> x5 : ffff0000d9028000 x4 : ffff0000d5c31000 x3 : ffff0000d9027f80
> x2 : fffffffffffffff0 x1 : fffffc00036409c0 x0 : ffff0000f14b83b8
> Call trace:
> 0x0
> set_page_dirty+0x38/0xbc mm/folio-compat.c:62
> get_next_nat_page+0x198/0x300 fs/f2fs/node.c:154
> __flush_nat_entry_set fs/f2fs/node.c:3005 [inline]
> f2fs_flush_nat_entries+0x354/0x988 fs/f2fs/node.c:3109
> f2fs_write_checkpoint+0x350/0x568 fs/f2fs/checkpoint.c:1667
> f2fs_issue_checkpoint+0x1b0/0x234
> f2fs_sync_fs+0x8c/0xc8 fs/f2fs/super.c:1651
> sync_filesystem+0xe0/0x134 fs/sync.c:66
> generic_shutdown_super+0x38/0x190 fs/super.c:474
> kill_block_super+0x30/0x78 fs/super.c:1427
> kill_f2fs_super+0x140/0x184 fs/f2fs/super.c:4544
> deactivate_locked_super+0x70/0xd4 fs/super.c:332
> deactivate_super+0xb8/0xbc fs/super.c:363
> cleanup_mnt+0x1f8/0x234 fs/namespace.c:1186
> __cleanup_mnt+0x20/0x30 fs/namespace.c:1193
> task_work_run+0xc4/0x208 kernel/task_work.c:177
> resume_user_mode_work include/linux/resume_user_mode.h:49 [inline]
> do_notify_resume+0x174/0x1d0 arch/arm64/kernel/signal.c:1127
> prepare_exit_to_user_mode arch/arm64/kernel/entry-common.c:137 [inline]
> exit_to_user_mode arch/arm64/kernel/entry-common.c:142 [inline]
> el0_svc+0x9c/0x150 arch/arm64/kernel/entry-common.c:625
> el0t_64_sync_handler+0x84/0xf0 arch/arm64/kernel/entry-common.c:642
> el0t_64_sync+0x18c/0x190
> Code: bad PC value
> ---[ end trace 0000000000000000 ]---
>
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at [email protected].
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

2022-08-26 19:14:54

by syzbot

[permalink] [raw]

Subject: Re: [syzbot] BUG: unable to handle kernel NULL pointer dereference in set_page_dirty

syzbot has found a reproducer for the following issue on:

HEAD commit: a41a877bc12d Merge branch 'for-next/fixes' into for-kernelci
git tree: git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
console output: https://syzkaller.appspot.com/x/log.txt?x=104eb875080000
kernel config: https://syzkaller.appspot.com/x/.config?x=5cea15779c42821c
dashboard link: https://syzkaller.appspot.com/bug?extid=775a3440817f74fddb8c
compiler: Debian clang version 13.0.1-++20220126092033+75e33f71c2da-1~exp1~20220126212112.63, GNU ld (GNU Binutils for Debian) 2.35.2
userspace arch: arm64
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=15aebce7080000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=167b5e33080000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]

Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
Mem abort info:
ESR = 0x0000000086000004
EC = 0x21: IABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x04: level 0 translation fault
user pgtable: 4k pages, 48-bit VAs, pgdp=0000000109ee4000
[0000000000000000] pgd=0000000000000000, p4d=0000000000000000
Internal error: Oops: 86000004 [#1] PREEMPT SMP
Modules linked in:
CPU: 1 PID: 3045 Comm: syz-executor330 Not tainted 6.0.0-rc2-syzkaller-16455-ga41a877bc12d #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/22/2022
pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : 0x0
lr : folio_mark_dirty+0xbc/0x208 mm/page-writeback.c:2748
sp : ffff800012783970
x29: ffff800012783970 x28: 0000000000000000 x27: ffff800012783b08
x26: 0000000000000001 x25: 0000000000000400 x24: 0000000000000001
x23: ffff0000c736e000 x22: 0000000000000045 x21: 05ffc00000000015
x20: ffff0000ca7403b8 x19: fffffc00032ec600 x18: 0000000000000181
x17: ffff80000c04d6bc x16: ffff80000dbb8658 x15: 0000000000000000
x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
x11: ff808000083e9814 x10: 0000000000000000 x9 : ffff8000083e9814
x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000
x5 : ffff0000cbb19000 x4 : ffff0000cb3d2000 x3 : ffff0000cbb18f80
x2 : fffffffffffffff0 x1 : fffffc00032ec600 x0 : ffff0000ca7403b8
Call trace:
0x0
set_page_dirty+0x38/0xbc mm/folio-compat.c:62
f2fs_update_meta_page+0x80/0xa8 fs/f2fs/segment.c:2369
do_checkpoint+0x794/0xea8 fs/f2fs/checkpoint.c:1522
f2fs_write_checkpoint+0x3b8/0x568 fs/f2fs/checkpoint.c:1679
f2fs_issue_checkpoint+0x1b0/0x234
f2fs_sync_fs+0x8c/0xc8 fs/f2fs/super.c:1651
sync_filesystem+0xe0/0x134 fs/sync.c:66
generic_shutdown_super+0x38/0x190 fs/super.c:474
kill_block_super+0x30/0x78 fs/super.c:1427
kill_f2fs_super+0x140/0x184 fs/f2fs/super.c:4544
deactivate_locked_super+0x70/0xd4 fs/super.c:332
deactivate_super+0xb8/0xbc fs/super.c:363
cleanup_mnt+0x1f8/0x234 fs/namespace.c:1186
__cleanup_mnt+0x20/0x30 fs/namespace.c:1193
task_work_run+0xc4/0x208 kernel/task_work.c:177
resume_user_mode_work include/linux/resume_user_mode.h:49 [inline]
do_notify_resume+0x174/0x1d0 arch/arm64/kernel/signal.c:1127
prepare_exit_to_user_mode arch/arm64/kernel/entry-common.c:137 [inline]
exit_to_user_mode arch/arm64/kernel/entry-common.c:142 [inline]
el0_svc+0x9c/0x150 arch/arm64/kernel/entry-common.c:625
el0t_64_sync_handler+0x84/0xf0 arch/arm64/kernel/entry-common.c:642
el0t_64_sync+0x18c/0x190
Code: bad PC value
---[ end trace 0000000000000000 ]---

2022-08-29 18:14:14

by Jaegeuk Kim

[permalink] [raw]

Subject: Re: [syzbot] BUG: unable to handle kernel NULL pointer dereference in set_page_dirty

On 08/25, Andrew Morton wrote:
> (cc fsf2 developers)
>
> On Thu, 25 Aug 2022 08:29:32 -0700 syzbot <[email protected]> wrote:
>
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit: a41a877bc12d Merge branch 'for-next/fixes' into for-kernelci
> > git tree: git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
> > console output: https://syzkaller.appspot.com/x/log.txt?x=175def47080000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=5cea15779c42821c
> > dashboard link: https://syzkaller.appspot.com/bug?extid=775a3440817f74fddb8c
> > compiler: Debian clang version 13.0.1-++20220126092033+75e33f71c2da-1~exp1~20220126212112.63, GNU ld (GNU Binutils for Debian) 2.35.2
> > userspace arch: arm64
> >
> > Unfortunately, I don't have any reproducer for this issue yet.
> >
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: [email protected]
> >
> > Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
> > Mem abort info:
> > ESR = 0x0000000086000005
> > EC = 0x21: IABT (current EL), IL = 32 bits
> > SET = 0, FnV = 0
> > EA = 0, S1PTW = 0
> > FSC = 0x05: level 1 translation fault
> > user pgtable: 4k pages, 48-bit VAs, pgdp=00000001249cc000
> > [0000000000000000] pgd=080000012ee65003, p4d=080000012ee65003, pud=0000000000000000
> > Internal error: Oops: 86000005 [#1] PREEMPT SMP
> > Modules linked in:
> > CPU: 0 PID: 3044 Comm: syz-executor.0 Not tainted 6.0.0-rc2-syzkaller-16455-ga41a877bc12d #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/20/2022
> > pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > pc : 0x0
> > lr : folio_mark_dirty+0xbc/0x208 mm/page-writeback.c:2748
> > sp : ffff800012803830
> > x29: ffff800012803830 x28: ffff0000d02c8000 x27: 0000000000000009
> > x26: 0000000000000001 x25: 0000000000000a00 x24: 0000000000000080
> > x23: 0000000000000000 x22: ffff0000ef276c00 x21: 05ffc00000000007
> > x20: ffff0000f14b83b8 x19: fffffc00036409c0 x18: fffffffffffffff5
> > x17: ffff80000dd7a698 x16: ffff80000dbb8658 x15: 0000000000000000
> > x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
> > x11: ff808000083e9814 x10: 0000000000000000 x9 : ffff8000083e9814
> > x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000
> > x5 : ffff0000d9028000 x4 : ffff0000d5c31000 x3 : ffff0000d9027f80
> > x2 : fffffffffffffff0 x1 : fffffc00036409c0 x0 : ffff0000f14b83b8
> > Call trace:
> > 0x0
> > set_page_dirty+0x38/0xbc mm/folio-compat.c:62

2363 void f2fs_update_meta_page(struct f2fs_sb_info *sbi,
2364 void *src, block_t blk_addr)
2365 {
2366 struct page *page = f2fs_grab_meta_page(sbi, blk_addr);

--> f2fs_grab_meta_page() gives a locked page by grab_cache_page().

2367
2368 memcpy(page_address(page), src, PAGE_SIZE);
2369 set_page_dirty(page);
2370 f2fs_put_page(page, 1);
2371 }

Is there a change in folio?

> > get_next_nat_page+0x198/0x300 fs/f2fs/node.c:154
> > __flush_nat_entry_set fs/f2fs/node.c:3005 [inline]
> > f2fs_flush_nat_entries+0x354/0x988 fs/f2fs/node.c:3109
> > f2fs_write_checkpoint+0x350/0x568 fs/f2fs/checkpoint.c:1667
> > f2fs_issue_checkpoint+0x1b0/0x234
> > f2fs_sync_fs+0x8c/0xc8 fs/f2fs/super.c:1651
> > sync_filesystem+0xe0/0x134 fs/sync.c:66
> > generic_shutdown_super+0x38/0x190 fs/super.c:474
> > kill_block_super+0x30/0x78 fs/super.c:1427
> > kill_f2fs_super+0x140/0x184 fs/f2fs/super.c:4544
> > deactivate_locked_super+0x70/0xd4 fs/super.c:332
> > deactivate_super+0xb8/0xbc fs/super.c:363
> > cleanup_mnt+0x1f8/0x234 fs/namespace.c:1186
> > __cleanup_mnt+0x20/0x30 fs/namespace.c:1193
> > task_work_run+0xc4/0x208 kernel/task_work.c:177
> > resume_user_mode_work include/linux/resume_user_mode.h:49 [inline]
> > do_notify_resume+0x174/0x1d0 arch/arm64/kernel/signal.c:1127
> > prepare_exit_to_user_mode arch/arm64/kernel/entry-common.c:137 [inline]
> > exit_to_user_mode arch/arm64/kernel/entry-common.c:142 [inline]
> > el0_svc+0x9c/0x150 arch/arm64/kernel/entry-common.c:625
> > el0t_64_sync_handler+0x84/0xf0 arch/arm64/kernel/entry-common.c:642
> > el0t_64_sync+0x18c/0x190
> > Code: bad PC value
> > ---[ end trace 0000000000000000 ]---
> >
> >
> > ---
> > This report is generated by a bot. It may contain errors.
> > See https://goo.gl/tpsmEJ for more information about syzbot.
> > syzbot engineers can be reached at [email protected].
> >
> > syzbot will keep track of this issue. See:
> > https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

2022-08-29 18:59:48

by Jaegeuk Kim

[permalink] [raw]

Subject: Re: [syzbot] BUG: unable to handle kernel NULL pointer dereference in set_page_dirty

On 08/29, Matthew Wilcox wrote:
> On Mon, Aug 29, 2022 at 10:52:57AM -0700, Jaegeuk Kim wrote:
> > On 08/25, Andrew Morton wrote:
> > > (cc fsf2 developers)
> > >
> > > On Thu, 25 Aug 2022 08:29:32 -0700 syzbot <[email protected]> wrote:
> > >
> > > > Hello,
> > > >
> > > > syzbot found the following issue on:
> > > >
> > > > HEAD commit: a41a877bc12d Merge branch 'for-next/fixes' into for-kernelci
> > > > git tree: git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
> > > > console output: https://syzkaller.appspot.com/x/log.txt?x=175def47080000
> > > > kernel config: https://syzkaller.appspot.com/x/.config?x=5cea15779c42821c
> > > > dashboard link: https://syzkaller.appspot.com/bug?extid=775a3440817f74fddb8c
> > > > compiler: Debian clang version 13.0.1-++20220126092033+75e33f71c2da-1~exp1~20220126212112.63, GNU ld (GNU Binutils for Debian) 2.35.2
> > > > userspace arch: arm64
> > > >
> > > > Unfortunately, I don't have any reproducer for this issue yet.
> > > >
> > > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > > Reported-by: [email protected]
> > > >
> > > > Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
> > > > Mem abort info:
> > > > ESR = 0x0000000086000005
> > > > EC = 0x21: IABT (current EL), IL = 32 bits
> > > > SET = 0, FnV = 0
> > > > EA = 0, S1PTW = 0
> > > > FSC = 0x05: level 1 translation fault
> > > > user pgtable: 4k pages, 48-bit VAs, pgdp=00000001249cc000
> > > > [0000000000000000] pgd=080000012ee65003, p4d=080000012ee65003, pud=0000000000000000
> > > > Internal error: Oops: 86000005 [#1] PREEMPT SMP
> > > > Modules linked in:
> > > > CPU: 0 PID: 3044 Comm: syz-executor.0 Not tainted 6.0.0-rc2-syzkaller-16455-ga41a877bc12d #0
> > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/20/2022
> > > > pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > > > pc : 0x0
> > > > lr : folio_mark_dirty+0xbc/0x208 mm/page-writeback.c:2748
> > > > sp : ffff800012803830
> > > > x29: ffff800012803830 x28: ffff0000d02c8000 x27: 0000000000000009
> > > > x26: 0000000000000001 x25: 0000000000000a00 x24: 0000000000000080
> > > > x23: 0000000000000000 x22: ffff0000ef276c00 x21: 05ffc00000000007
> > > > x20: ffff0000f14b83b8 x19: fffffc00036409c0 x18: fffffffffffffff5
> > > > x17: ffff80000dd7a698 x16: ffff80000dbb8658 x15: 0000000000000000
> > > > x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
> > > > x11: ff808000083e9814 x10: 0000000000000000 x9 : ffff8000083e9814
> > > > x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000
> > > > x5 : ffff0000d9028000 x4 : ffff0000d5c31000 x3 : ffff0000d9027f80
> > > > x2 : fffffffffffffff0 x1 : fffffc00036409c0 x0 : ffff0000f14b83b8
> > > > Call trace:
> > > > 0x0
> > > > set_page_dirty+0x38/0xbc mm/folio-compat.c:62
> >
> > 2363 void f2fs_update_meta_page(struct f2fs_sb_info *sbi,
> > 2364 void *src, block_t blk_addr)
> > 2365 {
> > 2366 struct page *page = f2fs_grab_meta_page(sbi, blk_addr);
> >
> > --> f2fs_grab_meta_page() gives a locked page by grab_cache_page().
> >
> > 2367
> > 2368 memcpy(page_address(page), src, PAGE_SIZE);
> > 2369 set_page_dirty(page);
> > 2370 f2fs_put_page(page, 1);
> > 2371 }
> >
> > Is there a change in folio?
>
> Not directly, but there was a related change, 0af573780b0b which
> requires aops->set_page_dirty to be set; is that perhaps missing?
> I don't see one in the f2fs_compress_aops, for example.

Do you mean dirty_folio? I think all aops have it except the compressed one
that we don't make it dirty.

>
> The other possibiity is that it's a mapping that is missing an ->a_ops.
> Is that something f2fs ever does?

Hmm, no, I haven't seen this before, and we set aops when mounting the
file system. Ah, if this happens on the corrupted image, yeah, maybe.. I need
to check the error path in f2fs_fill_super.

>
> I only managed to narrow down the crash to the line:
> return mapping->a_ops->dirty_folio(mapping, folio);
> so either mapping->a_ops is NULL or mapping->a_ops->dirty_folio is
> NULL. The reproducer was on ARM and ARM doesn't emit a 'Code:' line,
> unlike x86.

2022-08-29 19:01:04

by Matthew Wilcox

[permalink] [raw]

Subject: Re: [syzbot] BUG: unable to handle kernel NULL pointer dereference in set_page_dirty

On Mon, Aug 29, 2022 at 10:52:57AM -0700, Jaegeuk Kim wrote:
> On 08/25, Andrew Morton wrote:
> > (cc fsf2 developers)
> >
> > On Thu, 25 Aug 2022 08:29:32 -0700 syzbot <[email protected]> wrote:
> >
> > > Hello,
> > >
> > > syzbot found the following issue on:
> > >
> > > HEAD commit: a41a877bc12d Merge branch 'for-next/fixes' into for-kernelci
> > > git tree: git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=175def47080000
> > > kernel config: https://syzkaller.appspot.com/x/.config?x=5cea15779c42821c
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=775a3440817f74fddb8c
> > > compiler: Debian clang version 13.0.1-++20220126092033+75e33f71c2da-1~exp1~20220126212112.63, GNU ld (GNU Binutils for Debian) 2.35.2
> > > userspace arch: arm64
> > >
> > > Unfortunately, I don't have any reproducer for this issue yet.
> > >
> > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > Reported-by: [email protected]
> > >
> > > Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
> > > Mem abort info:
> > > ESR = 0x0000000086000005
> > > EC = 0x21: IABT (current EL), IL = 32 bits
> > > SET = 0, FnV = 0
> > > EA = 0, S1PTW = 0
> > > FSC = 0x05: level 1 translation fault
> > > user pgtable: 4k pages, 48-bit VAs, pgdp=00000001249cc000
> > > [0000000000000000] pgd=080000012ee65003, p4d=080000012ee65003, pud=0000000000000000
> > > Internal error: Oops: 86000005 [#1] PREEMPT SMP
> > > Modules linked in:
> > > CPU: 0 PID: 3044 Comm: syz-executor.0 Not tainted 6.0.0-rc2-syzkaller-16455-ga41a877bc12d #0
> > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/20/2022
> > > pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > > pc : 0x0
> > > lr : folio_mark_dirty+0xbc/0x208 mm/page-writeback.c:2748
> > > sp : ffff800012803830
> > > x29: ffff800012803830 x28: ffff0000d02c8000 x27: 0000000000000009
> > > x26: 0000000000000001 x25: 0000000000000a00 x24: 0000000000000080
> > > x23: 0000000000000000 x22: ffff0000ef276c00 x21: 05ffc00000000007
> > > x20: ffff0000f14b83b8 x19: fffffc00036409c0 x18: fffffffffffffff5
> > > x17: ffff80000dd7a698 x16: ffff80000dbb8658 x15: 0000000000000000
> > > x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
> > > x11: ff808000083e9814 x10: 0000000000000000 x9 : ffff8000083e9814
> > > x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000
> > > x5 : ffff0000d9028000 x4 : ffff0000d5c31000 x3 : ffff0000d9027f80
> > > x2 : fffffffffffffff0 x1 : fffffc00036409c0 x0 : ffff0000f14b83b8
> > > Call trace:
> > > 0x0
> > > set_page_dirty+0x38/0xbc mm/folio-compat.c:62
>
> 2363 void f2fs_update_meta_page(struct f2fs_sb_info *sbi,
> 2364 void *src, block_t blk_addr)
> 2365 {
> 2366 struct page *page = f2fs_grab_meta_page(sbi, blk_addr);
>
> --> f2fs_grab_meta_page() gives a locked page by grab_cache_page().
>
> 2367
> 2368 memcpy(page_address(page), src, PAGE_SIZE);
> 2369 set_page_dirty(page);
> 2370 f2fs_put_page(page, 1);
> 2371 }
>
> Is there a change in folio?

Not directly, but there was a related change, 0af573780b0b which
requires aops->set_page_dirty to be set; is that perhaps missing?
I don't see one in the f2fs_compress_aops, for example.

The other possibiity is that it's a mapping that is missing an ->a_ops.
Is that something f2fs ever does?

I only managed to narrow down the crash to the line:
return mapping->a_ops->dirty_folio(mapping, folio);
so either mapping->a_ops is NULL or mapping->a_ops->dirty_folio is
NULL. The reproducer was on ARM and ARM doesn't emit a 'Code:' line,
unlike x86.

2022-08-29 23:04:18

by Jaegeuk Kim

[permalink] [raw]

Subject: Re: [syzbot] BUG: unable to handle kernel NULL pointer dereference in set_page_dirty

On 08/29, Jaegeuk Kim wrote:
> On 08/29, Matthew Wilcox wrote:
> > On Mon, Aug 29, 2022 at 10:52:57AM -0700, Jaegeuk Kim wrote:
> > > On 08/25, Andrew Morton wrote:
> > > > (cc fsf2 developers)
> > > >
> > > > On Thu, 25 Aug 2022 08:29:32 -0700 syzbot <[email protected]> wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > > syzbot found the following issue on:
> > > > >
> > > > > HEAD commit: a41a877bc12d Merge branch 'for-next/fixes' into for-kernelci
> > > > > git tree: git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
> > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=175def47080000
> > > > > kernel config: https://syzkaller.appspot.com/x/.config?x=5cea15779c42821c
> > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=775a3440817f74fddb8c
> > > > > compiler: Debian clang version 13.0.1-++20220126092033+75e33f71c2da-1~exp1~20220126212112.63, GNU ld (GNU Binutils for Debian) 2.35.2
> > > > > userspace arch: arm64
> > > > >
> > > > > Unfortunately, I don't have any reproducer for this issue yet.
> > > > >
> > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > > > Reported-by: [email protected]
> > > > >
> > > > > Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
> > > > > Mem abort info:
> > > > > ESR = 0x0000000086000005
> > > > > EC = 0x21: IABT (current EL), IL = 32 bits
> > > > > SET = 0, FnV = 0
> > > > > EA = 0, S1PTW = 0
> > > > > FSC = 0x05: level 1 translation fault
> > > > > user pgtable: 4k pages, 48-bit VAs, pgdp=00000001249cc000
> > > > > [0000000000000000] pgd=080000012ee65003, p4d=080000012ee65003, pud=0000000000000000
> > > > > Internal error: Oops: 86000005 [#1] PREEMPT SMP
> > > > > Modules linked in:
> > > > > CPU: 0 PID: 3044 Comm: syz-executor.0 Not tainted 6.0.0-rc2-syzkaller-16455-ga41a877bc12d #0
> > > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/20/2022
> > > > > pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > > > > pc : 0x0
> > > > > lr : folio_mark_dirty+0xbc/0x208 mm/page-writeback.c:2748
> > > > > sp : ffff800012803830
> > > > > x29: ffff800012803830 x28: ffff0000d02c8000 x27: 0000000000000009
> > > > > x26: 0000000000000001 x25: 0000000000000a00 x24: 0000000000000080
> > > > > x23: 0000000000000000 x22: ffff0000ef276c00 x21: 05ffc00000000007
> > > > > x20: ffff0000f14b83b8 x19: fffffc00036409c0 x18: fffffffffffffff5
> > > > > x17: ffff80000dd7a698 x16: ffff80000dbb8658 x15: 0000000000000000
> > > > > x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
> > > > > x11: ff808000083e9814 x10: 0000000000000000 x9 : ffff8000083e9814
> > > > > x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000
> > > > > x5 : ffff0000d9028000 x4 : ffff0000d5c31000 x3 : ffff0000d9027f80
> > > > > x2 : fffffffffffffff0 x1 : fffffc00036409c0 x0 : ffff0000f14b83b8
> > > > > Call trace:
> > > > > 0x0
> > > > > set_page_dirty+0x38/0xbc mm/folio-compat.c:62
> > >
> > > 2363 void f2fs_update_meta_page(struct f2fs_sb_info *sbi,
> > > 2364 void *src, block_t blk_addr)
> > > 2365 {
> > > 2366 struct page *page = f2fs_grab_meta_page(sbi, blk_addr);
> > >
> > > --> f2fs_grab_meta_page() gives a locked page by grab_cache_page().
> > >
> > > 2367
> > > 2368 memcpy(page_address(page), src, PAGE_SIZE);
> > > 2369 set_page_dirty(page);
> > > 2370 f2fs_put_page(page, 1);
> > > 2371 }
> > >
> > > Is there a change in folio?
> >
> > Not directly, but there was a related change, 0af573780b0b which
> > requires aops->set_page_dirty to be set; is that perhaps missing?
> > I don't see one in the f2fs_compress_aops, for example.
>
> Do you mean dirty_folio? I think all aops have it except the compressed one
> that we don't make it dirty.
>
> >
> > The other possibiity is that it's a mapping that is missing an ->a_ops.
> > Is that something f2fs ever does?
>
> Hmm, no, I haven't seen this before, and we set aops when mounting the
> file system. Ah, if this happens on the corrupted image, yeah, maybe.. I need
> to check the error path in f2fs_fill_super.

Fixed by https://lore.kernel.org/linux-f2fs-devel/[email protected]/T/#u

>
> >
> > I only managed to narrow down the crash to the line:
> > return mapping->a_ops->dirty_folio(mapping, folio);
> > so either mapping->a_ops is NULL or mapping->a_ops->dirty_folio is
> > NULL. The reproducer was on ARM and ARM doesn't emit a 'Code:' line,
> > unlike x86.