2022-10-30 11:26:10

by Wei Chen

[permalink] [raw]
Subject: INFO: task hung in fscrypt_ioctl_set_policy

Dear Linux Developer,

Recently when using our tool to fuzz kernel, the following crash was triggered:

HEAD commit: 64570fbc14f8 Linux 5.15-rc5
git tree: upstream
compiler: gcc 8.0.1
console output:
https://drive.google.com/file/d/1QcXGrdd3Li0ylo1WBfswWukWV__QZS4h/view?usp=share_link
kernel config: https://drive.google.com/file/d/1uDOeEYgJDcLiSOrx9W8v2bqZ6uOA_55t/view?usp=share_link

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: Wei Chen <[email protected]>

INFO: task syz-executor.0:29056 blocked for more than 143 seconds.
Not tainted 5.15.0-rc5 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor.0 state:D stack:14632 pid:29056 ppid: 6574 flags:0x00000004
Call Trace:
__schedule+0x4a1/0x1720
schedule+0x36/0xe0
rwsem_down_write_slowpath+0x322/0x7a0
fscrypt_ioctl_set_policy+0x11f/0x2a0
__f2fs_ioctl+0x1a9f/0x5780
f2fs_ioctl+0x89/0x3a0
__x64_sys_ioctl+0xe8/0x140
do_syscall_64+0x34/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x4692c9
RSP: 002b:00007f9fecafbc38 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004692c9
RDX: 0000000020002900 RSI: 00000000800c6613 RDI: 0000000000000003
RBP: 000000000119c1d8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000119c1d4
R13: 0000000000000000 R14: 000000000119c1c8 R15: 00007fffbfabbc40

Showing all locks held in the system:
1 lock held by khungtaskd/29:
#0: ffffffff8641dee0 (rcu_read_lock){....}-{1:2}, at:
debug_show_all_locks+0x15/0x17a
1 lock held by kswapd0/269:
1 lock held by in:imklog/6194:
#0: ffff888100174ef0 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0x92/0xa0
1 lock held by systemd-udevd/16539:
3 locks held by syz-executor.0/29046:
2 locks held by syz-executor.0/29056:
#0: ffff88811912f460 (sb_writers#22){.+.+}-{0:0}, at:
fscrypt_ioctl_set_policy+0xfc/0x2a0
#1: ffff88810a98e3f8 (&type->i_mutex_dir_key#12){++++}-{3:3}, at:
fscrypt_ioctl_set_policy+0x11f/0x2a0

=============================================

NMI backtrace for cpu 1
CPU: 1 PID: 29 Comm: khungtaskd Not tainted 5.15.0-rc5 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.13.0-48-gd9c812dda519-prebuilt.qemu.org 04/01/2014
Call Trace:
dump_stack_lvl+0xcd/0x134
nmi_cpu_backtrace.cold.8+0xf3/0x118
nmi_trigger_cpumask_backtrace+0x18f/0x1c0
watchdog+0x9a0/0xb10
kthread+0x1a6/0x1e0
ret_from_fork+0x1f/0x30
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
CPU: 0 PID: 2973 Comm: systemd-journal Not tainted 5.15.0-rc5 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.13.0-48-gd9c812dda519-prebuilt.qemu.org 04/01/2014
RIP: 0010:lock_is_held_type+0xff/0x140
Code: 10 00 00 ba ff ff ff ff 65 0f c1 15 cb 92 43 7b 83 fa 01 75 28
9c 58 f6 c4 02 75 3c 41 f7 c7 00 02 00 00 74 01 fb 48 83 c4 08 <44> 89
e8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 45 31 ed eb ba 0f 0b 48
RSP: 0018:ffffc9000079bd80 EFLAGS: 00000292
RAX: 0000000000000046 RBX: 0000000000000000 RCX: 0000000000000001
RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffffffff8641dea0 R08: 0000000000000000 R09: 0000000000000001
R10: ffffffff86462280 R11: 0000000000000000 R12: ffff8881050bd280
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000246
FS: 00007f0a317238c0(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f0a2d3c2000 CR3: 0000000012f8f000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
___might_sleep+0x21/0x150
kmem_cache_alloc+0x279/0x310
getname_flags+0x5f/0x2c0
do_sys_openat2+0x3c8/0x6a0
do_sys_open+0x84/0xe0
do_syscall_64+0x34/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f0a30cb385d
Code: bb 20 00 00 75 10 b8 02 00 00 00 0f 05 48 3d 01 f0 ff ff 73 31
c3 48 83 ec 08 e8 1e f6 ff ff 48 89 04 24 b8 02 00 00 00 0f 05 <48> 8b
3c 24 48 89 c2 e8 67 f6 ff ff 48 89 d0 48 83 c4 08 48 3d 01
RSP: 002b:00007fff8f43b630 EFLAGS: 00000293 ORIG_RAX: 0000000000000002
RAX: ffffffffffffffda RBX: 00007fff8f43b940 RCX: 00007f0a30cb385d
RDX: 00000000000001a0 RSI: 0000000000080042 RDI: 0000555919efe460
RBP: 000000000000000d R08: 000000000000ffc0 R09: 00000000ffffffff
R10: 0000000000000069 R11: 0000000000000293 R12: 00000000ffffffff
R13: 0000555919ef1040 R14: 00007fff8f43b900 R15: 0000555919efe280
----------------
Code disassembly (best guess):
0: 10 00 adc %al,(%rax)
2: 00 ba ff ff ff ff add %bh,-0x1(%rdx)
8: 65 0f c1 15 cb 92 43 xadd %edx,%gs:0x7b4392cb(%rip) # 0x7b4392db
f: 7b
10: 83 fa 01 cmp $0x1,%edx
13: 75 28 jne 0x3d
15: 9c pushfq
16: 58 pop %rax
17: f6 c4 02 test $0x2,%ah
1a: 75 3c jne 0x58
1c: 41 f7 c7 00 02 00 00 test $0x200,%r15d
23: 74 01 je 0x26
25: fb sti
26: 48 83 c4 08 add $0x8,%rsp
* 2a: 44 89 e8 mov %r13d,%eax <-- trapping instruction
2d: 5b pop %rbx
2e: 5d pop %rbp
2f: 41 5c pop %r12
31: 41 5d pop %r13
33: 41 5e pop %r14
35: 41 5f pop %r15
37: c3 retq
38: 45 31 ed xor %r13d,%r13d
3b: eb ba jmp 0xfffffff7
3d: 0f 0b ud2
3f: 48 rex.W

Best,
Wei


2022-10-31 14:48:49

by Wei Chen

[permalink] [raw]
Subject: Re: INFO: task hung in fscrypt_ioctl_set_policy

Dear Linux developers,

Here is the link to the reproducers.

C reproducer: https://drive.google.com/file/d/1mduYsYuoOKemH3qkvpDQwnAHAaaLUp0Y/view?usp=share_link
Syz reproducer:
https://drive.google.com/file/d/1mu-_w7dy_562vWRlQvTRbcBjG4_G7b2L/view?usp=share_link

The bug persists in the latest commit, v5.15.76 (4f5365f77018). I hope
it is helpful to you.

[ 1782.137186][ T30] INFO: task a.out:6910 blocked for more than 143 seconds.
[ 1782.139217][ T30] Not tainted 5.15.76 #5
[ 1782.140388][ T30] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1782.142524][ T30] task:a.out state:D stack:14296 pid:
6910 ppid: 6532 flags:0x00004004
[ 1782.144799][ T30] Call Trace:
[ 1782.145623][ T30] <TASK>
[ 1782.146316][ T30] __schedule+0x3e8/0x1850
[ 1782.152029][ T30] ? mark_held_locks+0x49/0x70
[ 1782.153533][ T30] ? mark_held_locks+0x10/0x70
[ 1782.154759][ T30] ? __down_write_common.part.14+0x31f/0x7b0
[ 1782.156159][ T30] schedule+0x4e/0xe0
[ 1782.158314][ T30] __down_write_common.part.14+0x324/0x7b0
[ 1782.159704][ T30] ? fscrypt_ioctl_set_policy+0xe0/0x200
[ 1782.161050][ T30] fscrypt_ioctl_set_policy+0xe0/0x200
[ 1782.162330][ T30] __f2fs_ioctl+0x9d6/0x45e0
[ 1782.163417][ T30] f2fs_ioctl+0x64/0x240
[ 1782.164404][ T30] ? __f2fs_ioctl+0x45e0/0x45e0
[ 1782.165554][ T30] __x64_sys_ioctl+0xb6/0x100
[ 1782.166662][ T30] do_syscall_64+0x34/0xb0
[ 1782.169947][ T30] entry_SYSCALL_64_after_hwframe+0x61/0xcb
[ 1782.171348][ T30] RIP: 0033:0x7f55690bb469
[ 1782.172385][ T30] RSP: 002b:00007f556976aef8 EFLAGS: 00000246
ORIG_RAX: 0000000000000010
[ 1782.174344][ T30] RAX: ffffffffffffffda RBX: 0000000000000000
RCX: 00007f55690bb469
[ 1782.176198][ T30] RDX: 0000000020002900 RSI: 00000000800c6613
RDI: 0000000000000003
[ 1782.179056][ T30] RBP: 00007f556976af20 R08: 0000000000000000
R09: 0000000000000000
[ 1782.180945][ T30] R10: 0000000000000000 R11: 0000000000000246
R12: 00007ffee37a15ce
[ 1782.182812][ T30] R13: 00007ffee37a15cf R14: 00007f556974b000
R15: 0000000000000003
[ 1782.184690][ T30] </TASK>
[ 1782.185445][ T30]
[ 1782.185445][ T30] Showing all locks held in the system:
[ 1782.189142][ T30] 1 lock held by khungtaskd/30:
[ 1782.190295][ T30] #0: ffffffff85a1dc60
(rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0xe/0x1a0
[ 1782.192615][ T30] 1 lock held by in:imklog/6238:
[ 1782.193768][ T30] #0: ffff88810a68f6f0
(&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0x55/0x60
[ 1782.204980][ T30] 2 locks held by a.out/6900:
[ 1782.206094][ T30] 2 locks held by a.out/6910:
[ 1782.208314][ T30] #0: ffff888011f73460
(sb_writers#14){.+.+}-{0:0}, at: fscrypt_ioctl_set_policy+0xc6/0x200
[ 1782.210728][ T30] #1: ffff888013f49350
(&type->i_mutex_dir_key#8){+.+.}-{3:3}, at:
fscrypt_ioctl_set_policy+0xe0/0x200
[ 1782.213346][ T30]
[ 1782.213891][ T30] =============================================
[ 1782.213891][ T30]
[ 1782.215851][ T30] NMI backtrace for cpu 1
[ 1782.216913][ T30] CPU: 1 PID: 30 Comm: khungtaskd Not tainted 5.15.76 #5
[ 1782.218539][ T30] Hardware name: QEMU Standard PC (i440FX + PIIX,
1996), BIOS rel-1.13.0-48-gd9c812dda519-prebuilt.qemu.org 04/01/2014
[ 1782.221460][ T30] Call Trace:
[ 1782.222289][ T30] <TASK>
[ 1782.223009][ T30] dump_stack_lvl+0x8d/0xcf
[ 1782.224108][ T30] nmi_cpu_backtrace+0x1e9/0x210
[ 1782.225309][ T30] ? lapic_can_unplug_cpu+0x90/0x90
[ 1782.226550][ T30] nmi_trigger_cpumask_backtrace+0x129/0x190
[ 1782.228048][ T30] watchdog+0x4e1/0x970
[ 1782.229093][ T30] ? hungtask_pm_notify+0x70/0x70
[ 1782.230337][ T30] kthread+0x178/0x1b0
[ 1782.231359][ T30] ? set_kthread_struct+0x50/0x50
[ 1782.232617][ T30] ret_from_fork+0x1f/0x30
[ 1782.233731][ T30] </TASK>
[ 1782.234606][ T30] Sending NMI from CPU 1 to CPUs 0:
[ 1782.235927][ C0] NMI backtrace for cpu 0
[ 1782.235933][ C0] CPU: 0 PID: 2974 Comm: systemd-journal Not
tainted 5.15.76 #5
[ 1782.235940][ C0] Hardware name: QEMU Standard PC (i440FX + PIIX,
1996), BIOS rel-1.13.0-48-gd9c812dda519-prebuilt.qemu.org 04/01/2014
[ 1782.235946][ C0] RIP: 0010:__lock_acquire+0x840/0x1d90
[ 1782.235958][ C0] Code: d2 75 17 e8 e2 26 ff 00 85 c0 74 0e 8b 05
60 e4 e1 04 85 c0 0f 84 29 02 00 00 31 c0 48 83 78 40 00 75 02 0f 0b
0f b7 44 24 28 <8b> 4c 24 30 c1 e0 0d 66 0b 44 24 20 98 2b 44 24 18 33
44 24 10 89
[ 1782.235965][ C0] RSP: 0018:ffffc9000080bdb0 EFLAGS: 00000082
[ 1782.235971][ C0] RAX: 0000000000000000 RBX: ffff88801024dd60
RCX: 0000000000000008
[ 1782.235975][ C0] RDX: 0000000000000008 RSI: 0000000000000000
RDI: 0000000000000000
[ 1782.235979][ C0] RBP: 00000000000000a4 R08: 0000000000000000
R09: 0000000000000001
[ 1782.235983][ C0] R10: 0000000000000000 R11: 0000000000000000
R12: 0000000000000001
[ 1782.235986][ C0] R13: ffff88801024d340 R14: ffff88801024dd60
R15: ffffffffffffffff
[ 1782.235991][ C0] FS: 00007f88e2cda8c0(0000)
GS:ffff88807dc00000(0000) knlGS:0000000000000000
[ 1782.235999][ C0] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1782.236003][ C0] CR2: 00007f88e0039008 CR3: 0000000011fc2000
CR4: 00000000003506f0
[ 1782.236007][ C0] DR0: 0000000000000000 DR1: 0000000000000000
DR2: 0000000000000000
[ 1782.236011][ C0] DR3: 0000000000000000 DR6: 00000000fffe0ff0
DR7: 0000000000000400
[ 1782.236015][ C0] Call Trace:
[ 1782.236017][ C0] <TASK>
[ 1782.236022][ C0] lock_acquire+0x294/0x350
[ 1782.236030][ C0] ? __context_tracking_exit+0x88/0x1f0
[ 1782.236040][ C0] ? __context_tracking_enter+0x100/0x210
[ 1782.236047][ C0] vtime_user_exit+0x50/0xe0
[ 1782.236056][ C0] ? __context_tracking_exit+0x88/0x1f0
[ 1782.236062][ C0] __context_tracking_exit+0x88/0x1f0
[ 1782.236069][ C0] syscall_enter_from_user_mode+0x4a/0x70
[ 1782.236077][ C0] do_syscall_64+0x15/0xb0
[ 1782.236083][ C0] entry_SYSCALL_64_after_hwframe+0x61/0xcb
[ 1782.236092][ C0] RIP: 0033:0x7f88e1f73f17
[ 1782.236098][ C0] Code: ff ff ff 48 8b 4d a0 0f b7 51 fe 48 8b 4d
a8 66 89 54 08 fe e9 1a ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 b8 27
00 00 00 0f 05 <c3> 0f 1f 84 00 00 00 00 00 b8 6e 00 00 00 0f 05 c3 0f
1f 84 00 00
[ 1782.236104][ C0] RSP: 002b:00007fff9ce26b28 EFLAGS: 00000202
ORIG_RAX: 0000000000000027
[ 1782.236110][ C0] RAX: ffffffffffffffda RBX: 000055d5d4d9d1e0
RCX: 00007f88e1f73f17
[ 1782.236114][ C0] RDX: 00000000ffffffff RSI: 00007fff9ce26bb0
RDI: 000055d5d4d9d1e0
[ 1782.236118][ C0] RBP: 0000000000000b9e R08: 00007fff9cedf080
R09: 00007fff9cedf080
[ 1782.236122][ C0] R10: 0000000000034d62 R11: 0000000000000202
R12: 00007fff9ce26bb0
[ 1782.236126][ C0] R13: 00007fff9ce26ba8 R14: 000055d5d3122958
R15: 0005ec5535de5711
[ 1782.236132][ C0] </TASK>
[ 1782.267071][ T30] Kernel panic - not syncing: hung_task: blocked tasks
[ 1782.301186][ T30] CPU: 1 PID: 30 Comm: khungtaskd Not tainted 5.15.76 #5
[ 1782.302914][ T30] Hardware name: QEMU Standard PC (i440FX + PIIX,
1996), BIOS rel-1.13.0-48-gd9c812dda519-prebuilt.qemu.org 04/01/2014
[ 1782.305923][ T30] Call Trace:
[ 1782.306711][ T30] <TASK>
[ 1782.307482][ T30] dump_stack_lvl+0x8d/0xcf
[ 1782.308605][ T30] panic+0x165/0x3ed
[ 1782.309572][ T30] ? watchdog+0x4f8/0x970
[ 1782.310645][ T30] watchdog+0x504/0x970
[ 1782.311670][ T30] ? hungtask_pm_notify+0x70/0x70
[ 1782.312906][ T30] kthread+0x178/0x1b0
[ 1782.313914][ T30] ? set_kthread_struct+0x50/0x50
[ 1782.315142][ T30] ret_from_fork+0x1f/0x30
[ 1782.316235][ T30] </TASK>
[ 1782.317156][ T30] Kernel Offset: disabled
[ 1782.318271][ T30] Rebooting in 86400 seconds..

Best,
Wei


On Sun, 30 Oct 2022 at 18:40, Wei Chen <[email protected]> wrote:
>
> Dear Linux Developer,
>
> Recently when using our tool to fuzz kernel, the following crash was triggered:
>
> HEAD commit: 64570fbc14f8 Linux 5.15-rc5
> git tree: upstream
> compiler: gcc 8.0.1
> console output:
> https://drive.google.com/file/d/1QcXGrdd3Li0ylo1WBfswWukWV__QZS4h/view?usp=share_link
> kernel config: https://drive.google.com/file/d/1uDOeEYgJDcLiSOrx9W8v2bqZ6uOA_55t/view?usp=share_link
>
> Unfortunately, I don't have any reproducer for this crash yet.
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: Wei Chen <[email protected]>
>
> INFO: task syz-executor.0:29056 blocked for more than 143 seconds.
> Not tainted 5.15.0-rc5 #1
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:syz-executor.0 state:D stack:14632 pid:29056 ppid: 6574 flags:0x00000004
> Call Trace:
> __schedule+0x4a1/0x1720
> schedule+0x36/0xe0
> rwsem_down_write_slowpath+0x322/0x7a0
> fscrypt_ioctl_set_policy+0x11f/0x2a0
> __f2fs_ioctl+0x1a9f/0x5780
> f2fs_ioctl+0x89/0x3a0
> __x64_sys_ioctl+0xe8/0x140
> do_syscall_64+0x34/0xb0
> entry_SYSCALL_64_after_hwframe+0x44/0xae
> RIP: 0033:0x4692c9
> RSP: 002b:00007f9fecafbc38 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004692c9
> RDX: 0000000020002900 RSI: 00000000800c6613 RDI: 0000000000000003
> RBP: 000000000119c1d8 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 000000000119c1d4
> R13: 0000000000000000 R14: 000000000119c1c8 R15: 00007fffbfabbc40
>
> Showing all locks held in the system:
> 1 lock held by khungtaskd/29:
> #0: ffffffff8641dee0 (rcu_read_lock){....}-{1:2}, at:
> debug_show_all_locks+0x15/0x17a
> 1 lock held by kswapd0/269:
> 1 lock held by in:imklog/6194:
> #0: ffff888100174ef0 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0x92/0xa0
> 1 lock held by systemd-udevd/16539:
> 3 locks held by syz-executor.0/29046:
> 2 locks held by syz-executor.0/29056:
> #0: ffff88811912f460 (sb_writers#22){.+.+}-{0:0}, at:
> fscrypt_ioctl_set_policy+0xfc/0x2a0
> #1: ffff88810a98e3f8 (&type->i_mutex_dir_key#12){++++}-{3:3}, at:
> fscrypt_ioctl_set_policy+0x11f/0x2a0
>
> =============================================
>
> NMI backtrace for cpu 1
> CPU: 1 PID: 29 Comm: khungtaskd Not tainted 5.15.0-rc5 #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> rel-1.13.0-48-gd9c812dda519-prebuilt.qemu.org 04/01/2014
> Call Trace:
> dump_stack_lvl+0xcd/0x134
> nmi_cpu_backtrace.cold.8+0xf3/0x118
> nmi_trigger_cpumask_backtrace+0x18f/0x1c0
> watchdog+0x9a0/0xb10
> kthread+0x1a6/0x1e0
> ret_from_fork+0x1f/0x30
> Sending NMI from CPU 1 to CPUs 0:
> NMI backtrace for cpu 0
> CPU: 0 PID: 2973 Comm: systemd-journal Not tainted 5.15.0-rc5 #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> rel-1.13.0-48-gd9c812dda519-prebuilt.qemu.org 04/01/2014
> RIP: 0010:lock_is_held_type+0xff/0x140
> Code: 10 00 00 ba ff ff ff ff 65 0f c1 15 cb 92 43 7b 83 fa 01 75 28
> 9c 58 f6 c4 02 75 3c 41 f7 c7 00 02 00 00 74 01 fb 48 83 c4 08 <44> 89
> e8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 45 31 ed eb ba 0f 0b 48
> RSP: 0018:ffffc9000079bd80 EFLAGS: 00000292
> RAX: 0000000000000046 RBX: 0000000000000000 RCX: 0000000000000001
> RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000000
> RBP: ffffffff8641dea0 R08: 0000000000000000 R09: 0000000000000001
> R10: ffffffff86462280 R11: 0000000000000000 R12: ffff8881050bd280
> R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000246
> FS: 00007f0a317238c0(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f0a2d3c2000 CR3: 0000000012f8f000 CR4: 00000000003506f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> ___might_sleep+0x21/0x150
> kmem_cache_alloc+0x279/0x310
> getname_flags+0x5f/0x2c0
> do_sys_openat2+0x3c8/0x6a0
> do_sys_open+0x84/0xe0
> do_syscall_64+0x34/0xb0
> entry_SYSCALL_64_after_hwframe+0x44/0xae
> RIP: 0033:0x7f0a30cb385d
> Code: bb 20 00 00 75 10 b8 02 00 00 00 0f 05 48 3d 01 f0 ff ff 73 31
> c3 48 83 ec 08 e8 1e f6 ff ff 48 89 04 24 b8 02 00 00 00 0f 05 <48> 8b
> 3c 24 48 89 c2 e8 67 f6 ff ff 48 89 d0 48 83 c4 08 48 3d 01
> RSP: 002b:00007fff8f43b630 EFLAGS: 00000293 ORIG_RAX: 0000000000000002
> RAX: ffffffffffffffda RBX: 00007fff8f43b940 RCX: 00007f0a30cb385d
> RDX: 00000000000001a0 RSI: 0000000000080042 RDI: 0000555919efe460
> RBP: 000000000000000d R08: 000000000000ffc0 R09: 00000000ffffffff
> R10: 0000000000000069 R11: 0000000000000293 R12: 00000000ffffffff
> R13: 0000555919ef1040 R14: 00007fff8f43b900 R15: 0000555919efe280
> ----------------
> Code disassembly (best guess):
> 0: 10 00 adc %al,(%rax)
> 2: 00 ba ff ff ff ff add %bh,-0x1(%rdx)
> 8: 65 0f c1 15 cb 92 43 xadd %edx,%gs:0x7b4392cb(%rip) # 0x7b4392db
> f: 7b
> 10: 83 fa 01 cmp $0x1,%edx
> 13: 75 28 jne 0x3d
> 15: 9c pushfq
> 16: 58 pop %rax
> 17: f6 c4 02 test $0x2,%ah
> 1a: 75 3c jne 0x58
> 1c: 41 f7 c7 00 02 00 00 test $0x200,%r15d
> 23: 74 01 je 0x26
> 25: fb sti
> 26: 48 83 c4 08 add $0x8,%rsp
> * 2a: 44 89 e8 mov %r13d,%eax <-- trapping instruction
> 2d: 5b pop %rbx
> 2e: 5d pop %rbp
> 2f: 41 5c pop %r12
> 31: 41 5d pop %r13
> 33: 41 5e pop %r14
> 35: 41 5f pop %r15
> 37: c3 retq
> 38: 45 31 ed xor %r13d,%r13d
> 3b: eb ba jmp 0xfffffff7
> 3d: 0f 0b ud2
> 3f: 48 rex.W
>
> Best,
> Wei

2022-11-02 06:58:05

by Eric Biggers

[permalink] [raw]
Subject: f2fs_empty_dir() can be extremely slow on malicious disk images

[+f2fs list and maintainers]
[changed subject from "INFO: task hung in fscrypt_ioctl_set_policy"]

On Mon, Oct 31, 2022 at 10:18:02PM +0800, Wei Chen wrote:
> Dear Linux developers,
>
> Here is the link to the reproducers.
>
> C reproducer: https://drive.google.com/file/d/1mduYsYuoOKemH3qkvpDQwnAHAaaLUp0Y/view?usp=share_link
> Syz reproducer:
> https://drive.google.com/file/d/1mu-_w7dy_562vWRlQvTRbcBjG4_G7b2L/view?usp=share_link
>
> The bug persists in the latest commit, v5.15.76 (4f5365f77018). I hope
> it is helpful to you.
>
> [ 1782.137186][ T30] INFO: task a.out:6910 blocked for more than 143 seconds.
> [ 1782.139217][ T30] Not tainted 5.15.76 #5
> [ 1782.140388][ T30] "echo 0 >
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 1782.142524][ T30] task:a.out state:D stack:14296 pid:
> 6910 ppid: 6532 flags:0x00004004
> [ 1782.144799][ T30] Call Trace:
> [ 1782.145623][ T30] <TASK>
> [ 1782.146316][ T30] __schedule+0x3e8/0x1850
> [ 1782.152029][ T30] ? mark_held_locks+0x49/0x70
> [ 1782.153533][ T30] ? mark_held_locks+0x10/0x70
> [ 1782.154759][ T30] ? __down_write_common.part.14+0x31f/0x7b0
> [ 1782.156159][ T30] schedule+0x4e/0xe0
> [ 1782.158314][ T30] __down_write_common.part.14+0x324/0x7b0
> [ 1782.159704][ T30] ? fscrypt_ioctl_set_policy+0xe0/0x200
> [ 1782.161050][ T30] fscrypt_ioctl_set_policy+0xe0/0x200
> [ 1782.162330][ T30] __f2fs_ioctl+0x9d6/0x45e0
> [ 1782.163417][ T30] f2fs_ioctl+0x64/0x240
> [ 1782.164404][ T30] ? __f2fs_ioctl+0x45e0/0x45e0
> [ 1782.165554][ T30] __x64_sys_ioctl+0xb6/0x100
> [ 1782.166662][ T30] do_syscall_64+0x34/0xb0
> [ 1782.169947][ T30] entry_SYSCALL_64_after_hwframe+0x61/0xcb

Well, the quality of this bug report has a lot to be desired (not on upstream
kernel, reproducer is full of totally irrelevant stuff, not sent to the mailing
list of the filesystem whose disk image is being fuzzed, etc.). But what is
going on is that f2fs_empty_dir() doesn't consider the case of a directory with
an extremely large i_size on a malicious disk image.

Specifically, the reproducer mounts an f2fs image with a directory that has an
i_size of 14814520042850357248, then calls FS_IOC_SET_ENCRYPTION_POLICY on it.
That results in a call to f2fs_empty_dir() to check whether the directory is
empty. f2fs_empty_dir() then iterates through all 3616826182336513 blocks the
directory allegedly contains to check whether any contain anything. i_rwsem is
held during this, so anything else that tries to take it will hang.

I'll look into this more if needed, but Jaegeuk and Chao, do you happen to have
any ideas for how f2fs_empty_dir() should be fixed? Is there an easy way to
just iterate through the blocks that are actually allocated?

- Eric

2022-11-02 07:32:25

by Wei Chen

[permalink] [raw]
Subject: Re: f2fs_empty_dir() can be extremely slow on malicious disk images

Dear Eric,

Thank you for the email and advice. The bug persists in upstream Linux
6.0 4fe89d07dc.

[ 222.604894][ T23] INFO: task a.out:6867 blocked for more than 143 seconds.
[ 222.606536][ T23] Not tainted 6.0.0 #35
[ 222.606833][ T23] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 222.607369][ T23] task:a.out state:D stack:25312 pid:
6867 ppid: 6611 flags:0x00004004
[ 222.607948][ T23] Call Trace:
[ 222.608161][ T23] <TASK>
[ 222.608369][ T23] __schedule+0xbc5/0x1180
[ 222.608673][ T23] ? __sched_text_start+0x8/0x8
[ 222.608986][ T23] ? print_irqtrace_events+0x220/0x220
[ 222.609333][ T23] ? _raw_spin_lock_irq+0xba/0xf0
[ 222.609669][ T23] schedule+0xcb/0x190
[ 222.609949][ T23] rwsem_down_write_slowpath+0xf1c/0x1340
[ 222.610326][ T23] ? rwsem_down_read_slowpath+0x920/0x920
[ 222.610689][ T23] ? read_lock_is_recursive+0x10/0x10
[ 222.611031][ T23] ? read_lock_is_recursive+0x10/0x10
[ 222.611379][ T23] ? __lock_acquire+0x6080/0x6080
[ 222.611696][ T23] ? rcu_read_lock_any_held+0xb1/0x130
[ 222.612036][ T23] down_write+0x163/0x170
[ 222.612308][ T23] ? down_read_killable+0x80/0x80
[ 222.612639][ T23] ? __mnt_want_write+0x221/0x2a0
[ 222.612963][ T23] fscrypt_ioctl_set_policy+0x249/0x1500
[ 222.613318][ T23] ? arch_stack_walk+0x98/0xe0
[ 222.613629][ T23] ? fscrypt_policy_from_context+0x600/0x600
[ 222.614017][ T23] ? stack_trace_save+0x104/0x1e0
[ 222.614344][ T23] ? stack_trace_snprint+0xf0/0xf0
[ 222.614681][ T23] ? __stack_depot_save+0x33/0x490
[ 222.656775][ T23] ? kfree+0xda/0x350
[ 222.657043][ T23] ? kasan_set_track+0x4f/0x60
[ 222.657345][ T23] ? kasan_set_track+0x3d/0x60
[ 222.657642][ T23] ? kasan_set_free_info+0x1f/0x40
[ 222.657967][ T23] ? ____kasan_slab_free+0xd8/0x120
[ 222.658295][ T23] ? slab_free_freelist_hook+0x12e/0x1a0
[ 222.658645][ T23] ? tomoyo_path_number_perm+0x5be/0x790
[ 222.658992][ T23] ? security_file_ioctl+0x55/0xb0
[ 222.659326][ T23] ? __se_sys_ioctl+0x48/0x170
[ 222.659629][ T23] ? do_syscall_64+0x3d/0x90
[ 222.659925][ T23] __f2fs_ioctl+0x2ee5/0xf320
[ 222.660232][ T23] ? rcu_read_lock_sched_held+0x87/0x110
[ 222.660597][ T23] ? __bpf_trace_rcu_stall_warning+0x10/0x10
[ 222.660969][ T23] ? lockdep_hardirqs_on_prepare+0x428/0x790
[ 222.661357][ T23] ? do_vfs_ioctl+0xc35/0x29e0
[ 222.661671][ T23] ? __ia32_compat_sys_ioctl+0xb90/0xb90
[ 222.662035][ T23] ? __lock_acquire+0x6080/0x6080
[ 222.662369][ T23] ? f2fs_ioctl+0x1d0/0x1d0
[ 222.662652][ T23] ? slab_free_freelist_hook+0x12e/0x1a0
[ 222.663004][ T23] ? tomoyo_path_number_perm+0x5be/0x790
[ 222.663372][ T23] ? kfree+0xda/0x350
[ 222.663630][ T23] ? tomoyo_path_number_perm+0x627/0x790
[ 222.663992][ T23] ? tomoyo_check_path_acl+0x1c0/0x1c0
[ 222.664348][ T23] ? rcu_read_lock_sched_held+0x87/0x110
[ 222.664706][ T23] ? read_lock_is_recursive+0x10/0x10
[ 222.669813][ T23] ? f2fs_ioctl+0x130/0x1d0
[ 222.670117][ T23] ? f2fs_precache_extents+0x3c0/0x3c0
[ 222.670474][ T23] __se_sys_ioctl+0xfb/0x170
[ 222.670776][ T23] do_syscall_64+0x3d/0x90
[ 222.671063][ T23] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 222.671441][ T23] RIP: 0033:0x7fb0400e4469
[ 222.671736][ T23] RSP: 002b:00007fb040b45ef8 EFLAGS: 00000246
ORIG_RAX: 0000000000000010
[ 222.672288][ T23] RAX: ffffffffffffffda RBX: 0000000000000000
RCX: 00007fb0400e4469
[ 222.672786][ T23] RDX: 0000000020002900 RSI: 00000000800c6613
RDI: 0000000000000003
[ 222.673293][ T23] RBP: 00007fb040b45f20 R08: 0000000000000000
R09: 0000000000000000
[ 222.673798][ T23] R10: 0000000000000000 R11: 0000000000000246
R12: 00007ffc3dcf28be
[ 222.674303][ T23] R13: 00007ffc3dcf28bf R14: 00007fb040b26000
R15: 0000000000000003
[ 222.684859][ T23] </TASK>
[ 222.685083][ T23]
[ 222.685083][ T23] Showing all locks held in the system:
[ 222.685564][ T23] 1 lock held by rcu_tasks_kthre/13:
[ 222.685897][ T23] #0: ffffffff8cf22530
(rcu_tasks.tasks_gp_mutex){+.+.}-{3:3}, at:
rcu_tasks_one_gp+0x30/0xbd0
[ 222.686559][ T23] 1 lock held by rcu_tasks_trace/14:
[ 222.686886][ T23] #0: ffffffff8cf22d30
(rcu_tasks_trace.tasks_gp_mutex){+.+.}-{3:3}, at:
rcu_tasks_one_gp+0x30/0xbd0
[ 222.687596][ T23] 1 lock held by khungtaskd/23:
[ 222.687895][ T23] #0: ffffffff8cf22360
(rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire+0x0/0x30
[ 222.688519][ T23] 1 lock held by in:imklog/6313:
[ 222.688829][ T23] #0: ffff888021796368
(&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0x23a/0x2d0
[ 222.689420][ T23] 3 locks held by a.out/6858:
[ 222.689716][ T23] 2 locks held by a.out/6867:
[ 222.697525][ T23] #0: ffff88802484c460
(sb_writers#14){.+.+}-{0:0}, at: mnt_want_write_file+0x5a/0x1f0
[ 222.698147][ T23] #1: ffff88802e539300
(&type->i_mutex_dir_key#8){+.+.}-{3:3}, at:
fscrypt_ioctl_set_policy+0x249/0x1500
[ 222.698876][ T23]
[ 222.699030][ T23] =============================================
[ 222.699030][ T23]
[ 222.699551][ T23] NMI backtrace for cpu 0
[ 222.699816][ T23] CPU: 0 PID: 23 Comm: khungtaskd Not tainted 6.0.0 #35
[ 222.700232][ T23] Hardware name: QEMU Standard PC (i440FX + PIIX,
1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[ 222.700852][ T23] Call Trace:
[ 222.701056][ T23] <TASK>
[ 222.701254][ T23] dump_stack_lvl+0x1b1/0x28e
[ 222.701550][ T23] ? io_alloc_page_table+0xfe/0xfe
[ 222.701876][ T23] ? panic+0x81b/0x81b
[ 222.702158][ T23] ? console_unlock+0x6bd/0x6f0
[ 222.702482][ T23] nmi_cpu_backtrace+0x40e/0x440
[ 222.702806][ T23] ? vprintk_emit+0x109/0x1e0
[ 222.703109][ T23] ? nmi_trigger_cpumask_backtrace+0x280/0x280
[ 222.703492][ T23] ? _printk+0xc0/0x100
[ 222.703759][ T23] ? panic+0x81b/0x81b
[ 222.704020][ T23] ? __wake_up_klogd+0xcc/0x100
[ 222.704353][ T23] ? panic+0x81b/0x81b
[ 222.704615][ T23] ? arch_trigger_cpumask_backtrace+0x10/0x10
[ 222.704998][ T23] nmi_trigger_cpumask_backtrace+0x16a/0x280
[ 222.705381][ T23] watchdog+0xcd5/0xd20
[ 222.705679][ T23] kthread+0x266/0x300
[ 222.705942][ T23] ? hungtask_pm_notify+0x50/0x50
[ 222.706256][ T23] ? kthread_blkcg+0xd0/0xd0
[ 222.706560][ T23] ret_from_fork+0x1f/0x30
[ 222.706857][ T23] </TASK>
[ 222.724848][ T23] Kernel panic - not syncing: hung_task: blocked tasks
[ 222.725269][ T23] CPU: 0 PID: 23 Comm: khungtaskd Not tainted 6.0.0 #35
[ 222.725691][ T23] Hardware name: QEMU Standard PC (i440FX + PIIX,
1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[ 222.726291][ T23] Call Trace:
[ 222.726500][ T23] <TASK>
[ 222.726684][ T23] dump_stack_lvl+0x1b1/0x28e
[ 222.726980][ T23] ? io_alloc_page_table+0xfe/0xfe
[ 222.727297][ T23] ? panic+0x81b/0x81b
[ 222.727558][ T23] panic+0x2c2/0x81b
[ 222.727803][ T23] ? schedule_preempt_disabled+0x20/0x20
[ 222.728151][ T23] ? nmi_trigger_cpumask_backtrace+0x206/0x280
[ 222.728532][ T23] ? asan.module_ctor+0x4/0x4
[ 222.728827][ T23] ? tick_nohz_tick_stopped+0x76/0xb0
[ 222.729158][ T23] ? arch_trigger_cpumask_backtrace+0x10/0x10
[ 222.729535][ T23] ? nmi_trigger_cpumask_backtrace+0x206/0x280
[ 222.729918][ T23] ? nmi_trigger_cpumask_backtrace+0x266/0x280
[ 222.730322][ T23] watchdog+0xd16/0xd20
[ 222.730591][ T23] kthread+0x266/0x300
[ 222.730862][ T23] ? hungtask_pm_notify+0x50/0x50
[ 222.731171][ T23] ? kthread_blkcg+0xd0/0xd0
[ 222.731458][ T23] ret_from_fork+0x1f/0x30
[ 222.731742][ T23] </TASK>
[ 222.732155][ T23] Kernel Offset: disabled
[ 222.732439][ T23] Rebooting in 86400 seconds..

Best,
Wei

On Wed, 2 Nov 2022 at 14:18, Eric Biggers <[email protected]> wrote:
>
> [+f2fs list and maintainers]
> [changed subject from "INFO: task hung in fscrypt_ioctl_set_policy"]
>
> On Mon, Oct 31, 2022 at 10:18:02PM +0800, Wei Chen wrote:
> > Dear Linux developers,
> >
> > Here is the link to the reproducers.
> >
> > C reproducer: https://drive.google.com/file/d/1mduYsYuoOKemH3qkvpDQwnAHAaaLUp0Y/view?usp=share_link
> > Syz reproducer:
> > https://drive.google.com/file/d/1mu-_w7dy_562vWRlQvTRbcBjG4_G7b2L/view?usp=share_link
> >
> > The bug persists in the latest commit, v5.15.76 (4f5365f77018). I hope
> > it is helpful to you.
> >
> > [ 1782.137186][ T30] INFO: task a.out:6910 blocked for more than 143 seconds.
> > [ 1782.139217][ T30] Not tainted 5.15.76 #5
> > [ 1782.140388][ T30] "echo 0 >
> > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > [ 1782.142524][ T30] task:a.out state:D stack:14296 pid:
> > 6910 ppid: 6532 flags:0x00004004
> > [ 1782.144799][ T30] Call Trace:
> > [ 1782.145623][ T30] <TASK>
> > [ 1782.146316][ T30] __schedule+0x3e8/0x1850
> > [ 1782.152029][ T30] ? mark_held_locks+0x49/0x70
> > [ 1782.153533][ T30] ? mark_held_locks+0x10/0x70
> > [ 1782.154759][ T30] ? __down_write_common.part.14+0x31f/0x7b0
> > [ 1782.156159][ T30] schedule+0x4e/0xe0
> > [ 1782.158314][ T30] __down_write_common.part.14+0x324/0x7b0
> > [ 1782.159704][ T30] ? fscrypt_ioctl_set_policy+0xe0/0x200
> > [ 1782.161050][ T30] fscrypt_ioctl_set_policy+0xe0/0x200
> > [ 1782.162330][ T30] __f2fs_ioctl+0x9d6/0x45e0
> > [ 1782.163417][ T30] f2fs_ioctl+0x64/0x240
> > [ 1782.164404][ T30] ? __f2fs_ioctl+0x45e0/0x45e0
> > [ 1782.165554][ T30] __x64_sys_ioctl+0xb6/0x100
> > [ 1782.166662][ T30] do_syscall_64+0x34/0xb0
> > [ 1782.169947][ T30] entry_SYSCALL_64_after_hwframe+0x61/0xcb
>
> Well, the quality of this bug report has a lot to be desired (not on upstream
> kernel, reproducer is full of totally irrelevant stuff, not sent to the mailing
> list of the filesystem whose disk image is being fuzzed, etc.). But what is
> going on is that f2fs_empty_dir() doesn't consider the case of a directory with
> an extremely large i_size on a malicious disk image.
>
> Specifically, the reproducer mounts an f2fs image with a directory that has an
> i_size of 14814520042850357248, then calls FS_IOC_SET_ENCRYPTION_POLICY on it.
> That results in a call to f2fs_empty_dir() to check whether the directory is
> empty. f2fs_empty_dir() then iterates through all 3616826182336513 blocks the
> directory allegedly contains to check whether any contain anything. i_rwsem is
> held during this, so anything else that tries to take it will hang.
>
> I'll look into this more if needed, but Jaegeuk and Chao, do you happen to have
> any ideas for how f2fs_empty_dir() should be fixed? Is there an easy way to
> just iterate through the blocks that are actually allocated?
>
> - Eric

2022-11-02 15:29:22

by Chao Yu

[permalink] [raw]
Subject: Re: f2fs_empty_dir() can be extremely slow on malicious disk images

On 2022/11/2 14:17, Eric Biggers wrote:
> [+f2fs list and maintainers]

Thanks for the forwarding.

> [changed subject from "INFO: task hung in fscrypt_ioctl_set_policy"]
>
> On Mon, Oct 31, 2022 at 10:18:02PM +0800, Wei Chen wrote:
>> Dear Linux developers,
>>
>> Here is the link to the reproducers.
>>
>> C reproducer: https://drive.google.com/file/d/1mduYsYuoOKemH3qkvpDQwnAHAaaLUp0Y/view?usp=share_link
>> Syz reproducer:
>> https://drive.google.com/file/d/1mu-_w7dy_562vWRlQvTRbcBjG4_G7b2L/view?usp=share_link
>>
>> The bug persists in the latest commit, v5.15.76 (4f5365f77018). I hope
>> it is helpful to you.
>>
>> [ 1782.137186][ T30] INFO: task a.out:6910 blocked for more than 143 seconds.
>> [ 1782.139217][ T30] Not tainted 5.15.76 #5
>> [ 1782.140388][ T30] "echo 0 >
>> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> [ 1782.142524][ T30] task:a.out state:D stack:14296 pid:
>> 6910 ppid: 6532 flags:0x00004004
>> [ 1782.144799][ T30] Call Trace:
>> [ 1782.145623][ T30] <TASK>
>> [ 1782.146316][ T30] __schedule+0x3e8/0x1850
>> [ 1782.152029][ T30] ? mark_held_locks+0x49/0x70
>> [ 1782.153533][ T30] ? mark_held_locks+0x10/0x70
>> [ 1782.154759][ T30] ? __down_write_common.part.14+0x31f/0x7b0
>> [ 1782.156159][ T30] schedule+0x4e/0xe0
>> [ 1782.158314][ T30] __down_write_common.part.14+0x324/0x7b0
>> [ 1782.159704][ T30] ? fscrypt_ioctl_set_policy+0xe0/0x200
>> [ 1782.161050][ T30] fscrypt_ioctl_set_policy+0xe0/0x200
>> [ 1782.162330][ T30] __f2fs_ioctl+0x9d6/0x45e0
>> [ 1782.163417][ T30] f2fs_ioctl+0x64/0x240
>> [ 1782.164404][ T30] ? __f2fs_ioctl+0x45e0/0x45e0
>> [ 1782.165554][ T30] __x64_sys_ioctl+0xb6/0x100
>> [ 1782.166662][ T30] do_syscall_64+0x34/0xb0
>> [ 1782.169947][ T30] entry_SYSCALL_64_after_hwframe+0x61/0xcb
>
> Well, the quality of this bug report has a lot to be desired (not on upstream
> kernel, reproducer is full of totally irrelevant stuff, not sent to the mailing
> list of the filesystem whose disk image is being fuzzed, etc.). But what is
> going on is that f2fs_empty_dir() doesn't consider the case of a directory with
> an extremely large i_size on a malicious disk image.
>
> Specifically, the reproducer mounts an f2fs image with a directory that has an
> i_size of 14814520042850357248, then calls FS_IOC_SET_ENCRYPTION_POLICY on it.
> That results in a call to f2fs_empty_dir() to check whether the directory is
> empty. f2fs_empty_dir() then iterates through all 3616826182336513 blocks the
> directory allegedly contains to check whether any contain anything. i_rwsem is
> held during this, so anything else that tries to take it will hang.
>
> I'll look into this more if needed, but Jaegeuk and Chao, do you happen to have
> any ideas for how f2fs_empty_dir() should be fixed? Is there an easy way to
> just iterate through the blocks that are actually allocated?

I send this just for requesting comments, no test now.

Thoughts?

From 38ea5f172c47853536a9f70857e4438a69d16f39 Mon Sep 17 00:00:00 2001
From: Chao Yu <[email protected]>
Date: Wed, 2 Nov 2022 12:02:08 +0800
Subject: [RFC PATCH] f2fs: speed up f2fs_empty_dir()

Signed-off-by: Chao Yu <[email protected]>
---
fs/f2fs/dir.c | 48 ++++++++++++++++++++++++++----------------------
1 file changed, 26 insertions(+), 22 deletions(-)

diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
index 21960a899b6a..45f52b34ed1f 100644
--- a/fs/f2fs/dir.c
+++ b/fs/f2fs/dir.c
@@ -956,38 +956,42 @@ void f2fs_delete_entry(struct f2fs_dir_entry *dentry, struct page *page,

bool f2fs_empty_dir(struct inode *dir)
{
- unsigned long bidx;
struct page *dentry_page;
unsigned int bit_pos;
struct f2fs_dentry_block *dentry_blk;
- unsigned long nblock = dir_blocks(dir);
+ struct f2fs_map_blocks map;
+ int ret;

if (f2fs_has_inline_dentry(dir))
return f2fs_empty_inline_dir(dir);

- for (bidx = 0; bidx < nblock; bidx++) {
- dentry_page = f2fs_get_lock_data_page(dir, bidx, false);
- if (IS_ERR(dentry_page)) {
- if (PTR_ERR(dentry_page) == -ENOENT)
- continue;
- else
- return false;
- }
+ dentry_page = f2fs_get_lock_data_page(dir, 0, false);
+ if (IS_ERR(dentry_page)) {
+ if (PTR_ERR(dentry_page) == -ENOENT)
+ return true;
+ return false;
+ }

- dentry_blk = page_address(dentry_page);
- if (bidx == 0)
- bit_pos = 2;
- else
- bit_pos = 0;
- bit_pos = find_next_bit_le(&dentry_blk->dentry_bitmap,
- NR_DENTRY_IN_BLOCK,
- bit_pos);
+ dentry_blk = page_address(dentry_page);
+ bit_pos = find_next_bit_le(&dentry_blk->dentry_bitmap,
+ NR_DENTRY_IN_BLOCK, 2);
+ f2fs_put_page(dentry_page, 1);
+
+ if (bit_pos < NR_DENTRY_IN_BLOCK)
+ return false;

- f2fs_put_page(dentry_page, 1);
+ memset(&map, 0, sizeof(map));
+ map.m_lblk = 1;
+ map.m_len = dir_blocks(dir) - 1;
+ map.m_seg_type = NO_CHECK_TYPE;
+
+ ret = f2fs_map_blocks(dir, &map, 0, F2FS_GET_BLOCK_FIEMAP);
+ if (ret)
+ return false;
+
+ if (map.m_flags & F2FS_MAP_FLAGS)
+ return false;

- if (bit_pos < NR_DENTRY_IN_BLOCK)
- return false;
- }
return true;
}

--
2.36.1



>
> - Eric

2022-11-03 01:07:55

by Chao Yu

[permalink] [raw]
Subject: Re: [f2fs-dev] f2fs_empty_dir() can be extremely slow on malicious disk images

On 2022/11/2 23:12, Chao Yu wrote:
> On 2022/11/2 14:17, Eric Biggers wrote:
>> [+f2fs list and maintainers]
>
> Thanks for the forwarding.
>
>> [changed subject from "INFO: task hung in fscrypt_ioctl_set_policy"]
>>
>> On Mon, Oct 31, 2022 at 10:18:02PM +0800, Wei Chen wrote:
>>> Dear Linux developers,
>>>
>>> Here is the link to the reproducers.
>>>
>>> C reproducer: https://drive.google.com/file/d/1mduYsYuoOKemH3qkvpDQwnAHAaaLUp0Y/view?usp=share_link
>>> Syz reproducer:
>>> https://drive.google.com/file/d/1mu-_w7dy_562vWRlQvTRbcBjG4_G7b2L/view?usp=share_link
>>>
>>> The bug persists in the latest commit, v5.15.76 (4f5365f77018). I hope
>>> it is helpful to you.
>>>
>>> [ 1782.137186][ T30] INFO: task a.out:6910 blocked for more than 143 seconds.
>>> [ 1782.139217][ T30] Not tainted 5.15.76 #5
>>> [ 1782.140388][ T30] "echo 0 >
>>> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> [ 1782.142524][ T30] task:a.out state:D stack:14296 pid:
>>> 6910 ppid: 6532 flags:0x00004004
>>> [ 1782.144799][ T30] Call Trace:
>>> [ 1782.145623][ T30] <TASK>
>>> [ 1782.146316][ T30] __schedule+0x3e8/0x1850
>>> [ 1782.152029][ T30] ? mark_held_locks+0x49/0x70
>>> [ 1782.153533][ T30] ? mark_held_locks+0x10/0x70
>>> [ 1782.154759][ T30] ? __down_write_common.part.14+0x31f/0x7b0
>>> [ 1782.156159][ T30] schedule+0x4e/0xe0
>>> [ 1782.158314][ T30] __down_write_common.part.14+0x324/0x7b0
>>> [ 1782.159704][ T30] ? fscrypt_ioctl_set_policy+0xe0/0x200
>>> [ 1782.161050][ T30] fscrypt_ioctl_set_policy+0xe0/0x200
>>> [ 1782.162330][ T30] __f2fs_ioctl+0x9d6/0x45e0
>>> [ 1782.163417][ T30] f2fs_ioctl+0x64/0x240
>>> [ 1782.164404][ T30] ? __f2fs_ioctl+0x45e0/0x45e0
>>> [ 1782.165554][ T30] __x64_sys_ioctl+0xb6/0x100
>>> [ 1782.166662][ T30] do_syscall_64+0x34/0xb0
>>> [ 1782.169947][ T30] entry_SYSCALL_64_after_hwframe+0x61/0xcb
>>
>> Well, the quality of this bug report has a lot to be desired (not on upstream
>> kernel, reproducer is full of totally irrelevant stuff, not sent to the mailing
>> list of the filesystem whose disk image is being fuzzed, etc.). But what is
>> going on is that f2fs_empty_dir() doesn't consider the case of a directory with
>> an extremely large i_size on a malicious disk image.
>>
>> Specifically, the reproducer mounts an f2fs image with a directory that has an
>> i_size of 14814520042850357248, then calls FS_IOC_SET_ENCRYPTION_POLICY on it.
>> That results in a call to f2fs_empty_dir() to check whether the directory is
>> empty. f2fs_empty_dir() then iterates through all 3616826182336513 blocks the
>> directory allegedly contains to check whether any contain anything. i_rwsem is
>> held during this, so anything else that tries to take it will hang.
>>
>> I'll look into this more if needed, but Jaegeuk and Chao, do you happen to have
>> any ideas for how f2fs_empty_dir() should be fixed? Is there an easy way to
>> just iterate through the blocks that are actually allocated?
>

Sorry, I mean:

From 07f662ca6bd2a0991961ea42932ce90f19e74624 Mon Sep 17 00:00:00 2001
From: Chao Yu <[email protected]>
Date: Wed, 2 Nov 2022 12:02:08 +0800
Subject: [RFC v2] f2fs: speed up f2fs_empty_dir()

Signed-off-by: Chao Yu <[email protected]>
---
fs/f2fs/dir.c | 48 ++++++++++++++++++++++++++++--------------------
1 file changed, 28 insertions(+), 20 deletions(-)

diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
index 21960a899b6a..569f7304e3e6 100644
--- a/fs/f2fs/dir.c
+++ b/fs/f2fs/dir.c
@@ -956,38 +956,46 @@ void f2fs_delete_entry(struct f2fs_dir_entry *dentry, struct page *page,

bool f2fs_empty_dir(struct inode *dir)
{
- unsigned long bidx;
struct page *dentry_page;
unsigned int bit_pos;
struct f2fs_dentry_block *dentry_blk;
- unsigned long nblock = dir_blocks(dir);
+ pgoff_t index;

if (f2fs_has_inline_dentry(dir))
return f2fs_empty_inline_dir(dir);

- for (bidx = 0; bidx < nblock; bidx++) {
- dentry_page = f2fs_get_lock_data_page(dir, bidx, false);
- if (IS_ERR(dentry_page)) {
- if (PTR_ERR(dentry_page) == -ENOENT)
- continue;
- else
- return false;
- }
+ dentry_page = f2fs_get_lock_data_page(dir, 0, false);
+ if (IS_ERR(dentry_page)) {
+ if (PTR_ERR(dentry_page) == -ENOENT)
+ return true;
+ return false;
+ }

- dentry_blk = page_address(dentry_page);
- if (bidx == 0)
- bit_pos = 2;
- else
- bit_pos = 0;
- bit_pos = find_next_bit_le(&dentry_blk->dentry_bitmap,
- NR_DENTRY_IN_BLOCK,
- bit_pos);
+ dentry_blk = page_address(dentry_page);
+ bit_pos = find_next_bit_le(&dentry_blk->dentry_bitmap,
+ NR_DENTRY_IN_BLOCK, 2);
+ f2fs_put_page(dentry_page, 1);
+
+ if (bit_pos < NR_DENTRY_IN_BLOCK)
+ return false;

- f2fs_put_page(dentry_page, 1);
+ for (index = 1; index < dir_blocks(dir);) {
+ struct dnode_of_data dn;
+ int err;

- if (bit_pos < NR_DENTRY_IN_BLOCK)
+ set_new_dnode(&dn, dir, NULL, NULL, 0);
+ err = f2fs_get_dnode_of_data(&dn, index, LOOKUP_NODE);
+ if (err && err != -ENOENT) {
return false;
+ } else if (err == -ENOENT) {
+ index = f2fs_get_next_page_offset(&dn, index);
+ continue;
+ }
+ f2fs_put_dnode(&dn);
+
+ return false;
}
+
return true;
}

--
2.36.1