2020-07-15 20:00:32

by syzbot

[permalink] [raw]
Subject: possible deadlock in uprobe_clear_state (2)

Hello,

syzbot found the following issue on:

HEAD commit: a581387e Merge tag 'io_uring-5.8-2020-07-10' of git://git...
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=111a004f100000
kernel config: https://syzkaller.appspot.com/x/.config?x=66ad203c2bb6d8b
dashboard link: https://syzkaller.appspot.com/bug?extid=e5344baa319c9a96edec
compiler: gcc (GCC) 10.1.0-syz 20200507

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]

======================================================
WARNING: possible circular locking dependency detected
5.8.0-rc4-syzkaller #0 Not tainted
------------------------------------------------------
syz-executor.1/4807 is trying to acquire lock:
ffffffff89c3a988 (delayed_uprobe_lock){+.+.}-{3:3}, at: uprobe_clear_state+0x47/0x3b0 kernel/events/uprobes.c:1550

but task is already holding lock:
ffffffff89c6c360 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_release mm/page_alloc.c:4202 [inline]
ffffffff89c6c360 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_release mm/page_alloc.c:4198 [inline]
ffffffff89c6c360 (fs_reclaim){+.+.}-{0:0}, at: __perform_reclaim mm/page_alloc.c:4227 [inline]
ffffffff89c6c360 (fs_reclaim){+.+.}-{0:0}, at: __alloc_pages_direct_reclaim mm/page_alloc.c:4244 [inline]
ffffffff89c6c360 (fs_reclaim){+.+.}-{0:0}, at: __alloc_pages_slowpath.constprop.0+0x1554/0x2780 mm/page_alloc.c:4650

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (fs_reclaim){+.+.}-{0:0}:
__fs_reclaim_acquire mm/page_alloc.c:4183 [inline]
fs_reclaim_acquire+0x2f/0x40 mm/page_alloc.c:4194
slab_pre_alloc_hook mm/slab.h:564 [inline]
slab_alloc mm/slab.c:3306 [inline]
kmem_cache_alloc_trace+0x29/0x2d0 mm/slab.c:3549
kmalloc include/linux/slab.h:555 [inline]
kzalloc include/linux/slab.h:669 [inline]
delayed_uprobe_add kernel/events/uprobes.c:304 [inline]
update_ref_ctr+0x4b6/0x700 kernel/events/uprobes.c:438
uprobe_write_opcode+0xe07/0x1650 kernel/events/uprobes.c:497
install_breakpoint kernel/events/uprobes.c:915 [inline]
install_breakpoint.isra.0+0x5a5/0x7c0 kernel/events/uprobes.c:897
uprobe_mmap+0x5d7/0x1050 kernel/events/uprobes.c:1394
mmap_region+0x5cf/0x1590 mm/mmap.c:1818
do_mmap+0xca8/0x1170 mm/mmap.c:1545
do_mmap_pgoff include/linux/mm.h:2596 [inline]
vm_mmap_pgoff+0x197/0x200 mm/util.c:506
ksys_mmap_pgoff+0x455/0x5a0 mm/mmap.c:1595
do_syscall_64+0x60/0xe0 arch/x86/entry/common.c:384
entry_SYSCALL_64_after_hwframe+0x44/0xa9

-> #0 (delayed_uprobe_lock){+.+.}-{3:3}:
check_prev_add kernel/locking/lockdep.c:2496 [inline]
check_prevs_add kernel/locking/lockdep.c:2601 [inline]
validate_chain kernel/locking/lockdep.c:3218 [inline]
__lock_acquire+0x2acb/0x56e0 kernel/locking/lockdep.c:4380
lock_acquire+0x1f1/0xad0 kernel/locking/lockdep.c:4959
__mutex_lock_common kernel/locking/mutex.c:956 [inline]
__mutex_lock+0x134/0x10d0 kernel/locking/mutex.c:1103
uprobe_clear_state+0x47/0x3b0 kernel/events/uprobes.c:1550
__mmput+0x73/0x470 kernel/fork.c:1089
mmput+0x53/0x60 kernel/fork.c:1114
binder_alloc_free_page+0x441/0xf90 drivers/android/binder_alloc.c:950
__list_lru_walk_one+0x178/0x5c0 mm/list_lru.c:222
list_lru_walk_one mm/list_lru.c:266 [inline]
list_lru_walk_node+0x67/0x2a0 mm/list_lru.c:295
list_lru_walk include/linux/list_lru.h:215 [inline]
binder_shrink_scan+0x123/0x190 drivers/android/binder_alloc.c:984
do_shrink_slab+0x3c6/0xab0 mm/vmscan.c:518
shrink_slab+0x16f/0x5c0 mm/vmscan.c:679
shrink_node_memcgs mm/vmscan.c:2658 [inline]
shrink_node+0x519/0x1b60 mm/vmscan.c:2770
shrink_zones mm/vmscan.c:2973 [inline]
do_try_to_free_pages+0x38b/0x1340 mm/vmscan.c:3026
try_to_free_pages+0x29a/0x8b0 mm/vmscan.c:3265
__perform_reclaim mm/page_alloc.c:4223 [inline]
__alloc_pages_direct_reclaim mm/page_alloc.c:4244 [inline]
__alloc_pages_slowpath.constprop.0+0x949/0x2780 mm/page_alloc.c:4650
__alloc_pages_nodemask+0x68f/0x930 mm/page_alloc.c:4863
alloc_pages_current+0x187/0x280 mm/mempolicy.c:2292
alloc_pages include/linux/gfp.h:545 [inline]
alloc_mmu_pages+0x7f/0x170 arch/x86/kvm/mmu/mmu.c:5671
kvm_mmu_create+0x3cb/0x560 arch/x86/kvm/mmu/mmu.c:5704
kvm_arch_vcpu_create+0x16d/0xb70 arch/x86/kvm/x86.c:9433
kvm_vm_ioctl_create_vcpu arch/x86/kvm/../../../virt/kvm/kvm_main.c:3060 [inline]
kvm_vm_ioctl+0x1547/0x23c0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3621
vfs_ioctl fs/ioctl.c:48 [inline]
ksys_ioctl+0x11a/0x180 fs/ioctl.c:753
__do_sys_ioctl fs/ioctl.c:762 [inline]
__se_sys_ioctl fs/ioctl.c:760 [inline]
__x64_sys_ioctl+0x6f/0xb0 fs/ioctl.c:760
do_syscall_64+0x60/0xe0 arch/x86/entry/common.c:384
entry_SYSCALL_64_after_hwframe+0x44/0xa9

other info that might help us debug this:

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(fs_reclaim);
lock(delayed_uprobe_lock);
lock(fs_reclaim);
lock(delayed_uprobe_lock);

*** DEADLOCK ***

3 locks held by syz-executor.1/4807:
#0: ffffffff89c6c360 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_release mm/page_alloc.c:4202 [inline]
#0: ffffffff89c6c360 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_release mm/page_alloc.c:4198 [inline]
#0: ffffffff89c6c360 (fs_reclaim){+.+.}-{0:0}, at: __perform_reclaim mm/page_alloc.c:4227 [inline]
#0: ffffffff89c6c360 (fs_reclaim){+.+.}-{0:0}, at: __alloc_pages_direct_reclaim mm/page_alloc.c:4244 [inline]
#0: ffffffff89c6c360 (fs_reclaim){+.+.}-{0:0}, at: __alloc_pages_slowpath.constprop.0+0x1554/0x2780 mm/page_alloc.c:4650
#1: ffffffff89c46b90 (shrinker_rwsem){++++}-{3:3}, at: shrink_slab+0xc7/0x5c0 mm/vmscan.c:669
#2: ffff88808ea481e0 (&alloc->mutex){+.+.}-{3:3}, at: binder_alloc_free_page+0x4f/0xf90 drivers/android/binder_alloc.c:923

stack backtrace:
CPU: 0 PID: 4807 Comm: syz-executor.1 Not tainted 5.8.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x18f/0x20d lib/dump_stack.c:118
check_noncircular+0x324/0x3e0 kernel/locking/lockdep.c:1827
check_prev_add kernel/locking/lockdep.c:2496 [inline]
check_prevs_add kernel/locking/lockdep.c:2601 [inline]
validate_chain kernel/locking/lockdep.c:3218 [inline]
__lock_acquire+0x2acb/0x56e0 kernel/locking/lockdep.c:4380
lock_acquire+0x1f1/0xad0 kernel/locking/lockdep.c:4959
__mutex_lock_common kernel/locking/mutex.c:956 [inline]
__mutex_lock+0x134/0x10d0 kernel/locking/mutex.c:1103
uprobe_clear_state+0x47/0x3b0 kernel/events/uprobes.c:1550
__mmput+0x73/0x470 kernel/fork.c:1089
mmput+0x53/0x60 kernel/fork.c:1114
binder_alloc_free_page+0x441/0xf90 drivers/android/binder_alloc.c:950
__list_lru_walk_one+0x178/0x5c0 mm/list_lru.c:222
list_lru_walk_one mm/list_lru.c:266 [inline]
list_lru_walk_node+0x67/0x2a0 mm/list_lru.c:295
list_lru_walk include/linux/list_lru.h:215 [inline]
binder_shrink_scan+0x123/0x190 drivers/android/binder_alloc.c:984
do_shrink_slab+0x3c6/0xab0 mm/vmscan.c:518
shrink_slab+0x16f/0x5c0 mm/vmscan.c:679
shrink_node_memcgs mm/vmscan.c:2658 [inline]
shrink_node+0x519/0x1b60 mm/vmscan.c:2770
shrink_zones mm/vmscan.c:2973 [inline]
do_try_to_free_pages+0x38b/0x1340 mm/vmscan.c:3026
try_to_free_pages+0x29a/0x8b0 mm/vmscan.c:3265
__perform_reclaim mm/page_alloc.c:4223 [inline]
__alloc_pages_direct_reclaim mm/page_alloc.c:4244 [inline]
__alloc_pages_slowpath.constprop.0+0x949/0x2780 mm/page_alloc.c:4650
__alloc_pages_nodemask+0x68f/0x930 mm/page_alloc.c:4863
alloc_pages_current+0x187/0x280 mm/mempolicy.c:2292
alloc_pages include/linux/gfp.h:545 [inline]
alloc_mmu_pages+0x7f/0x170 arch/x86/kvm/mmu/mmu.c:5671
kvm_mmu_create+0x3cb/0x560 arch/x86/kvm/mmu/mmu.c:5704
kvm_arch_vcpu_create+0x16d/0xb70 arch/x86/kvm/x86.c:9433
kvm_vm_ioctl_create_vcpu arch/x86/kvm/../../../virt/kvm/kvm_main.c:3060 [inline]
kvm_vm_ioctl+0x1547/0x23c0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3621
vfs_ioctl fs/ioctl.c:48 [inline]
ksys_ioctl+0x11a/0x180 fs/ioctl.c:753
__do_sys_ioctl fs/ioctl.c:762 [inline]
__se_sys_ioctl fs/ioctl.c:760 [inline]
__x64_sys_ioctl+0x6f/0xb0 fs/ioctl.c:760
do_syscall_64+0x60/0xe0 arch/x86/entry/common.c:384
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x45cba9
Code: Bad RIP value.
RSP: 002b:00007f45444f8c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00000000004e85c0 RCX: 000000000045cba9
RDX: 0000000000000000 RSI: 000000000000ae41 RDI: 0000000000000004
RBP: 000000000078bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
R13: 00000000000003a3 R14: 00000000004c652f R15: 00007f45444f96d4


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.


2020-07-15 23:40:02

by Tetsuo Handa

[permalink] [raw]
Subject: [PATCH] binder: Don't use mmput() from shrinker function.

syzbot is reporting that mmput() from shrinker function has a risk of
deadlock [1]. Don't start synchronous teardown of mm when called from
shrinker function.

[1] https://syzkaller.appspot.com/bug?id=bc9e7303f537c41b2b0cc2dfcea3fc42964c2d45

Reported-by: syzbot <[email protected]>
Reported-by: syzbot <[email protected]>
Signed-off-by: Tetsuo Handa <[email protected]>
---
drivers/android/binder_alloc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
index 42c672f1584e..cbe6aa77d50d 100644
--- a/drivers/android/binder_alloc.c
+++ b/drivers/android/binder_alloc.c
@@ -947,7 +947,7 @@ enum lru_status binder_alloc_free_page(struct list_head *item,
trace_binder_unmap_user_end(alloc, index);
}
mmap_read_unlock(mm);
- mmput(mm);
+ mmput_async(mm);

trace_binder_unmap_kernel_start(alloc, index);

--
2.18.4


2020-07-16 08:35:47

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] binder: Don't use mmput() from shrinker function.

On Thu 16-07-20 08:36:52, Tetsuo Handa wrote:
> syzbot is reporting that mmput() from shrinker function has a risk of
> deadlock [1]. Don't start synchronous teardown of mm when called from
> shrinker function.

Please add the actual lock dependency to the changelog.

Anyway is this deadlock real? Mayve I have missed some details but the
call graph points to these two paths.
uprobe_mmap do_shrink_slab
uprobes_mmap_hash #lock
install_breakpoint binder_shrink_scan
set_swbp binder_alloc_free_page
uprobe_write_opcode __mmput
update_ref_ctr uprobe_clear_state
mutex_lock(&delayed_uprobe_lock) mutex_lock(&delayed_uprobe_lock);
allocation -> reclaim

But in order for this to happen the shrinker would have to do the last
put on the mm. But mm cannot go away from under uprobe_mmap so those two
paths cannot race with each other.

Unless I am missing something this is a false positive. I do not mind
using mmput_async from the shrinker as a workaround but the changelog
should be explicit about the fact.

> [1] https://syzkaller.appspot.com/bug?id=bc9e7303f537c41b2b0cc2dfcea3fc42964c2d45
>
> Reported-by: syzbot <[email protected]>
> Reported-by: syzbot <[email protected]>
> Signed-off-by: Tetsuo Handa <[email protected]>
> ---
> drivers/android/binder_alloc.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
> index 42c672f1584e..cbe6aa77d50d 100644
> --- a/drivers/android/binder_alloc.c
> +++ b/drivers/android/binder_alloc.c
> @@ -947,7 +947,7 @@ enum lru_status binder_alloc_free_page(struct list_head *item,
> trace_binder_unmap_user_end(alloc, index);
> }
> mmap_read_unlock(mm);
> - mmput(mm);
> + mmput_async(mm);
>
> trace_binder_unmap_kernel_start(alloc, index);
>
> --
> 2.18.4
>

--
Michal Hocko
SUSE Labs

2020-07-16 13:42:12

by Tetsuo Handa

[permalink] [raw]
Subject: Re: [PATCH] binder: Don't use mmput() from shrinker function.

On 2020/07/16 17:35, Michal Hocko wrote:
> On Thu 16-07-20 08:36:52, Tetsuo Handa wrote:
>> syzbot is reporting that mmput() from shrinker function has a risk of
>> deadlock [1]. Don't start synchronous teardown of mm when called from
>> shrinker function.
>
> Please add the actual lock dependency to the changelog.
>
> Anyway is this deadlock real? Mayve I have missed some details but the
> call graph points to these two paths.
> uprobe_mmap do_shrink_slab
> uprobes_mmap_hash #lock
> install_breakpoint binder_shrink_scan
> set_swbp binder_alloc_free_page
> uprobe_write_opcode __mmput
> update_ref_ctr uprobe_clear_state
> mutex_lock(&delayed_uprobe_lock) mutex_lock(&delayed_uprobe_lock);
> allocation -> reclaim
>

static int update_ref_ctr(struct uprobe *uprobe, struct mm_struct *mm, short d) {
mutex_lock(&delayed_uprobe_lock);
ret = delayed_uprobe_add(uprobe, mm1) {
du = kzalloc(sizeof(*du), GFP_KERNEL) {
do_shrink_slab() {
binder_shrink_scan() {
binder_alloc_free_page() {
mmget_not_zero(mm2);
mmput(mm2) {
__mmput(mm2) {
uprobe_clear_state(mm2) {
mutex_lock(&delayed_uprobe_lock);
delayed_uprobe_remove(NULL, mm2);
mutex_unlock(&delayed_uprobe_lock);
}
}
}
}
}
}
}
}
mutex_unlock(&delayed_uprobe_lock);
}

> But in order for this to happen the shrinker would have to do the last
> put on the mm. But mm cannot go away from under uprobe_mmap so those two
> paths cannot race with each other.

and mm1 != mm2 is possible, isn't it?

>
> Unless I am missing something this is a false positive. I do not mind
> using mmput_async from the shrinker as a workaround but the changelog
> should be explicit about the fact.
>

binder_alloc_free_page() is already using mmput_async() 14 lines later.
It just took 18 months to hit this race for the third time, for it is
quite difficult to let the owner of mm2 to call mmput(mm2) between
binder_alloc_free_page() calls mmget_not_zero(mm2) and mmput(mm2).

The reason I added you is to see whether we can do

void mmput(struct mm_struct *mm)
{
might_sleep();
+ /* Calling mmput() from shrinker context can deadlock. */
+ WARN_ON(current->flags & PF_MEMALLOC);

if (atomic_dec_and_test(&mm->mm_users))
__mmput(mm);
}

in order to catch this bug easier.

2020-07-16 13:55:56

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] binder: Don't use mmput() from shrinker function.

On Thu 16-07-20 22:41:14, Tetsuo Handa wrote:
> On 2020/07/16 17:35, Michal Hocko wrote:
[...]
> > But in order for this to happen the shrinker would have to do the last
> > put on the mm. But mm cannot go away from under uprobe_mmap so those two
> > paths cannot race with each other.
>
> and mm1 != mm2 is possible, isn't it?

OK, I have missed that information. You are right. Can you make this
into the changelog please?
--
Michal Hocko
SUSE Labs

2020-07-16 15:13:10

by Tetsuo Handa

[permalink] [raw]
Subject: [PATCH v2] binder: Don't use mmput() from shrinker function.

syzbot is reporting that mmput() from shrinker function has a risk of
deadlock [1], for delayed_uprobe_add() from update_ref_ctr() calls
kzalloc(GFP_KERNEL) with delayed_uprobe_lock held, and
uprobe_clear_state() from __mmput() also holds delayed_uprobe_lock.

Commit a1b2289cef92ef0e ("android: binder: drop lru lock in isolate
callback") replaced mmput() with mmput_async() in order to avoid sleeping
with spinlock held. But this patch replaces mmput() with mmput_async() in
order not to start __mmput() from shrinker context.

[1] https://syzkaller.appspot.com/bug?id=bc9e7303f537c41b2b0cc2dfcea3fc42964c2d45

Reported-by: syzbot <[email protected]>
Reported-by: syzbot <[email protected]>
Signed-off-by: Tetsuo Handa <[email protected]>
---
drivers/android/binder_alloc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
index 42c672f1584e..cbe6aa77d50d 100644
--- a/drivers/android/binder_alloc.c
+++ b/drivers/android/binder_alloc.c
@@ -947,7 +947,7 @@ enum lru_status binder_alloc_free_page(struct list_head *item,
trace_binder_unmap_user_end(alloc, index);
}
mmap_read_unlock(mm);
- mmput(mm);
+ mmput_async(mm);

trace_binder_unmap_kernel_start(alloc, index);

--
2.18.4

2020-07-16 15:21:48

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH v2] binder: Don't use mmput() from shrinker function.

On Fri 17-07-20 00:12:15, Tetsuo Handa wrote:
> syzbot is reporting that mmput() from shrinker function has a risk of
> deadlock [1], for delayed_uprobe_add() from update_ref_ctr() calls
> kzalloc(GFP_KERNEL) with delayed_uprobe_lock held, and
> uprobe_clear_state() from __mmput() also holds delayed_uprobe_lock.
>
> Commit a1b2289cef92ef0e ("android: binder: drop lru lock in isolate
> callback") replaced mmput() with mmput_async() in order to avoid sleeping
> with spinlock held. But this patch replaces mmput() with mmput_async() in
> order not to start __mmput() from shrinker context.
>
> [1] https://syzkaller.appspot.com/bug?id=bc9e7303f537c41b2b0cc2dfcea3fc42964c2d45
>
> Reported-by: syzbot <[email protected]>
> Reported-by: syzbot <[email protected]>
> Signed-off-by: Tetsuo Handa <[email protected]>

Reviewed-by: Michal Hocko <[email protected]>

Thanks!

> ---
> drivers/android/binder_alloc.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
> index 42c672f1584e..cbe6aa77d50d 100644
> --- a/drivers/android/binder_alloc.c
> +++ b/drivers/android/binder_alloc.c
> @@ -947,7 +947,7 @@ enum lru_status binder_alloc_free_page(struct list_head *item,
> trace_binder_unmap_user_end(alloc, index);
> }
> mmap_read_unlock(mm);
> - mmput(mm);
> + mmput_async(mm);
>
> trace_binder_unmap_kernel_start(alloc, index);
>
> --
> 2.18.4

--
Michal Hocko
SUSE Labs

2020-07-16 16:31:06

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH v2] binder: Don't use mmput() from shrinker function.

On Thu, Jul 16, 2020 at 05:17:56PM +0200, Michal Hocko wrote:
> On Fri 17-07-20 00:12:15, Tetsuo Handa wrote:
> > syzbot is reporting that mmput() from shrinker function has a risk of
> > deadlock [1], for delayed_uprobe_add() from update_ref_ctr() calls
> > kzalloc(GFP_KERNEL) with delayed_uprobe_lock held, and
> > uprobe_clear_state() from __mmput() also holds delayed_uprobe_lock.
> >
> > Commit a1b2289cef92ef0e ("android: binder: drop lru lock in isolate
> > callback") replaced mmput() with mmput_async() in order to avoid sleeping
> > with spinlock held. But this patch replaces mmput() with mmput_async() in
> > order not to start __mmput() from shrinker context.
> >
> > [1] https://syzkaller.appspot.com/bug?id=bc9e7303f537c41b2b0cc2dfcea3fc42964c2d45
> >
> > Reported-by: syzbot <[email protected]>
> > Reported-by: syzbot <[email protected]>
> > Signed-off-by: Tetsuo Handa <[email protected]>
>
> Reviewed-by: Michal Hocko <[email protected]>

Thanks for the careful review Michal!
Does this need a Cc: stable?

Otherwise

Acked-by: Christian Brauner <[email protected]>

Thanks!
Christian

2020-07-16 23:19:12

by Tetsuo Handa

[permalink] [raw]
Subject: Re: [PATCH v2] binder: Don't use mmput() from shrinker function.

On 2020/07/17 1:29, Christian Brauner wrote:
> Does this need a Cc: stable?

Up to someone who applies this patch. I think this race is hard to hit.

2020-07-16 23:55:15

by Todd Kjos

[permalink] [raw]
Subject: Re: [PATCH v2] binder: Don't use mmput() from shrinker function.

On Thu, Jul 16, 2020 at 8:18 AM Michal Hocko <[email protected]> wrote:
>
> On Fri 17-07-20 00:12:15, Tetsuo Handa wrote:
> > syzbot is reporting that mmput() from shrinker function has a risk of
> > deadlock [1], for delayed_uprobe_add() from update_ref_ctr() calls
> > kzalloc(GFP_KERNEL) with delayed_uprobe_lock held, and
> > uprobe_clear_state() from __mmput() also holds delayed_uprobe_lock.
> >
> > Commit a1b2289cef92ef0e ("android: binder: drop lru lock in isolate
> > callback") replaced mmput() with mmput_async() in order to avoid sleeping
> > with spinlock held. But this patch replaces mmput() with mmput_async() in
> > order not to start __mmput() from shrinker context.
> >
> > [1] https://syzkaller.appspot.com/bug?id=bc9e7303f537c41b2b0cc2dfcea3fc42964c2d45
> >
> > Reported-by: syzbot <[email protected]>
> > Reported-by: syzbot <[email protected]>
> > Signed-off-by: Tetsuo Handa <[email protected]>
>
> Reviewed-by: Michal Hocko <[email protected]>

Acked-by: Todd Kjos <[email protected]>

>
> Thanks!
>
> > ---
> > drivers/android/binder_alloc.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
> > index 42c672f1584e..cbe6aa77d50d 100644
> > --- a/drivers/android/binder_alloc.c
> > +++ b/drivers/android/binder_alloc.c
> > @@ -947,7 +947,7 @@ enum lru_status binder_alloc_free_page(struct list_head *item,
> > trace_binder_unmap_user_end(alloc, index);
> > }
> > mmap_read_unlock(mm);
> > - mmput(mm);
> > + mmput_async(mm);
> >
> > trace_binder_unmap_kernel_start(alloc, index);
> >
> > --
> > 2.18.4
>
> --
> Michal Hocko
> SUSE Labs