2021-07-07 07:46:25

by Desmond Cheong Zhi Xi

[permalink] [raw]
Subject: [PATCH v3 0/2] fcntl: fix potential deadlocks

Hi,

Sorry for the delay between v1 and v2, there was an unrelated issue with Syzbot testing.

Syzbot reports a possible irq lock inversion dependency:
https://syzkaller.appspot.com/bug?id=923cfc6c6348963f99886a0176ef11dcc429547b

While investigating this error, I discovered that multiple similar lock inversion scenarios can occur. Hence, this series addresses potential deadlocks for two classes of locks, one in each patch:

1. Fix potential deadlocks for &fown_struct.lock

2. Fix potential deadlock for &fasync_struct.fa_lock

v2 -> v3:
- Removed WARN_ON_ONCE, keeping elaboration for why read_lock_irq is safe to use in the commit message. As suggested by Greg KH.

v1 -> v2:
- Added WARN_ON_ONCE(irqs_disabled()) before calls to read_lock_irq, and added elaboration in the commit message. As suggested by Jeff Layton.

Best wishes,
Desmond

Desmond Cheong Zhi Xi (2):
fcntl: fix potential deadlocks for &fown_struct.lock
fcntl: fix potential deadlock for &fasync_struct.fa_lock

fs/fcntl.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)

--
2.25.1


2021-07-07 07:46:45

by Desmond Cheong Zhi Xi

[permalink] [raw]
Subject: [PATCH v3 1/2] fcntl: fix potential deadlocks for &fown_struct.lock

Syzbot reports a potential deadlock in do_fcntl:

========================================================
WARNING: possible irq lock inversion dependency detected
5.12.0-syzkaller #0 Not tainted
--------------------------------------------------------
syz-executor132/8391 just changed the state of lock:
ffff888015967bf8 (&f->f_owner.lock){.+..}-{2:2}, at: f_getown_ex fs/fcntl.c:211 [inline]
ffff888015967bf8 (&f->f_owner.lock){.+..}-{2:2}, at: do_fcntl+0x8b4/0x1200 fs/fcntl.c:395
but this lock was taken by another, HARDIRQ-safe lock in the past:
(&dev->event_lock){-...}-{2:2}

and interrupts could create inverse lock ordering between them.

other info that might help us debug this:
Chain exists of:
&dev->event_lock --> &new->fa_lock --> &f->f_owner.lock

Possible interrupt unsafe locking scenario:

CPU0 CPU1
---- ----
lock(&f->f_owner.lock);
local_irq_disable();
lock(&dev->event_lock);
lock(&new->fa_lock);
<Interrupt>
lock(&dev->event_lock);

*** DEADLOCK ***

This happens because there is a lock hierarchy of
&dev->event_lock --> &new->fa_lock --> &f->f_owner.lock
from the following call chain:

input_inject_event():
spin_lock_irqsave(&dev->event_lock,...);
input_handle_event():
input_pass_values():
input_to_handler():
evdev_events():
evdev_pass_values():
spin_lock(&client->buffer_lock);
__pass_event():
kill_fasync():
kill_fasync_rcu():
read_lock(&fa->fa_lock);
send_sigio():
read_lock_irqsave(&fown->lock,...);

However, since &dev->event_lock is HARDIRQ-safe, interrupts have to be
disabled while grabbing &f->f_owner.lock, otherwise we invert the lock
hierarchy.

Hence, we replace calls to read_lock/read_unlock on &f->f_owner.lock,
with read_lock_irq/read_unlock_irq.

Here read_lock_irq/read_unlock_irq are safe to use because the
functions f_getown_ex and f_getowner_uids are only called from
do_fcntl, and f_getown is only called from do_fnctl and
sock_ioctl. do_fnctl itself is only called from syscalls.

For sock_ioctl, the chain is
compat_sock_ioctl():
compat_sock_ioctl_trans():
sock_ioctl()

Interrupts are not disabled on either path.

Reported-and-tested-by: [email protected]
Signed-off-by: Desmond Cheong Zhi Xi <[email protected]>
---
fs/fcntl.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/fs/fcntl.c b/fs/fcntl.c
index dfc72f15be7f..cf9e81dfa615 100644
--- a/fs/fcntl.c
+++ b/fs/fcntl.c
@@ -150,7 +150,8 @@ void f_delown(struct file *filp)
pid_t f_getown(struct file *filp)
{
pid_t pid = 0;
- read_lock(&filp->f_owner.lock);
+
+ read_lock_irq(&filp->f_owner.lock);
rcu_read_lock();
if (pid_task(filp->f_owner.pid, filp->f_owner.pid_type)) {
pid = pid_vnr(filp->f_owner.pid);
@@ -158,7 +159,7 @@ pid_t f_getown(struct file *filp)
pid = -pid;
}
rcu_read_unlock();
- read_unlock(&filp->f_owner.lock);
+ read_unlock_irq(&filp->f_owner.lock);
return pid;
}

@@ -208,7 +209,7 @@ static int f_getown_ex(struct file *filp, unsigned long arg)
struct f_owner_ex owner = {};
int ret = 0;

- read_lock(&filp->f_owner.lock);
+ read_lock_irq(&filp->f_owner.lock);
rcu_read_lock();
if (pid_task(filp->f_owner.pid, filp->f_owner.pid_type))
owner.pid = pid_vnr(filp->f_owner.pid);
@@ -231,7 +232,7 @@ static int f_getown_ex(struct file *filp, unsigned long arg)
ret = -EINVAL;
break;
}
- read_unlock(&filp->f_owner.lock);
+ read_unlock_irq(&filp->f_owner.lock);

if (!ret) {
ret = copy_to_user(owner_p, &owner, sizeof(owner));
@@ -249,10 +250,10 @@ static int f_getowner_uids(struct file *filp, unsigned long arg)
uid_t src[2];
int err;

- read_lock(&filp->f_owner.lock);
+ read_lock_irq(&filp->f_owner.lock);
src[0] = from_kuid(user_ns, filp->f_owner.uid);
src[1] = from_kuid(user_ns, filp->f_owner.euid);
- read_unlock(&filp->f_owner.lock);
+ read_unlock_irq(&filp->f_owner.lock);

err = put_user(src[0], &dst[0]);
err |= put_user(src[1], &dst[1]);
--
2.25.1

2021-07-07 07:47:05

by Desmond Cheong Zhi Xi

[permalink] [raw]
Subject: [PATCH v3 2/2] fcntl: fix potential deadlock for &fasync_struct.fa_lock

There is an existing lock hierarchy of
&dev->event_lock --> &fasync_struct.fa_lock --> &f->f_owner.lock
from the following call chain:

input_inject_event():
spin_lock_irqsave(&dev->event_lock,...);
input_handle_event():
input_pass_values():
input_to_handler():
evdev_events():
evdev_pass_values():
spin_lock(&client->buffer_lock);
__pass_event():
kill_fasync():
kill_fasync_rcu():
read_lock(&fa->fa_lock);
send_sigio():
read_lock_irqsave(&fown->lock,...);

&dev->event_lock is HARDIRQ-safe, so interrupts have to be disabled
while grabbing &fasync_struct.fa_lock, otherwise we invert the lock
hierarchy. However, since kill_fasync which calls kill_fasync_rcu is
an exported symbol, it may not necessarily be called with interrupts
disabled.

As kill_fasync_rcu may be called with interrupts disabled (for
example, in the call chain above), we replace calls to
read_lock/read_unlock on &fasync_struct.fa_lock in kill_fasync_rcu
with read_lock_irqsave/read_unlock_irqrestore.

Signed-off-by: Desmond Cheong Zhi Xi <[email protected]>
---
fs/fcntl.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/fcntl.c b/fs/fcntl.c
index cf9e81dfa615..887db4918a89 100644
--- a/fs/fcntl.c
+++ b/fs/fcntl.c
@@ -1004,13 +1004,14 @@ static void kill_fasync_rcu(struct fasync_struct *fa, int sig, int band)
{
while (fa) {
struct fown_struct *fown;
+ unsigned long flags;

if (fa->magic != FASYNC_MAGIC) {
printk(KERN_ERR "kill_fasync: bad magic number in "
"fasync_struct!\n");
return;
}
- read_lock(&fa->fa_lock);
+ read_lock_irqsave(&fa->fa_lock, flags);
if (fa->fa_file) {
fown = &fa->fa_file->f_owner;
/* Don't send SIGURG to processes which have not set a
@@ -1019,7 +1020,7 @@ static void kill_fasync_rcu(struct fasync_struct *fa, int sig, int band)
if (!(sig == SIGURG && fown->signum == 0))
send_sigio(fown, fa->fa_fd, band);
}
- read_unlock(&fa->fa_lock);
+ read_unlock_irqrestore(&fa->fa_lock, flags);
fa = rcu_dereference(fa->fa_next);
}
}
--
2.25.1

2021-07-07 18:51:16

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH v3 0/2] fcntl: fix potential deadlocks

On Wed, 2021-07-07 at 15:43 +0800, Desmond Cheong Zhi Xi wrote:
> Hi,
>
> Sorry for the delay between v1 and v2, there was an unrelated issue with Syzbot testing.
>
> Syzbot reports a possible irq lock inversion dependency:
> https://syzkaller.appspot.com/bug?id=923cfc6c6348963f99886a0176ef11dcc429547b
>
> While investigating this error, I discovered that multiple similar lock inversion scenarios can occur. Hence, this series addresses potential deadlocks for two classes of locks, one in each patch:
>
> 1. Fix potential deadlocks for &fown_struct.lock
>
> 2. Fix potential deadlock for &fasync_struct.fa_lock
>
> v2 -> v3:
> - Removed WARN_ON_ONCE, keeping elaboration for why read_lock_irq is safe to use in the commit message. As suggested by Greg KH.
>
> v1 -> v2:
> - Added WARN_ON_ONCE(irqs_disabled()) before calls to read_lock_irq, and added elaboration in the commit message. As suggested by Jeff Layton.
>
> Best wishes,
> Desmond
>
> Desmond Cheong Zhi Xi (2):
> fcntl: fix potential deadlocks for &fown_struct.lock
> fcntl: fix potential deadlock for &fasync_struct.fa_lock
>
> fs/fcntl.c | 18 ++++++++++--------
> 1 file changed, 10 insertions(+), 8 deletions(-)
>

Looks like these patches are identical to the v1 set, so I'm just going
to leave those in place since linux-next already has them. Let me know
if I've missed something though.

Thanks!
--
Jeff Layton <[email protected]>

2021-07-08 01:55:45

by Desmond Cheong Zhi Xi

[permalink] [raw]
Subject: Re: [PATCH v3 0/2] fcntl: fix potential deadlocks

On 8/7/21 1:06 am, Jeff Layton wrote:
> On Wed, 2021-07-07 at 15:43 +0800, Desmond Cheong Zhi Xi wrote:
>> Hi,
>>
>> Sorry for the delay between v1 and v2, there was an unrelated issue with Syzbot testing.
>>
>> Syzbot reports a possible irq lock inversion dependency:
>> https://syzkaller.appspot.com/bug?id=923cfc6c6348963f99886a0176ef11dcc429547b
>>
>> While investigating this error, I discovered that multiple similar lock inversion scenarios can occur. Hence, this series addresses potential deadlocks for two classes of locks, one in each patch:
>>
>> 1. Fix potential deadlocks for &fown_struct.lock
>>
>> 2. Fix potential deadlock for &fasync_struct.fa_lock
>>
>> v2 -> v3:
>> - Removed WARN_ON_ONCE, keeping elaboration for why read_lock_irq is safe to use in the commit message. As suggested by Greg KH.
>>
>> v1 -> v2:
>> - Added WARN_ON_ONCE(irqs_disabled()) before calls to read_lock_irq, and added elaboration in the commit message. As suggested by Jeff Layton.
>>
>> Best wishes,
>> Desmond
>>
>> Desmond Cheong Zhi Xi (2):
>> fcntl: fix potential deadlocks for &fown_struct.lock
>> fcntl: fix potential deadlock for &fasync_struct.fa_lock
>>
>> fs/fcntl.c | 18 ++++++++++--------
>> 1 file changed, 10 insertions(+), 8 deletions(-)
>>
>
> Looks like these patches are identical to the v1 set, so I'm just going
> to leave those in place since linux-next already has them. Let me know
> if I've missed something though.
>
> Thanks!
>

Yep, there's no change outside of the commit message. But I think after
the discussion and with config DEBUG_IRQFLAGS, that is fine.

Thanks again, Jeff!