2023-12-02 09:21:14

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH next] trace/blktrace: fix task hung in blk_trace_ioctl

Hi,

?? 2023/12/02 17:01, Edward Adam Davis д??:
> The reproducer involves running test programs on multiple processors separately,
> in order to enter blkdev_ioctl() and ultimately reach blk_trace_ioctl() through
> two different paths, triggering an AA deadlock.
>
> CPU0 CPU1
> --- ---
> mutex_lock(&q->debugfs_mutex) mutex_lock(&q->debugfs_mutex)
> mutex_lock(&q->debugfs_mutex) mutex_lock(&q->debugfs_mutex)
>
>
> The first path:
> blkdev_ioctl()->
> blk_trace_ioctl()->
> mutex_lock(&q->debugfs_mutex)
>
> The second path:
> blkdev_ioctl()->
> blkdev_common_ioctl()->
> blk_trace_ioctl()->
> mutex_lock(&q->debugfs_mutex)
I still don't understand how this AA deadlock is triggered, does the
'debugfs_mutex' already held before calling blk_trace_ioctl()?

>
> The solution I have proposed is to exit blk_trace_ioctl() to avoid AA locks if
> a task has already obtained debugfs_mutex.
>
> Fixes: 0d345996e4cb ("x86/kernel: increase kcov coverage under arch/x86/kernel folder")
> Reported-and-tested-by: [email protected]
> Signed-off-by: Edward Adam Davis <[email protected]>
> ---
> kernel/trace/blktrace.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
> index 54ade89a1ad2..34e5bce42b1e 100644
> --- a/kernel/trace/blktrace.c
> +++ b/kernel/trace/blktrace.c
> @@ -735,7 +735,8 @@ int blk_trace_ioctl(struct block_device *bdev, unsigned cmd, char __user *arg)
> int ret, start = 0;
> char b[BDEVNAME_SIZE];
>
> - mutex_lock(&q->debugfs_mutex);
> + if (!mutex_trylock(&q->debugfs_mutex))
> + return -EBUSY;

This is absolutely not a proper fix, a lot of user case will fail after
this patch.

Thanks,
Kuai

>
> switch (cmd) {
> case BLKTRACESETUP:
>


2023-12-02 22:08:07

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH next] trace/blktrace: fix task hung in blk_trace_ioctl

On Sat, 2 Dec 2023 17:19:25 +0800
Yu Kuai <[email protected]> wrote:

> Hi,
>
> 在 2023/12/02 17:01, Edward Adam Davis 写道:
> > The reproducer involves running test programs on multiple processors separately,
> > in order to enter blkdev_ioctl() and ultimately reach blk_trace_ioctl() through
> > two different paths, triggering an AA deadlock.
> >
> > CPU0 CPU1
> > --- ---
> > mutex_lock(&q->debugfs_mutex) mutex_lock(&q->debugfs_mutex)
> > mutex_lock(&q->debugfs_mutex) mutex_lock(&q->debugfs_mutex)
> >
> >
> > The first path:
> > blkdev_ioctl()->
> > blk_trace_ioctl()->
> > mutex_lock(&q->debugfs_mutex)
> >
> > The second path:
> > blkdev_ioctl()->
> > blkdev_common_ioctl()->
> > blk_trace_ioctl()->
> > mutex_lock(&q->debugfs_mutex)
> I still don't understand how this AA deadlock is triggered, does the
> 'debugfs_mutex' already held before calling blk_trace_ioctl()?

Right, I don't see where the mutex is taken twice. You don't need two
paths for an AA lock, you only need one.

>
> >
> > The solution I have proposed is to exit blk_trace_ioctl() to avoid AA locks if
> > a task has already obtained debugfs_mutex.
> >
> > Fixes: 0d345996e4cb ("x86/kernel: increase kcov coverage under arch/x86/kernel folder")

How does it fix the above? I don't see how the above is even related to this.

-- Steve

> > Reported-and-tested-by: [email protected]
> > Signed-off-by: Edward Adam Davis <[email protected]>
> > ---
> > kernel/trace/blktrace.c | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c

2023-12-03 11:51:47

by Pengfei Xu

[permalink] [raw]
Subject: Re: [PATCH next] trace/blktrace: fix task hung in blk_trace_ioctl

Hi,

On 2023-12-03 at 06:07:43 +0800, Steven Rostedt wrote:
> On Sat, 2 Dec 2023 17:19:25 +0800
> Yu Kuai <[email protected]> wrote:
>
> > Hi,
> >
> > 在 2023/12/02 17:01, Edward Adam Davis 写道:
> > > The reproducer involves running test programs on multiple processors separately,
> > > in order to enter blkdev_ioctl() and ultimately reach blk_trace_ioctl() through
> > > two different paths, triggering an AA deadlock.
> > >
> > > CPU0 CPU1
> > > --- ---
> > > mutex_lock(&q->debugfs_mutex) mutex_lock(&q->debugfs_mutex)
> > > mutex_lock(&q->debugfs_mutex) mutex_lock(&q->debugfs_mutex)
> > >
> > >
> > > The first path:
> > > blkdev_ioctl()->
> > > blk_trace_ioctl()->
> > > mutex_lock(&q->debugfs_mutex)
> > >
> > > The second path:
> > > blkdev_ioctl()->
> > > blkdev_common_ioctl()->
> > > blk_trace_ioctl()->
> > > mutex_lock(&q->debugfs_mutex)
> > I still don't understand how this AA deadlock is triggered, does the
> > 'debugfs_mutex' already held before calling blk_trace_ioctl()?
>
> Right, I don't see where the mutex is taken twice. You don't need two
> paths for an AA lock, you only need one.
>
> >
> > >
> > > The solution I have proposed is to exit blk_trace_ioctl() to avoid AA locks if
> > > a task has already obtained debugfs_mutex.
> > >
> > > Fixes: 0d345996e4cb ("x86/kernel: increase kcov coverage under arch/x86/kernel folder")
>
> How does it fix the above? I don't see how the above is even related to this.

I bisected this issue and the following fix information is more accurate:
"
Fixes: f2c2e717642c ("usb: gadget: add raw-gadget interface")
"

All the bisected info is in link: https://github.com/xupengfe/syzkaller_logs/tree/main/231203_140738_blk_trace_ioctl

Acked-by: Pengfei Xu <[email protected]>

Thanks!

>
> -- Steve
>
> > > Reported-and-tested-by: [email protected]
> > > Signed-off-by: Edward Adam Davis <[email protected]>
> > > ---
> > > kernel/trace/blktrace.c | 3 ++-
> > > 1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c