2022-04-07 08:15:05

by Max Filippov

[permalink] [raw]
Subject: Re: [PATCH 11/11] arch: xtensa: platforms: Fix deadlock in rs_close()

Hi Duoming,

On Wed, Apr 6, 2022 at 11:38 PM Duoming Zhou <[email protected]> wrote:
>
> There is a deadlock in rs_close(), which is shown
> below:
>
> (Thread 1) | (Thread 2)
> | rs_open()
> rs_close() | mod_timer()
> spin_lock_bh() //(1) | (wait a time)
> ... | rs_poll()
> del_timer_sync() | spin_lock() //(2)
> (wait timer to stop) | ...
>
> We hold timer_lock in position (1) of thread 1 and
> use del_timer_sync() to wait timer to stop, but timer handler
> also need timer_lock in position (2) of thread 2.
> As a result, rs_close() will block forever.

I agree with this.

> This patch extracts del_timer_sync() from the protection of
> spin_lock_bh(), which could let timer handler to obtain
> the needed lock.

Looking at the timer_lock I don't really understand what it protects.
It looks like it is not needed at all.

Also, I see that rs_poll rewinds the timer regardless of whether del_timer_sync
was called or not, which violates del_timer_sync requirements.

> Signed-off-by: Duoming Zhou <[email protected]>
> ---
> arch/xtensa/platforms/iss/console.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/arch/xtensa/platforms/iss/console.c b/arch/xtensa/platforms/iss/console.c
> index 81d7c7e8f7e..d431b61ae3c 100644
> --- a/arch/xtensa/platforms/iss/console.c
> +++ b/arch/xtensa/platforms/iss/console.c
> @@ -51,8 +51,10 @@ static int rs_open(struct tty_struct *tty, struct file * filp)
> static void rs_close(struct tty_struct *tty, struct file * filp)
> {
> spin_lock_bh(&timer_lock);
> - if (tty->count == 1)
> + if (tty->count == 1) {
> + spin_unlock_bh(&timer_lock);
> del_timer_sync(&serial_timer);
> + }
> spin_unlock_bh(&timer_lock);

Now in case tty->count == 1 the timer_lock would be unlocked twice.

--
Thanks.
-- Max


2022-04-07 20:40:20

by Duoming Zhou

[permalink] [raw]
Subject: Re: Re: [PATCH 11/11] arch: xtensa: platforms: Fix deadlock in rs_close()

Hello,

On Thu, 7 Apr 2022 12:42:31 +0300 Sergey Shtylyov wrote:

> > There is a deadlock in rs_close(), which is shown
> > below:
> >
> > (Thread 1) | (Thread 2)
> > | rs_open()
> > rs_close() | mod_timer()
> > spin_lock_bh() //(1) | (wait a time)
> > ... | rs_poll()
> > del_timer_sync() | spin_lock() //(2)
> > (wait timer to stop) | ...
> >
> > We hold timer_lock in position (1) of thread 1 and
> > use del_timer_sync() to wait timer to stop, but timer handler
> > also need timer_lock in position (2) of thread 2.
> > As a result, rs_close() will block forever.
> >
> > This patch extracts del_timer_sync() from the protection of
> > spin_lock_bh(), which could let timer handler to obtain
> > the needed lock.
> >
> > Signed-off-by: Duoming Zhou <[email protected]>
> > ---
> > arch/xtensa/platforms/iss/console.c | 4 +++-
> > 1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/xtensa/platforms/iss/console.c b/arch/xtensa/platforms/iss/console.c
> > index 81d7c7e8f7e..d431b61ae3c 100644
> > --- a/arch/xtensa/platforms/iss/console.c
> > +++ b/arch/xtensa/platforms/iss/console.c
> > @@ -51,8 +51,10 @@ static int rs_open(struct tty_struct *tty, struct file * filp)
> > static void rs_close(struct tty_struct *tty, struct file * filp)
> > {
> > spin_lock_bh(&timer_lock);
> > - if (tty->count == 1)
> > + if (tty->count == 1) {
> > + spin_unlock_bh(&timer_lock);
> > del_timer_sync(&serial_timer);
> > + }
> > spin_unlock_bh(&timer_lock);
>
> Double unlock iff tty->count == 1?

Yes, Thanks a lot for your timer and advice. I found there is no race condition
between rs_close and rs_poll(timer handler), I think we could remove the timer_lock
in rs_close(), rs_open() and rs_poll().

Best regards,
Duoming Zhou

2022-04-07 21:08:31

by Duoming Zhou

[permalink] [raw]
Subject: Re: Re: [PATCH 11/11] arch: xtensa: platforms: Fix deadlock in rs_close()

Hello,

On Thu, 7 Apr 2022 00:21:58 -0700 Max Filippov wrote:

> > There is a deadlock in rs_close(), which is shown
> > below:
> >
> > (Thread 1) | (Thread 2)
> > | rs_open()
> > rs_close() | mod_timer()
> > spin_lock_bh() //(1) | (wait a time)
> > ... | rs_poll()
> > del_timer_sync() | spin_lock() //(2)
> > (wait timer to stop) | ...
> >
> > We hold timer_lock in position (1) of thread 1 and
> > use del_timer_sync() to wait timer to stop, but timer handler
> > also need timer_lock in position (2) of thread 2.
> > As a result, rs_close() will block forever.
>
> I agree with this.
>
> > This patch extracts del_timer_sync() from the protection of
> > spin_lock_bh(), which could let timer handler to obtain
> > the needed lock.
>
> Looking at the timer_lock I don't really understand what it protects.
> It looks like it is not needed at all.

There is no race condition between rs_close and rs_poll(timer handler),
I think we could remove the timer_lock in rs_close(), rs_open() and rs_poll().

> Also, I see that rs_poll rewinds the timer regardless of whether del_timer_sync
> was called or not, which violates del_timer_sync requirements.

I wrote a kernel module to test whether del_timer_sync() could finish a timer handler
that use mod_timer() to rewind itself. The following is the result.

# insmod del_timer_sync.ko
[ 929.374405] my_timer will be create.
[ 929.374738] the jiffies is :4295595572
[ 930.411581] In my_timer_function
[ 930.411956] the jiffies is 4295596609
[ 935.466643] In my_timer_function
[ 935.467505] the jiffies is 4295601665
[ 940.586538] In my_timer_function
[ 940.586916] the jiffies is 4295606784
[ 945.706579] In my_timer_function
[ 945.706885] the jiffies is 4295611904

#
# rmmod del_timer_sync.ko
[ 948.507692] the del_timer_sync is :1
[ 948.507692]
#
#

The result of the experiment shows that the timer handler could
be killed after we execute del_timer_sync().

> > Signed-off-by: Duoming Zhou <[email protected]>
> > ---
> > arch/xtensa/platforms/iss/console.c | 4 +++-
> > 1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/xtensa/platforms/iss/console.c b/arch/xtensa/platforms/iss/console.c
> > index 81d7c7e8f7e..d431b61ae3c 100644
> > --- a/arch/xtensa/platforms/iss/console.c
> > +++ b/arch/xtensa/platforms/iss/console.c
> > @@ -51,8 +51,10 @@ static int rs_open(struct tty_struct *tty, struct file * filp)
> > static void rs_close(struct tty_struct *tty, struct file * filp)
> > {
> > spin_lock_bh(&timer_lock);
> > - if (tty->count == 1)
> > + if (tty->count == 1) {
> > + spin_unlock_bh(&timer_lock);
> > del_timer_sync(&serial_timer);
> > + }
> > spin_unlock_bh(&timer_lock);
>
> Now in case tty->count == 1 the timer_lock would be unlocked twice.

I will remove the timer_lock in rs_close(), rs_open() and rs_poll().

Thanks a lot for your time and advice!

Best regards,
Duoming Zhou