2023-03-26 08:54:06

by Dae R. Jeong

[permalink] [raw]
Subject: WARNING in isotp_tx_timer_handler and WARNING in print_tainted

Hi,

I am curious about the error handling logic in isotp_sendmsg() which
looks a bit unclear to me.

I was looking the `WARNING in isotp_tx_timer_handler` warning [1],
which was firstly addressed by a commit [2] but reoccured even after
the commit.
[1]: https://syzkaller.appspot.com/bug?id=4f492d593461a5e44d76dd9322e179d13191a8ef
[2]: c6adf659a8ba can: isotp: check CAN address family in isotp_bind()

I thought that the warning is caused by the concurrent execution of
two isotp_sendmsg() as described below (I'm not 100% sure though).

CPU1 CPU2
isotp_sendmsg() isotp_sendmsg()
----- -----
old_state = so->tx.state; // ISOTP_IDLE

cmpxchg(&so->tx.state, ISTOP_IDLE, ISOTP_SENDING) // success
...
so->tx.state = ISTOP_WAIT_FIRST_FC;
hrtimer_start(&so->txtimer);

cmpxchg(&so->tx.state, ISTOP_IDLE, ISOTP_SENDING) // failed
// if MSG_DONTWAIT is set in msg->msg_flags or
// a signal is delivered during wait_event_interruptible()
goto err_out;
err_out:
so->tx.state = old_state; // ISTOP_IDLE

isotp_tx_timer_handler()
-----
switch (so->tx.state) {
default:
WARN_ONCE();
}

Then, a commit [3] changed the logic of tx timer, and removed the
WARN_ONCE() statement. So I thought that the issue is completely
handled.
[3]: 4f027cba8216 can: isotp: split tx timer into transmission and timeout

But even after [3] is applied, I found a warning that seems related
occurred [4] (in the kernel commit: 478a351ce0d6).
[4]: https://syzkaller.appspot.com/bug?id=11d0e5f6fef53a0ea486bbd07ddd3cba66132150

So I wonder whether the `err_out` logic in isotp_sendmsg() is safe.
For me, it looks like isotp_sendmsg() can change so->tx.state to
ISTOP_IDLE at any time. It may not be a problem if all other locations
are aware of this. Is this an intended behavior?

Thank you in advance.


Best regards,
Dae R. Jeong


2023-03-26 11:20:48

by Oliver Hartkopp

[permalink] [raw]
Subject: Re: WARNING in isotp_tx_timer_handler and WARNING in print_tainted

Hi,

On 26.03.23 10:10, Dae R. Jeong wrote:
> Hi,
>
> I am curious about the error handling logic in isotp_sendmsg() which
> looks a bit unclear to me.
>
> I was looking the `WARNING in isotp_tx_timer_handler` warning [1],
> which was firstly addressed by a commit [2] but reoccured even after
> the commit.
> [1]: https://syzkaller.appspot.com/bug?id=4f492d593461a5e44d76dd9322e179d13191a8ef
> [2]: c6adf659a8ba can: isotp: check CAN address family in isotp_bind()
>
> I thought that the warning is caused by the concurrent execution of
> two isotp_sendmsg() as described below (I'm not 100% sure though).
>
> CPU1 CPU2
> isotp_sendmsg() isotp_sendmsg()
> ----- -----
> old_state = so->tx.state; // ISOTP_IDLE
>
> cmpxchg(&so->tx.state, ISTOP_IDLE, ISOTP_SENDING) // success
> ...
> so->tx.state = ISTOP_WAIT_FIRST_FC;
> hrtimer_start(&so->txtimer);
>
> cmpxchg(&so->tx.state, ISTOP_IDLE, ISOTP_SENDING) // failed
> // if MSG_DONTWAIT is set in msg->msg_flags or
> // a signal is delivered during wait_event_interruptible()
> goto err_out;
> err_out:
> so->tx.state = old_state; // ISTOP_IDLE
>
> isotp_tx_timer_handler()
> -----
> switch (so->tx.state) {
> default:
> WARN_ONCE();
> }
>
> Then, a commit [3] changed the logic of tx timer, and removed the
> WARN_ONCE() statement. So I thought that the issue is completely
> handled.
> [3]: 4f027cba8216 can: isotp: split tx timer into transmission and timeout
>
> But even after [3] is applied, I found a warning that seems related
> occurred [4] (in the kernel commit: 478a351ce0d6).
> [4]: https://syzkaller.appspot.com/bug?id=11d0e5f6fef53a0ea486bbd07ddd3cba66132150
>
> So I wonder whether the `err_out` logic in isotp_sendmsg() is safe.
> For me, it looks like isotp_sendmsg() can change so->tx.state to
> ISTOP_IDLE at any time. It may not be a problem if all other locations
> are aware of this. Is this an intended behavior?
>
> Thank you in advance.

Thank you for picking this up!

In fact I was not aware of the possibility of a concurrent execution of
isotp_sendmsg() and thought cmpxchg() would just make it ...

But looking at other *_sendmsg() implementations a lock_sock() seems to
be a common pattern to handle concurrent syscalls, see:

git grep -p lock_sock net | grep sendmsg

What do you think about adopting this to isotp_sendmsg()? See patch below.

Best regards,
Oliver

diff --git a/net/can/isotp.c b/net/can/isotp.c
index 9bc344851704..0b95c0df7a63 100644
--- a/net/can/isotp.c
+++ b/net/can/isotp.c
@@ -912,13 +912,12 @@ static enum hrtimer_restart
isotp_txfr_timer_handler(struct hrtimer *hrtimer)
isotp_send_cframe(so);

return HRTIMER_NORESTART;
}

-static int isotp_sendmsg(struct socket *sock, struct msghdr *msg,
size_t size)
+static int isotp_sendmsg_locked(struct sock *sk, struct msghdr *msg,
size_t size)
{
- struct sock *sk = sock->sk;
struct isotp_sock *so = isotp_sk(sk);
u32 old_state = so->tx.state;
struct sk_buff *skb;
struct net_device *dev;
struct canfd_frame *cf;
@@ -1091,10 +1090,22 @@ static int isotp_sendmsg(struct socket *sock,
struct msghdr *msg, size_t size)
wake_up_interruptible(&so->wait);

return err;
}

+static int isotp_sendmsg(struct socket *sock, struct msghdr *msg,
size_t size)
+{
+ struct sock *sk = sock->sk;
+ int ret;
+
+ lock_sock(sk);
+ ret = isotp_sendmsg_locked(sk, msg, size);
+ release_sock(sk);
+
+ return ret;
+}
+
static int isotp_recvmsg(struct socket *sock, struct msghdr *msg,
size_t size,
int flags)
{
struct sock *sk = sock->sk;
struct sk_buff *skb;



2023-03-26 11:59:57

by Dae R. Jeong

[permalink] [raw]
Subject: Re: WARNING in isotp_tx_timer_handler and WARNING in print_tainted

> diff --git a/net/can/isotp.c b/net/can/isotp.c
> index 9bc344851704..0b95c0df7a63 100644
> --- a/net/can/isotp.c
> +++ b/net/can/isotp.c
> @@ -912,13 +912,12 @@ static enum hrtimer_restart
> isotp_txfr_timer_handler(struct hrtimer *hrtimer)
> isotp_send_cframe(so);
>
> return HRTIMER_NORESTART;
> }
>
> -static int isotp_sendmsg(struct socket *sock, struct msghdr *msg, size_t
> size)
> +static int isotp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t
> size)
> {
> - struct sock *sk = sock->sk;
> struct isotp_sock *so = isotp_sk(sk);
> u32 old_state = so->tx.state;
> struct sk_buff *skb;
> struct net_device *dev;
> struct canfd_frame *cf;
> @@ -1091,10 +1090,22 @@ static int isotp_sendmsg(struct socket *sock, struct
> msghdr *msg, size_t size)
> wake_up_interruptible(&so->wait);
>
> return err;
> }
>
> +static int isotp_sendmsg(struct socket *sock, struct msghdr *msg, size_t
> size)
> +{
> + struct sock *sk = sock->sk;
> + int ret;
> +
> + lock_sock(sk);
> + ret = isotp_sendmsg_locked(sk, msg, size);
> + release_sock(sk);
> +
> + return ret;
> +}
> +
> static int isotp_recvmsg(struct socket *sock, struct msghdr *msg, size_t
> size,
> int flags)
> {
> struct sock *sk = sock->sk;
> struct sk_buff *skb;

Hi, Oliver.

It seems that the patch should address the scenario I was thinking
of. But using a lock is always scary for a newbie like me because of
the possibility of causing other problems, e.g., deadlock. If it does
not cause other problems, it looks good for me.

Or although I'm not sure about this, what about getting rid of
reverting so->tx.state to old_state?

I think the concurrent execution of isotp_sendmsg() would be
problematic when reverting so->tx.state to old_state after goto'ing
err_out. There are two locations of "goto err_out", and
iostp_sendmsg() does nothing to the socket before both of "goto
err_out". So after goto'ing err_out, it seems fine for me even if we
do not revert so->tx.state to old_state.

If I think correctly, this will make cmpxchg() work, and prevent the
problematic concurrent execution. Could you please check the patch
below?

Best regards,
Dae R. Jeong.

diff --git a/net/can/isotp.c b/net/can/isotp.c
index 9bc344851704..4630fad13803 100644
--- a/net/can/isotp.c
+++ b/net/can/isotp.c
@@ -918,7 +918,6 @@ static int isotp_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
{
struct sock *sk = sock->sk;
struct isotp_sock *so = isotp_sk(sk);
- u32 old_state = so->tx.state;
struct sk_buff *skb;
struct net_device *dev;
struct canfd_frame *cf;
@@ -1084,9 +1083,8 @@ static int isotp_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)

err_out_drop:
/* drop this PDU and unlock a potential wait queue */
- old_state = ISOTP_IDLE;
+ so->tx.state = ISOTP_IDLE;
err_out:
- so->tx.state = old_state;
if (so->tx.state == ISOTP_IDLE)
wake_up_interruptible(&so->wait);

2023-03-26 16:21:47

by Oliver Hartkopp

[permalink] [raw]
Subject: Re: WARNING in isotp_tx_timer_handler and WARNING in print_tainted

Hi Dae,

On 26.03.23 13:55, Dae R. Jeong wrote:
>> diff --git a/net/can/isotp.c b/net/can/isotp.c
>> index 9bc344851704..0b95c0df7a63 100644
>> --- a/net/can/isotp.c
>> +++ b/net/can/isotp.c
>> @@ -912,13 +912,12 @@ static enum hrtimer_restart
>> isotp_txfr_timer_handler(struct hrtimer *hrtimer)
>> isotp_send_cframe(so);
>>
>> return HRTIMER_NORESTART;
>> }
>>
>> -static int isotp_sendmsg(struct socket *sock, struct msghdr *msg, size_t
>> size)
>> +static int isotp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t
>> size)
>> {
>> - struct sock *sk = sock->sk;
>> struct isotp_sock *so = isotp_sk(sk);
>> u32 old_state = so->tx.state;
>> struct sk_buff *skb;
>> struct net_device *dev;
>> struct canfd_frame *cf;
>> @@ -1091,10 +1090,22 @@ static int isotp_sendmsg(struct socket *sock, struct
>> msghdr *msg, size_t size)
>> wake_up_interruptible(&so->wait);
>>
>> return err;
>> }
>>
>> +static int isotp_sendmsg(struct socket *sock, struct msghdr *msg, size_t
>> size)
>> +{
>> + struct sock *sk = sock->sk;
>> + int ret;
>> +
>> + lock_sock(sk);
>> + ret = isotp_sendmsg_locked(sk, msg, size);
>> + release_sock(sk);
>> +
>> + return ret;
>> +}
>> +
>> static int isotp_recvmsg(struct socket *sock, struct msghdr *msg, size_t
>> size,
>> int flags)
>> {
>> struct sock *sk = sock->sk;
>> struct sk_buff *skb;
>
> Hi, Oliver.
>
> It seems that the patch should address the scenario I was thinking
> of. But using a lock is always scary for a newbie like me because of
> the possibility of causing other problems, e.g., deadlock. If it does
> not cause other problems, it looks good for me.

Yes, I feel you!

We use lock_sock() also in the notifier which is called when someone
removes the CAN interface.

But the other cases for e.g. set_sockopt() and for sendmsg() seem to be
a common pattern to lock concurrent user space calls.

> Or although I'm not sure about this, what about getting rid of
> reverting so->tx.state to old_state?
>
> I think the concurrent execution of isotp_sendmsg() would be
> problematic when reverting so->tx.state to old_state after goto'ing
> err_out.
Your described case in the original post indeed shows that this might
lead to a problem.

> There are two locations of "goto err_out", and
> iostp_sendmsg() does nothing to the socket before both of "goto
> err_out". So after goto'ing err_out, it seems fine for me even if we
> do not revert so->tx.state to old_state.
>
> If I think correctly, this will make cmpxchg() work, and prevent the
> problematic concurrent execution. Could you please check the patch
> below?

Hm, interesting idea.

But in which state will so->tx.state be here:

/* wait for complete transmission of current pdu */
err = wait_event_interruptible(so->wait, so->tx.state == ISOTP_IDLE);
if (err)
goto err_out;


Should we better set the tx.state in the error case?

if (err) {
so->tx.state = ISOTP_IDLE;
goto err_out;
}

Best regards,
Oliver

(..)

>
> diff --git a/net/can/isotp.c b/net/can/isotp.c
> index 9bc344851704..4630fad13803 100644
> --- a/net/can/isotp.c
> +++ b/net/can/isotp.c
> @@ -918,7 +918,6 @@ static int isotp_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
> {
> struct sock *sk = sock->sk;
> struct isotp_sock *so = isotp_sk(sk);
> - u32 old_state = so->tx.state;
> struct sk_buff *skb;
> struct net_device *dev;
> struct canfd_frame *cf;
> @@ -1084,9 +1083,8 @@ static int isotp_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
>
> err_out_drop:
> /* drop this PDU and unlock a potential wait queue */
> - old_state = ISOTP_IDLE;
> + so->tx.state = ISOTP_IDLE;
> err_out:
> - so->tx.state = old_state;
> if (so->tx.state == ISOTP_IDLE)
> wake_up_interruptible(&so->wait);
>

2023-03-27 01:59:35

by Dae R. Jeong

[permalink] [raw]
Subject: Re: WARNING in isotp_tx_timer_handler and WARNING in print_tainted

On Sun, Mar 26, 2023 at 06:17:17PM +0200, Oliver Hartkopp wrote:
> Hi Dae,
>
> On 26.03.23 13:55, Dae R. Jeong wrote:
> > > diff --git a/net/can/isotp.c b/net/can/isotp.c
> > > index 9bc344851704..0b95c0df7a63 100644
> > > --- a/net/can/isotp.c
> > > +++ b/net/can/isotp.c
> > > @@ -912,13 +912,12 @@ static enum hrtimer_restart
> > > isotp_txfr_timer_handler(struct hrtimer *hrtimer)
> > > isotp_send_cframe(so);
> > >
> > > return HRTIMER_NORESTART;
> > > }
> > >
> > > -static int isotp_sendmsg(struct socket *sock, struct msghdr *msg, size_t
> > > size)
> > > +static int isotp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t
> > > size)
> > > {
> > > - struct sock *sk = sock->sk;
> > > struct isotp_sock *so = isotp_sk(sk);
> > > u32 old_state = so->tx.state;
> > > struct sk_buff *skb;
> > > struct net_device *dev;
> > > struct canfd_frame *cf;
> > > @@ -1091,10 +1090,22 @@ static int isotp_sendmsg(struct socket *sock, struct
> > > msghdr *msg, size_t size)
> > > wake_up_interruptible(&so->wait);
> > >
> > > return err;
> > > }
> > >
> > > +static int isotp_sendmsg(struct socket *sock, struct msghdr *msg, size_t
> > > size)
> > > +{
> > > + struct sock *sk = sock->sk;
> > > + int ret;
> > > +
> > > + lock_sock(sk);
> > > + ret = isotp_sendmsg_locked(sk, msg, size);
> > > + release_sock(sk);
> > > +
> > > + return ret;
> > > +}
> > > +
> > > static int isotp_recvmsg(struct socket *sock, struct msghdr *msg, size_t
> > > size,
> > > int flags)
> > > {
> > > struct sock *sk = sock->sk;
> > > struct sk_buff *skb;
> >
> > Hi, Oliver.
> >
> > It seems that the patch should address the scenario I was thinking
> > of. But using a lock is always scary for a newbie like me because of
> > the possibility of causing other problems, e.g., deadlock. If it does
> > not cause other problems, it looks good for me.
>
> Yes, I feel you!
>
> We use lock_sock() also in the notifier which is called when someone removes
> the CAN interface.
>
> But the other cases for e.g. set_sockopt() and for sendmsg() seem to be a
> common pattern to lock concurrent user space calls.


Yes, I see lock_sock() is a common pattern in *_sendmsg(). One thing I
wonder is whether sleeping (i.e., wait_event_*) after lock_sock() is
safe or not, as lock_sock() seems to have mutex_lock() semantics.

Perhaps we may need to unlock - wait_event - lock in istop_sendmsg()?
If so, we also need to consider various possible thread interleaving
cases.


> > Or although I'm not sure about this, what about getting rid of
> > reverting so->tx.state to old_state?
> >
> > I think the concurrent execution of isotp_sendmsg() would be
> > problematic when reverting so->tx.state to old_state after goto'ing
> > err_out.
> Your described case in the original post indeed shows that this might lead
> to a problem.
>
> > There are two locations of "goto err_out", and
> > iostp_sendmsg() does nothing to the socket before both of "goto
> > err_out". So after goto'ing err_out, it seems fine for me even if we
> > do not revert so->tx.state to old_state.
> >
> > If I think correctly, this will make cmpxchg() work, and prevent the
> > problematic concurrent execution. Could you please check the patch
> > below?
>
> Hm, interesting idea.
>
> But in which state will so->tx.state be here:
>
> /* wait for complete transmission of current pdu */
> err = wait_event_interruptible(so->wait, so->tx.state == ISOTP_IDLE);
> if (err)
> goto err_out;
>
>
> Should we better set the tx.state in the error case?
>
> if (err) {
> so->tx.state = ISOTP_IDLE;
> goto err_out;
> }
>
> Best regards,
> Oliver
>
> (..)

Hmm... my original thought was that 1) isotp_sendmsg() waiting the
event (so->tx.state == ISTOP_IDLE) does not touch anything related to
the socket as well as the sending process yet, so 2) this
isotp_sendmsg() does not need to change the tx.state if it returns an
error due to a signal. I'm not sure that we need to set tx.state in
this case. Do we still need to do it?


Best regards,
Dae R. Jeong.

2023-03-31 10:28:06

by Oliver Hartkopp

[permalink] [raw]
Subject: Re: WARNING in isotp_tx_timer_handler and WARNING in print_tainted

Hey all,

looking at the code simplification from Hillf I reworked some of the
cmpxchg() code and removed the old_state mechanism.

I posted a RFC patch here:

https://lore.kernel.org/linux-can/[email protected]/

Please comment on the patch whether you think if this could be an
improvement.

Many thanks,
Oliver

On 27.03.23 03:48, Hillf Danton wrote:
> On Sun, 26 Mar 2023 18:17:17 +0200 Oliver Hartkopp <[email protected]>
>> On 26.03.23 13:55, Dae R. Jeong wrote:
>>>
>>> If I think correctly, this will make cmpxchg() work, and prevent the
>>> problematic concurrent execution. Could you please check the patch
>>> below?
>>
>> Hm, interesting idea.
>>
>> But in which state will so->tx.state be here:
>>
>> /* wait for complete transmission of current pdu */
>> err = wait_event_interruptible(so->wait, so->tx.state == ISOTP_IDLE);
>> if (err)
>> goto err_out;
>>
>>
>> Should we better set the tx.state in the error case?
>>
>> if (err) {
>> so->tx.state = ISOTP_IDLE;
>> goto err_out;
>> }
>>
>> Best regards,
>> Oliver
>
> Another 2c only if cmpxchg is preferred.
>
> +++ b/net/can/isotp.c
> @@ -932,19 +932,24 @@ static int isotp_sendmsg(struct socket *
> return -EADDRNOTAVAIL;
>
> /* we do not support multiple buffers - for now */
> - if (cmpxchg(&so->tx.state, ISOTP_IDLE, ISOTP_SENDING) != ISOTP_IDLE ||
> - wq_has_sleeper(&so->wait)) {
> - if (msg->msg_flags & MSG_DONTWAIT) {
> - err = -EAGAIN;
> - goto err_out;
> - }
> -
> + if (wq_has_sleeper(&so->wait)) {
> + if (msg->msg_flags & MSG_DONTWAIT)
> + return -EAGAIN;
> /* wait for complete transmission of current pdu */
> err = wait_event_interruptible(so->wait, so->tx.state == ISOTP_IDLE);
> if (err)
> - goto err_out;
> + return err;
> + }
>
> - so->tx.state = ISOTP_SENDING;
> +again:
> + old_state = cmpxchg(&so->tx.state, ISOTP_IDLE, ISOTP_SENDING);
> + if (old_state != ISOTP_IDLE) {
> + if (msg->msg_flags & MSG_DONTWAIT)
> + return -EAGAIN;
> + err = wait_event_interruptible(so->wait, so->tx.state == ISOTP_IDLE);
> + if (err)
> + return err;
> + goto again;
> }
>
> if (!size || size > MAX_MSG_LENGTH) {