From: Peilin Ye <[email protected]>
An O_NONBLOCK vsock_connect() request may try to reschedule
@connect_work. Consider the following vsock_connect() requests:
1. The 1st, non-blocking request schedules @connect_work, which will
expire after, say, 200 jiffies. Socket state is now SS_CONNECTING;
2. Later, the 2nd, blocking request gets interrupted by a signal after
5 jiffies while waiting for the connection to be established.
Socket state is back to SS_UNCONNECTED, and @connect_work will
expire after 100 jiffies;
3. Now, the 3rd, non-blocking request tries to schedule @connect_work
again, but @connect_work has already been scheduled, and will
expire in, say, 50 jiffies.
In this scenario, currently this 3rd request simply decreases the sock
reference count and returns. Instead, let it reschedules @connect_work
and resets the timeout back to @connect_timeout.
Signed-off-by: Peilin Ye <[email protected]>
---
Hi all,
This patch is RFC because it bases on Stefano's WIP fix [1] for a bug [2]
reported by syzbot, and it won't apply on current net-next. I think it
solves a separate issue.
Please advise, thanks!
Peilin Ye
[1] https://gitlab.com/sgarzarella/linux/-/commit/2d0f0b9cbbb30d58fdcbca7c1a857fd8f3110d61
[2] https://syzkaller.appspot.com/bug?id=cd9103dc63346d26acbbdbf5c6ba9bd74e48c860
net/vmw_vsock/af_vsock.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 194d22291d8b..417e4ad17c03 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1395,7 +1395,7 @@ static int vsock_connect(struct socket *sock, struct sockaddr *addr,
/* If the timeout function is already scheduled, ungrab
* the socket refcount to not leave it unbalanced.
*/
- if (!schedule_delayed_work(&vsk->connect_work, timeout))
+ if (mod_delayed_work(system_wq, &vsk->connect_work, timeout))
sock_put(sk);
/* Skip ahead to preserve error code set above. */
--
2.20.1
On Wed, Aug 03, 2022 at 07:09:25PM -0700, Peilin Ye wrote:
>From: Peilin Ye <[email protected]>
>
>An O_NONBLOCK vsock_connect() request may try to reschedule
>@connect_work. Consider the following vsock_connect() requests:
>
> 1. The 1st, non-blocking request schedules @connect_work, which will
> expire after, say, 200 jiffies. Socket state is now SS_CONNECTING;
>
> 2. Later, the 2nd, blocking request gets interrupted by a signal after
> 5 jiffies while waiting for the connection to be established.
> Socket state is back to SS_UNCONNECTED, and @connect_work will
> expire after 100 jiffies;
>
> 3. Now, the 3rd, non-blocking request tries to schedule @connect_work
> again, but @connect_work has already been scheduled, and will
> expire in, say, 50 jiffies.
>
>In this scenario, currently this 3rd request simply decreases the sock
>reference count and returns. Instead, let it reschedules @connect_work
>and resets the timeout back to @connect_timeout.
>
>Signed-off-by: Peilin Ye <[email protected]>
>---
>Hi all,
>
>This patch is RFC because it bases on Stefano's WIP fix [1] for a bug
>[2]
>reported by syzbot, and it won't apply on current net-next. I think it
>solves a separate issue.
Nice, this is better!
Feel free to include my patch in this (inclunding also the Fixes tag and
maybe senidng to syzbot and including its tag as well).
The last thing I was trying to figure out before sending the patch was
whether to set sock->state = SS_UNCONNECTED in vsock_connect_timeout().
I think we should do that, otherwise a subsequent to connect() with
O_NONBLOCK set would keep returning -EALREADY, even though the timeout
has expired.
What do you think?
I don't think it changes anything for the bug raised by sysbot, so it
could be a separate patch.
Thanks,
Stefano
>
>Please advise, thanks!
>Peilin Ye
>
>[1] https://gitlab.com/sgarzarella/linux/-/commit/2d0f0b9cbbb30d58fdcbca7c1a857fd8f3110d61
>[2] https://syzkaller.appspot.com/bug?id=cd9103dc63346d26acbbdbf5c6ba9bd74e48c860
>
> net/vmw_vsock/af_vsock.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index 194d22291d8b..417e4ad17c03 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -1395,7 +1395,7 @@ static int vsock_connect(struct socket *sock, struct sockaddr *addr,
> /* If the timeout function is already scheduled, ungrab
> * the socket refcount to not leave it unbalanced.
> */
>- if (!schedule_delayed_work(&vsk->connect_work, timeout))
>+ if (mod_delayed_work(system_wq, &vsk->connect_work, timeout))
> sock_put(sk);
>
> /* Skip ahead to preserve error code set above. */
>--
>2.20.1
>
Hi Stefano,
On Thu, Aug 04, 2022 at 08:59:23AM +0200, Stefano Garzarella wrote:
> The last thing I was trying to figure out before sending the patch was
> whether to set sock->state = SS_UNCONNECTED in vsock_connect_timeout().
>
> I think we should do that, otherwise a subsequent to connect() with
> O_NONBLOCK set would keep returning -EALREADY, even though the timeout has
> expired.
>
> What do you think?
Thanks for bringing this up, after thinking about sock->state, I have 3
thoughts:
1. I think the root cause of this memleak is, we keep @connect_work
pending, even after the 2nd, blocking request times out (or gets
interrupted) and sets sock->state back to SS_UNCONNECTED.
@connect_work is effectively no-op when sk->sk_state is
TCP_CLOS{E,ING} anyway, so why not we just cancel @connect_work when
blocking requests time out or get interrupted? Something like:
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index f04abf662ec6..62628af84164 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1402,6 +1402,9 @@ static int vsock_connect(struct socket *sock, struct sockaddr *addr,
lock_sock(sk);
if (signal_pending(current)) {
+ if (cancel_delayed_work(&vsk->connect_work))
+ sock_put(sk);
+
err = sock_intr_errno(timeout);
sk->sk_state = sk->sk_state == TCP_ESTABLISHED ? TCP_CLOSING : TCP_CLOSE;
sock->state = SS_UNCONNECTED;
@@ -1409,6 +1412,9 @@ static int vsock_connect(struct socket *sock, struct sockaddr *addr,
vsock_remove_connected(vsk);
goto out_wait;
} else if (timeout == 0) {
+ if (cancel_delayed_work(&vsk->connect_work))
+ sock_put(sk);
+
err = -ETIMEDOUT;
sk->sk_state = TCP_CLOSE;
sock->state = SS_UNCONNECTED;
Then no need to worry about rescheduling @connect_work, and the state
machine becomes more accurate. What do you think? I will ask syzbot
to test this.
2. About your suggestion of setting sock->state = SS_UNCONNECTED in
vsock_connect_timeout(), I think it makes sense. Are you going to
send a net-next patch for this?
3. After a TCP_SYN_SENT sock receives VIRTIO_VSOCK_OP_RESPONSE in
virtio_transport_recv_connecting(), why don't we cancel @connect_work?
Am I missing something?
Thanks,
Peilin Ye
On Thu, Aug 04, 2022 at 04:44:47PM -0700, Peilin Ye wrote:
>Hi Stefano,
>
>On Thu, Aug 04, 2022 at 08:59:23AM +0200, Stefano Garzarella wrote:
>> The last thing I was trying to figure out before sending the patch was
>> whether to set sock->state = SS_UNCONNECTED in vsock_connect_timeout().
>>
>> I think we should do that, otherwise a subsequent to connect() with
>> O_NONBLOCK set would keep returning -EALREADY, even though the timeout has
>> expired.
>>
>> What do you think?
>
>Thanks for bringing this up, after thinking about sock->state, I have 3
>thoughts:
>
>1. I think the root cause of this memleak is, we keep @connect_work
> pending, even after the 2nd, blocking request times out (or gets
> interrupted) and sets sock->state back to SS_UNCONNECTED.
>
> @connect_work is effectively no-op when sk->sk_state is
> TCP_CLOS{E,ING} anyway, so why not we just cancel @connect_work when
> blocking requests time out or get interrupted? Something like:
>
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index f04abf662ec6..62628af84164 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -1402,6 +1402,9 @@ static int vsock_connect(struct socket *sock, struct sockaddr *addr,
> lock_sock(sk);
>
> if (signal_pending(current)) {
>+ if (cancel_delayed_work(&vsk->connect_work))
>+ sock_put(sk);
>+
> err = sock_intr_errno(timeout);
> sk->sk_state = sk->sk_state == TCP_ESTABLISHED ? TCP_CLOSING : TCP_CLOSE;
> sock->state = SS_UNCONNECTED;
>@@ -1409,6 +1412,9 @@ static int vsock_connect(struct socket *sock, struct sockaddr *addr,
> vsock_remove_connected(vsk);
> goto out_wait;
> } else if (timeout == 0) {
>+ if (cancel_delayed_work(&vsk->connect_work))
>+ sock_put(sk);
>+
> err = -ETIMEDOUT;
> sk->sk_state = TCP_CLOSE;
> sock->state = SS_UNCONNECTED;
>
> Then no need to worry about rescheduling @connect_work, and the state
> machine becomes more accurate. What do you think? I will ask syzbot
> to test this.
It could work, but should we set `sk->sk_err` and call sk_error_report()
to wake up thread waiting on poll()?
Maybe the previous version is simpler.
>
>2. About your suggestion of setting sock->state = SS_UNCONNECTED in
> vsock_connect_timeout(), I think it makes sense. Are you going to
> send a net-next patch for this?
If you have time, feel free to send it.
Since it is a fix, I believe you can use the "net" tree. (Also for this
patch).
Remember to put the "Fixes" tag that should be the same.
>
>3. After a TCP_SYN_SENT sock receives VIRTIO_VSOCK_OP_RESPONSE in
> virtio_transport_recv_connecting(), why don't we cancel
> @connect_work?
> Am I missing something?
Because when the timeout will fire, vsock_connect_timeout() will just
call sock_put() since sk->sk_state is changed.
Of course, we can cancel it if we want, but I think it's not worth it.
In the end, this rescheduling patch should solve all the problems.
Thanks,
Stefano
On Fri, Aug 05, 2022 at 02:42:39PM +0200, Stefano Garzarella wrote:
> On Thu, Aug 04, 2022 at 04:44:47PM -0700, Peilin Ye wrote:
> > 1. I think the root cause of this memleak is, we keep @connect_work
> > pending, even after the 2nd, blocking request times out (or gets
> > interrupted) and sets sock->state back to SS_UNCONNECTED.
> >
> > @connect_work is effectively no-op when sk->sk_state is
> > TCP_CLOS{E,ING} anyway, so why not we just cancel @connect_work when
> > blocking requests time out or get interrupted? Something like:
> >
> > diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
> > index f04abf662ec6..62628af84164 100644
> > --- a/net/vmw_vsock/af_vsock.c
> > +++ b/net/vmw_vsock/af_vsock.c
> > @@ -1402,6 +1402,9 @@ static int vsock_connect(struct socket *sock, struct sockaddr *addr,
> > lock_sock(sk);
> >
> > if (signal_pending(current)) {
> > + if (cancel_delayed_work(&vsk->connect_work))
> > + sock_put(sk);
> > +
> > err = sock_intr_errno(timeout);
> > sk->sk_state = sk->sk_state == TCP_ESTABLISHED ? TCP_CLOSING : TCP_CLOSE;
> > sock->state = SS_UNCONNECTED;
> > @@ -1409,6 +1412,9 @@ static int vsock_connect(struct socket *sock, struct sockaddr *addr,
> > vsock_remove_connected(vsk);
> > goto out_wait;
> > } else if (timeout == 0) {
> > + if (cancel_delayed_work(&vsk->connect_work))
> > + sock_put(sk);
> > +
> > err = -ETIMEDOUT;
> > sk->sk_state = TCP_CLOSE;
> > sock->state = SS_UNCONNECTED;
> >
> > Then no need to worry about rescheduling @connect_work, and the state
> > machine becomes more accurate. What do you think? I will ask syzbot
> > to test this.
>
> It could work, but should we set `sk->sk_err` and call sk_error_report() to
> wake up thread waiting on poll()?
>
> Maybe the previous version is simpler.
Right, I forgot about sk_error_report(). Let us use the simpler version
then.
> > 2. About your suggestion of setting sock->state = SS_UNCONNECTED in
> > vsock_connect_timeout(), I think it makes sense. Are you going to
> > send a net-next patch for this?
>
> If you have time, feel free to send it.
>
> Since it is a fix, I believe you can use the "net" tree. (Also for this
> patch).
>
> Remember to put the "Fixes" tag that should be the same.
Sure, I will send them this week. Thanks!
Peilin Ye
From: Peilin Ye <[email protected]>
An O_NONBLOCK vsock_connect() request may try to reschedule
@connect_work. Imagine the following sequence of vsock_connect()
requests:
1. The 1st, non-blocking request schedules @connect_work, which will
expire after 200 jiffies. Socket state is now SS_CONNECTING;
2. Later, the 2nd, blocking request gets interrupted by a signal after
a few jiffies while waiting for the connection to be established.
Socket state is back to SS_UNCONNECTED, but @connect_work is still
pending, and will expire after 100 jiffies.
3. Now, the 3rd, non-blocking request tries to schedule @connect_work
again. Since @connect_work is already scheduled,
schedule_delayed_work() silently returns. sock_hold() is called
twice, but sock_put() will only be called once in
vsock_connect_timeout(), causing a memory leak reported by syzbot:
BUG: memory leak
unreferenced object 0xffff88810ea56a40 (size 1232):
comm "syz-executor756", pid 3604, jiffies 4294947681 (age 12.350s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
28 00 07 40 00 00 00 00 00 00 00 00 00 00 00 00 (..@............
backtrace:
[<ffffffff837c830e>] sk_prot_alloc+0x3e/0x1b0 net/core/sock.c:1930
[<ffffffff837cbe22>] sk_alloc+0x32/0x2e0 net/core/sock.c:1989
[<ffffffff842ccf68>] __vsock_create.constprop.0+0x38/0x320 net/vmw_vsock/af_vsock.c:734
[<ffffffff842ce8f1>] vsock_create+0xc1/0x2d0 net/vmw_vsock/af_vsock.c:2203
[<ffffffff837c0cbb>] __sock_create+0x1ab/0x2b0 net/socket.c:1468
[<ffffffff837c3acf>] sock_create net/socket.c:1519 [inline]
[<ffffffff837c3acf>] __sys_socket+0x6f/0x140 net/socket.c:1561
[<ffffffff837c3bba>] __do_sys_socket net/socket.c:1570 [inline]
[<ffffffff837c3bba>] __se_sys_socket net/socket.c:1568 [inline]
[<ffffffff837c3bba>] __x64_sys_socket+0x1a/0x20 net/socket.c:1568
[<ffffffff84512815>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
[<ffffffff84512815>] do_syscall_64+0x35/0x80 arch/x86/entry/common.c:80
[<ffffffff84600068>] entry_SYSCALL_64_after_hwframe+0x44/0xae
<...>
Use mod_delayed_work() instead: if @connect_work is already scheduled,
reschedule it, and undo sock_hold() to keep the reference count
balanced.
Reported-and-tested-by: [email protected]
Fixes: d021c344051a ("VSOCK: Introduce VM Sockets")
Co-developed-by: Stefano Garzarella <[email protected]>
Signed-off-by: Stefano Garzarella <[email protected]>
Signed-off-by: Peilin Ye <[email protected]>
---
change since v1:
- merged with Stefano's patch [1]
[1] https://gitlab.com/sgarzarella/linux/-/commit/2d0f0b9cbbb30d58fdcbca7c1a857fd8f3110d61
Hi Stefano,
About the Fixes: tag, [2] introduced @connect_work, but all it did was
breaking @dwork into two and moving some INIT_DELAYED_WORK()'s, so I don't
think [2] introduced this memory leak?
Since [2] has already been backported to 4.9 and 4.14, I think we can
Fixes: commit d021c344051a ("VSOCK: Introduce VM Sockets"), too, to make
backporting easier?
[2] commit 455f05ecd2b2 ("vsock: split dwork to avoid reinitializations")
Thanks,
Peilin Ye
net/vmw_vsock/af_vsock.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index f04abf662ec6..fe14f6cbca22 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1391,7 +1391,13 @@ static int vsock_connect(struct socket *sock, struct sockaddr *addr,
* timeout fires.
*/
sock_hold(sk);
- schedule_delayed_work(&vsk->connect_work, timeout);
+
+ /* If the timeout function is already scheduled,
+ * reschedule it, then ungrab the socket refcount to
+ * keep it balanced.
+ */
+ if (mod_delayed_work(system_wq, &vsk->connect_work, timeout))
+ sock_put(sk);
/* Skip ahead to preserve error code set above. */
goto out_wait;
--
2.20.1
On Sun, Aug 07, 2022 at 02:00:11AM -0700, Peilin Ye wrote:
>From: Peilin Ye <[email protected]>
>
>An O_NONBLOCK vsock_connect() request may try to reschedule
>@connect_work. Imagine the following sequence of vsock_connect()
>requests:
>
> 1. The 1st, non-blocking request schedules @connect_work, which will
> expire after 200 jiffies. Socket state is now SS_CONNECTING;
>
> 2. Later, the 2nd, blocking request gets interrupted by a signal after
> a few jiffies while waiting for the connection to be established.
> Socket state is back to SS_UNCONNECTED, but @connect_work is still
> pending, and will expire after 100 jiffies.
>
> 3. Now, the 3rd, non-blocking request tries to schedule @connect_work
> again. Since @connect_work is already scheduled,
> schedule_delayed_work() silently returns. sock_hold() is called
> twice, but sock_put() will only be called once in
> vsock_connect_timeout(), causing a memory leak reported by syzbot:
>
> BUG: memory leak
> unreferenced object 0xffff88810ea56a40 (size 1232):
> comm "syz-executor756", pid 3604, jiffies 4294947681 (age 12.350s)
> hex dump (first 32 bytes):
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> 28 00 07 40 00 00 00 00 00 00 00 00 00 00 00 00 (..@............
> backtrace:
> [<ffffffff837c830e>] sk_prot_alloc+0x3e/0x1b0 net/core/sock.c:1930
> [<ffffffff837cbe22>] sk_alloc+0x32/0x2e0 net/core/sock.c:1989
> [<ffffffff842ccf68>] __vsock_create.constprop.0+0x38/0x320 net/vmw_vsock/af_vsock.c:734
> [<ffffffff842ce8f1>] vsock_create+0xc1/0x2d0 net/vmw_vsock/af_vsock.c:2203
> [<ffffffff837c0cbb>] __sock_create+0x1ab/0x2b0 net/socket.c:1468
> [<ffffffff837c3acf>] sock_create net/socket.c:1519 [inline]
> [<ffffffff837c3acf>] __sys_socket+0x6f/0x140 net/socket.c:1561
> [<ffffffff837c3bba>] __do_sys_socket net/socket.c:1570 [inline]
> [<ffffffff837c3bba>] __se_sys_socket net/socket.c:1568 [inline]
> [<ffffffff837c3bba>] __x64_sys_socket+0x1a/0x20 net/socket.c:1568
> [<ffffffff84512815>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> [<ffffffff84512815>] do_syscall_64+0x35/0x80 arch/x86/entry/common.c:80
> [<ffffffff84600068>] entry_SYSCALL_64_after_hwframe+0x44/0xae
> <...>
>
>Use mod_delayed_work() instead: if @connect_work is already scheduled,
>reschedule it, and undo sock_hold() to keep the reference count
>balanced.
>
>Reported-and-tested-by: [email protected]
>Fixes: d021c344051a ("VSOCK: Introduce VM Sockets")
>Co-developed-by: Stefano Garzarella <[email protected]>
>Signed-off-by: Stefano Garzarella <[email protected]>
>Signed-off-by: Peilin Ye <[email protected]>
>---
>change since v1:
> - merged with Stefano's patch [1]
>
>[1] https://gitlab.com/sgarzarella/linux/-/commit/2d0f0b9cbbb30d58fdcbca7c1a857fd8f3110d61
>
>Hi Stefano,
>
>About the Fixes: tag, [2] introduced @connect_work, but all it did was
>breaking @dwork into two and moving some INIT_DELAYED_WORK()'s, so I don't
>think [2] introduced this memory leak?
>
>Since [2] has already been backported to 4.9 and 4.14, I think we can
>Fixes: commit d021c344051a ("VSOCK: Introduce VM Sockets"), too, to make
>backporting easier?
Yep, I think it should be fine!
>
>[2] commit 455f05ecd2b2 ("vsock: split dwork to avoid reinitializations")
>
>Thanks,
>Peilin Ye
>
> net/vmw_vsock/af_vsock.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index f04abf662ec6..fe14f6cbca22 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -1391,7 +1391,13 @@ static int vsock_connect(struct socket *sock, struct sockaddr *addr,
> * timeout fires.
> */
> sock_hold(sk);
>- schedule_delayed_work(&vsk->connect_work, timeout);
>+
>+ /* If the timeout function is already scheduled,
>+ * reschedule it, then ungrab the socket refcount to
>+ * keep it balanced.
>+ */
>+ if (mod_delayed_work(system_wq, &vsk->connect_work, timeout))
^
Checkpatch warns here about line lenght.
If you have to re-send, please split it.
Anyway, the patch LGTM:
Reviewed-by: Stefano Garzarella <[email protected]>
Thanks,
Stefano
On Mon, Aug 08, 2022 at 09:55:33AM +0200, Stefano Garzarella wrote:
> On Sun, Aug 07, 2022 at 02:00:11AM -0700, Peilin Ye wrote:
> > net/vmw_vsock/af_vsock.c | 8 +++++++-
> > 1 file changed, 7 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
> > index f04abf662ec6..fe14f6cbca22 100644
> > --- a/net/vmw_vsock/af_vsock.c
> > +++ b/net/vmw_vsock/af_vsock.c
> > @@ -1391,7 +1391,13 @@ static int vsock_connect(struct socket *sock, struct sockaddr *addr,
> > * timeout fires.
> > */
> > sock_hold(sk);
> > - schedule_delayed_work(&vsk->connect_work, timeout);
> > +
> > + /* If the timeout function is already scheduled,
> > + * reschedule it, then ungrab the socket refcount to
> > + * keep it balanced.
> > + */
> > + if (mod_delayed_work(system_wq, &vsk->connect_work, timeout))
> ^
> Checkpatch warns here about line lenght.
> If you have to re-send, please split it.
Oh, net-next HEAD's checkpatch --strict didn't complain, I didn't know
Patchwork checks 80 columns. I will send v3 soon.
> Anyway, the patch LGTM:
>
> Reviewed-by: Stefano Garzarella <[email protected]>
Thanks!
Peilin Ye
From: Peilin Ye <[email protected]>
An O_NONBLOCK vsock_connect() request may try to reschedule
@connect_work. Imagine the following sequence of vsock_connect()
requests:
1. The 1st, non-blocking request schedules @connect_work, which will
expire after 200 jiffies. Socket state is now SS_CONNECTING;
2. Later, the 2nd, blocking request gets interrupted by a signal after
a few jiffies while waiting for the connection to be established.
Socket state is back to SS_UNCONNECTED, but @connect_work is still
pending, and will expire after 100 jiffies.
3. Now, the 3rd, non-blocking request tries to schedule @connect_work
again. Since @connect_work is already scheduled,
schedule_delayed_work() silently returns. sock_hold() is called
twice, but sock_put() will only be called once in
vsock_connect_timeout(), causing a memory leak reported by syzbot:
BUG: memory leak
unreferenced object 0xffff88810ea56a40 (size 1232):
comm "syz-executor756", pid 3604, jiffies 4294947681 (age 12.350s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
28 00 07 40 00 00 00 00 00 00 00 00 00 00 00 00 (..@............
backtrace:
[<ffffffff837c830e>] sk_prot_alloc+0x3e/0x1b0 net/core/sock.c:1930
[<ffffffff837cbe22>] sk_alloc+0x32/0x2e0 net/core/sock.c:1989
[<ffffffff842ccf68>] __vsock_create.constprop.0+0x38/0x320 net/vmw_vsock/af_vsock.c:734
[<ffffffff842ce8f1>] vsock_create+0xc1/0x2d0 net/vmw_vsock/af_vsock.c:2203
[<ffffffff837c0cbb>] __sock_create+0x1ab/0x2b0 net/socket.c:1468
[<ffffffff837c3acf>] sock_create net/socket.c:1519 [inline]
[<ffffffff837c3acf>] __sys_socket+0x6f/0x140 net/socket.c:1561
[<ffffffff837c3bba>] __do_sys_socket net/socket.c:1570 [inline]
[<ffffffff837c3bba>] __se_sys_socket net/socket.c:1568 [inline]
[<ffffffff837c3bba>] __x64_sys_socket+0x1a/0x20 net/socket.c:1568
[<ffffffff84512815>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
[<ffffffff84512815>] do_syscall_64+0x35/0x80 arch/x86/entry/common.c:80
[<ffffffff84600068>] entry_SYSCALL_64_after_hwframe+0x44/0xae
<...>
Use mod_delayed_work() instead: if @connect_work is already scheduled,
reschedule it, and undo sock_hold() to keep the reference count
balanced.
Reported-and-tested-by: [email protected]
Fixes: d021c344051a ("VSOCK: Introduce VM Sockets")
Co-developed-by: Stefano Garzarella <[email protected]>
Signed-off-by: Stefano Garzarella <[email protected]>
Reviewed-by: Stefano Garzarella <[email protected]>
Signed-off-by: Peilin Ye <[email protected]>
---
change since v2:
- wrapped long line (Stefano)
change since v1:
- merged with Stefano's patch [1]
[1] https://gitlab.com/sgarzarella/linux/-/commit/2d0f0b9cbbb30d58fdcbca7c1a857fd8f3110d61
net/vmw_vsock/af_vsock.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index f04abf662ec6..4d68681f5abe 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1391,7 +1391,14 @@ static int vsock_connect(struct socket *sock, struct sockaddr *addr,
* timeout fires.
*/
sock_hold(sk);
- schedule_delayed_work(&vsk->connect_work, timeout);
+
+ /* If the timeout function is already scheduled,
+ * reschedule it, then ungrab the socket refcount to
+ * keep it balanced.
+ */
+ if (mod_delayed_work(system_wq, &vsk->connect_work,
+ timeout))
+ sock_put(sk);
/* Skip ahead to preserve error code set above. */
goto out_wait;
--
2.20.1
Hello:
This series was applied to netdev/net.git (master)
by David S. Miller <[email protected]>:
On Mon, 8 Aug 2022 11:04:47 -0700 you wrote:
> From: Peilin Ye <[email protected]>
>
> An O_NONBLOCK vsock_connect() request may try to reschedule
> @connect_work. Imagine the following sequence of vsock_connect()
> requests:
>
> 1. The 1st, non-blocking request schedules @connect_work, which will
> expire after 200 jiffies. Socket state is now SS_CONNECTING;
>
> [...]
Here is the summary with links:
- [net,v3,1/2] vsock: Fix memory leak in vsock_connect()
https://git.kernel.org/netdev/net/c/7e97cfed9929
- [net,v3,2/2] vsock: Set socket state back to SS_UNCONNECTED in vsock_connect_timeout()
https://git.kernel.org/netdev/net/c/a3e7b29e3085
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html