2019-07-28 18:33:47

by Dexuan Cui

[permalink] [raw]
Subject: [PATCH net] hv_sock: Fix hang when a connection is closed


hvs_do_close_lock_held() may decrease the reference count to 0 and free the
sk struct completely, and then the following release_sock(sk) may hang.

Fixes: a9eeb998c28d ("hv_sock: Add support for delayed close")
Signed-off-by: Dexuan Cui <[email protected]>
Cc: [email protected]

---
With the proper kernel debugging options enabled, first a warning can
appear:

kworker/1:0/4467 is freeing memory ..., with a lock still held there!
stack backtrace:
Workqueue: events vmbus_onmessage_work [hv_vmbus]
Call Trace:
dump_stack+0x67/0x90
debug_check_no_locks_freed.cold.52+0x78/0x7d
slab_free_freelist_hook+0x85/0x140
kmem_cache_free+0xa5/0x380
__sk_destruct+0x150/0x260
hvs_close_connection+0x24/0x30 [hv_sock]
vmbus_onmessage_work+0x1d/0x30 [hv_vmbus]
process_one_work+0x241/0x600
worker_thread+0x3c/0x390
kthread+0x11b/0x140
ret_from_fork+0x24/0x30

and then the following release_sock(sk) can hang:

watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [kworker/1:0:4467]
...
irq event stamp: 62890
CPU: 1 PID: 4467 Comm: kworker/1:0 Tainted: G W 5.2.0+ #39
Workqueue: events vmbus_onmessage_work [hv_vmbus]
RIP: 0010:queued_spin_lock_slowpath+0x2b/0x1e0
...
Call Trace:
do_raw_spin_lock+0xab/0xb0
release_sock+0x19/0xb0
vmbus_onmessage_work+0x1d/0x30 [hv_vmbus]
process_one_work+0x241/0x600
worker_thread+0x3c/0x390
kthread+0x11b/0x140
ret_from_fork+0x24/0x30

net/vmw_vsock/hyperv_transport.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c
index f2084e3f7aa4..efbda8ef1eff 100644
--- a/net/vmw_vsock/hyperv_transport.c
+++ b/net/vmw_vsock/hyperv_transport.c
@@ -309,9 +309,16 @@ static void hvs_close_connection(struct vmbus_channel *chan)
{
struct sock *sk = get_per_channel_state(chan);

+ /* Grab an extra reference since hvs_do_close_lock_held() may decrease
+ * the reference count to 0 by calling sock_put(sk).
+ */
+ sock_hold(sk);
+
lock_sock(sk);
hvs_do_close_lock_held(vsock_sk(sk), true);
release_sock(sk);
+
+ sock_put(sk);
}

static void hvs_open_connection(struct vmbus_channel *chan)
--
2.19.1



2019-07-29 20:25:40

by Dexuan Cui

[permalink] [raw]
Subject: RE: [PATCH net] hv_sock: Fix hang when a connection is closed

> From: Sunil Muthuswamy <[email protected]>
> Sent: Monday, July 29, 2019 10:21 AM
> > --- a/net/vmw_vsock/hyperv_transport.c
> > +++ b/net/vmw_vsock/hyperv_transport.c
> > @@ -309,9 +309,16 @@ static void hvs_close_connection(struct
> vmbus_channel *chan)
> > {
> > struct sock *sk = get_per_channel_state(chan);
> >
> > + /* Grab an extra reference since hvs_do_close_lock_held() may decrease
> > + * the reference count to 0 by calling sock_put(sk).
> > + */
> > + sock_hold(sk);
> > +
>
> To me, it seems like when 'hvs_close_connection' is called, there should always
> be an outstanding reference to the socket.

I agree. There *should* be, but it turns out there is race condition:

For an established connectin that is being closed by the guest, the refcnt is 4
at the end of hvs_release() (Note: here the 'remove_sock' is false):

1 for the initial value;
1 for the sk being in the bound list;
1 for the sk being in the connected list;
1 for the delayed close_work.

After hvs_release() finishes, __vsock_release() -> sock_put(sk) *may* decrease
the refcnt to 3.

Concurrently, hvs_close_connection() runs in another thread:
calls vsock_remove_sock() to decrease the refcnt by 2;
call sock_put() to decrease the refcnt to 0, and free the sk;
Next, the "release_sock(sk)" may hang due to use-after-free.

In the above, after hvs_release() finishes, if hvs_close_connection() runs
faster than "__vsock_release() -> sock_put(sk)", then there is not any issue,
because at the beginning of hvs_close_connection(), the refcnt is still 4.

So, this patch can work, but it's not the right fix.
Your suggestion is correct and here is the patch.
I'll give it more tests and send a v2.

--- a/net/vmw_vsock/hyperv_transport.c
+++ b/net/vmw_vsock/hyperv_transport.c
@@ -312,6 +312,11 @@ static void hvs_close_connection(struct vmbus_channel *chan)
lock_sock(sk);
hvs_do_close_lock_held(vsock_sk(sk), true);
release_sock(sk);
+
+ /* Release the refcnt for the channel that's opened in
+ * hvs_open_connection().
+ */
+ sock_put(sk);
}

static void hvs_open_connection(struct vmbus_channel *chan)
@@ -407,6 +412,9 @@ static void hvs_open_connection(struct vmbus_channel *chan)
}

set_per_channel_state(chan, conn_from_host ? new : sk);
+
+ /* This reference will be dropped by hvs_close_connection(). */
+ sock_hold(conn_from_host ? new: sk);
vmbus_set_chn_rescind_callback(chan, hvs_close_connection);

/* Set the pending send size to max packet size to always get


> The reference that is dropped by
> ' hvs_do_close_lock_held' is a legitimate reference that was taken by
> 'hvs_close_lock_held'.

Correct.

> Or, in other words, I think the right solution is to always maintain a reference to
> socket
> until this routine is called and drop that here. That can be done by taking the
> reference to
> the socket prior to ' vmbus_set_chn_rescind_callback(chan,
> hvs_close_connection)' and
> dropping that reference at the end of 'hvs_close_connection'.
>
> > lock_sock(sk);
> > hvs_do_close_lock_held(vsock_sk(sk), true);
> > release_sock(sk);
> > +
> > + sock_put(sk);
>
> Thanks for taking a look at this. We should queue this fix and the other
> hvsocket fixes
> for the stable branch.

I added a "Cc: [email protected]" tag so this pach will go to the
stable kernels automatically.

Your previous two fixes are in the v5.2.4 stable kernel, but not in the other
longterm stable kernels 4.19 and 4.14:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/log/?h=v5.2.4&qt=author&q=Muthuswamy

I'll request them to be backported for 4.19 and 4.14.
I'll also request the patch "vsock: correct removal of socket from the list"
to be backported.

The other two "hv_sock: perf" patches are more of features rather than
fixes. Usually the stable kernel maintaners don't backport feature patches.

Thanks,
-- Dexuan

2019-07-29 21:59:17

by Sunil Muthuswamy

[permalink] [raw]
Subject: RE: [PATCH net] hv_sock: Fix hang when a connection is closed



> -----Original Message-----
> From: Dexuan Cui <[email protected]>
> Sent: Sunday, July 28, 2019 11:32 AM
> To: Sunil Muthuswamy <[email protected]>; David Miller <[email protected]>; [email protected]
> Cc: KY Srinivasan <[email protected]>; Haiyang Zhang <[email protected]>; Stephen Hemminger
> <[email protected]>; [email protected]; Michael Kelley <[email protected]>; [email protected]; linux-
> [email protected]; [email protected]; [email protected]; [email protected]; vkuznets <[email protected]>;
> [email protected]
> Subject: [PATCH net] hv_sock: Fix hang when a connection is closed
>
>
> hvs_do_close_lock_held() may decrease the reference count to 0 and free the
> sk struct completely, and then the following release_sock(sk) may hang.
>
> Fixes: a9eeb998c28d ("hv_sock: Add support for delayed close")
> Signed-off-by: Dexuan Cui <[email protected]>
> Cc: [email protected]
>
> ---
> With the proper kernel debugging options enabled, first a warning can
> appear:
>
> kworker/1:0/4467 is freeing memory ..., with a lock still held there!
> stack backtrace:
> Workqueue: events vmbus_onmessage_work [hv_vmbus]
> Call Trace:
> dump_stack+0x67/0x90
> debug_check_no_locks_freed.cold.52+0x78/0x7d
> slab_free_freelist_hook+0x85/0x140
> kmem_cache_free+0xa5/0x380
> __sk_destruct+0x150/0x260
> hvs_close_connection+0x24/0x30 [hv_sock]
> vmbus_onmessage_work+0x1d/0x30 [hv_vmbus]
> process_one_work+0x241/0x600
> worker_thread+0x3c/0x390
> kthread+0x11b/0x140
> ret_from_fork+0x24/0x30
>
> and then the following release_sock(sk) can hang:
>
> watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [kworker/1:0:4467]
> ...
> irq event stamp: 62890
> CPU: 1 PID: 4467 Comm: kworker/1:0 Tainted: G W 5.2.0+ #39
> Workqueue: events vmbus_onmessage_work [hv_vmbus]
> RIP: 0010:queued_spin_lock_slowpath+0x2b/0x1e0
> ...
> Call Trace:
> do_raw_spin_lock+0xab/0xb0
> release_sock+0x19/0xb0
> vmbus_onmessage_work+0x1d/0x30 [hv_vmbus]
> process_one_work+0x241/0x600
> worker_thread+0x3c/0x390
> kthread+0x11b/0x140
> ret_from_fork+0x24/0x30
>
> net/vmw_vsock/hyperv_transport.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c
> index f2084e3f7aa4..efbda8ef1eff 100644
> --- a/net/vmw_vsock/hyperv_transport.c
> +++ b/net/vmw_vsock/hyperv_transport.c
> @@ -309,9 +309,16 @@ static void hvs_close_connection(struct vmbus_channel *chan)
> {
> struct sock *sk = get_per_channel_state(chan);
>
> + /* Grab an extra reference since hvs_do_close_lock_held() may decrease
> + * the reference count to 0 by calling sock_put(sk).
> + */
> + sock_hold(sk);
> +

To me, it seems like when 'hvs_close_connection' is called, there should always be
an outstanding reference to the socket. The reference that is dropped by
' hvs_do_close_lock_held' is a legitimate reference that was taken by 'hvs_close_lock_held'.
Or, in other words, I think the right solution is to always maintain a reference to socket
until this routine is called and drop that here. That can be done by taking the reference to
the socket prior to ' vmbus_set_chn_rescind_callback(chan, hvs_close_connection)' and
dropping that reference at the end of 'hvs_close_connection'.

> lock_sock(sk);
> hvs_do_close_lock_held(vsock_sk(sk), true);
> release_sock(sk);
> +
> + sock_put(sk);
> }
>
> static void hvs_open_connection(struct vmbus_channel *chan)
> --
> 2.19.1

Thanks for taking a look at this. We should queue this fix and the other hvsocket fixes
for the stable branch.