2017-12-04 00:17:32

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH v3] SUNRPC: Fix a race in the receive code path

We must ensure that the call to rpc_sleep_on() in xprt_transmit() cannot
race with the call to xprt_complete_rqst().

Reported-by: Chuck Lever <[email protected]>
Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=317
Fixes: ce7c252a8c74 ("SUNRPC: Add a separate spinlock to protect..")
Cc: [email protected] # 4.14+
Reviewed-by: Chuck Lever <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
---
net/sunrpc/xprt.c | 26 +++++++++++++++++---------
1 file changed, 17 insertions(+), 9 deletions(-)

diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index 333b9d697ae5..5e4278e9ce37 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -1024,6 +1024,7 @@ void xprt_transmit(struct rpc_task *task)
} else if (!req->rq_bytes_sent)
return;

+ req->rq_connect_cookie = xprt->connect_cookie;
req->rq_xtime = ktime_get();
status = xprt->ops->send_request(task);
trace_xprt_transmit(xprt, req->rq_xid, status);
@@ -1047,20 +1048,27 @@ void xprt_transmit(struct rpc_task *task)
xprt->stat.bklog_u += xprt->backlog.qlen;
xprt->stat.sending_u += xprt->sending.qlen;
xprt->stat.pending_u += xprt->pending.qlen;
+ spin_unlock_bh(&xprt->transport_lock);

- /* Don't race with disconnect */
- if (!xprt_connected(xprt))
- task->tk_status = -ENOTCONN;
- else {
+ if (rpc_reply_expected(task) && !READ_ONCE(req->rq_reply_bytes_recvd)) {
/*
- * Sleep on the pending queue since
- * we're expecting a reply.
+ * Sleep on the pending queue if we're expecting a reply.
+ * The spinlock ensures atomicity between the test of
+ * req->rq_reply_bytes_recvd, and the call to rpc_sleep_on().
*/
- if (!req->rq_reply_bytes_recvd && rpc_reply_expected(task))
+ spin_lock(&xprt->recv_lock);
+ if (!req->rq_reply_bytes_recvd) {
rpc_sleep_on(&xprt->pending, task, xprt_timer);
- req->rq_connect_cookie = xprt->connect_cookie;
+ /*
+ * Send an extra queue wakeup call if the
+ * connection was dropped in case the call to
+ * rpc_sleep_on() raced.
+ */
+ if (!xprt_connected(xprt))
+ xprt_wake_pending_tasks(xprt, -ENOTCONN);
+ }
+ spin_unlock(&xprt->recv_lock);
}
- spin_unlock_bh(&xprt->transport_lock);
}

static void xprt_add_backlog(struct rpc_xprt *xprt, struct rpc_task *task)
--
2.14.3



2017-12-13 16:15:42

by Chuck Lever

[permalink] [raw]
Subject: Re: [PATCH v3] SUNRPC: Fix a race in the receive code path

Hi Trond-

> On Dec 3, 2017, at 7:17 PM, Trond Myklebust =
<[email protected]> wrote:
>=20
> We must ensure that the call to rpc_sleep_on() in xprt_transmit() =
cannot
> race with the call to xprt_complete_rqst().
>=20
> Reported-by: Chuck Lever <[email protected]>
> Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=3D317
> Fixes: ce7c252a8c74 ("SUNRPC: Add a separate spinlock to protect..")
> Cc: [email protected] # 4.14+
> Reviewed-by: Chuck Lever <[email protected]>
> Signed-off-by: Trond Myklebust <[email protected]>
> ---
> net/sunrpc/xprt.c | 26 +++++++++++++++++---------
> 1 file changed, 17 insertions(+), 9 deletions(-)
>=20
> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
> index 333b9d697ae5..5e4278e9ce37 100644
> --- a/net/sunrpc/xprt.c
> +++ b/net/sunrpc/xprt.c
> @@ -1024,6 +1024,7 @@ void xprt_transmit(struct rpc_task *task)
> } else if (!req->rq_bytes_sent)
> return;
>=20
> + req->rq_connect_cookie =3D xprt->connect_cookie;
> req->rq_xtime =3D ktime_get();
> status =3D xprt->ops->send_request(task);
> trace_xprt_transmit(xprt, req->rq_xid, status);
> @@ -1047,20 +1048,27 @@ void xprt_transmit(struct rpc_task *task)
> xprt->stat.bklog_u +=3D xprt->backlog.qlen;
> xprt->stat.sending_u +=3D xprt->sending.qlen;
> xprt->stat.pending_u +=3D xprt->pending.qlen;
> + spin_unlock_bh(&xprt->transport_lock);
>=20
> - /* Don't race with disconnect */
> - if (!xprt_connected(xprt))
> - task->tk_status =3D -ENOTCONN;
> - else {
> + if (rpc_reply_expected(task) && =
!READ_ONCE(req->rq_reply_bytes_recvd)) {
> /*
> - * Sleep on the pending queue since
> - * we're expecting a reply.
> + * Sleep on the pending queue if we're expecting a =
reply.
> + * The spinlock ensures atomicity between the test of
> + * req->rq_reply_bytes_recvd, and the call to =
rpc_sleep_on().
> */
> - if (!req->rq_reply_bytes_recvd && =
rpc_reply_expected(task))
> + spin_lock(&xprt->recv_lock);
> + if (!req->rq_reply_bytes_recvd) {
> rpc_sleep_on(&xprt->pending, task, xprt_timer);
> - req->rq_connect_cookie =3D xprt->connect_cookie;
> + /*
> + * Send an extra queue wakeup call if the
> + * connection was dropped in case the call to
> + * rpc_sleep_on() raced.
> + */
> + if (!xprt_connected(xprt))
> + xprt_wake_pending_tasks(xprt, =
-ENOTCONN);
> + }
> + spin_unlock(&xprt->recv_lock);
> }
> - spin_unlock_bh(&xprt->transport_lock);
> }
>=20
> static void xprt_add_backlog(struct rpc_xprt *xprt, struct rpc_task =
*task)
> --=20
> 2.14.3

I've run into a problem with this version of the patch (on v4.15-rc3).

With dbench on NFSv4.0 (on RDMA) I no longer see constant throughput
and a temporarily climbing latency followed by a recovery, but this
can happen on occasion:

releasing clients
16 584 322.55 MB/sec warmup 1 sec latency 807.121 ms
16 584 161.28 MB/sec warmup 2 sec latency 1807.175 ms
16 584 107.52 MB/sec warmup 3 sec latency 2807.198 ms
16 584 80.64 MB/sec warmup 4 sec latency 3807.217 ms
16 584 64.51 MB/sec warmup 5 sec latency 4807.235 ms
16 584 53.76 MB/sec warmup 6 sec latency 5807.251 ms
16 584 46.08 MB/sec warmup 7 sec latency 6807.269 ms
16 584 40.32 MB/sec warmup 8 sec latency 7807.285 ms
16 584 35.84 MB/sec warmup 9 sec latency 8807.303 ms
16 584 32.26 MB/sec warmup 10 sec latency 9807.318 ms
16 584 29.32 MB/sec warmup 11 sec latency 10807.334 ms
16 584 26.88 MB/sec warmup 12 sec latency 11807.355 ms
16 584 24.81 MB/sec warmup 13 sec latency 12807.372 ms
16 584 23.04 MB/sec warmup 14 sec latency 13807.387 ms
16 584 21.50 MB/sec warmup 15 sec latency 14807.406 ms
16 584 20.16 MB/sec warmup 16 sec latency 15807.423 ms

No recovery. The latency number keeps climbing, and throughput
drops.

"kernel: nfs: server klimt-ib not responding, still trying"
appears in the client's /var/log/messages.

Without this patch applied, the NFSv4.0 behavior is the same as I
reported with NFSv3: every once in a while, one RPC completion is
lost, but when the timer fires, the client notices the reply
actually did arrive and the RPC completes normally.

I'm looking into it.


--
Chuck Lever




2017-12-13 23:43:13

by Chuck Lever

[permalink] [raw]
Subject: Re: [PATCH v3] SUNRPC: Fix a race in the receive code path


> On Dec 13, 2017, at 11:14 AM, Chuck Lever <[email protected]> =
wrote:
>=20
> Hi Trond-
>=20
>> On Dec 3, 2017, at 7:17 PM, Trond Myklebust =
<[email protected]> wrote:
>>=20
>> We must ensure that the call to rpc_sleep_on() in xprt_transmit() =
cannot
>> race with the call to xprt_complete_rqst().
>>=20
>> Reported-by: Chuck Lever <[email protected]>
>> Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=3D317
>> Fixes: ce7c252a8c74 ("SUNRPC: Add a separate spinlock to protect..")
>> Cc: [email protected] # 4.14+
>> Reviewed-by: Chuck Lever <[email protected]>
>> Signed-off-by: Trond Myklebust <[email protected]>
>> ---
>> net/sunrpc/xprt.c | 26 +++++++++++++++++---------
>> 1 file changed, 17 insertions(+), 9 deletions(-)
>>=20
>> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
>> index 333b9d697ae5..5e4278e9ce37 100644
>> --- a/net/sunrpc/xprt.c
>> +++ b/net/sunrpc/xprt.c
>> @@ -1024,6 +1024,7 @@ void xprt_transmit(struct rpc_task *task)
>> } else if (!req->rq_bytes_sent)
>> return;
>>=20
>> + req->rq_connect_cookie =3D xprt->connect_cookie;
>> req->rq_xtime =3D ktime_get();
>> status =3D xprt->ops->send_request(task);
>> trace_xprt_transmit(xprt, req->rq_xid, status);
>> @@ -1047,20 +1048,27 @@ void xprt_transmit(struct rpc_task *task)
>> xprt->stat.bklog_u +=3D xprt->backlog.qlen;
>> xprt->stat.sending_u +=3D xprt->sending.qlen;
>> xprt->stat.pending_u +=3D xprt->pending.qlen;
>> + spin_unlock_bh(&xprt->transport_lock);
>>=20
>> - /* Don't race with disconnect */
>> - if (!xprt_connected(xprt))
>> - task->tk_status =3D -ENOTCONN;
>> - else {
>> + if (rpc_reply_expected(task) && =
!READ_ONCE(req->rq_reply_bytes_recvd)) {
>> /*
>> - * Sleep on the pending queue since
>> - * we're expecting a reply.
>> + * Sleep on the pending queue if we're expecting a =
reply.
>> + * The spinlock ensures atomicity between the test of
>> + * req->rq_reply_bytes_recvd, and the call to =
rpc_sleep_on().
>> */
>> - if (!req->rq_reply_bytes_recvd && =
rpc_reply_expected(task))
>> + spin_lock(&xprt->recv_lock);
>> + if (!req->rq_reply_bytes_recvd) {
>> rpc_sleep_on(&xprt->pending, task, xprt_timer);
>> - req->rq_connect_cookie =3D xprt->connect_cookie;
>> + /*
>> + * Send an extra queue wakeup call if the
>> + * connection was dropped in case the call to
>> + * rpc_sleep_on() raced.
>> + */
>> + if (!xprt_connected(xprt))
>> + xprt_wake_pending_tasks(xprt, =
-ENOTCONN);
>> + }
>> + spin_unlock(&xprt->recv_lock);
>> }
>> - spin_unlock_bh(&xprt->transport_lock);
>> }
>>=20
>> static void xprt_add_backlog(struct rpc_xprt *xprt, struct rpc_task =
*task)
>> --=20
>> 2.14.3
>=20
> I've run into a problem with this version of the patch (on v4.15-rc3).
>=20
> With dbench on NFSv4.0 (on RDMA) I no longer see constant throughput
> and a temporarily climbing latency followed by a recovery, but this
> can happen on occasion:
>=20
> releasing clients
> 16 584 322.55 MB/sec warmup 1 sec latency 807.121 ms
> 16 584 161.28 MB/sec warmup 2 sec latency 1807.175 ms
> 16 584 107.52 MB/sec warmup 3 sec latency 2807.198 ms
> 16 584 80.64 MB/sec warmup 4 sec latency 3807.217 ms
> 16 584 64.51 MB/sec warmup 5 sec latency 4807.235 ms
> 16 584 53.76 MB/sec warmup 6 sec latency 5807.251 ms
> 16 584 46.08 MB/sec warmup 7 sec latency 6807.269 ms
> 16 584 40.32 MB/sec warmup 8 sec latency 7807.285 ms
> 16 584 35.84 MB/sec warmup 9 sec latency 8807.303 ms
> 16 584 32.26 MB/sec warmup 10 sec latency 9807.318 ms
> 16 584 29.32 MB/sec warmup 11 sec latency 10807.334 ms
> 16 584 26.88 MB/sec warmup 12 sec latency 11807.355 ms
> 16 584 24.81 MB/sec warmup 13 sec latency 12807.372 ms
> 16 584 23.04 MB/sec warmup 14 sec latency 13807.387 ms
> 16 584 21.50 MB/sec warmup 15 sec latency 14807.406 ms
> 16 584 20.16 MB/sec warmup 16 sec latency 15807.423 ms
>=20
> No recovery. The latency number keeps climbing, and throughput
> drops.
>=20
> "kernel: nfs: server klimt-ib not responding, still trying"
> appears in the client's /var/log/messages.
>=20
> Without this patch applied, the NFSv4.0 behavior is the same as I
> reported with NFSv3: every once in a while, one RPC completion is
> lost, but when the timer fires, the client notices the reply
> actually did arrive and the RPC completes normally.
>=20
> I'm looking into it.

Root cause:

The transport is temporarily out of resources and returns -ENOBUFS
from ->send_request.

The FSM schedules a delay then calls call_transmit again, which
invokes xprt_prepare_transmit.

On NFSv3 mounts, xprt_prepare_transmit tries to acquire the
transport write lock.

On NFSv4 mounts, RPC_TASK_NO_RETRANS_TIMEOUT is set. This causes
xprt_prepare_transmit to see that the transport is connected and
rq_connect_cookie is the same as xprt->connect_cookie, so it
queues the task on ->pending and returns false.

The transport never sends the Call, but still waits for a Reply,
forever. This is not a bug in your patch, it is a pre-existing
issue.

There's a simple fix, which is to set

rq_connect_cookie =3D xprt->connect_cookie - 1;

if RPC-over-RDMA header marshaling fails. This is a one-line fix,
appropriate for -rc and backport to stable (to at least v4.8).

A more robust fix might be to have ->send_request return -EAGAIN
instead, and let xprtrdma call xprt_write_space when the resource
exhaustion is relieved. That might be more involved, and thus is
future-facing.

I'm investigating both fixes.


--
Chuck Lever




2017-12-14 01:13:16

by Chuck Lever

[permalink] [raw]
Subject: Re: [PATCH v3] SUNRPC: Fix a race in the receive code path


> On Dec 13, 2017, at 6:42 PM, Chuck Lever <[email protected]> =
wrote:
>=20
>>=20
>> On Dec 13, 2017, at 11:14 AM, Chuck Lever <[email protected]> =
wrote:
>>=20
>> Hi Trond-
>>=20
>>> On Dec 3, 2017, at 7:17 PM, Trond Myklebust =
<[email protected]> wrote:
>>>=20
>>> We must ensure that the call to rpc_sleep_on() in xprt_transmit() =
cannot
>>> race with the call to xprt_complete_rqst().
>>>=20
>>> Reported-by: Chuck Lever <[email protected]>
>>> Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=3D317
>>> Fixes: ce7c252a8c74 ("SUNRPC: Add a separate spinlock to protect..")
>>> Cc: [email protected] # 4.14+
>>> Reviewed-by: Chuck Lever <[email protected]>
>>> Signed-off-by: Trond Myklebust <[email protected]>
>>> ---
>>> net/sunrpc/xprt.c | 26 +++++++++++++++++---------
>>> 1 file changed, 17 insertions(+), 9 deletions(-)
>>>=20
>>> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
>>> index 333b9d697ae5..5e4278e9ce37 100644
>>> --- a/net/sunrpc/xprt.c
>>> +++ b/net/sunrpc/xprt.c
>>> @@ -1024,6 +1024,7 @@ void xprt_transmit(struct rpc_task *task)
>>> } else if (!req->rq_bytes_sent)
>>> return;
>>>=20
>>> + req->rq_connect_cookie =3D xprt->connect_cookie;
>>> req->rq_xtime =3D ktime_get();
>>> status =3D xprt->ops->send_request(task);
>>> trace_xprt_transmit(xprt, req->rq_xid, status);
>>> @@ -1047,20 +1048,27 @@ void xprt_transmit(struct rpc_task *task)
>>> xprt->stat.bklog_u +=3D xprt->backlog.qlen;
>>> xprt->stat.sending_u +=3D xprt->sending.qlen;
>>> xprt->stat.pending_u +=3D xprt->pending.qlen;
>>> + spin_unlock_bh(&xprt->transport_lock);
>>>=20
>>> - /* Don't race with disconnect */
>>> - if (!xprt_connected(xprt))
>>> - task->tk_status =3D -ENOTCONN;
>>> - else {
>>> + if (rpc_reply_expected(task) && =
!READ_ONCE(req->rq_reply_bytes_recvd)) {
>>> /*
>>> - * Sleep on the pending queue since
>>> - * we're expecting a reply.
>>> + * Sleep on the pending queue if we're expecting a =
reply.
>>> + * The spinlock ensures atomicity between the test of
>>> + * req->rq_reply_bytes_recvd, and the call to =
rpc_sleep_on().
>>> */
>>> - if (!req->rq_reply_bytes_recvd && =
rpc_reply_expected(task))
>>> + spin_lock(&xprt->recv_lock);
>>> + if (!req->rq_reply_bytes_recvd) {
>>> rpc_sleep_on(&xprt->pending, task, xprt_timer);
>>> - req->rq_connect_cookie =3D xprt->connect_cookie;
>>> + /*
>>> + * Send an extra queue wakeup call if the
>>> + * connection was dropped in case the call to
>>> + * rpc_sleep_on() raced.
>>> + */
>>> + if (!xprt_connected(xprt))
>>> + xprt_wake_pending_tasks(xprt, =
-ENOTCONN);
>>> + }
>>> + spin_unlock(&xprt->recv_lock);
>>> }
>>> - spin_unlock_bh(&xprt->transport_lock);
>>> }
>>>=20
>>> static void xprt_add_backlog(struct rpc_xprt *xprt, struct rpc_task =
*task)
>>> --=20
>>> 2.14.3
>>=20
>> I've run into a problem with this version of the patch (on =
v4.15-rc3).
>>=20
>> With dbench on NFSv4.0 (on RDMA) I no longer see constant throughput
>> and a temporarily climbing latency followed by a recovery, but this
>> can happen on occasion:
>>=20
>> releasing clients
>> 16 584 322.55 MB/sec warmup 1 sec latency 807.121 ms
>> 16 584 161.28 MB/sec warmup 2 sec latency 1807.175 ms
>> 16 584 107.52 MB/sec warmup 3 sec latency 2807.198 ms
>> 16 584 80.64 MB/sec warmup 4 sec latency 3807.217 ms
>> 16 584 64.51 MB/sec warmup 5 sec latency 4807.235 ms
>> 16 584 53.76 MB/sec warmup 6 sec latency 5807.251 ms
>> 16 584 46.08 MB/sec warmup 7 sec latency 6807.269 ms
>> 16 584 40.32 MB/sec warmup 8 sec latency 7807.285 ms
>> 16 584 35.84 MB/sec warmup 9 sec latency 8807.303 ms
>> 16 584 32.26 MB/sec warmup 10 sec latency 9807.318 ms
>> 16 584 29.32 MB/sec warmup 11 sec latency 10807.334 ms
>> 16 584 26.88 MB/sec warmup 12 sec latency 11807.355 ms
>> 16 584 24.81 MB/sec warmup 13 sec latency 12807.372 ms
>> 16 584 23.04 MB/sec warmup 14 sec latency 13807.387 ms
>> 16 584 21.50 MB/sec warmup 15 sec latency 14807.406 ms
>> 16 584 20.16 MB/sec warmup 16 sec latency 15807.423 ms
>>=20
>> No recovery. The latency number keeps climbing, and throughput
>> drops.
>>=20
>> "kernel: nfs: server klimt-ib not responding, still trying"
>> appears in the client's /var/log/messages.
>>=20
>> Without this patch applied, the NFSv4.0 behavior is the same as I
>> reported with NFSv3: every once in a while, one RPC completion is
>> lost, but when the timer fires, the client notices the reply
>> actually did arrive and the RPC completes normally.
>>=20
>> I'm looking into it.
>=20
> Root cause:
>=20
> The transport is temporarily out of resources and returns -ENOBUFS
> from ->send_request.
>=20
> The FSM schedules a delay then calls call_transmit again, which
> invokes xprt_prepare_transmit.
>=20
> On NFSv3 mounts, xprt_prepare_transmit tries to acquire the
> transport write lock.
>=20
> On NFSv4 mounts, RPC_TASK_NO_RETRANS_TIMEOUT is set. This causes
> xprt_prepare_transmit to see that the transport is connected and
> rq_connect_cookie is the same as xprt->connect_cookie, so it
> queues the task on ->pending and returns false.
>=20
> The transport never sends the Call, but still waits for a Reply,
> forever. This is not a bug in your patch, it is a pre-existing
> issue.

Sorry for one more reply...

I think this last statement is incorrect. It is a bug in your
patch after all.

I forgot that your patch moves the line that sets

req->rq_connect_cookie =3D xprt->connect_cookie;

to before the call to ->send_request. But rq_connect_cookie
must be set iff ->send_request is successful. Moving this
line seems to fix the issue:

1027 req->rq_xtime =3D ktime_get();
1028 status =3D xprt->ops->send_request(task);
1029 trace_xprt_transmit(xprt, req->rq_xid, status);
1030 if (status !=3D 0) {
1031 task->tk_status =3D status;
1032 return;
1033 }
1034 req->rq_connect_cookie =3D xprt->connect_cookie; <<<<
1035 xprt_inject_disconnect(xprt);


> There's a simple fix, which is to set
>=20
> rq_connect_cookie =3D xprt->connect_cookie - 1;
>=20
> if RPC-over-RDMA header marshaling fails. This is a one-line fix,
> appropriate for -rc and backport to stable (to at least v4.8).

This fix does exactly the same thing.


> A more robust fix might be to have ->send_request return -EAGAIN
> instead, and let xprtrdma call xprt_write_space when the resource
> exhaustion is relieved. That might be more involved, and thus is
> future-facing.
>=20
> I'm investigating both fixes.


--
Chuck Lever




2017-12-14 12:16:26

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH v3] SUNRPC: Fix a race in the receive code path

T24gV2VkLCAyMDE3LTEyLTEzIGF0IDIwOjEzIC0wNTAwLCBDaHVjayBMZXZlciB3cm90ZToNCj4g
PiBPbiBEZWMgMTMsIDIwMTcsIGF0IDY6NDIgUE0sIENodWNrIExldmVyIDxjaHVjay5sZXZlckBv
cmFjbGUuY29tPg0KPiA+IHdyb3RlOg0KPiA+IA0KPiA+ID4gDQo+ID4gPiBPbiBEZWMgMTMsIDIw
MTcsIGF0IDExOjE0IEFNLCBDaHVjayBMZXZlciA8Y2h1Y2subGV2ZXJAb3JhY2xlLmNvbQ0KPiA+
ID4gPiB3cm90ZToNCj4gPiA+IA0KPiA+ID4gSGkgVHJvbmQtDQo+ID4gPiANCj4gPiA+ID4gT24g
RGVjIDMsIDIwMTcsIGF0IDc6MTcgUE0sIFRyb25kIE15a2xlYnVzdCA8dHJvbmQubXlrbGVidXN0
QHByDQo+ID4gPiA+IGltYXJ5ZGF0YS5jb20+IHdyb3RlOg0KPiA+ID4gPiANCj4gPiA+ID4gV2Ug
bXVzdCBlbnN1cmUgdGhhdCB0aGUgY2FsbCB0byBycGNfc2xlZXBfb24oKSBpbg0KPiA+ID4gPiB4
cHJ0X3RyYW5zbWl0KCkgY2Fubm90DQo+ID4gPiA+IHJhY2Ugd2l0aCB0aGUgY2FsbCB0byB4cHJ0
X2NvbXBsZXRlX3Jxc3QoKS4NCj4gPiA+ID4gDQo+ID4gPiA+IFJlcG9ydGVkLWJ5OiBDaHVjayBM
ZXZlciA8Y2h1Y2subGV2ZXJAb3JhY2xlLmNvbT4NCj4gPiA+ID4gTGluazogaHR0cHM6Ly9idWd6
aWxsYS5saW51eC1uZnMub3JnL3Nob3dfYnVnLmNnaT9pZD0zMTcNCj4gPiA+ID4gRml4ZXM6IGNl
N2MyNTJhOGM3NCAoIlNVTlJQQzogQWRkIGEgc2VwYXJhdGUgc3BpbmxvY2sgdG8NCj4gPiA+ID4g
cHJvdGVjdC4uIikNCj4gPiA+ID4gQ2M6IHN0YWJsZUB2Z2VyLmtlcm5lbC5vcmcgIyA0LjE0Kw0K
PiA+ID4gPiBSZXZpZXdlZC1ieTogQ2h1Y2sgTGV2ZXIgPGNodWNrLmxldmVyQG9yYWNsZS5jb20+
DQo+ID4gPiA+IFNpZ25lZC1vZmYtYnk6IFRyb25kIE15a2xlYnVzdCA8dHJvbmQubXlrbGVidXN0
QHByaW1hcnlkYXRhLmNvbQ0KPiA+ID4gPiA+DQo+ID4gPiA+IC0tLQ0KPiA+ID4gPiBuZXQvc3Vu
cnBjL3hwcnQuYyB8IDI2ICsrKysrKysrKysrKysrKysrLS0tLS0tLS0tDQo+ID4gPiA+IDEgZmls
ZSBjaGFuZ2VkLCAxNyBpbnNlcnRpb25zKCspLCA5IGRlbGV0aW9ucygtKQ0KPiA+ID4gPiANCj4g
PiA+ID4gZGlmZiAtLWdpdCBhL25ldC9zdW5ycGMveHBydC5jIGIvbmV0L3N1bnJwYy94cHJ0LmMN
Cj4gPiA+ID4gaW5kZXggMzMzYjlkNjk3YWU1Li41ZTQyNzhlOWNlMzcgMTAwNjQ0DQo+ID4gPiA+
IC0tLSBhL25ldC9zdW5ycGMveHBydC5jDQo+ID4gPiA+ICsrKyBiL25ldC9zdW5ycGMveHBydC5j
DQo+ID4gPiA+IEBAIC0xMDI0LDYgKzEwMjQsNyBAQCB2b2lkIHhwcnRfdHJhbnNtaXQoc3RydWN0
IHJwY190YXNrICp0YXNrKQ0KPiA+ID4gPiAJfSBlbHNlIGlmICghcmVxLT5ycV9ieXRlc19zZW50
KQ0KPiA+ID4gPiAJCXJldHVybjsNCj4gPiA+ID4gDQo+ID4gPiA+ICsJcmVxLT5ycV9jb25uZWN0
X2Nvb2tpZSA9IHhwcnQtPmNvbm5lY3RfY29va2llOw0KPiA+ID4gPiAJcmVxLT5ycV94dGltZSA9
IGt0aW1lX2dldCgpOw0KPiA+ID4gPiAJc3RhdHVzID0geHBydC0+b3BzLT5zZW5kX3JlcXVlc3Qo
dGFzayk7DQo+ID4gPiA+IAl0cmFjZV94cHJ0X3RyYW5zbWl0KHhwcnQsIHJlcS0+cnFfeGlkLCBz
dGF0dXMpOw0KPiA+ID4gPiBAQCAtMTA0NywyMCArMTA0OCwyNyBAQCB2b2lkIHhwcnRfdHJhbnNt
aXQoc3RydWN0IHJwY190YXNrDQo+ID4gPiA+ICp0YXNrKQ0KPiA+ID4gPiAJeHBydC0+c3RhdC5i
a2xvZ191ICs9IHhwcnQtPmJhY2tsb2cucWxlbjsNCj4gPiA+ID4gCXhwcnQtPnN0YXQuc2VuZGlu
Z191ICs9IHhwcnQtPnNlbmRpbmcucWxlbjsNCj4gPiA+ID4gCXhwcnQtPnN0YXQucGVuZGluZ191
ICs9IHhwcnQtPnBlbmRpbmcucWxlbjsNCj4gPiA+ID4gKwlzcGluX3VubG9ja19iaCgmeHBydC0+
dHJhbnNwb3J0X2xvY2spOw0KPiA+ID4gPiANCj4gPiA+ID4gLQkvKiBEb24ndCByYWNlIHdpdGgg
ZGlzY29ubmVjdCAqLw0KPiA+ID4gPiAtCWlmICgheHBydF9jb25uZWN0ZWQoeHBydCkpDQo+ID4g
PiA+IC0JCXRhc2stPnRrX3N0YXR1cyA9IC1FTk9UQ09OTjsNCj4gPiA+ID4gLQllbHNlIHsNCj4g
PiA+ID4gKwlpZiAocnBjX3JlcGx5X2V4cGVjdGVkKHRhc2spICYmICFSRUFEX09OQ0UocmVxLQ0K
PiA+ID4gPiA+cnFfcmVwbHlfYnl0ZXNfcmVjdmQpKSB7DQo+ID4gPiA+IAkJLyoNCj4gPiA+ID4g
LQkJICogU2xlZXAgb24gdGhlIHBlbmRpbmcgcXVldWUgc2luY2UNCj4gPiA+ID4gLQkJICogd2Un
cmUgZXhwZWN0aW5nIGEgcmVwbHkuDQo+ID4gPiA+ICsJCSAqIFNsZWVwIG9uIHRoZSBwZW5kaW5n
IHF1ZXVlIGlmIHdlJ3JlDQo+ID4gPiA+IGV4cGVjdGluZyBhIHJlcGx5Lg0KPiA+ID4gPiArCQkg
KiBUaGUgc3BpbmxvY2sgZW5zdXJlcyBhdG9taWNpdHkgYmV0d2VlbiB0aGUNCj4gPiA+ID4gdGVz
dCBvZg0KPiA+ID4gPiArCQkgKiByZXEtPnJxX3JlcGx5X2J5dGVzX3JlY3ZkLCBhbmQgdGhlIGNh
bGwgdG8NCj4gPiA+ID4gcnBjX3NsZWVwX29uKCkuDQo+ID4gPiA+IAkJICovDQo+ID4gPiA+IC0J
CWlmICghcmVxLT5ycV9yZXBseV9ieXRlc19yZWN2ZCAmJg0KPiA+ID4gPiBycGNfcmVwbHlfZXhw
ZWN0ZWQodGFzaykpDQo+ID4gPiA+ICsJCXNwaW5fbG9jaygmeHBydC0+cmVjdl9sb2NrKTsNCj4g
PiA+ID4gKwkJaWYgKCFyZXEtPnJxX3JlcGx5X2J5dGVzX3JlY3ZkKSB7DQo+ID4gPiA+IAkJCXJw
Y19zbGVlcF9vbigmeHBydC0+cGVuZGluZywgdGFzaywNCj4gPiA+ID4geHBydF90aW1lcik7DQo+
ID4gPiA+IC0JCXJlcS0+cnFfY29ubmVjdF9jb29raWUgPSB4cHJ0LT5jb25uZWN0X2Nvb2tpZTsN
Cj4gPiA+ID4gKwkJCS8qDQo+ID4gPiA+ICsJCQkgKiBTZW5kIGFuIGV4dHJhIHF1ZXVlIHdha2V1
cCBjYWxsIGlmDQo+ID4gPiA+IHRoZQ0KPiA+ID4gPiArCQkJICogY29ubmVjdGlvbiB3YXMgZHJv
cHBlZCBpbiBjYXNlIHRoZQ0KPiA+ID4gPiBjYWxsIHRvDQo+ID4gPiA+ICsJCQkgKiBycGNfc2xl
ZXBfb24oKSByYWNlZC4NCj4gPiA+ID4gKwkJCSAqLw0KPiA+ID4gPiArCQkJaWYgKCF4cHJ0X2Nv
bm5lY3RlZCh4cHJ0KSkNCj4gPiA+ID4gKwkJCQl4cHJ0X3dha2VfcGVuZGluZ190YXNrcyh4cHJ0
LA0KPiA+ID4gPiAtRU5PVENPTk4pOw0KPiA+ID4gPiArCQl9DQo+ID4gPiA+ICsJCXNwaW5fdW5s
b2NrKCZ4cHJ0LT5yZWN2X2xvY2spOw0KPiA+ID4gPiAJfQ0KPiA+ID4gPiAtCXNwaW5fdW5sb2Nr
X2JoKCZ4cHJ0LT50cmFuc3BvcnRfbG9jayk7DQo+ID4gPiA+IH0NCj4gPiA+ID4gDQo+ID4gPiA+
IHN0YXRpYyB2b2lkIHhwcnRfYWRkX2JhY2tsb2coc3RydWN0IHJwY194cHJ0ICp4cHJ0LCBzdHJ1
Y3QNCj4gPiA+ID4gcnBjX3Rhc2sgKnRhc2spDQo+ID4gPiA+IC0tIA0KPiA+ID4gPiAyLjE0LjMN
Cj4gPiA+IA0KPiA+ID4gSSd2ZSBydW4gaW50byBhIHByb2JsZW0gd2l0aCB0aGlzIHZlcnNpb24g
b2YgdGhlIHBhdGNoIChvbiB2NC4xNS0NCj4gPiA+IHJjMykuDQo+ID4gPiANCj4gPiA+IFdpdGgg
ZGJlbmNoIG9uIE5GU3Y0LjAgKG9uIFJETUEpIEkgbm8gbG9uZ2VyIHNlZSBjb25zdGFudA0KPiA+
ID4gdGhyb3VnaHB1dA0KPiA+ID4gYW5kIGEgdGVtcG9yYXJpbHkgY2xpbWJpbmcgbGF0ZW5jeSBm
b2xsb3dlZCBieSBhIHJlY292ZXJ5LCBidXQNCj4gPiA+IHRoaXMNCj4gPiA+IGNhbiBoYXBwZW4g
b24gb2NjYXNpb246DQo+ID4gPiANCj4gPiA+IHJlbGVhc2luZyBjbGllbnRzDQo+ID4gPiAxNiAg
ICAgICA1ODQgICAzMjIuNTUgTUIvc2VjICB3YXJtdXAgICAxIHNlYyAgbGF0ZW5jeSA4MDcuMTIx
IG1zDQo+ID4gPiAxNiAgICAgICA1ODQgICAxNjEuMjggTUIvc2VjICB3YXJtdXAgICAyIHNlYyAg
bGF0ZW5jeSAxODA3LjE3NSBtcw0KPiA+ID4gMTYgICAgICAgNTg0ICAgMTA3LjUyIE1CL3NlYyAg
d2FybXVwICAgMyBzZWMgIGxhdGVuY3kgMjgwNy4xOTggbXMNCj4gPiA+IDE2ICAgICAgIDU4NCAg
ICA4MC42NCBNQi9zZWMgIHdhcm11cCAgIDQgc2VjICBsYXRlbmN5IDM4MDcuMjE3IG1zDQo+ID4g
PiAxNiAgICAgICA1ODQgICAgNjQuNTEgTUIvc2VjICB3YXJtdXAgICA1IHNlYyAgbGF0ZW5jeSA0
ODA3LjIzNSBtcw0KPiA+ID4gMTYgICAgICAgNTg0ICAgIDUzLjc2IE1CL3NlYyAgd2FybXVwICAg
NiBzZWMgIGxhdGVuY3kgNTgwNy4yNTEgbXMNCj4gPiA+IDE2ICAgICAgIDU4NCAgICA0Ni4wOCBN
Qi9zZWMgIHdhcm11cCAgIDcgc2VjICBsYXRlbmN5IDY4MDcuMjY5IG1zDQo+ID4gPiAxNiAgICAg
ICA1ODQgICAgNDAuMzIgTUIvc2VjICB3YXJtdXAgICA4IHNlYyAgbGF0ZW5jeSA3ODA3LjI4NSBt
cw0KPiA+ID4gMTYgICAgICAgNTg0ICAgIDM1Ljg0IE1CL3NlYyAgd2FybXVwICAgOSBzZWMgIGxh
dGVuY3kgODgwNy4zMDMgbXMNCj4gPiA+IDE2ICAgICAgIDU4NCAgICAzMi4yNiBNQi9zZWMgIHdh
cm11cCAgMTAgc2VjICBsYXRlbmN5IDk4MDcuMzE4IG1zDQo+ID4gPiAxNiAgICAgICA1ODQgICAg
MjkuMzIgTUIvc2VjICB3YXJtdXAgIDExIHNlYyAgbGF0ZW5jeSAxMDgwNy4zMzQNCj4gPiA+IG1z
DQo+ID4gPiAxNiAgICAgICA1ODQgICAgMjYuODggTUIvc2VjICB3YXJtdXAgIDEyIHNlYyAgbGF0
ZW5jeSAxMTgwNy4zNTUNCj4gPiA+IG1zDQo+ID4gPiAxNiAgICAgICA1ODQgICAgMjQuODEgTUIv
c2VjICB3YXJtdXAgIDEzIHNlYyAgbGF0ZW5jeSAxMjgwNy4zNzINCj4gPiA+IG1zDQo+ID4gPiAx
NiAgICAgICA1ODQgICAgMjMuMDQgTUIvc2VjICB3YXJtdXAgIDE0IHNlYyAgbGF0ZW5jeSAxMzgw
Ny4zODcNCj4gPiA+IG1zDQo+ID4gPiAxNiAgICAgICA1ODQgICAgMjEuNTAgTUIvc2VjICB3YXJt
dXAgIDE1IHNlYyAgbGF0ZW5jeSAxNDgwNy40MDYNCj4gPiA+IG1zDQo+ID4gPiAxNiAgICAgICA1
ODQgICAgMjAuMTYgTUIvc2VjICB3YXJtdXAgIDE2IHNlYyAgbGF0ZW5jeSAxNTgwNy40MjMNCj4g
PiA+IG1zDQo+ID4gPiANCj4gPiA+IE5vIHJlY292ZXJ5LiBUaGUgbGF0ZW5jeSBudW1iZXIga2Vl
cHMgY2xpbWJpbmcsIGFuZCB0aHJvdWdocHV0DQo+ID4gPiBkcm9wcy4NCj4gPiA+IA0KPiA+ID4g
Imtlcm5lbDogbmZzOiBzZXJ2ZXIga2xpbXQtaWIgbm90IHJlc3BvbmRpbmcsIHN0aWxsIHRyeWlu
ZyINCj4gPiA+IGFwcGVhcnMgaW4gdGhlIGNsaWVudCdzIC92YXIvbG9nL21lc3NhZ2VzLg0KPiA+
ID4gDQo+ID4gPiBXaXRob3V0IHRoaXMgcGF0Y2ggYXBwbGllZCwgdGhlIE5GU3Y0LjAgYmVoYXZp
b3IgaXMgdGhlIHNhbWUgYXMgSQ0KPiA+ID4gcmVwb3J0ZWQgd2l0aCBORlN2MzogZXZlcnkgb25j
ZSBpbiBhIHdoaWxlLCBvbmUgUlBDIGNvbXBsZXRpb24gaXMNCj4gPiA+IGxvc3QsIGJ1dCB3aGVu
IHRoZSB0aW1lciBmaXJlcywgdGhlIGNsaWVudCBub3RpY2VzIHRoZSByZXBseQ0KPiA+ID4gYWN0
dWFsbHkgZGlkIGFycml2ZSBhbmQgdGhlIFJQQyBjb21wbGV0ZXMgbm9ybWFsbHkuDQo+ID4gPiAN
Cj4gPiA+IEknbSBsb29raW5nIGludG8gaXQuDQo+ID4gDQo+ID4gUm9vdCBjYXVzZToNCj4gPiAN
Cj4gPiBUaGUgdHJhbnNwb3J0IGlzIHRlbXBvcmFyaWx5IG91dCBvZiByZXNvdXJjZXMgYW5kIHJl
dHVybnMgLUVOT0JVRlMNCj4gPiBmcm9tIC0+c2VuZF9yZXF1ZXN0Lg0KPiA+IA0KPiA+IFRoZSBG
U00gc2NoZWR1bGVzIGEgZGVsYXkgdGhlbiBjYWxscyBjYWxsX3RyYW5zbWl0IGFnYWluLCB3aGlj
aA0KPiA+IGludm9rZXMgeHBydF9wcmVwYXJlX3RyYW5zbWl0Lg0KPiA+IA0KPiA+IE9uIE5GU3Yz
IG1vdW50cywgeHBydF9wcmVwYXJlX3RyYW5zbWl0IHRyaWVzIHRvIGFjcXVpcmUgdGhlDQo+ID4g
dHJhbnNwb3J0IHdyaXRlIGxvY2suDQo+ID4gDQo+ID4gT24gTkZTdjQgbW91bnRzLCBSUENfVEFT
S19OT19SRVRSQU5TX1RJTUVPVVQgaXMgc2V0LiBUaGlzIGNhdXNlcw0KPiA+IHhwcnRfcHJlcGFy
ZV90cmFuc21pdCB0byBzZWUgdGhhdCB0aGUgdHJhbnNwb3J0IGlzIGNvbm5lY3RlZCBhbmQNCj4g
PiBycV9jb25uZWN0X2Nvb2tpZSBpcyB0aGUgc2FtZSBhcyB4cHJ0LT5jb25uZWN0X2Nvb2tpZSwg
c28gaXQNCj4gPiBxdWV1ZXMgdGhlIHRhc2sgb24gLT5wZW5kaW5nIGFuZCByZXR1cm5zIGZhbHNl
Lg0KPiA+IA0KPiA+IFRoZSB0cmFuc3BvcnQgbmV2ZXIgc2VuZHMgdGhlIENhbGwsIGJ1dCBzdGls
bCB3YWl0cyBmb3IgYSBSZXBseSwNCj4gPiBmb3JldmVyLiBUaGlzIGlzIG5vdCBhIGJ1ZyBpbiB5
b3VyIHBhdGNoLCBpdCBpcyBhIHByZS1leGlzdGluZw0KPiA+IGlzc3VlLg0KPiANCj4gU29ycnkg
Zm9yIG9uZSBtb3JlIHJlcGx5Li4uDQo+IA0KPiBJIHRoaW5rIHRoaXMgbGFzdCBzdGF0ZW1lbnQg
aXMgaW5jb3JyZWN0LiBJdCBpcyBhIGJ1ZyBpbiB5b3VyDQo+IHBhdGNoIGFmdGVyIGFsbC4NCj4g
DQo+IEkgZm9yZ290IHRoYXQgeW91ciBwYXRjaCBtb3ZlcyB0aGUgbGluZSB0aGF0IHNldHMNCj4g
DQo+ICAgICByZXEtPnJxX2Nvbm5lY3RfY29va2llID0geHBydC0+Y29ubmVjdF9jb29raWU7DQo+
IA0KPiB0byBiZWZvcmUgdGhlIGNhbGwgdG8gLT5zZW5kX3JlcXVlc3QuIEJ1dCBycV9jb25uZWN0
X2Nvb2tpZQ0KPiBtdXN0IGJlIHNldCBpZmYgLT5zZW5kX3JlcXVlc3QgaXMgc3VjY2Vzc2Z1bC4g
TW92aW5nIHRoaXMNCj4gbGluZSBzZWVtcyB0byBmaXggdGhlIGlzc3VlOg0KPiANCj4gMTAyNyAg
ICAgICAgIHJlcS0+cnFfeHRpbWUgPSBrdGltZV9nZXQoKTsNCj4gMTAyOCAgICAgICAgIHN0YXR1
cyA9IHhwcnQtPm9wcy0+c2VuZF9yZXF1ZXN0KHRhc2spOw0KPiAxMDI5ICAgICAgICAgdHJhY2Vf
eHBydF90cmFuc21pdCh4cHJ0LCByZXEtPnJxX3hpZCwgc3RhdHVzKTsNCj4gMTAzMCAgICAgICAg
IGlmIChzdGF0dXMgIT0gMCkgew0KPiAxMDMxICAgICAgICAgICAgICAgICB0YXNrLT50a19zdGF0
dXMgPSBzdGF0dXM7DQo+IDEwMzIgICAgICAgICAgICAgICAgIHJldHVybjsNCj4gMTAzMyAgICAg
ICAgIH0NCj4gMTAzNCAgICAgICAgIHJlcS0+cnFfY29ubmVjdF9jb29raWUgPSB4cHJ0LT5jb25u
ZWN0X2Nvb2tpZTsgPDw8PA0KPiAxMDM1ICAgICAgICAgeHBydF9pbmplY3RfZGlzY29ubmVjdCh4
cHJ0KTsNCj4gDQoNCkxldCdzIHJhdGhlciBhcHBseSB0aGUgYXR0YWNoZWQgZml4IChpbmNyZW1l
bnRhbCB0byB0aGUgb3JpZ2luYWwgcGF0Y2gNCmFuZCBpbnRlbmRlZCB0byBiZSBhcHBsaWVkIHVz
aW5nICdnaXQgcmViYXNlIC0tYXV0b3NxdWFzaCcpLiBUaGUgcmVhc29uDQpmb3Igd2FudGluZyB0
byBzZXQgdGhlIGNvbm5lY3QgY29va2llIGJlZm9yZSB3ZSBjYWxsIHNlbmRfcmVxdWVzdCgpIGlz
DQp0byBhdm9pZCByYWNlcyB3aXRoIGRpc2Nvbm5lY3RzIHRoYXQgaGFwcGVuIGltbWVkaWF0ZWx5
IGFmdGVyIHdlIHNlbnQNCnRoZSBkYXRhIChidXQgYmVmb3JlIHdlIGdldCByb3VuZCB0byBzZXR0
aW5nIHRoZSBjb29raWUpLg0KDQo4PC0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tDQpGcm9tIDMyNmMxMjcyN2IxNzlkNWMyZmQ0MmYzNzRhNGMw
ZTJkNTZlZTkwYzUgTW9uIFNlcCAxNyAwMDowMDowMCAyMDAxDQpGcm9tOiBUcm9uZCBNeWtsZWJ1
c3QgPHRyb25kLm15a2xlYnVzdEBwcmltYXJ5ZGF0YS5jb20+DQpEYXRlOiBUaHUsIDE0IERlYyAy
MDE3IDA3OjA1OjI3IC0wNTAwDQpTdWJqZWN0OiBbUEFUQ0hdIGZpeHVwISBTVU5SUEM6IEZpeCBh
IHJhY2UgaW4gdGhlIHJlY2VpdmUgY29kZSBwYXRoDQoNCi0tLQ0KIG5ldC9zdW5ycGMveHBydC5j
IHwgMyArKy0NCiAxIGZpbGUgY2hhbmdlZCwgMiBpbnNlcnRpb25zKCspLCAxIGRlbGV0aW9uKC0p
DQoNCmRpZmYgLS1naXQgYS9uZXQvc3VucnBjL3hwcnQuYyBiL25ldC9zdW5ycGMveHBydC5jDQpp
bmRleCA1ZTQyNzhlOWNlMzcuLjJjZjRjZjIxYTg3OCAxMDA2NDQNCi0tLSBhL25ldC9zdW5ycGMv
eHBydC5jDQorKysgYi9uZXQvc3VucnBjL3hwcnQuYw0KQEAgLTEwMjEsMTAgKzEwMjEsMTEgQEAg
dm9pZCB4cHJ0X3RyYW5zbWl0KHN0cnVjdCBycGNfdGFzayAqdGFzaykNCiAJCQkvKiBUdXJuIG9m
ZiBhdXRvZGlzY29ubmVjdCAqLw0KIAkJCWRlbF9zaW5nbGVzaG90X3RpbWVyX3N5bmMoJnhwcnQt
PnRpbWVyKTsNCiAJCX0NCisJCWlmICghcmVxLT5ycV9ieXRlc19zZW50KQ0KKwkJCXJlcS0+cnFf
Y29ubmVjdF9jb29raWUgPSB4cHJ0LT5jb25uZWN0X2Nvb2tpZTsNCiAJfSBlbHNlIGlmICghcmVx
LT5ycV9ieXRlc19zZW50KQ0KIAkJcmV0dXJuOw0KIA0KLQlyZXEtPnJxX2Nvbm5lY3RfY29va2ll
ID0geHBydC0+Y29ubmVjdF9jb29raWU7DQogCXJlcS0+cnFfeHRpbWUgPSBrdGltZV9nZXQoKTsN
CiAJc3RhdHVzID0geHBydC0+b3BzLT5zZW5kX3JlcXVlc3QodGFzayk7DQogCXRyYWNlX3hwcnRf
dHJhbnNtaXQoeHBydCwgcmVxLT5ycV94aWQsIHN0YXR1cyk7DQotLSANCjIuMTQuMw0KDQotLSAN
ClRyb25kIE15a2xlYnVzdA0KTGludXggTkZTIGNsaWVudCBtYWludGFpbmVyLCBQcmltYXJ5RGF0
YQ0KdHJvbmQubXlrbGVidXN0QHByaW1hcnlkYXRhLmNvbQ0K


2017-12-14 15:49:28

by Chuck Lever

[permalink] [raw]
Subject: Re: [PATCH v3] SUNRPC: Fix a race in the receive code path


> On Dec 14, 2017, at 7:16 AM, Trond Myklebust <[email protected]> =
wrote:
>=20
> On Wed, 2017-12-13 at 20:13 -0500, Chuck Lever wrote:
>>> On Dec 13, 2017, at 6:42 PM, Chuck Lever <[email protected]>
>>> wrote:
>>>=20
>>>>=20
>>>> On Dec 13, 2017, at 11:14 AM, Chuck Lever <[email protected]
>>>>> wrote:
>>>>=20
>>>> Hi Trond-
>>>>=20
>>>>> On Dec 3, 2017, at 7:17 PM, Trond Myklebust <trond.myklebust@pr
>>>>> imarydata.com> wrote:
>>>>>=20
>>>>> We must ensure that the call to rpc_sleep_on() in
>>>>> xprt_transmit() cannot
>>>>> race with the call to xprt_complete_rqst().
>>>>>=20
>>>>> Reported-by: Chuck Lever <[email protected]>
>>>>> Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=3D317
>>>>> Fixes: ce7c252a8c74 ("SUNRPC: Add a separate spinlock to
>>>>> protect..")
>>>>> Cc: [email protected] # 4.14+
>>>>> Reviewed-by: Chuck Lever <[email protected]>
>>>>> Signed-off-by: Trond Myklebust <[email protected]
>>>>>>=20
>>>>> ---
>>>>> net/sunrpc/xprt.c | 26 +++++++++++++++++---------
>>>>> 1 file changed, 17 insertions(+), 9 deletions(-)
>>>>>=20
>>>>> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
>>>>> index 333b9d697ae5..5e4278e9ce37 100644
>>>>> --- a/net/sunrpc/xprt.c
>>>>> +++ b/net/sunrpc/xprt.c
>>>>> @@ -1024,6 +1024,7 @@ void xprt_transmit(struct rpc_task *task)
>>>>> } else if (!req->rq_bytes_sent)
>>>>> return;
>>>>>=20
>>>>> + req->rq_connect_cookie =3D xprt->connect_cookie;
>>>>> req->rq_xtime =3D ktime_get();
>>>>> status =3D xprt->ops->send_request(task);
>>>>> trace_xprt_transmit(xprt, req->rq_xid, status);
>>>>> @@ -1047,20 +1048,27 @@ void xprt_transmit(struct rpc_task
>>>>> *task)
>>>>> xprt->stat.bklog_u +=3D xprt->backlog.qlen;
>>>>> xprt->stat.sending_u +=3D xprt->sending.qlen;
>>>>> xprt->stat.pending_u +=3D xprt->pending.qlen;
>>>>> + spin_unlock_bh(&xprt->transport_lock);
>>>>>=20
>>>>> - /* Don't race with disconnect */
>>>>> - if (!xprt_connected(xprt))
>>>>> - task->tk_status =3D -ENOTCONN;
>>>>> - else {
>>>>> + if (rpc_reply_expected(task) && !READ_ONCE(req-
>>>>>> rq_reply_bytes_recvd)) {
>>>>> /*
>>>>> - * Sleep on the pending queue since
>>>>> - * we're expecting a reply.
>>>>> + * Sleep on the pending queue if we're
>>>>> expecting a reply.
>>>>> + * The spinlock ensures atomicity between the
>>>>> test of
>>>>> + * req->rq_reply_bytes_recvd, and the call to
>>>>> rpc_sleep_on().
>>>>> */
>>>>> - if (!req->rq_reply_bytes_recvd &&
>>>>> rpc_reply_expected(task))
>>>>> + spin_lock(&xprt->recv_lock);
>>>>> + if (!req->rq_reply_bytes_recvd) {
>>>>> rpc_sleep_on(&xprt->pending, task,
>>>>> xprt_timer);
>>>>> - req->rq_connect_cookie =3D xprt->connect_cookie;
>>>>> + /*
>>>>> + * Send an extra queue wakeup call if
>>>>> the
>>>>> + * connection was dropped in case the
>>>>> call to
>>>>> + * rpc_sleep_on() raced.
>>>>> + */
>>>>> + if (!xprt_connected(xprt))
>>>>> + xprt_wake_pending_tasks(xprt,
>>>>> -ENOTCONN);
>>>>> + }
>>>>> + spin_unlock(&xprt->recv_lock);
>>>>> }
>>>>> - spin_unlock_bh(&xprt->transport_lock);
>>>>> }
>>>>>=20
>>>>> static void xprt_add_backlog(struct rpc_xprt *xprt, struct
>>>>> rpc_task *task)
>>>>> --=20
>>>>> 2.14.3
>>>>=20
>>>> I've run into a problem with this version of the patch (on v4.15-
>>>> rc3).
>>>>=20
>>>> With dbench on NFSv4.0 (on RDMA) I no longer see constant
>>>> throughput
>>>> and a temporarily climbing latency followed by a recovery, but
>>>> this
>>>> can happen on occasion:
>>>>=20
>>>> releasing clients
>>>> 16 584 322.55 MB/sec warmup 1 sec latency 807.121 ms
>>>> 16 584 161.28 MB/sec warmup 2 sec latency 1807.175 ms
>>>> 16 584 107.52 MB/sec warmup 3 sec latency 2807.198 ms
>>>> 16 584 80.64 MB/sec warmup 4 sec latency 3807.217 ms
>>>> 16 584 64.51 MB/sec warmup 5 sec latency 4807.235 ms
>>>> 16 584 53.76 MB/sec warmup 6 sec latency 5807.251 ms
>>>> 16 584 46.08 MB/sec warmup 7 sec latency 6807.269 ms
>>>> 16 584 40.32 MB/sec warmup 8 sec latency 7807.285 ms
>>>> 16 584 35.84 MB/sec warmup 9 sec latency 8807.303 ms
>>>> 16 584 32.26 MB/sec warmup 10 sec latency 9807.318 ms
>>>> 16 584 29.32 MB/sec warmup 11 sec latency 10807.334
>>>> ms
>>>> 16 584 26.88 MB/sec warmup 12 sec latency 11807.355
>>>> ms
>>>> 16 584 24.81 MB/sec warmup 13 sec latency 12807.372
>>>> ms
>>>> 16 584 23.04 MB/sec warmup 14 sec latency 13807.387
>>>> ms
>>>> 16 584 21.50 MB/sec warmup 15 sec latency 14807.406
>>>> ms
>>>> 16 584 20.16 MB/sec warmup 16 sec latency 15807.423
>>>> ms
>>>>=20
>>>> No recovery. The latency number keeps climbing, and throughput
>>>> drops.
>>>>=20
>>>> "kernel: nfs: server klimt-ib not responding, still trying"
>>>> appears in the client's /var/log/messages.
>>>>=20
>>>> Without this patch applied, the NFSv4.0 behavior is the same as I
>>>> reported with NFSv3: every once in a while, one RPC completion is
>>>> lost, but when the timer fires, the client notices the reply
>>>> actually did arrive and the RPC completes normally.
>>>>=20
>>>> I'm looking into it.
>>>=20
>>> Root cause:
>>>=20
>>> The transport is temporarily out of resources and returns -ENOBUFS
>>> from ->send_request.
>>>=20
>>> The FSM schedules a delay then calls call_transmit again, which
>>> invokes xprt_prepare_transmit.
>>>=20
>>> On NFSv3 mounts, xprt_prepare_transmit tries to acquire the
>>> transport write lock.
>>>=20
>>> On NFSv4 mounts, RPC_TASK_NO_RETRANS_TIMEOUT is set. This causes
>>> xprt_prepare_transmit to see that the transport is connected and
>>> rq_connect_cookie is the same as xprt->connect_cookie, so it
>>> queues the task on ->pending and returns false.
>>>=20
>>> The transport never sends the Call, but still waits for a Reply,
>>> forever. This is not a bug in your patch, it is a pre-existing
>>> issue.
>>=20
>> Sorry for one more reply...
>>=20
>> I think this last statement is incorrect. It is a bug in your
>> patch after all.
>>=20
>> I forgot that your patch moves the line that sets
>>=20
>> req->rq_connect_cookie =3D xprt->connect_cookie;
>>=20
>> to before the call to ->send_request. But rq_connect_cookie
>> must be set iff ->send_request is successful. Moving this
>> line seems to fix the issue:
>>=20
>> 1027 req->rq_xtime =3D ktime_get();
>> 1028 status =3D xprt->ops->send_request(task);
>> 1029 trace_xprt_transmit(xprt, req->rq_xid, status);
>> 1030 if (status !=3D 0) {
>> 1031 task->tk_status =3D status;
>> 1032 return;
>> 1033 }
>> 1034 req->rq_connect_cookie =3D xprt->connect_cookie; <<<<
>> 1035 xprt_inject_disconnect(xprt);
>>=20
>=20
> Let's rather apply the attached fix (incremental to the original patch
> and intended to be applied using 'git rebase --autosquash'). The =
reason
> for wanting to set the connect cookie before we call send_request() is
> to avoid races with disconnects that happen immediately after we sent
> the data (but before we get round to setting the cookie).
>=20
> 8<--------------------------------------------------------
> =46rom 326c12727b179d5c2fd42f374a4c0e2d56ee90c5 Mon Sep 17 00:00:00 =
2001
> From: Trond Myklebust <[email protected]>
> Date: Thu, 14 Dec 2017 07:05:27 -0500
> Subject: [PATCH] fixup! SUNRPC: Fix a race in the receive code path
>=20
> ---
> net/sunrpc/xprt.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>=20
> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
> index 5e4278e9ce37..2cf4cf21a878 100644
> --- a/net/sunrpc/xprt.c
> +++ b/net/sunrpc/xprt.c
> @@ -1021,10 +1021,11 @@ void xprt_transmit(struct rpc_task *task)
> /* Turn off autodisconnect */
> del_singleshot_timer_sync(&xprt->timer);
> }
> + if (!req->rq_bytes_sent)
> + req->rq_connect_cookie =3D xprt->connect_cookie;
> } else if (!req->rq_bytes_sent)
> return;
>=20
> - req->rq_connect_cookie =3D xprt->connect_cookie;
> req->rq_xtime =3D ktime_get();
> status =3D xprt->ops->send_request(task);
> trace_xprt_transmit(xprt, req->rq_xid, status);
> --=20
> 2.14.3

Tested with a client that returns -ENOBUFS from ->send_request
with a little extra frequency.

This doesn't appear to address the problem. The transport
deadlocks as before when send_request() fails to send the RPC.

I imagine this issue could also impact write buffer exhaustion
on TCP.


--
Chuck Lever




2017-12-14 19:03:31

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH v3] SUNRPC: Fix a race in the receive code path

T24gVGh1LCAyMDE3LTEyLTE0IGF0IDEwOjQ5IC0wNTAwLCBDaHVjayBMZXZlciB3cm90ZToNCj4g
PiBPbiBEZWMgMTQsIDIwMTcsIGF0IDc6MTYgQU0sIFRyb25kIE15a2xlYnVzdCA8dHJvbmRteUBw
cmltYXJ5ZGF0YS5jDQo+ID4gb20+IHdyb3RlOg0KPiA+IA0KPiA+IE9uIFdlZCwgMjAxNy0xMi0x
MyBhdCAyMDoxMyAtMDUwMCwgQ2h1Y2sgTGV2ZXIgd3JvdGU6DQo+ID4gPiA+IE9uIERlYyAxMywg
MjAxNywgYXQgNjo0MiBQTSwgQ2h1Y2sgTGV2ZXIgPGNodWNrLmxldmVyQG9yYWNsZS5jbw0KPiA+
ID4gPiBtPg0KPiA+ID4gPiB3cm90ZToNCj4gPiA+ID4gDQo+ID4gPiA+ID4gDQo+ID4gPiA+ID4g
T24gRGVjIDEzLCAyMDE3LCBhdCAxMToxNCBBTSwgQ2h1Y2sgTGV2ZXIgPGNodWNrLmxldmVyQG9y
YWNsZQ0KPiA+ID4gPiA+IC5jb20NCj4gPiA+ID4gPiA+IHdyb3RlOg0KPiA+ID4gPiA+IA0KPiA+
ID4gPiA+IEhpIFRyb25kLQ0KPiA+ID4gPiA+IA0KPiA+ID4gPiA+ID4gT24gRGVjIDMsIDIwMTcs
IGF0IDc6MTcgUE0sIFRyb25kIE15a2xlYnVzdCA8dHJvbmQubXlrbGVidXMNCj4gPiA+ID4gPiA+
IHRAcHINCj4gPiA+ID4gPiA+IGltYXJ5ZGF0YS5jb20+IHdyb3RlOg0KPiA+ID4gPiA+ID4gDQo+
ID4gPiA+ID4gPiBXZSBtdXN0IGVuc3VyZSB0aGF0IHRoZSBjYWxsIHRvIHJwY19zbGVlcF9vbigp
IGluDQo+ID4gPiA+ID4gPiB4cHJ0X3RyYW5zbWl0KCkgY2Fubm90DQo+ID4gPiA+ID4gPiByYWNl
IHdpdGggdGhlIGNhbGwgdG8geHBydF9jb21wbGV0ZV9ycXN0KCkuDQo+ID4gPiA+ID4gPiANCj4g
PiA+ID4gPiA+IFJlcG9ydGVkLWJ5OiBDaHVjayBMZXZlciA8Y2h1Y2subGV2ZXJAb3JhY2xlLmNv
bT4NCj4gPiA+ID4gPiA+IExpbms6IGh0dHBzOi8vYnVnemlsbGEubGludXgtbmZzLm9yZy9zaG93
X2J1Zy5jZ2k/aWQ9MzE3DQo+ID4gPiA+ID4gPiBGaXhlczogY2U3YzI1MmE4Yzc0ICgiU1VOUlBD
OiBBZGQgYSBzZXBhcmF0ZSBzcGlubG9jayB0bw0KPiA+ID4gPiA+ID4gcHJvdGVjdC4uIikNCj4g
PiA+ID4gPiA+IENjOiBzdGFibGVAdmdlci5rZXJuZWwub3JnICMgNC4xNCsNCj4gPiA+ID4gPiA+
IFJldmlld2VkLWJ5OiBDaHVjayBMZXZlciA8Y2h1Y2subGV2ZXJAb3JhY2xlLmNvbT4NCj4gPiA+
ID4gPiA+IFNpZ25lZC1vZmYtYnk6IFRyb25kIE15a2xlYnVzdCA8dHJvbmQubXlrbGVidXN0QHBy
aW1hcnlkYXRhDQo+ID4gPiA+ID4gPiAuY29tDQo+ID4gPiA+ID4gPiA+IA0KPiA+ID4gPiA+ID4g
DQo+ID4gPiA+ID4gPiAtLS0NCj4gPiA+ID4gPiA+IG5ldC9zdW5ycGMveHBydC5jIHwgMjYgKysr
KysrKysrKysrKysrKystLS0tLS0tLS0NCj4gPiA+ID4gPiA+IDEgZmlsZSBjaGFuZ2VkLCAxNyBp
bnNlcnRpb25zKCspLCA5IGRlbGV0aW9ucygtKQ0KPiA+ID4gPiA+ID4gDQo+ID4gPiA+ID4gPiBk
aWZmIC0tZ2l0IGEvbmV0L3N1bnJwYy94cHJ0LmMgYi9uZXQvc3VucnBjL3hwcnQuYw0KPiA+ID4g
PiA+ID4gaW5kZXggMzMzYjlkNjk3YWU1Li41ZTQyNzhlOWNlMzcgMTAwNjQ0DQo+ID4gPiA+ID4g
PiAtLS0gYS9uZXQvc3VucnBjL3hwcnQuYw0KPiA+ID4gPiA+ID4gKysrIGIvbmV0L3N1bnJwYy94
cHJ0LmMNCj4gPiA+ID4gPiA+IEBAIC0xMDI0LDYgKzEwMjQsNyBAQCB2b2lkIHhwcnRfdHJhbnNt
aXQoc3RydWN0IHJwY190YXNrDQo+ID4gPiA+ID4gPiAqdGFzaykNCj4gPiA+ID4gPiA+IAl9IGVs
c2UgaWYgKCFyZXEtPnJxX2J5dGVzX3NlbnQpDQo+ID4gPiA+ID4gPiAJCXJldHVybjsNCj4gPiA+
ID4gPiA+IA0KPiA+ID4gPiA+ID4gKwlyZXEtPnJxX2Nvbm5lY3RfY29va2llID0geHBydC0+Y29u
bmVjdF9jb29raWU7DQo+ID4gPiA+ID4gPiAJcmVxLT5ycV94dGltZSA9IGt0aW1lX2dldCgpOw0K
PiA+ID4gPiA+ID4gCXN0YXR1cyA9IHhwcnQtPm9wcy0+c2VuZF9yZXF1ZXN0KHRhc2spOw0KPiA+
ID4gPiA+ID4gCXRyYWNlX3hwcnRfdHJhbnNtaXQoeHBydCwgcmVxLT5ycV94aWQsIHN0YXR1cyk7
DQo+ID4gPiA+ID4gPiBAQCAtMTA0NywyMCArMTA0OCwyNyBAQCB2b2lkIHhwcnRfdHJhbnNtaXQo
c3RydWN0IHJwY190YXNrDQo+ID4gPiA+ID4gPiAqdGFzaykNCj4gPiA+ID4gPiA+IAl4cHJ0LT5z
dGF0LmJrbG9nX3UgKz0geHBydC0+YmFja2xvZy5xbGVuOw0KPiA+ID4gPiA+ID4gCXhwcnQtPnN0
YXQuc2VuZGluZ191ICs9IHhwcnQtPnNlbmRpbmcucWxlbjsNCj4gPiA+ID4gPiA+IAl4cHJ0LT5z
dGF0LnBlbmRpbmdfdSArPSB4cHJ0LT5wZW5kaW5nLnFsZW47DQo+ID4gPiA+ID4gPiArCXNwaW5f
dW5sb2NrX2JoKCZ4cHJ0LT50cmFuc3BvcnRfbG9jayk7DQo+ID4gPiA+ID4gPiANCj4gPiA+ID4g
PiA+IC0JLyogRG9uJ3QgcmFjZSB3aXRoIGRpc2Nvbm5lY3QgKi8NCj4gPiA+ID4gPiA+IC0JaWYg
KCF4cHJ0X2Nvbm5lY3RlZCh4cHJ0KSkNCj4gPiA+ID4gPiA+IC0JCXRhc2stPnRrX3N0YXR1cyA9
IC1FTk9UQ09OTjsNCj4gPiA+ID4gPiA+IC0JZWxzZSB7DQo+ID4gPiA+ID4gPiArCWlmIChycGNf
cmVwbHlfZXhwZWN0ZWQodGFzaykgJiYgIVJFQURfT05DRShyZXEtDQo+ID4gPiA+ID4gPiA+IHJx
X3JlcGx5X2J5dGVzX3JlY3ZkKSkgew0KPiA+ID4gPiA+ID4gDQo+ID4gPiA+ID4gPiAJCS8qDQo+
ID4gPiA+ID4gPiAtCQkgKiBTbGVlcCBvbiB0aGUgcGVuZGluZyBxdWV1ZSBzaW5jZQ0KPiA+ID4g
PiA+ID4gLQkJICogd2UncmUgZXhwZWN0aW5nIGEgcmVwbHkuDQo+ID4gPiA+ID4gPiArCQkgKiBT
bGVlcCBvbiB0aGUgcGVuZGluZyBxdWV1ZSBpZiB3ZSdyZQ0KPiA+ID4gPiA+ID4gZXhwZWN0aW5n
IGEgcmVwbHkuDQo+ID4gPiA+ID4gPiArCQkgKiBUaGUgc3BpbmxvY2sgZW5zdXJlcyBhdG9taWNp
dHkgYmV0d2Vlbg0KPiA+ID4gPiA+ID4gdGhlDQo+ID4gPiA+ID4gPiB0ZXN0IG9mDQo+ID4gPiA+
ID4gPiArCQkgKiByZXEtPnJxX3JlcGx5X2J5dGVzX3JlY3ZkLCBhbmQgdGhlIGNhbGwNCj4gPiA+
ID4gPiA+IHRvDQo+ID4gPiA+ID4gPiBycGNfc2xlZXBfb24oKS4NCj4gPiA+ID4gPiA+IAkJICov
DQo+ID4gPiA+ID4gPiAtCQlpZiAoIXJlcS0+cnFfcmVwbHlfYnl0ZXNfcmVjdmQgJiYNCj4gPiA+
ID4gPiA+IHJwY19yZXBseV9leHBlY3RlZCh0YXNrKSkNCj4gPiA+ID4gPiA+ICsJCXNwaW5fbG9j
aygmeHBydC0+cmVjdl9sb2NrKTsNCj4gPiA+ID4gPiA+ICsJCWlmICghcmVxLT5ycV9yZXBseV9i
eXRlc19yZWN2ZCkgew0KPiA+ID4gPiA+ID4gCQkJcnBjX3NsZWVwX29uKCZ4cHJ0LT5wZW5kaW5n
LCB0YXNrLA0KPiA+ID4gPiA+ID4geHBydF90aW1lcik7DQo+ID4gPiA+ID4gPiAtCQlyZXEtPnJx
X2Nvbm5lY3RfY29va2llID0geHBydC0NCj4gPiA+ID4gPiA+ID5jb25uZWN0X2Nvb2tpZTsNCj4g
PiA+ID4gPiA+ICsJCQkvKg0KPiA+ID4gPiA+ID4gKwkJCSAqIFNlbmQgYW4gZXh0cmEgcXVldWUg
d2FrZXVwIGNhbGwNCj4gPiA+ID4gPiA+IGlmDQo+ID4gPiA+ID4gPiB0aGUNCj4gPiA+ID4gPiA+
ICsJCQkgKiBjb25uZWN0aW9uIHdhcyBkcm9wcGVkIGluIGNhc2UNCj4gPiA+ID4gPiA+IHRoZQ0K
PiA+ID4gPiA+ID4gY2FsbCB0bw0KPiA+ID4gPiA+ID4gKwkJCSAqIHJwY19zbGVlcF9vbigpIHJh
Y2VkLg0KPiA+ID4gPiA+ID4gKwkJCSAqLw0KPiA+ID4gPiA+ID4gKwkJCWlmICgheHBydF9jb25u
ZWN0ZWQoeHBydCkpDQo+ID4gPiA+ID4gPiArCQkJCXhwcnRfd2FrZV9wZW5kaW5nX3Rhc2tzKHhw
DQo+ID4gPiA+ID4gPiBydCwNCj4gPiA+ID4gPiA+IC1FTk9UQ09OTik7DQo+ID4gPiA+ID4gPiAr
CQl9DQo+ID4gPiA+ID4gPiArCQlzcGluX3VubG9jaygmeHBydC0+cmVjdl9sb2NrKTsNCj4gPiA+
ID4gPiA+IAl9DQo+ID4gPiA+ID4gPiAtCXNwaW5fdW5sb2NrX2JoKCZ4cHJ0LT50cmFuc3BvcnRf
bG9jayk7DQo+ID4gPiA+ID4gPiB9DQo+ID4gPiA+ID4gPiANCj4gPiA+ID4gPiA+IHN0YXRpYyB2
b2lkIHhwcnRfYWRkX2JhY2tsb2coc3RydWN0IHJwY194cHJ0ICp4cHJ0LCBzdHJ1Y3QNCj4gPiA+
ID4gPiA+IHJwY190YXNrICp0YXNrKQ0KPiA+ID4gPiA+ID4gLS0gDQo+ID4gPiA+ID4gPiAyLjE0
LjMNCj4gPiA+ID4gPiANCj4gPiA+ID4gPiBJJ3ZlIHJ1biBpbnRvIGEgcHJvYmxlbSB3aXRoIHRo
aXMgdmVyc2lvbiBvZiB0aGUgcGF0Y2ggKG9uDQo+ID4gPiA+ID4gdjQuMTUtDQo+ID4gPiA+ID4g
cmMzKS4NCj4gPiA+ID4gPiANCj4gPiA+ID4gPiBXaXRoIGRiZW5jaCBvbiBORlN2NC4wIChvbiBS
RE1BKSBJIG5vIGxvbmdlciBzZWUgY29uc3RhbnQNCj4gPiA+ID4gPiB0aHJvdWdocHV0DQo+ID4g
PiA+ID4gYW5kIGEgdGVtcG9yYXJpbHkgY2xpbWJpbmcgbGF0ZW5jeSBmb2xsb3dlZCBieSBhIHJl
Y292ZXJ5LA0KPiA+ID4gPiA+IGJ1dA0KPiA+ID4gPiA+IHRoaXMNCj4gPiA+ID4gPiBjYW4gaGFw
cGVuIG9uIG9jY2FzaW9uOg0KPiA+ID4gPiA+IA0KPiA+ID4gPiA+IHJlbGVhc2luZyBjbGllbnRz
DQo+ID4gPiA+ID4gMTYgICAgICAgNTg0ICAgMzIyLjU1IE1CL3NlYyAgd2FybXVwICAgMSBzZWMg
IGxhdGVuY3kgODA3LjEyMQ0KPiA+ID4gPiA+IG1zDQo+ID4gPiA+ID4gMTYgICAgICAgNTg0ICAg
MTYxLjI4IE1CL3NlYyAgd2FybXVwICAgMiBzZWMgIGxhdGVuY3kNCj4gPiA+ID4gPiAxODA3LjE3
NSBtcw0KPiA+ID4gPiA+IDE2ICAgICAgIDU4NCAgIDEwNy41MiBNQi9zZWMgIHdhcm11cCAgIDMg
c2VjICBsYXRlbmN5DQo+ID4gPiA+ID4gMjgwNy4xOTggbXMNCj4gPiA+ID4gPiAxNiAgICAgICA1
ODQgICAgODAuNjQgTUIvc2VjICB3YXJtdXAgICA0IHNlYyAgbGF0ZW5jeQ0KPiA+ID4gPiA+IDM4
MDcuMjE3IG1zDQo+ID4gPiA+ID4gMTYgICAgICAgNTg0ICAgIDY0LjUxIE1CL3NlYyAgd2FybXVw
ICAgNSBzZWMgIGxhdGVuY3kNCj4gPiA+ID4gPiA0ODA3LjIzNSBtcw0KPiA+ID4gPiA+IDE2ICAg
ICAgIDU4NCAgICA1My43NiBNQi9zZWMgIHdhcm11cCAgIDYgc2VjICBsYXRlbmN5DQo+ID4gPiA+
ID4gNTgwNy4yNTEgbXMNCj4gPiA+ID4gPiAxNiAgICAgICA1ODQgICAgNDYuMDggTUIvc2VjICB3
YXJtdXAgICA3IHNlYyAgbGF0ZW5jeQ0KPiA+ID4gPiA+IDY4MDcuMjY5IG1zDQo+ID4gPiA+ID4g
MTYgICAgICAgNTg0ICAgIDQwLjMyIE1CL3NlYyAgd2FybXVwICAgOCBzZWMgIGxhdGVuY3kNCj4g
PiA+ID4gPiA3ODA3LjI4NSBtcw0KPiA+ID4gPiA+IDE2ICAgICAgIDU4NCAgICAzNS44NCBNQi9z
ZWMgIHdhcm11cCAgIDkgc2VjICBsYXRlbmN5DQo+ID4gPiA+ID4gODgwNy4zMDMgbXMNCj4gPiA+
ID4gPiAxNiAgICAgICA1ODQgICAgMzIuMjYgTUIvc2VjICB3YXJtdXAgIDEwIHNlYyAgbGF0ZW5j
eQ0KPiA+ID4gPiA+IDk4MDcuMzE4IG1zDQo+ID4gPiA+ID4gMTYgICAgICAgNTg0ICAgIDI5LjMy
IE1CL3NlYyAgd2FybXVwICAxMSBzZWMgIGxhdGVuY3kNCj4gPiA+ID4gPiAxMDgwNy4zMzQNCj4g
PiA+ID4gPiBtcw0KPiA+ID4gPiA+IDE2ICAgICAgIDU4NCAgICAyNi44OCBNQi9zZWMgIHdhcm11
cCAgMTIgc2VjICBsYXRlbmN5DQo+ID4gPiA+ID4gMTE4MDcuMzU1DQo+ID4gPiA+ID4gbXMNCj4g
PiA+ID4gPiAxNiAgICAgICA1ODQgICAgMjQuODEgTUIvc2VjICB3YXJtdXAgIDEzIHNlYyAgbGF0
ZW5jeQ0KPiA+ID4gPiA+IDEyODA3LjM3Mg0KPiA+ID4gPiA+IG1zDQo+ID4gPiA+ID4gMTYgICAg
ICAgNTg0ICAgIDIzLjA0IE1CL3NlYyAgd2FybXVwICAxNCBzZWMgIGxhdGVuY3kNCj4gPiA+ID4g
PiAxMzgwNy4zODcNCj4gPiA+ID4gPiBtcw0KPiA+ID4gPiA+IDE2ICAgICAgIDU4NCAgICAyMS41
MCBNQi9zZWMgIHdhcm11cCAgMTUgc2VjICBsYXRlbmN5DQo+ID4gPiA+ID4gMTQ4MDcuNDA2DQo+
ID4gPiA+ID4gbXMNCj4gPiA+ID4gPiAxNiAgICAgICA1ODQgICAgMjAuMTYgTUIvc2VjICB3YXJt
dXAgIDE2IHNlYyAgbGF0ZW5jeQ0KPiA+ID4gPiA+IDE1ODA3LjQyMw0KPiA+ID4gPiA+IG1zDQo+
ID4gPiA+ID4gDQo+ID4gPiA+ID4gTm8gcmVjb3ZlcnkuIFRoZSBsYXRlbmN5IG51bWJlciBrZWVw
cyBjbGltYmluZywgYW5kDQo+ID4gPiA+ID4gdGhyb3VnaHB1dA0KPiA+ID4gPiA+IGRyb3BzLg0K
PiA+ID4gPiA+IA0KPiA+ID4gPiA+ICJrZXJuZWw6IG5mczogc2VydmVyIGtsaW10LWliIG5vdCBy
ZXNwb25kaW5nLCBzdGlsbCB0cnlpbmciDQo+ID4gPiA+ID4gYXBwZWFycyBpbiB0aGUgY2xpZW50
J3MgL3Zhci9sb2cvbWVzc2FnZXMuDQo+ID4gPiA+ID4gDQo+ID4gPiA+ID4gV2l0aG91dCB0aGlz
IHBhdGNoIGFwcGxpZWQsIHRoZSBORlN2NC4wIGJlaGF2aW9yIGlzIHRoZSBzYW1lDQo+ID4gPiA+
ID4gYXMgSQ0KPiA+ID4gPiA+IHJlcG9ydGVkIHdpdGggTkZTdjM6IGV2ZXJ5IG9uY2UgaW4gYSB3
aGlsZSwgb25lIFJQQw0KPiA+ID4gPiA+IGNvbXBsZXRpb24gaXMNCj4gPiA+ID4gPiBsb3N0LCBi
dXQgd2hlbiB0aGUgdGltZXIgZmlyZXMsIHRoZSBjbGllbnQgbm90aWNlcyB0aGUgcmVwbHkNCj4g
PiA+ID4gPiBhY3R1YWxseSBkaWQgYXJyaXZlIGFuZCB0aGUgUlBDIGNvbXBsZXRlcyBub3JtYWxs
eS4NCj4gPiA+ID4gPiANCj4gPiA+ID4gPiBJJ20gbG9va2luZyBpbnRvIGl0Lg0KPiA+ID4gPiAN
Cj4gPiA+ID4gUm9vdCBjYXVzZToNCj4gPiA+ID4gDQo+ID4gPiA+IFRoZSB0cmFuc3BvcnQgaXMg
dGVtcG9yYXJpbHkgb3V0IG9mIHJlc291cmNlcyBhbmQgcmV0dXJucw0KPiA+ID4gPiAtRU5PQlVG
Uw0KPiA+ID4gPiBmcm9tIC0+c2VuZF9yZXF1ZXN0Lg0KPiA+ID4gPiANCj4gPiA+ID4gVGhlIEZT
TSBzY2hlZHVsZXMgYSBkZWxheSB0aGVuIGNhbGxzIGNhbGxfdHJhbnNtaXQgYWdhaW4sIHdoaWNo
DQo+ID4gPiA+IGludm9rZXMgeHBydF9wcmVwYXJlX3RyYW5zbWl0Lg0KPiA+ID4gPiANCj4gPiA+
ID4gT24gTkZTdjMgbW91bnRzLCB4cHJ0X3ByZXBhcmVfdHJhbnNtaXQgdHJpZXMgdG8gYWNxdWly
ZSB0aGUNCj4gPiA+ID4gdHJhbnNwb3J0IHdyaXRlIGxvY2suDQo+ID4gPiA+IA0KPiA+ID4gPiBP
biBORlN2NCBtb3VudHMsIFJQQ19UQVNLX05PX1JFVFJBTlNfVElNRU9VVCBpcyBzZXQuIFRoaXMN
Cj4gPiA+ID4gY2F1c2VzDQo+ID4gPiA+IHhwcnRfcHJlcGFyZV90cmFuc21pdCB0byBzZWUgdGhh
dCB0aGUgdHJhbnNwb3J0IGlzIGNvbm5lY3RlZA0KPiA+ID4gPiBhbmQNCj4gPiA+ID4gcnFfY29u
bmVjdF9jb29raWUgaXMgdGhlIHNhbWUgYXMgeHBydC0+Y29ubmVjdF9jb29raWUsIHNvIGl0DQo+
ID4gPiA+IHF1ZXVlcyB0aGUgdGFzayBvbiAtPnBlbmRpbmcgYW5kIHJldHVybnMgZmFsc2UuDQo+
ID4gPiA+IA0KPiA+ID4gPiBUaGUgdHJhbnNwb3J0IG5ldmVyIHNlbmRzIHRoZSBDYWxsLCBidXQg
c3RpbGwgd2FpdHMgZm9yIGENCj4gPiA+ID4gUmVwbHksDQo+ID4gPiA+IGZvcmV2ZXIuIFRoaXMg
aXMgbm90IGEgYnVnIGluIHlvdXIgcGF0Y2gsIGl0IGlzIGEgcHJlLWV4aXN0aW5nDQo+ID4gPiA+
IGlzc3VlLg0KPiA+ID4gDQo+ID4gPiBTb3JyeSBmb3Igb25lIG1vcmUgcmVwbHkuLi4NCj4gPiA+
IA0KPiA+ID4gSSB0aGluayB0aGlzIGxhc3Qgc3RhdGVtZW50IGlzIGluY29ycmVjdC4gSXQgaXMg
YSBidWcgaW4geW91cg0KPiA+ID4gcGF0Y2ggYWZ0ZXIgYWxsLg0KPiA+ID4gDQo+ID4gPiBJIGZv
cmdvdCB0aGF0IHlvdXIgcGF0Y2ggbW92ZXMgdGhlIGxpbmUgdGhhdCBzZXRzDQo+ID4gPiANCj4g
PiA+ICAgIHJlcS0+cnFfY29ubmVjdF9jb29raWUgPSB4cHJ0LT5jb25uZWN0X2Nvb2tpZTsNCj4g
PiA+IA0KPiA+ID4gdG8gYmVmb3JlIHRoZSBjYWxsIHRvIC0+c2VuZF9yZXF1ZXN0LiBCdXQgcnFf
Y29ubmVjdF9jb29raWUNCj4gPiA+IG11c3QgYmUgc2V0IGlmZiAtPnNlbmRfcmVxdWVzdCBpcyBz
dWNjZXNzZnVsLiBNb3ZpbmcgdGhpcw0KPiA+ID4gbGluZSBzZWVtcyB0byBmaXggdGhlIGlzc3Vl
Og0KPiA+ID4gDQo+ID4gPiAxMDI3ICAgICAgICAgcmVxLT5ycV94dGltZSA9IGt0aW1lX2dldCgp
Ow0KPiA+ID4gMTAyOCAgICAgICAgIHN0YXR1cyA9IHhwcnQtPm9wcy0+c2VuZF9yZXF1ZXN0KHRh
c2spOw0KPiA+ID4gMTAyOSAgICAgICAgIHRyYWNlX3hwcnRfdHJhbnNtaXQoeHBydCwgcmVxLT5y
cV94aWQsIHN0YXR1cyk7DQo+ID4gPiAxMDMwICAgICAgICAgaWYgKHN0YXR1cyAhPSAwKSB7DQo+
ID4gPiAxMDMxICAgICAgICAgICAgICAgICB0YXNrLT50a19zdGF0dXMgPSBzdGF0dXM7DQo+ID4g
PiAxMDMyICAgICAgICAgICAgICAgICByZXR1cm47DQo+ID4gPiAxMDMzICAgICAgICAgfQ0KPiA+
ID4gMTAzNCAgICAgICAgIHJlcS0+cnFfY29ubmVjdF9jb29raWUgPSB4cHJ0LT5jb25uZWN0X2Nv
b2tpZTsgPDw8PA0KPiA+ID4gMTAzNSAgICAgICAgIHhwcnRfaW5qZWN0X2Rpc2Nvbm5lY3QoeHBy
dCk7DQo+ID4gPiANCj4gPiANCj4gPiBMZXQncyByYXRoZXIgYXBwbHkgdGhlIGF0dGFjaGVkIGZp
eCAoaW5jcmVtZW50YWwgdG8gdGhlIG9yaWdpbmFsDQo+ID4gcGF0Y2gNCj4gPiBhbmQgaW50ZW5k
ZWQgdG8gYmUgYXBwbGllZCB1c2luZyAnZ2l0IHJlYmFzZSAtLWF1dG9zcXVhc2gnKS4gVGhlDQo+
ID4gcmVhc29uDQo+ID4gZm9yIHdhbnRpbmcgdG8gc2V0IHRoZSBjb25uZWN0IGNvb2tpZSBiZWZv
cmUgd2UgY2FsbCBzZW5kX3JlcXVlc3QoKQ0KPiA+IGlzDQo+ID4gdG8gYXZvaWQgcmFjZXMgd2l0
aCBkaXNjb25uZWN0cyB0aGF0IGhhcHBlbiBpbW1lZGlhdGVseSBhZnRlciB3ZQ0KPiA+IHNlbnQN
Cj4gPiB0aGUgZGF0YSAoYnV0IGJlZm9yZSB3ZSBnZXQgcm91bmQgdG8gc2V0dGluZyB0aGUgY29v
a2llKS4NCj4gPiANCj4gPiA4PC0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tDQo+ID4gRnJvbSAzMjZjMTI3MjdiMTc5ZDVjMmZkNDJmMzc0YTRj
MGUyZDU2ZWU5MGM1IE1vbiBTZXAgMTcgMDA6MDA6MDANCj4gPiAyMDAxDQo+ID4gRnJvbTogVHJv
bmQgTXlrbGVidXN0IDx0cm9uZC5teWtsZWJ1c3RAcHJpbWFyeWRhdGEuY29tPg0KPiA+IERhdGU6
IFRodSwgMTQgRGVjIDIwMTcgMDc6MDU6MjcgLTA1MDANCj4gPiBTdWJqZWN0OiBbUEFUQ0hdIGZp
eHVwISBTVU5SUEM6IEZpeCBhIHJhY2UgaW4gdGhlIHJlY2VpdmUgY29kZSBwYXRoDQo+ID4gDQo+
ID4gLS0tDQo+ID4gbmV0L3N1bnJwYy94cHJ0LmMgfCAzICsrLQ0KPiA+IDEgZmlsZSBjaGFuZ2Vk
LCAyIGluc2VydGlvbnMoKyksIDEgZGVsZXRpb24oLSkNCj4gPiANCj4gPiBkaWZmIC0tZ2l0IGEv
bmV0L3N1bnJwYy94cHJ0LmMgYi9uZXQvc3VucnBjL3hwcnQuYw0KPiA+IGluZGV4IDVlNDI3OGU5
Y2UzNy4uMmNmNGNmMjFhODc4IDEwMDY0NA0KPiA+IC0tLSBhL25ldC9zdW5ycGMveHBydC5jDQo+
ID4gKysrIGIvbmV0L3N1bnJwYy94cHJ0LmMNCj4gPiBAQCAtMTAyMSwxMCArMTAyMSwxMSBAQCB2
b2lkIHhwcnRfdHJhbnNtaXQoc3RydWN0IHJwY190YXNrICp0YXNrKQ0KPiA+IAkJCS8qIFR1cm4g
b2ZmIGF1dG9kaXNjb25uZWN0ICovDQo+ID4gCQkJZGVsX3NpbmdsZXNob3RfdGltZXJfc3luYygm
eHBydC0+dGltZXIpOw0KPiA+IAkJfQ0KPiA+ICsJCWlmICghcmVxLT5ycV9ieXRlc19zZW50KQ0K
PiA+ICsJCQlyZXEtPnJxX2Nvbm5lY3RfY29va2llID0geHBydC0NCj4gPiA+Y29ubmVjdF9jb29r
aWU7DQo+ID4gCX0gZWxzZSBpZiAoIXJlcS0+cnFfYnl0ZXNfc2VudCkNCj4gPiAJCXJldHVybjsN
Cj4gPiANCj4gPiAtCXJlcS0+cnFfY29ubmVjdF9jb29raWUgPSB4cHJ0LT5jb25uZWN0X2Nvb2tp
ZTsNCj4gPiAJcmVxLT5ycV94dGltZSA9IGt0aW1lX2dldCgpOw0KPiA+IAlzdGF0dXMgPSB4cHJ0
LT5vcHMtPnNlbmRfcmVxdWVzdCh0YXNrKTsNCj4gPiAJdHJhY2VfeHBydF90cmFuc21pdCh4cHJ0
LCByZXEtPnJxX3hpZCwgc3RhdHVzKTsNCj4gPiAtLSANCj4gPiAyLjE0LjMNCj4gDQo+IFRlc3Rl
ZCB3aXRoIGEgY2xpZW50IHRoYXQgcmV0dXJucyAtRU5PQlVGUyBmcm9tIC0+c2VuZF9yZXF1ZXN0
DQo+IHdpdGggYSBsaXR0bGUgZXh0cmEgZnJlcXVlbmN5Lg0KPiANCj4gVGhpcyBkb2Vzbid0IGFw
cGVhciB0byBhZGRyZXNzIHRoZSBwcm9ibGVtLiBUaGUgdHJhbnNwb3J0DQo+IGRlYWRsb2NrcyBh
cyBiZWZvcmUgd2hlbiBzZW5kX3JlcXVlc3QoKSBmYWlscyB0byBzZW5kIHRoZSBSUEMuDQoNCkRv
ZXMgdGhlIFJETUEgY29kZSB1cGRhdGUgdGhlIGNvbm5lY3QgY29va2llIHdoZW4gdGhlIGNvbm5l
Y3Rpb24NCmJyZWFrcz8gSXQgbG9va3MgdG8gbWUgYXMgaWYgaXQgb25seSBkb2VzIHRoYXQgd2hl
biB0aGUgY29ubmVjdGlvbiBpcw0KcmUtZXN0YWJsaXNoZWQuIFdlIHJlYWxseSB3YW50IGJvdGgu
DQoNCj4gSSBpbWFnaW5lIHRoaXMgaXNzdWUgY291bGQgYWxzbyBpbXBhY3Qgd3JpdGUgYnVmZmVy
IGV4aGF1c3Rpb24NCj4gb24gVENQLg0KDQpTZWUgbmV0L3N1bnJwYy94cHJ0c29jay5jOnhzX3Rj
cF9zdGF0ZV9jaGFuZ2UoKQ0KDQoNCi0tIA0KVHJvbmQgTXlrbGVidXN0DQpMaW51eCBORlMgY2xp
ZW50IG1haW50YWluZXIsIFByaW1hcnlEYXRhDQp0cm9uZC5teWtsZWJ1c3RAcHJpbWFyeWRhdGEu
Y29tDQo=


2017-12-14 19:22:14

by Chuck Lever

[permalink] [raw]
Subject: Re: [PATCH v3] SUNRPC: Fix a race in the receive code path


> On Dec 14, 2017, at 2:03 PM, Trond Myklebust <[email protected]> =
wrote:
>=20
> On Thu, 2017-12-14 at 10:49 -0500, Chuck Lever wrote:
>>> On Dec 14, 2017, at 7:16 AM, Trond Myklebust <[email protected]
>>> om> wrote:
>>>=20
>>> On Wed, 2017-12-13 at 20:13 -0500, Chuck Lever wrote:
>>>>> On Dec 13, 2017, at 6:42 PM, Chuck Lever <[email protected]
>>>>> m>
>>>>> wrote:
>>>>>=20
>>>>>>=20
>>>>>> On Dec 13, 2017, at 11:14 AM, Chuck Lever <chuck.lever@oracle
>>>>>> .com
>>>>>>> wrote:
>>>>>>=20
>>>>>> Hi Trond-
>>>>>>=20
>>>>>>> On Dec 3, 2017, at 7:17 PM, Trond Myklebust <trond.myklebus
>>>>>>> t@pr
>>>>>>> imarydata.com> wrote:
>>>>>>>=20
>>>>>>> We must ensure that the call to rpc_sleep_on() in
>>>>>>> xprt_transmit() cannot
>>>>>>> race with the call to xprt_complete_rqst().
>>>>>>>=20
>>>>>>> Reported-by: Chuck Lever <[email protected]>
>>>>>>> Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=3D317
>>>>>>> Fixes: ce7c252a8c74 ("SUNRPC: Add a separate spinlock to
>>>>>>> protect..")
>>>>>>> Cc: [email protected] # 4.14+
>>>>>>> Reviewed-by: Chuck Lever <[email protected]>
>>>>>>> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata
>>>>>>> .com
>>>>>>>>=20
>>>>>>>=20
>>>>>>> ---
>>>>>>> net/sunrpc/xprt.c | 26 +++++++++++++++++---------
>>>>>>> 1 file changed, 17 insertions(+), 9 deletions(-)
>>>>>>>=20
>>>>>>> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
>>>>>>> index 333b9d697ae5..5e4278e9ce37 100644
>>>>>>> --- a/net/sunrpc/xprt.c
>>>>>>> +++ b/net/sunrpc/xprt.c
>>>>>>> @@ -1024,6 +1024,7 @@ void xprt_transmit(struct rpc_task
>>>>>>> *task)
>>>>>>> } else if (!req->rq_bytes_sent)
>>>>>>> return;
>>>>>>>=20
>>>>>>> + req->rq_connect_cookie =3D xprt->connect_cookie;
>>>>>>> req->rq_xtime =3D ktime_get();
>>>>>>> status =3D xprt->ops->send_request(task);
>>>>>>> trace_xprt_transmit(xprt, req->rq_xid, status);
>>>>>>> @@ -1047,20 +1048,27 @@ void xprt_transmit(struct rpc_task
>>>>>>> *task)
>>>>>>> xprt->stat.bklog_u +=3D xprt->backlog.qlen;
>>>>>>> xprt->stat.sending_u +=3D xprt->sending.qlen;
>>>>>>> xprt->stat.pending_u +=3D xprt->pending.qlen;
>>>>>>> + spin_unlock_bh(&xprt->transport_lock);
>>>>>>>=20
>>>>>>> - /* Don't race with disconnect */
>>>>>>> - if (!xprt_connected(xprt))
>>>>>>> - task->tk_status =3D -ENOTCONN;
>>>>>>> - else {
>>>>>>> + if (rpc_reply_expected(task) && !READ_ONCE(req-
>>>>>>>> rq_reply_bytes_recvd)) {
>>>>>>>=20
>>>>>>> /*
>>>>>>> - * Sleep on the pending queue since
>>>>>>> - * we're expecting a reply.
>>>>>>> + * Sleep on the pending queue if we're
>>>>>>> expecting a reply.
>>>>>>> + * The spinlock ensures atomicity between
>>>>>>> the
>>>>>>> test of
>>>>>>> + * req->rq_reply_bytes_recvd, and the call
>>>>>>> to
>>>>>>> rpc_sleep_on().
>>>>>>> */
>>>>>>> - if (!req->rq_reply_bytes_recvd &&
>>>>>>> rpc_reply_expected(task))
>>>>>>> + spin_lock(&xprt->recv_lock);
>>>>>>> + if (!req->rq_reply_bytes_recvd) {
>>>>>>> rpc_sleep_on(&xprt->pending, task,
>>>>>>> xprt_timer);
>>>>>>> - req->rq_connect_cookie =3D xprt-
>>>>>>>> connect_cookie;
>>>>>>> + /*
>>>>>>> + * Send an extra queue wakeup call
>>>>>>> if
>>>>>>> the
>>>>>>> + * connection was dropped in case
>>>>>>> the
>>>>>>> call to
>>>>>>> + * rpc_sleep_on() raced.
>>>>>>> + */
>>>>>>> + if (!xprt_connected(xprt))
>>>>>>> + xprt_wake_pending_tasks(xp
>>>>>>> rt,
>>>>>>> -ENOTCONN);
>>>>>>> + }
>>>>>>> + spin_unlock(&xprt->recv_lock);
>>>>>>> }
>>>>>>> - spin_unlock_bh(&xprt->transport_lock);
>>>>>>> }
>>>>>>>=20
>>>>>>> static void xprt_add_backlog(struct rpc_xprt *xprt, struct
>>>>>>> rpc_task *task)
>>>>>>> --=20
>>>>>>> 2.14.3
>>>>>>=20
>>>>>> I've run into a problem with this version of the patch (on
>>>>>> v4.15-
>>>>>> rc3).
>>>>>>=20
>>>>>> With dbench on NFSv4.0 (on RDMA) I no longer see constant
>>>>>> throughput
>>>>>> and a temporarily climbing latency followed by a recovery,
>>>>>> but
>>>>>> this
>>>>>> can happen on occasion:
>>>>>>=20
>>>>>> releasing clients
>>>>>> 16 584 322.55 MB/sec warmup 1 sec latency 807.121
>>>>>> ms
>>>>>> 16 584 161.28 MB/sec warmup 2 sec latency
>>>>>> 1807.175 ms
>>>>>> 16 584 107.52 MB/sec warmup 3 sec latency
>>>>>> 2807.198 ms
>>>>>> 16 584 80.64 MB/sec warmup 4 sec latency
>>>>>> 3807.217 ms
>>>>>> 16 584 64.51 MB/sec warmup 5 sec latency
>>>>>> 4807.235 ms
>>>>>> 16 584 53.76 MB/sec warmup 6 sec latency
>>>>>> 5807.251 ms
>>>>>> 16 584 46.08 MB/sec warmup 7 sec latency
>>>>>> 6807.269 ms
>>>>>> 16 584 40.32 MB/sec warmup 8 sec latency
>>>>>> 7807.285 ms
>>>>>> 16 584 35.84 MB/sec warmup 9 sec latency
>>>>>> 8807.303 ms
>>>>>> 16 584 32.26 MB/sec warmup 10 sec latency
>>>>>> 9807.318 ms
>>>>>> 16 584 29.32 MB/sec warmup 11 sec latency
>>>>>> 10807.334
>>>>>> ms
>>>>>> 16 584 26.88 MB/sec warmup 12 sec latency
>>>>>> 11807.355
>>>>>> ms
>>>>>> 16 584 24.81 MB/sec warmup 13 sec latency
>>>>>> 12807.372
>>>>>> ms
>>>>>> 16 584 23.04 MB/sec warmup 14 sec latency
>>>>>> 13807.387
>>>>>> ms
>>>>>> 16 584 21.50 MB/sec warmup 15 sec latency
>>>>>> 14807.406
>>>>>> ms
>>>>>> 16 584 20.16 MB/sec warmup 16 sec latency
>>>>>> 15807.423
>>>>>> ms
>>>>>>=20
>>>>>> No recovery. The latency number keeps climbing, and
>>>>>> throughput
>>>>>> drops.
>>>>>>=20
>>>>>> "kernel: nfs: server klimt-ib not responding, still trying"
>>>>>> appears in the client's /var/log/messages.
>>>>>>=20
>>>>>> Without this patch applied, the NFSv4.0 behavior is the same
>>>>>> as I
>>>>>> reported with NFSv3: every once in a while, one RPC
>>>>>> completion is
>>>>>> lost, but when the timer fires, the client notices the reply
>>>>>> actually did arrive and the RPC completes normally.
>>>>>>=20
>>>>>> I'm looking into it.
>>>>>=20
>>>>> Root cause:
>>>>>=20
>>>>> The transport is temporarily out of resources and returns
>>>>> -ENOBUFS
>>>>> from ->send_request.
>>>>>=20
>>>>> The FSM schedules a delay then calls call_transmit again, which
>>>>> invokes xprt_prepare_transmit.
>>>>>=20
>>>>> On NFSv3 mounts, xprt_prepare_transmit tries to acquire the
>>>>> transport write lock.
>>>>>=20
>>>>> On NFSv4 mounts, RPC_TASK_NO_RETRANS_TIMEOUT is set. This
>>>>> causes
>>>>> xprt_prepare_transmit to see that the transport is connected
>>>>> and
>>>>> rq_connect_cookie is the same as xprt->connect_cookie, so it
>>>>> queues the task on ->pending and returns false.
>>>>>=20
>>>>> The transport never sends the Call, but still waits for a
>>>>> Reply,
>>>>> forever. This is not a bug in your patch, it is a pre-existing
>>>>> issue.
>>>>=20
>>>> Sorry for one more reply...
>>>>=20
>>>> I think this last statement is incorrect. It is a bug in your
>>>> patch after all.
>>>>=20
>>>> I forgot that your patch moves the line that sets
>>>>=20
>>>> req->rq_connect_cookie =3D xprt->connect_cookie;
>>>>=20
>>>> to before the call to ->send_request. But rq_connect_cookie
>>>> must be set iff ->send_request is successful. Moving this
>>>> line seems to fix the issue:
>>>>=20
>>>> 1027 req->rq_xtime =3D ktime_get();
>>>> 1028 status =3D xprt->ops->send_request(task);
>>>> 1029 trace_xprt_transmit(xprt, req->rq_xid, status);
>>>> 1030 if (status !=3D 0) {
>>>> 1031 task->tk_status =3D status;
>>>> 1032 return;
>>>> 1033 }
>>>> 1034 req->rq_connect_cookie =3D xprt->connect_cookie; <<<<
>>>> 1035 xprt_inject_disconnect(xprt);
>>>>=20
>>>=20
>>> Let's rather apply the attached fix (incremental to the original
>>> patch
>>> and intended to be applied using 'git rebase --autosquash'). The
>>> reason
>>> for wanting to set the connect cookie before we call send_request()
>>> is
>>> to avoid races with disconnects that happen immediately after we
>>> sent
>>> the data (but before we get round to setting the cookie).
>>>=20
>>> 8<--------------------------------------------------------
>>> =46rom 326c12727b179d5c2fd42f374a4c0e2d56ee90c5 Mon Sep 17 00:00:00
>>> 2001
>>> From: Trond Myklebust <[email protected]>
>>> Date: Thu, 14 Dec 2017 07:05:27 -0500
>>> Subject: [PATCH] fixup! SUNRPC: Fix a race in the receive code path
>>>=20
>>> ---
>>> net/sunrpc/xprt.c | 3 ++-
>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>=20
>>> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
>>> index 5e4278e9ce37..2cf4cf21a878 100644
>>> --- a/net/sunrpc/xprt.c
>>> +++ b/net/sunrpc/xprt.c
>>> @@ -1021,10 +1021,11 @@ void xprt_transmit(struct rpc_task *task)
>>> /* Turn off autodisconnect */
>>> del_singleshot_timer_sync(&xprt->timer);
>>> }
>>> + if (!req->rq_bytes_sent)
>>> + req->rq_connect_cookie =3D xprt-
>>>> connect_cookie;
>>> } else if (!req->rq_bytes_sent)
>>> return;
>>>=20
>>> - req->rq_connect_cookie =3D xprt->connect_cookie;
>>> req->rq_xtime =3D ktime_get();
>>> status =3D xprt->ops->send_request(task);
>>> trace_xprt_transmit(xprt, req->rq_xid, status);
>>> --=20
>>> 2.14.3
>>=20
>> Tested with a client that returns -ENOBUFS from ->send_request
>> with a little extra frequency.
>>=20
>> This doesn't appear to address the problem. The transport
>> deadlocks as before when send_request() fails to send the RPC.
>=20
> Does the RDMA code update the connect cookie when the connection
> breaks? It looks to me as if it only does that when the connection is
> re-established. We really want both.
>=20
>> I imagine this issue could also impact write buffer exhaustion
>> on TCP.
>=20
> See net/sunrpc/xprtsock.c:xs_tcp_state_change()

xprtrdma manipulates the connect_cookie in its connect worker,
see rpcrdma_connect_worker. This was added by:

commit 575448bd36208f99fe0dd554a43518d798966740
Author: Tom Talpey <[email protected]>
AuthorDate: Thu Oct 9 15:00:40 2008 -0400
Commit: Trond Myklebust <[email protected]>
CommitDate: Fri Oct 10 15:10:36 2008 -0400

RPC/RDMA: suppress retransmit on RPC/RDMA clients.

Would it be more correct to bump the cookie in rpcrdma_conn_upcall,
which is the equivalent to xs_tcp_state_change? (if so, why, so
I can compose a reasonable patch description)

It could be bumped in the RDMA_CM_EVENT_ESTABLISHED and the
RDMA_CM_EVENT_DISCONNECTED cases, for example. I'm not sure
RDMA provides a distinction between "server disconnected"
and "client disconnected" although that probably does not
matter for this purpose.

But, why would the additional cookie update help? The transport
is not disconnecting before the deadlock.


--
Chuck Lever


2017-12-14 20:37:31

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH v3] SUNRPC: Fix a race in the receive code path

T24gVGh1LCAyMDE3LTEyLTE0IGF0IDE0OjIyIC0wNTAwLCBDaHVjayBMZXZlciB3cm90ZToNCj4g
PiBPbiBEZWMgMTQsIDIwMTcsIGF0IDI6MDMgUE0sIFRyb25kIE15a2xlYnVzdCA8dHJvbmRteUBw
cmltYXJ5ZGF0YS5jDQo+ID4gb20+IHdyb3RlOg0KPiA+IA0KPiA+IERvZXMgdGhlIFJETUEgY29k
ZSB1cGRhdGUgdGhlIGNvbm5lY3QgY29va2llIHdoZW4gdGhlIGNvbm5lY3Rpb24NCj4gPiBicmVh
a3M/IEl0IGxvb2tzIHRvIG1lIGFzIGlmIGl0IG9ubHkgZG9lcyB0aGF0IHdoZW4gdGhlIGNvbm5l
Y3Rpb24NCj4gPiBpcw0KPiA+IHJlLWVzdGFibGlzaGVkLiBXZSByZWFsbHkgd2FudCBib3RoLg0K
PiA+IA0KPiA+ID4gSSBpbWFnaW5lIHRoaXMgaXNzdWUgY291bGQgYWxzbyBpbXBhY3Qgd3JpdGUg
YnVmZmVyIGV4aGF1c3Rpb24NCj4gPiA+IG9uIFRDUC4NCj4gPiANCj4gPiBTZWUgbmV0L3N1bnJw
Yy94cHJ0c29jay5jOnhzX3RjcF9zdGF0ZV9jaGFuZ2UoKQ0KPiANCj4geHBydHJkbWEgbWFuaXB1
bGF0ZXMgdGhlIGNvbm5lY3RfY29va2llIGluIGl0cyBjb25uZWN0IHdvcmtlciwNCj4gc2VlIHJw
Y3JkbWFfY29ubmVjdF93b3JrZXIuIFRoaXMgd2FzIGFkZGVkIGJ5Og0KPiANCj4gY29tbWl0IDU3
NTQ0OGJkMzYyMDhmOTlmZTBkZDU1NGE0MzUxOGQ3OTg5NjY3NDANCj4gQXV0aG9yOiAgICAgVG9t
IFRhbHBleSA8dGFscGV5QG5ldGFwcC5jb20+DQo+IEF1dGhvckRhdGU6IFRodSBPY3QgOSAxNTow
MDo0MCAyMDA4IC0wNDAwDQo+IENvbW1pdDogICAgIFRyb25kIE15a2xlYnVzdCA8VHJvbmQuTXlr
bGVidXN0QG5ldGFwcC5jb20+DQo+IENvbW1pdERhdGU6IEZyaSBPY3QgMTAgMTU6MTA6MzYgMjAw
OCAtMDQwMA0KPiANCj4gICAgIFJQQy9SRE1BOiBzdXBwcmVzcyByZXRyYW5zbWl0IG9uIFJQQy9S
RE1BIGNsaWVudHMuDQo+IA0KPiBXb3VsZCBpdCBiZSBtb3JlIGNvcnJlY3QgdG8gYnVtcCB0aGUg
Y29va2llIGluIHJwY3JkbWFfY29ubl91cGNhbGwsDQo+IHdoaWNoIGlzIHRoZSBlcXVpdmFsZW50
IHRvIHhzX3RjcF9zdGF0ZV9jaGFuZ2U/IChpZiBzbywgd2h5LCBzbw0KPiBJIGNhbiBjb21wb3Nl
IGEgcmVhc29uYWJsZSBwYXRjaCBkZXNjcmlwdGlvbikNCj4gDQo+IEl0IGNvdWxkIGJlIGJ1bXBl
ZCBpbiB0aGUgUkRNQV9DTV9FVkVOVF9FU1RBQkxJU0hFRCBhbmQgdGhlDQo+IFJETUFfQ01fRVZF
TlRfRElTQ09OTkVDVEVEIGNhc2VzLCBmb3IgZXhhbXBsZS4gSSdtIG5vdCBzdXJlDQo+IFJETUEg
cHJvdmlkZXMgYSBkaXN0aW5jdGlvbiBiZXR3ZWVuICJzZXJ2ZXIgZGlzY29ubmVjdGVkIg0KPiBh
bmQgImNsaWVudCBkaXNjb25uZWN0ZWQiIGFsdGhvdWdoIHRoYXQgcHJvYmFibHkgZG9lcyBub3QN
Cj4gbWF0dGVyIGZvciB0aGlzIHB1cnBvc2UuDQo+IA0KPiBCdXQsIHdoeSB3b3VsZCB0aGUgYWRk
aXRpb25hbCBjb29raWUgdXBkYXRlIGhlbHA/IFRoZSB0cmFuc3BvcnQNCj4gaXMgbm90IGRpc2Nv
bm5lY3RpbmcgYmVmb3JlIHRoZSBkZWFkbG9jay4NCj4gDQoNClRoZSBjb25uZWN0aW9uIGNvb2tp
ZSdzIHB1cnBvc2UgaXMgdHdvZm9sZDoNCg0KMSkgSXQgdHJhY2tzIHdoZXRoZXIgb3Igbm90IGEg
cmVxdWVzdCBoYXMgYmVlbiB0cmFuc21pdHRlZCBvbiB0aGUNCmN1cnJlbnQgY29ubmVjdGlvbiBv
ciBub3QuDQoNCjIpIEl0IGVuc3VyZXMgdGhhdCB3aGVuIHNldmVyYWwgcmVxdWVzdHMgd2l0aCB0
aGUgc2FtZSBjb25uZWN0aW9uDQpjb29raWUgYWxsIGNhbGwgeHBydF9jb25kaXRpb25hbF9kaXNj
b25uZWN0KCksIHRoZW4gdGhhdCByZXN1bHRzIGluIGENCnNpbmdsZSBkaXNjb25uZWN0aW9uIGV2
ZW50LiBUbyBkbyBzbywgaXQgYXNzdW1lcyB0aGF0IHhwcnRfYXV0b2Nsb3NlKCkNCndpbGwgY2hh
bmdlIHRoZSBjb29raWUgaWYgdGhlIGRpc2Nvbm5lY3Rpb24gYXR0ZW1wdCBpcyBzdWNjZXNzZnVs
Lg0KDQpJbiBUQ1Agd2UgZG8gc28gaW4gdGhlIHhzX3RjcF9zdGF0ZV9jaGFuZ2UoKS4gSWYgdGhl
IFJETUEgdHJhbnNwb3J0IGNhbg0KZ3VhcmFudGVlIHRoYXQgdGhlIGNhbGwgdG8geHBydC0+b3Bz
LT5jbG9zZSh4cHJ0KSBpcyBhbHdheXMgc3VjY2Vzc2Z1bCwNCnRoZW4geW91IGNvdWxkIGRvIHNv
IHRoZXJlLg0KDQotLSANClRyb25kIE15a2xlYnVzdA0KTGludXggTkZTIGNsaWVudCBtYWludGFp
bmVyLCBQcmltYXJ5RGF0YQ0KdHJvbmQubXlrbGVidXN0QHByaW1hcnlkYXRhLmNvbQ0K


2017-12-14 21:00:10

by Chuck Lever

[permalink] [raw]
Subject: Re: [PATCH v3] SUNRPC: Fix a race in the receive code path


> On Dec 14, 2017, at 3:37 PM, Trond Myklebust <[email protected]> =
wrote:
>=20
> On Thu, 2017-12-14 at 14:22 -0500, Chuck Lever wrote:
>>> On Dec 14, 2017, at 2:03 PM, Trond Myklebust <[email protected]
>>> om> wrote:
>>>=20
>>> Does the RDMA code update the connect cookie when the connection
>>> breaks? It looks to me as if it only does that when the connection
>>> is
>>> re-established. We really want both.
>>>=20
>>>> I imagine this issue could also impact write buffer exhaustion
>>>> on TCP.
>>>=20
>>> See net/sunrpc/xprtsock.c:xs_tcp_state_change()
>>=20
>> xprtrdma manipulates the connect_cookie in its connect worker,
>> see rpcrdma_connect_worker. This was added by:
>>=20
>> commit 575448bd36208f99fe0dd554a43518d798966740
>> Author: Tom Talpey <[email protected]>
>> AuthorDate: Thu Oct 9 15:00:40 2008 -0400
>> Commit: Trond Myklebust <[email protected]>
>> CommitDate: Fri Oct 10 15:10:36 2008 -0400
>>=20
>> RPC/RDMA: suppress retransmit on RPC/RDMA clients.
>>=20
>> Would it be more correct to bump the cookie in rpcrdma_conn_upcall,
>> which is the equivalent to xs_tcp_state_change? (if so, why, so
>> I can compose a reasonable patch description)
>>=20
>> It could be bumped in the RDMA_CM_EVENT_ESTABLISHED and the
>> RDMA_CM_EVENT_DISCONNECTED cases, for example. I'm not sure
>> RDMA provides a distinction between "server disconnected"
>> and "client disconnected" although that probably does not
>> matter for this purpose.
>>=20
>> But, why would the additional cookie update help? The transport
>> is not disconnecting before the deadlock.
>>=20
>=20
> The connection cookie's purpose is twofold:
>=20
> 1) It tracks whether or not a request has been transmitted on the
> current connection or not.

That's broken by setting the cookie unconditionally outside
the transport_lock, isn't it?


> 2) It ensures that when several requests with the same connection
> cookie all call xprt_conditional_disconnect(), then that results in a
> single disconnection event. To do so, it assumes that xprt_autoclose()
> will change the cookie if the disconnection attempt is successful.
>=20
> In TCP we do so in the xs_tcp_state_change(). If the RDMA transport =
can
> guarantee that the call to xprt->ops->close(xprt) is always =
successful,
> then you could do so there.

I don't mind moving the cookie bump to rpcrdma_conn_upcall,
but I'm not sure I understand the locking requirements.

Currently, xprt_transmit sets the connect_cookie while holding
the transport_lock.

xprt_conditional_disconnect compares the cookie while holding
the transport_lock.

For TCP, the transport_lock is held when bumping the cookie
in the ESTABLISHED case, but _not_ in the two CLOSE cases?

xprtrdma holds the transport_lock when bumping the cookie,
which it does in its connect worker. It has to hold the lock
because it skips the value 0. xprtrdma needs to guarantee
that an RPC is never transmitted on the same connection
twice (and maybe it could use rq_connect_cookie instead of
its own cookie).

xprt_reserve_init is holding the reserve_lock but not the
transport_lock when it grabs the cookie. Maybe it should
not be initializing the rqst's cookie there?

Seems to me that xprt_transmit needs to update the rqst's
cookie while holding the transport_lock, especially if
xprtrdma needs to skip a cookie value? I'm sure I'm missing
something.


--
Chuck Lever




2017-12-14 21:33:57

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH v3] SUNRPC: Fix a race in the receive code path

T24gVGh1LCAyMDE3LTEyLTE0IGF0IDE1OjU5IC0wNTAwLCBDaHVjayBMZXZlciB3cm90ZToNCj4g
PiBPbiBEZWMgMTQsIDIwMTcsIGF0IDM6MzcgUE0sIFRyb25kIE15a2xlYnVzdCA8dHJvbmRteUBw
cmltYXJ5ZGF0YS5jDQo+ID4gb20+IHdyb3RlOg0KPiA+IA0KPiA+IE9uIFRodSwgMjAxNy0xMi0x
NCBhdCAxNDoyMiAtMDUwMCwgQ2h1Y2sgTGV2ZXIgd3JvdGU6DQo+ID4gPiA+IE9uIERlYyAxNCwg
MjAxNywgYXQgMjowMyBQTSwgVHJvbmQgTXlrbGVidXN0IDx0cm9uZG15QHByaW1hcnlkYQ0KPiA+
ID4gPiB0YS5jDQo+ID4gPiA+IG9tPiB3cm90ZToNCj4gPiA+ID4gDQo+ID4gPiA+IERvZXMgdGhl
IFJETUEgY29kZSB1cGRhdGUgdGhlIGNvbm5lY3QgY29va2llIHdoZW4gdGhlDQo+ID4gPiA+IGNv
bm5lY3Rpb24NCj4gPiA+ID4gYnJlYWtzPyBJdCBsb29rcyB0byBtZSBhcyBpZiBpdCBvbmx5IGRv
ZXMgdGhhdCB3aGVuIHRoZQ0KPiA+ID4gPiBjb25uZWN0aW9uDQo+ID4gPiA+IGlzDQo+ID4gPiA+
IHJlLWVzdGFibGlzaGVkLiBXZSByZWFsbHkgd2FudCBib3RoLg0KPiA+ID4gPiANCj4gPiA+ID4g
PiBJIGltYWdpbmUgdGhpcyBpc3N1ZSBjb3VsZCBhbHNvIGltcGFjdCB3cml0ZSBidWZmZXINCj4g
PiA+ID4gPiBleGhhdXN0aW9uDQo+ID4gPiA+ID4gb24gVENQLg0KPiA+ID4gPiANCj4gPiA+ID4g
U2VlIG5ldC9zdW5ycGMveHBydHNvY2suYzp4c190Y3Bfc3RhdGVfY2hhbmdlKCkNCj4gPiA+IA0K
PiA+ID4geHBydHJkbWEgbWFuaXB1bGF0ZXMgdGhlIGNvbm5lY3RfY29va2llIGluIGl0cyBjb25u
ZWN0IHdvcmtlciwNCj4gPiA+IHNlZSBycGNyZG1hX2Nvbm5lY3Rfd29ya2VyLiBUaGlzIHdhcyBh
ZGRlZCBieToNCj4gPiA+IA0KPiA+ID4gY29tbWl0IDU3NTQ0OGJkMzYyMDhmOTlmZTBkZDU1NGE0
MzUxOGQ3OTg5NjY3NDANCj4gPiA+IEF1dGhvcjogICAgIFRvbSBUYWxwZXkgPHRhbHBleUBuZXRh
cHAuY29tPg0KPiA+ID4gQXV0aG9yRGF0ZTogVGh1IE9jdCA5IDE1OjAwOjQwIDIwMDggLTA0MDAN
Cj4gPiA+IENvbW1pdDogICAgIFRyb25kIE15a2xlYnVzdCA8VHJvbmQuTXlrbGVidXN0QG5ldGFw
cC5jb20+DQo+ID4gPiBDb21taXREYXRlOiBGcmkgT2N0IDEwIDE1OjEwOjM2IDIwMDggLTA0MDAN
Cj4gPiA+IA0KPiA+ID4gICAgUlBDL1JETUE6IHN1cHByZXNzIHJldHJhbnNtaXQgb24gUlBDL1JE
TUEgY2xpZW50cy4NCj4gPiA+IA0KPiA+ID4gV291bGQgaXQgYmUgbW9yZSBjb3JyZWN0IHRvIGJ1
bXAgdGhlIGNvb2tpZSBpbg0KPiA+ID4gcnBjcmRtYV9jb25uX3VwY2FsbCwNCj4gPiA+IHdoaWNo
IGlzIHRoZSBlcXVpdmFsZW50IHRvIHhzX3RjcF9zdGF0ZV9jaGFuZ2U/IChpZiBzbywgd2h5LCBz
bw0KPiA+ID4gSSBjYW4gY29tcG9zZSBhIHJlYXNvbmFibGUgcGF0Y2ggZGVzY3JpcHRpb24pDQo+
ID4gPiANCj4gPiA+IEl0IGNvdWxkIGJlIGJ1bXBlZCBpbiB0aGUgUkRNQV9DTV9FVkVOVF9FU1RB
QkxJU0hFRCBhbmQgdGhlDQo+ID4gPiBSRE1BX0NNX0VWRU5UX0RJU0NPTk5FQ1RFRCBjYXNlcywg
Zm9yIGV4YW1wbGUuIEknbSBub3Qgc3VyZQ0KPiA+ID4gUkRNQSBwcm92aWRlcyBhIGRpc3RpbmN0
aW9uIGJldHdlZW4gInNlcnZlciBkaXNjb25uZWN0ZWQiDQo+ID4gPiBhbmQgImNsaWVudCBkaXNj
b25uZWN0ZWQiIGFsdGhvdWdoIHRoYXQgcHJvYmFibHkgZG9lcyBub3QNCj4gPiA+IG1hdHRlciBm
b3IgdGhpcyBwdXJwb3NlLg0KPiA+ID4gDQo+ID4gPiBCdXQsIHdoeSB3b3VsZCB0aGUgYWRkaXRp
b25hbCBjb29raWUgdXBkYXRlIGhlbHA/IFRoZSB0cmFuc3BvcnQNCj4gPiA+IGlzIG5vdCBkaXNj
b25uZWN0aW5nIGJlZm9yZSB0aGUgZGVhZGxvY2suDQo+ID4gPiANCj4gPiANCj4gPiBUaGUgY29u
bmVjdGlvbiBjb29raWUncyBwdXJwb3NlIGlzIHR3b2ZvbGQ6DQo+ID4gDQo+ID4gMSkgSXQgdHJh
Y2tzIHdoZXRoZXIgb3Igbm90IGEgcmVxdWVzdCBoYXMgYmVlbiB0cmFuc21pdHRlZCBvbiB0aGUN
Cj4gPiBjdXJyZW50IGNvbm5lY3Rpb24gb3Igbm90Lg0KPiANCj4gVGhhdCdzIGJyb2tlbiBieSBz
ZXR0aW5nIHRoZSBjb29raWUgdW5jb25kaXRpb25hbGx5IG91dHNpZGUNCj4gdGhlIHRyYW5zcG9y
dF9sb2NrLCBpc24ndCBpdD8NCj4gDQo+IA0KPiA+IDIpIEl0IGVuc3VyZXMgdGhhdCB3aGVuIHNl
dmVyYWwgcmVxdWVzdHMgd2l0aCB0aGUgc2FtZSBjb25uZWN0aW9uDQo+ID4gY29va2llIGFsbCBj
YWxsIHhwcnRfY29uZGl0aW9uYWxfZGlzY29ubmVjdCgpLCB0aGVuIHRoYXQgcmVzdWx0cyBpbg0K
PiA+IGENCj4gPiBzaW5nbGUgZGlzY29ubmVjdGlvbiBldmVudC4gVG8gZG8gc28sIGl0IGFzc3Vt
ZXMgdGhhdA0KPiA+IHhwcnRfYXV0b2Nsb3NlKCkNCj4gPiB3aWxsIGNoYW5nZSB0aGUgY29va2ll
IGlmIHRoZSBkaXNjb25uZWN0aW9uIGF0dGVtcHQgaXMgc3VjY2Vzc2Z1bC4NCj4gPiANCj4gPiBJ
biBUQ1Agd2UgZG8gc28gaW4gdGhlIHhzX3RjcF9zdGF0ZV9jaGFuZ2UoKS4gSWYgdGhlIFJETUEg
dHJhbnNwb3J0DQo+ID4gY2FuDQo+ID4gZ3VhcmFudGVlIHRoYXQgdGhlIGNhbGwgdG8geHBydC0+
b3BzLT5jbG9zZSh4cHJ0KSBpcyBhbHdheXMNCj4gPiBzdWNjZXNzZnVsLA0KPiA+IHRoZW4geW91
IGNvdWxkIGRvIHNvIHRoZXJlLg0KPiANCj4gSSBkb24ndCBtaW5kIG1vdmluZyB0aGUgY29va2ll
IGJ1bXAgdG8gcnBjcmRtYV9jb25uX3VwY2FsbCwNCj4gYnV0IEknbSBub3Qgc3VyZSBJIHVuZGVy
c3RhbmQgdGhlIGxvY2tpbmcgcmVxdWlyZW1lbnRzLg0KPiANCj4gQ3VycmVudGx5LCB4cHJ0X3Ry
YW5zbWl0IHNldHMgdGhlIGNvbm5lY3RfY29va2llIHdoaWxlIGhvbGRpbmcNCj4gdGhlIHRyYW5z
cG9ydF9sb2NrLg0KPiANCj4geHBydF9jb25kaXRpb25hbF9kaXNjb25uZWN0IGNvbXBhcmVzIHRo
ZSBjb29raWUgd2hpbGUgaG9sZGluZw0KPiB0aGUgdHJhbnNwb3J0X2xvY2suDQo+IA0KPiBGb3Ig
VENQLCB0aGUgdHJhbnNwb3J0X2xvY2sgaXMgaGVsZCB3aGVuIGJ1bXBpbmcgdGhlIGNvb2tpZQ0K
PiBpbiB0aGUgRVNUQUJMSVNIRUQgY2FzZSwgYnV0IF9ub3RfIGluIHRoZSB0d28gQ0xPU0UgY2Fz
ZXM/DQoNClRoYXQgc2hvdWxkIGJlIE9LLiBUaGUgbmV0d29ya2luZyBsYXllciBzaG91bGQgcHJv
dmlkZSBzdWZmaWNpZW50DQpzZXJpYWxpc2F0aW9uIHRoYXQgd2UgZG9uJ3QgaGF2ZSB0byB3b3Jy
eSBhYm91dCBjb2xsaXNpb25zLg0KDQo+IA0KPiB4cHJ0cmRtYSBob2xkcyB0aGUgdHJhbnNwb3J0
X2xvY2sgd2hlbiBidW1waW5nIHRoZSBjb29raWUsDQo+IHdoaWNoIGl0IGRvZXMgaW4gaXRzIGNv
bm5lY3Qgd29ya2VyLiBJdCBoYXMgdG8gaG9sZCB0aGUgbG9jaw0KPiBiZWNhdXNlIGl0IHNraXBz
IHRoZSB2YWx1ZSAwLiB4cHJ0cmRtYSBuZWVkcyB0byBndWFyYW50ZWUNCj4gdGhhdCBhbiBSUEMg
aXMgbmV2ZXIgdHJhbnNtaXR0ZWQgb24gdGhlIHNhbWUgY29ubmVjdGlvbg0KPiB0d2ljZSAoYW5k
IG1heWJlIGl0IGNvdWxkIHVzZSBycV9jb25uZWN0X2Nvb2tpZSBpbnN0ZWFkIG9mDQo+IGl0cyBv
d24gY29va2llKS4NCj4gDQo+IHhwcnRfcmVzZXJ2ZV9pbml0IGlzIGhvbGRpbmcgdGhlIHJlc2Vy
dmVfbG9jayBidXQgbm90IHRoZQ0KPiB0cmFuc3BvcnRfbG9jayB3aGVuIGl0IGdyYWJzIHRoZSBj
b29raWUuIE1heWJlIGl0IHNob3VsZA0KPiBub3QgYmUgaW5pdGlhbGl6aW5nIHRoZSBycXN0J3Mg
Y29va2llIHRoZXJlPw0KPiANCj4gU2VlbXMgdG8gbWUgdGhhdCB4cHJ0X3RyYW5zbWl0IG5lZWRz
IHRvIHVwZGF0ZSB0aGUgcnFzdCdzDQo+IGNvb2tpZSB3aGlsZSBob2xkaW5nIHRoZSB0cmFuc3Bv
cnRfbG9jaywgZXNwZWNpYWxseSBpZg0KPiB4cHJ0cmRtYSBuZWVkcyB0byBza2lwIGEgY29va2ll
IHZhbHVlPyBJJ20gc3VyZSBJJ20gbWlzc2luZw0KPiBzb21ldGhpbmcuDQo+IA0KDQpJdCBzaG91
bGQgYmUgT0ssIGdpdmVuIHRoYXQgdGhlIGNvbm5lY3Rpb24gaXMgYSBzdGF0ZSBtYWNoaW5lLg0K
SG93ZXZlciwgSSBtaXNzZWQgc29tZXRoaW5nIHRoYXQgeW91IHNhaWQgZWFybGllciBhYm91dA0K
eHBydF9wcmVwYXJlX3RyYW5zbWl0KCkuDQoNCk9LLiBIb3cgYWJvdXQgdGhlIGZvbGxvd2luZyBm
aXh1cCBwYXRjaCBpbnN0ZWFkIG9mIHRoZSBlYXJsaWVyIG9uZT8NCg0KODwtLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0NCkZyb20gMjFjZGIyODAyZDlk
OGI3MTU1Mzk5OGU2YmU1YWFmZWZmMDc0MjE0MiBNb24gU2VwIDE3IDAwOjAwOjAwIDIwMDENCkZy
b206IFRyb25kIE15a2xlYnVzdCA8dHJvbmQubXlrbGVidXN0QHByaW1hcnlkYXRhLmNvbT4NCkRh
dGU6IFRodSwgMTQgRGVjIDIwMTcgMDc6MDU6MjcgLTA1MDANClN1YmplY3Q6IFtQQVRDSF0gZml4
dXAhIFNVTlJQQzogRml4IGEgcmFjZSBpbiB0aGUgcmVjZWl2ZSBjb2RlIHBhdGgNCg0KLS0tDQog
bmV0L3N1bnJwYy94cHJ0LmMgfCA0ICsrKy0NCiAxIGZpbGUgY2hhbmdlZCwgMyBpbnNlcnRpb25z
KCspLCAxIGRlbGV0aW9uKC0pDQoNCmRpZmYgLS1naXQgYS9uZXQvc3VucnBjL3hwcnQuYyBiL25l
dC9zdW5ycGMveHBydC5jDQppbmRleCA1ZTQyNzhlOWNlMzcuLjMzYjc0ZmQ4NDA1MSAxMDA2NDQN
Ci0tLSBhL25ldC9zdW5ycGMveHBydC5jDQorKysgYi9uZXQvc3VucnBjL3hwcnQuYw0KQEAgLTEw
MDEsNiArMTAwMSw3IEBAIHZvaWQgeHBydF90cmFuc21pdChzdHJ1Y3QgcnBjX3Rhc2sgKnRhc2sp
DQogew0KIAlzdHJ1Y3QgcnBjX3Jxc3QJKnJlcSA9IHRhc2stPnRrX3Jxc3RwOw0KIAlzdHJ1Y3Qg
cnBjX3hwcnQJKnhwcnQgPSByZXEtPnJxX3hwcnQ7DQorCXVuc2lnbmVkIGludCBjb25uZWN0X2Nv
b2tpZTsNCiAJaW50IHN0YXR1cywgbnVtcmVxczsNCiANCiAJZHByaW50aygiUlBDOiAlNXUgeHBy
dF90cmFuc21pdCgldSlcbiIsIHRhc2stPnRrX3BpZCwgcmVxLT5ycV9zbGVuKTsNCkBAIC0xMDI0
LDcgKzEwMjUsNyBAQCB2b2lkIHhwcnRfdHJhbnNtaXQoc3RydWN0IHJwY190YXNrICp0YXNrKQ0K
IAl9IGVsc2UgaWYgKCFyZXEtPnJxX2J5dGVzX3NlbnQpDQogCQlyZXR1cm47DQogDQotCXJlcS0+
cnFfY29ubmVjdF9jb29raWUgPSB4cHJ0LT5jb25uZWN0X2Nvb2tpZTsNCisJY29ubmVjdF9jb29r
aWUgPSB4cHJ0LT5jb25uZWN0X2Nvb2tpZTsNCiAJcmVxLT5ycV94dGltZSA9IGt0aW1lX2dldCgp
Ow0KIAlzdGF0dXMgPSB4cHJ0LT5vcHMtPnNlbmRfcmVxdWVzdCh0YXNrKTsNCiAJdHJhY2VfeHBy
dF90cmFuc21pdCh4cHJ0LCByZXEtPnJxX3hpZCwgc3RhdHVzKTsNCkBAIC0xMDUwLDYgKzEwNTEs
NyBAQCB2b2lkIHhwcnRfdHJhbnNtaXQoc3RydWN0IHJwY190YXNrICp0YXNrKQ0KIAl4cHJ0LT5z
dGF0LnBlbmRpbmdfdSArPSB4cHJ0LT5wZW5kaW5nLnFsZW47DQogCXNwaW5fdW5sb2NrX2JoKCZ4
cHJ0LT50cmFuc3BvcnRfbG9jayk7DQogDQorCXJlcS0+cnFfY29ubmVjdF9jb29raWUgPSBjb25u
ZWN0X2Nvb2tpZTsNCiAJaWYgKHJwY19yZXBseV9leHBlY3RlZCh0YXNrKSAmJiAhUkVBRF9PTkNF
KHJlcS0+cnFfcmVwbHlfYnl0ZXNfcmVjdmQpKSB7DQogCQkvKg0KIAkJICogU2xlZXAgb24gdGhl
IHBlbmRpbmcgcXVldWUgaWYgd2UncmUgZXhwZWN0aW5nIGEgcmVwbHkuDQotLSANCjIuMTQuMw0K
DQotLSANClRyb25kIE15a2xlYnVzdA0KTGludXggTkZTIGNsaWVudCBtYWludGFpbmVyLCBQcmlt
YXJ5RGF0YQ0KdHJvbmQubXlrbGVidXN0QHByaW1hcnlkYXRhLmNvbQ0K


2017-12-15 02:16:58

by Chuck Lever

[permalink] [raw]
Subject: Re: [PATCH v3] SUNRPC: Fix a race in the receive code path


> On Dec 14, 2017, at 4:33 PM, Trond Myklebust <[email protected]> =
wrote:
>=20
> On Thu, 2017-12-14 at 15:59 -0500, Chuck Lever wrote:
>>> On Dec 14, 2017, at 3:37 PM, Trond Myklebust <[email protected]
>>> om> wrote:
>>>=20
>>> On Thu, 2017-12-14 at 14:22 -0500, Chuck Lever wrote:
>>>>> On Dec 14, 2017, at 2:03 PM, Trond Myklebust <trondmy@primaryda
>>>>> ta.c
>>>>> om> wrote:
>>>>>=20
>>>>> Does the RDMA code update the connect cookie when the
>>>>> connection
>>>>> breaks? It looks to me as if it only does that when the
>>>>> connection
>>>>> is
>>>>> re-established. We really want both.
>>>>>=20
>>>>>> I imagine this issue could also impact write buffer
>>>>>> exhaustion
>>>>>> on TCP.
>>>>>=20
>>>>> See net/sunrpc/xprtsock.c:xs_tcp_state_change()
>>>>=20
>>>> xprtrdma manipulates the connect_cookie in its connect worker,
>>>> see rpcrdma_connect_worker. This was added by:
>>>>=20
>>>> commit 575448bd36208f99fe0dd554a43518d798966740
>>>> Author: Tom Talpey <[email protected]>
>>>> AuthorDate: Thu Oct 9 15:00:40 2008 -0400
>>>> Commit: Trond Myklebust <[email protected]>
>>>> CommitDate: Fri Oct 10 15:10:36 2008 -0400
>>>>=20
>>>> RPC/RDMA: suppress retransmit on RPC/RDMA clients.
>>>>=20
>>>> Would it be more correct to bump the cookie in
>>>> rpcrdma_conn_upcall,
>>>> which is the equivalent to xs_tcp_state_change? (if so, why, so
>>>> I can compose a reasonable patch description)
>>>>=20
>>>> It could be bumped in the RDMA_CM_EVENT_ESTABLISHED and the
>>>> RDMA_CM_EVENT_DISCONNECTED cases, for example. I'm not sure
>>>> RDMA provides a distinction between "server disconnected"
>>>> and "client disconnected" although that probably does not
>>>> matter for this purpose.
>>>>=20
>>>> But, why would the additional cookie update help? The transport
>>>> is not disconnecting before the deadlock.
>>>>=20
>>>=20
>>> The connection cookie's purpose is twofold:
>>>=20
>>> 1) It tracks whether or not a request has been transmitted on the
>>> current connection or not.
>>=20
>> That's broken by setting the cookie unconditionally outside
>> the transport_lock, isn't it?
>>=20
>>=20
>>> 2) It ensures that when several requests with the same connection
>>> cookie all call xprt_conditional_disconnect(), then that results in
>>> a
>>> single disconnection event. To do so, it assumes that
>>> xprt_autoclose()
>>> will change the cookie if the disconnection attempt is successful.
>>>=20
>>> In TCP we do so in the xs_tcp_state_change(). If the RDMA transport
>>> can
>>> guarantee that the call to xprt->ops->close(xprt) is always
>>> successful,
>>> then you could do so there.
>>=20
>> I don't mind moving the cookie bump to rpcrdma_conn_upcall,
>> but I'm not sure I understand the locking requirements.
>>=20
>> Currently, xprt_transmit sets the connect_cookie while holding
>> the transport_lock.
>>=20
>> xprt_conditional_disconnect compares the cookie while holding
>> the transport_lock.
>>=20
>> For TCP, the transport_lock is held when bumping the cookie
>> in the ESTABLISHED case, but _not_ in the two CLOSE cases?
>=20
> That should be OK. The networking layer should provide sufficient
> serialisation that we don't have to worry about collisions.
>=20
>>=20
>> xprtrdma holds the transport_lock when bumping the cookie,
>> which it does in its connect worker. It has to hold the lock
>> because it skips the value 0. xprtrdma needs to guarantee
>> that an RPC is never transmitted on the same connection
>> twice (and maybe it could use rq_connect_cookie instead of
>> its own cookie).
>>=20
>> xprt_reserve_init is holding the reserve_lock but not the
>> transport_lock when it grabs the cookie. Maybe it should
>> not be initializing the rqst's cookie there?
>>=20
>> Seems to me that xprt_transmit needs to update the rqst's
>> cookie while holding the transport_lock, especially if
>> xprtrdma needs to skip a cookie value? I'm sure I'm missing
>> something.
>>=20
>=20
> It should be OK, given that the connection is a state machine.
> However, I missed something that you said earlier about
> xprt_prepare_transmit().
>=20
> OK. How about the following fixup patch instead of the earlier one?
>=20
> 8<---------------------------------------------------
> =46rom 21cdb2802d9d8b71553998e6be5aafeff0742142 Mon Sep 17 00:00:00 =
2001
> From: Trond Myklebust <[email protected]>
> Date: Thu, 14 Dec 2017 07:05:27 -0500
> Subject: [PATCH] fixup! SUNRPC: Fix a race in the receive code path
>=20
> ---
> net/sunrpc/xprt.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>=20
> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
> index 5e4278e9ce37..33b74fd84051 100644
> --- a/net/sunrpc/xprt.c
> +++ b/net/sunrpc/xprt.c
> @@ -1001,6 +1001,7 @@ void xprt_transmit(struct rpc_task *task)
> {
> struct rpc_rqst *req =3D task->tk_rqstp;
> struct rpc_xprt *xprt =3D req->rq_xprt;
> + unsigned int connect_cookie;
> int status, numreqs;
>=20
> dprintk("RPC: %5u xprt_transmit(%u)\n", task->tk_pid, =
req->rq_slen);
> @@ -1024,7 +1025,7 @@ void xprt_transmit(struct rpc_task *task)
> } else if (!req->rq_bytes_sent)
> return;
>=20
> - req->rq_connect_cookie =3D xprt->connect_cookie;
> + connect_cookie =3D xprt->connect_cookie;
> req->rq_xtime =3D ktime_get();
> status =3D xprt->ops->send_request(task);
> trace_xprt_transmit(xprt, req->rq_xid, status);
> @@ -1050,6 +1051,7 @@ void xprt_transmit(struct rpc_task *task)
> xprt->stat.pending_u +=3D xprt->pending.qlen;
> spin_unlock_bh(&xprt->transport_lock);
>=20
> + req->rq_connect_cookie =3D connect_cookie;
> if (rpc_reply_expected(task) && =
!READ_ONCE(req->rq_reply_bytes_recvd)) {
> /*
> * Sleep on the pending queue if we're expecting a =
reply.
> --=20
> 2.14.3

No problems here, passed basic testing with NFSv4.0 on a
client with extra send_request fault injection.

I hope we can get the recv race fix (as updated here) and
the queue-work-on patch [1] into v4.15-rc.


--
Chuck Lever

[1] https://marc.info/?l=3Dlinux-nfs&m=3D151241427912572&w=3D2=