2017-12-03 18:50:18

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH] SUNRPC: Fix a race in the receive code path

We must ensure that the call to rpc_sleep_on() in xprt_transmit() cannot
race with the call to xprt_complete_rqst().

Reported-by: Chuck Lever <[email protected]>
Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=317
Fixes: ce7c252a8c74 ("SUNRPC: Add a separate spinlock to protect..")
Cc: [email protected] # 4.14+
Signed-off-by: Trond Myklebust <[email protected]>
---
net/sunrpc/xprt.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index 333b9d697ae5..9d9092805696 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -1035,6 +1035,7 @@ void xprt_transmit(struct rpc_task *task)

dprintk("RPC: %5u xmit complete\n", task->tk_pid);
task->tk_flags |= RPC_TASK_SENT;
+ spin_lock(&xprt->recv_lock);
spin_lock_bh(&xprt->transport_lock);

xprt->ops->set_retrans_timeout(task);
@@ -1061,6 +1062,7 @@ void xprt_transmit(struct rpc_task *task)
req->rq_connect_cookie = xprt->connect_cookie;
}
spin_unlock_bh(&xprt->transport_lock);
+ spin_unlock(&xprt->recv_lock);
}

static void xprt_add_backlog(struct rpc_xprt *xprt, struct rpc_task *task)
--
2.14.3



2017-12-03 18:54:30

by Chuck Lever

[permalink] [raw]
Subject: Re: [PATCH] SUNRPC: Fix a race in the receive code path


> On Dec 3, 2017, at 1:50 PM, Trond Myklebust =
<[email protected]> wrote:
>=20
> We must ensure that the call to rpc_sleep_on() in xprt_transmit() =
cannot
> race with the call to xprt_complete_rqst().

:-( this will kill scalability, we might as well just go back
to the old locking scheme.

But I will give it a try.


> Reported-by: Chuck Lever <[email protected]>
> Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=3D317
> Fixes: ce7c252a8c74 ("SUNRPC: Add a separate spinlock to protect..")
> Cc: [email protected] # 4.14+
> Signed-off-by: Trond Myklebust <[email protected]>
> ---
> net/sunrpc/xprt.c | 2 ++
> 1 file changed, 2 insertions(+)
>=20
> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
> index 333b9d697ae5..9d9092805696 100644
> --- a/net/sunrpc/xprt.c
> +++ b/net/sunrpc/xprt.c
> @@ -1035,6 +1035,7 @@ void xprt_transmit(struct rpc_task *task)
>=20
> dprintk("RPC: %5u xmit complete\n", task->tk_pid);
> task->tk_flags |=3D RPC_TASK_SENT;
> + spin_lock(&xprt->recv_lock);
> spin_lock_bh(&xprt->transport_lock);
>=20
> xprt->ops->set_retrans_timeout(task);
> @@ -1061,6 +1062,7 @@ void xprt_transmit(struct rpc_task *task)
> req->rq_connect_cookie =3D xprt->connect_cookie;
> }
> spin_unlock_bh(&xprt->transport_lock);
> + spin_unlock(&xprt->recv_lock);
> }
>=20
> static void xprt_add_backlog(struct rpc_xprt *xprt, struct rpc_task =
*task)
> --=20
> 2.14.3
>=20
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" =
in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
Chuck Lever
[email protected]




2017-12-03 20:12:52

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH] SUNRPC: Fix a race in the receive code path

T24gU3VuLCAyMDE3LTEyLTAzIGF0IDEzOjU0IC0wNTAwLCBDaHVjayBMZXZlciB3cm90ZToNCj4g
PiBPbiBEZWMgMywgMjAxNywgYXQgMTo1MCBQTSwgVHJvbmQgTXlrbGVidXN0IDx0cm9uZC5teWts
ZWJ1c3RAcHJpbWFyDQo+ID4geWRhdGEuY29tPiB3cm90ZToNCj4gPiANCj4gPiBXZSBtdXN0IGVu
c3VyZSB0aGF0IHRoZSBjYWxsIHRvIHJwY19zbGVlcF9vbigpIGluIHhwcnRfdHJhbnNtaXQoKQ0K
PiA+IGNhbm5vdA0KPiA+IHJhY2Ugd2l0aCB0aGUgY2FsbCB0byB4cHJ0X2NvbXBsZXRlX3Jxc3Qo
KS4NCj4gDQo+IDotKCB0aGlzIHdpbGwga2lsbCBzY2FsYWJpbGl0eSwgd2UgbWlnaHQgYXMgd2Vs
bCBqdXN0IGdvIGJhY2sNCj4gdG8gdGhlIG9sZCBsb2NraW5nIHNjaGVtZS4NCg0KSXQgc2hvdWxk
bid0IG1ha2UgYSBodWdlIGRpZmZlcmVuY2UsIGJ1dCBJIGFncmVlIHRoYXQgd2UgZG8gd2FudCB0
byBnZXQNCnJpZCBvZiB0aGF0IHRyYW5zcG9ydCBsb2NrLg0KDQpIb3cgYWJvdXQgdGhlIGZvbGxv
d2luZywgdGhlbj8NCg0KODwtLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0NCkZyb20gNmEwYzUwN2YxNjBkNTYyNGQ5MDQ5Mjgx
Y2Q5ZGZlMjIyYTg2NmYwNiBNb24gU2VwIDE3IDAwOjAwOjAwIDIwMDENCkZyb206IFRyb25kIE15
a2xlYnVzdCA8dHJvbmQubXlrbGVidXN0QHByaW1hcnlkYXRhLmNvbT4NCkRhdGU6IFN1biwgMyBE
ZWMgMjAxNyAxMzozNzoyNyAtMDUwMA0KU3ViamVjdDogW1BBVENIIHYyXSBTVU5SUEM6IEZpeCBh
IHJhY2UgaW4gdGhlIHJlY2VpdmUgY29kZSBwYXRoDQoNCldlIG11c3QgZW5zdXJlIHRoYXQgdGhl
IGNhbGwgdG8gcnBjX3NsZWVwX29uKCkgaW4geHBydF90cmFuc21pdCgpIGNhbm5vdA0KcmFjZSB3
aXRoIHRoZSBjYWxsIHRvIHhwcnRfY29tcGxldGVfcnFzdCgpLg0KDQpSZXBvcnRlZC1ieTogQ2h1
Y2sgTGV2ZXIgPGNodWNrLmxldmVyQG9yYWNsZS5jb20+DQpMaW5rOiBodHRwczovL2J1Z3ppbGxh
LmxpbnV4LW5mcy5vcmcvc2hvd19idWcuY2dpP2lkPTMxNw0KRml4ZXM6IGNlN2MyNTJhOGM3NCAo
IlNVTlJQQzogQWRkIGEgc2VwYXJhdGUgc3BpbmxvY2sgdG8gcHJvdGVjdC4uIikNCkNjOiBzdGFi
bGVAdmdlci5rZXJuZWwub3JnICMgNC4xNCsNClNpZ25lZC1vZmYtYnk6IFRyb25kIE15a2xlYnVz
dCA8dHJvbmQubXlrbGVidXN0QHByaW1hcnlkYXRhLmNvbT4NCi0tLQ0KIG5ldC9zdW5ycGMveHBy
dC5jIHwgMTggKysrKysrKysrKystLS0tLS0tDQogMSBmaWxlIGNoYW5nZWQsIDExIGluc2VydGlv
bnMoKyksIDcgZGVsZXRpb25zKC0pDQoNCmRpZmYgLS1naXQgYS9uZXQvc3VucnBjL3hwcnQuYyBi
L25ldC9zdW5ycGMveHBydC5jDQppbmRleCAzMzNiOWQ2OTdhZTUuLjM0ZjYxMzM4NTMxOSAxMDA2
NDQNCi0tLSBhL25ldC9zdW5ycGMveHBydC5jDQorKysgYi9uZXQvc3VucnBjL3hwcnQuYw0KQEAg
LTEwMjQsNiArMTAyNCw3IEBAIHZvaWQgeHBydF90cmFuc21pdChzdHJ1Y3QgcnBjX3Rhc2sgKnRh
c2spDQogCX0gZWxzZSBpZiAoIXJlcS0+cnFfYnl0ZXNfc2VudCkNCiAJCXJldHVybjsNCiANCisJ
cmVxLT5ycV9jb25uZWN0X2Nvb2tpZSA9IHhwcnQtPmNvbm5lY3RfY29va2llOw0KIAlyZXEtPnJx
X3h0aW1lID0ga3RpbWVfZ2V0KCk7DQogCXN0YXR1cyA9IHhwcnQtPm9wcy0+c2VuZF9yZXF1ZXN0
KHRhc2spOw0KIAl0cmFjZV94cHJ0X3RyYW5zbWl0KHhwcnQsIHJlcS0+cnFfeGlkLCBzdGF0dXMp
Ow0KQEAgLTEwNDgsMTkgKzEwNDksMjIgQEAgdm9pZCB4cHJ0X3RyYW5zbWl0KHN0cnVjdCBycGNf
dGFzayAqdGFzaykNCiAJeHBydC0+c3RhdC5zZW5kaW5nX3UgKz0geHBydC0+c2VuZGluZy5xbGVu
Ow0KIAl4cHJ0LT5zdGF0LnBlbmRpbmdfdSArPSB4cHJ0LT5wZW5kaW5nLnFsZW47DQogDQotCS8q
IERvbid0IHJhY2Ugd2l0aCBkaXNjb25uZWN0ICovDQotCWlmICgheHBydF9jb25uZWN0ZWQoeHBy
dCkpDQotCQl0YXNrLT50a19zdGF0dXMgPSAtRU5PVENPTk47DQotCWVsc2Ugew0KKwlzcGluX3Vu
bG9ja19iaCgmeHBydC0+dHJhbnNwb3J0X2xvY2spOw0KKw0KKwlpZiAoIVJFQURfT05DRShyZXEt
PnJxX3JlcGx5X2J5dGVzX3JlY3ZkKSAmJiBycGNfcmVwbHlfZXhwZWN0ZWQodGFzaykpIHsNCisJ
CXNwaW5fbG9jaygmeHBydC0+cmVjdl9sb2NrKTsNCiAJCS8qDQogCQkgKiBTbGVlcCBvbiB0aGUg
cGVuZGluZyBxdWV1ZSBzaW5jZQ0KIAkJICogd2UncmUgZXhwZWN0aW5nIGEgcmVwbHkuDQogCQkg
Ki8NCi0JCWlmICghcmVxLT5ycV9yZXBseV9ieXRlc19yZWN2ZCAmJiBycGNfcmVwbHlfZXhwZWN0
ZWQodGFzaykpDQorCQlpZiAoIXJlcS0+cnFfcmVwbHlfYnl0ZXNfcmVjdmQpIHsNCiAJCQlycGNf
c2xlZXBfb24oJnhwcnQtPnBlbmRpbmcsIHRhc2ssIHhwcnRfdGltZXIpOw0KLQkJcmVxLT5ycV9j
b25uZWN0X2Nvb2tpZSA9IHhwcnQtPmNvbm5lY3RfY29va2llOw0KKwkJCS8qIERlYWwgd2l0aCBk
aXNjb25uZWN0IHJhY2VzICovDQorCQkJaWYgKCF4cHJ0X2Nvbm5lY3RlZCh4cHJ0KSkNCisJCQkJ
eHBydF93YWtlX3BlbmRpbmdfdGFza3MoeHBydCwgLUVOT1RDT05OKTsNCisJCX0NCisJCXNwaW5f
dW5sb2NrKCZ4cHJ0LT5yZWN2X2xvY2spOw0KIAl9DQotCXNwaW5fdW5sb2NrX2JoKCZ4cHJ0LT50
cmFuc3BvcnRfbG9jayk7DQogfQ0KIA0KIHN0YXRpYyB2b2lkIHhwcnRfYWRkX2JhY2tsb2coc3Ry
dWN0IHJwY194cHJ0ICp4cHJ0LCBzdHJ1Y3QgcnBjX3Rhc2sgKnRhc2spDQotLSANCjIuMTQuMw0K
DQotLSANClRyb25kIE15a2xlYnVzdA0KTGludXggTkZTIGNsaWVudCBtYWludGFpbmVyLCBQcmlt
YXJ5RGF0YQ0KdHJvbmQubXlrbGVidXN0QHByaW1hcnlkYXRhLmNvbQ0K


2017-12-03 20:19:51

by Chuck Lever

[permalink] [raw]
Subject: Re: [PATCH] SUNRPC: Fix a race in the receive code path


> On Dec 3, 2017, at 3:12 PM, Trond Myklebust <[email protected]> =
wrote:
>=20
> On Sun, 2017-12-03 at 13:54 -0500, Chuck Lever wrote:
>>> On Dec 3, 2017, at 1:50 PM, Trond Myklebust <trond.myklebust@primar
>>> ydata.com> wrote:
>>>=20
>>> We must ensure that the call to rpc_sleep_on() in xprt_transmit()
>>> cannot
>>> race with the call to xprt_complete_rqst().
>>=20
>> :-( this will kill scalability, we might as well just go back
>> to the old locking scheme.
>=20
> It shouldn't make a huge difference, but I agree that we do want to =
get
> rid of that transport lock.
>=20
> How about the following, then?

This looks better, I'll give it a try!

But touching the recv_lock in the transmit path is what I'd like
to avoid completely, if possible. I'm not clear on why the pending
waitqueue is unprotected. Doesn't it have a lock of its own?


> 8<------------------------------------------------------------------
> =46rom 6a0c507f160d5624d9049281cd9dfe222a866f06 Mon Sep 17 00:00:00 =
2001
> From: Trond Myklebust <[email protected]>
> Date: Sun, 3 Dec 2017 13:37:27 -0500
> Subject: [PATCH v2] SUNRPC: Fix a race in the receive code path
>=20
> We must ensure that the call to rpc_sleep_on() in xprt_transmit() =
cannot
> race with the call to xprt_complete_rqst().
>=20
> Reported-by: Chuck Lever <[email protected]>
> Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=3D317
> Fixes: ce7c252a8c74 ("SUNRPC: Add a separate spinlock to protect..")
> Cc: [email protected] # 4.14+
> Signed-off-by: Trond Myklebust <[email protected]>
> ---
> net/sunrpc/xprt.c | 18 +++++++++++-------
> 1 file changed, 11 insertions(+), 7 deletions(-)
>=20
> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
> index 333b9d697ae5..34f613385319 100644
> --- a/net/sunrpc/xprt.c
> +++ b/net/sunrpc/xprt.c
> @@ -1024,6 +1024,7 @@ void xprt_transmit(struct rpc_task *task)
> } else if (!req->rq_bytes_sent)
> return;
>=20
> + req->rq_connect_cookie =3D xprt->connect_cookie;
> req->rq_xtime =3D ktime_get();
> status =3D xprt->ops->send_request(task);
> trace_xprt_transmit(xprt, req->rq_xid, status);
> @@ -1048,19 +1049,22 @@ void xprt_transmit(struct rpc_task *task)
> xprt->stat.sending_u +=3D xprt->sending.qlen;
> xprt->stat.pending_u +=3D xprt->pending.qlen;
>=20
> - /* Don't race with disconnect */
> - if (!xprt_connected(xprt))
> - task->tk_status =3D -ENOTCONN;
> - else {
> + spin_unlock_bh(&xprt->transport_lock);
> +
> + if (!READ_ONCE(req->rq_reply_bytes_recvd) && =
rpc_reply_expected(task)) {
> + spin_lock(&xprt->recv_lock);
> /*
> * Sleep on the pending queue since
> * we're expecting a reply.
> */
> - if (!req->rq_reply_bytes_recvd && =
rpc_reply_expected(task))
> + if (!req->rq_reply_bytes_recvd) {
> rpc_sleep_on(&xprt->pending, task, xprt_timer);
> - req->rq_connect_cookie =3D xprt->connect_cookie;
> + /* Deal with disconnect races */
> + if (!xprt_connected(xprt))
> + xprt_wake_pending_tasks(xprt, =
-ENOTCONN);
> + }
> + spin_unlock(&xprt->recv_lock);
> }
> - spin_unlock_bh(&xprt->transport_lock);
> }
>=20
> static void xprt_add_backlog(struct rpc_xprt *xprt, struct rpc_task =
*task)
> --=20
> 2.14.3
>=20
> --=20
> Trond Myklebust
> Linux NFS client maintainer, PrimaryData
> [email protected]
> =
N=C2=8B=C2=A7=C4=93=C3=A6=C4=97r=C4=BC=C2=9By=C3=BA=C4=8D=C2=9A=C3=98b=C4=93=
X=C5=BD=C4=B7=C4=AE=C2=A7v=C3=98^=C2=96)=C3=9E=C5=A1{.n=C4=AE+=C2=89=C2=B7=
=C4=A8=C2=8A{=C4=85=C2=9D=C3=BB"=C2=9E=C3=98^n=C2=87r=C4=84=C3=B6=C4=B6z=C3=
=8B=1A=C2=81=C3=ABh=C2=99=C4=BB=C4=8D=C2=AD=C3=9A&=C4=92=C3=B8=1E=C5=AAG=C5=
=A6=C2=9D=C3=A9h=C5=AA=03(=C2=AD=C3=A9=C2=9A=C2=8E=C2=8A=C3=9D=C4=92j"=C2=9D=
=C3=BA=1A=C4=B7=1Bm=C2=A7=C4=B8=C3=AF=C2=81=C4=99=C3=A4z=C4=91=C3=9E=C2=96=
=C2=8A=C4=81=C3=BEf=C4=A2=C4=92=C2=B7h=C2=9A=C2=88=C2=A7~=C2=88m

--
Chuck Lever
[email protected]




2017-12-03 20:24:57

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH] SUNRPC: Fix a race in the receive code path

T24gU3VuLCAyMDE3LTEyLTAzIGF0IDE1OjE5IC0wNTAwLCBDaHVjayBMZXZlciB3cm90ZToNCj4g
PiBPbiBEZWMgMywgMjAxNywgYXQgMzoxMiBQTSwgVHJvbmQgTXlrbGVidXN0IDx0cm9uZG15QHBy
aW1hcnlkYXRhLmNvDQo+ID4gbT4gd3JvdGU6DQo+ID4gDQo+ID4gT24gU3VuLCAyMDE3LTEyLTAz
IGF0IDEzOjU0IC0wNTAwLCBDaHVjayBMZXZlciB3cm90ZToNCj4gPiA+ID4gT24gRGVjIDMsIDIw
MTcsIGF0IDE6NTAgUE0sIFRyb25kIE15a2xlYnVzdCA8dHJvbmQubXlrbGVidXN0QHByDQo+ID4g
PiA+IGltYXINCj4gPiA+ID4geWRhdGEuY29tPiB3cm90ZToNCj4gPiA+ID4gDQo+ID4gPiA+IFdl
IG11c3QgZW5zdXJlIHRoYXQgdGhlIGNhbGwgdG8gcnBjX3NsZWVwX29uKCkgaW4NCj4gPiA+ID4g
eHBydF90cmFuc21pdCgpDQo+ID4gPiA+IGNhbm5vdA0KPiA+ID4gPiByYWNlIHdpdGggdGhlIGNh
bGwgdG8geHBydF9jb21wbGV0ZV9ycXN0KCkuDQo+ID4gPiANCj4gPiA+IDotKCB0aGlzIHdpbGwg
a2lsbCBzY2FsYWJpbGl0eSwgd2UgbWlnaHQgYXMgd2VsbCBqdXN0IGdvIGJhY2sNCj4gPiA+IHRv
IHRoZSBvbGQgbG9ja2luZyBzY2hlbWUuDQo+ID4gDQo+ID4gSXQgc2hvdWxkbid0IG1ha2UgYSBo
dWdlIGRpZmZlcmVuY2UsIGJ1dCBJIGFncmVlIHRoYXQgd2UgZG8gd2FudCB0bw0KPiA+IGdldA0K
PiA+IHJpZCBvZiB0aGF0IHRyYW5zcG9ydCBsb2NrLg0KPiA+IA0KPiA+IEhvdyBhYm91dCB0aGUg
Zm9sbG93aW5nLCB0aGVuPw0KPiANCj4gVGhpcyBsb29rcyBiZXR0ZXIsIEknbGwgZ2l2ZSBpdCBh
IHRyeSENCj4gDQo+IEJ1dCB0b3VjaGluZyB0aGUgcmVjdl9sb2NrIGluIHRoZSB0cmFuc21pdCBw
YXRoIGlzIHdoYXQgSSdkIGxpa2UNCj4gdG8gYXZvaWQgY29tcGxldGVseSwgaWYgcG9zc2libGUu
IEknbSBub3QgY2xlYXIgb24gd2h5IHRoZSBwZW5kaW5nDQo+IHdhaXRxdWV1ZSBpcyB1bnByb3Rl
Y3RlZC4gRG9lc24ndCBpdCBoYXZlIGEgbG9jayBvZiBpdHMgb3duPw0KDQpUaGUgcmVjdl9sb2Nr
IGVuc3VyZXMgdGhhdCB0aGUgY2hlY2sgb2YgcmVxLT5ycV9yZXBseV9ieXRlc19yZWN2ZCBpcw0K
YXRvbWljIHdpdGggdGhlIGNhbGwgdG8gcnBjX3NsZWVwX29uKCkuDQoNCi0tIA0KVHJvbmQgTXlr
bGVidXN0DQpMaW51eCBORlMgY2xpZW50IG1haW50YWluZXIsIFByaW1hcnlEYXRhDQp0cm9uZC5t
eWtsZWJ1c3RAcHJpbWFyeWRhdGEuY29tDQo=


2017-12-03 23:33:44

by Chuck Lever

[permalink] [raw]
Subject: Re: [PATCH] SUNRPC: Fix a race in the receive code path


> On Dec 3, 2017, at 3:24 PM, Trond Myklebust <[email protected]> =
wrote:
>=20
> On Sun, 2017-12-03 at 15:19 -0500, Chuck Lever wrote:
>>> On Dec 3, 2017, at 3:12 PM, Trond Myklebust <[email protected]
>>> m> wrote:
>>>=20
>>> On Sun, 2017-12-03 at 13:54 -0500, Chuck Lever wrote:
>>>>> On Dec 3, 2017, at 1:50 PM, Trond Myklebust <trond.myklebust@pr
>>>>> imar
>>>>> ydata.com> wrote:
>>>>>=20
>>>>> We must ensure that the call to rpc_sleep_on() in
>>>>> xprt_transmit()
>>>>> cannot
>>>>> race with the call to xprt_complete_rqst().
>>>>=20
>>>> :-( this will kill scalability, we might as well just go back
>>>> to the old locking scheme.
>>>=20
>>> It shouldn't make a huge difference, but I agree that we do want to
>>> get
>>> rid of that transport lock.
>>>=20
>>> How about the following, then?
>>=20
>> This looks better, I'll give it a try!

After testing for several hours, I was not able to reproduce the
lost RPC completion symptom.

I've been concerned about the performance impact. 8-thread dbench
throughput regresses 2-3% on my 56Gb/s network. There will probably
not be any noticeable degradation on a slower fabric.

Tested-by: Chuck Lever <[email protected]>


>>=20
>> But touching the recv_lock in the transmit path is what I'd like
>> to avoid completely, if possible. I'm not clear on why the pending
>> waitqueue is unprotected. Doesn't it have a lock of its own?
>=20
> The recv_lock ensures that the check of req->rq_reply_bytes_recvd is
> atomic with the call to rpc_sleep_on().

Ah, makes sense. I agree it is an oversight in ce7c252a8c7 not to
have made this change at the same time.

> From: Trond Myklebust <[email protected]>
> Date: Sun, 3 Dec 2017 13:37:27 -0500
> Subject: [PATCH v2] SUNRPC: Fix a race in the receive code path
>=20
> We must ensure that the call to rpc_sleep_on() in xprt_transmit() =
cannot
> race with the call to xprt_complete_rqst().

IMHO the patch description could mention rq_reply_bytes_recvd, which
is the data that is protected by the recv_lock. More bread crumbs
for our future selves...


> @@ -1048,19 +1049,22 @@ void xprt_transmit(struct rpc_task *task)
> xprt->stat.sending_u +=3D xprt->sending.qlen;
> xprt->stat.pending_u +=3D xprt->pending.qlen;
>=20
> - /* Don't race with disconnect */
> - if (!xprt_connected(xprt))
> - task->tk_status =3D -ENOTCONN;
> - else {
> + spin_unlock_bh(&xprt->transport_lock);
> +
> + if (!READ_ONCE(req->rq_reply_bytes_recvd) && =
rpc_reply_expected(task)) {

I was wondering about this optimization. It seems to provide about a
1% benefit in dbench throughput results in my setup, though it is
certainly not a common case that the reply has arrived at this point.


> + spin_lock(&xprt->recv_lock);
> /*
> * Sleep on the pending queue since
> * we're expecting a reply.
> */
> - if (!req->rq_reply_bytes_recvd && =
rpc_reply_expected(task))
> + if (!req->rq_reply_bytes_recvd) {
> rpc_sleep_on(&xprt->pending, task, xprt_timer);
> - req->rq_connect_cookie =3D xprt->connect_cookie;
> + /* Deal with disconnect races */
> + if (!xprt_connected(xprt))
> + xprt_wake_pending_tasks(xprt, =
-ENOTCONN);

Here, the !xprt_connected test is no longer done unconditionally,
but only when the task has to be put to sleep. Is that a 100% safe
change?

I'm also wondering if this check can be omitted. Won't disconnect
wait until xprt_release_xprt ?

Otherwise if this check is still needed, the new comment could be
a little more explanatory :-)


> + }
> + spin_unlock(&xprt->recv_lock);
> }
> - spin_unlock_bh(&xprt->transport_lock);
> }


Reviewed-by: Chuck Lever <[email protected]>

And thanks for your quick response!


--
Chuck Lever




2017-12-04 00:00:11

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH] SUNRPC: Fix a race in the receive code path

T24gU3VuLCAyMDE3LTEyLTAzIGF0IDE4OjMzIC0wNTAwLCBDaHVjayBMZXZlciB3cm90ZToNCj4g
PiBPbiBEZWMgMywgMjAxNywgYXQgMzoyNCBQTSwgVHJvbmQgTXlrbGVidXN0IDx0cm9uZG15QHBy
aW1hcnlkYXRhLmNvDQo+ID4gbT4gd3JvdGU6DQo+ID4gDQo+ID4gT24gU3VuLCAyMDE3LTEyLTAz
IGF0IDE1OjE5IC0wNTAwLCBDaHVjayBMZXZlciB3cm90ZToNCj4gPiA+ID4gT24gRGVjIDMsIDIw
MTcsIGF0IDM6MTIgUE0sIFRyb25kIE15a2xlYnVzdCA8dHJvbmRteUBwcmltYXJ5ZGF0DQo+ID4g
PiA+IGEuY28NCj4gPiA+ID4gbT4gd3JvdGU6DQo+ID4gPiA+IA0KPiA+ID4gPiBPbiBTdW4sIDIw
MTctMTItMDMgYXQgMTM6NTQgLTA1MDAsIENodWNrIExldmVyIHdyb3RlOg0KPiA+ID4gPiA+ID4g
T24gRGVjIDMsIDIwMTcsIGF0IDE6NTAgUE0sIFRyb25kIE15a2xlYnVzdCA8dHJvbmQubXlrbGVi
dXMNCj4gPiA+ID4gPiA+IHRAcHINCj4gPiA+ID4gPiA+IGltYXINCj4gPiA+ID4gPiA+IHlkYXRh
LmNvbT4gd3JvdGU6DQo+ID4gPiA+ID4gPiANCj4gPiA+ID4gPiA+IFdlIG11c3QgZW5zdXJlIHRo
YXQgdGhlIGNhbGwgdG8gcnBjX3NsZWVwX29uKCkgaW4NCj4gPiA+ID4gPiA+IHhwcnRfdHJhbnNt
aXQoKQ0KPiA+ID4gPiA+ID4gY2Fubm90DQo+ID4gPiA+ID4gPiByYWNlIHdpdGggdGhlIGNhbGwg
dG8geHBydF9jb21wbGV0ZV9ycXN0KCkuDQo+ID4gPiA+ID4gDQo+ID4gPiA+ID4gOi0oIHRoaXMg
d2lsbCBraWxsIHNjYWxhYmlsaXR5LCB3ZSBtaWdodCBhcyB3ZWxsIGp1c3QgZ28gYmFjaw0KPiA+
ID4gPiA+IHRvIHRoZSBvbGQgbG9ja2luZyBzY2hlbWUuDQo+ID4gPiA+IA0KPiA+ID4gPiBJdCBz
aG91bGRuJ3QgbWFrZSBhIGh1Z2UgZGlmZmVyZW5jZSwgYnV0IEkgYWdyZWUgdGhhdCB3ZSBkbw0K
PiA+ID4gPiB3YW50IHRvDQo+ID4gPiA+IGdldA0KPiA+ID4gPiByaWQgb2YgdGhhdCB0cmFuc3Bv
cnQgbG9jay4NCj4gPiA+ID4gDQo+ID4gPiA+IEhvdyBhYm91dCB0aGUgZm9sbG93aW5nLCB0aGVu
Pw0KPiA+ID4gDQo+ID4gPiBUaGlzIGxvb2tzIGJldHRlciwgSSdsbCBnaXZlIGl0IGEgdHJ5IQ0K
PiANCj4gQWZ0ZXIgdGVzdGluZyBmb3Igc2V2ZXJhbCBob3VycywgSSB3YXMgbm90IGFibGUgdG8g
cmVwcm9kdWNlIHRoZQ0KPiBsb3N0IFJQQyBjb21wbGV0aW9uIHN5bXB0b20uDQo+IA0KPiBJJ3Zl
IGJlZW4gY29uY2VybmVkIGFib3V0IHRoZSBwZXJmb3JtYW5jZSBpbXBhY3QuIDgtdGhyZWFkIGRi
ZW5jaA0KPiB0aHJvdWdocHV0IHJlZ3Jlc3NlcyAyLTMlIG9uIG15IDU2R2IvcyBuZXR3b3JrLiBU
aGVyZSB3aWxsIHByb2JhYmx5DQo+IG5vdCBiZSBhbnkgbm90aWNlYWJsZSBkZWdyYWRhdGlvbiBv
biBhIHNsb3dlciBmYWJyaWMuDQo+IA0KPiBUZXN0ZWQtYnk6IENodWNrIExldmVyIDxjaHVjay5s
ZXZlckBvcmFjbGUuY29tPg0KPiANCj4gDQo+ID4gPiANCj4gPiA+IEJ1dCB0b3VjaGluZyB0aGUg
cmVjdl9sb2NrIGluIHRoZSB0cmFuc21pdCBwYXRoIGlzIHdoYXQgSSdkIGxpa2UNCj4gPiA+IHRv
IGF2b2lkIGNvbXBsZXRlbHksIGlmIHBvc3NpYmxlLiBJJ20gbm90IGNsZWFyIG9uIHdoeSB0aGUN
Cj4gPiA+IHBlbmRpbmcNCj4gPiA+IHdhaXRxdWV1ZSBpcyB1bnByb3RlY3RlZC4gRG9lc24ndCBp
dCBoYXZlIGEgbG9jayBvZiBpdHMgb3duPw0KPiA+IA0KPiA+IFRoZSByZWN2X2xvY2sgZW5zdXJl
cyB0aGF0IHRoZSBjaGVjayBvZiByZXEtPnJxX3JlcGx5X2J5dGVzX3JlY3ZkDQo+ID4gaXMNCj4g
PiBhdG9taWMgd2l0aCB0aGUgY2FsbCB0byBycGNfc2xlZXBfb24oKS4NCj4gDQo+IEFoLCBtYWtl
cyBzZW5zZS4gSSBhZ3JlZSBpdCBpcyBhbiBvdmVyc2lnaHQgaW4gY2U3YzI1MmE4Yzcgbm90IHRv
DQo+IGhhdmUgbWFkZSB0aGlzIGNoYW5nZSBhdCB0aGUgc2FtZSB0aW1lLg0KPiANCj4gPiBGcm9t
OiBUcm9uZCBNeWtsZWJ1c3QgPHRyb25kLm15a2xlYnVzdEBwcmltYXJ5ZGF0YS5jb20+DQo+ID4g
RGF0ZTogU3VuLCAzIERlYyAyMDE3IDEzOjM3OjI3IC0wNTAwDQo+ID4gU3ViamVjdDogW1BBVENI
IHYyXSBTVU5SUEM6IEZpeCBhIHJhY2UgaW4gdGhlIHJlY2VpdmUgY29kZSBwYXRoDQo+ID4gDQo+
ID4gV2UgbXVzdCBlbnN1cmUgdGhhdCB0aGUgY2FsbCB0byBycGNfc2xlZXBfb24oKSBpbiB4cHJ0
X3RyYW5zbWl0KCkNCj4gPiBjYW5ub3QNCj4gPiByYWNlIHdpdGggdGhlIGNhbGwgdG8geHBydF9j
b21wbGV0ZV9ycXN0KCkuDQo+IA0KPiBJTUhPIHRoZSBwYXRjaCBkZXNjcmlwdGlvbiBjb3VsZCBt
ZW50aW9uIHJxX3JlcGx5X2J5dGVzX3JlY3ZkLCB3aGljaA0KPiBpcyB0aGUgZGF0YSB0aGF0IGlz
IHByb3RlY3RlZCBieSB0aGUgcmVjdl9sb2NrLiBNb3JlIGJyZWFkIGNydW1icw0KPiBmb3Igb3Vy
IGZ1dHVyZSBzZWx2ZXMuLi4NCj4gDQpJJ2xsIHB1dCBhIGNvbW1lbnQgaW4gdGhlIGNvZGUgaXRz
ZWxmLiBUaGF0IG1ha2VzIHRoZSBjb2RlIGVhc2llciB0bw0KcmV2aWV3Lg0KDQo+IA0KPiA+IEBA
IC0xMDQ4LDE5ICsxMDQ5LDIyIEBAIHZvaWQgeHBydF90cmFuc21pdChzdHJ1Y3QgcnBjX3Rhc2sg
KnRhc2spDQo+ID4gCXhwcnQtPnN0YXQuc2VuZGluZ191ICs9IHhwcnQtPnNlbmRpbmcucWxlbjsN
Cj4gPiAJeHBydC0+c3RhdC5wZW5kaW5nX3UgKz0geHBydC0+cGVuZGluZy5xbGVuOw0KPiA+IA0K
PiA+IC0JLyogRG9uJ3QgcmFjZSB3aXRoIGRpc2Nvbm5lY3QgKi8NCj4gPiAtCWlmICgheHBydF9j
b25uZWN0ZWQoeHBydCkpDQo+ID4gLQkJdGFzay0+dGtfc3RhdHVzID0gLUVOT1RDT05OOw0KPiA+
IC0JZWxzZSB7DQo+ID4gKwlzcGluX3VubG9ja19iaCgmeHBydC0+dHJhbnNwb3J0X2xvY2spOw0K
PiA+ICsNCj4gPiArCWlmICghUkVBRF9PTkNFKHJlcS0+cnFfcmVwbHlfYnl0ZXNfcmVjdmQpICYm
DQo+ID4gcnBjX3JlcGx5X2V4cGVjdGVkKHRhc2spKSB7DQo+IA0KPiBJIHdhcyB3b25kZXJpbmcg
YWJvdXQgdGhpcyBvcHRpbWl6YXRpb24uIEl0IHNlZW1zIHRvIHByb3ZpZGUgYWJvdXQgYQ0KPiAx
JSBiZW5lZml0IGluIGRiZW5jaCB0aHJvdWdocHV0IHJlc3VsdHMgaW4gbXkgc2V0dXAsIHRob3Vn
aCBpdCBpcw0KPiBjZXJ0YWlubHkgbm90IGEgY29tbW9uIGNhc2UgdGhhdCB0aGUgcmVwbHkgaGFz
IGFycml2ZWQgYXQgdGhpcyBwb2ludC4NCg0KSXQncyBub3QgdG9vIGxpa2VseSBhbiBvY2N1cnJl
bmNlIGZvciBtb3N0IHNldHVwcywgYnV0IGl0IGNvc3RzIGxpdHRsZQ0KdG8gZG8gdGhhdCBjaGVj
aywgYW5kIGl0IGRvZXMgbWVhbiB0aGF0IHdlIGNhbiBvcHRpbWlzZSBhd2F5IHRoZSBjYXNlDQpv
ZiBSUEMgY2FsbHMgdGhhdCBleHBlY3Qgbm8gcmVwbHkuDQoNCj4gPiArCQlzcGluX2xvY2soJnhw
cnQtPnJlY3ZfbG9jayk7DQo+ID4gCQkvKg0KPiA+IAkJICogU2xlZXAgb24gdGhlIHBlbmRpbmcg
cXVldWUgc2luY2UNCj4gPiAJCSAqIHdlJ3JlIGV4cGVjdGluZyBhIHJlcGx5Lg0KPiA+IAkJICov
DQo+ID4gLQkJaWYgKCFyZXEtPnJxX3JlcGx5X2J5dGVzX3JlY3ZkICYmDQo+ID4gcnBjX3JlcGx5
X2V4cGVjdGVkKHRhc2spKQ0KPiA+ICsJCWlmICghcmVxLT5ycV9yZXBseV9ieXRlc19yZWN2ZCkg
ew0KPiA+IAkJCXJwY19zbGVlcF9vbigmeHBydC0+cGVuZGluZywgdGFzaywgeHBydF90aW1lcik7
DQo+ID4gLQkJcmVxLT5ycV9jb25uZWN0X2Nvb2tpZSA9IHhwcnQtPmNvbm5lY3RfY29va2llOw0K
PiA+ICsJCQkvKiBEZWFsIHdpdGggZGlzY29ubmVjdCByYWNlcyAqLw0KPiA+ICsJCQlpZiAoIXhw
cnRfY29ubmVjdGVkKHhwcnQpKQ0KPiA+ICsJCQkJeHBydF93YWtlX3BlbmRpbmdfdGFza3MoeHBy
dCwNCj4gPiAtRU5PVENPTk4pOw0KPiANCj4gSGVyZSwgdGhlICF4cHJ0X2Nvbm5lY3RlZCB0ZXN0
IGlzIG5vIGxvbmdlciBkb25lIHVuY29uZGl0aW9uYWxseSwNCj4gYnV0IG9ubHkgd2hlbiB0aGUg
dGFzayBoYXMgdG8gYmUgcHV0IHRvIHNsZWVwLiBJcyB0aGF0IGEgMTAwJSBzYWZlDQo+IGNoYW5n
ZT8NCg0KWWVzLiBXZSBvbmx5IGNhcmUgYWJvdXQgdGhlIHJhY2UgY2FzZSBmb3IgdGhpcyBwYXJ0
aWN1bGFyIFJQQyBjYWxsLiBBbnkNCm90aGVyIFJQQ3MgdGhhdCBhcmUgYWxyZWFkeSB3YWl0aW5n
IG9uIHhwcnQtPnBlbmRpbmcgd2lsbCBiZSB3b2tlbiBieQ0KdGhlIHRyYW5zcG9ydCBsYXllciBp
dHNlbGYuDQoNCj4gSSdtIGFsc28gd29uZGVyaW5nIGlmIHRoaXMgY2hlY2sgY2FuIGJlIG9taXR0
ZWQuIFdvbid0IGRpc2Nvbm5lY3QNCj4gd2FpdCB1bnRpbCB4cHJ0X3JlbGVhc2VfeHBydCA/DQoN
CklmIHRoZXJlIGFyZSBubyBvdGhlciBSUEMgY2FsbHMgcGVuZGluZyB0byBkcml2ZSBhIHJlY29u
bmVjdCwgdGhlbiB3ZQ0KY2FuIGVuZCB1cCB3YWl0aW5nIGZvciB0aGlzIFJQQyBjYWxsIHRvIHRp
bWUgb3V0IGlmIHdlIHJhY2UgaGVyZS4NCg0KPiANCj4gT3RoZXJ3aXNlIGlmIHRoaXMgY2hlY2sg
aXMgc3RpbGwgbmVlZGVkLCB0aGUgbmV3IGNvbW1lbnQgY291bGQgYmUNCj4gYSBsaXR0bGUgbW9y
ZSBleHBsYW5hdG9yeSA6LSkNCj4gDQo+IA0KPiA+ICsJCX0NCj4gPiArCQlzcGluX3VubG9jaygm
eHBydC0+cmVjdl9sb2NrKTsNCj4gPiAJfQ0KPiA+IC0Jc3Bpbl91bmxvY2tfYmgoJnhwcnQtPnRy
YW5zcG9ydF9sb2NrKTsNCj4gPiB9DQo+IA0KPiANCj4gUmV2aWV3ZWQtYnk6IENodWNrIExldmVy
IDxjaHVjay5sZXZlckBvcmFjbGUuY29tPg0KPiANCj4gQW5kIHRoYW5rcyBmb3IgeW91ciBxdWlj
ayByZXNwb25zZSENCj4gDQo+IA0KPiAtLQ0KPiBDaHVjayBMZXZlcg0KPiANCj4gDQo+IA0KLS0g
DQpUcm9uZCBNeWtsZWJ1c3QNCkxpbnV4IE5GUyBjbGllbnQgbWFpbnRhaW5lciwgUHJpbWFyeURh
dGENCnRyb25kLm15a2xlYnVzdEBwcmltYXJ5ZGF0YS5jb20NCg==