Just happened again. Any thoughts about where I should start looking?
Mon Apr 15 11:01:40 EDT 2019
4k100test: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1024
...
fio-3.1
Starting 12 processes
4k100test: Laying out IO file (1 file / 1024MiB)
fio: native_fallocate call failed: Operation not supported
4k100test: Laying out IO file (1 file / 1024MiB)
fio: native_fallocate call failed: Operation not supported
4k100test: Laying out IO file (1 file / 1024MiB)
fio: native_fallocate call failed: Operation not supported
4k100test: Laying out IO file (1 file / 1024MiB)
fio: native_fallocate call failed: Operation not supported
4k100test: Laying out IO file (1 file / 1024MiB)
fio: native_fallocate call failed: Operation not supported
4k100test: Laying out IO file (1 file / 1024MiB)
fio: native_fallocate call failed: Operation not supported
4k100test: Laying out IO file (1 file / 1024MiB)
fio: native_fallocate call failed: Operation not supported
4k100test: Laying out IO file (1 file / 1024MiB)
fio: native_fallocate call failed: Operation not supported
4k100test: Laying out IO file (1 file / 1024MiB)
fio: native_fallocate call failed: Operation not supported
4k100test: Laying out IO file (1 file / 1024MiB)
fio: native_fallocate call failed: Operation not supported
4k100test: Laying out IO file (1 file / 1024MiB)
fio: native_fallocate call failed: Operation not supported
4k100test: Laying out IO file (1 file / 1024MiB)
fio: native_fallocate call failed: Operation not supported
fio: io_u error on file 4k100test.7.0: Invalid slot: read offset=938229760, buflen=4096
fio: pid=2653, err=57/file:io_u.c:1756, func=io_u error, error=Invalid slot
fio: io_u error on file 4k100test.10.0: Invalid slot: read offset=331800576, buflen=40963s]
fio: pid=2650, err=57/file:io_u.c:1756, func=io_u error, error=Invalid slot
Jobs: 10 (f=10): [r(2),X(1),r(2),X(1),r(6)][100.0%][r=617MiB/s,w=0KiB/s][r=158k,w=0 IOPS][eta 00m:00s]
4k100test: (groupid=0, jobs=12): err=57 (file:io_u.c:1756, func=io_u error, error=Invalid slot): pid=2648: Mon Apr 15 11:02:25 2019
read: IOPS=157k, BW=613MiB/s (643MB/s)(11.4GiB/19024msec)
slat (usec): min=2, max=9808, avg= 7.36, stdev=30.80
clat (usec): min=43, max=1236.1k, avg=74443.93, stdev=85310.03
lat (usec): min=134, max=1236.1k, avg=74451.53, stdev=85309.77
clat percentiles (usec):
| 1.00th=[ 807], 5.00th=[ 1287], 10.00th=[ 1647], 20.00th=[ 2376],
| 30.00th=[ 3490], 40.00th=[ 8160], 50.00th=[ 69731], 60.00th=[ 76022],
| 70.00th=[ 80217], 80.00th=[139461], 90.00th=[181404], 95.00th=[233833],
| 99.00th=[375391], 99.50th=[438305], 99.90th=[549454], 99.95th=[616563],
| 99.99th=[750781]
bw ( KiB/s): min=46941, max=70036, per=8.73%, avg=54819.99, stdev=4131.49, samples=430
iops : min=11735, max=17509, avg=13704.80, stdev=1032.88, samples=430
lat (usec) : 50=0.01%, 100=0.01%, 250=0.01%, 500=0.14%, 750=0.63%
lat (usec) : 1000=1.48%
lat (msec) : 2=12.84%, 4=17.70%, 10=7.87%, 20=0.77%, 50=0.21%
lat (msec) : 100=34.04%, 250=20.30%, 500=3.82%, 750=0.20%, 1000=0.01%
lat (msec) : 2000=0.01%
cpu : usr=2.04%, sys=7.06%, ctx=698855, majf=0, minf=6299
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.1%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
issued rwt: total=2986716,0,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1024
Run status group 0 (all jobs):
READ: bw=613MiB/s (643MB/s), 613MiB/s-613MiB/s (643MB/s-643MB/s), io=11.4GiB (12.2GB), run=19024-19024msec
--
Chuck Lever
Hi Chuck,
On Mon, 2019-04-15 at 11:04 -0400, Chuck Lever wrote:
> Just happened again. Any thoughts about where I should start looking?
>
> Mon Apr 15 11:01:40 EDT 2019
> 4k100test: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B,
> (T) 4096B-4096B, ioengine=libaio, iodepth=1024
> ...
> fio-3.1
> Starting 12 processes
> 4k100test: Laying out IO file (1 file / 1024MiB)
> fio: native_fallocate call failed: Operation not supported
> 4k100test: Laying out IO file (1 file / 1024MiB)
> fio: native_fallocate call failed: Operation not supported
> 4k100test: Laying out IO file (1 file / 1024MiB)
> fio: native_fallocate call failed: Operation not supported
> 4k100test: Laying out IO file (1 file / 1024MiB)
> fio: native_fallocate call failed: Operation not supported
> 4k100test: Laying out IO file (1 file / 1024MiB)
> fio: native_fallocate call failed: Operation not supported
> 4k100test: Laying out IO file (1 file / 1024MiB)
> fio: native_fallocate call failed: Operation not supported
> 4k100test: Laying out IO file (1 file / 1024MiB)
> fio: native_fallocate call failed: Operation not supported
> 4k100test: Laying out IO file (1 file / 1024MiB)
> fio: native_fallocate call failed: Operation not supported
> 4k100test: Laying out IO file (1 file / 1024MiB)
> fio: native_fallocate call failed: Operation not supported
> 4k100test: Laying out IO file (1 file / 1024MiB)
> fio: native_fallocate call failed: Operation not supported
> 4k100test: Laying out IO file (1 file / 1024MiB)
> fio: native_fallocate call failed: Operation not supported
> 4k100test: Laying out IO file (1 file / 1024MiB)
> fio: native_fallocate call failed: Operation not supported
> fio: io_u error on file 4k100test.7.0: Invalid slot: read
> offset=938229760, buflen=4096
Does the following patch fix the race?
8<--------------------------------------
From 4c8759eafad9bb7ea2626a53296e30618aeefcc7 Mon Sep 17 00:00:00 2001
From: Trond Myklebust <[email protected]>
Date: Mon, 15 Apr 2019 11:54:13 -0400
Subject: [PATCH] SUNRPC: Ignore queue transmission errors on successful
transmission
If a request transmission fails due to write space or slot unavailability
errors, but the queued task then gets transmitted before it has time to
process the error in call_transmit_status() or call_bc_transmit_status(),
we need to suppress the transmission error code to prevent it from leaking
out of the RPC layer.
Reported-by: Chuck Lever <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
---
net/sunrpc/clnt.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index fa900bb44cd5..369a2648dafc 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -2101,8 +2101,8 @@ call_transmit_status(struct rpc_task *task)
* test first.
*/
if (rpc_task_transmitted(task)) {
- if (task->tk_status == 0)
- xprt_request_wait_receive(task);
+ task->tk_status = 0;
+ xprt_request_wait_receive(task);
return;
}
@@ -2187,6 +2187,9 @@ call_bc_transmit_status(struct rpc_task *task)
{
struct rpc_rqst *req = task->tk_rqstp;
+ if (rpc_task_transmitted(task))
+ task->tk_status = 0;
+
dprint_status(task);
switch (task->tk_status) {
--
2.20.1
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]
> On Apr 15, 2019, at 12:05 PM, Trond Myklebust <[email protected]> wrote:
>
> Hi Chuck,
>
>
> On Mon, 2019-04-15 at 11:04 -0400, Chuck Lever wrote:
>> Just happened again. Any thoughts about where I should start looking?
>>
>> Mon Apr 15 11:01:40 EDT 2019
>> 4k100test: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B,
>> (T) 4096B-4096B, ioengine=libaio, iodepth=1024
>> ...
>> fio-3.1
>> Starting 12 processes
>> 4k100test: Laying out IO file (1 file / 1024MiB)
>> fio: native_fallocate call failed: Operation not supported
>> 4k100test: Laying out IO file (1 file / 1024MiB)
>> fio: native_fallocate call failed: Operation not supported
>> 4k100test: Laying out IO file (1 file / 1024MiB)
>> fio: native_fallocate call failed: Operation not supported
>> 4k100test: Laying out IO file (1 file / 1024MiB)
>> fio: native_fallocate call failed: Operation not supported
>> 4k100test: Laying out IO file (1 file / 1024MiB)
>> fio: native_fallocate call failed: Operation not supported
>> 4k100test: Laying out IO file (1 file / 1024MiB)
>> fio: native_fallocate call failed: Operation not supported
>> 4k100test: Laying out IO file (1 file / 1024MiB)
>> fio: native_fallocate call failed: Operation not supported
>> 4k100test: Laying out IO file (1 file / 1024MiB)
>> fio: native_fallocate call failed: Operation not supported
>> 4k100test: Laying out IO file (1 file / 1024MiB)
>> fio: native_fallocate call failed: Operation not supported
>> 4k100test: Laying out IO file (1 file / 1024MiB)
>> fio: native_fallocate call failed: Operation not supported
>> 4k100test: Laying out IO file (1 file / 1024MiB)
>> fio: native_fallocate call failed: Operation not supported
>> 4k100test: Laying out IO file (1 file / 1024MiB)
>> fio: native_fallocate call failed: Operation not supported
>> fio: io_u error on file 4k100test.7.0: Invalid slot: read
>> offset=938229760, buflen=4096
>
> Does the following patch fix the race?
>
> 8<--------------------------------------
> From 4c8759eafad9bb7ea2626a53296e30618aeefcc7 Mon Sep 17 00:00:00 2001
> From: Trond Myklebust <[email protected]>
> Date: Mon, 15 Apr 2019 11:54:13 -0400
> Subject: [PATCH] SUNRPC: Ignore queue transmission errors on successful
> transmission
>
> If a request transmission fails due to write space or slot unavailability
> errors, but the queued task then gets transmitted before it has time to
> process the error in call_transmit_status() or call_bc_transmit_status(),
> we need to suppress the transmission error code to prevent it from leaking
> out of the RPC layer.
>
> Reported-by: Chuck Lever <[email protected]>
> Signed-off-by: Trond Myklebust <[email protected]>
> ---
> net/sunrpc/clnt.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
> index fa900bb44cd5..369a2648dafc 100644
> --- a/net/sunrpc/clnt.c
> +++ b/net/sunrpc/clnt.c
> @@ -2101,8 +2101,8 @@ call_transmit_status(struct rpc_task *task)
> * test first.
> */
> if (rpc_task_transmitted(task)) {
> - if (task->tk_status == 0)
> - xprt_request_wait_receive(task);
> + task->tk_status = 0;
> + xprt_request_wait_receive(task);
> return;
> }
>
> @@ -2187,6 +2187,9 @@ call_bc_transmit_status(struct rpc_task *task)
> {
> struct rpc_rqst *req = task->tk_rqstp;
>
> + if (rpc_task_transmitted(task))
> + task->tk_status = 0;
> +
> dprint_status(task);
>
> switch (task->tk_status) {
I was about to try something like this.
I don't have a 100% reproducer. I will apply your patch and
wait for the problem to appear over the next few days.
--
Chuck Lever
On Mon, 2019-04-15 at 12:11 -0400, Chuck Lever wrote:
> >
> I was about to try something like this.
>
So, the reason for that approach is that I believe we can currently
leak ENOBUFS and possibly other errors in addition to EBADSLT through
the same race.
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]
> On Apr 15, 2019, at 12:05 PM, Trond Myklebust <[email protected]> wrote:
>
> Hi Chuck,
>
>
> On Mon, 2019-04-15 at 11:04 -0400, Chuck Lever wrote:
>> Just happened again. Any thoughts about where I should start looking?
>>
>> Mon Apr 15 11:01:40 EDT 2019
>> 4k100test: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B,
>> (T) 4096B-4096B, ioengine=libaio, iodepth=1024
>> ...
>> fio-3.1
>> Starting 12 processes
>> 4k100test: Laying out IO file (1 file / 1024MiB)
>> fio: native_fallocate call failed: Operation not supported
>> 4k100test: Laying out IO file (1 file / 1024MiB)
>> fio: native_fallocate call failed: Operation not supported
>> 4k100test: Laying out IO file (1 file / 1024MiB)
>> fio: native_fallocate call failed: Operation not supported
>> 4k100test: Laying out IO file (1 file / 1024MiB)
>> fio: native_fallocate call failed: Operation not supported
>> 4k100test: Laying out IO file (1 file / 1024MiB)
>> fio: native_fallocate call failed: Operation not supported
>> 4k100test: Laying out IO file (1 file / 1024MiB)
>> fio: native_fallocate call failed: Operation not supported
>> 4k100test: Laying out IO file (1 file / 1024MiB)
>> fio: native_fallocate call failed: Operation not supported
>> 4k100test: Laying out IO file (1 file / 1024MiB)
>> fio: native_fallocate call failed: Operation not supported
>> 4k100test: Laying out IO file (1 file / 1024MiB)
>> fio: native_fallocate call failed: Operation not supported
>> 4k100test: Laying out IO file (1 file / 1024MiB)
>> fio: native_fallocate call failed: Operation not supported
>> 4k100test: Laying out IO file (1 file / 1024MiB)
>> fio: native_fallocate call failed: Operation not supported
>> 4k100test: Laying out IO file (1 file / 1024MiB)
>> fio: native_fallocate call failed: Operation not supported
>> fio: io_u error on file 4k100test.7.0: Invalid slot: read
>> offset=938229760, buflen=4096
>
> Does the following patch fix the race?
>
> 8<--------------------------------------
> From 4c8759eafad9bb7ea2626a53296e30618aeefcc7 Mon Sep 17 00:00:00 2001
> From: Trond Myklebust <[email protected]>
> Date: Mon, 15 Apr 2019 11:54:13 -0400
> Subject: [PATCH] SUNRPC: Ignore queue transmission errors on successful
> transmission
>
> If a request transmission fails due to write space or slot unavailability
> errors, but the queued task then gets transmitted before it has time to
> process the error in call_transmit_status() or call_bc_transmit_status(),
> we need to suppress the transmission error code to prevent it from leaking
> out of the RPC layer.
>
> Reported-by: Chuck Lever <[email protected]>
> Signed-off-by: Trond Myklebust <[email protected]>
> ---
> net/sunrpc/clnt.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
> index fa900bb44cd5..369a2648dafc 100644
> --- a/net/sunrpc/clnt.c
> +++ b/net/sunrpc/clnt.c
> @@ -2101,8 +2101,8 @@ call_transmit_status(struct rpc_task *task)
> * test first.
> */
> if (rpc_task_transmitted(task)) {
> - if (task->tk_status == 0)
> - xprt_request_wait_receive(task);
> + task->tk_status = 0;
> + xprt_request_wait_receive(task);
> return;
> }
>
> @@ -2187,6 +2187,9 @@ call_bc_transmit_status(struct rpc_task *task)
> {
> struct rpc_rqst *req = task->tk_rqstp;
>
> + if (rpc_task_transmitted(task))
> + task->tk_status = 0;
> +
> dprint_status(task);
>
> switch (task->tk_status) {
> --
> 2.20.1
>
> --
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> [email protected]
I haven't been able to reproduce the problem.
Tested-by: Chuck Lever <[email protected]>
--
Chuck Lever