Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp609495ybi; Fri, 31 May 2019 06:26:26 -0700 (PDT) X-Google-Smtp-Source: APXvYqx72yTVpMpgNkUtGGkmXi3QB/bPYVXnGS2xmbDJ8tnteA8d4qAszII2fjd1l8/pWNUpZ/Zs X-Received: by 2002:a17:902:6b03:: with SMTP id o3mr9473291plk.85.1559309186202; Fri, 31 May 2019 06:26:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1559309186; cv=none; d=google.com; s=arc-20160816; b=Vz1gEREZd3Zs7zJn/kRkVYt4k9W3sse049XJ2IA2zLSCBjXngutnEv/9F5e0HZyVJV Q2BcGeKfnCjylp/iU85kpZzmjaNX1UMwX+5AWaG75mhd6zX7ZcBXt430cDPx12xWTExW mmIrzDYL8ZvdnsW+RsLaXd0RDEsgTp352gPQjZQbhL43zDjekJEmjF3F3afMqyHL3TE9 2XvB6Li4/oRYsQIeLz6XX5D6JbU7jax+yq7aroyBhSJS4441E+uWEWf1R3Ut28iWCGK6 QKKlkm1i7au1zKOtGLdgkFx3G7O6gI2HCdeznPGut2v+3aRvdhPGswmaYuXmjwXABEvo iOeg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:to:references:message-id :content-transfer-encoding:cc:date:in-reply-to:from:subject :mime-version:dkim-signature; bh=UdQQq8ASrjTgUSxtshe8VsnndBb0QfrtynyMYUL+kFk=; b=Bd6x2OmW7p2UQ3tfMA5tSvrb/s1WS14uf0x2/7MsR15ONHs+CILYWnbEi75Li4qRV/ QiMQvwxJHFXXG8Vr4ufiTK6je6cGHT9xI5cvqh8GQmAiqQPU28O8NxcO30v5SX90gUVd 9O3LSJLrEYrtXvlmEB1ZyQ3pUY40Gu2v9YyS3hc4gTTABCk9frY6/GLtkMAAJG8WjqyU 9Yc4qN6JXaGQpGFHKb1OHD6cOz6wuXdsb5xipOtlYgNlfLZ+VO8eyEJa8Wyy+1rxxWQX PvXMrsvmrQwFZYlYrH2vlDXdenvwkjtyFuJO9jfcfEdX3Z0hceAVpO5gWmsb85EUaOh5 hwpA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b=oHhXuA5Y; spf=pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w36si6271931pgl.540.2019.05.31.06.25.55; Fri, 31 May 2019 06:26:26 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b=oHhXuA5Y; spf=pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726512AbfEaNZx (ORCPT + 99 others); Fri, 31 May 2019 09:25:53 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:48742 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726037AbfEaNZx (ORCPT ); Fri, 31 May 2019 09:25:53 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x4VDNIA0041124; Fri, 31 May 2019 13:25:19 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=content-type : mime-version : subject : from : in-reply-to : date : cc : content-transfer-encoding : message-id : references : to; s=corp-2018-07-02; bh=UdQQq8ASrjTgUSxtshe8VsnndBb0QfrtynyMYUL+kFk=; b=oHhXuA5Yh7LlqyJfgDZ+ynvBNaS0Ejq8zEUCNsXBVd4ADmF15a6Ia9G02FotG0Zl5/CE HAQPBybn47FLxrDw6aHvyil41C1IVLZjrqSbwe7qSvfYpPP1maAMmZXLIYg08H5mR/RK J/lMl4RPpQaRggKpa2ng5PAKzmsEMsbI/qkELGAt9aY8Yh2Wllh6129G+foFvIRdeq2S TKDF2bahAZo2IB20Xj0xG26Ei2LG8aXiC9ZsDfNlLIVnHtopOPsO1vW6/0lmxvIe6GzY 2lkqlrCyO7FxsxeM+Cdep3DnZjA8qIXLiDtsBaArA6jDfvqKACUlR6pYzhsEybrK052B yg== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by userp2130.oracle.com with ESMTP id 2spw4tx7cn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 31 May 2019 13:25:18 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x4VDO3r3087358; Fri, 31 May 2019 13:25:18 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by aserp3030.oracle.com with ESMTP id 2su3y494eu-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 31 May 2019 13:25:17 +0000 Received: from abhmp0019.oracle.com (abhmp0019.oracle.com [141.146.116.25]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id x4VDPCN0013944; Fri, 31 May 2019 13:25:12 GMT Received: from anon-dhcp-171.1015granger.net (/68.61.232.219) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 31 May 2019 06:25:12 -0700 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\)) Subject: Re: [PATCH 3/3] SUNRPC: Count ops completing with tk_status < 0 From: Chuck Lever In-Reply-To: <20190530223314.GA25368@fieldses.org> Date: Fri, 31 May 2019 09:25:11 -0400 Cc: Dave Wysochanski , Linux NFS Mailing List Content-Transfer-Encoding: quoted-printable Message-Id: References: <20190523201351.12232-1-dwysocha@redhat.com> <20190523201351.12232-3-dwysocha@redhat.com> <20190530213857.GA24802@fieldses.org> <9B9F0C9B-C493-44F5-ABD1-6B2B4BAA2F08@oracle.com> <20190530223314.GA25368@fieldses.org> To: Bruce Fields X-Mailer: Apple Mail (2.3445.104.11) X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9273 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1905310086 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9273 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1905310086 Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org > On May 30, 2019, at 6:33 PM, Bruce Fields = wrote: >=20 > On Thu, May 30, 2019 at 06:19:54PM -0400, Chuck Lever wrote: >>=20 >>=20 >>> On May 30, 2019, at 5:38 PM, bfields@fieldses.org wrote: >>>=20 >>> On Thu, May 23, 2019 at 04:13:50PM -0400, Dave Wysochanski wrote: >>>> We often see various error conditions with NFS4.x that show up with >>>> a very high operation count all completing with tk_status < 0 in a >>>> short period of time. Add a count to rpc_iostats to record on a >>>> per-op basis the ops that complete in this manner, which will >>>> enable lower overhead diagnostics. >>>=20 >>> Looks like a good idea to me. >>>=20 >>> It's too bad we can't distinguish the errors. (E.g. ESTALE on a = lookup >>> call is a lot more interesting than ENOENT.) But understood that >>> maintaining (number of possible errors) * (number of possible ops) >>> counters is probably overkill, so just counting the number of errors >>> seems like a good start. >>=20 >> We now have trace points that can do that too. >=20 > You mean, that can report every error (and its value)? Yes, the nfs_xdr_status trace point reports the error by value and = symbolic name. > I assume having these statistics in mountstats is still useful, = though. >=20 > --b. >=20 >>=20 >>=20 >>> --b. >>>=20 >>>>=20 >>>> Signed-off-by: Dave Wysochanski >>>> --- >>>> include/linux/sunrpc/metrics.h | 7 ++++++- >>>> net/sunrpc/stats.c | 8 ++++++-- >>>> 2 files changed, 12 insertions(+), 3 deletions(-) >>>>=20 >>>> diff --git a/include/linux/sunrpc/metrics.h = b/include/linux/sunrpc/metrics.h >>>> index 1b3751327575..0ee3f7052846 100644 >>>> --- a/include/linux/sunrpc/metrics.h >>>> +++ b/include/linux/sunrpc/metrics.h >>>> @@ -30,7 +30,7 @@ >>>> #include >>>> #include >>>>=20 >>>> -#define RPC_IOSTATS_VERS "1.0" >>>> +#define RPC_IOSTATS_VERS "1.1" >>>>=20 >>>> struct rpc_iostats { >>>> spinlock_t om_lock; >>>> @@ -66,6 +66,11 @@ struct rpc_iostats { >>>> ktime_t om_queue, /* queued for xmit */ >>>> om_rtt, /* RPC RTT */ >>>> om_execute; /* RPC execution */ >>>> + /* >>>> + * The count of operations that complete with tk_status < 0. >>>> + * These statuses usually indicate error conditions. >>>> + */ >>>> + unsigned long om_error_status; >>>> } ____cacheline_aligned; >>>>=20 >>>> struct rpc_task; >>>> diff --git a/net/sunrpc/stats.c b/net/sunrpc/stats.c >>>> index 8b2d3c58ffae..737414247ca7 100644 >>>> --- a/net/sunrpc/stats.c >>>> +++ b/net/sunrpc/stats.c >>>> @@ -176,6 +176,8 @@ void rpc_count_iostats_metrics(const struct = rpc_task *task, >>>>=20 >>>> execute =3D ktime_sub(now, task->tk_start); >>>> op_metrics->om_execute =3D ktime_add(op_metrics->om_execute, = execute); >>>> + if (task->tk_status < 0) >>>> + op_metrics->om_error_status++; >>>>=20 >>>> spin_unlock(&op_metrics->om_lock); >>>>=20 >>>> @@ -218,13 +220,14 @@ static void _add_rpc_iostats(struct = rpc_iostats *a, struct rpc_iostats *b) >>>> a->om_queue =3D ktime_add(a->om_queue, b->om_queue); >>>> a->om_rtt =3D ktime_add(a->om_rtt, b->om_rtt); >>>> a->om_execute =3D ktime_add(a->om_execute, b->om_execute); >>>> + a->om_error_status +=3D b->om_error_status; >>>> } >>>>=20 >>>> static void _print_rpc_iostats(struct seq_file *seq, struct = rpc_iostats *stats, >>>> int op, const struct rpc_procinfo *procs) >>>> { >>>> _print_name(seq, op, procs); >>>> - seq_printf(seq, "%lu %lu %lu %llu %llu %llu %llu %llu\n", >>>> + seq_printf(seq, "%lu %lu %lu %llu %llu %llu %llu %llu %lu\n", >>>> stats->om_ops, >>>> stats->om_ntrans, >>>> stats->om_timeouts, >>>> @@ -232,7 +235,8 @@ static void _print_rpc_iostats(struct seq_file = *seq, struct rpc_iostats *stats, >>>> stats->om_bytes_recv, >>>> ktime_to_ms(stats->om_queue), >>>> ktime_to_ms(stats->om_rtt), >>>> - ktime_to_ms(stats->om_execute)); >>>> + ktime_to_ms(stats->om_execute), >>>> + stats->om_error_status); >>>> } >>>>=20 >>>> void rpc_clnt_show_stats(struct seq_file *seq, struct rpc_clnt = *clnt) >>>> --=20 >>>> 2.20.1 >>=20 >> -- >> Chuck Lever -- Chuck Lever