Received: by 2002:a05:7412:6592:b0:d7:7d3a:4fe2 with SMTP id m18csp621304rdg; Thu, 10 Aug 2023 13:47:07 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHHLHl+nvRhLLx/oslmCESBrB40qAIz7Ahss8SFFdImhwl3Vq5GbZWmEw0OGuC3juzac6in X-Received: by 2002:a05:6402:395:b0:521:8d64:df1c with SMTP id o21-20020a056402039500b005218d64df1cmr152185edv.0.1691700427565; Thu, 10 Aug 2023 13:47:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691700427; cv=none; d=google.com; s=arc-20160816; b=UunpfW/GujI17+a+1yUaNHlSVu0+hCH4Ewj6vEQqNLZCEhZNHvWmvmjeObjF+kt/Y/ W0dv4rsc3t8WZLDAN40gwDL9cLISNykiiwL53+risUmCVqHMvs1rLMYM8b5aq9qaJ0yT 0xnnwAHAsVH7niiNKX8YOripZAi6UDIVbjR2oi+wku7BgAxuSH9pnEXWi9MdEBp/Imin ZD7n4b3TQA7X/Yf0cQ+bNPJqFLhg9JGCkcx4mv7WdP/ILNfg9yK/1+AAwG5PCd6ZuMli hpEmjgLTDiUBP/j9f+HvWoLijRCkUB61/p6mCyqmrcfM0Ed0KO8njhf0m5J/MhJzMVG8 YLVA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:message-id:date:references:in-reply-to:subject :cc:to:from:mime-version:content-transfer-encoding:dkim-signature :dkim-signature; bh=sXd8HHByZl5x4FgVYN7diqYoGPm/1Iq/ubEabUWaWWI=; fh=lsDcAteKP26GmFwfsRRj16TIlGFi9Q2D1BHen0qXcFs=; b=VJvRHAGfb8lanQbfylW/Ylqb8j2WEjlBO6Qvj3C61OX507hJkOXd0qhHT+936dSrDE yc4+jKREs6UwgjPxCZshrDITP3WGgmVDt/TWGi/X/lDW2vfykHwrURH7uz2AEfWvRZFl ZsB/NRE7o/CMIrHUqPWejJRc6U0L0ZwiGfJtlTTc/PzgpAJKF8c2uLTHlg0YKFjd1+KL HvF6msV5mtvWv5dwPeLwFBFBOGr69ppw19Bl8qssWu6g2guDOWKM64sLRghsd1BwTbZ5 N2UeRQ21fbXm8/iu2s2g8aVEUChP2ABkssrSKTmWLZgeg1EdzC9JE98bpzuVHKUdiIcq uZWg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=pkzKiPZY; dkim=neutral (no key) header.i=@suse.de; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.de Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c23-20020aa7c757000000b005233789e992si2240626eds.0.2023.08.10.13.46.32; Thu, 10 Aug 2023 13:47:07 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=pkzKiPZY; dkim=neutral (no key) header.i=@suse.de; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235293AbjHJUYv (ORCPT + 99 others); Thu, 10 Aug 2023 16:24:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38548 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233860AbjHJUYv (ORCPT ); Thu, 10 Aug 2023 16:24:51 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D6AA02D41 for ; Thu, 10 Aug 2023 13:24:47 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 7974F2187F; Thu, 10 Aug 2023 20:24:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1691699086; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sXd8HHByZl5x4FgVYN7diqYoGPm/1Iq/ubEabUWaWWI=; b=pkzKiPZYBelpDetgWMJjL/0olsuQ3j6Eliwzr3zETxbq/oE5XcfT1QVMDnJJQTSWo0hPnK 2awMdFerrpmtAzUkr//0SwltIKBEX9wKW1eiomR4haM7cQWRTAGh2d6zik/uFgBX1NoKSc sr4NYix5oHc9GNlMUtjgaRZD2ucH5Yc= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1691699086; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sXd8HHByZl5x4FgVYN7diqYoGPm/1Iq/ubEabUWaWWI=; b=RB71XKCuZEErocC40g+0cguikg3cKiSEH62MwDMxrZXwlRwoeSPpWg069p2dv1CTh2HiHD 8ts7ON3/tWs7qpCA== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 936FD138E2; Thu, 10 Aug 2023 20:24:44 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id 0WWVEYxH1WTnDgAAMHmgww (envelope-from ); Thu, 10 Aug 2023 20:24:44 +0000 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 From: "NeilBrown" To: "Chuck Lever" Cc: "Lorenzo Bianconi" , linux-nfs@vger.kernel.org, lorenzo.bianconi@redhat.com, jlayton@kernel.org Subject: Re: [PATCH v6 3/3] NFSD: add rpc_status entry in nfsd debug filesystem In-reply-to: References: , <177cfd4fc640d9b406101761be24da07990ca81e.1691656474.git.lorenzo@kernel.org>, Date: Fri, 11 Aug 2023 06:24:39 +1000 Message-id: <169169907976.11073.6029761322750936330@noble.neil.brown.name> X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org On Fri, 11 Aug 2023, Chuck Lever wrote: > On Thu, Aug 10, 2023 at 10:39:21AM +0200, Lorenzo Bianconi wrote: > > Introduce rpc_status entry in nfsd debug filesystem in order to dump > > pending RPC requests debugging information. > >=20 > > Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=3D366 > > Reviewed-by: NeilBrown > > Reviewed-by: Jeff Layton > > Signed-off-by: Lorenzo Bianconi > > --- > > fs/nfsd/nfs4proc.c | 4 +- > > fs/nfsd/nfsctl.c | 9 +++ > > fs/nfsd/nfsd.h | 7 ++ > > fs/nfsd/nfssvc.c | 140 +++++++++++++++++++++++++++++++++++++ > > include/linux/sunrpc/svc.h | 1 + > > net/sunrpc/svc.c | 2 +- > > 6 files changed, 159 insertions(+), 4 deletions(-) > >=20 > > diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c > > index aa4f21f92deb..ff5a1dddc0ed 100644 > > --- a/fs/nfsd/nfs4proc.c > > +++ b/fs/nfsd/nfs4proc.c > > @@ -2496,8 +2496,6 @@ static inline void nfsd4_increment_op_stats(u32 opn= um) > > =20 > > static const struct nfsd4_operation nfsd4_ops[]; > > =20 > > -static const char *nfsd4_op_name(unsigned opnum); > > - > > /* > > * Enforce NFSv4.1 COMPOUND ordering rules: > > * > > @@ -3627,7 +3625,7 @@ void warn_on_nonidempotent_op(struct nfsd4_op *op) > > } > > } > > =20 > > -static const char *nfsd4_op_name(unsigned opnum) > > +const char *nfsd4_op_name(unsigned int opnum) > > { > > if (opnum < ARRAY_SIZE(nfsd4_ops)) > > return nfsd4_ops[opnum].op_name; > > diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c > > index dad88bff3b9e..83eb5c6d894e 100644 > > --- a/fs/nfsd/nfsctl.c > > +++ b/fs/nfsd/nfsctl.c > > @@ -47,6 +47,7 @@ enum { > > NFSD_MaxBlkSize, > > NFSD_MaxConnections, > > NFSD_Filecache, > > + NFSD_Rpc_Status, > > /* > > * The below MUST come last. Otherwise we leave a hole in nfsd_files[] > > * with !CONFIG_NFSD_V4 and simple_fill_super() goes oops > > @@ -195,6 +196,13 @@ static inline struct net *netns(struct file *file) > > return file_inode(file)->i_sb->s_fs_info; > > } > > =20 > > +static const struct file_operations nfsd_rpc_status_operations =3D { > > + .open =3D nfsd_rpc_status_open, > > + .read =3D seq_read, > > + .llseek =3D seq_lseek, > > + .release =3D nfsd_stats_release, > > +}; > > + > > /* > > * write_unlock_ip - Release all locks used by a client > > * > > @@ -1394,6 +1402,7 @@ static int nfsd_fill_super(struct super_block *sb, = struct fs_context *fc) > > [NFSD_MaxBlkSize] =3D {"max_block_size", &transaction_ops, S_IWUSR|S_I= RUGO}, > > [NFSD_MaxConnections] =3D {"max_connections", &transaction_ops, S_IWUS= R|S_IRUGO}, > > [NFSD_Filecache] =3D {"filecache", &nfsd_file_cache_stats_fops, S_IRUG= O}, > > + [NFSD_Rpc_Status] =3D {"rpc_status", &nfsd_rpc_status_operations, S_IR= UGO}, > > #ifdef CONFIG_NFSD_V4 > > [NFSD_Leasetime] =3D {"nfsv4leasetime", &transaction_ops, S_IWUSR|S_IR= USR}, > > [NFSD_Gracetime] =3D {"nfsv4gracetime", &transaction_ops, S_IWUSR|S_IR= USR}, > > diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h > > index 55b9d85ed71b..3e8a47b93fd4 100644 > > --- a/fs/nfsd/nfsd.h > > +++ b/fs/nfsd/nfsd.h > > @@ -94,6 +94,7 @@ int nfsd_get_nrthreads(int n, int *, struct net *); > > int nfsd_set_nrthreads(int n, int *, struct net *); > > int nfsd_pool_stats_open(struct inode *, struct file *); > > int nfsd_stats_release(struct inode *, struct file *); > > +int nfsd_rpc_status_open(struct inode *inode, struct file *file); > > void nfsd_shutdown_threads(struct net *net); > > =20 > > static inline void nfsd_put(struct net *net) > > @@ -511,12 +512,18 @@ extern void nfsd4_ssc_init_umount_work(struct nfsd_= net *nn); > > =20 > > extern void nfsd4_init_leases_net(struct nfsd_net *nn); > > =20 > > +const char *nfsd4_op_name(unsigned int opnum); > > #else /* CONFIG_NFSD_V4 */ > > static inline int nfsd4_is_junction(struct dentry *dentry) > > { > > return 0; > > } > > =20 > > +static inline const char *nfsd4_op_name(unsigned int opnum) > > +{ > > + return "unknown_operation"; > > +} > > + > > static inline void nfsd4_init_leases_net(struct nfsd_net *nn) { }; > > =20 > > #define register_cld_notifier() 0 > > diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c > > index 460219030ce1..237be14d3a11 100644 > > --- a/fs/nfsd/nfssvc.c > > +++ b/fs/nfsd/nfssvc.c > > @@ -998,6 +998,15 @@ int nfsd_dispatch(struct svc_rqst *rqstp) > > if (!proc->pc_decode(rqstp, &rqstp->rq_arg_stream)) > > goto out_decode_err; > > =20 > > + /* > > + * Release rq_status_counter setting it to an odd value after the rpc > > + * request has been properly parsed. rq_status_counter is used to > > + * notify the consumers if the rqstp fields are stable > > + * (rq_status_counter is odd) or not meaningful (rq_status_counter > > + * is even). > > + */ > > + smp_store_release(&rqstp->rq_status_counter, rqstp->rq_status_counter |= 1); > > + > > rp =3D NULL; > > switch (nfsd_cache_lookup(rqstp, &rp)) { > > case RC_DOIT: > > @@ -1015,6 +1024,12 @@ int nfsd_dispatch(struct svc_rqst *rqstp) > > if (!proc->pc_encode(rqstp, &rqstp->rq_res_stream)) > > goto out_encode_err; > > =20 > > + /* > > + * Release rq_status_counter setting it to an even value after the rpc > > + * request has been properly processed. > > + */ > > + smp_store_release(&rqstp->rq_status_counter, rqstp->rq_status_counter += 1); > > + > > nfsd_cache_update(rqstp, rp, rqstp->rq_cachetype, statp + 1); > > out_cached_reply: > > return 1; > > @@ -1101,3 +1116,128 @@ int nfsd_stats_release(struct inode *inode, struc= t file *file) > > mutex_unlock(&nfsd_mutex); > > return ret; > > } > > + > > +static int nfsd_rpc_status_show(struct seq_file *m, void *v) > > +{ > > + struct inode *inode =3D file_inode(m->file); > > + struct nfsd_net *nn =3D net_generic(inode->i_sb->s_fs_info, nfsd_net_id= ); > > + int i; > > + > > + seq_puts(m, "# XID FLAGS VERS PROC TIMESTAMP SADDR SPORT DADDR DPORT CO= MPOUND_OPS\n"); > > + > > + rcu_read_lock(); > > + > > + for (i =3D 0; i < nn->nfsd_serv->sv_nrpools; i++) { > > + struct svc_rqst *rqstp; > > + > > + list_for_each_entry_rcu(rqstp, > > + &nn->nfsd_serv->sv_pools[i].sp_all_threads, > > + rq_all) { > > + struct { > > + struct sockaddr daddr; > > + struct sockaddr saddr; > > + unsigned long rq_flags; > > + const char *pc_name; > > + ktime_t rq_stime; > > + __be32 rq_xid; > > + u32 rq_vers; > > + /* NFSv4 compund */ > > + u32 opnum[NFSD_MAX_OPS_PER_COMPOUND]; > > + } rqstp_info; > > + unsigned int status_counter; > > + char buf[RPC_MAX_ADDRBUFLEN]; > > + int opcnt =3D 0; > > + > > + /* > > + * Acquire rq_status_counter before parsing the rqst > > + * fields. rq_status_counter is set to an odd value in > > + * order to notify the consumers the rqstp fields are > > + * meaningful. > > + */ > > + status_counter =3D smp_load_acquire(&rqstp->rq_status_counter); > > + if (!(status_counter & 1)) > > + continue; > > + > > + rqstp_info.rq_xid =3D rqstp->rq_xid; > > + rqstp_info.rq_flags =3D rqstp->rq_flags; > > + rqstp_info.rq_vers =3D rqstp->rq_vers; > > + rqstp_info.pc_name =3D svc_proc_name(rqstp); > > + rqstp_info.rq_stime =3D rqstp->rq_stime; > > + memcpy(&rqstp_info.daddr, svc_daddr(rqstp), > > + sizeof(struct sockaddr)); > > + memcpy(&rqstp_info.saddr, svc_addr(rqstp), > > + sizeof(struct sockaddr)); > > + > > +#ifdef CONFIG_NFSD_V4 > > + if (rqstp->rq_vers =3D=3D NFS4_VERSION && > > + rqstp->rq_proc =3D=3D NFSPROC4_COMPOUND) { > > + /* NFSv4 compund */ > > + struct nfsd4_compoundargs *args =3D rqstp->rq_argp; > > + int j; > > + > > + opcnt =3D args->opcnt; > > + for (j =3D 0; j < opcnt; j++) { > > + struct nfsd4_op *op =3D &args->ops[j]; > > + > > + rqstp_info.opnum[j] =3D op->opnum; > > + } > > + } > > +#endif /* CONFIG_NFSD_V4 */ > > + > > + /* > > + * Acquire rq_status_counter before reporting the rqst > > + * fields to the user. > > + */ > > + if (smp_load_acquire(&rqstp->rq_status_counter) !=3D status_counter) > > + continue; > > + > > + seq_printf(m, > > + "%04u %04ld NFSv%d %s %016lld", > > + be32_to_cpu(rqstp_info.rq_xid), >=20 > It's proper to display XIDs as 8-hexit hexadecimal values, as you > did before. "0x%08x" is the correct format, as that matches the > XID display format used by Wireshark and our tracepoints. >=20 >=20 > > + rqstp_info.rq_flags, >=20 > I didn't mean for you to change the flags format to decimal. I was > trying to point out that the content of this field will need to be > displayed symbolically if we care about an easy user experience. >=20 > Let's stick with hex here. A clever user can read the bits directly > from that. All others should have a tool that parses this field and > prints the list of bits in it symbolically. >=20 >=20 > > + rqstp_info.rq_vers, > > + rqstp_info.pc_name, > > + ktime_to_us(rqstp_info.rq_stime)); > > + seq_printf(m, " %s", > > + __svc_print_addr(&rqstp_info.saddr, buf, > > + sizeof(buf), false)); > > + seq_printf(m, " %s", > > + __svc_print_addr(&rqstp_info.daddr, buf, > > + sizeof(buf), false)); > > + if (opcnt) { > > + int j; > > + > > + seq_puts(m, " "); > > + for (j =3D 0; j < opcnt; j++) > > + seq_printf(m, "%s%s", > > + nfsd4_op_name(rqstp_info.opnum[j]), > > + j =3D=3D opcnt - 1 ? "" : ":"); > > + } else { > > + seq_puts(m, " -"); > > + } >=20 > This looks correct to me. >=20 > I'm leaning towards moving this to a netlink API that can be > extended over time to handle other stats and also act as an NFSD > control plane, similar to other network subsystems. >=20 > Any comments, complaints or rotten fruit from anyone? I think netlink is the best way forward. 'cat' is nice, but not necessary. We have an established path for distributing tools for working with nfsd so we get easily get a suitable tool into the hands of our users. The only fruit I have relates to the name "rpc_status", and it probably over-ripe rather than rotten :-) In the context of RPC, "status" means the success/failure result of a request. That is not what this file provides. It is a list of active requests. So maybe "active_rpc". One advantage of netlink is that the API is based on numbers, not names! NeilBrown >=20 >=20 > > + seq_puts(m, "\n"); > > + } > > + } > > + > > + rcu_read_unlock(); > > + > > + return 0; > > +} > > + > > +/** > > + * nfsd_rpc_status_open - open routine for nfsd_rpc_status handler > > + * @inode: entry inode pointer. > > + * @file: entry file pointer. > > + * > > + * nfsd_rpc_status_open is the open routine for nfsd_rpc_status procfs h= andler. > > + * nfsd_rpc_status dumps pending RPC requests info queued into nfs serve= r. > > + */ > > +int nfsd_rpc_status_open(struct inode *inode, struct file *file) > > +{ > > + int ret =3D nfsd_stats_open(inode); > > + > > + if (ret) > > + return ret; > > + > > + return single_open(file, nfsd_rpc_status_show, inode->i_private); > > +} > > diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h > > index 7838b37bcfa8..b49c0470b4fe 100644 > > --- a/include/linux/sunrpc/svc.h > > +++ b/include/linux/sunrpc/svc.h > > @@ -251,6 +251,7 @@ struct svc_rqst { > > * net namespace > > */ > > void ** rq_lease_breaker; /* The v4 client breaking a lease */ > > + unsigned int rq_status_counter; /* RPC processing counter */ > > }; > > =20 > > /* bits for rq_flags */ > > diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c > > index af692bff44ab..83bee19df104 100644 > > --- a/net/sunrpc/svc.c > > +++ b/net/sunrpc/svc.c > > @@ -1656,7 +1656,7 @@ const char *svc_proc_name(const struct svc_rqst *rq= stp) > > return rqstp->rq_procinfo->pc_name; > > return "unknown"; > > } > > - > > +EXPORT_SYMBOL_GPL(svc_proc_name); > > =20 > > /** > > * svc_encode_result_payload - mark a range of bytes as a result payload > > --=20 > > 2.41.0 > >=20 >=20 > --=20 > Chuck Lever >=20