Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp35608365rwd; Mon, 10 Jul 2023 09:46:54 -0700 (PDT) X-Google-Smtp-Source: APBJJlGrTKn3IU1OMUD9Qt3yYxmx6HnnAeHjF7T7BpY7wVukBGhRU9k+oPfnKCtllle9VRAS5vg5 X-Received: by 2002:a17:906:648d:b0:993:e695:b585 with SMTP id e13-20020a170906648d00b00993e695b585mr11729759ejm.9.1689007614581; Mon, 10 Jul 2023 09:46:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689007614; cv=none; d=google.com; s=arc-20160816; b=JRf+iGSQqA950f2OllcnJgyJ9fK0sdgOWiukG4b+2Lq4ngBSIxpaHUKVxJb7EEwZKq Xpt8hJaGmDGZgo4zqL2iRhvOuswk6BarKsrORY8/KIjEZdDiOGgHgNOIVn6lXEQcK9yz 8JOVbYz89BH0PpMz9OSz5xp3kz0u4Po6T8xyaQDpo7irF/qDYJodiAJwk5WyBPH8n3V+ cNlU6czoqKrxGNGbIulW+CqfZd0ANlTaez2/BbRCyrFdk7LLPYnpMQblLTmrol3Jqb0h UH69KL3QlKs63a2m+yXnNh/fY7q0cv2nTb8fs1US7DQmdqvytzrbD66sPOQgVNEPrU4r 3HsA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:cc:to:from :subject:dkim-signature; bh=qL//JD4hh4jZsKABcgaJZCfQrLh4Vt0aaWevpWwonUk=; fh=1mpseLkGFXp1jGJr9QrS6J8xZJaGrNTaerrRbpi6Fog=; b=DwPDxZRwxk5BOMIiFC6xSJzO9JCbOAnR3/EH5s6O7WTynRy19/6Fi2vrghS53Wfc1E hiJqEyveCFlCfjz9t5MhX9LVUKCLodiKool9U9znbT8uPT6EK3hC/UseC9ev5eKKUwH7 wvR3q0vPN4J66Wdr3UIfRjKy9q8TEEojPwWTHVenW4hoVRk8AX3gL4dGlTW4x0BxDXUo 5yfcreQIptayQX+vtUauT9P4gHWkK/rf2LqUdCtro7j8Hure4SCsk7msKxh+eE2zBBtg 1ktlmckpSGD2i6botBa6q79rraJXCB9O/c6vlNU2gLIHkEn7UYjsI0pmNUywAUx50R4C totQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="peEBc46/"; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f27-20020a170906049b00b00988a210d8a3si10712186eja.239.2023.07.10.09.46.29; Mon, 10 Jul 2023 09:46:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="peEBc46/"; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230016AbjGJQmL (ORCPT + 99 others); Mon, 10 Jul 2023 12:42:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45146 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230238AbjGJQmK (ORCPT ); Mon, 10 Jul 2023 12:42:10 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6455FC0 for ; Mon, 10 Jul 2023 09:42:09 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id EE9886101E for ; Mon, 10 Jul 2023 16:42:08 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EDB35C433C7; Mon, 10 Jul 2023 16:42:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1689007328; bh=5X+RAkYJgFfBpaSONJGJGm077v85R/L6N3rVflk0/wI=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=peEBc46/NXXFP9Fu85ckpi0HnzI+QUSMfQYjKfDlfdxAo3ywfnhXZe81iZMwNhLrM kwg5gi0TtXF9vtjXJIJfTLDWbmyyoT4T+R8+tS6zz/zy9f89tEc7HqLlRQ+o1aLzhp L1j05vdsGq+cs7f3aKwf/rOFvtQxYoeHPM6Zw8rIQYCPGCQxW7LmAThnMGpN8IDL8C l7Y6O45gI6LOTUUJQUt+TvU/LgWy33R/QJM25Xwfqm9CI45OLcCjdrxuTwrt/lVoqy CJAYXIAbgm8o8zR3ktiH6qdnZrpMS1PkYUkoYWAn7qTdQt6Wa8LmQt1oDxYYiyMtA2 EMNN4fv+OqZNQ== Subject: [PATCH v3 2/9] SUNRPC: Report when no service thread is available. From: Chuck Lever To: linux-nfs@vger.kernel.org Cc: Chuck Lever , lorenzo@kernel.org, neilb@suse.de, jlayton@redhat.com, david@fromorbit.com Date: Mon, 10 Jul 2023 12:42:07 -0400 Message-ID: <168900732701.7514.15763852101603746862.stgit@manet.1015granger.net> In-Reply-To: <168900729243.7514.15141312295052254929.stgit@manet.1015granger.net> References: <168900729243.7514.15141312295052254929.stgit@manet.1015granger.net> User-Agent: StGit/1.5 MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org From: Chuck Lever Count and record thread pool starvation. Administrators can take action by increasing thread count or decreasing workload. Signed-off-by: Chuck Lever --- include/linux/sunrpc/svc.h | 5 +++- include/trace/events/sunrpc.h | 49 ++++++++++++++++++++++++++++++++++------- net/sunrpc/svc.c | 9 +++++++- net/sunrpc/svc_xprt.c | 22 ++++++++++-------- 4 files changed, 64 insertions(+), 21 deletions(-) diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h index dc2d90a655e2..fbfe6ea737c8 100644 --- a/include/linux/sunrpc/svc.h +++ b/include/linux/sunrpc/svc.h @@ -22,7 +22,6 @@ #include /* - * * RPC service thread pool. * * Pool of threads and temporary sockets. Generally there is only @@ -42,6 +41,7 @@ struct svc_pool { struct percpu_counter sp_sockets_queued; struct percpu_counter sp_threads_woken; struct percpu_counter sp_threads_timedout; + struct percpu_counter sp_threads_starved; #define SP_TASK_PENDING (0) /* still work to do even if no * xprt is queued. */ @@ -427,7 +427,8 @@ int svc_register(const struct svc_serv *, struct net *, const int, void svc_wake_up(struct svc_serv *); void svc_reserve(struct svc_rqst *rqstp, int space); -struct svc_rqst *svc_pool_wake_idle_thread(struct svc_pool *pool); +struct svc_rqst *svc_pool_wake_idle_thread(struct svc_serv *serv, + struct svc_pool *pool); struct svc_pool *svc_pool_for_cpu(struct svc_serv *serv); char * svc_print_addr(struct svc_rqst *, char *, size_t); const char * svc_proc_name(const struct svc_rqst *rqstp); diff --git a/include/trace/events/sunrpc.h b/include/trace/events/sunrpc.h index 43711753616a..9b70fc1c698a 100644 --- a/include/trace/events/sunrpc.h +++ b/include/trace/events/sunrpc.h @@ -1994,21 +1994,21 @@ TRACE_EVENT(svc_xprt_create_err, TRACE_EVENT(svc_xprt_enqueue, TP_PROTO( const struct svc_xprt *xprt, - const struct svc_rqst *rqst + const struct svc_rqst *wakee ), - TP_ARGS(xprt, rqst), + TP_ARGS(xprt, wakee), TP_STRUCT__entry( SVC_XPRT_ENDPOINT_FIELDS(xprt) - __field(int, pid) + __field(pid_t, pid) ), TP_fast_assign( SVC_XPRT_ENDPOINT_ASSIGNMENTS(xprt); - __entry->pid = rqst? rqst->rq_task->pid : 0; + __entry->pid = wakee->rq_task->pid; ), TP_printk(SVC_XPRT_ENDPOINT_FORMAT " pid=%d", @@ -2039,6 +2039,39 @@ TRACE_EVENT(svc_xprt_dequeue, SVC_XPRT_ENDPOINT_VARARGS, __entry->wakeup) ); +#define show_svc_pool_flags(x) \ + __print_flags(x, "|", \ + { BIT(SP_TASK_PENDING), "TASK_PENDING" }, \ + { BIT(SP_CONGESTED), "CONGESTED" }) + +TRACE_EVENT(svc_pool_starved, + TP_PROTO( + const struct svc_serv *serv, + const struct svc_pool *pool + ), + + TP_ARGS(serv, pool), + + TP_STRUCT__entry( + __string(name, serv->sv_name) + __field(int, pool_id) + __field(unsigned int, nrthreads) + __field(unsigned long, flags) + ), + + TP_fast_assign( + __assign_str(name, serv->sv_name); + __entry->pool_id = pool->sp_id; + __entry->nrthreads = pool->sp_nrthreads; + __entry->flags = pool->sp_flags; + ), + + TP_printk("service=%s pool=%d flags=%s nrthreads=%u", + __get_str(name), __entry->pool_id, + show_svc_pool_flags(__entry->flags), __entry->nrthreads + ) +); + DECLARE_EVENT_CLASS(svc_xprt_event, TP_PROTO( const struct svc_xprt *xprt @@ -2109,16 +2142,16 @@ TRACE_EVENT(svc_xprt_accept, ); TRACE_EVENT(svc_wake_up, - TP_PROTO(int pid), + TP_PROTO(const struct svc_rqst *wakee), - TP_ARGS(pid), + TP_ARGS(wakee), TP_STRUCT__entry( - __field(int, pid) + __field(pid_t, pid) ), TP_fast_assign( - __entry->pid = pid; + __entry->pid = wakee->rq_task->pid; ), TP_printk("pid=%d", __entry->pid) diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c index 05ee92b5fa1e..b79b8b41905d 100644 --- a/net/sunrpc/svc.c +++ b/net/sunrpc/svc.c @@ -516,6 +516,7 @@ __svc_create(struct svc_program *prog, unsigned int bufsize, int npools, percpu_counter_init(&pool->sp_sockets_queued, 0, GFP_KERNEL); percpu_counter_init(&pool->sp_threads_woken, 0, GFP_KERNEL); percpu_counter_init(&pool->sp_threads_timedout, 0, GFP_KERNEL); + percpu_counter_init(&pool->sp_threads_starved, 0, GFP_KERNEL); } return serv; @@ -591,6 +592,7 @@ svc_destroy(struct kref *ref) percpu_counter_destroy(&pool->sp_sockets_queued); percpu_counter_destroy(&pool->sp_threads_woken); percpu_counter_destroy(&pool->sp_threads_timedout); + percpu_counter_destroy(&pool->sp_threads_starved); } kfree(serv->sv_pools); kfree(serv); @@ -691,6 +693,7 @@ svc_prepare_thread(struct svc_serv *serv, struct svc_pool *pool, int node) /** * svc_pool_wake_idle_thread - Awaken an idle thread in @pool + * @serv: RPC service * @pool: service thread pool * * Returns an idle service thread (now marked BUSY), or NULL @@ -698,7 +701,8 @@ svc_prepare_thread(struct svc_serv *serv, struct svc_pool *pool, int node) * thread and marking it BUSY is atomic with respect to other * calls to svc_pool_wake_idle_thread(). */ -struct svc_rqst *svc_pool_wake_idle_thread(struct svc_pool *pool) +struct svc_rqst *svc_pool_wake_idle_thread(struct svc_serv *serv, + struct svc_pool *pool) { struct svc_rqst *rqstp; @@ -714,6 +718,9 @@ struct svc_rqst *svc_pool_wake_idle_thread(struct svc_pool *pool) return rqstp; } rcu_read_unlock(); + + trace_svc_pool_starved(serv, pool); + percpu_counter_inc(&pool->sp_threads_starved); return NULL; } diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c index 89302bf09b77..a1ed6fb69793 100644 --- a/net/sunrpc/svc_xprt.c +++ b/net/sunrpc/svc_xprt.c @@ -455,7 +455,7 @@ static bool svc_xprt_ready(struct svc_xprt *xprt) */ void svc_xprt_enqueue(struct svc_xprt *xprt) { - struct svc_rqst *rqstp; + struct svc_rqst *rqstp; struct svc_pool *pool; if (!svc_xprt_ready(xprt)) @@ -476,9 +476,11 @@ void svc_xprt_enqueue(struct svc_xprt *xprt) list_add_tail(&xprt->xpt_ready, &pool->sp_sockets); spin_unlock_bh(&pool->sp_lock); - rqstp = svc_pool_wake_idle_thread(pool); - if (!rqstp) + rqstp = svc_pool_wake_idle_thread(xprt->xpt_server, pool); + if (!rqstp) { set_bit(SP_CONGESTED, &pool->sp_flags); + return; + } trace_svc_xprt_enqueue(xprt, rqstp); } @@ -584,17 +586,16 @@ static void svc_xprt_release(struct svc_rqst *rqstp) void svc_wake_up(struct svc_serv *serv) { struct svc_pool *pool = &serv->sv_pools[0]; - struct svc_rqst *rqstp; + struct svc_rqst *rqstp; - rqstp = svc_pool_wake_idle_thread(pool); + rqstp = svc_pool_wake_idle_thread(serv, pool); if (!rqstp) { set_bit(SP_TASK_PENDING, &pool->sp_flags); smp_wmb(); - trace_svc_wake_up(0); return; } - trace_svc_wake_up(rqstp->rq_task->pid); + trace_svc_wake_up(rqstp); } EXPORT_SYMBOL_GPL(svc_wake_up); @@ -1436,16 +1437,17 @@ static int svc_pool_stats_show(struct seq_file *m, void *p) struct svc_pool *pool = p; if (p == SEQ_START_TOKEN) { - seq_puts(m, "# pool packets-arrived sockets-enqueued threads-woken threads-timedout\n"); + seq_puts(m, "# pool packets-arrived xprts-enqueued threads-woken threads-timedout starved\n"); return 0; } - seq_printf(m, "%u %llu %llu %llu %llu\n", + seq_printf(m, "%u %llu %llu %llu %llu %llu\n", pool->sp_id, percpu_counter_sum_positive(&pool->sp_sockets_queued), percpu_counter_sum_positive(&pool->sp_sockets_queued), percpu_counter_sum_positive(&pool->sp_threads_woken), - percpu_counter_sum_positive(&pool->sp_threads_timedout)); + percpu_counter_sum_positive(&pool->sp_threads_timedout), + percpu_counter_sum_positive(&pool->sp_threads_starved)); return 0; }