Received: by 2002:a05:7412:31a9:b0:e2:908c:2ebd with SMTP id et41csp3337571rdb; Wed, 13 Sep 2023 09:03:43 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGYe6fDpXs9kDFXLzjyBhoQoyaR3dXE2GJc1bjvzYsM6H+ju+Uz2cvUtuKGP7GBBbA5emHr X-Received: by 2002:a05:6a20:158d:b0:14e:315b:d9c with SMTP id h13-20020a056a20158d00b0014e315b0d9cmr3292734pzj.22.1694621022998; Wed, 13 Sep 2023 09:03:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694621022; cv=none; d=google.com; s=arc-20160816; b=Fz0hyVcVMQNTTuvIQqgxn+i7qKcoWv4nYB2OZ28aXEE0Vw4awLnImALY1lW3F+faPA PMOMpA+nzh9RYAKJiBIAlblpTlRCog0UL7LrScM86n+Satw7G+VgbqtFcPZuI9z1CENc b6xlU/iADmv7wGxd5xRqDYXv7aWIcEOptz3aDE1sbvir8he04dbWPCmGBWmPpPu04UUC YOrmg5XCHQIYQ4d2zUqnXPuubGEPLrqU8zrGH4h0zZuIERkWsMSlqiYoxAdafxRWgvtl O9uqpCV5jmgY9hnuXgD+C0XSu5Cp4O2yj01MfdHicZ6wkPjzZHZ8q/Gl0we+fYd3cuTD IS+Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:message-id:subject:cc:to:from:date:dkim-signature; bh=YOIimXAa4BnacYFigSUrd1F+Xa9jb+T8PxsuF24Vw+o=; fh=UgBAXmnivFk7kQi7sQlRYuWmRxo2ocd12zIZQ50jSes=; b=IDgYPXVnG86BgJBD3BKk6nxihjHLdr0neJTKvlgOPx4l76nXb5gcR52Nyf4RscHAQH rHruh6kIMmVZRxlyS2+fSkkoeJBoZQgunO9JiNmGDAhz5TBNGpkHyxGeMqvdNkt6+2MK hTzlKJFAV5jIfnbVQMs0fznMjWJMqZR5zaRCPtP0btOgBiknAsu+4Lw+LNgXLg3f0HNV UN7fRI6OsNxlwtXBpHo1MURhI1vYVkoKgOg+rhm6AkXJaK5fznLg6UwxTIyiV2ZqkzMc 6G/wIXdUKtjdDzzkrDjtp2En7H/RsVU1bDV5ZJQX28Ug5QJI0cHHu9sMP6mj4kf74uvc jJEA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=ResrXRHn; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id s24-20020a63af58000000b00577581963ccsi7082246pgo.260.2023.09.13.09.03.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Sep 2023 09:03:42 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=ResrXRHn; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 9482E813A5BE; Wed, 13 Sep 2023 08:52:01 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230098AbjIMPv6 (ORCPT + 99 others); Wed, 13 Sep 2023 11:51:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38484 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230048AbjIMPvy (ORCPT ); Wed, 13 Sep 2023 11:51:54 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 72C0BE3 for ; Wed, 13 Sep 2023 08:51:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1694620267; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to; bh=YOIimXAa4BnacYFigSUrd1F+Xa9jb+T8PxsuF24Vw+o=; b=ResrXRHn3CQwn21zxVIuGsFVVqdV7H0dO9JvxAHswicaq2NyzB2cf5pNeoykcozvevbuVK rhc9qgqb5BVn0PSXGVrz40JXrWp4O21b/yLy9imCSTFYHHU0sj0/jASuKuVbyR+R1u+jKQ auFYY9liN/BgEXncTFylp3x8XspEl2w= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-518--MmqiAbIOYmQToADvHS67Q-1; Wed, 13 Sep 2023 11:51:05 -0400 X-MC-Unique: -MmqiAbIOYmQToADvHS67Q-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 5976F801779; Wed, 13 Sep 2023 15:51:05 +0000 (UTC) Received: from dhcp-27-174.brq.redhat.com (unknown [10.45.225.76]) by smtp.corp.redhat.com (Postfix) with SMTP id D291D10085AA; Wed, 13 Sep 2023 15:51:01 +0000 (UTC) Received: by dhcp-27-174.brq.redhat.com (nbSMTP-1.00) for uid 1000 oleg@redhat.com; Wed, 13 Sep 2023 17:50:13 +0200 (CEST) Date: Wed, 13 Sep 2023 17:50:09 +0200 From: Oleg Nesterov To: Boqun Feng , Ingo Molnar , Peter Zijlstra , Rik van Riel , Thomas Gleixner , Waiman Long , Will Deacon Cc: Alexey Gladkov , "Eric W. Biederman" , linux-kernel@vger.kernel.org Subject: [PATCH 5/5] time,signal: turn signal_struct.stats_lock into seqcount_rwlock_t Message-ID: <20230913155009.GA26255@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230913154907.GA26210@redhat.com> User-Agent: Mutt/1.5.24 (2015-08-30) X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Wed, 13 Sep 2023 08:52:01 -0700 (PDT) This way thread_group_cputime() doesn't exclude other readers on the 2nd pass. thread_group_cputime() still needs to disable irqs because stats_lock nests inside siglock. But once we change the getrusage()-like users to rely on stats_lock we can remove this dependency, and after that there will be no need for _irqsave. And IIUC, this is the bugfix for CONFIG_PREEMPT_RT? Before this patch read_seqbegin_or_lock() can spin in __read_seqcount_begin() while the write_seqlock(stats_lock) section was preempted. While at it, change the main loop to use __for_each_thread(sig, t). Signed-off-by: Oleg Nesterov --- include/linux/sched/signal.h | 4 +++- kernel/exit.c | 12 ++++++++---- kernel/fork.c | 3 ++- kernel/sched/cputime.c | 10 ++++++---- 4 files changed, 19 insertions(+), 10 deletions(-) diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h index d7fa3ca2fa53..c7c0928b877d 100644 --- a/include/linux/sched/signal.h +++ b/include/linux/sched/signal.h @@ -182,7 +182,9 @@ struct signal_struct { * Live threads maintain their own counters and add to these * in __exit_signal, except for the group leader. */ - seqlock_t stats_lock; + rwlock_t stats_lock; + seqcount_rwlock_t stats_seqc; + u64 utime, stime, cutime, cstime; u64 gtime; u64 cgtime; diff --git a/kernel/exit.c b/kernel/exit.c index f3ba4b97a7d9..8dedb7138f9c 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -182,7 +182,8 @@ static void __exit_signal(struct task_struct *tsk) * see the empty ->thread_head list. */ task_cputime(tsk, &utime, &stime); - write_seqlock(&sig->stats_lock); + write_lock(&sig->stats_lock); + write_seqcount_begin(&sig->stats_seqc); sig->utime += utime; sig->stime += stime; sig->gtime += task_gtime(tsk); @@ -196,7 +197,8 @@ static void __exit_signal(struct task_struct *tsk) sig->sum_sched_runtime += tsk->se.sum_exec_runtime; sig->nr_threads--; __unhash_process(tsk, group_dead); - write_sequnlock(&sig->stats_lock); + write_seqcount_end(&sig->stats_seqc); + write_unlock(&sig->stats_lock); /* * Do this under ->siglock, we can race with another thread @@ -1160,7 +1162,8 @@ static int wait_task_zombie(struct wait_opts *wo, struct task_struct *p) */ thread_group_cputime_adjusted(p, &tgutime, &tgstime); spin_lock_irq(¤t->sighand->siglock); - write_seqlock(&psig->stats_lock); + write_lock(&psig->stats_lock); + write_seqcount_begin(&psig->stats_seqc); psig->cutime += tgutime + sig->cutime; psig->cstime += tgstime + sig->cstime; psig->cgtime += task_gtime(p) + sig->gtime + sig->cgtime; @@ -1183,7 +1186,8 @@ static int wait_task_zombie(struct wait_opts *wo, struct task_struct *p) psig->cmaxrss = maxrss; task_io_accounting_add(&psig->ioac, &p->ioac); task_io_accounting_add(&psig->ioac, &sig->ioac); - write_sequnlock(&psig->stats_lock); + write_seqcount_end(&psig->stats_seqc); + write_unlock(&psig->stats_lock); spin_unlock_irq(¤t->sighand->siglock); } diff --git a/kernel/fork.c b/kernel/fork.c index b9d3aa493bbd..bbd5604053f8 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1870,7 +1870,8 @@ static int copy_signal(unsigned long clone_flags, struct task_struct *tsk) sig->curr_target = tsk; init_sigpending(&sig->shared_pending); INIT_HLIST_HEAD(&sig->multiprocess); - seqlock_init(&sig->stats_lock); + rwlock_init(&sig->stats_lock); + seqcount_rwlock_init(&sig->stats_seqc, &sig->stats_lock); prev_cputime_init(&sig->prev_cputime); #ifdef CONFIG_POSIX_TIMERS diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c index af7952f12e6c..bd6a85bd2a49 100644 --- a/kernel/sched/cputime.c +++ b/kernel/sched/cputime.c @@ -333,12 +333,13 @@ void thread_group_cputime(struct task_struct *tsk, struct task_cputime *times) nextseq = 0; do { seq = nextseq; - flags = read_seqbegin_or_lock_irqsave(&sig->stats_lock, &seq); + flags = read_seqcount_begin_or_lock_irqsave(&sig->stats_seqc, + &sig->stats_lock, &seq); times->utime = sig->utime; times->stime = sig->stime; times->sum_exec_runtime = sig->sum_sched_runtime; - for_each_thread(tsk, t) { + __for_each_thread(sig, t) { task_cputime(t, &utime, &stime); times->utime += utime; times->stime += stime; @@ -346,8 +347,9 @@ void thread_group_cputime(struct task_struct *tsk, struct task_cputime *times) } /* If lockless access failed, take the lock. */ nextseq = 1; - } while (need_seqretry(&sig->stats_lock, seq)); - done_seqretry_irqrestore(&sig->stats_lock, seq, flags); + } while (need_seqcount_retry(&sig->stats_seqc, seq)); + done_seqcount_retry_irqrestore(&sig->stats_seqc, &sig->stats_lock, + seq, flags); rcu_read_unlock(); } -- 2.25.1.362.g51ebf55