Received: by 2002:a05:6a10:1d13:0:0:0:0 with SMTP id pp19csp1741725pxb; Fri, 27 Aug 2021 16:45:47 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyphEjQq1AkGmCg2MyN1C8cK5TSDRZB5nWJwWX5abXRPHP8WfvbzkuL7kDRvx7YiH5X3/Jv X-Received: by 2002:a17:906:c1da:: with SMTP id bw26mr12507653ejb.253.1630107947246; Fri, 27 Aug 2021 16:45:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1630107947; cv=none; d=google.com; s=arc-20160816; b=CTgk53RkjM9lJijjO65QsmzjVIbRKcS+d4LI9foRjWMXktFVU+nsKntuOQMdqYrBtM E4PquG48nngPq+IAFqzmreGT0w5Dnvrok9XxL1Jr07Jq+Z7+QD/rvvsN4G5EiEk1muhE plRWnViuJ7g9z888UpdcOf/2o/x/qPHfdM183DPx/Orko8jtKAGmdM53BQNn7o8vk2/9 gPJJI0bzdUVNALkEh6crX/xusXGY4TNKQ6i9eqa/FdBi0YjkpcNZPtKkdzzIJaC4317S DlYXy30VWtPlGQu8pGdgM9CZZLwdk4H6XEnnSazhu6Il6iPjGB+52xadALVdjvjW4Zz1 2Vag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:robot-unsubscribe :robot-id:message-id:mime-version:references:in-reply-to:cc:subject :to:reply-to:sender:from:dkim-signature:dkim-signature:date; bh=zJtYy3+Ewjzd6KiUleN02lT/h8y1My8Bkx4g8HNxCt0=; b=tuBOd71t7MX94poGg3bPrMiUCA/iDLAVc+zbMzr93RA4OnU9uyiC0k0Uq8OG76cJfh NDx8JRVjRl621ucmYyG9n//MvBFVsVQBMgFJU19sxfF3WEh/4B5mULpQeWhYASfEuEgn 2acWWzRaW3FkNcu2Cr91T0bPnhf14bEyY5aU0XAm41rtpALbvqloyR5b0wXYC0zVEJT8 fhrhS2UfOXciZCHIy0PwBjBL8k77bKZscq9hdR7UUsda/3ve+6BjOCHpejd8wba9+FlM Wj2wZ3asPiOrfU9loV7y+jw7//UlLtw8/HUcWJASZ+/cf0hNczNx7M+nBKFtYtliNXrH J1OA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=jAGJv+QU; dkim=neutral (no key) header.i=@linutronix.de header.b=R6H2THrr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id be5si7526644edb.165.2021.08.27.16.45.22; Fri, 27 Aug 2021 16:45:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=jAGJv+QU; dkim=neutral (no key) header.i=@linutronix.de header.b=R6H2THrr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232505AbhH0XmE (ORCPT + 99 others); Fri, 27 Aug 2021 19:42:04 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:40996 "EHLO galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232433AbhH0XmD (ORCPT ); Fri, 27 Aug 2021 19:42:03 -0400 Date: Fri, 27 Aug 2021 23:41:12 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1630107673; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zJtYy3+Ewjzd6KiUleN02lT/h8y1My8Bkx4g8HNxCt0=; b=jAGJv+QUqKknVrjTfBTClN5pQ61UfSivTREsgKJ6Y0qzU5h/pix8f1SipRTbVdLQp8xxuB CjTx1SZZ+wC8Mr12CfvYQbhMDAnA4yZfXer1dK2dCEVDkZTG3Q4esr4VAkEhcR/jqee/j3 75E7EpYwv/X5cFoQm1JXMhr7xA7l20Qb8jGtH9vbSevttQY//X3a7Y4ZE1pJfHw/88ZPK+ cv3EF4NWS3iZQOpxiP8yhuphOSiLV/wTR/iEIluzVYQUspV01oA8Jwv15Rrbn92FA5H0O7 lS4fD/BqMx9GPBBQyAa36MNNrcuBEOJ0i9J+Z9RBP62uxhoL+7wqcEdI6vxKNg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1630107673; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zJtYy3+Ewjzd6KiUleN02lT/h8y1My8Bkx4g8HNxCt0=; b=R6H2THrrvIMOQd8GXiXXTzmmySzHmJm8v0usKEyNtxQZnUwwON42T5aTC+qgSHfzXOJO2J RaU4DsToCu115ECA== From: "tip-bot2 for Thomas Gleixner" Sender: tip-bot2@linutronix.de Reply-to: linux-kernel@vger.kernel.org To: linux-tip-commits@vger.kernel.org Subject: [tip: sched/core] eventfd: Make signal recursion protection a task bit Cc: Daniel Bristot de Oliveira , Thomas Gleixner , Jason Wang , Al Viro , x86@kernel.org, linux-kernel@vger.kernel.org In-Reply-To: <87wnp9idso.ffs@tglx> References: <87wnp9idso.ffs@tglx> MIME-Version: 1.0 Message-ID: <163010767256.25758.8600942642007356589.tip-bot2@tip-bot2> Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The following commit has been merged into the sched/core branch of tip: Commit-ID: b542e383d8c005f06a131e2b40d5889b812f19c6 Gitweb: https://git.kernel.org/tip/b542e383d8c005f06a131e2b40d5889b812f19c6 Author: Thomas Gleixner AuthorDate: Thu, 29 Jul 2021 13:01:59 +02:00 Committer: Thomas Gleixner CommitterDate: Sat, 28 Aug 2021 01:33:02 +02:00 eventfd: Make signal recursion protection a task bit The recursion protection for eventfd_signal() is based on a per CPU variable and relies on the !RT semantics of spin_lock_irqsave() for protecting this per CPU variable. On RT kernels spin_lock_irqsave() neither disables preemption nor interrupts which allows the spin lock held section to be preempted. If the preempting task invokes eventfd_signal() as well, then the recursion warning triggers. Paolo suggested to protect the per CPU variable with a local lock, but that's heavyweight and actually not necessary. The goal of this protection is to prevent the task stack from overflowing, which can be achieved with a per task recursion protection as well. Replace the per CPU variable with a per task bit similar to other recursion protection bits like task_struct::in_page_owner. This works on both !RT and RT kernels and removes as a side effect the extra per CPU storage. No functional change for !RT kernels. Reported-by: Daniel Bristot de Oliveira Signed-off-by: Thomas Gleixner Tested-by: Daniel Bristot de Oliveira Acked-by: Jason Wang Cc: Al Viro Link: https://lore.kernel.org/r/87wnp9idso.ffs@tglx --- fs/aio.c | 2 +- fs/eventfd.c | 12 +++++------- include/linux/eventfd.h | 11 +++++------ include/linux/sched.h | 4 ++++ 4 files changed, 15 insertions(+), 14 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index 76ce0cc..51b08ab 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -1695,7 +1695,7 @@ static int aio_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync, list_del(&iocb->ki_list); iocb->ki_res.res = mangle_poll(mask); req->done = true; - if (iocb->ki_eventfd && eventfd_signal_count()) { + if (iocb->ki_eventfd && eventfd_signal_allowed()) { iocb = NULL; INIT_WORK(&req->work, aio_poll_put_work); schedule_work(&req->work); diff --git a/fs/eventfd.c b/fs/eventfd.c index e265b6d..3627dd7 100644 --- a/fs/eventfd.c +++ b/fs/eventfd.c @@ -25,8 +25,6 @@ #include #include -DEFINE_PER_CPU(int, eventfd_wake_count); - static DEFINE_IDA(eventfd_ida); struct eventfd_ctx { @@ -67,21 +65,21 @@ __u64 eventfd_signal(struct eventfd_ctx *ctx, __u64 n) * Deadlock or stack overflow issues can happen if we recurse here * through waitqueue wakeup handlers. If the caller users potentially * nested waitqueues with custom wakeup handlers, then it should - * check eventfd_signal_count() before calling this function. If - * it returns true, the eventfd_signal() call should be deferred to a + * check eventfd_signal_allowed() before calling this function. If + * it returns false, the eventfd_signal() call should be deferred to a * safe context. */ - if (WARN_ON_ONCE(this_cpu_read(eventfd_wake_count))) + if (WARN_ON_ONCE(current->in_eventfd_signal)) return 0; spin_lock_irqsave(&ctx->wqh.lock, flags); - this_cpu_inc(eventfd_wake_count); + current->in_eventfd_signal = 1; if (ULLONG_MAX - ctx->count < n) n = ULLONG_MAX - ctx->count; ctx->count += n; if (waitqueue_active(&ctx->wqh)) wake_up_locked_poll(&ctx->wqh, EPOLLIN); - this_cpu_dec(eventfd_wake_count); + current->in_eventfd_signal = 0; spin_unlock_irqrestore(&ctx->wqh.lock, flags); return n; diff --git a/include/linux/eventfd.h b/include/linux/eventfd.h index fa0a524..305d5f1 100644 --- a/include/linux/eventfd.h +++ b/include/linux/eventfd.h @@ -14,6 +14,7 @@ #include #include #include +#include /* * CAREFUL: Check include/uapi/asm-generic/fcntl.h when defining @@ -43,11 +44,9 @@ int eventfd_ctx_remove_wait_queue(struct eventfd_ctx *ctx, wait_queue_entry_t *w __u64 *cnt); void eventfd_ctx_do_read(struct eventfd_ctx *ctx, __u64 *cnt); -DECLARE_PER_CPU(int, eventfd_wake_count); - -static inline bool eventfd_signal_count(void) +static inline bool eventfd_signal_allowed(void) { - return this_cpu_read(eventfd_wake_count); + return !current->in_eventfd_signal; } #else /* CONFIG_EVENTFD */ @@ -78,9 +77,9 @@ static inline int eventfd_ctx_remove_wait_queue(struct eventfd_ctx *ctx, return -ENOSYS; } -static inline bool eventfd_signal_count(void) +static inline bool eventfd_signal_allowed(void) { - return false; + return true; } static inline void eventfd_ctx_do_read(struct eventfd_ctx *ctx, __u64 *cnt) diff --git a/include/linux/sched.h b/include/linux/sched.h index 3bb9fec..6421a9a 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -864,6 +864,10 @@ struct task_struct { /* Used by page_owner=on to detect recursion in page tracking. */ unsigned in_page_owner:1; #endif +#ifdef CONFIG_EVENTFD + /* Recursion prevention for eventfd_signal() */ + unsigned in_eventfd_signal:1; +#endif unsigned long atomic_flags; /* Flags requiring atomic access. */