Received: by 2002:ab2:710b:0:b0:1ef:a325:1205 with SMTP id z11csp1243460lql; Tue, 12 Mar 2024 11:08:51 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXfhPmHaOh11AnDcf3Ocqrt6r/LIgic8uJwboR7Ad4dpkkYDjeWLJ6P3Oz277FGRK6JZqWfFhgzBEe6pNKztGTUJdKcIcpy1o2atFOAYQ== X-Google-Smtp-Source: AGHT+IFUWdUulPxCa/1bvrqskhhUf8xhLcZtE2K43nlYVe47nOuJSwgIfceZhDlHZkgZuNhuKb0s X-Received: by 2002:a05:620a:55b3:b0:788:2fc4:c3ff with SMTP id vr19-20020a05620a55b300b007882fc4c3ffmr10503273qkn.31.1710266931256; Tue, 12 Mar 2024 11:08:51 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1710266931; cv=pass; d=google.com; s=arc-20160816; b=EhC/rJk/VBJVaj9xRwcA0KuHEVI4EQLnunN/YR5UgzqaNbfhhi9RHPx2CAnMsFWv9d RTEZC/epnpe94vGkdZL40Blt2SvOI0KvVXVUaGFK4QOrCacUYDq8bnKcwUI7cGxJXUYa AofopMvD31f3Pthxi6nvvNKWYvbUsIRZqo+IonoJRLmhlpIVlowrVJugTxY0r52MbxvV /dYGGAyR1BNxNMZe3530603SMh/L5md3DXRW0jBl+9fVXhopfG3BVITRE+W+ogjRc4gO 03HXRt5ojy8BsJssz8CL7jPSdphrUXLV4AmZU87WcBfDOyQvVcUEm9gGk5PJR03b6iV9 ZX3A== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:dkim-signature:dkim-signature:from; bh=FE3b5PyhcD0TC3K+2ITLv4adUr9UbVkRxLK+4jENi1c=; fh=cilliUvb2Cn2gGk/iiSGOkFamBoPX3ao5SvmN4nWZEc=; b=s2TcEgzuzRTAKObnXg6pLRLZmWIn0nWkr83D+6zvDlnORG8XJgoCHe5kQk1BZmpDGC QZ+PecAv5BFhsJpazJnp4M5GSNoEs0VfT4hUzcN+Gtt+6vRGbH0G4ZPoEtFl2nizOuAp Z4wkadBX3pMb+XIHrTZGjYXNX23P70ZMnTLZhqBRC04o8XNelFzqqzany8L1cZiRR45+ qU6id5etaYw01+TFYTsjGHQ1XG5bnVB3gs52adUAP58d8BfFwwZiif0fsnck3sc3IoI7 VUs6EdhjmiWtCjMDp18jPuR+qJ5VP8WnrEnYGJF2Z10CIRELFMwxmft38MT6gzQdSByr wN0w==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b="KNe/wpzl"; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e; arc=pass (i=1 spf=pass spfdomain=linutronix.de dkim=pass dkdomain=linutronix.de dmarc=pass fromdomain=linutronix.de); spf=pass (google.com: domain of linux-kernel+bounces-100631-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-100631-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id g26-20020a37e21a000000b007885c5bb1besi7860937qki.249.2024.03.12.11.08.51 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Mar 2024 11:08:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-100631-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b="KNe/wpzl"; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e; arc=pass (i=1 spf=pass spfdomain=linutronix.de dkim=pass dkdomain=linutronix.de dmarc=pass fromdomain=linutronix.de); spf=pass (google.com: domain of linux-kernel+bounces-100631-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-100631-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id AAE6E1C2257A for ; Tue, 12 Mar 2024 18:08:50 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 2173313A24E; Tue, 12 Mar 2024 18:08:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="KNe/wpzl"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="JuPvkuXr" Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 638F81386D1; Tue, 12 Mar 2024 18:08:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710266903; cv=none; b=oyk9hwO+Sj22yA9pnZYWmBQ2+kJPdQgfOOumr7L+ol14xZbmQrj6Wk9RHYvW/kjTCRmSLJ9eYNDYjxFfvSCv28eH0BONH0FGAWZVlEPXkwhq3D/xrjjiYkAfso5J09eWh58q6WeAox1QRXoSkRGSsQkXmsuvI2KDUAmS8aI6Np8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710266903; c=relaxed/simple; bh=oSh885xTWKQ1RgRzIZ35HM/jKMTC9MM21WT7E2eDHhc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=VkoGXLOY6B2OnSsQIhV858gtJEEYHrfLsNRx1iVIaW/kJN/NU8yviMRcVHXtpQwWywzg9Gx7aLrtiWFz3HVF1FUDAZt4QjhLQU2ydzkQw/5lNxf/hOHMZhAOX9vLvzq7xFs7Qx9OIF+4R+3TNPUXMxHZSNCZCcZARCiSSRSYxhM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=KNe/wpzl; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=JuPvkuXr; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1710266899; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FE3b5PyhcD0TC3K+2ITLv4adUr9UbVkRxLK+4jENi1c=; b=KNe/wpzleBuytTNWmN+9si37DQRO1Y5ECkdID0eDtVsQ4gy82jDsGnmbnmHvE2Vroe9vmN y3J/uTp/xWufHXtBqzUaEhs1grsO96Ri2258o5BbmeHDpuebFs02AeoFWwnQas43MJcYKc H8OnaVMqNkG/x9POZiowCOcBH04P/0JEUDU0Yf96JHA4kNfVAd5QstJi+T/GQUp1uEIACU 7Kve/X7BaAuldwIrilJ8NaRzkpRZIf/FQFj6CWa9km3tcSyWabVQAVw/YXUH1ZUtGGrBhH CUjGrXhTm09Bhq7kOGxX+Q97JIZDDAnZclP1goE9kYPCnylAZIuVZn6GduUBOw== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1710266899; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FE3b5PyhcD0TC3K+2ITLv4adUr9UbVkRxLK+4jENi1c=; b=JuPvkuXrOEbkzZdHF7heOO9tFIBO+vI7c6+PnRC2JQXqN9IFTCqYiFV7yScuvnIkMHKih9 3s5NrcTaKLb6XSCQ== To: linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Adrian Hunter , Alexander Shishkin , Arnaldo Carvalho de Melo , Ian Rogers , Ingo Molnar , Jiri Olsa , Marco Elver , Mark Rutland , Namhyung Kim , Peter Zijlstra , Thomas Gleixner , Sebastian Andrzej Siewior Subject: [PATCH v2 2/4] perf: Enqueue SIGTRAP always via task_work. Date: Tue, 12 Mar 2024 19:01:50 +0100 Message-ID: <20240312180814.3373778-3-bigeasy@linutronix.de> In-Reply-To: <20240312180814.3373778-1-bigeasy@linutronix.de> References: <20240312180814.3373778-1-bigeasy@linutronix.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable A signal is delivered by raising irq_work() which works from any context including NMI. irq_work() can be delayed if the architecture does not provide an interrupt vector. In order not to lose a signal, the signal is injected via task_work during event_sched_out(). Instead going via irq_work, the signal could be added directly via task_work. The signal is sent to current and can be enqueued on its return path to userland instead of triggering irq_work. A dummy IRQ is required in the NMI case to ensure the task_work is handled before returning to user land. For this irq_work is used. An alternative would be just raising an interrupt like arch_send_call_function_single_ipi(). During testing with `remove_on_exec' it become visible that the event can be enqueued via NMI during execve(). The task_work must not be kept because free_event() will complain later. Also the new task will not have a sighandler installed. Queue signal via task_work. Remove perf_event::pending_sigtrap and and use perf_event::pending_work instead. Raise irq_work in the NMI case for a dummy interrupt. Remove the task_work if the event is freed. Signed-off-by: Sebastian Andrzej Siewior --- include/linux/perf_event.h | 3 +-- kernel/events/core.c | 45 +++++++++++++++++--------------------- 2 files changed, 21 insertions(+), 27 deletions(-) diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index d2a15c0c6f8a9..24ac6765146c7 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -781,7 +781,6 @@ struct perf_event { unsigned int pending_wakeup; unsigned int pending_kill; unsigned int pending_disable; - unsigned int pending_sigtrap; unsigned long pending_addr; /* SIGTRAP */ struct irq_work pending_irq; struct callback_head pending_task; @@ -959,7 +958,7 @@ struct perf_event_context { struct rcu_head rcu_head; =20 /* - * Sum (event->pending_sigtrap + event->pending_work) + * Sum (event->pending_work + event->pending_work) * * The SIGTRAP is targeted at ctx->task, as such it won't do changing * that until the signal is delivered. diff --git a/kernel/events/core.c b/kernel/events/core.c index c7a0274c662c8..e9926baaa1587 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -2283,21 +2283,6 @@ event_sched_out(struct perf_event *event, struct per= f_event_context *ctx) state =3D PERF_EVENT_STATE_OFF; } =20 - if (event->pending_sigtrap) { - bool dec =3D true; - - event->pending_sigtrap =3D 0; - if (state !=3D PERF_EVENT_STATE_OFF && - !event->pending_work) { - event->pending_work =3D 1; - dec =3D false; - WARN_ON_ONCE(!atomic_long_inc_not_zero(&event->refcount)); - task_work_add(current, &event->pending_task, TWA_RESUME); - } - if (dec) - local_dec(&event->ctx->nr_pending); - } - perf_event_set_state(event, state); =20 if (!is_software_event(event)) @@ -6741,11 +6726,6 @@ static void __perf_pending_irq(struct perf_event *ev= ent) * Yay, we hit home and are in the context of the event. */ if (cpu =3D=3D smp_processor_id()) { - if (event->pending_sigtrap) { - event->pending_sigtrap =3D 0; - perf_sigtrap(event); - local_dec(&event->ctx->nr_pending); - } if (event->pending_disable) { event->pending_disable =3D 0; perf_event_disable_local(event); @@ -9592,14 +9572,17 @@ static int __perf_event_overflow(struct perf_event = *event, =20 if (regs) pending_id =3D hash32_ptr((void *)instruction_pointer(regs)) ?: 1; - if (!event->pending_sigtrap) { - event->pending_sigtrap =3D pending_id; + if (!event->pending_work) { + event->pending_work =3D pending_id; local_inc(&event->ctx->nr_pending); - irq_work_queue(&event->pending_irq); + WARN_ON_ONCE(!atomic_long_inc_not_zero(&event->refcount)); + task_work_add(current, &event->pending_task, TWA_RESUME); + if (in_nmi()) + irq_work_queue(&event->pending_irq); } else if (event->attr.exclude_kernel && valid_sample) { /* * Should not be able to return to user space without - * consuming pending_sigtrap; with exceptions: + * consuming pending_work; with exceptions: * * 1. Where !exclude_kernel, events can overflow again * in the kernel without returning to user space. @@ -9609,7 +9592,7 @@ static int __perf_event_overflow(struct perf_event *e= vent, * To approximate progress (with false negatives), * check 32-bit hash of the current IP. */ - WARN_ON_ONCE(event->pending_sigtrap !=3D pending_id); + WARN_ON_ONCE(event->pending_work !=3D pending_id); } =20 event->pending_addr =3D 0; @@ -13049,6 +13032,13 @@ static void sync_child_event(struct perf_event *ch= ild_event) &parent_event->child_total_time_running); } =20 +static bool task_work_cb_match(struct callback_head *cb, void *data) +{ + struct perf_event *event =3D container_of(cb, struct perf_event, pending_= task); + + return event =3D=3D data; +} + static void perf_event_exit_event(struct perf_event *event, struct perf_event_context = *ctx) { @@ -13088,6 +13078,11 @@ perf_event_exit_event(struct perf_event *event, st= ruct perf_event_context *ctx) * Kick perf_poll() for is_event_hup(); */ perf_event_wakeup(parent_event); + if (event->pending_work && + task_work_cancel_match(current, task_work_cb_match, event)) { + put_event(event); + local_dec(&event->ctx->nr_pending); + } free_event(event); put_event(parent_event); return; --=20 2.43.0