Received: by 2002:ad5:4acb:0:0:0:0:0 with SMTP id n11csp1329726imw; Tue, 5 Jul 2022 07:39:39 -0700 (PDT) X-Google-Smtp-Source: AGRyM1vo1jCFEAoEfq2nf9E9PzMF8I9QFIDdErkSB35l+kbYOd70vGh9q1jH7YhUVFrQdZR54zAE X-Received: by 2002:a17:902:7884:b0:16a:42c4:8780 with SMTP id q4-20020a170902788400b0016a42c48780mr40747637pll.15.1657031978812; Tue, 05 Jul 2022 07:39:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1657031978; cv=none; d=google.com; s=arc-20160816; b=BpIRD3WcP5QsgS5yRAcHsuvFMbZALTDyq5zlK3qHw4a0fifUY9zjjhWLf+y5BOoci7 TnTE1g6TVVtg3CN4KtS+EEzMsfUOSmvVKRXxo0l4JdyFOQ3Q0t9VpYN7LijiFijcHIcY jnmya5nBf+usl2SE7Ad8cmKxOhaWHOK2ar1cdGMG/j1Qe+EYrq0YC71hzjFNdc4I1HaK 2qwcI9r1Wnf+yEw63NnJf8mpDU3VvqGAG+pAYoerbDl460TtsHK2Th9w2Zbl92zdVVXs 8/QT4WIz2rVj456XDzEbWUgo35eRp6gOYaUetleRXezFsTrfBofo57GQl0jT9/MSKCeI eZxw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=S/YFv54JTDc1FmYECgAN91i/LP+GF3iJnpVhVBrSnuI=; b=kJl9uqrW3iwamTJy45mIXuWtb4PoIgCT0lT7ZoyPI66ilOFUVBZhqCp2rATPLSKZTl Sh2hZfjJJHvAkrrlmASOc18GwdL3MAchJlkRZ8ksayKKIk6qP0vbmzh6U1zQesdO63Sy o7UXO21k0v0hHsNzr7CdIChCEJKh+taqXhfdVPX9j011IE2jwTRrrZwHpUZDG6MPgHga KvDq1Mpptpuc8k7TqUGOHZmH2XRoMfjvnCaaHt711hmHpgVY2QyAck50jA88DHNqTyig /CNYtwCZQoLPU1OghlLSM5jVRZN0n6aWJw9wfnGdMPAYpKHHfxdfLeOdu9fNRhGhG/q9 SuVA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=FVdtXtc6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ls16-20020a17090b351000b001ef3f27e281si16876638pjb.86.2022.07.05.07.39.26; Tue, 05 Jul 2022 07:39:38 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=FVdtXtc6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233003AbiGENon (ORCPT + 99 others); Tue, 5 Jul 2022 09:44:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35542 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233001AbiGENoM (ORCPT ); Tue, 5 Jul 2022 09:44:12 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E987F1D30F; Tue, 5 Jul 2022 06:08:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=S/YFv54JTDc1FmYECgAN91i/LP+GF3iJnpVhVBrSnuI=; b=FVdtXtc69ESas1TRuen4OQ46T0 jG6Z8dzasNi9g8sefltZiUqm8DGsO7aXZMGuh5FiJDmHdQwiQbbx7uTgItwriNIIfEzbbi5AfNETi LWL8eVUW/+quv0HpZ7NzUSCstDJjtD7GseP8HUtcsEDyusL6eQNAxArEFDQ0fBLmn1GQGB+3cwake 3m1L834fz1RtwVIOD2fXVAxvzU/e3f/Zucqr1CSpPHmJxddv7BThqPRagtdOTHtL0hBGeE1T9Qv3Q vRUYjWIqY7y2zE3qAOrO1x/dknFWu6H+m/J8NA2I6okedrJfwdY8Z1Knh6DDqinkUX9K7trMYLmIi 6qiEvVUg==; Received: from dhcp-077-249-017-003.chello.nl ([77.249.17.3] helo=worktop.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1o8iH2-00Hab2-Ed; Tue, 05 Jul 2022 13:07:29 +0000 Received: by worktop.programming.kicks-ass.net (Postfix, from userid 1000) id 7EB44980060; Tue, 5 Jul 2022 15:07:26 +0200 (CEST) Date: Tue, 5 Jul 2022 15:07:26 +0200 From: Peter Zijlstra To: Yang Jihong Cc: mingo@redhat.com, acme@kernel.org, mark.rutland@arm.com, alexander.shishkin@linux.intel.com, jolsa@kernel.org, namhyung@kernel.org, linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2] perf/core: Fix data race between perf_event_set_output and perf_mmap_close Message-ID: References: <20220704120006.98141-1-yangjihong1@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jul 04, 2022 at 05:26:04PM +0200, Peter Zijlstra wrote: > On Mon, Jul 04, 2022 at 08:00:06PM +0800, Yang Jihong wrote: > > Data race exists between perf_event_set_output and perf_mmap_close. > > The scenario is as follows: > > > > CPU1 CPU2 > > perf_mmap_close(event2) > > if (atomic_dec_and_test(&event2->rb->mmap_count) // mmap_count 1 -> 0 > > detach_rest = true; > > ioctl(event1, PERF_EVENT_IOC_SET_OUTPUT, event2) > > perf_event_set_output(event1, event2) > > if (!detach_rest) > > goto out_put; > > list_for_each_entry_rcu(event, &event2->rb->event_list, rb_entry) > > ring_buffer_attach(event, NULL) > > // because event1 has not been added to event2->rb->event_list, > > // event1->rb is not set to NULL in these loops > > > > ring_buffer_attach(event1, event2->rb) > > list_add_rcu(&event1->rb_entry, &event2->rb->event_list) > > > > The above data race causes a problem, that is, event1->rb is not NULL, but event1->rb->mmap_count is 0. > > If the perf_mmap interface is invoked for the fd of event1, the kernel keeps in the perf_mmap infinite loop: > > > > again: > > mutex_lock(&event->mmap_mutex); > > if (event->rb) { > > > > if (!atomic_inc_not_zero(&event->rb->mmap_count)) { > > /* > > * Raced against perf_mmap_close() through > > * perf_event_set_output(). Try again, hope for better > > * luck. > > */ > > mutex_unlock(&event->mmap_mutex); > > goto again; > > } > > > > Too tired, must look again tomorrow, little feeback below. With brain more awake I ended up with the below. Does that work? --- kernel/events/core.c | 45 +++++++++++++++++++++++++++++++-------------- 1 file changed, 31 insertions(+), 14 deletions(-) diff --git a/kernel/events/core.c b/kernel/events/core.c index 4d8c335a07db..c9d32d4d2e20 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -6262,10 +6262,10 @@ static int perf_mmap(struct file *file, struct vm_area_struct *vma) if (!atomic_inc_not_zero(&event->rb->mmap_count)) { /* - * Raced against perf_mmap_close() through - * perf_event_set_output(). Try again, hope for better - * luck. + * Raced against perf_mmap_close(); remove the + * event and try again. */ + ring_buffer_attach(event, NULL); mutex_unlock(&event->mmap_mutex); goto again; } @@ -11840,14 +11840,25 @@ static int perf_copy_attr(struct perf_event_attr __user *uattr, goto out; } +static void mutex_lock_double(struct mutex *a, struct mutex *b) +{ + if (b < a) + swap(a, b); + + mutex_lock(a); + mutex_lock_nested(b, SINGLE_DEPTH_NESTING); +} + static int perf_event_set_output(struct perf_event *event, struct perf_event *output_event) { struct perf_buffer *rb = NULL; int ret = -EINVAL; - if (!output_event) + if (!output_event) { + mutex_lock(&event->mmap_mutex); goto set; + } /* don't allow circular references */ if (event == output_event) @@ -11885,8 +11896,15 @@ perf_event_set_output(struct perf_event *event, struct perf_event *output_event) event->pmu != output_event->pmu) goto out; + /* + * Hold both mmap_mutex to serialize against perf_mmap_close(). Since + * output_event is already on rb->event_list, and the list iteration + * restarts after every removal, it is guaranteed this new event is + * observed *OR* if output_event is already removed, it's guaranteed we + * observe !rb->mmap_count. + */ + mutex_lock_double(&event->mmap_mutex, &output_event->mmap_mutex); set: - mutex_lock(&event->mmap_mutex); /* Can't redirect output if we've got an active mmap() */ if (atomic_read(&event->mmap_count)) goto unlock; @@ -11896,6 +11914,12 @@ perf_event_set_output(struct perf_event *event, struct perf_event *output_event) rb = ring_buffer_get(output_event); if (!rb) goto unlock; + + /* did we race against perf_mmap_close() */ + if (!atomic_read(&rb->mmap_count)) { + ring_buffer_put(rb); + goto unlock; + } } ring_buffer_attach(event, rb); @@ -11903,20 +11927,13 @@ perf_event_set_output(struct perf_event *event, struct perf_event *output_event) ret = 0; unlock: mutex_unlock(&event->mmap_mutex); + if (output_event) + mutex_unlock(&output_event->mmap_mutex); out: return ret; } -static void mutex_lock_double(struct mutex *a, struct mutex *b) -{ - if (b < a) - swap(a, b); - - mutex_lock(a); - mutex_lock_nested(b, SINGLE_DEPTH_NESTING); -} - static int perf_event_set_clock(struct perf_event *event, clockid_t clk_id) { bool nmi_safe = false;