Received: by 2002:ad5:4acb:0:0:0:0:0 with SMTP id n11csp1115286imw; Fri, 8 Jul 2022 19:32:52 -0700 (PDT) X-Google-Smtp-Source: AGRyM1vWxg/Q+jfmVxnkVMB3+5ECWYAZTUhB9tyjQEm5AY8ORz/bzHJs3YIA3L0KSRNto0rUkDgc X-Received: by 2002:a05:6402:5245:b0:43a:a024:82cc with SMTP id t5-20020a056402524500b0043aa02482ccmr8668208edd.56.1657333972133; Fri, 08 Jul 2022 19:32:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1657333972; cv=none; d=google.com; s=arc-20160816; b=I6FBcNRd6LCP6X6FmWElt7aYBbJyjzoQeWOlhju9sKOkWD2KaIZHDNes8h9Lgu9+Tj CVrTIfbt903fCwgVmxikJAR4JunwWToFSAYl8haBDhetcLw66aYFH4bVctOsnmWXyPAl xy518DuWynE39styriBK7eZcoff+U+dg9VxdyEU7nM8lCJBH2DoU/ouJeSaHe1t7XunM 7Ty3tw39Ga17/LddhY/vKXiH5GrDiGETdtBnAQ9QjW2/lR/P2m53Sielo0IP0UlhsetH BotiFMu8iMnF6eodoE2dUyI/fqH8ZeSYSyIqi2hWcsLPRzvYPp9w8cknFLl0KD71zC9U /mrg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:references:cc :to:from:subject; bh=zErPI2I2R4Occq5UeLje/tuNr5eJA5yG+2ehvcSPiuM=; b=PKVBivYlVId2Cq3MCcTN+J0DHiJUXJSnFOTbtxPJvqK+92r89Q3jlUs4A2vC0zalDZ IcXoNknxuYti+CaaM7Mk8H4fmsiaqiSkngDkK4i5fGOsSc2tCP/6705up+g7YNqaPKEX YlJ68HTiiE4nHBi1WnOqBhFa7w1ZUueMPF0q0YcvK4ScQ7nvjE9FPE1whQo9qIM8RS8U uPHtUeWwYgM3njH1JSezogAnTXtlWXcBvA9TEKA1CprWJ0uRj7tF42M/tpfM7/2UWcQC 0bN7sGMZ8tKt4hObt2/FAkqDDBoGOyW8KE0IaMpgNMBESEm5jC8R53ZkfHANW7uEYfUM RcrQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u16-20020a170906069000b00726e1fdf99csi325328ejb.867.2022.07.08.19.32.26; Fri, 08 Jul 2022 19:32:52 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229617AbiGICAe (ORCPT + 99 others); Fri, 8 Jul 2022 22:00:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38800 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229457AbiGICAd (ORCPT ); Fri, 8 Jul 2022 22:00:33 -0400 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 80CA34E87E; Fri, 8 Jul 2022 19:00:32 -0700 (PDT) Received: from dggemv711-chm.china.huawei.com (unknown [172.30.72.57]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4LftbN5TqQzkWnW; Sat, 9 Jul 2022 09:58:24 +0800 (CST) Received: from kwepemm600003.china.huawei.com (7.193.23.202) by dggemv711-chm.china.huawei.com (10.1.198.66) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Sat, 9 Jul 2022 10:00:30 +0800 Received: from [10.67.111.205] (10.67.111.205) by kwepemm600003.china.huawei.com (7.193.23.202) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Sat, 9 Jul 2022 10:00:29 +0800 Subject: Re: [PATCH v2] perf/core: Fix data race between perf_event_set_output and perf_mmap_close From: Yang Jihong To: Peter Zijlstra CC: , , , , , , , References: <20220704120006.98141-1-yangjihong1@huawei.com> <1e28533a-33ed-cae3-0389-c68e7c52cead@huawei.com> Message-ID: <3ba262b0-a92f-e57d-af31-baccca765ac3@huawei.com> Date: Sat, 9 Jul 2022 10:00:29 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.6.1 MIME-Version: 1.0 In-Reply-To: <1e28533a-33ed-cae3-0389-c68e7c52cead@huawei.com> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Originating-IP: [10.67.111.205] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To kwepemm600003.china.huawei.com (7.193.23.202) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,NICE_REPLY_A, RCVD_IN_DNSWL_MED,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, On 2022/7/6 20:29, Yang Jihong wrote: > Hello, > > On 2022/7/5 21:07, Peter Zijlstra wrote: >> On Mon, Jul 04, 2022 at 05:26:04PM +0200, Peter Zijlstra wrote: >>> On Mon, Jul 04, 2022 at 08:00:06PM +0800, Yang Jihong wrote: >>>> Data race exists between perf_event_set_output and perf_mmap_close. >>>> The scenario is as follows: >>>> >>>> >>>> CPU1                                                       CPU2 >>>> >>>> perf_mmap_close(event2) >>>> >>>> if (atomic_dec_and_test(&event2->rb->mmap_count)  // mmap_count 1 -> 0 >>>> >>>> detach_rest = true; >>>> ioctl(event1, PERF_EVENT_IOC_SET_OUTPUT, event2) >>>>    perf_event_set_output(event1, event2) >>>> >>>> if (!detach_rest) >>>> >>>> goto out_put; >>>> >>>> list_for_each_entry_rcu(event, &event2->rb->event_list, rb_entry) >>>> >>>> ring_buffer_attach(event, NULL) >>>> >>>> // because event1 has not been added to event2->rb->event_list, >>>> >>>> // event1->rb is not set to NULL in these loops >>>> >>>>      ring_buffer_attach(event1, event2->rb) >>>>        list_add_rcu(&event1->rb_entry, &event2->rb->event_list) >>>> >>>> The above data race causes a problem, that is, event1->rb is not >>>> NULL, but event1->rb->mmap_count is 0. >>>> If the perf_mmap interface is invoked for the fd of event1, the >>>> kernel keeps in the perf_mmap infinite loop: >>>> >>>> again: >>>>          mutex_lock(&event->mmap_mutex); >>>>          if (event->rb) { >>>> >>>>                  if (!atomic_inc_not_zero(&event->rb->mmap_count)) { >>>>                          /* >>>>                           * Raced against perf_mmap_close() through >>>>                           * perf_event_set_output(). Try again, hope >>>> for better >>>>                           * luck. >>>>                           */ >>>>                          mutex_unlock(&event->mmap_mutex); >>>>                          goto again; >>>>                  } >>>> >>> >>> Too tired, must look again tomorrow, little feeback below. >> >> With brain more awake I ended up with the below. Does that work? I have verified that this patch can solve the problem. Do I submit this patch? Or do you submit it? Thanks, Yang > > Yes, I apply the patch on kernel versions 5.10 and mainline, > and it could fixed the problem. > > Tested-by: Yang Jihong > > Thanks, > Yang > .