Received: by 2002:a05:6358:c692:b0:131:369:b2a3 with SMTP id fe18csp4538601rwb; Mon, 31 Jul 2023 08:19:23 -0700 (PDT) X-Google-Smtp-Source: APBJJlGKtuncBx/0Mehxa62Yr7ucb3IBQumNsMMqDSESUFJ1s7a7A4B/hFB29fvETmN4k7PcHIY4 X-Received: by 2002:a17:902:c951:b0:1bb:c5b5:8353 with SMTP id i17-20020a170902c95100b001bbc5b58353mr10885518pla.4.1690816762690; Mon, 31 Jul 2023 08:19:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690816762; cv=none; d=google.com; s=arc-20160816; b=vtrYhCZdexKSKgctf+9QTaIwUGj/lpZhyTrcWmNfgElQByvKd7N+qtqhREnNGiyMZv 4OjzTwwK8QtkRm0+eUVwSZFAADZPHDhw921RcY8IheTfvf1MGBqWhE8drYfvmMwIW2vz 1H+jsgcWeES4n62fXHhypnM1vy+xnS6fM7+14oreddfoy2yUGqgHZDuJzZdIOg1Ae9Ql rcEUiYfGweG8ZT+6rRebVOiC4vx/LhKzYpLBm5YzJE8YWwSjUxAjb3VSnFuVvNrxRGiB Z7VzGi2nOWZsXbLahA+DVtOvJmcgXz77BrAnaOETEko9rdkRPi4/WltFGC/xAcb6kP7u +92A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :to:subject; bh=a2UuKReZZnSuZkCEAd7ml8NnKDbXvM6yV/fwUKtwQgs=; fh=qRu0Yn5CVa5BvLNoI/hJXkG9tC43u2sAoR+ajhIBVl0=; b=IrXy/QcoF7yjQBZ9uS6opQrnnS4lpawUcsCyGk/8Z42X220l16n6fPmD/X5E0ikR0A zY14mu0sCJ+TPw6UpatP6lXL2AyEigFLGlBqI7w2/wTn+w/dkjs5ITOEPYBLKkqdwchg JY388MuR5gnM+/KKZQmvSs2VZSPJ/aKZFISuEpcny/ITAxJURtGIbThpOgjhMYuKytEw hMVErPRdRr5apquldgLVRHA+Xk3MYWqAGefdQtmFllRPnJ/AmnQHd0KcHW2uXYMMJcxv yElOBcHOR4NRo3c5MN/l64VxGKCf70RBDB9S/oCJf2R27mmCafitT3EK2yqUXLYJulrs yTvw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u2-20020a170902714200b001bb8f59aca3si7229599plm.583.2023.07.31.08.18.58; Mon, 31 Jul 2023 08:19:22 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232037AbjGaO2O (ORCPT + 99 others); Mon, 31 Jul 2023 10:28:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40232 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231744AbjGaO2L (ORCPT ); Mon, 31 Jul 2023 10:28:11 -0400 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ABE21B6; Mon, 31 Jul 2023 07:28:09 -0700 (PDT) Received: from kwepemm600003.china.huawei.com (unknown [172.30.72.56]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4RF0qz0JL1ztRjT; Mon, 31 Jul 2023 22:24:47 +0800 (CST) Received: from [10.67.111.205] (10.67.111.205) by kwepemm600003.china.huawei.com (7.193.23.202) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Mon, 31 Jul 2023 22:28:06 +0800 Subject: Re: [PATCH v3 4/7] perf record: Track sideband events for all CPUs when tracing selected CPUs To: Adrian Hunter , , , , , , , , , , , , , , , References: <20230722093219.174898-1-yangjihong1@huawei.com> <20230722093219.174898-5-yangjihong1@huawei.com> <4ec5cf9e-130d-4259-420f-420508186858@intel.com> <095df85c-e44e-9ff0-ad28-c3473a9a01e4@huawei.com> <25b32870-12e1-b237-648a-3c6fd9678bb9@intel.com> From: Yang Jihong Message-ID: <8c62094a-921c-2f3a-2130-3f083f4ac178@huawei.com> Date: Mon, 31 Jul 2023 22:28:05 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.6.1 MIME-Version: 1.0 In-Reply-To: <25b32870-12e1-b237-648a-3c6fd9678bb9@intel.com> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Originating-IP: [10.67.111.205] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemm600003.china.huawei.com (7.193.23.202) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-5.3 required=5.0 tests=BAYES_00,NICE_REPLY_A, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, On 2023/7/31 21:01, Adrian Hunter wrote: > On 31/07/23 15:38, Yang Jihong wrote: >> Hello, >> >> On 2023/7/31 19:08, Adrian Hunter wrote: >>> On 22/07/23 12:32, Yang Jihong wrote: >>>> User space tasks can migrate between CPUs, we need to track side-band >>>> events for all CPUs. >>>> >>>> The specific scenarios are as follows: >>>> >>>>           CPU0                                 CPU1 >>>>    perf record -C 0 start >>>>                                taskA starts to be created and executed >>>>                                  -> PERF_RECORD_COMM and PERF_RECORD_MMAP >>>>                                     events only deliver to CPU1 >>>>                                ...... >>>>                                  | >>>>                            migrate to CPU0 >>>>                                  | >>>>    Running on CPU0    <----------/ >>>>    ... >>>> >>>>    perf record -C 0 stop >>>> >>>> Now perf samples the PC of taskA. However, perf does not record the >>>> PERF_RECORD_COMM and PERF_RECORD_MMAP events of taskA. >>>> Therefore, the comm and symbols of taskA cannot be parsed. >>>> >>>> The solution is to record sideband events for all CPUs when tracing >>>> selected CPUs. Because this modifies the default behavior, add related >>>> comments to the perf record man page. >>>> >>>> The sys_perf_event_open invoked is as follows: >>>> >>>>    # perf --debug verbose=3 record -e cpu-clock -C 1 true >>>>    >>>>    Opening: cpu-clock >>>>    ------------------------------------------------------------ >>>>    perf_event_attr: >>>>      type                             1 (PERF_TYPE_SOFTWARE) >>>>      size                             136 >>>>      config                           0 (PERF_COUNT_SW_CPU_CLOCK) >>>>      { sample_period, sample_freq }   4000 >>>>      sample_type                      IP|TID|TIME|CPU|PERIOD|IDENTIFIER >>>>      read_format                      ID|LOST >>>>      disabled                         1 >>>>      inherit                          1 >>>>      freq                             1 >>>>      sample_id_all                    1 >>>>      exclude_guest                    1 >>>>    ------------------------------------------------------------ >>>>    sys_perf_event_open: pid -1  cpu 1  group_fd -1  flags 0x8 = 5 >>>>    Opening: dummy:u >>>>    ------------------------------------------------------------ >>>>    perf_event_attr: >>>>      type                             1 (PERF_TYPE_SOFTWARE) >>>>      size                             136 >>>>      config                           0x9 (PERF_COUNT_SW_DUMMY) >>>>      { sample_period, sample_freq }   1 >>>>      sample_type                      IP|TID|TIME|CPU|IDENTIFIER >>>>      read_format                      ID|LOST >>>>      inherit                          1 >>>>      exclude_kernel                   1 >>>>      exclude_hv                       1 >>>>      mmap                             1 >>>>      comm                             1 >>>>      task                             1 >>>>      sample_id_all                    1 >>>>      exclude_guest                    1 >>>>      mmap2                            1 >>>>      comm_exec                        1 >>>>      ksymbol                          1 >>>>      bpf_event                        1 >>>>    ------------------------------------------------------------ >>>>    sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 6 >>>>    sys_perf_event_open: pid -1  cpu 1  group_fd -1  flags 0x8 = 7 >>>>    sys_perf_event_open: pid -1  cpu 2  group_fd -1  flags 0x8 = 9 >>>>    sys_perf_event_open: pid -1  cpu 3  group_fd -1  flags 0x8 = 10 >>>>    sys_perf_event_open: pid -1  cpu 4  group_fd -1  flags 0x8 = 11 >>>>    sys_perf_event_open: pid -1  cpu 5  group_fd -1  flags 0x8 = 12 >>>>    sys_perf_event_open: pid -1  cpu 6  group_fd -1  flags 0x8 = 13 >>>>    sys_perf_event_open: pid -1  cpu 7  group_fd -1  flags 0x8 = 14 >>>>    >>>> >>>> Signed-off-by: Yang Jihong >>>> --- >>>>   tools/perf/Documentation/perf-record.txt |  3 +++ >>>>   tools/perf/builtin-record.c              | 14 +++++++++++++- >>>>   2 files changed, 16 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt >>>> index 680396c56bd1..dac53ece51ab 100644 >>>> --- a/tools/perf/Documentation/perf-record.txt >>>> +++ b/tools/perf/Documentation/perf-record.txt >>>> @@ -388,6 +388,9 @@ comma-separated list with no space: 0,1. Ranges of CPUs are specified with -: 0- >>>>   In per-thread mode with inheritance mode on (default), samples are captured only when >>>>   the thread executes on the designated CPUs. Default is to monitor all CPUs. >>>>   +User space tasks can migrate between CPUs, so when tracing selected CPUs, >>>> +a dummy event is created to track sideband for all CPUs. >>>> + >>>>   -B:: >>>>   --no-buildid:: >>>>   Do not save the build ids of binaries in the perf.data files. This skips >>>> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c >>>> index 3ff9d972225e..4e8e97928f05 100644 >>>> --- a/tools/perf/builtin-record.c >>>> +++ b/tools/perf/builtin-record.c >>>> @@ -912,6 +912,7 @@ static int record__config_tracking_events(struct record *rec) >>>>   { >>>>       struct record_opts *opts = &rec->opts; >>>>       struct evlist *evlist = rec->evlist; >>>> +    bool system_wide = false; >>>>       struct evsel *evsel; >>>>         /* >>>> @@ -921,7 +922,18 @@ static int record__config_tracking_events(struct record *rec) >>>>        */ >>>>       if (opts->target.initial_delay || target__has_cpu(&opts->target) || >>>>           perf_pmus__num_core_pmus() > 1) { >>>> -        evsel = evlist__findnew_tracking_event(evlist, false); >>>> + >>>> +        /* >>>> +         * User space tasks can migrate between CPUs, so when tracing >>>> +         * selected CPUs, sideband for all CPUs is still needed. >>>> +         * >>>> +         * If all (non-dummy) evsel have exclude_user, >>>> +         * system_wide is not needed. >>>> +         */ >>>> +        if (!!opts->target.cpu_list && !opts->all_kernel) >>> >>> Not everyone uses all-kernel.  Can we check the evsels are either dummy >>> or exclude_user? >> For perf_record, exclude_user of all evsels is set in evsel__config(), and record__config_tracking_events() is before evsel__config(). >> >> Uh..., it seems that only opts->all_kernel can be used to check exclude_user of evsels. >> >> void evsel__config() >> { >>   ... >>   if (opts->all_kernel) { >>     attr->exclude_kernel = 0; >>     attr->exclude_user   = 1; >>   } >>   ... >> } > > The parser updates attr in accordance with ":k" etc. I guess Yes, the ":k" situation also needs to be considered. > opts->all_kernel or opts->all_user override that as well. Yes, opts->all_kernel and opts->all_user will overwrite the original attr, see [1]. may need to check all_user, all_kernel and non-dummy exclude_user at the same time: if ((all_user && one_non_dummy_exist) || (!all_user && !all_kernel && one_non_dummy_without_exclude_user)) { system_wide = true; } [1] # perf --debug verbose=2 record -e cpu-clock:u --all-kernel true ------------------------------------------------------------ perf_event_attr: type 1 (PERF_TYPE_SOFTWARE) size 136 config 0 (PERF_COUNT_SW_CPU_CLOCK) { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|PERIOD read_format ID|LOST disabled 1 inherit 1 exclude_user 1 exclude_hv 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ksymbol 1 bpf_event 1 ------------------------------------------------------------ Thanks, Yang