Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp962528pxb; Wed, 6 Apr 2022 05:26:00 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwR91i0HbTIImg8PFBdI0C9rwI3fCO/fYRWCSvmzysPTDHWhHQrvSNGkeJKXq6BXkitws14 X-Received: by 2002:a17:903:2ca:b0:156:f1cc:7cb6 with SMTP id s10-20020a17090302ca00b00156f1cc7cb6mr1157312plk.174.1649247959819; Wed, 06 Apr 2022 05:25:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649247959; cv=none; d=google.com; s=arc-20160816; b=Yq8r/kpqOqGziv2mx0lwT9v5cCE9Brt8MvY0PeW+8BCIY5GCosV+QDMTb6JBCfNTeo qW8gaMIyn4lmk4qwQBD6MPD4sMVrkIVwt29Cwy+xevPwwsYncUy2iUPA2UDe3I+Dgi4B pjDkzvI53Epq1dfz4/d/vWA5j91dzfqm6/7RbBrxAXgu0xj3TQqGCO5FbyoHLEyx0A40 iLNi8pVBMxiuJVOV5X8gQod4avP6dOQVI8vVUHDVIRnXwsfiBqTgQSaV2TBWHjn9JNls knRJOlM+dV+QJbdg0OE/JvaKK/oAENsXklAVXdZt7Z7n/n+2MrWFcc0i4HgTdli5VNnK lFWg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to :organization:from:references:cc:to:content-language:subject :user-agent:mime-version:date:message-id:dkim-signature; bh=hB/Q9LQHQ+U8uy91+G5PUr5Pq+wKQ97KUEfEmpR1Xng=; b=ra513YfxuphLf1wuuG2rt5cMAZWBOsbQ9rI//MlAq3OVs66B0fSZ084Ly3vLESkZCI mXfMZgVpVH+B217Vi28WduBToVWgZ/4EBrzu84TgeKovbLuqm/jcwjY7iU7v1rUbNbeY E+dvV5R3gIEnGn65Qa4hROwvs0vI6mWTJub0cE+41q2bc1QBLx1M6JdTKKbMeUo9I56j PUKBi/A8Qs2iorH7bq4C73dnABNpwut0PONXr0p5XLPb5zhiKp5bG6wdeO5gGyBtqJd8 SwaXu5nuUJ00Rxq95xX1a+dFthp+1BVyCi/VSpCBbucf0ZsZLo3C3BVoMnAw0lWAcvAO LFig== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=UqlefvKE; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id l13-20020a170903244d00b00156a2c84fc8si7124779pls.617.2022.04.06.05.25.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Apr 2022 05:25:59 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=UqlefvKE; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 567166E3C95; Wed, 6 Apr 2022 04:35:39 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1846709AbiDFCIR (ORCPT + 99 others); Tue, 5 Apr 2022 22:08:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53128 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1457637AbiDEQXX (ORCPT ); Tue, 5 Apr 2022 12:23:23 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CC5C37DE31 for ; Tue, 5 Apr 2022 09:21:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1649175684; x=1680711684; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=WOfrO/l6F0LrcdJO4XTmDU8NuIKRrF+SxqFE/ogbwpU=; b=UqlefvKEhYJnX7An8mRh+tVt+hlleiWqWRiTppvgGunv35lxwiCdRi7g qo6Wk/Aqmf9kCxc+LYRnUmTyaIGtcpN8qMA1oS2e8WQ7YYmmqX0edobF2 X1ERCMAs50pt2ahZXgZaRCbursgvuaHw7O2ZJ9JOMwx2bkSTalT1H4XtB wL2fQ4/zvzXPGCIAWCT9Q2lv8xfy7rABx+Ac+dDAM3tvMo84SjIAuHveD AYnB+BeQhQT/4+zFaAFsQsSDWOkz8UdXoYG+A5tuO6zu5SRHRcRzDYgEm G1qUKIS72rjkPtHidXnmfxbylyOkW0oR6lXlit8MB4eZcLL4CHi9jEgEr w==; X-IronPort-AV: E=McAfee;i="6200,9189,10308"; a="259624479" X-IronPort-AV: E=Sophos;i="5.90,236,1643702400"; d="scan'208";a="259624479" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Apr 2022 09:21:24 -0700 X-IronPort-AV: E=Sophos;i="5.90,236,1643702400"; d="scan'208";a="569963695" Received: from abaydur-mobl1.ccr.corp.intel.com (HELO [10.249.229.244]) ([10.249.229.244]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Apr 2022 09:21:20 -0700 Message-ID: Date: Tue, 5 Apr 2022 19:21:04 +0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0 Subject: Re: [PATCH v13 01/16] perf record: Introduce thread affinity and mmap masks Content-Language: en-GB To: Ian Rogers Cc: Arnaldo Carvalho de Melo , Jiri Olsa , Namhyung Kim , Alexander Shishkin , Peter Zijlstra , Ingo Molnar , linux-kernel , Andi Kleen , Adrian Hunter , Alexander Antonov , Alexei Budankov , Riccardo Mancini References: <9042bf7daf988e17e17e6acbf5d29590bde869cd.1642440724.git.alexey.v.bayduraev@linux.intel.com> From: "Bayduraev, Alexey V" Organization: Intel Corporation In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-4.7 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,NICE_REPLY_A,RDNS_NONE,SPF_HELO_NONE, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 05.04.2022 1:25, Ian Rogers wrote: > On Mon, Jan 17, 2022 at 10:38 AM Alexey Bayduraev > wrote: >> >> Introduce affinity and mmap thread masks. Thread affinity mask > > In per-thread mode it is possible that cpus is the dummy CPU map here. > This means that the cpu below has the value -1 and setting bit -1 > actually has the effect of setting bit 63. Here is a reproduction > based on the acme/perf/core branch: > > ``` > $ make STATIC=1 DEBUG=1 EXTRA_CFLAGS='-fno-omit-frame-pointer > -fsanitize=undefined -fno-sanitize-recover' > $ perf record -o /tmp/perf.data --per-thread true > tools/include/asm-generic/bitops/atomic.h:10:36: runtime error: shift > exponent -1 is negative > $ UBSAN_OPTIONS=abort_on_error=1 gdb --args perf record -o > /tmp/perf.data --per-thread true > (gdb) r > tools/include/asm-generic/bitops/atomic.h:10:36: runtime error: shift > exponent -1 is negative > (gdb) bt > #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:49 > #1 0x00007ffff71d2546 in __GI_abort () at abort.c:79 > #2 0x00007ffff640db9f in __sanitizer::Abort () at > ../../../../src/libsanitizer/sanitizer_common/sanitizer_posix_libcdep.cpp:151 > #3 0x00007ffff6418efc in __sanitizer::Die () at > ../../../../src/libsanitizer/sanitizer_common/sanitizer_termination.cpp:58 > #4 0x00007ffff63fd99e in > __ubsan::__ubsan_handle_shift_out_of_bounds_abort (Data= out>, LHS=, > RHS=) at > ../../../../src/libsanitizer/ubsan/ubsan_handlers.cpp:378 > #5 0x0000555555c54405 in set_bit (nr=-1, addr=0x555556ecd0a0) > at tools/include/asm-generic/bitops/atomic.h:10 > #6 0x0000555555c6ddaf in record__mmap_cpu_mask_init > (mask=0x555556ecd070, cpus=0x555556ecd050) at builtin-record.c:3333 > #7 0x0000555555c7044c in record__init_thread_default_masks > (rec=0x55555681b100 , cpus=0x555556ecd050) at > builtin-record.c:3668 > #8 0x0000555555c705b3 in record__init_thread_masks > (rec=0x55555681b100 ) at builtin-record.c:3681 > #9 0x0000555555c7297a in cmd_record (argc=1, argv=0x7fffffffdcc0) at > builtin-record.c:3976 > #10 0x0000555555e06d41 in run_builtin (p=0x555556827538 > , argc=5, argv=0x7fffffffdcc0) at perf.c:313 > #11 0x0000555555e07253 in handle_internal_command (argc=5, > argv=0x7fffffffdcc0) at perf.c:365 > #12 0x0000555555e07508 in run_argv (argcp=0x7fffffffdb0c, > argv=0x7fffffffdb00) at perf.c:409 > #13 0x0000555555e07b32 in main (argc=5, argv=0x7fffffffdcc0) at perf.c:539 > ``` > > Not setting the mask->bits if the cpu map is dummy causes no data to > be written. Setting mask->bits 0 causes a segv. Setting bit 63 works > but feels like there are more invariants broken in the code. > > Here is a not good workaround patch: > > diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c > index ba74fab02e62..62727b676f98 100644 > --- a/tools/perf/builtin-record.c > +++ b/tools/perf/builtin-record.c > @@ -3329,6 +3329,11 @@ static void record__mmap_cpu_mask_init(struct > mmap_cpu_mask *mask, struct perf_c > { > int c; > > + if (cpu_map__is_dummy(cpus)) { > + set_bit(63, mask->bits); > + return; > + } > + > for (c = 0; c < cpus->nr; c++) > set_bit(cpus->map[c].cpu, mask->bits); > } > > Alexey, what should the expected behavior be with per-thread mmaps? > > Thanks, > Ian Thanks a lot, In case of per-thread mmaps we should initialize thread_data[0]->maps[i] by evlist->mmap[i]. Looks like this was missed by this patchset. Your patch works, because it triggers test_bit() in record__thread_data_init_maps() and thread_data maps get correctly initialized. However, it's better to ignore thread_data->masks in record__mmap_cpu_mask_init() and setup thread_data maps explicitly for per-thread case. Also, to prevent more runtime crashes, --per-thread and --threads options should be mutually exclusive. I will prepare a fix for this issue soon. Regards, Alexey > >> +static void record__free_thread_masks(struct record *rec, int nr_threads) >> +{ >> + int t; >> + >> + if (rec->thread_masks) >> + for (t = 0; t < nr_threads; t++) >> + record__thread_mask_free(&rec->thread_masks[t]); >> + >> + zfree(&rec->thread_masks); >> +} >> + >> +static int record__alloc_thread_masks(struct record *rec, int nr_threads, int nr_bits) >> +{ >> + int t, ret; >> + >> + rec->thread_masks = zalloc(nr_threads * sizeof(*(rec->thread_masks))); >> + if (!rec->thread_masks) { >> + pr_err("Failed to allocate thread masks\n"); >> + return -ENOMEM; >> + } >> + >> + for (t = 0; t < nr_threads; t++) { >> + ret = record__thread_mask_alloc(&rec->thread_masks[t], nr_bits); >> + if (ret) { >> + pr_err("Failed to allocate thread masks[%d]\n", t); >> + goto out_free; >> + } >> + } >> + >> + return 0; >> + >> +out_free: >> + record__free_thread_masks(rec, nr_threads); >> + >> + return ret; >> +} >> + >> +static int record__init_thread_default_masks(struct record *rec, struct perf_cpu_map *cpus) >> +{ >> + int ret; >> + >> + ret = record__alloc_thread_masks(rec, 1, cpu__max_cpu().cpu); >> + if (ret) >> + return ret; >> + >> + record__mmap_cpu_mask_init(&rec->thread_masks->maps, cpus); >> + >> + rec->nr_threads = 1; >> + >> + return 0; >> +} >> + >> +static int record__init_thread_masks(struct record *rec) >> +{ >> + struct perf_cpu_map *cpus = rec->evlist->core.cpus; >> + >> + return record__init_thread_default_masks(rec, cpus); >> +} >> + >> int cmd_record(int argc, const char **argv) >> { >> int err; >> @@ -2948,6 +3063,12 @@ int cmd_record(int argc, const char **argv) >> goto out; >> } >> >> + err = record__init_thread_masks(rec); >> + if (err) { >> + pr_err("Failed to initialize parallel data streaming masks\n"); >> + goto out; >> + } >> + >> if (rec->opts.nr_cblocks > nr_cblocks_max) >> rec->opts.nr_cblocks = nr_cblocks_max; >> pr_debug("nr_cblocks: %d\n", rec->opts.nr_cblocks); >> @@ -2966,6 +3087,8 @@ int cmd_record(int argc, const char **argv) >> symbol__exit(); >> auxtrace_record__free(rec->itr); >> out_opts: >> + record__free_thread_masks(rec, rec->nr_threads); >> + rec->nr_threads = 0; >> evlist__close_control(rec->opts.ctl_fd, rec->opts.ctl_fd_ack, &rec->opts.ctl_fd_close); >> return err; >> } >> -- >> 2.19.0 >>