Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp5149642pxj; Wed, 9 Jun 2021 10:14:11 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzWQHVUPklz9q9lsYIciMUZO/as9m1H/Lkk4FQ8gMfcvcujtkdlmCTLFcJPMx/SI3V4pmKf X-Received: by 2002:aa7:da94:: with SMTP id q20mr483426eds.310.1623258851521; Wed, 09 Jun 2021 10:14:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1623258851; cv=none; d=google.com; s=arc-20160816; b=wJCItFemF35GMNgINLPwQ+Fohhj5ubF0nrh/uKN2B1NhI9Qjstgs+tYpFCobPyBvKD NUybwBJnSywoe3SB6QWmY06k0rtCAow9rqoMKBWWRR0de6ddPpXpparIqcSaFyUTFtHu 63BohsG6WA+x/rdXWS6H2+YiZ3d2mc9FVW5fCZC1Oj8ltRyDeDpQU7iwzTBeaZOb7cUb H/146PSO773pVYVx44kBjcT8JOFO/3flEpL0Y1SVBlLMQba1dg18lnT/3LyqBf3wfb4k kjAzgvreeLeGo2H9ypJas7je/kfUp5ZIbTtowE/HvP7C19zsELVDPYQbXw80RQl1DfCV 3f5w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:organization :from:references:cc:to:subject:ironport-sdr:ironport-sdr; bh=KcUFCulI74xR3+rGUwxL6qrTi1rB5QkzCCeMNpWezl4=; b=RCXiGW9DBRKQS/t0gGijiQM09Zer+hjXrhZ4bAb/vqF3rzcPn7ZkGR6HNIrhEMIvmo zABLi2ORjpgDQ1Xn5l6q6BS/9u7HPSog2HpqUsJtoZueVoy4RRbxnjy1C23BW89K39gU LStCc3elQZeiclUouKx/0MYHXV/8ehT98KhPt0XjfPvZqot8xyXNVT0vL3sLURiYSM9q qbBXGdd3L+zi5BSq7NUZtGjHkhAvwCLwcXs6RxkvV9JUe6+VPGMDE0pv5RpsPPdKJbTL zqGDe+bCy/OzV52os3lKM9EY0G52zjzEsRZhcro6rVJe4bV7fpEFX8kfHjRmtkR+ZXBk B4yA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id d5si231132edn.231.2021.06.09.10.13.47; Wed, 09 Jun 2021 10:14:11 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237592AbhFIIZG (ORCPT + 99 others); Wed, 9 Jun 2021 04:25:06 -0400 Received: from mga01.intel.com ([192.55.52.88]:7219 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236161AbhFIIZC (ORCPT ); Wed, 9 Jun 2021 04:25:02 -0400 IronPort-SDR: npEVE186Hlzc8j6uSYt+hBaEPhWWnzZ+YTkyfMWbnX0/GAD9f9meAKZnxpLbbrq7qpHlEMJ9Jn MtIoLOzh/Yvw== X-IronPort-AV: E=McAfee;i="6200,9189,10009"; a="226395376" X-IronPort-AV: E=Sophos;i="5.83,260,1616482800"; d="scan'208";a="226395376" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Jun 2021 01:23:00 -0700 IronPort-SDR: 7PEud1cuUSnZaaYrwPItsgM29nX57AV8UOQZF+rIBpJ1HMHahxHxx4pGAdJBq15m7Pker0o5nl bzOIPse5OozQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.83,260,1616482800"; d="scan'208";a="482294965" Received: from ahunter-desktop.fi.intel.com (HELO [10.237.72.79]) ([10.237.72.79]) by orsmga001.jf.intel.com with ESMTP; 09 Jun 2021 01:22:56 -0700 Subject: Re: [PATCH v2 8/8] perf record: Directly bail out for compat case To: Leo Yan Cc: Arnaldo Carvalho de Melo , Peter Zijlstra , Ingo Molnar , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Thomas Gleixner , x86@kernel.org, "H. Peter Anvin" , Mathieu Poirier , Suzuki K Poulose , Mike Leach , Andi Kleen , linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, coresight@lists.linaro.org, linux-arm-kernel@lists.infradead.org References: <20210602103007.184993-1-leo.yan@linaro.org> <20210602103007.184993-9-leo.yan@linaro.org> <20210602123847.GE10272@leoy-ThinkPad-X240s> <96e5fac6-17a2-ea03-9b15-338b84321ecf@intel.com> <20210607150903.GC1071897@leoy-ThinkPad-X240s> From: Adrian Hunter Organization: Intel Finland Oy, Registered Address: PL 281, 00181 Helsinki, Business Identity Code: 0357606 - 4, Domiciled in Helsinki Message-ID: Date: Wed, 9 Jun 2021 11:23:25 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <20210607150903.GC1071897@leoy-ThinkPad-X240s> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 7/06/21 6:09 pm, Leo Yan wrote: > On Mon, Jun 07, 2021 at 01:23:43PM +0300, Adrian Hunter wrote: >> On 2/06/21 3:38 pm, Leo Yan wrote: >>> Hi Adrain, >>> >>> On Wed, Jun 02, 2021 at 02:18:47PM +0300, Adrian Hunter wrote: >>>> On 2/06/21 1:30 pm, Leo Yan wrote: >>>>> Since the 64-bit atomicity is not promised in 32-bit perf, directly >>>>> report the error and bail out for this case. >>>>> >>>>> Now only applies on x86_64 and Arm64 platforms. >>>>> >>>>> Suggested-by: Adrian Hunter >>>> >>>> Maybe we can do better for the compat case. >>>> >>>> We can assume the upper 32-bits change very seldom, >>>> and always increase. So for the 'read' case: >>>> >>>> u64 first, second, last; >>>> u64 mask = (u64)((u32)-1) << 32; >>>> >>>> do { >>>> first = READ_ONCE(pc->aux_head); >>>> rmb(); >>>> second = READ_ONCE(pc->aux_head); >>>> rmb(); >>>> last = READ_ONCE(pc->aux_head); >>>> } while ((first & mask) != (last & mask)); >>>> return second; >>>> >>>> For the write case, we can cause a fatal error only if the new >>>> tail has non-zero upper 32-bits. That gives up to 4GiB of data >>>> before aborting: >>>> >>>> if (tail & mask) >>>> return -1; >>>> smp_mb(); >>>> WRITE_ONCE(pc->aux_tail, tail); >>> >>> Seems to me, it's pointless to only support aux_head for 64-bit and >>> support aux_tail for 32-bit. I understand this can be helpful for the >>> snapshot mode which only uses aux_head, but it still fails to support >>> the normal case for AUX ring buffer using 64-bit head/tail. >> >> I am not sure why you say it is pointless. 'perf record' would still be >> able to capture up to 4GiB of data. Do you mean you usually capture more >> than 4GiB of data? > > Okay, understand. We can support 32-bit perf for compat mode when the > trace data is less than 4GiB. > >> I was thinking we would separate out the compat case: >> >> #if BITS_PER_LONG == 32 >> if (kernel_is_64_bit) >> return compat_auxtrace_mmap__[read_head/write_tail]() >> #endif >> >> So the non-compat cases would not be affected. > > Because I don't want to introduce the complexity for read/write head > and tail, and we also need to handle the same issue for the perf ring > buffer. So how about below change? > > The main idea for below change is it allows the perf to run normally > on the compat mode and exitly if detects the buffer head is close to > the low 32-bit's overflow: when detect the low 32-bit value is bigger > than 0xf0000000 (so we have 256MiB margin to the overflow), it reports > error and exit. > > diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c > index 1b4091a3b508..2a9965bfeab4 100644 > --- a/tools/perf/util/auxtrace.c > +++ b/tools/perf/util/auxtrace.c > @@ -1693,6 +1693,14 @@ static int __auxtrace_mmap__read(struct mmap *map, > pr_debug3("auxtrace idx %d old %#"PRIx64" head %#"PRIx64" diff %#"PRIx64"\n", > mm->idx, old, head, head - old); > > +#ifdef BITS_PER_LONG == 32 > + if (kernel_is_64bit() && head >= 0xf0000000) { You are assuming the head never increases by more than 256MiB which means you should limit the buffer size to 256MiB maximum. To me this seems a bit too far from an ideal solution. I would have thought separating out the compat case makes things simpler to understand. > + pr_err("32-bit perf cannot read 64-bit value atomically;\n"); > + pr_err("exit to avoid the 4GB (32-bit) AUX buffer overflow on compat mode.\n"); > + return -ENOMEM; > + } > +#endif > + > if (mm->mask) { > head_off = head & mm->mask; > old_off = old & mm->mask; > diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c > index 9130f6fad8d5..823b69895b85 100644 > --- a/tools/perf/util/env.c > +++ b/tools/perf/util/env.c > @@ -405,3 +405,20 @@ int perf_env__numa_node(struct perf_env *env, int cpu) > > return cpu >= 0 && cpu < env->nr_numa_map ? env->numa_map[cpu] : -1; > } > + > +int perf_kernel_is_64bit(void) > +{ > + struct utsname uts; > + int ret; > + > + ret = uname(&uts); > + if (ret < 0) > + return 0; > + > + if (!strncmp(uts.machine, "x86_64", 6) || > + !strncmp(uts.machine, "aarch64", 7) || > + !strncmp(uts.machine, "arm64", 5)) > + return 1; > + > + return 0; > +} Obviously, we don't need to keep checking uname. It could be a global variable that is always set up early. > diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h > index ca249bf5e984..c6c034fc08f6 100644 > --- a/tools/perf/util/env.h > +++ b/tools/perf/util/env.h > @@ -147,4 +147,6 @@ void perf_env__insert_btf(struct perf_env *env, struct btf_node *btf_node); > struct btf_node *perf_env__find_btf(struct perf_env *env, __u32 btf_id); > > int perf_env__numa_node(struct perf_env *env, int cpu); > + > +int perf_kernel_is_64bit(void); > #endif /* __PERF_ENV_H */ > diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c > index ab7108d22428..f1d3725d599a 100644 > --- a/tools/perf/util/mmap.c > +++ b/tools/perf/util/mmap.c > @@ -323,6 +323,14 @@ int perf_mmap__push(struct mmap *md, void *to, > if (rc < 0) > return (rc == -EAGAIN) ? 1 : -1; > > +#ifdef BITS_PER_LONG == 32 > + if (kernel_is_64bit() && head >= 0xf0000000) { > + pr_err("32-bit perf cannot read 64-bit value atomically;\n"); > + pr_err("exit to avoid the 4GB (32-bit) buffer overflow on compat mode.\n"); > + return -ENOMEM; > + } > +#endif > + > size = md->core.end - md->core.start; > > if ((md->core.start & md->core.mask) + size != (md->core.end & md->core.mask)) { > > Thanks, > Leo >