Received: by 2002:a05:6a10:f3d0:0:0:0:0 with SMTP id a16csp2131603pxv; Sun, 11 Jul 2021 03:44:28 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw5L2xV3U6e44r5EbgBdHXGppLxfnblMkuxghdmKNNBIctLwVIX/MfUuMDSgNxRTIieOjfF X-Received: by 2002:a17:906:f8da:: with SMTP id lh26mr1289976ejb.203.1626000267923; Sun, 11 Jul 2021 03:44:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1626000267; cv=none; d=google.com; s=arc-20160816; b=bDhtm/Gv23LNWnYP+oA/2vcyZbeJE6QeOswgIYR4fH2J7BrEiAZC1lbJRX5ntm+AiW Tvz/QEY4kQ6R5kbsdmjfvXwowMch/pYgBF+4hNjp3cHw0GiGMQMrGb+JSNjtI1xDM1uK XajypanxqwO/gxyMFDBzOa3XHW3BK2WkyaS2FVfam+FOMsXQAPp/w3S9NZWpnMs+k2kx MqOmIM1ujZ8lrO3j9mt4dvl0JHTtYJPtKLGEH1aJ4IxggjNcVbDYVbNISyx49vcgyqQ1 sCbir64YTNncTW8KY0OEF4P/pEneV9KuK0Cp/n06l00hl/pCV5Q0OMZSW9v8s1nadlNx 8++g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=r/tGyqhrBoK91o4eCemyjadg6GlA9KSC6VamX5cgheU=; b=vXDVusQnkjbrWR5ZlJjWfEk2oYfTbwTpx8+49N7YB+BO+OHSG0KghiRg/qdWrzE9Av D7b/BQaqCIe7YDWDfWXE9zRf5wB2zNTiaSQ2YyNBcDygnbIa1vDAwcQ49ffaZK4/EpeF nf9v7oQpinLNJbFn5BdjunTkTDUYXBdD4CGlzj8XugFHZezocERIywl3H6P05PNdh+VP jghbMF6PzhIFVyyfZxdJQpcP8/0vCvWyr7RHKiAcxUStYUgZSOcXbrdDgt9eDcH6OATO 5eknSmuI6aeF375PcQnE7DkeJ5/oOFBc4582YcBX06BEiO8u+5vOrNtA727SsYroo9Qv b8qA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=eFEcv+8r; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id 26si13056814ejz.171.2021.07.11.03.44.05; Sun, 11 Jul 2021 03:44:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=eFEcv+8r; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232666AbhGKKoi (ORCPT + 99 others); Sun, 11 Jul 2021 06:44:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48666 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232676AbhGKKog (ORCPT ); Sun, 11 Jul 2021 06:44:36 -0400 Received: from mail-pg1-x533.google.com (mail-pg1-x533.google.com [IPv6:2607:f8b0:4864:20::533]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 11D28C0613EE for ; Sun, 11 Jul 2021 03:41:49 -0700 (PDT) Received: by mail-pg1-x533.google.com with SMTP id w15so14951513pgk.13 for ; Sun, 11 Jul 2021 03:41:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=r/tGyqhrBoK91o4eCemyjadg6GlA9KSC6VamX5cgheU=; b=eFEcv+8r6OlhUaVPY7jy/IUCvqioh3wDxBp8UB6a6cxPKtq2AdIghdUkDBI7pMTDI3 TO7wKKfEWqmjQyPzh1I4DEb30+sVaHqfj09ksvynYZUN/MzRCXr85cfRBwcvxLLUUtOC /dEn3LUrOo/nwNUROrdfcv+59/VVU7n/na/fhM2NQyLGzuu6Ew2c4VmIHf5Pq3yGX9tL FNPEcwlIo62hlmuLXdC43OE5EISgH4eEuH1xQJl12c6awQWoU8pIDHbhlvXoHtFed/9O CTdA0E+i+/5szy48J0ZDSi+/EBH9urStYc3kjpb/c+uocuCAcE8wadWo35FWION6VQlp XXuQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=r/tGyqhrBoK91o4eCemyjadg6GlA9KSC6VamX5cgheU=; b=L1TNUGs/RdN0nk6/lC/Qres84NCAcGk0NGxijKDWsNoCAHZ08d01jYyk0vIKDocQaC m+5erj1MnYg0Rf35yez//rGqS2kPB4zWCt8joUbUZxrU4XJA5lbRWtC+UkLfj/TSCMnT 0boYDuMmviu5SrkVjg2dQHYoS38l3QiIuVZ5ILmeH0hGQ8HMjI8R6GKt67wu5SuncxRq KaOisufQLs5u3eG+o/8VhXb/mUIPywpLEOLc0O6fuUNk9Cnk6ozG7vemwqqhu6gNzXNo sJZy206+fk93H3M4D/eY4QPnzPo1Ruib7aOMUAvPZrAdXDg5tSyjnPpLGqDhg3pZu668 CaEg== X-Gm-Message-State: AOAM530DrMLm9NN65k1EFLSiACTcXr93kLxT8V9mnxe9lSpQjC1yQP8e HDSgST9nq5gSHoD6UGljVbx5cg== X-Received: by 2002:a63:515f:: with SMTP id r31mr47275292pgl.406.1626000108577; Sun, 11 Jul 2021 03:41:48 -0700 (PDT) Received: from localhost ([103.127.241.250]) by smtp.gmail.com with ESMTPSA id s33sm1628476pfw.158.2021.07.11.03.41.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 11 Jul 2021 03:41:48 -0700 (PDT) From: Leo Yan To: Arnaldo Carvalho de Melo , Adrian Hunter , Peter Zijlstra , Ingo Molnar , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Thomas Gleixner , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" , Mathieu Poirier , Suzuki K Poulose , Mike Leach , linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, coresight@lists.linaro.org, linux-arm-kernel@lists.infradead.org Cc: Leo Yan Subject: [PATCH v4 11/11] perf auxtrace: Add compat_auxtrace_mmap__{read_head|write_tail} Date: Sun, 11 Jul 2021 18:41:05 +0800 Message-Id: <20210711104105.505728-12-leo.yan@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210711104105.505728-1-leo.yan@linaro.org> References: <20210711104105.505728-1-leo.yan@linaro.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When perf runs in compat mode (kernel in 64-bit mode and the perf is in 32-bit mode), the 64-bit value atomicity in the user space cannot be assured, E.g. on some architectures, the 64-bit value accessing is split into two instructions, one is for the low 32-bit word accessing and another is for the high 32-bit word. This patch introduces two functions compat_auxtrace_mmap__read_head() and compat_auxtrace_mmap__write_tail(), as their naming indicates, when perf tool works in compat mode, it uses these two functions to access the AUX head and tail. These two functions can allow the perf tool to work properly in certain conditions, e.g. when perf tool works in snapshot mode with only using AUX head pointer, or perf tool uses the AUX buffer and the incremented tail is not bigger than 4GB. When perf tool cannot handle the case when the AUX tail is bigger than 4GB, the function compat_auxtrace_mmap__write_tail() returns -1 and tells the caller to bail out for the error. Suggested-by: Adrian Hunter Signed-off-by: Leo Yan --- tools/perf/util/auxtrace.c | 9 ++-- tools/perf/util/auxtrace.h | 94 +++++++++++++++++++++++++++++++++++++- 2 files changed, 98 insertions(+), 5 deletions(-) diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c index 6a63be8b2430..d6fc250fbf97 100644 --- a/tools/perf/util/auxtrace.c +++ b/tools/perf/util/auxtrace.c @@ -1766,10 +1766,13 @@ static int __auxtrace_mmap__read(struct mmap *map, mm->prev = head; if (!snapshot) { - auxtrace_mmap__write_tail(mm, head); - if (itr->read_finish) { - int err; + int err; + err = auxtrace_mmap__write_tail(mm, head); + if (err < 0) + return err; + + if (itr->read_finish) { err = itr->read_finish(itr, mm->idx); if (err < 0) return err; diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h index d68a5e80b217..66de7b6e65ec 100644 --- a/tools/perf/util/auxtrace.h +++ b/tools/perf/util/auxtrace.h @@ -18,6 +18,8 @@ #include #include +#include "env.h" + union perf_event; struct perf_session; struct evlist; @@ -440,23 +442,111 @@ struct auxtrace_cache; #ifdef HAVE_AUXTRACE_SUPPORT +/* + * In the compat mode kernel runs in 64-bit and perf tool runs in 32-bit mode, + * 32-bit perf tool cannot access 64-bit value atomically, which might lead to + * the issues caused by the below sequence on multiple CPUs: when perf tool + * accesses either the load operation or the store operation for 64-bit value, + * on some architectures the operation is divided into two instructions, one + * is for accessing the low 32-bit value and another is for the high 32-bit; + * thus these two user operations can give the kernel chances to access the + * 64-bit value, and thus leads to the unexpected load values. + * + * kernel (64-bit) user (32-bit) + * + * if (LOAD ->aux_tail) { --, LOAD ->aux_head_lo + * STORE $aux_data | ,---> + * FLUSH $aux_data | | LOAD ->aux_head_hi + * STORE ->aux_head --|-------` smp_rmb() + * } | LOAD $data + * | smp_mb() + * | STORE ->aux_tail_lo + * `-----------> + * STORE ->aux_tail_hi + * + * For this reason, it's impossible for the perf tool to work correctly when + * the AUX head or tail is bigger than 4GB (more than 32 bits length); and we + * can not simply limit the AUX ring buffer to less than 4GB, the reason is + * the pointers can be increased monotonically (e.g in snapshot mode), whatever + * the buffer size it is, at the end the head and tail can be bigger than 4GB + * and carry out to the high 32-bit. + * + * To mitigate the issues and improve the user experience, we can allow the + * perf tool working in certain conditions and bail out with error if detect + * any overflow cannot be handled. + * + * For reading the AUX head, it reads out the values for three times, and + * compares the high 4 bytes of the values between the first time and the last + * time, if there has no change for high 4 bytes injected by the kernel during + * the user reading sequence, it's safe for use the second value. + * + * When update the AUX tail and detects any carrying in the high 32 bits, it + * means there have two store operations in user space and it cannot promise + * the atomicity for 64-bit write, so return '-1' in this case to tell the + * caller an overflow error has happened. + */ +static inline u64 compat_auxtrace_mmap__read_head(struct auxtrace_mmap *mm) +{ + struct perf_event_mmap_page *pc = mm->userpg; + u64 first, second, last; + u64 mask = (u64)(UINT32_MAX) << 32; + + do { + first = READ_ONCE(pc->aux_head); + /* Ensure all reads are done after we read the head */ + smp_rmb(); + second = READ_ONCE(pc->aux_head); + /* Ensure all reads are done after we read the head */ + smp_rmb(); + last = READ_ONCE(pc->aux_head); + } while ((first & mask) != (last & mask)); + + return second; +} + +static inline int compat_auxtrace_mmap__write_tail(struct auxtrace_mmap *mm, + u64 tail) +{ + struct perf_event_mmap_page *pc = mm->userpg; + u64 mask = (u64)(UINT32_MAX) << 32; + + if (tail & mask) + return -1; + + /* Ensure all reads are done before we write the tail out */ + smp_mb(); + WRITE_ONCE(pc->aux_tail, tail); + return 0; +} + static inline u64 auxtrace_mmap__read_head(struct auxtrace_mmap *mm) { struct perf_event_mmap_page *pc = mm->userpg; - u64 head = READ_ONCE(pc->aux_head); + u64 head; + +#if BITS_PER_LONG == 32 + if (kernel_is_64_bit) + return compat_auxtrace_mmap__read_head(mm); +#endif + head = READ_ONCE(pc->aux_head); /* Ensure all reads are done after we read the head */ smp_rmb(); return head; } -static inline void auxtrace_mmap__write_tail(struct auxtrace_mmap *mm, u64 tail) +static inline int auxtrace_mmap__write_tail(struct auxtrace_mmap *mm, u64 tail) { struct perf_event_mmap_page *pc = mm->userpg; +#if BITS_PER_LONG == 32 + if (kernel_is_64_bit) + return compat_auxtrace_mmap__write_tail(mm, tail); +#endif /* Ensure all reads are done before we write the tail out */ smp_mb(); WRITE_ONCE(pc->aux_tail, tail); + return 0; } int auxtrace_mmap__mmap(struct auxtrace_mmap *mm, -- 2.25.1