Received: by 2002:a25:824b:0:0:0:0:0 with SMTP id d11csp1189151ybn; Wed, 2 Oct 2019 12:10:26 -0700 (PDT) X-Google-Smtp-Source: APXvYqyDbevP2PZS3znRuEkKtt05TqhXSdhDZlq++uMmt3l/VtsVmVW+3NonCt6gXWkmjyIOOU9E X-Received: by 2002:a05:6402:1251:: with SMTP id l17mr5582066edw.270.1570043426625; Wed, 02 Oct 2019 12:10:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1570043426; cv=none; d=google.com; s=arc-20160816; b=iIKX2SfiZdNL68jl9SSMkFeeX0+uIkJXEfbi2hiT7q5nXONhuiOlB653cKp/G/2EBz vUVBpRNYhBQxe81HAlg09S9hOftRD5OO/cZqVP7l1Z+sei/FOd/xXW8aKzkHPSmW1fej gii0kqzXT7NhlA2cK0lPof35rOIPD8D9LBnRthpmrKUiwcnK6ByGEJszoXwhmdlEB7Nv pD5d0i131FhtNM8vL9aEcay91hPZ7PKpZ9TEksdIEGkweAW9y4wSltWkNDhjtdH8EDtL X7C7dMMIGPyz6Jrxs2A2aoMfbkjrlWDY47u8XyKxyaMAbq3IOtnOZykgaQrX1KCqVuUI K8fg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:subject:message-id:date:cc:to :from:mime-version:content-transfer-encoding:content-disposition; bh=eidWJIITrKPBia8+Z22UCR7Mh1iS3sJk/rKndw4/Ub0=; b=QAu2MNvZNAFGXkwNtsKpv3BNpdzooEg00Fu7X0VDMD62cuSqflAGlv0PWkFaufakxM YgVU7Sb+G3xsKvO2X0U150nS0JQ+TcAMTYbmmYwe88D7SKN8S98vtX/cfH3OijCY/ejc SWlZ15jqMnCm6jnSGwx5UTjJZP/dvwb7pAQDiyM8Oyig+0qUaJPHUPwYaugh+ogbZdk0 G42hLMofOYAR2G55792dzFGo3XWLL0Z2bTDFEKTpLSOmeLt4LMan5o03n0+vZigBC1w8 cZ3+Y1fy6NbI2NAP0LC7tBTUdykKtMUjogwZ/mi1ygE1IszdmhdGXkRG7PiqnK8V9L/7 CsQg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d32si21295eda.266.2019.10.02.12.10.01; Wed, 02 Oct 2019 12:10:26 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729587AbfJBTJL (ORCPT + 99 others); Wed, 2 Oct 2019 15:09:11 -0400 Received: from shadbolt.e.decadent.org.uk ([88.96.1.126]:36118 "EHLO shadbolt.e.decadent.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729380AbfJBTIW (ORCPT ); Wed, 2 Oct 2019 15:08:22 -0400 Received: from [192.168.4.242] (helo=deadeye) by shadbolt.decadent.org.uk with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1iFjyo-00035r-Eq; Wed, 02 Oct 2019 20:08:06 +0100 Received: from ben by deadeye with local (Exim 4.92.1) (envelope-from ) id 1iFjyn-0003bm-MR; Wed, 02 Oct 2019 20:08:05 +0100 Content-Type: text/plain; charset="UTF-8" Content-Disposition: inline Content-Transfer-Encoding: 8bit MIME-Version: 1.0 From: Ben Hutchings To: linux-kernel@vger.kernel.org, stable@vger.kernel.org CC: akpm@linux-foundation.org, Denis Kirjanov , "Linus Torvalds" , "Jiri Olsa" , "Namhyung Kim" , "Vince Weaver" , "Arnaldo Carvalho de Melo" , "Alexander Shishkin" , "Yabin Cui" , mark.rutland@arm.com, "Peter Zijlstra (Intel)" , "Ingo Molnar" , "Stephane Eranian" , "Arnaldo Carvalho de Melo" , "Thomas Gleixner" Date: Wed, 02 Oct 2019 20:06:51 +0100 Message-ID: X-Mailer: LinuxStableQueue (scripts by bwh) X-Patchwork-Hint: ignore Subject: [PATCH 3.16 24/87] perf/ring_buffer: Fix exposing a temporarily decreased data_head In-Reply-To: X-SA-Exim-Connect-IP: 192.168.4.242 X-SA-Exim-Mail-From: ben@decadent.org.uk X-SA-Exim-Scanned: No (on shadbolt.decadent.org.uk); SAEximRunCond expanded to false Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 3.16.75-rc1 review patch. If anyone has any objections, please let me know. ------------------ From: Yabin Cui commit 1b038c6e05ff70a1e66e3e571c2e6106bdb75f53 upstream. In perf_output_put_handle(), an IRQ/NMI can happen in below location and write records to the same ring buffer: ... local_dec_and_test(&rb->nest) ... <-- an IRQ/NMI can happen here rb->user_page->data_head = head; ... In this case, a value A is written to data_head in the IRQ, then a value B is written to data_head after the IRQ. And A > B. As a result, data_head is temporarily decreased from A to B. And a reader may see data_head < data_tail if it read the buffer frequently enough, which creates unexpected behaviors. This can be fixed by moving dec(&rb->nest) to after updating data_head, which prevents the IRQ/NMI above from updating data_head. [ Split up by peterz. ] Signed-off-by: Yabin Cui Signed-off-by: Peter Zijlstra (Intel) Cc: Alexander Shishkin Cc: Arnaldo Carvalho de Melo Cc: Arnaldo Carvalho de Melo Cc: Jiri Olsa Cc: Linus Torvalds Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Stephane Eranian Cc: Thomas Gleixner Cc: Vince Weaver Cc: mark.rutland@arm.com Fixes: ef60777c9abd ("perf: Optimize the perf_output() path by removing IRQ-disables") Link: http://lkml.kernel.org/r/20190517115418.224478157@infradead.org Signed-off-by: Ingo Molnar Signed-off-by: Ben Hutchings --- kernel/events/ring_buffer.c | 24 ++++++++++++++++++++---- 1 file changed, 20 insertions(+), 4 deletions(-) --- a/kernel/events/ring_buffer.c +++ b/kernel/events/ring_buffer.c @@ -50,11 +50,18 @@ again: head = local_read(&rb->head); /* - * IRQ/NMI can happen here, which means we can miss a head update. + * IRQ/NMI can happen here and advance @rb->head, causing our + * load above to be stale. */ - if (!local_dec_and_test(&rb->nest)) + /* + * If this isn't the outermost nesting, we don't have to update + * @rb->user_page->data_head. + */ + if (local_read(&rb->nest) > 1) { + local_dec(&rb->nest); goto out; + } /* * Since the mmap() consumer (userspace) can run on a different CPU: @@ -86,9 +93,18 @@ again: rb->user_page->data_head = head; /* - * Now check if we missed an update -- rely on previous implied - * compiler barriers to force a re-read. + * We must publish the head before decrementing the nest count, + * otherwise an IRQ/NMI can publish a more recent head value and our + * write will (temporarily) publish a stale value. + */ + barrier(); + local_set(&rb->nest, 0); + + /* + * Ensure we decrement @rb->nest before we validate the @rb->head. + * Otherwise we cannot be sure we caught the 'last' nested update. */ + barrier(); if (unlikely(head != local_read(&rb->head))) { local_inc(&rb->nest); goto again;