Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp3847946pxb; Tue, 17 Nov 2020 05:18:45 -0800 (PST) X-Google-Smtp-Source: ABdhPJzVuZOH4ggmWfXtI3VGUmUWCuQ/Miib1nFVtHcj8OiBt6U0Il8TxUofteJdmBc/TvN/HcE1 X-Received: by 2002:aa7:df04:: with SMTP id c4mr20710852edy.25.1605619125210; Tue, 17 Nov 2020 05:18:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1605619125; cv=none; d=google.com; s=arc-20160816; b=KxbpSvR5jJt4nPI7LgoeqiC3O48OKGftGt9cVEK9hdens8XgKIx1s9C/N92zDt/73R xMZ2/mfpqgPewylgvjCz/AAuXnc9Y9316dFsrJdZ1Gkzpt2CYn/56+U9WlmhHqueRNpd YSw2eZ1dXgoVZsYDTGgbRB+ohyJ5iu/Nbxd/jKQAixaB6ZSM8X4FgCo4weNmqEevJzu+ JWGKjLEcYLFKOJ36kzwUnis/uygEJ9Tu3kjUSYtZLBugj3eMGfbJa6QRJbHzlhwhE1XM 8L/35MMw6ysYyhQkXWVIa7TaDMqqmNz8M0JtWyJsxW22MJw4nPmHt0VmcOW1gAq5RoK2 KYfA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=9NL0WidOVHjsGOmDNGFwbvt8L2LXLOMn+JmdGB1+pNU=; b=ktZ4XQMjsL46GYZTedSkLptDrXewPc/nJTuuvuY21FfGfhXifA00kg3VqfUqHcMBBt Y1VMEXbvi8uxIZbI+mhXYiDhvr0XeY9XUZiZ3DLg07hZLvMgRqXs3NuhJUcuFWIc1Ms0 awiaWrqTLiPqliCXBoh3UiClfY+XISij5W/qgzT1/l2ZJIJ/UNfUo2RC/QJiqFTFIx2E 3o8Dkt2y+thbZJE0WjZtDytbzk5F65Tb0FBTpm6iuqPsIGSCB2kq+D2DZ/xQdHFcb+Nl 2PCZFVUDkQ5NueJD7H4okv2y48IwiK04pDlbkRHODIqC75nYkfJ3pSgpThon3TmbsQmx U4/A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=0v2YHzDz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id x3si717324ejj.269.2020.11.17.05.18.22; Tue, 17 Nov 2020 05:18:45 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=0v2YHzDz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729458AbgKQNNv (ORCPT + 99 others); Tue, 17 Nov 2020 08:13:51 -0500 Received: from mail.kernel.org ([198.145.29.99]:44326 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729722AbgKQNNo (ORCPT ); Tue, 17 Nov 2020 08:13:44 -0500 Received: from localhost (83-86-74-64.cable.dynamic.v4.ziggo.nl [83.86.74.64]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 08614221EB; Tue, 17 Nov 2020 13:13:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1605618823; bh=8ob0CyxlvL7lTuw+NAc1Pp9r/zR7825ePp7woDTRSX4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=0v2YHzDzQ5r+xqnA49G6oibdkCU1/SeqoqFg0q6lzcIWMq+ESUX7eEek4u0IaPP+f aRhgz8zDVvboEnzDeWMPgNL9bLDf9XgLLoC1rEUU9TK8i7idQo5JxIcBcuCAtErup4 NIU12WIMRdsCPVQMGE1rVnfJ0OMTpARbbXeliWzE= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, "Steven Rostedt (VMware)" , Sasha Levin Subject: [PATCH 4.14 02/85] ring-buffer: Fix recursion protection transitions between interrupt context Date: Tue, 17 Nov 2020 14:04:31 +0100 Message-Id: <20201117122111.139672659@linuxfoundation.org> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201117122111.018425544@linuxfoundation.org> References: <20201117122111.018425544@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Steven Rostedt (VMware) [ Upstream commit b02414c8f045ab3b9afc816c3735bc98c5c3d262 ] The recursion protection of the ring buffer depends on preempt_count() to be correct. But it is possible that the ring buffer gets called after an interrupt comes in but before it updates the preempt_count(). This will trigger a false positive in the recursion code. Use the same trick from the ftrace function callback recursion code which uses a "transition" bit that gets set, to allow for a single recursion for to handle transitions between contexts. Cc: stable@vger.kernel.org Fixes: 567cd4da54ff4 ("ring-buffer: User context bit recursion checking") Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Sasha Levin --- kernel/trace/ring_buffer.c | 54 +++++++++++++++++++++++++++++++------- 1 file changed, 44 insertions(+), 10 deletions(-) diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index b9b71e7fb6979..8082328eb01a4 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -416,14 +416,16 @@ struct rb_event_info { /* * Used for which event context the event is in. - * NMI = 0 - * IRQ = 1 - * SOFTIRQ = 2 - * NORMAL = 3 + * TRANSITION = 0 + * NMI = 1 + * IRQ = 2 + * SOFTIRQ = 3 + * NORMAL = 4 * * See trace_recursive_lock() comment below for more details. */ enum { + RB_CTX_TRANSITION, RB_CTX_NMI, RB_CTX_IRQ, RB_CTX_SOFTIRQ, @@ -2553,10 +2555,10 @@ rb_wakeups(struct ring_buffer *buffer, struct ring_buffer_per_cpu *cpu_buffer) * a bit of overhead in something as critical as function tracing, * we use a bitmask trick. * - * bit 0 = NMI context - * bit 1 = IRQ context - * bit 2 = SoftIRQ context - * bit 3 = normal context. + * bit 1 = NMI context + * bit 2 = IRQ context + * bit 3 = SoftIRQ context + * bit 4 = normal context. * * This works because this is the order of contexts that can * preempt other contexts. A SoftIRQ never preempts an IRQ @@ -2579,6 +2581,30 @@ rb_wakeups(struct ring_buffer *buffer, struct ring_buffer_per_cpu *cpu_buffer) * The least significant bit can be cleared this way, and it * just so happens that it is the same bit corresponding to * the current context. + * + * Now the TRANSITION bit breaks the above slightly. The TRANSITION bit + * is set when a recursion is detected at the current context, and if + * the TRANSITION bit is already set, it will fail the recursion. + * This is needed because there's a lag between the changing of + * interrupt context and updating the preempt count. In this case, + * a false positive will be found. To handle this, one extra recursion + * is allowed, and this is done by the TRANSITION bit. If the TRANSITION + * bit is already set, then it is considered a recursion and the function + * ends. Otherwise, the TRANSITION bit is set, and that bit is returned. + * + * On the trace_recursive_unlock(), the TRANSITION bit will be the first + * to be cleared. Even if it wasn't the context that set it. That is, + * if an interrupt comes in while NORMAL bit is set and the ring buffer + * is called before preempt_count() is updated, since the check will + * be on the NORMAL bit, the TRANSITION bit will then be set. If an + * NMI then comes in, it will set the NMI bit, but when the NMI code + * does the trace_recursive_unlock() it will clear the TRANSTION bit + * and leave the NMI bit set. But this is fine, because the interrupt + * code that set the TRANSITION bit will then clear the NMI bit when it + * calls trace_recursive_unlock(). If another NMI comes in, it will + * set the TRANSITION bit and continue. + * + * Note: The TRANSITION bit only handles a single transition between context. */ static __always_inline int @@ -2597,8 +2623,16 @@ trace_recursive_lock(struct ring_buffer_per_cpu *cpu_buffer) } else bit = RB_CTX_NORMAL; - if (unlikely(val & (1 << bit))) - return 1; + if (unlikely(val & (1 << bit))) { + /* + * It is possible that this was called by transitioning + * between interrupt context, and preempt_count() has not + * been updated yet. In this case, use the TRANSITION bit. + */ + bit = RB_CTX_TRANSITION; + if (val & (1 << bit)) + return 1; + } val |= (1 << bit); cpu_buffer->current_context = val; -- 2.27.0