Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp3200965pxb; Mon, 9 Nov 2020 05:27:55 -0800 (PST) X-Google-Smtp-Source: ABdhPJwlPQAaslGE1b67fW8dfrZXX1PglF1NZ4AIXvryf61LtbY4+iCwjg4dkv7gTMD4K7Yol4Et X-Received: by 2002:aa7:d5d7:: with SMTP id d23mr15595222eds.203.1604928475406; Mon, 09 Nov 2020 05:27:55 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1604928475; cv=none; d=google.com; s=arc-20160816; b=gfYTXcz2ghprNJpAbRChKW4eMNql+U7VBntEZQyCyO9fP4zf4ZARMSe9MrFBUpCxUr 0XdYQGbsL34khEzVIDTsm6AHRgM5fs4M+vzdnR3u2ENIoJK6fMTIHrlSsg4HG2AxIjwa trUmqyDbMwGTuRol1BmSVDEKFp3VTUc29b8KSmSNC9XP9cxLu4Nk+adsgwmJZ9AAFe/a Hoim97tH8KqVVSIvsa2MgLUOzc/TWtW7r/sULCHp6bje0hs+h7OYMV/r4P6ww+US3WVx XAjE6KEM2mwO3bkQlZxidSFwEvQ0gJH5l3ZfTe4nXzoQgrv5dciaztrB1VDpaoGDskyV aorA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=6Ig8s4+mZIfoAdtYb3ENsOJvUyNWnldZ6g7qr83oV6A=; b=z0pwpS2MQ+yXCoPvnvQSeJzDCczzv4g37cXBBNvT4mfm5SRulxFesP2+E3gIR2ZfCg OQeoCWqXgl31xyCavH2Jvoqi/nILIYGSwR2UiTrURqvxSr3X6kYbvVA24qCA7r5UB/3O jyZLMFl/CfbH46Bn1thnhADRr5+5NURmPv0+I5Uq1fM65jPy1RvvPksRb+LtjLKRojkm oaGnMBdbRze7uFgewC72c6S2iF82eXdSF0kGN9j7v9tj93MjAdOLguwE+zTnrUEdKpTn 4/uJeAAIBa5bW4coBfRnNLgWTfadSAU7QTRxNX6Jx4D+1h3BukvA3ZaFtTZRJ+mcbKAp Z9UQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=mR75eNUs; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id z11si6816064ejr.354.2020.11.09.05.27.31; Mon, 09 Nov 2020 05:27:55 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=mR75eNUs; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387934AbgKINZJ (ORCPT + 99 others); Mon, 9 Nov 2020 08:25:09 -0500 Received: from mail.kernel.org ([198.145.29.99]:45780 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732849AbgKINS0 (ORCPT ); Mon, 9 Nov 2020 08:18:26 -0500 Received: from localhost (83-86-74-64.cable.dynamic.v4.ziggo.nl [83.86.74.64]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 52CDC20663; Mon, 9 Nov 2020 13:18:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1604927906; bh=q/DynEQf9tFsE4J3P7AFepSlb74nwb50z2mtJKO7/Mg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=mR75eNUs6NPuCHdRB3/6UmpxpKd3Hv/I9paFUKjcuSOWQE5BLriMaKl5XKlt2chCZ zZpTR+GyGwlG+AH2HVwzcPiwmmsqH86JvJKSQEUcUjzau4eD6hwTPWu2ojcph98KVT MRI5OZ3GSM7ZKU27t2uymSfpjkOFJPYnIcbozk1c= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, "Steven Rostedt (VMware)" Subject: [PATCH 5.9 060/133] ring-buffer: Fix recursion protection transitions between interrupt context Date: Mon, 9 Nov 2020 13:55:22 +0100 Message-Id: <20201109125033.618651446@linuxfoundation.org> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201109125030.706496283@linuxfoundation.org> References: <20201109125030.706496283@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Steven Rostedt (VMware) commit b02414c8f045ab3b9afc816c3735bc98c5c3d262 upstream. The recursion protection of the ring buffer depends on preempt_count() to be correct. But it is possible that the ring buffer gets called after an interrupt comes in but before it updates the preempt_count(). This will trigger a false positive in the recursion code. Use the same trick from the ftrace function callback recursion code which uses a "transition" bit that gets set, to allow for a single recursion for to handle transitions between contexts. Cc: stable@vger.kernel.org Fixes: 567cd4da54ff4 ("ring-buffer: User context bit recursion checking") Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Greg Kroah-Hartman --- kernel/trace/ring_buffer.c | 58 +++++++++++++++++++++++++++++++++++---------- 1 file changed, 46 insertions(+), 12 deletions(-) --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -438,14 +438,16 @@ enum { }; /* * Used for which event context the event is in. - * NMI = 0 - * IRQ = 1 - * SOFTIRQ = 2 - * NORMAL = 3 + * TRANSITION = 0 + * NMI = 1 + * IRQ = 2 + * SOFTIRQ = 3 + * NORMAL = 4 * * See trace_recursive_lock() comment below for more details. */ enum { + RB_CTX_TRANSITION, RB_CTX_NMI, RB_CTX_IRQ, RB_CTX_SOFTIRQ, @@ -3014,10 +3016,10 @@ rb_wakeups(struct trace_buffer *buffer, * a bit of overhead in something as critical as function tracing, * we use a bitmask trick. * - * bit 0 = NMI context - * bit 1 = IRQ context - * bit 2 = SoftIRQ context - * bit 3 = normal context. + * bit 1 = NMI context + * bit 2 = IRQ context + * bit 3 = SoftIRQ context + * bit 4 = normal context. * * This works because this is the order of contexts that can * preempt other contexts. A SoftIRQ never preempts an IRQ @@ -3040,6 +3042,30 @@ rb_wakeups(struct trace_buffer *buffer, * The least significant bit can be cleared this way, and it * just so happens that it is the same bit corresponding to * the current context. + * + * Now the TRANSITION bit breaks the above slightly. The TRANSITION bit + * is set when a recursion is detected at the current context, and if + * the TRANSITION bit is already set, it will fail the recursion. + * This is needed because there's a lag between the changing of + * interrupt context and updating the preempt count. In this case, + * a false positive will be found. To handle this, one extra recursion + * is allowed, and this is done by the TRANSITION bit. If the TRANSITION + * bit is already set, then it is considered a recursion and the function + * ends. Otherwise, the TRANSITION bit is set, and that bit is returned. + * + * On the trace_recursive_unlock(), the TRANSITION bit will be the first + * to be cleared. Even if it wasn't the context that set it. That is, + * if an interrupt comes in while NORMAL bit is set and the ring buffer + * is called before preempt_count() is updated, since the check will + * be on the NORMAL bit, the TRANSITION bit will then be set. If an + * NMI then comes in, it will set the NMI bit, but when the NMI code + * does the trace_recursive_unlock() it will clear the TRANSTION bit + * and leave the NMI bit set. But this is fine, because the interrupt + * code that set the TRANSITION bit will then clear the NMI bit when it + * calls trace_recursive_unlock(). If another NMI comes in, it will + * set the TRANSITION bit and continue. + * + * Note: The TRANSITION bit only handles a single transition between context. */ static __always_inline int @@ -3055,8 +3081,16 @@ trace_recursive_lock(struct ring_buffer_ bit = pc & NMI_MASK ? RB_CTX_NMI : pc & HARDIRQ_MASK ? RB_CTX_IRQ : RB_CTX_SOFTIRQ; - if (unlikely(val & (1 << (bit + cpu_buffer->nest)))) - return 1; + if (unlikely(val & (1 << (bit + cpu_buffer->nest)))) { + /* + * It is possible that this was called by transitioning + * between interrupt context, and preempt_count() has not + * been updated yet. In this case, use the TRANSITION bit. + */ + bit = RB_CTX_TRANSITION; + if (val & (1 << (bit + cpu_buffer->nest))) + return 1; + } val |= (1 << (bit + cpu_buffer->nest)); cpu_buffer->current_context = val; @@ -3071,8 +3105,8 @@ trace_recursive_unlock(struct ring_buffe cpu_buffer->current_context - (1 << cpu_buffer->nest); } -/* The recursive locking above uses 4 bits */ -#define NESTED_BITS 4 +/* The recursive locking above uses 5 bits */ +#define NESTED_BITS 5 /** * ring_buffer_nest_start - Allow to trace while nested