Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B42C0C6FD19 for ; Mon, 13 Mar 2023 18:22:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229682AbjCMSWs (ORCPT ); Mon, 13 Mar 2023 14:22:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44150 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229571AbjCMSWo (ORCPT ); Mon, 13 Mar 2023 14:22:44 -0400 Received: from mail-ed1-x52a.google.com (mail-ed1-x52a.google.com [IPv6:2a00:1450:4864:20::52a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CCAD78092C for ; Mon, 13 Mar 2023 11:22:12 -0700 (PDT) Received: by mail-ed1-x52a.google.com with SMTP id k10so52253764edk.13 for ; Mon, 13 Mar 2023 11:22:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; t=1678731722; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=iqn/Obws/mvOXsirsuPduJgSRZ2tDsmdmCKpoj+0Jvg=; b=M5j7IdC55FRJSAuH7unc6HEH6Id1gYzCYt6KOAuwp74OcmSkJmGXP0fvqe9OojSt53 bxBr2gwO+yY3MhnBrpbJw/UEGzWX9+24oqiymg7of7MKdFIQ6KHjEOOboQ4rmz0X92zL xJWe2WjnlcM2B13RuYh9K+QPoesU7yqDKp/+M= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678731722; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=iqn/Obws/mvOXsirsuPduJgSRZ2tDsmdmCKpoj+0Jvg=; b=aWZq8mLas4GPyAA2vLQ9sSU7oigO8MQ33mFzMfSWW9n3+EJyQiAwXWz0qZMhrbRZt8 rbIcsZwvqlliUyjYgxK7pUjIsUBJxn2b8F7SfqZw2AicCjDGvZbT+stbzBphmA32wbIZ JxX49wUoZW1e/HCa3gzsN+V7Um7EYIGSsfGoVW4g9uYghRoCfmn+vhtG9ZRADn0JhMZz A4iLYawhZYcJ1LI+Ir4RKAdDaFBEOfmYUX5Ot2/FRa0u72ZADNlNIhHJIXKilHpw9/d1 QIlt9v4ajL8g18cr7+ehYYwTk/D+q+oUnET46X4etiWLMk2NN++bS8EC413aJCk6oHXi q3/A== X-Gm-Message-State: AO0yUKXeaRkrgYTAViLey3XaFsTSOA9HJAwGCHQEPMWRaDLiqg+vKf+T 2eoOuq95On4TXWQvfKq7r1s97sayzjjzdwzMyQoFuQ== X-Google-Smtp-Source: AK7set/InrORxt8fBWO44OxJb9OMFP1GS+kji0EXmxXzcT5kWohPyABDqBN1jVhuWPXU2XBedoVgXQ== X-Received: by 2002:a05:6402:1613:b0:4fa:c17d:8fdd with SMTP id f19-20020a056402161300b004fac17d8fddmr7311454edv.34.1678731722589; Mon, 13 Mar 2023 11:22:02 -0700 (PDT) Received: from mail-ed1-f41.google.com (mail-ed1-f41.google.com. [209.85.208.41]) by smtp.gmail.com with ESMTPSA id h3-20020a056402280300b004d8287c775fsm116484ede.8.2023.03.13.11.22.01 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 13 Mar 2023 11:22:02 -0700 (PDT) Received: by mail-ed1-f41.google.com with SMTP id r11so749768edd.5 for ; Mon, 13 Mar 2023 11:22:01 -0700 (PDT) X-Received: by 2002:a50:d543:0:b0:4af:6e08:30c with SMTP id f3-20020a50d543000000b004af6e08030cmr6164365edj.4.1678731721582; Mon, 13 Mar 2023 11:22:01 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Linus Torvalds Date: Mon, 13 Mar 2023 11:21:44 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Linux 6.3-rc2 To: Guenter Roeck , "Paul E. McKenney" , Frederic Weisbecker , Peter Zijlstra Cc: Linux Kernel Mailing List Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Mar 13, 2023 at 8:53=E2=80=AFAM Guenter Roeck = wrote: > > Warning backtraces in calls from ct_nmi_enter(), > seen randomly. Hmm. I suspect this one is a bug in the warning, not in the kernel, although I have no idea why it would have started happening now. This happens from an irq event, but that check is not *supposed* to happen at all from interrupts: * We dont accurately track softirq state in e.g. * hardirq contexts (such as on 4KSTACKS), so only * check if not in hardirq contexts: but I think that the ct_nmi_enter() function was called before the hardirq count had even been incremented. > Sample decoded stack trace: Hmm. That WARNING backtrace doesn't actually seem to follow the stack chain, so it only shows the irq stack, not where the irq happened. > Seen if CONFIG_DEBUG_LOCK_ALLOC=3Dy and CONFIG_CONTEXT_TRACKING_IDLE=3Dy. > It seems that rcu_read_lock_sched_held() can be true when entering an int= errupt. > > The problem is not seen in v6.2, but occurs randomly on ToT with various > arm emulations. Strange. I must be wrong about this being a race on the warning itself, because that warning has been there for a long long time. Adding in some people who might have more of a clue. I'm thinking Frederic and Paul might know what's up with the context tracking, but I don't see why this would be arm-related or have started recently. But I do note that PeterZ did some rcuidle tracing cleanups that do end up affecting arm too. So adding PeterZ too. Original email with full details at https://lore.kernel.org/lkml/d915df60-d06b-47d4-8b47-8aa1bbc2aac7@roeck-= us.net/ for added peeps. Anybody? Linus