Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp1650335pxj; Wed, 19 May 2021 10:37:35 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxPixhwY3/9GBBXbJUAhzaFRybDEKXryi9yNIYfdEgQG/YfY/YZEbXwdSEIjgTGBeyeFQzg X-Received: by 2002:a05:6402:1ac7:: with SMTP id ba7mr165921edb.299.1621445855449; Wed, 19 May 2021 10:37:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1621445855; cv=none; d=google.com; s=arc-20160816; b=cKrMyR6UmB3gd6Q1OIucLE9E7huUBFxcwuvw6OBQa+VlHk3c9KZ9vOBPE/ssGwKDHp p/Bhh0Uh1R9iW3svJabjPKDlRGrMrX/bjGob+e8n2msNlSp7cryRFujB8LGLSy1r0eCa piE86lfdj956EdWB0vlg/m4GTh2i5zYiPhBmF4dEe1vsYYok8gHNCM8PSiF7V4apu2VS oMgClzjsx7XC/FFqtpNEYLADHSBnW92f/vxi+vuUaTOjdvhadzmBUlEfVeFwgO+UFrPn MIAEmS31jbwCyKFeJUousNlP4coZ9fPHbztT5wFss5yXg8tP0Myp0yPCDDm+VrVE+Dil 8srg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent:references:in-reply-to :subject:cc:to:from:message-id:date; bh=Kxp/YzZYqXTvtzJ7ZpZtUsI1OIqOO+rQOMEIFdxdzDw=; b=iZ4tIH3RK4oSAwK7M+z8vOPyP7HafD8yMGAvnbLrB8seGkBJbhSAtgbUTh+WMdlhvY jJOrTmXTvJfI5+iugOSxcosaM/hcolZnwx87nBUijcqxoQIXocH8DnkgdePNF5LhXExw 531dweSAEN7xXvZUiB9sltZYSw6zlCl0MG03Sg5q8V2BjO4byeD6emCMp2sJQynXn1mw SJxl8rE2h8o+FPTRqpd6Xm1pbfnME8Pnhqf3Dy4P0a8Ty87FLsT60ChB8sLwW3aeWvwl Jon9dF4g0D8F5a2Dtid4iHvFP/kRZuHLuJQh9fCaswueZotDEVUbwSdWZ6m9LyNCiaAM BnkA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id gj23si423986ejb.7.2021.05.19.10.37.11; Wed, 19 May 2021 10:37:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240960AbhERJNx (ORCPT + 99 others); Tue, 18 May 2021 05:13:53 -0400 Received: from mail.kernel.org ([198.145.29.99]:40066 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240243AbhERJNw (ORCPT ); Tue, 18 May 2021 05:13:52 -0400 Received: from disco-boy.misterjones.org (disco-boy.misterjones.org [51.254.78.96]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 042B861042; Tue, 18 May 2021 09:12:35 +0000 (UTC) Received: from 78.163-31-62.static.virginmediabusiness.co.uk ([62.31.163.78] helo=why.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1livmD-0021ws-50; Tue, 18 May 2021 10:12:33 +0100 Date: Tue, 18 May 2021 10:12:32 +0100 Message-ID: <878s4cv20v.wl-maz@kernel.org> From: Marc Zyngier To: John Stultz Cc: Catalin Marinas , Kees Cook , Will Deacon , Sami Tolvanen , linux-arm-kernel , Linux Kernel Mailing List , Bjorn Andersson , YongQin Liu , Amit Pundir , Michael Walle Subject: Re: REGRESSION: kernel BUG at arch/arm64/kernel/alternative.c:157! In-Reply-To: References: User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 62.31.163.78 X-SA-Exim-Rcpt-To: john.stultz@linaro.org, catalin.marinas@arm.com, keescook@chromium.org, will@kernel.org, samitolvanen@google.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, bjorn.andersson@linaro.org, yongqin.liu@linaro.org, amit.pundir@linaro.org, michael@walle.cc X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org + Michael On Mon, 17 May 2021 22:52:59 +0100, John Stultz wrote: > > With v5.13-rc2, I've been seeing an odd boot regression with the > DragonBoard 845c: > > Unfortunately, trying to bisect it down (v5.13-rc1 works ok) is giving > me inconsistent results so far. It feels a bit like maybe some config > option gets enabled moving forward, and then sticks around when we go > back. I'll take another swing at bisecting it later today, but I have > to move on to some other work right now, so I figured I'd share (with > folks who better know the recent __apply_alternatives changes) in case > folks have a better idea: > > [ 0.254384] CPU features: detected: RAS Extension Support > [ 0.259928] CPU: All CPU(s) started at EL1 > [ 0.264127] alternatives: patching kernel code > [ 0.268635] ------------[ cut here ]------------ > [ 0.273303] kernel BUG at arch/arm64/kernel/alternative.c:157! > [ 0.279192] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP > [ 0.284736] Modules linked in: > [ 0.287833] CPU: 0 PID: 14 Comm: migration/0 Not tainted > 5.13.0-rc2-mainline #4501 > [ 0.295472] Hardware name: Thundercomm Dragonboard 845c (DT) > [ 0.301182] Stopper: multi_cpu_stop+0x0/0x1a0 <- > stop_machine_cpuslocked+0x128/0x160 > [ 0.309020] pstate: 204000c5 (nzCv daIF +PAN -UAO -TCO BTYPE=--) > [ 0.315086] pc : __apply_alternatives+0x1f0/0x270 > [ 0.319847] lr : __apply_alternatives+0xf4/0x270 > [ 0.324515] sp : ffffffc01020bca0 > [ 0.327874] x29: ffffffc01020bca0 x28: 00000000000000a0 x27: ffffffd7f5c11124 > [ 0.335086] x26: ffffffd7f5c11128 x25: 00000000001b0020 x24: ffffffd7f700ab90 > [ 0.342297] x23: 0000000000000000 x22: ffffffc01020bd20 x21: ffffffd7f7bea374 > [ 0.349508] x20: ffffffc01020bd30 x19: ffffffd7f72194fc x18: ffffffffffffffff > [ 0.356718] x17: ffffffd7f7bdce40 x16: 000000005c8e1b43 x15: ffffffd7f76d9d10 > [ 0.363929] x14: ffffffc09020b967 x13: ffffffc01020b975 x12: ffffffd7f76d9e30 > [ 0.371140] x11: 0000000005f5e0ff x10: ffffffc01020b8c0 x9 : 00000000ffffffd0 > [ 0.378350] x8 : 6b20676e69686374 x7 : ffffffd7f79b9238 x6 : c0000000ffff7fff > [ 0.385560] x5 : 0000000000000000 x4 : ffffffd7f5c22898 x3 : 0000000000000010 > [ 0.392771] x2 : 0000000000000004 x1 : 0000000000000000 x0 : 000000000000003f > [ 0.399982] Call trace: > [ 0.402461] __apply_alternatives+0x1f0/0x270 > [ 0.406873] __apply_alternatives_multi_stop+0xc0/0xe0 > [ 0.412062] multi_cpu_stop+0xb8/0x1a0 > [ 0.415851] cpu_stopper_thread+0xac/0x120 > [ 0.419997] smpboot_thread_fn+0x200/0x238 > [ 0.424146] kthread+0x14c/0x158 > [ 0.427423] ret_from_fork+0x10/0x1c > [ 0.431045] Code: 39402e61 39402a62 6b01005f 54fff500 (d4210000) > [ 0.437199] ---[ end trace 523e13d9d60a992d ]--- > [ 0.441868] note: migration/0[14] exited with preempt_count 2 > [ 0.447739] migration/0 (14) used greatest stack depth: 12448 bytes left [/me digs in my IRC logs] This looks a lot like an issue that was reported my Michael Walle a few days ago on IRC, leading to a crash that looked like this: [ 0.325238] alternatives: patching kernel code [ 0.329735] ------------[ cut here ]------------ [ 0.334394] kernel BUG at arch/arm64/kernel/alternative.c:157! [ 0.340300] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP [ 0.345836] Modules linked in: [ 0.348916] CPU: 0 PID: 14 Comm: migration/0 Not tainted 5.13.0-rc1-next-20210511+ #536 [ 0.356998] Hardware name: Kontron SMARC-sAL28 (Single PHY) on SMARC Eval 2.0 carrier (DT) [ 0.365339] Stopper: multi_cpu_stop+0x0/0x1a8 <- stop_cpus.constprop.9+0x78/0xc8 [ 0.372820] pstate: 200000c5 (nzCv daIF -PAN -UAO -TCO BTYPE=--) [ 0.378882] pc : __apply_alternatives.isra.1+0x1c4/0x270 [ 0.384246] lr : __apply_alternatives.isra.1+0x110/0x270 [ 0.389606] sp : ffff800012db3ca0 [ 0.392946] x29: ffff800012db3ca0 x28: 0000000000000000 x27: ffff800010011924 [ 0.400155] x26: ffff800010011928 x25: 00000000001b0020 x24: ffff8000115ad350 [ 0.407364] x23: ffff800012db3d28 x22: 0000000000000000 x21: ffff800011fb24cd [ 0.414571] x20: ffff800012db3d30 x19: ffff800011840b38 x18: 0000000000000010 [ 0.421779] x17: 0000000044a56c23 x16: 0000000000000002 x15: ffffffffffffffff [ 0.428986] x14: ffff800011d50a48 x13: ffff800092db3987 x12: ffff800011de6a70 [ 0.436193] x11: 0000000000000003 x10: ffff800011dcea30 x9 : ffff8000105d8928 [ 0.443401] x8 : 0000000000017fe8 x7 : c0000000ffffefff x6 : 0000000000000001 [ 0.450608] x5 : 0000000000000000 x4 : ffff800010024398 x3 : 0000000000000010 [ 0.457815] x2 : 0000000000000004 x1 : 0000000000000000 x0 : 000000000000003f [ 0.465022] Call trace: [ 0.467483] __apply_alternatives.isra.1+0x1c4/0x270 [ 0.472493] __apply_alternatives_multi_stop+0xcc/0xe0 [ 0.477679] multi_cpu_stop+0xac/0x1a8 [ 0.481460] cpu_stopper_thread+0xa4/0x138 [ 0.485592] smpboot_thread_fn+0x12c/0x268 [ 0.489725] kthread+0x164/0x168 [ 0.492980] ret_from_fork+0x10/0x30 [ 0.496588] Code: 39402e61 39402a62 6b01005f 54fff6a0 (d4210000) [ 0.502742] ---[ end trace 24ef7d65759ab825 ]--- [ 0.507398] note: migration/0[14] exited with preempt_count 2 [ 0.513290] ------------[ cut here ]------------ Michael subsequently reported that: mhh nevermind, I can't reproduce it anymore. Maybe I should have recompiled with a clean build dir at first My gut feeling is that we can end-up with some build leftovers when going between -rc1 and -rc2, hence the screw-up when the capabilities get reordered. Dependency issues? Thanks, M. -- Without deviation from the norm, progress is not possible.