Received: by 2002:ac0:e350:0:0:0:0:0 with SMTP id g16csp2502229imn; Tue, 2 Aug 2022 05:29:53 -0700 (PDT) X-Google-Smtp-Source: AA6agR5fINhfB0K0WoNKaZuraqS8IKX40TBqexOb8ZsrywKAVr8e6CEsouIc6zTjzpMhb9nS1QZ3 X-Received: by 2002:aa7:db44:0:b0:43d:267c:edd9 with SMTP id n4-20020aa7db44000000b0043d267cedd9mr19134847edt.385.1659443393129; Tue, 02 Aug 2022 05:29:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1659443393; cv=none; d=google.com; s=arc-20160816; b=KtvobkoA/vD4K+6Krs6KuNKVZXQ3/deL83mavCpt4lOoHPBmktKW8xJIj3BpHuad2V ei9lsz0+XxVyfz7IUYOlvMzLf8w1LI2MNWzFTuLoCqY4pjMVr09drEQHU04kouAqOLre 5W+f3xhDppnbHCSHwZ0jpuiEmIQj0xPDl1Cm9n+udeFfZHU1JdKn0l3o9XfdVZG2xOlD 9eIDFQDdabhIwaqv/2tOwIBfsyk8ihLqmh/4pKd8nFeKfGJIPlAa2zDHaV7vcL3YAndb rM3gkhlmZAjs7iVj0L0nPWbhdSy2RcSgEmSHPjUNQjCbeAiVoYROE2ItnW1enWZ2IRDo PXyw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:references:in-reply-to:message-id:date:subject :cc:to:from; bh=M872fH8+4QVIM6vnkheX+of+uS/+RdonTYq6ubzPxFA=; b=Q1pV6fziK7b25Y7yzKUlSoxlCGP+ujEOox+WrcKZsnylCF2AvpaekQzgNMs+tIRYSm j9faQuN7t2yxi2AgGBh+FXoKiv5BmgrrbtkxI7zafr2olkC5BwBP1w1PY6GBcPzDYUE9 9ahoC5W85own6/WSxdf7I8xMvdYUFCF7aevTy9KZyyzpl8muWrbqZHBE2SCGGXCBkYQe tXqVURERoXeGirlMFct1mfzyDWRyTW8Yi/Hngaie9RsxyGVhlBHv21avww5L3Lv6iWLB OcWvOkZzp3nns5O4vEGuGSAcQ7qr+RX1pirYTP5mAyhElhVYAfmVNRQFn2l87/mi+atj UfEA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id oz44-20020a1709077dac00b0072b64f0c355si14772835ejc.171.2022.08.02.05.29.27; Tue, 02 Aug 2022 05:29:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237034AbiHBMS5 (ORCPT + 99 others); Tue, 2 Aug 2022 08:18:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38944 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236934AbiHBMSh (ORCPT ); Tue, 2 Aug 2022 08:18:37 -0400 Received: from out30-132.freemail.mail.aliyun.com (out30-132.freemail.mail.aliyun.com [115.124.30.132]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C311E4F64B; Tue, 2 Aug 2022 05:18:35 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R201e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046059;MF=xianting.tian@linux.alibaba.com;NM=1;PH=DS;RN=24;SR=0;TI=SMTPD_---0VLBjY2m_1659442709; Received: from localhost(mailfrom:xianting.tian@linux.alibaba.com fp:SMTPD_---0VLBjY2m_1659442709) by smtp.aliyun-inc.com; Tue, 02 Aug 2022 20:18:30 +0800 From: Xianting Tian To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, anup@brainfault.org, heiko@sntech.de, guoren@kernel.org, mick@ics.forth.gr, alexandre.ghiti@canonical.com, bhe@redhat.com, vgoyal@redhat.com, dyoung@redhat.com, corbet@lwn.net, Conor.Dooley@microchip.com Cc: kexec@lists.infradead.org, linux-doc@vger.kernel.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, crash-utility@redhat.com, huanyi.xj@alibaba-inc.com, heinrich.schuchardt@canonical.com, k-hagio-ab@nec.com, hschauhan@nulltrace.org, yixun.lan@gmail.com, Xianting Tian Subject: [PATCH V5 6/6] RISC-V: Fixup schedule out issue in machine_crash_shutdown() Date: Tue, 2 Aug 2022 20:18:18 +0800 Message-Id: <20220802121818.2201268-7-xianting.tian@linux.alibaba.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220802121818.2201268-1-xianting.tian@linux.alibaba.com> References: <20220802121818.2201268-1-xianting.tian@linux.alibaba.com> X-Spam-Status: No, score=-9.9 required=5.0 tests=BAYES_00, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, UNPARSEABLE_RELAY,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Fixup schedule out issue in machine_crash_shutdown(), which is triggered by RCU Stall. [224521.877268] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: [224521.883471] rcu: 0-...0: (3 GPs behind) idle=cfa/0/0x1 softirq=3968793/3968793 fqs=2495 [224521.891742] (detected by 2, t=5255 jiffies, g=60855593, q=328) [224521.897754] Task dump for CPU 0: [224521.901074] task:swapper/0 state:R running task stack: 0 pid: 0 ppid: 0 flags:0x00000008 [224521.911090] Call Trace: [224521.913638] [] __schedule+0x208/0x5ea [224521.918957] Kernel panic - not syncing: RCU Stall [224521.923773] bad: scheduling from the idle thread! [224521.928571] CPU: 2 PID: 0 Comm: swapper/2 Kdump: loaded Tainted: G O 5.10.113-yocto-standard #1 [224521.938658] Call Trace: [224521.941200] [] walk_stackframe+0x0/0xaa [224521.946689] [] show_stack+0x32/0x3e [224521.951830] [] dump_stack_lvl+0x7e/0xa2 [224521.957317] [] dump_stack+0x14/0x1c [224521.962459] [] dequeue_task_idle+0x2c/0x40 [224521.968207] [] __schedule+0x41e/0x5ea [224521.973520] [] schedule+0x34/0xe4 [224521.978487] [] schedule_timeout+0xc6/0x170 [224521.984234] [] wait_for_completion+0x98/0xf2 [224521.990157] [] __wait_rcu_gp+0x148/0x14a [224521.995733] [] synchronize_rcu+0x5c/0x66 [224522.001307] [] rcu_sync_enter+0x54/0xe6 [224522.006795] [] percpu_down_write+0x32/0x11c [224522.012629] [] _cpu_down+0x92/0x21a [224522.017771] [] smp_shutdown_nonboot_cpus+0x90/0x118 [224522.024299] [] machine_crash_shutdown+0x30/0x4a [224522.030483] [] __crash_kexec+0x62/0xa6 [224522.035884] [] panic+0xfa/0x2b6 [224522.040678] [] rcu_sched_clock_irq+0xc26/0xcb8 [224522.046774] [] update_process_times+0x62/0x8a [224522.052785] [] tick_sched_timer+0x9e/0x102 [224522.058533] [] __hrtimer_run_queues+0x16a/0x318 [224522.064716] [] hrtimer_interrupt+0xd4/0x228 [224522.070551] [] riscv_timer_interrupt+0x3c/0x48 [224522.076646] [] handle_percpu_devid_irq+0xb0/0x24c [224522.083004] [] __handle_domain_irq+0xa8/0x122 [224522.089014] [] riscv_intc_irq+0x38/0x60 [224522.094501] [] ret_from_exception+0x0/0xc [224522.100161] [] rcu_eqs_enter.constprop.0+0x8c/0xb8 With the patch, it can enter crash system when RCU Stall occur. Signed-off-by: Xianting Tian --- arch/riscv/kernel/machine_kexec.c | 26 ++++++++++++++++++++++---- 1 file changed, 22 insertions(+), 4 deletions(-) diff --git a/arch/riscv/kernel/machine_kexec.c b/arch/riscv/kernel/machine_kexec.c index 86d1b5f9dfb5..ee79e6839b86 100644 --- a/arch/riscv/kernel/machine_kexec.c +++ b/arch/riscv/kernel/machine_kexec.c @@ -138,19 +138,37 @@ void machine_shutdown(void) #endif } +/* Override the weak function in kernel/panic.c */ +void crash_smp_send_stop(void) +{ + static int cpus_stopped; + + /* + * This function can be called twice in panic path, but obviously + * we execute this only once. + */ + if (cpus_stopped) + return; + + smp_send_stop(); + cpus_stopped = 1; +} + /* * machine_crash_shutdown - Prepare to kexec after a kernel crash * * This function is called by crash_kexec just before machine_kexec - * below and its goal is similar to machine_shutdown, but in case of - * a kernel crash. Since we don't handle such cases yet, this function - * is empty. + * and its goal is to shutdown non-crashing cpus and save registers. */ void machine_crash_shutdown(struct pt_regs *regs) { + local_irq_disable(); + + /* shutdown non-crashing cpus */ + crash_smp_send_stop(); + crash_save_cpu(regs, smp_processor_id()); - machine_shutdown(); pr_info("Starting crashdump kernel...\n"); } -- 2.17.1