Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp30082386rwd; Thu, 6 Jul 2023 00:09:57 -0700 (PDT) X-Google-Smtp-Source: APBJJlHBZTJck6azvQ5hyZIjlaRFbCYedGUepZ1VX2B5iz1l8aZRoDknBxO/D8EPZI4X6UDGKVR8 X-Received: by 2002:a17:90a:348d:b0:263:1f1c:ef4d with SMTP id p13-20020a17090a348d00b002631f1cef4dmr608601pjb.10.1688627397176; Thu, 06 Jul 2023 00:09:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688627397; cv=none; d=google.com; s=arc-20160816; b=ipOyAHVIYjc8ET8ss8O0Fbp6JjiOnqTPYYLB+2GnBX+GSCfExcYn1Q/qbbIFL6oZ0t KN8ejLbduAq0KCS0nVJ5FZ5bvf2q5z4kQFGCSm3nDSoleqtIMASEwmx4MzsvraG5W7ii Hn4XRnVt8feBp0Wnrfe/3pQEgubtdKTQKFeywE6I05sdc7GHq8Do30X0Lr8Srvy0tj5L 3Q0wbXOl+iOqMEXoD3QpTxUstEzCmfJs/Kagwln1FMUq8iMTLvkWR5krAd5A5g97AnWb 7zFVG5xEn1KumN6EbKUN9owLS4MzKoBdD5Bulb65wcuwYnkxbKrkcJxW73IgM+DKIuc2 bZYA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id; bh=zyvZ1WrctCOWgAoW+N5nLHEyXjBzvPAfAmOMqkJbAt4=; fh=J7EP0Iv6xHYD+1gj2aSBEr3YQaw8z8KdGLaog7SdDSo=; b=oMmT3TiaztJjbtbg9cFl8UaRfXc4+Gu8nbfqxwoueo0Z4rmHQz0ItOUno8IFUB8C+q MaZscaLaW+hHyKu/S4973VcJ+/eJWpOt+LJhNJV/k0eskjQM8z3YQm3HfFhzMmW9275l g1C/SCGjs4yKWkPoLuNctHt/6flDXe6lWzgQRNcD72/3IDLcYTcHkZpXND+4dmcz6Kw+ grEVz60WXdn46mAJH8iIHrpIleqlt3D1AUAWbyQ07h2M/8eOlje5D5KeNTewnJA7ysEH 7yoRbW/ay0aN3VW80CZ4e33qlU2WKhionAeyixiWW/0tj09kbHmkI0+h9Rg9RklqIrRg Jueg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id pf1-20020a17090b1d8100b00263f59430fasi1045557pjb.95.2023.07.06.00.09.45; Thu, 06 Jul 2023 00:09:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230204AbjGFGog (ORCPT + 99 others); Thu, 6 Jul 2023 02:44:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58394 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229528AbjGFGof (ORCPT ); Thu, 6 Jul 2023 02:44:35 -0400 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A26B119B for ; Wed, 5 Jul 2023 23:44:33 -0700 (PDT) Received: from dggpeml500021.china.huawei.com (unknown [172.30.72.56]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4QxRnB2d43zTm85; Thu, 6 Jul 2023 14:43:26 +0800 (CST) Received: from [10.174.177.174] (10.174.177.174) by dggpeml500021.china.huawei.com (7.185.36.21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Thu, 6 Jul 2023 14:44:29 +0800 Message-ID: Date: Thu, 6 Jul 2023 14:44:29 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.1.2 Subject: Re: [BUG REPORT] Triggering a panic in an x86 virtual machine does not wait Content-Language: en-US To: Thomas Gleixner CC: , , , , , , , , , , yangerkun , Baoquan He , , Baokun Li References: <20230615193330.608657211@linutronix.de> <71578392-63ed-02a9-24da-2adf8cce38c7@huawei.com> <87ttui91jo.ffs@tglx> From: Baokun Li In-Reply-To: <87ttui91jo.ffs@tglx> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.174.177.174] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To dggpeml500021.china.huawei.com (7.185.36.21) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,NICE_REPLY_A, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2023/7/5 16:59, Thomas Gleixner wrote: > On Mon, Jul 03 2023 at 11:44, Baokun Li wrote: > >> When I manually trigger panic in a qume x86 VM with >> >>        `echo c > /proc/sysrq-trigger`, >> >>  I find that the VM will probably reboot directly, but the >> PANIC_TIMEOUT is 0. >> This prevents us from exporting the vmcore via panic, and even if we succeed >> in panic exporting the vmcore, the processes in the vmcore are mostly >> stop_this_cpu(). By dichotomizing we found the patch that introduced the >> behavior change >> >>    45e34c8af58f ("x86/smp: Put CPUs into INIT on shutdown if possible"), > Bah, I missed that this is used by crash too. So if this happens to be > invoked on an AP, i.e. not on CPU 0, then the INIT will reset the > machine. Fix below. > > Thanks, > > tglx > --- > diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c > index ed2d51960a7d..e1aa2cd7734b 100644 > --- a/arch/x86/kernel/smpboot.c > +++ b/arch/x86/kernel/smpboot.c > @@ -1348,6 +1348,14 @@ bool smp_park_other_cpus_in_init(void) > if (apic->wakeup_secondary_cpu_64 || apic->wakeup_secondary_cpu) > return false; > > + /* > + * If this is a crash stop which does not execute on the boot CPU, > + * then this cannot use the INIT mechanism because INIT to the boot > + * CPU will reset the machine. > + */ > + if (this_cpu) > + return false; > + > for_each_present_cpu(cpu) { > if (cpu == this_cpu) > continue; This patch does fix the problem of rebooting at panic, but the exported stack stays at stop_this_cpu() like below, instead of showing what the corresponding process is doing as before. PID: 681      TASK: ffff9ac2429d3080  CPU: 2    COMMAND: "fsstress"  #0 [ffffb00200184fd0] stop_this_cpu at ffffffff89a4ffd8  #1 [ffffb00200184fe8] __sysvec_reboot at ffffffff89a94213  #2 [ffffb00200184ff0] sysvec_reboot at ffffffff8aee7491 --- ---     RIP: 0000000000000010  RSP: 0000000000000018  RFLAGS: ffffb00200f8bd08     RAX: ffff9ac256fda9d8  RBX: 0000000009973a85  RCX: ffff9ac256fda078     RDX: ffff9ac24416e300  RSI: ffff9ac256fda9e0  RDI: ffffffffffffffff     RBP: ffff9ac2443a5f88   R8: 0000000000000000   R9: ffff9ac2422eeea0     R10: ffff9ac256fda9d8  R11: 0000000000549921  R12: ffff9ac2422eeea0     R13: ffff9ac251cd23c8  R14: ffff9ac24269a800  R15: ffff9ac251cd2150     ORIG_RAX: ffffffff8a1719e4  CS: 0206  SS: ffffffff8a1719c8 bt: WARNING: possibly bogus exception frame Do you know how this happened? I would be grateful if you could fix it. Thanks! -- With Best Regards, Baokun Li .