Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp2539808ybi; Mon, 1 Jul 2019 13:51:50 -0700 (PDT) X-Google-Smtp-Source: APXvYqwu5PhNVjrlmupNlmzQZxJW4/XCAo8VfIluwV2YjGzRvSwUqGhJcBnYtSFRfSVOU0q/kTaj X-Received: by 2002:a17:90a:9a83:: with SMTP id e3mr1285647pjp.105.1562014309941; Mon, 01 Jul 2019 13:51:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1562014309; cv=none; d=google.com; s=arc-20160816; b=THFEJXWVpzGFcJOPHYb5zD7a8ljY84mdT5OOFHJowToYFBz3ZTn2gKk9n0vSAShAr+ njOlf9xuCo9kbeHifkpq9WPVIL7PIIkKyHyTp13Cea1AmO/gnI1OrhtlOJcbs2gIMfi3 SJ9wsGUdAUFR6BocN29ybb6cSYgCJefot/p6D+8+GNZdRqTiFpppgPmh6scfSrXJAtcA pzwfl5Z5k2Wi1uLvTIM8V1Y4nD83zwZj+LtH4MsNrunvC3ck7HElUJkGtXlDv30I4Slk ZejuwhozE84IF+QlsU3aEWVdIIrv71fuxejFsVeekyIRfPpOjdNscmvPqK7mBu+kM1p+ qzwQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version; bh=wESivRaOzdReZ9r3dQBJ6+auCoe5qPpy86Hfp/fZ9E8=; b=LzJIoLj8QpvJg7S4MDI63xKCxkd2xdg/jex1MI+9tb/bTOYoBfLkB0nUSt+TMYjSXM f1UQrDjtW8HWc49q2UbHwNlLOSvu8s7Fb9q6jvrO3nsX58avFNU58cauBNUSP5N0FevM Fp48IUnq3ztF1etzLBRZYyJk3vSZ2nkbhA2WdOcx9MsKfsR6RbHgTbmi8BrRoLGMUy1Z AJOrpDicI/imtLJA/KwyJ3MtJA/yFvtT/qgaM8ePbpLRmJanOd9fjN8ZxbZpGMODB25J dmaenukgrSxvD1CadBgIMBIPHdNPu48MEZY0f5/oriP8+sss4DB6oplBtUwgI5IEG/s3 ZXHQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d12si10621748plo.68.2019.07.01.13.51.35; Mon, 01 Jul 2019 13:51:49 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726957AbfGAUvP (ORCPT + 99 others); Mon, 1 Jul 2019 16:51:15 -0400 Received: from mail-ua1-f67.google.com ([209.85.222.67]:36256 "EHLO mail-ua1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726664AbfGAUvP (ORCPT ); Mon, 1 Jul 2019 16:51:15 -0400 Received: by mail-ua1-f67.google.com with SMTP id v20so5626783uao.3 for ; Mon, 01 Jul 2019 13:51:15 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=wESivRaOzdReZ9r3dQBJ6+auCoe5qPpy86Hfp/fZ9E8=; b=E8qUnaRi86DOYe4ykNLKZsrbv0jdLVwI+pq8I2EsHYEs0CcBbovV1y0Hei5KDm2Fmz KmHtA8xJqJ/vwuMgStkXzR5exybnjqwJ0U017XcMZljSVrkorpLmhd/gwEnrWbBG1kq4 xaEWCeJQzSEXgr+vsfpwNtX9K4qT5eqLhACkUv7Sc4DxuoxAkeBHYXrhk/N4VR1SwZIg q3CfzDwMJTX+1EHGRMttwp/D6fSbZzJZChfSzouvgWW9ZiiGOLMyLm+9yiAmM/PDCN3c bW5auRal34ZjScoNNCcI0uPMPdlyhbHtV908sNtc+D0PYT0Y5SWeECSxjIt5+JSmAv10 m1RQ== X-Gm-Message-State: APjAAAV3g8WFMHb4Ti6Lm5P/3NEbJdysVDukxnx7kFOUbkzmUj8NLSwd Q5UHNCk09l+2Tm7/M9dkY1SxpxjeIYcO4Yv4euMqLQ== X-Received: by 2002:ab0:168a:: with SMTP id e10mr15443818uaf.87.1562014274511; Mon, 01 Jul 2019 13:51:14 -0700 (PDT) MIME-Version: 1.0 References: <20190626081522.GX24419@MiWiFi-R3L-srv> <20190626082907.GY24419@MiWiFi-R3L-srv> In-Reply-To: <20190626082907.GY24419@MiWiFi-R3L-srv> From: David Airlie Date: Tue, 2 Jul 2019 06:51:03 +1000 Message-ID: Subject: Re: mgag200 fails kdump kernel booting To: Baoquan He Cc: kexec@lists.infradead.org, x86@kernel.org, linux-kernel , dyoung@redhat.com, Lyude Paul Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jun 26, 2019 at 6:29 PM Baoquan He wrote: > > On 06/26/19 at 04:15pm, Baoquan He wrote: > > Hi Dave, > > > > We met an kdump kernel boot failure on a lenovo system. Kdump kernel > > failed to boot, but just reset to firmware to reboot system. And nothing > > is printed out. > > > > The machine is a big server, with 6T memory and many cpu, its graphic > > driver module is mgag200. > > > > When added 'earlyprintk=ttyS0' into kernel command line, it printed > > out only one line to console during kdump kernel booting: > > KASLR disabled: 'nokaslr' on cmdline. > > > > Then reset to firmware to reboot system. > > > > By further code debugging, the failure happened in > > arch/x86/boot/compressed/misc.c, during kernel decompressing stage. It's > > triggered by the vga printing. As you can see, in __putstr() of > > arch/x86/boot/compressed/misc.c, the code checks if earlyprintk= is > > specified, and print out to the target. And no matter if earlyprintk= is > > added or not, it will print to VGA. And printing to VGA caused it to > > reset to firmware. That's why we see nothing when didn't specify > > earlyprintk=, but see only one line of printing about the 'KASLR > > disabled'. > > Here I mean: > That's why we see nothing when didn't specify earlyprintk=, but see only > one line of printing about the 'KASLR disabled' message when > earlyprintk=ttyS0 added. Just to clarify, the original kernel is booted with mgag200 turned off, then kexec works, but if the original kernel loads mgag200, the kexec kernels resets hard when the VGA is used to write stuff out. This *might* be fixable in the controlled kexec case, but having an mgag200 shutdown path that tries to put the gpu back into a state where VGA doesn't die, but for the uncontrolled kexec it'll still be a problem, since once the gpu is up and running and VGA is disabled, it doesn't expect to see anymore VGA transactions. Dave. > > > > > To confirm it's caused by VGA printing, I blacklist the mgag200 by > > writting it into /etc/modprobe.d/blacklist.conf. The kdump kernel can > > boot up successfully. And add 'nomodeset' can also make it work. So it's > > for sure mgag driver or related code have something wrong when booting > > code tries to re-init it. > > > > This is the only case we ever see, tend to pursuit fix in mgag200 driver > > side. Any idea or suggestion? We have two machines to be able to > > reproduce it stablly. > > > > Thanks > > Baoquan