Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp1037491ybi; Fri, 21 Jun 2019 12:34:07 -0700 (PDT) X-Google-Smtp-Source: APXvYqxM+OYgqXKPBGD3J8dtX1Iuu7cjo0ay7S0RHdk1J7zOTlOsi6ck4LmkvxGuWhht5ocFiSWH X-Received: by 2002:a17:902:934a:: with SMTP id g10mr123820603plp.18.1561145647727; Fri, 21 Jun 2019 12:34:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1561145647; cv=none; d=google.com; s=arc-20160816; b=AZ1cI2wi2kf9bNCb+aMXj8lW2/z8bs/2glXtIuV06/waDy3E9/FxBZ84cAjWraBkIy MhF3qYNPHiNes4aKlox5ly4FhbVM7EG6YI8N3C/vDwRToX0JVyeP7Mexmo/oSnRvleJN IgK+tsxOOL2oLuw//iabLVwwPl+N/FmlpXIDu+9IPBaBWK8PPwEmzvquTQ7Iekxt7ipM fzFhztOr1sWCxDRzqz6knQWujzIMQjPQ2LPafjucfgQL0mDulbd2NyeAcMMFOl55Cec6 8i3AYv7J7JNgWuNjVeaIhN3PAGM5PvAI3YtgbKuhgn162B7GX7vErEeySQfdHTmmr3gu 0OIQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date; bh=O23L83F3IoT6K3oCp7DL429eSmy66BX6dtfqHLrqhDA=; b=pzsbR4tkQ0QGeFQeFlzzGH2Bn1CVFcGTG2KynE81pBDD4WFcL7/xZCizL8MAXUaZxr x0s7w2G5NLw+sVI9zpeR25oFEU1Xsl2mST/du+3N8PUWGMvViz4FF1+dY8WWIF1wo3Ls JIDC8kP18uPsrb6P0ni/oWzz5/j0k2NZbWb/pDIXCjJm/YrBCZcftCE7Gol/t4+3/pcQ rbjvmoGBQXMcRrsuRQ7/z7lGHRugNclfcvelEt1p9FL85qoYDU1cv/2pETkhrvkcxtBN lftDdtZcx33C9GxyePkCXPxdbxe6unAkWsnHIbrOfuTgBkq+NKK60ipVM89Otc/C++va iRgw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l19si3249867pgb.204.2019.06.21.12.33.51; Fri, 21 Jun 2019 12:34:07 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726052AbfFUTdl (ORCPT + 99 others); Fri, 21 Jun 2019 15:33:41 -0400 Received: from Galois.linutronix.de ([146.0.238.70]:56907 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725947AbfFUTdl (ORCPT ); Fri, 21 Jun 2019 15:33:41 -0400 Received: from p5b06daab.dip0.t-ipconnect.de ([91.6.218.171] helo=nanos) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1hePI1-0005dP-D7; Fri, 21 Jun 2019 21:33:37 +0200 Date: Fri, 21 Jun 2019 21:33:36 +0200 (CEST) From: Thomas Gleixner To: Chris Wilson cc: Linus Torvalds , Linux List Kernel Mailing , Steven Rostedt , Josh Poimboeuf , Joerg Roedel Subject: Re: NMI hardlock stacktrace deadlock [was Re: Linux 5.2-rc5] In-Reply-To: <156114224132.2401.13297188928702045223@skylake-alporthouse-com> Message-ID: References: <156094799629.21217.4574572565333265288@skylake-alporthouse-com> <156097197830.664.13418742301997062555@skylake-alporthouse-com> <156114224132.2401.13297188928702045223@skylake-alporthouse-com> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 21 Jun 2019, Chris Wilson wrote: > Quoting Thomas Gleixner (2019-06-21 16:30:52) > > Chris, do you have the actual NMI lockup detector splats somewhere? > > Sorry, I'm having a hard time reproducing this at will now. The test > case depends on the right timing of the wrong event to cause the GPU to > hang. > > From memory, I got the > "Watchdog detected hard LOCKUP on cpu foo" > followed by the register dump and then nothing. At which point I had to > power cycle the machine. Hmm. Do you have a serial log of that incident? Thanks, tglx